Page:Crowdsourcing and Open Access.djvu/25

This page has been validated.

2010]

CROWDSOURCING AND OPEN ACCESS

615

differing opinions about whether the Wikipedia biographies of Presidents Barack Obama^[1] or George W. Bush^[2] adhere to the stated standard of neutrality; it is less easy to imagine users reasonably adhering to different views about whether the text reproduced at Wikisource matches the content of the published source.^[3]

Like Distributed Proofreaders, Wikisource now draws most new content from users who proofread and correct the text extracted from scanned page images of a published source.^[4] Unlike Distributed Proofreaders, however, Wikisource was not originally engineered with proofreading of page scans in mind. This functionality has been in place only during the last two to three years of the project’s existence.^[5] Nevertheless, the site now offers a clean and well-organized user interface that at least rivals, and perhaps exceeds, the usefulness and intuitive functionality of Distributed Proofreaders.

First, each scanned volume image accessible at Wikisource^[6] (which typically, although not always, correspond to a separately bound hard copy volume of a work as originally published) has a so-called “Index page” that reproduces identifying information about that

(last visited Feb. 10, 2010) (briefly addressing neutrality issue in context of publication of misleading extracts from a work).

↑ See generally Barack Obama, http://en.wikipedia.org/wiki/Barack_Obama (last visited Feb. 10, 2010).
↑ See generally George W. Bush, http://en.wikipedia.org/wiki/George_W._Bush (last visited Feb. 10, 2010).
↑ This risk seems particularly low for works more recently added to Wikisource, many of which are assembled from scanned page images of the published original sources and which permit easy user verification of the text against the source image.
↑ Statistical descriptions of Wikisource (or, indeed, any of the WMF projects) involve substantial risks of error due to the constant flux of additions and deletions to the project. With that caveat in mind, however, it is possible to make some very broad points to illustrate the relative magnitude of the works available at the English-language Wikisource. As of January 2010, Wikisource included approximately 321,000 individual pages of scanned text—a figure that may undercount the actual number of scanned images available at the site, not all of which have yet been used to produce a corresponding text page. See Wikisource Statistics—Tables—English—Database records per namespace, http://stats.wikimedia.org/wikisource/EN/TablesWikipediaEN.htm#namespaces (last visited Feb. 10, 2010) (the column heading “104” in this table corresponds to the “Page” namespace used on the project and marks the number of text records that match a scanned page image at Wikisource).
↑ The necessary “ProofreadPage” software extension was added to the MediaWiki software that underlies all WMF sites in mid-2007. See Extension:ProofreadPage, http://www.mediawiki.org/wiki/Extension:Proofread_Page (last visited Feb. 10, 2010).
↑ A few of these files are hosted at Wikisource itself, although the more common practice appears to be to host the files at Wikimedia Commons, where they are equally usable by all WMF projects.

[2] See generally Barack Obama, http://en.wikipedia.org/wiki/Barack_Obama (last visited Feb. 10, 2010).

[3] See generally George W. Bush, http://en.wikipedia.org/wiki/George_W._Bush (last visited Feb. 10, 2010).

[4] This risk seems particularly low for works more recently added to Wikisource, many of which are assembled from scanned page images of the published original sources and which permit easy user verification of the text against the source image.

[5] Statistical descriptions of Wikisource (or, indeed, any of the WMF projects) involve substantial risks of error due to the constant flux of additions and deletions to the project. With that caveat in mind, however, it is possible to make some very broad points to illustrate the relative magnitude of the works available at the English-language Wikisource. As of January 2010, Wikisource included approximately 321,000 individual pages of scanned text—a figure that may undercount the actual number of scanned images available at the site, not all of which have yet been used to produce a corresponding text page. See Wikisource Statistics—Tables—English—Database records per namespace, http://stats.wikimedia.org/wikisource/EN/TablesWikipediaEN.htm#namespaces (last visited Feb. 10, 2010) (the column heading “104” in this table corresponds to the “Page” namespace used on the project and marks the number of text records that match a scanned page image at Wikisource).

[6] The necessary “ProofreadPage” software extension was added to the MediaWiki software that underlies all WMF sites in mid-2007. See Extension:ProofreadPage, http://www.mediawiki.org/wiki/Extension:Proofread_Page (last visited Feb. 10, 2010).

[7] A few of these files are hosted at Wikisource itself, although the more common practice appears to be to host the files at Wikimedia Commons, where they are equally usable by all WMF projects.

[1]

[2]

[3]

[4]

[5]

[6]

Page:Crowdsourcing and Open Access.djvu/25

Navigation menu

Search