Help:ProofreadPage extension

From Wikisource
Jump to: navigation, search
ProofreadPage Extension
Wikisource uses the ProofreadPage extension, which allows you to render text along with its corresponding scanned image.

Discussion[edit]

The side by side proofread page extension provides a transcribed text and a scan of the original document on one page. These pages use the prefix 'Page:' and collections of these displayed in a page beginning with 'Index:'. While many file types are supported by the extension, a document at wikisource is usually a DjVu with OCR.

The ProofreadPage extension is enabled by default at Wikisource and should come up automatically when a page in the "Page:" namespace is edited. However, for this to work the editor's browser (and extensions such as NoScript) must allow script processing. Your Special:Preferences page (section "Gadgets") allows you to control certain features, such as whether the OCR button is enabled and whether the text by default appears side by side or one above another.

Users new to proofreading can experiment with the concept, and test their abilities with these simple introductory tests on the Distributed Proofreading's website. Working examples can be seen by finding a project in progress, such as Wikisource:Proofread of the Month.

Once you've found a project you want to work on, you'll want to go to the index page. In it you'll find links to many pages for the project, colored by their status. After selecting a page that needs work (not green), you'll go into the page, open up the editor, and make whatever changes (either to the document or the status) are appropriate, preview & save.

Anybody is able to proofread and correct most pages at Wikisource. However, editors must log into an account in order to change the proofread status. IP addresses cannot change this status.

When corrections and formatting are complete, the page is marked as proofread and is ready for the main namespace, leave the page as 'not proofread' until it is done. Mark as problematic if appropriate.

Rationale[edit]

The ProofreadPage extension is intended to allow easy comparison of text to the original. It has the following advantages:

  • Credibility: it makes it possible for Wikisource to guarantee that the text corresponds to its scanned source.
  • Improved collaboration: texts can be proofread and typos can be fixed by everyone, by providing direct access to the book. This restores the wiki way of collaborating.
  • Security: text is better protected against vandalism (any falsification can be detected immediately; texts are not accessed directly, but through transclusion, which deters inexperienced vandals).
  • No limitations on rendering: a book can be rendered in two different ways, without duplicating data:
  1. As a set of pages. Each page is a column of OCR text beside a column of scanned image. This mode is meant for contributors.
  2. Broken into its logical organization (such as chapters or poems) using transclusion. This mode is meant for readers.
  • Fairness of comparisons: since book pages are not in the 'main' namespace, they are not included in the statistical count of text units. A count of pages is available here. This method of comparison uses the same unit of measure for all texts (the page), which puts an end to the temptation of slicing texts into arbitrarily small units in order to increase statistics.

Limitations[edit]

The poem tag does not work well because it adds a carriage return at the end of a block. It's also not possible to use <pre> formatting, since the line breaks are suppressed during transclusion. To solve this issue, add <br /> tags to the beginning of lines.

To ease proofreading images that are rotated, the Rotate Image Firefox extension can be used.

New users[edit]

Users new to proofreading can experiment with the concept, and test their abilities with these simple introductory tests on the Distributed Proofreading's website. Working examples can be seen by finding a project in progress, such as Wikisource:Proofread of the Month.

See also[edit]