User:Zoeannl/Project guideline/Proofreaders Guide for proofreaders

From Wikisource
Jump to navigation Jump to search

Search this page[edit]

Help:Searching

Wikimedia general search help

Before you proofread[edit]

The Primary Rule[edit]

"Don't change what the author wrote!"[edit]

The electronic book should accurately convey the intent of the author. If the author spelled words oddly, we leave them spelled that way. If the author wrote outrageous racist or biased statements, we leave them that way. If the author put commas, superscripts, or footnotes every third word, we keep the commas, superscripts, or footnotes. We are proofreaders, not editors; if something in the text does not match the original page image, you should change the text so that it does match. (See Scannos, Typos and Mispunct for proper handling of obvious misprints.)

We do change minor typographical conventions that don't affect the sense of what the author wrote. For example, we rejoin words that were broken at the end of a line (End-of-line Hyphenation) and we close i. e. Changes such as these help us produce a consistently formed digital version of the book. The proofreading rules we follow are designed to achieve this result. Please carefully read the rest of the Proofreading Guidelines with this concept in mind.

To proofread a page, you should edit the text so that it matches the scan as closely as possible.

You do not have to make an identical, photographic copy of the scan. Wikisource is a website, not a book and the text is more important than the typography. You should just try to get as close as possible. Some things work in books but do not work on Wikisource. For example, columns of text are not necessary and do not work well on Wikisource; they should be ignored during proofreading to produce one column of text. Remember that several pages will be added together in the main namespace when proofreading is finished.

To assist the validater, the transcluder, and any subsequent editing, we also preserve line breaks at the proofreading stage. This allows them to easily compare the lines in the text to the lines in the image, improving accuracy of transcription.

Scannos, Typos and Mispunct[edit]

  • To create the desired electronic copy of a printed book: the physical book is scanned to produce a digital file (WS prefers .djvu files to work from); the file are uploaded to Wikicommons and then, in Wikisource, pages are converted to text using an optical character recognition (OCR) program. This is the text we proofread. Errors produced by this process are uncommon for normal text, and are called Scannos. At Wikisource, we aim to faithfully replicate the original printed version. All text should be exactly copied if possible.
  • All books before electronic printing were printed from type set. Printers placed individual letters, in mirror image and reverse, on to form which was then inked and printed (see w:Typesetting). Errors produced by this process are amazingly uncommon but when they occur, these printer’s errors are called Typos. We do not correct these directly. As they can cause consternation for readers, we mark them with {{SIC}} with a suggested correction. This lets the reader know that the text has been accurately proofread, without diverging from the original text. See below.
  • While all text (letters and numbers) is exactly copied, some punctuation, fonts and symbols are standardised to ease and enable wikifying the text. E.g. Punctuation is "closed" (spaces removed) so it stays attached to the associated words; some abbreviations and contractions are closed; title pages, headings and fancy fonts are standardised; ellipses are standardised to … regardless of how many dots there were originally; straight quotes ' and " are used instead of curly quotes ‘, ’ and “, ”; and when we transcribe footnotes, we do not keep the original symbol markers in the text, we number them automatically using templates.
  • If punctuation is obviously wrong, e.g. mismatched or missing quotes; . instead of ,; or just plain missing, this is called Mispunct. We make executive decisions on whether to fix it. It isn't worth {{SIC}}ing punctuation and we want to avoid reader consternation.


If there is an obvious misspelling on the printed page, {{SIC}} it.

Umrbella --> Umrbella {{SIC|Umrbella|Umbrella}}
Um brella --> Um brella {{SIC|Um brella|Umbrella}}

The greatest strength and value of Wikisource over other forms of digital transcription services, is that it has the original scan permanently attached to the proofread digital copy. So we can validate the proofreading at all times. This also means readers can check for errors against the original and make corrections. Checking the page's history, we can see if it occurred in the original document scan (a scanno) or if it was introduced by a proofreader (a typo). These corrections are generally considered minor edits.

References[edit]

Wikifying[edit]

The process of producing a webpage, able to be searched and viewed from many digital platforms, offers particular opportunities and compromises in comparison to the written work.

  • Wikisource:Wikilinks outlines the acceptable limits of adding to a text at Wikisource.
  • Help:Templates is an overview of the great things we can do with templates. Templates allow the content to be viewed by various digital platforms and allow updated functions over time.

The compromises are myriad and often idiosyncratic to how Wikisource (WS) and Wikipedia (WP) work.

Examples are:

  • We do not indent the start of a paragraph, we separate paragraphs with a hard return. This is because leaving a space at the beginning of a line creates a text box on wiki.
  • We close spaces around punctuation (such as semi-colons) and abbreviations, and we use the ellipses character … so they are not split when text wraps.
  • We use straight quotes ' and " instead of curly quotes ‘, ’ and “, ”; as curly quotes cause issues on some platforms.
  • Exact duplication is often not possible or feasible for digital viewing. There are standardised ways of presenting digital text. This proofreaders' guide reflects standards used at Distributed Proofreaders (DP) for the Project Gutenberg (PG) publications. These ebooks are produced as downloadable files through a rigorous proofreading process. The DP training process is recommended for proofreaders here: it is very effective and well worth the short time it takes to complete the first stage and will put you in good stead here.
  • We do not use raw html markup. Use templates where possible. Templates can be updated later to work better. The voluntary and co-operative nature of Wikipedia has led to some inconsistencies and odd work-abouts. Use templates that are available and within their current capacity. Ask for advice of the current best practice, make your opinion known at the Wikisource:Scriptorium by all means, add to the history of discontent and over time this should hopefully improve… Generally the Wikisource way is to not have many set rules to allow for development of skills and processes so just do your best.
See the Wikisource Style Guide for more.
  • We leave hard returns at the end of lines at the proofread stage to improve the accuracy of proofreading. We close text at validation to improve efficiency of files. See Transclusion

If you join in a current work (Proofread of the Month, List of Proofread books on WS), you can get feedback as you learn. You can also ask for comment at the Scriptorium Help for specific challenges.

Places to start Wikisourcing[edit]

About This Document[edit]

This document is written to explain the proofreading rules we use to maintain consistency when proofreading a single book. This helps us all do proofreading the same way, which in turn makes it easier for the transcluder, and any subsequent editors who will work on this e-book.

Always check the Index page Discussion for notes on variations from this guide. If the book has a Project Manager shepherding the book through its production, they may specify how they wish the book to proceed. Please respect their involvement and follow their instructions. {{ping}} the Project Manager if you have any queries or suggestions. If the Project Manager seems absent, ask for support at the Scriptorium.

If there is no Project Manager, and there are issues not covered by this guide, ask for support at the Scriptorium. Feel welcome to nominate yourself as the Project Manager and ask for support if you want it. See Project management

If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know. Talk to us at the Scriptorium.

When examples of templates are given, n indicates any number; x any letter; {{Lorem ipsum}} any text. These are prompts to put your number/letter/text "here". Using the pulldown menu on the tool bar to insert templates is recommended.

w:Help:Wiki markup

Project Discussion[edit]

When you select a project for proofreading, you start at the Index page. On this page there is a tab called "Discussion" containing information specific to that project (book). Read these before you start proofreading pages! If the Project Manager wants you to do something in this book differently from the way specified in these Guidelines, that will be noted here. Instructions in the Discussion page override the rules in these Guidelines, so follow them. Finally, this is also where the Project Manager may give you interesting tidbits of information about the author or the project. That is the place to ask (using {{ping}}) questions about this book, inform the Project Manager about problems, etc. This page is also often used by proofreaders to alert other proofreaders to recurring issues within the project and how they can best be addressed.

On the Index page, the status of pages is displayed. You can not validate pages which you have proofread. If two people work on the same page at the same time then a conflict arises on submission.

To make a note on a page for subsequent editors: <!-- xxx -->


Page headers[edit]

These are proofread into the "header" space. These are not transcluded into the final book but are done to remain true to the original work.

{{rh}} is typically used.

Proofreading at the Character Level:[edit]

Font style[edit]

Italics and Bold[edit]

For font emphasis, Wikipedians use ''double apostrophes'' for italic and '''three apostrophes''' for '''bold'''

For font emphasis, double apostrophes for italic and three apostrophes for bold

You can even do '''''bold''' within italic'' or '''bold with ''italic'''''

You can even do bold within italic or bold with italic 

This markup only applies within a text line and the OCR will automatically put a hard return at the end of each line.

So if the text you want to format spans two lines, you will need to use {{italic block}} or {{bolder}} or {{bold block}}.

{{italic block/s}}
By channels of coolness the echoes are calling,<br />
And down the dim gorges I hear the creek falling:<br />
It lives in the mountain where moss and the sedges<br />
Touch with their beauty the banks and the ledges.<br />
Through breaks of the cedar and sycamore bowers<br />
Struggles the light that is love to the flowers;<br />
And, softer than slumber, and sweeter than singing,<br />
The notes of the bell-birds are running and ringing.<br />
{{italic block/e}}

produces:

By channels of coolness the echoes are calling,
And down the dim gorges I hear the creek falling:
It lives in the mountain where moss and the sedges
Touch with their beauty the banks and the ledges.
Through breaks of the cedar and sycamore bowers
Struggles the light that is love to the flowers;
And, softer than slumber, and sweeter than singing,
The notes of the bell-birds are running and ringing.


It is considered good form to italicize words or phrases individually, i.e. around punctuation. E.g.By Carl Marx. By ''Carl Marx''.; Cf. Darwin. ''Cf.'' ''Darwin''; In a list: Apples, bananas and peaches. ''Apples'', ''bananas'' and ''peaches''. Where a whole sentence is italicized, the period is included. However, watch for italicized punctuation especially : and ;: here we respect the original.

Many typefaces found in older books used the same design for numbers in both regular text and italics or bold. For dates and similar phrases, format the entire phrase with one set of markup, rather than marking the words as italics (or bold) and not the numbers.


{{lighter}} can be applied to some title or decorative pages where multiple, unreplicable, fonts are used.


Italicized single quotes[edit]

To italicize 'this', use the single quote template for the printed single quotes (enclose single quotes in curly brackets) i.e. ''{{'}}this{{'}}'' to get 'this'.

Italicized links[edit]

The italics markup must be outside the link markup, or the link will not work; however, internal italicisation can be used in piped links.

Incorrect: He died with [[''Turandot'']] still unfinished.
Correct: He died with ''[[Turandot]]'' still unfinished.
He died with Turandot still unfinished.
Correct: The [[USS Adder (SS-3)|USS ''Adder'' (SS-3)]] was a submarine.
The USS Adder (SS-3) was a submarine.

Italicised {{hws}} and {{hwe}}[edit]

Italicise the parts and whole words within the template.

{{hws|''begin''|''beginning''}}

{{hwe|''ning''|''beginning''}}

beginning

Underline {{underline}}[edit]

{{underline|Emphasised}}

produces:

Emphasised

Overline {{overline}}[edit]

{{overline|Obscure maths}}

produces:

Obscure maths

Strikethrough {{strike}}[edit]

{{strike|strike through}}

produces:

strike through

Gothic Type {{blackletter}}[edit]

{{blackletter|New York}}

produces:

New York

Cursive {{cursive}}[edit]

{{cursive|signed}}

produces:

signed

Serif {{serif}}[edit]

{{serif|Title}}

Title

Roman numerals {{roman}}[edit]

{{roman|499}}

produces:

CDXCIX

Spaced Out Text (gesperrt) {{sp}}[edit]

{{sp|Spaced Out}}

produces:

Spaced Out

{{sp}} is a simple version of {{Letter-spacing}}, which has additional formatting provisions.

Format spaced out text with the {{sp}} template. This was a typesetting technique used for emphasis in some older books, especially in German and on Titles.

Coloured Text {{greyed}}, {{red}}, {{green}}[edit]

Text can be colored using templates. Red text was often used as a highlight in older works, especially on the title page. Greyed text can be used to indicate (important) text that has been written or typed onto the original document.

{{greyed|grey text}}, {{red|red text}}, {{green|green text}}

produces:

grey text, red text, green text

Non-Keyboard Characters[edit]

Please proofread these using the proper symbols or accented characters to match the image, where possible. Non-keyboard characters can be found on your pull-down menu. There is also a Gadget Editing tool for keyboard shortcuts under Preferences (tab at top right of this page, if you are logged in).

Suggestions for Gadget Proofreading shortcut keys
ligatures ^oe ^ae ^OE ^AE
emdash ---
endash --
minus -
Diacritical mark sample above below
macron (straight line) ¯ [=x] [x=]
2 dots (dieresis, umlaut) ¨ [:x] [x:]
1 dot · [.x] [x.]
grave accent ` ["x] [x"]
acute accent (aigu) ´ ['x] [x']
circumflex ˆ [^x] [x^]
caron (v-shaped symbol) [vx] [xv]
breve (u-shaped symbol) [)x] [x)]
tilde ˜ [~x] [x~]
cedilla ¸ [,x] [x,]

Characters can be cut and pasted from character lists. See Unicode/List_of_useful_symbols, w:Wikipedia:Mathematical_symbols, right menu lists punctuation, w:List_of_XML_and_HTML_character_entity_references, punctuation, w:Unicode_symbols currency symbols, w:Mathematical_operators_and_symbols_in_Unicode

w:Typographic_ligature#Ligatures_in_Unicode_(Latin_alphabets)

Typographical characters[edit]

It is not necessary to duplicate typographical conventions. Replace with modern conventional characters.

Templates are available for Antiquidated character Modern character replacement
ſ {{s}} s
ʃ {{s}} s
ſſ ſſ {{ss|1}} {{ss}} ss
ʃʃ {{ss|2}} {{ss}} ss
ß {{ss|3}} {{ss}} ss
ʃß {{ss|4}} {{ss}} ss
ƒ {{f}} f
Template:Ff {{ff}} ff


If specified by the project manager, typographical conventions may be replicated. The project manager is then responsible for producing both original and modern type transclusions, where the original characters are converted by bot to modern type.


{{{1}}}. {{ditto}}

Spacing[edit]

&ensp; | | "n" space as in the space a "n" takes up
&emsp; | | "m" space
&thinsp; | | thin space


  • Some special characters muck up templates e.g. = and ; and : and # and [&#35;]. When using these characters in templates, surround in curly brackets e.g. {{=}} or {{:}}.


Non-Latin Characters[edit]

Some projects contain text printed in non-Latin languages; that is, characters other than the Latin A...Z—for example, Greek, Cyrillic (used in Russian, Slavic, and other languages), Hebrew, or Arabic characters. These characters are available in the pull-down menus but if you are not sure of your Greek (etc.), you can get help, using a missing template e.g. {{Greek missing}} where the Greek characters should be, for an expert to fill in the gap.

Superscripts[edit]

Insert superscript from the pulldown menu. This works well only for single letters. Older books often abbreviated words as contractions, and printed them as superscripts, insert them into the {{sup}} {{sup|x}} template. For example:

Original Image:

Genrl Washington defeated Lᵗ Cornwall's army.

Correctly Proofread Text: Gen {{sup|rl}} Washington defeated Lᵗ Cornwall's army.

If the superscript represents a footnote marker, then see the Footnotes section instead.

The Project Manager may specify in the Project Comments that superscripted text be marked differently.

Italics are inside the template. E.g. 107th ''107''{{sup|''th''}}


Subscripts[edit]

Subscripted text is often found in scientific works, but is not common in other material. Proofread these with subscript from the pulldown menu or if characters are unavailable, insert into the {{sub}} {{sub|x}} template using the subscripted text. For example:

Original Image: H₂O. H2O.

Correctly Proofread Text: H₂O. H{{sub|2}}O.


Large, Opening Capital Letter[edit]

Large initial letters at the start of a chapter, section, or paragraph are duplicated with a template. If it is an ornate initial, note it in the Index Discussion page (with .djvu number) as someone may take the trouble to insert an image instead.

Proofread a large first letter that sits below the first line using the drop initial template {{di|x}}.

Rarely, you may have a large first letter sitting on the first line. Use {{largeinitial|x}} for these.

"If you have an apostophe before the initial, use {{float left|"}} before the initial template

c.f.

"I {{di|"I}}

P Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

P Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

P Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

P Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

P Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.


Words in Small Capitals[edit]

Using the Small capitals template {{sc|Lorem}}, proofread the characters in Small Caps (capital letters which are smaller than the standard capitals) as lower case letters; capitals remain as capitals. For example

Original Image:

  • Popular Science Monthly
  • Creighton, J. E.
  • a.m.
  • B.C.

Correctly Proofread Text:

  • {{sc|Popular Science Monthly}}
  • {{sc|Creighton, J. E.}}
  • {{sc|a.m.}}
  • {{smaller|B.C.}}

It is considered good form to include a whole name etc. within the template rather than only the part with small caps, as in {{sc|Creighton, J. E.}}. the If the OCR has converted small caps to all capitals, you must retype the words into the edit page. If it is a long passage it may be worth using the lower case template, {{lc}}, previewed, to convert the passage. Copy and paste into the Edit page.

{{lc|LOWERCASE}} LOWERCASE {{lc}}

{{uc|uppercase}} uppercase {{uc}}

{{capitalize|capitalize}} capitalize {{capitalize}}


Punctuation[edit]

Double Quotes[edit]

Proofread “double quotes” (straight or curly) as plain ASCII (keyboard) " (shift ') double quotes. Do not change double quotes to single quotes. Leave them as the author wrote them. See Chapter Headings if a double quote is missing at the start of a chapter.

The French equivalent, guillemets «like this», are available from the pulldown menus when editing or creating pages.

For other quotation marks, use '. This applies to languages which use marks »like this«; „like this“; “this way„

The Project Manager may instruct you in the Project Comments to proofread non-English language quotation marks differently for a particular book. Please be sure not to apply those directions to other projects.

Check for matching opening and closing quotes and close up the space between the marks and the quoted text. Some books have quotes where every new line has another quotemark—these are to be removed, leaving the beginning and ending quote, as the line spacing will change in transclusion. Check the Project Discussion for alternative directions.


Single Quotes[edit]

Proofread these as the plain ASCII (keyboard) ' single quote (apostrophe). Do not change single quotes to double quotes. Leave them as the author wrote them. Convert ‘ ’ to ' '.


Quote Marks on Each Line[edit]

Proofread quotation marks at the beginning of each line of a quotation by removing all of them except for the one at the start of the quotation. If a quotation like this goes on for multiple paragraphs, leave the quote mark that appears on the first line of each paragraph.

However, in poetry keep the extra quote marks where they appear in the image, since the line breaks will not be changed.

Often there is no closing quotation mark until the very end of the quoted section of text, which may not be on the same page you are proofreading. Leave it that way—do not add closing quotation marks that are not in the page image.

There are some language-specific exceptions. In French, for example, dialog within quotations uses a combination of different punctuation to indicate various speakers. If you are not familiar with a particular language, check the Project Discussion, or leave a message for the Project Manager in the Project Discussion for clarification.


Original Image:

Clearly he wasn't an academic with a preface like this
one. “I do not give the name of the play, act or scene,
“in head or foot lines, in my numerous quotations from
“Shakspere, designedly leaving the reader to trace and
“find for himself a liberal education by studying the
“wisdom of the Divine Bard.
“There are many things in this volume that the ordinary
“mind will not understand, yet I only contract with the
“present and future generations to give rare and rich
“food for thought, and cannot undertake to furnish the
“reader brains with each book!”


Correctly Proofread Text:

Clearly he wasn't an academic with a preface like this
one. "I do not give the name of the play, act or scene,
in head or foot lines, in my numerous quotations from
Shakspere, designedly leaving the reader to trace and
find for himself a liberal education by studying the
wisdom of the Divine Bard.
"There are many things in this volume that the ordinary
mind will not understand, yet I only contract with the
present and future generations to give rare and rich
food for thought, and cannot undertake to furnish the
reader brains with each book!"



End-of-sentence Periods[edit]

Proofread periods that end sentences with a single space after them.


Punctuation Spacing[edit]

Spaces before punctuation sometimes appear because books typeset in the 1700's & 1800's often used partial spaces before punctuation such as a semicolon or colon.

In general, a punctuation mark should have a space after it but no space before it. If the OCR'd text has no space after a punctuation mark, add one; if there is a space before punctuation, remove it. This applies even to languages such as French that normally use spaces before punctuation characters. However, punctuation marks that normally appear in pairs, such as "quotation marks", (parentheses), [brackets], and {braces} normally have a space before the opening mark, which should be retained.

Original Image:

and so it goes ; ever and ever.

Correctly Proofread Text:

and so it goes; ever and ever.


Trailing Space at End-of-line[edit]

Do not bother inserting spaces at the ends of lines of text; any such spaces will automatically be removed from the text when you save the page. When the text is transcluded, each end-of-line will be converted into a space.


Dashes, Hyphens, and Minus Signs[edit]

There are generally four such marks you will see in books:

Hyphens.-These are used to join words together, or sometimes to join prefixes or suffixes to a word. Leave these as a single hyphen, with no spaces on either side. Note that there is a common exception to this shown in the second example below.

En-dashes.-These are just a little longer, and are used for a range of numbers, or for a mathematical minus sign. Proofread these as a single hyphen, too. Spaces before or after are determined by the way it was done in the book; usually no spaces in number ranges, usually spaces around mathematical minus signs, sometimes both sides, sometimes just before.

Em-dashes & long dashes.—These serve as separators between words—sometimes for emphasis like this—or when a speaker gets a word caught in his throat——! Proofread these as an em-dash (from the pull-down menu) if the dash is as long as 2-3 letters (an em-dash) or use {{bar}} for a custom length. {{bar|3}} looks like this———. Don't leave a space before or after, even if it looks like there was a space in the original book image.

E.g.

{{sc|Vol 1.—a}}
Vol 1.—a

Deliberately Omitted or Censored Words or Names. If represented by a dash in the image, proofread these as an equivalent length {{bar}}. When it represents a word, we leave appropriate space around it like it's really a word. If it's only part of a word, then no spaces—join it with the rest of the word.

See also the guidelines for end-of-line and end-of-page hyphens and dashes.

End-of-line Hyphenation and Dashes[edit]

Where a hyphen appears at the end of a line, join the two halves of the hyphenated word back together. Remove the hyphen when you join it, unless it is really a hyphenated word like well-meaning. See Dashes, Hyphens, and Minus Signs for examples of each kind. If possible, keep the joined word on the top line, and put a line break after it to preserve the line formatting—this makes it easier for volunteers in later rounds. If the word is followed by punctuation, then carry that punctuation onto the top line, too.

Words like to-day and to-morrow that we don't commonly hyphenate now were often hyphenated in the old books we are working on. Leave them hyphenated the way the author did. Check the Index Discussion page for any advise if you're not sure if the author hyphenated it or not. If no consensus noted, leave the hyphen when you join the word. Leave a note on the Index discussion page of the .djvu page on which this occurs so someone can determine how the author typically wrote this word.

Similarly, if an em-dash appears at the start or end of a line of your OCR'd text, join it with the other line so that there are no spaces or line breaks around it. However, if the author used an em-dash to start or end a paragraph or a line of poetry, you should leave it as it is, without joining it to the next line. See Dashes, Hyphens, and Minus Signs for examples.

End-of-page Hyphenation and Dashes[edit]

Where a hyphen appears at the end of a page, the first part is used without the hyphen or em-dash. Use hyphanated word start {{hws}} and end {{hwe}} templates: {{hws|first part|whole word}} and {{hwe|second part|whole word}}. If it is really a hyphenated word like well-meaning, include it in the whole word i.e. {{hws|first part|whole-word}} and {{hwe|second part|whole-word}}. See Dashes, Hyphens, and Minus Signs for examples of each kind.

For example:

Original Image:

something Pat had already become accus-

Correctly Proofread Text:

something Pat had already become {{hws|accus|accustomed}}

To continue the above example on the next page:

Original Image:

tomed to from having to do his own family

Correctly Proofread Text:

{{hwe|tomed|accustomed}} to from having to do his own family

These templates rejoin the word when the pages are combined to produce the final e-book (transcluded). Please do not join the fragments across the pages yourself.

For Footnotes that split words over pages, see Footnotes.

Ellipsis "…"[edit]

The guidelines are different for English and Languages Other Than English (LOTE).

Ellipses of omission should be entered as the actual character (i.e. …, in the Symbol section of the pull-down menu) without surrounding spaces. However, note that not all strings of dots within written dialogue are ellipses of omission. In some cases, an author uses a sequence of three or more dots to indicate a pause, and in such situations there should be separate consecutive dots in order to preserve the tempo of the dialogue.

ENGLISH: An ellipsis is a character on its own, consisting of three consecutive dots. Regarding the spacing, in the middle of a sentence treat the three dots as a single word (i.e., usually a space before the 3 dots and a space after). At the end of a sentence treat the ellipsis as ending punctuation, with no space before it.

Note that there will also be an ending punctuation mark at the end of a sentence, in the case of a period there will be .+an ellipsis—4 dots total. A good hint that you're at the end of a sentence is the use of a capital letter at the start of the next word, or the presence of an ending punctuation mark (e.g., a question mark or exclamation point).

LOTE: (Languages Other Than English) Use the general rule "Follow closely the style used in the printed page." In particular, insert spaces, if there are spaces before or between the periods, and use the same number of periods as appear in the image. Sometimes the printed page is unclear; in that case, save as a Problematic page.

English examples:

Original Image: Correctly Proofread Text:
That I know . . . is true. That I know … is true.
This is the end.... This is the end….
The moving finger writes; and. . . The poet surely had a pen though! The moving finger writes; and…. The poet surely had a pen though!
Wherefore art thou Romeo. . . ? Wherefore art thou Romeo…?
“I went to the store, . . .” said Harry. "I went to the store, ..." said Harry.
“... And I did too!” said Sally. "... And I did too!" said Sally.
“Really? . . . Oh, Harry!” "Really?... Oh, Harry!"


Braces[edit]

{{brace2}}

{{brace2|3|l}} spans 3 lines, facing left: r for right

To include text within the bracket, use a table:

{|
|{{brace2|4|l}}
|do<br/>ray<br/>fa<br/>ti
|{{brace2|4|r}}
|}
do
ray
fa
ti


Contractions[edit]

In English, remove any extra space in contractions. For example, would n't should be proofread as wouldn't and 't is as 'tis.

This was a 19th century printers' convention in which the space was retained to indicate that 'would' and 'not' were originally separate words. It is also sometimes an artifact of the OCR. Remove the extra space in either case.

Some Project Managers may specify in the Project Discussion page not to remove extra spaces in contractions, particularly in the case of books that contain slang, dialect, or poetry.


Fractions[edit]

Proofread fractions as follows:

a diagonal fraction bar (a virgule)—¾ is written 34: {{fs70|{{frac|3|4}}}} {{frac}}, do not use the actual fraction symbols unless specifically requested in the Project Discussion, please.
A horizontal fraction bar (a vinculum)—use {{sfrac}} encased in {{fs70}} as {{fs70|{{sfrac|n|d}}}} (nd) to fit inline.
2+12: {{fs70|{{frac|2|1|2}}}}
2 1/2: {{fs70|{{sfrac|2|1|2}}}}

For fractions with a numerator of one, only the denominator need be inputted: {{fs70|{{sfrac|d}}}} 1/d; {{fs70|{{sfrac|3}}}} 1/3

Leaders[edit]

Period at the end. Remove

Maths[edit]

Music[edit]

Problem templates[edit]

These should be used if there is a problem that you cannot fix yourself. When using one of these, also set the progress to "problematic" (blue).

Template Used where..
{{missing score}} ..a musical score should be included.
{{missing math formula}} ..a mathematical formula should be included.
{{illegible}} ..the text cannot be read.
{{arabic missing}} ..Arabic characters are used.*
{{chinese missing}} ..Chinese characters are used.*
{{greek missing}} ..Greek characters are used.*
{{hebrew missing}} ..Hebrew characters are used.*
{{symbol missing}} ..unknown symbols are used.
* Where you cannot read or write in these languages.


Abbreviations[edit]

There are exceptions to the Primary Rule with regards to abbreviations. See the Manual of Style. Specific examples are i.e and e.g., a.m. and B.C. and other abbreviations from Latin, which are always contracted to avoid splitting across lines. Abbreviations should be SIC'd in full, except when "very common"—Manual of Style

Obscure words[edit]

Foreign words should be linked to Wiktionary using [[wikt:Article|word]]. Common phrases may also be in Wiktionary. E.g.

[[wikt:ceteris paribus|''caeteris paribus'']] caeteris paribus

To add a word to Wiktionary, go to Requested entries

Longer quotes in foreign languages may have translations in Wikiquotes and should be linked with [[q:Author firstname lastname|displayed name of Author]]. If the author’s page exists (blue link), check your quote is already there; if not, be brave and add it or leave a note in the Project Discussion. You can get it translated at ??. If your link is red, click the link and search Wikiquote—maybe they spell it differently. To add an author, click red link and follow: q:Help:Starting_a_new_page, or leave a note in the Project Discussion.

Proofreading at the Paragraph Level:[edit]

<br /> induces a new line without hard return, compatable with use in templates.

Font size[edit]

When a whole section of a publication is in a smaller size e.g. Correspondence, Subscriptions; we would keep the text at normal size as this is usually a space saving device for the publishers which we don’t need to follow. When the size of text changes to define different parts of a text, e.g. quotations or poetry, we follow the publishers lead.

If you want a bigger font size, or a smaller size, that's easy.

If you want {{larger|a bigger font size}}, or a {{smaller|smaller size}}, that's easy.

We don't use absolutes, like "large" so as to make reading text easier whatever the size of the device, but there are even bigger and huge size available, as well as even smaller and tiny.

We don't use absolutes, like "large" so as to make reading text easier whatever the size of the device, but there are {{x-larger|even bigger}}   and {{xx-larger|huge}} size available, as well as {{x-smaller|even smaller}} and {{xx-smaller|tiny}}.

For font sizes that are less than 100%, the following list of templates were designed to have line heights proportional to the font size. The number indicates the % reduction in size e.g. fs90=90% normal size font.

  • {{fs90/s}} {{fs90/e}} used to enclose a block of paragraphs and/or span pages. When used to span pages, the {{fs90/e}} is placed in the footer of the first page, to terminate the block and {{fs90/s}} is placed in the header of the following page to begin the new block. This way the transcluded text in the main namespace will be enclosed with a single set of templates because headers and footers are excluded. Click on this link to see an example.
  • {{fs85}} 85% font size and 100% line height.
  • {{fs75}} 75% font size and 95% line height.
  • {{fs70}} 70% font size and 90% line height. - Used inline to match the line height of fraction templates {{fs70|{{over||}}}} and {{fs70|{{frac||}}}}.

More: 100% and smaller font size and style comparisons table.

{{font-size}}

Thought breaks: Extra spaces, stars, or lines between paragraphs[edit]

Separator template needs work. Should automatically spread bullets across page, unless length specified.

In the image, most paragraphs start on the line immediately after the end of the previous one. Sometimes two paragraphs are separated to indicate a "thought break." A thought break may take the form of a line of stars, hyphens, or some other character, a plain or floridly decorated horizontal line, a simple decoration, or even just an extra blank line or two.

A thought break may represent a change of scene or subject, a lapse in time, or a bit of suspense.

Templates are used for more complex layout.

If more than one line is needed between paragraphs, the double hard return, {{dhr|nem}} template is used. {{dhr}} gives two lines-this is preferable to two blank lines as it shows consideration has been made of the formatting.

The rule template {{rule|5em}} uses its argument to govern the length of the rule:


"em" is the width of a single wide character whereas "px" stands for pixels.

Template Example Result
{{rule}} {{rule}}{{rule}}


{{rule|height=4px}}{{rule}}


{{rule|15%}}


{{rule|5em}}










{{custom rule}} {{custom rule|sp|100|d|6|sp|10|d|10|sp|10|d|6|sp|100}}

{{custom rule|c|6|sp|40|do|7|fy1|40|do|7|sp|40|c|6}}

{{custom rule|fc|140}}

{{separator}} {{separator}}
· · · · ·
{{***}} {{***}}

{{***|5|3em|char=@}}

***

@@@@@

{{Asterism}} {{Asterism}}


{{nw}} * * *

Line Breaks[edit]

Leave all line breaks in so that later in the process other volunteers can easily compare the lines in the text to the lines in the image. Be especially careful about this when rejoining hyphenated words or moving words around em-dashes. If the previous proofreader removed the line breaks, please replace them so that they once again match the image.

Remove any spaces at the beginning of the line.

A space at the beginning of a line creates a box.

Line breaks can cause problems (especially with templates, links and tables, and italics/bold which are closed by the line ending). Close the line in these cases.

Line spacing[edit]

If there is more than one space between lines, use the double return template: {{dhr}} for 2 lines; {{dhr|nem}} for a custom gap. For example:

This is a double line

gap.

This is a 4 em

gap.

Borders[edit]

  • Graphic frames
{{overfloat image}}
  • {{border |maxwidth= 350px|padding= 15px|Lorum ipsum}}
Lorum ipsum
  • {{frame}}
  • Table border {{ts|bt|bl|br|bb}}

Headings[edit]

Chapter Headings[edit]

Proofread chapter headings as they appear in the image but do not insert line breaks. e.g. Here we center the entire heading so it will format appropriately, according to the width of the display NOT according to the original page.

A chapter heading may start a bit farther down the page than the page header and won't have a page number on the same line. Chapter Headings are often printed all caps; if so, keep them as all caps.

Watch out for a missing double quote at the start of the first paragraph, which some publishers did not include or which the OCR missed due to a large capital in the image. If the author started the paragraph with dialog, insert the double quote.

Put {{dhr|4em}} before the "CHAPTER XXX". Include this even if the chapter starts on a new page; there are no 'pages' in an e-book, so the blank lines are needed. Then separate with a blank line each additional part of the chapter heading, such as a chapter description, opening quote, etc., and finally leave two blank lines {{dhr}} before the Section title or the start of the text of the chapter.

While chapter headings may appear to be bold or spaced out, these are usually the result of font or font size changes and should not be marked.

Headings (Chapter Headings, Section Headings, Captions, etc.) may appear to be in all small caps, but this is usually the result of a change in font size and should not be marked as small caps.

example here


Section Headings[edit]

Some books have sections within chapters. Format these headings as they appear in the image. Leave 2 blanks lines {{dhr}} before the heading and one line after, unless the Project Manager has requested otherwise. If you are not sure if a heading indicates a chapter or a section, post a question in the Project Discussion, noting the page number.

Mark any italics or mixed case small caps that appear in the image. While section headings may appear to be bold or spaced out, these are usually the result of font or font size changes and should not be marked. The extra blank lines separate the heading, so do not mark the font change as well.


Other Major Divisions in Texts[edit]

Major Divisions in the text such as Preface, Foreword, Table of Contents, Introduction, Prologue, Epilogue, Appendix, References, Conclusion, Glossary, Summary, Acknowledgements, Bibliography, etc., should be formatted in the same way as Chapter Headings, i.e. {{dhr|4em}} before the heading and {{dhr}} before the start of the text.

Book Title pages are treated differently.


Alignment[edit]

Text is by default aligned left, but where it is required to manually align text to the left, use {{left}}. To float a block of text to the left without affecting text alignment within the block, use {{float left}} or {{block left}}.

To align text to the right, use {{right}}. To float a block of text to the right without affecting text alignment within the block, use {{float right}} or {{block right}}.

To center text, use the center template {{c|Lorem}}. To float a block of text to the center without affecting text alignment within the block, use {{block center|Lorem ipsum}}.

Template Example Result
{{left}} {{left|this text<br/>is left justified}}
this text
is left justified
{{center}}, {{c}} {{c|this text<br/>is center justified}}

this text
is center justified

{{right}} {{right|this text<br/>is right justified}}
this text
is right justified
{{block left}}, {{float left}} {{block left|this block of text<br/>is left justified}}
this block of text
is left justified
{{block center}} {{block center|this block of text<br/>is center justified}}

this block of text text
is center justified

{{block right}} {{block right|this block of text<br/>is right justified}}

this block of text
is right justified

{{float right}} Signed. {{float right|{{sc|J. Dewey}}{{gap}}}} Signed.J. Dewey

{{justify}}

{{tl|

{{{1}}}

- may align better

{{zfloat right}}, {{zfloat left}}

Page:The_Continental_Monthly,_Volume_5.djvu/8

Block quotes[edit]

Block quotations are blocks of text (typically several lines and sometimes several pages) that are distinguished from the surrounding text by wider margins, a smaller font size, different indentation, or other means.

{{quote}} indents on left and right margins. It also enables citation. ****template needs work. Not smaller font size. Would be handy to be able to dictate fs and margin size.

{{lh2/s}} for specifying line height

{{fs90/s}} {{fs90/e}} used to enclose a block of paragraphs and/or span pages. When used to span pages, the {{fs90/e}} is placed in the footer of the first page, to terminate the block and {{fs90/s}} is placed in the header of the following page to begin the new block. This way the transcluded text in the main namespace will be enclosed with a single set of templates because headers and footers are excluded. Click on this link to see an example.

Also {{fs85}}, {{fs75}}, {{fs70}}. {{font-size-x}} allows you to specify the font size as a percentage or in em, with proportionate line spacing.

Indenting/Paragraph Spacing[edit]

"em" is the width of a single wide character whereas "px" stands for pixels.

Put a blank line before the start of a paragraph, unless it starts at the top of a page. A new paragraph is indicated by an indentation. Where there is a gap between paragraphs (not just indented), we use {{dhr}}

Example: Page:Lombard Street (1917).djvu/266

Contrary to the original scan, proofread paragraphs are not indented. However, there are exceptions in poems in which alternate lines are indented, and indented lists, where inserting a table is not warranted. In such cases there are two templates available:

  • Use {{gap|nem}} template where there is a wide gap or indent in the text. A default {{gap}} looks like this. A {{gap|4em}} looks like this. A {{gap|.5em}} looks like this.
  • Use {{spaces|n}} template where there is a short gap or indent in the text. A {{spaces|2}} looks like this  .

On transclusion, when pages are joined together, they will run immediately from the bottom of one page to the top of the next. If a paragraph finishes at the bottom of the page, it is followed by the no-operation template {{nop}} as the last line, which forces a new paragraph at the start of the next page. {{nop}} is not needed after an image, a table or some templates.

To indent every line of a paragraph except the first, use the Hanging indent template, {{hi}}. To indent a block of text left, use the colon (:) before the block. (no hard returns tho) For more control, the template {{left margin}} is available. Template {{dent}} combines the functionality of {{left margin}}, {{text-indent}} and {{hi}}.

Wikimarkup

Markup Renders as

; Term
: Definition 1
: Definition 2
: Definition 3
: Definition 4

Term

Definition 1
Definition 2
Definition 3
Definition 4

Wikimarkup: use * for items in an unordered list and # for ordered lists.

Template Example Result
Hanging indent template, {{hi}}} {{hi|This paragraph of text has a hanging indent, often used on long entries in tables or lists}}
This paragraph of text has a hanging indent, often used on long entries in tables or lists
{{left margin}}, : {{left margin|2em|This block of text is indented left 2 "ems", to offset it from the main body}}

This block of text is indented left 2 "ems", to offset it from the main body

{{Text-indent}}, : {{Text-indent|2em|This text is indented left 2 "ems", to offset it from the main body}}

This text is indented left 2 "ems", to offset it from the main body

{{both margins}}, : {{both margins|2em|2em|This block of text is indented 2 "ems", to offset it from the main body}}

This block of text is indented 2 "ems", to offset it from the main body

{{dent}} {{dent|4em|-2em|This block of text is formatted with both a left margin and a hanging indent}}

This block of text is formatted with both a left margin and a hanging indent


{{left}}

{{left|Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.|5em}}

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

{{right|Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.|5em}}

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

{{center|Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.|5em}}

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.



See Template:Hanging_indent for instructions on hanging indents over pages.


Page Headers and Footers[edit]

Cut page headers and page footers, but not footnotes, from the text. These may have already been inserted appropriately to the header or footer box as part of the page creation. This is part of the index page set-up. If the page is already created, or the automatic text doesn't match, add the text using the running header template {{rh|n|Title|x}}

Extra blank lines at the top of the page should be removed except where we intentionally add them for formatting. But blank lines at the bottom of the page are fine—these are removed when you save the page.

A chapter heading will usually start further down the page and won't have a page number on the same line. See the example below.

Insert example namespace here



Images[edit]

Mark missing Images with {{missing image}} where the image should be.

Proofread any caption text as it is printed, preserving the line breaks. If the caption falls in the middle of a paragraph, use blank lines to set it apart from the rest of the text. Text that could be (part of) a caption should be included, such as "See page 66" or a title within the bounds of the illustration.

Exemplar: Popular_Science_Monthly Proofreading_guide

Template:FreedImg/span

See: Help:Adding_images


References, Footnotes and Endnotes[edit]

Proofread footnotes and insert between <ref>Lorem ipsum</ref> where it is referenced in the text. Remove any markers such as * or 1. These will be automatically placed on transclusion.

Separate paragraphs within the footnote with the double return template at the end of the paragraphs:{{dhr}}

Endnotes are just footnotes that have been located together at the end of a chapter or at the end of the book, instead of on the bottom of each page. These are proofread in the same manner as footnotes.

See: Help:Footnotes_and_endnotes, Help:Page_breaks#Footnotes_across_page_breaks


Original Image:

The principal persons involved in this argument were Caesar*, former military
leader and Imperator, and the orator Cicero†. Both were of the aristocratic
(Patrician) class, and were quite wealthy.

* Gaius Julius Caesar.
† Marcus Tullius Cicero.

Correctly Proofread Text:

The principal persons involved in this argument were Caesar[1], former military leader and Imperator, and the orator Cicero[2]. Both were of the aristocratic (Patrician) class, and were quite wealthy.

  1. Gaius Julius Caesar.
  2. Marcus Tullius Cicero.
The principal persons involved in this argument were Caesar<ref>Gaius Julius Caesar.</ref>, former military
leader and Imperator, and the orator Cicero<ref>Marcus Tullius Cicero.</ref>. Both were of the aristocratic
(Patrician) class, and were quite wealthy.

<references />

Original Footnoted Poetry:

Mary had a little lamb[1]
Whose fleece was white as snow
And everywhere that Mary went
The lamb was sure to go!

1 This lamb was obviously of the Hampshire breed,
well known for the pure whiteness of their wool.

A most determined ovine.

Correctly Proofread Text:

Mary had a little lamb[1]
Whose fleece was white as snow
And everywhere that Mary went
The lamb was sure to go!

  1. This lamb was obviously of the Hampshire breed, well known for the pure whiteness of their wool.A most determined ovine.
<poem>
Mary had a little lamb<ref>This lamb was obviously of the Hampshire breed, well known for the pure whiteness of their wool.{{dhr}}A most determined ovine.</ref>
:Whose fleece was white as snow
And everywhere that Mary went
:The lamb was sure to go!
</poem>

<references />


Paragraph Side-Descriptions (Sidenotes)[edit]

Some books will have short descriptions of the paragraph along the side of the text. These are called sidenotes. They are usually consistantly used through out a text. The OCR may place the sidenotes anywhere on the page, and may even intermingle the sidenote text with the rest of the text.

Sidenotes in Wikisource are problematic. Refer to Project Discussion page for direction on how to handle them in the text you are working on.


Lists[edit]

w:Template:Plainlist

Examples and Exemplars: Ads in Science And Hypothesis

Multiple Columns[edit]

Proofread ordinary text that has been printed in multiple columns as a single column. Place the text from the left-most column first, the text from the next column below that, and so on. Do not mark where the columns were split, just join them together.

Original Image:

Andersen, Hans Christian Daguerre, Louis J. M. Melville, Herman
Bach, Johann Sebastian Darwin, Charles Newton, Isaac
Balboa, Vasco Nunez de Descartes, René Pasteur, Louis
Bierce, Ambrose Earhart, Amelia Poe, Edgar Allan
Carroll, Lewis Einstein, Albert Ponce de Leon, Juan
Churchill, Winston Freud, Sigmund Pulitzer, Joseph
Columbus, Christopher Lewis, Sinclair Shakespeare, William
Curie, Marie Magellan, Ferdinand Tesla, Nikola


Correctly Formatted Text:

Andersen, Hans Christian

Bach, Johann Sebastian

Balboa, Vasco Nunez de

Bierce, Ambrose

Carroll, Lewis

Churchill, Winston

Columbus, Christopher

Curie, Marie

Daguerre, Louis J. M.

Darwin, Charles

Descartes, René

Earhart, Amelia

Einstein, Albert

Freud, Sigmund

Lewis, Sinclair

Magellan, Ferdinand

Melville, Herman

Newton, Isaac

Pasteur, Louis

Poe, Edgar Allan

Ponce de Leon, Juan

Pulitzer, Joseph

Shakespeare, William

Tesla, Nikola


See also the Index and Table sections of the Proofreading Guidelines.


Tables[edit]

If you don't want to tackle the formatting involved with tables, be sure that all the information in a table is correctly proofread and make a note in the Project Discussion of an unformatted table and the djvu number and/or mark {{missing table}}. Separate items with spaces as needed, but do not worry about precise alignment. Retain line breaks (while handling end-of-line hyphenation and dashes normally). Ignore any periods or other punctuation (leaders) used to align the items.

Formatting guidelines for Tables, examples and more help is at Help:Table.

Template:Aligned_table

Help:Page_breaks#Tables_across_page_breaks


Poetry and Epigrams[edit]

Preserve the relative indentation of the individual lines of the poem or epigram by adding : in front of the indented lines to make them resemble the image. If the entire poem is centered on the printed page, use {{block center}}. Move the lines to the left margin, and preserve the relative indentation of the lines.

When a line of verse is too long for the printed page, many books wrap the continuation onto the next printed line and place a wide indentation in front of it. These continuation lines should be rejoined with the line above. Continuation lines usually start with a lower case letter. They will appear randomly unlike normal indentation, which occurs at regular intervals in the meter of the poem.

If a row of dots appears in a poem, treat this as a thought break.

Line Numbers in poetry should be kept.

Check the Project Discussion for the specific project you are formatting. Books of poetry often have special instructions from the Project Manager. Many times, you won't have to follow all these formatting guidelines for a book that is mostly or entirely poetry.

  • Enclosing the poem in <poem>Lorem ipsum</poem> to maintain line breaks.
  • Leave each line left justified and maintain the line breaks. Insert a blank line between stanzas, when there is one in the image. If there are indents use {{gap}} or : (multiple : for larger indentation).
  • A common style: {{block center/s}}<poem>{{fs90/s}}{{fqm}}Lorem ipsum{{fs90/e}}</poem>{{block center/e}}
{{block center/s}} (centers poem across pages) <poem> {{fs90/s}} (fs90=font size at 90%)
  • The template order is necessary because the font template line height is not applied to the contents, unless it is the innermost template.
  • The <poem></poem> tags can't span pages. In poems that span pages the tag must be terminated with </poem> at the last line of the poem and inserted anew with <poem> on the following page.
  • Poems that begin with a double quote (") require the use of the Floating quotation mark template {{fqm}} template in place of the ", to retain the proper centering of the poem.
  • Each line will need to be italised separately

See Help:Poetry for more help and examples.


Letters and Correspondence[edit]

Format letters and correspondence as you would paragraphs. Put a blank line before the start of the letter; do not duplicate any indenting.

Consecutive heading or footer lines (such as addresses, date blocks, salutations, or signatures) can be formatted with alignment templates or as a table.

Examples

If the correspondence is printed differently than the main text, see Block Quotations.

Single Word at Bottom of Page[edit]

Proofread this by deleting the word, even if it's the second half of a hyphenated word.

In some older books, the single word at the bottom of the page (called a "catchword", usually printed near the right margin) indicates the first word on the next page of the book (called an "incipit"). It was used to alert the printer to print the correct reverse (called "verso"), to make it easier for printers' helpers to make up the pages prior to binding, and to help the reader avoid turning over more than one page.


Proofreading at the Page Level:[edit]

Words hyphenated across pages[edit]

Running Headers and Footers[edit]

{{rh}} provides easy formatting for headers and footers, including page numbers.

Some running headers, and how to format the template to display them, can be seen in this table:

Examples of RunningHeader
Template Result
{{rh|1|CHAPTER TITLE|[1830.}}
1
CHAPTER TITLE
[1830.
{{rh||''Book Title''|{{{pagenum}}}}}
Book Title
55
{{rh|{{sc|Vol}}. 2||Page 101}}
Vol. 2
Page 101

Help:Beginner's_guide_to_typography

Blank Page[edit]

Most blank pages, or pages with an illustration but no text, will already be marked with [Without text]. Leave this marking as is. If the page is blank, and [Without text] does not appear, click the appropriate circle before saving.


Front Matter and Back Title Page[edit]

Proofread all the text just as it was printed on the page, whether all capitals, upper and lower case, etc., including the years of publication or copyright.


Older books often show the first letter as a large ornate graphic—proofread this as just the letter, note it in the Index Discussion page (with .djvu page number) as someone may take the trouble to insert an image instead.

Examples


Table of Contents[edit]

Proofread the Table of Contents just as it is printed in the book, whether all capitals, upper and lower case, etc. If there are Small Capitals, see the guidelines for Small Capitals. Leave a line between entries. A {{dhr}} before sections.

Periods or other punctuation (leaders) used to align the page numbers are removed. Templates align things up. To have a go:Formatting Table of Contents. Or leave it for the transcluder, who will also be adding links for chapters etc.

{{TOCstyle}}


Indexes[edit]

You don't need to align the page numbers in index pages as they appear in the image; just make sure that the numbers and punctuation match the image and retain the line breaks. Ensure that all the text and numbers are correct.

Indexes are often printed in 2 columns; this doesn’t work on a wiki so we have produce a single column of entries. See columns

Place one blank line before each entry in the index.

Treat each new section in an index (A, B, C...) the same as a section heading by placing a {{dhr}} before it.

Use {{nop}} if the page ends with an entry. If entry continues over the page, no nop

Please check the Project Comments as the Project Manager may request different formatting, such as treating the index like a Table of Contents instead.

Using {{TOCstyle}}, Mrs Beeton's Book of Household Management

Links to pages will be managed by the transcluder.

See also Multiple Columns.


Sections[edit]

Help:Subpages


Plays: Actor Names/Stage Directions[edit]

In dialog, treat a change in speaker as a new paragraph, with one blank line before it. If the speaker's name is on its own line, treat that as a separate paragraph as well.

Stage directions are kept as they are in the original image, so if the stage direction is on a line by itself, proofread it that way; if it is at the end of a line of dialog, leave it there. Stage directions often begin with an opening bracket and omit the closing bracket. This convention is retained; do not close the brackets.

Sometimes, especially in metrical plays, a word is split due to page-size constraints and placed above or below following a (, rather than having a line of its own. Please rejoin the word as per normal end-of-line hyphenation. See the example.

For all plays:

Format cast listings (Dramatis Personæ) as lists.

Treat each new Act the same as a chapter heading by placing {{dhr|4em}} before it and {{dhr}} after.

Treat each new Scene the same as a section heading by placing {{dhr}} before it.

In dialog, treat a change in speaker as a new paragraph, with one blank line before it. If the speaker's name is on its own line, treat that as a separate paragraph as well.

Format actor names as they are in the original image, whether they are italics, bold, or all capital letters.

Stage directions are formatted as they are in the original image, so if the stage direction is on a line by itself, format it that way; if it is at the end of a line of dialog, leave it there; if it is right-justified at the end of a line of dialog, leave at least six spaces between the dialog and the stage directions.

Stage directions often begin with an opening bracket and omit the closing bracket. This convention is retained; do not close the brackets. Italics markup is generally placed inside the brackets.

For metrical plays (plays written as poetry):

Many plays are metrical, and like poetry should not be rewrapped. Surround metered text with <poem></poem> as for poetry. If stage directions are on their own line, do not surround these with <poem></poem>. (Since stage directions are not metrical, and can be safely rewrapped, they should not be contained within the <poem></poem> tags that protect the metrical dialog.)

Preserve relative indention of dialog as with poetry.

Rejoin metrical lines that were split due to width restrictions of the paper, just as in poetry. If the continuation is only a word or so, it is often shown on the line above or below following a (, rather than having a line of its own. See the example.

Please check the Project Discussion, as the Project Manager may specify different handling.


Anything else that needs special handling or that you're unsure of[edit]

While proofreading, if you encounter something that isn't covered in these guidelines that you think needs special handling or that you are not sure how to handle, post your question, noting the djvu (page) number, in the Index Discussion.

You should also put a note in the Page discussion page to explain what the problem or question is. Include your signature ~~~~. Any comments put in by a previous volunteer must be left in place. See the next section for details.


Previous Proofreaders' Notes/Comments[edit]

Any notes or comments put in by a previous volunteer must be left in place. You may add agreement or disagreement to the existing note but even if you know the answer, you absolutely must not remove the comment. If you have found a source which clarifies the problem, please cite it so it can be referred to.

If you come across a note from a previous volunteer that you know the answer to, please take a moment and provide feedback to them by clicking on their name in the signature and posting a message to them explaining how to handle the situation in the future. Please, as already stated, do not remove the note.


Common Problems:[edit]

Formatting

Some texts have very challenging formatting. It is quite alright to leave this for someone else to do. Do leave a note on the Index Discussion page with the djvu page number. If you want to have a go, there are extra formatting guides to follow and you can then go to the Scriptorium for someone to look at your effort. Some examples of complex formatting tasks include:

Spaced-out text
Font size changes
Footnotes that continue for more than one page
Images
Sidenotes
Arrangement of data in tables
Indentation (in poetry or elsewhere)
Tables of Contents


Problem templates[edit]

These should be used if there is a problem that you cannot fix yourself. When using one of these, also set the progress to "problematic" (blue). Only use the template once per page. The "expert" will check the whole page for what's missing.

Template Used where..
{{missing image}} ..an image should be included.
{{Framed page}} ..a frame should be included.
{{missing table}} ..a table should be included.
{{missing score}} ..a musical score should be included.
{{missing chess diagram}} ..a chess diagram should be included.
{{missing math formula}} ..a mathematical formula should be included.
{{illegible}} ..the text cannot be read.
{{arabic missing}} ..Arabic characters are used.*
{{chinese missing}} ..Chinese characters are used.*
{{greek missing}} ..Greek characters are used.*
{{hebrew missing}} ..Hebrew characters are used.*
{{symbol missing}} ..unknown symbols are used.
* Where you cannot read or write in these languages.

Common OCR Problems[edit]

OCR commonly has trouble distinguishing between the similar characters. Some examples are:

  • The digit '1' (one), the lowercase letter 'l' (ell), the small-caps 'i' and the uppercase letter 'I' (aye).
  • The digit '0' (zero), and the uppercase letter 'O'.
  • Dashes & hyphens: Proofread these carefully—OCR'd text often has only one hyphen for an em-dash. See the guidelines for hyphenated words and em-dashes for more detailed information.
  • Parentheses ( ) and curly braces { }.

Watch out for these. Normally the context of the sentence is sufficient to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading.

Noticing these is much easier if you use a mono-spaced font such as DPCustomMono or Courier.


OCR Problems: Scannos[edit]

Another common OCR issue is misrecognition of characters. We call these errors "scannos" (like "typos"). This misrecognition can create a word that:

  • appears to be correct at first glance, but is actually misspelled. This can usually be caught by running WordCheck from the proofreading interface. Recommended word reference Wiktionary
  • is changed to a different but otherwise valid word that does not match what is in the page image. This is subtle because it can only be caught by someone actually reading the text.
  • Possibly the most common example of the second type is "and" being OCR'd as "arid." Other examples: "eve" for "eye", "Torn" for "Tom", "train" for "tram". This type is harder to spot and we have a special term for them: "Stealth Scannos." We collect examples of Stealth Scannos in the Index discussion page.

Spotting scannos is much easier if you use a mono-spaced font such as DPCustomMono or Courier. To aid proofreading, the use of WordCheck (or its equivalent) is recommended.


OCR Problems: Is that ° º really a degree sign?[edit]

There are three different symbols that can look very similar in the image and that the OCR software interprets the same (and usually incorrectly):

  • The degree sign °: This should be used only to indicate degrees (of temperature, of angle, etc.).
  • The superscript o, o: Virtually all other occurrences of a raised o should be proofread as {{sup|o}}.
  • The masculine ordinal º: Proofread this like a superscript o, too unless the special character is requested in the Index Discussion page. It may be used in languages such as Spanish and Portuguese, and is the equivalent of the -th in English 4th, 5th, etc. It follows numbers and has the feminine equivalent in the superscript a (a).


Handwritten Notes in Book[edit]

Do not include handwritten notes in a book (unless it is overwriting faded, printed text to make it more visible). Do not include handwritten marginal notes made by readers, etc.


Bad Image[edit]

If an image is bad (not loading, mostly illegible, etc.), please post about this bad image in the Index discussion and click on the "Problematic" button so this page is noted.

Note that some page images are quite large, and it is common for your browser to have difficulty displaying them, especially if you have several windows open or are using an older computer. Before reporting this as a bad page, try zooming in on the image, closing some of your windows and programs, or posting in the Index discussion to see if anyone else has the same problem.


Wrong Image for Text/Wrong text[edit]

If there is a wrong image for the text given, please post about this bad page in the Index discussion and click on the "Problematic" button so this page is noted.

It's fairly common for the OCR'd text to be mostly correct, but missing the first line or two of the text. Please just type in the missing line(s). If nearly all of the lines are missing in the text box, then either type in the whole page (if you are willing to do that), or skip it. If there are several pages like this, you might post a note in the Index discussion.


Previous Proofreader Mistakes[edit]

If a previous proofreader made a lot of mistakes or missed a lot of things, please take a moment to provide feedback to them by clicking on their name on the History page and posting a private message to them explaining how to handle the situation so that they will know how in the future.

Please be nice! Everyone here is a volunteer and presumably trying their best. The point of your feedback message should be to inform them of the correct way to proofread, rather than to criticize them. Give a specific example from their work showing what they did, and what they should have done.

If the previous proofreader did an outstanding job, you can also send them a message about that—especially if they were working on a particularly difficult page. You can thank them from the History page.


Printer Errors/Misspellings[edit]

Correct all of the words that the OCR has misread (scannos), but do not correct what may appear to you to be misspellings or printer errors that occur on the page image. Many of the older texts have words spelled differently from modern usage and we retain these older spellings, including any accented characters.

If you believe it is an obvious printer's error, use the SIC template with original and corrected text: {{SIC|wort|word}}. The original will appear on the page with a line underneath. When the cursor hovers over, it will reveal the corrected text. Be generous and respectful to the author's idiosyncrasies as English is an evolving language and as such these books demonstrate its progress (or not).

Also if it is in reference to someone or something, and you want to create a link to the Author page or wikipage, it is appropriate to SIC the modern name or term.

e.g. Original text

… as my Lord Shaftbury says …

Corrected text

… as my Lord Shaftbury says …

… as my [[w:Earl_of_Shaftesbury|{{SIC|Lord Shaftbury|Lord Shaftesbury}}]] says … 


Factual Errors in Texts[edit]

Do not correct factual errors in the author's book. Many of the books we are proofreading have statements of fact in them that we no longer accept as accurate. Leave them as the author wrote them. See Printer Errors/Misspellings for how to leave a note if you think the printed text is not what the author intended.

Annotated copies of the text may be created when it is transcluded. This is the appropriate place to make such notes.


Annotation[edit]

Wikis offer a wonderful opportunity to enrich the reading experience through annotation—adding links to extra information or comment. There is a fairly broad range of possibilities and some rules about what is appropriate where.

1. When the text is transcluded, an Annotated copy may be made—anything publishable can be added to these copies.

2. The original copy should be retained unaltered. If there is a direct reference to an Author or publication, it is usually appropriate to link these to

1. Wikisource—These will have links to relevant Wikipedia articles.
2. Wikipedia—If no Wikisource link is available.

3. Reference to concepts, people, places, events etc. may be appropriate to reference but generally less is best.

Alphabetical Index to the Guidelines[edit]

About This Document
Accented/Non-ASCII Characters
Actor Names (Plays)
ae Ligatures
Anything else that needs special handling
Back Title Page
Bad Image
Bad Text
Blank Page
Bold Text
Capital Letter, Ornate (Drop Cap)
Capitals, Small
Captions, Illustration
Catchwords
Chapter Headings
Characters, Accented/Non-ASCII
Characters with Diacritical Marks
Columns, Multiple
Comments, Previous Proofreaders'
Common OCR Problems
Contents, Table of
Contractions
Dashes
Dashes, End-of-line
Dashes, End-of-page
Degree Signs
Diacritical Marks
Double Quotes
Double Quotes, missing at start of chapter
Drama
Drop Cap
Drop-down Menus
Ellipsis
Em-dashes
Endnotes
End-of-line Hyphenation and Dashes
End-of-line Space
End-of-page Hyphenation and Dashes
End-of-sentence Periods
Epigrams
Extra Spaces Between Words
Errors, Factual
Errors, Printer
Factual Errors in Texts
Fixing Errors on Previous Pages
Footers, Page
Footnotes
Formatting
Forum
Fractions
Front/Back Title Page
Full Stops, End-of-sentence
Greek Text
Handwritten Notes in Book
Handy Proofreading Guide
Headers, Page
Headings, Chapter
Hebrew Text
Hyphenation, End-of-line
Hyphenation, End-of-page
Hyphens
Illustrations
Image, Bad
Indenting, Paragraph
Indexes
Inserting Special Characters
Italics
Keyboard Shortcuts for Latin-1 Characters
Language Other Than English (LOTE), Ellipses in
Large, Ornate Opening Capital Letter (Drop Cap)
Latin-1 Characters, Inserting
Ligatures
Line Breaks
Line Numbers
Lowered Text (Subscripts)
Minus Signs
Misspellings, Printer
Mistakes, Previous Proofreader
Multiple Columns
Non-ASCII Characters
Non-Latin Characters
Notes, Handwritten
Notes, Previous Proofreaders'
Numbers, Line
OCR Problems, Common
OCR Problems: Is that ° º really a degree sign?
OCR Problems: Scannos
oe Ligatures
Ordinal Symbol
Ornate Capital Letter (Drop Cap)
Other things that you're unsure of
Page, Blank
Page Headers/Page Footers
Page, Title
Paragraph Side-Descriptions (Sidenotes)
Paragraph Spacing/Indenting
Period Pause "..." (Ellipsis)
Periods, End-of-sentence
Plays: Actor Names/Stage Directions
Poetry
Preexisting Formatting
Previous Proofreader Mistakes
Previous Proofreaders' Notes/Comments
Previous Pages, Fixing Errors on
Primary Rule
Project Comments
Project Discussion
Punctuation Spacing
Printer Errors/Misspellings
Quote Marks on Each Line
Quotes, Double
Quotes, Missing at start of chapter
Quotes, Single
Raised Text (Superscripts)
Scannos
Shortcuts for Latin-1 Characters
Sidenotes
Single Quotes
Single Word at Bottom of Page
Small Capitals
Space at End-of-line
Spaces, Extra
Spacing, Paragraph
Spacing, Punctuation
Special Characters, Inserting
Stage Directions (Plays)
Subscripts
Summary Guidelines
Superscripts
Table of Contents
Tables
Tabs
Text, Wrong Image for
Title Page
Titles, Chapter
Trailing Space at End-of-line
Word at Bottom of Page
WordCheck
Words in Small Capitals
Wrong Image for Text