User:Ineuw/TPSM Proofreading guide

From Wikisource
Jump to: navigation, search
The Popular Science Monthly Project
Proofreading guide
Last update:
 


Page namespace proofreading strategy[edit]

Pages considered as special[edit]

  • Article title pages.
  • Index pages at the end of each volume.
  • Pages with images.
  • Pages with tables.
  • Article titles were standardized and proofread.
  • The same process was applied to the recurring monthly section titles and paragraph start styles.

PSM Templates[edit]

An attempt was made to use Templates almost exclusively. HTML tags are limited to the minimum where a template isn't available or possible.

Template selection was determined by their relevance and versatility in reducing the editing process to a logical minimum. Defaults were used where available, and a compact list of the templates is listed in the Proofreading tools section.

Project specific templates[edit]

 
Description Recurring
title
Note Template Shortcut Name
space
Article title No Article titles of all Volumes PSMTitle Pt Page
Correspondence section title 1 Yes From Volume 3 to Volume 47 PSMCorrespondence Pcor Page
Correspondence section title 2 Yes From Volume 48 to Volume 56 PSMCorrespondence2 PCor2 Page
Correspondence section title 3 Yes Volume 57 only PSMCorrespondence3 PCor3 Page
Discussion and Correspondence Yes From Volume 57 to Volume 69 PSMDiscuss&Correspond PD&C Page
Editor's Table 1 Yes From Volume 1 to Volume 47 no dot PSMEditorsTable Pedit Page
Editor's Table 2 Yes From Volume 48 to Volume 57 with dot PSMEditorsTable2 Pedit2 Page
End of article graphic rule No Terminates articles when ending in mid page PSM rule Page
Entertaining Varieties Yes From Volume 20 to Volume 22 PSMEntVar Page
Monthly first article title 1 No From Volume 1 to Volume 48 December PSMPage1Title Page
Monthly first article title 2 No From Volume 48 January to Volume 57 June PSMPage1Title2 Page
Monthly first article title 3 No From Volume 57 July to the end of Volume 87 PSMPage1Title3 Page
Monthly first article title 4 No Begins from Volume 88 PSMPage1Title4 Page
Fragments of Science Yes From Volume 48 November to Volume 57 May PSMFragmentsOfScience Pfos Page
General Notices Yes From Volume 48 November to Volume 56 December PSMGeneralNotices Pgn Page
Index header with underline Yes From Volume 1 to Volume 56 PSMIndex Page
Index header without underline Yes From Volume 57 to Volume 87 PSMIndex2 Page
Link to PSM article No Link to a main namespace article PSM link Main
Literary Notes Yes From Volume 1 to Volume 47 October PSMLitNotes Plit Page
Literary review paragraph template No Incorporates Hanging indent and pagraph padding PSMLitReview Plr Page
Minor Paragraphs Yes From Volume 48 November to Volume 57 May PSMMinorParagraphs Pmp Page
Miscellany Yes From Volume 1 to Volume 10 December PSMMisc Pmm Page
Navigator to index scans and images No Links to project Index scans and commons images Psm Wikiproject
Notes title 1 Yes From Volume 1 to Volume 47 October PSMNotes Pn Page
Notes title 2 Yes From Volume 48 November to Volume 57 May PSMNotes2 Pn2 Page
Obituary Note (single) Yes Volume 2 April PSMObitNote POn2 Page
Obituary Notes (plural) Yes From Volume 35 May to Volume 47 October PSMObitNotes POn Page
Obituary Yes From Volume 35 May to Volume 47 October PSMObitTitle POb Page
Page display frame bottom No Main namespace article frame bottom PSMLayoutBottom Main
Page display frame top No Main namespace article frame top PSMLayoutTop Main
Popular Miscellany Yes From Volume 10 January to Volume 47 October PSMPopMisc Pm Page
Project home page link No Project link of convenience PSMProjectHome Wikiproject
Project pages title No Template of convenience TPSMProject Wikiproject
Publications received 1 Yes From Volume 2 May to Volume 57 May PSMPubRec Ppr Page
Publications received 2 Yes From Volume 48 to Volume 50 PSMPubRec2 Ppr2 Page
Scientific Literature 1 Yes From Volume 48 November to Volume 57 May PSMScientificLiterature Psl Page
Scientific Literature 2 Yes From Volume 57 June to Volume 63 September PSMScientificLiterature2 Psl2 Page
Shorter Articles Yes From Volume 63 September to Volume 68 PSMShorterArticles Psa Page
Shorter Articles2 Yes From Volume 69 to Volume 70 No dot PSMShorterArticles2 Psa2 Page
Shorter Articles And Correspondence Yes Used when both articles are on the same page PSMShorterArticlesAndCorrespondence Psac Page
Shorter Articles And Discussion Yes Used when both articles are on the same page PSMShorterArticlesAndDiscussion Psad Page
Table of Contents No Table of contents title PSMToC Main
Table required template No Maintenance template PSMTable Page
The Progress of Science Yes From Volume 57 to Volume 68 with dot PSMProgressOfScience Pps Page
The Progress of Science Yes From Volume 68 to Volume 87 no doth PSMProgressOfScience2 Pps2 Page
Volume title page 1 No From Volume 1 to Volume 47 TPSM Page
Volume title page 2 No From Volume 48 to Volume 56 TPSM2 Page
Volume title page 3 No From Volume 57 to Volume 87 TPSM3 Page

Section tags[edit]

A simple alphanumeric coding is devised for the section begin and end tag codes, when two articles begin and end on the same page.

The scheme greatly simplifies section coding when proofreading, and when referencing the section codes during final page assembly in the main namespace. In the context of this project, this scheme cannot cause a section tag conflict.

The codes are made up of the following segments:

End of article begin and end section code segments:

 E = End of article
27 = .djvu page number
<section begin=E27 /><section end=E27 />

The article following on the same page uses the code segments, except prefixed by 'B' to indicate the beginning section of the article.

 B = Beginning of article
27 = .djvu page number
<section begin=B27 /><section end=B27 />

The example of the above coding scheme is visible on the edit view of .djvu page 27/17 here, with the final results are visible on this page.

Articles[edit]

Article main titles are followed by a variety of sub-title styles specific to their content. This is then followed by the first paragraph leadin of a dropped initial. Some articles have no authors.

This page has one main and five sub titles, as displayed below. Otherwise, article titles consist of one main and, at most, three sub-titles. Since the styles differ, there is good visual contrast, even when the font-size difference is less than 10%. Titles can start anywhere on a page.

In Wiki editor[edit]

Body of text[edit]

Look for tables and lists. Using a spreadsheet to design a table or a list, and then converting it to a wiki table is a good timesaving solution.

Image display is a multiple step process described separately under Managing images section below.

If the last word of the page is hyphenated, check next page for the complete word, insert the {{hws}} and {{hwe}} templates with the required parameters on both pages.

References and footnotes[edit]

{{smallrefs}}</div>

render footnotes in small font. Footnote references are automatically numbered. An example of a two page footnote can be seen on this page, when viewed in edit mode.

Unsupervised changes[edit]

Before proceeding, a note to those who are only familiar with Windows and the MS Word .doc definition of a paragraph, which considers the paragraph terminated when encountering the first (<Enter key>) line break

Line break, or Carriage Return (CR) is referred to as \newline here. The OCR scan terminates every line with a \newline, and uses two \newlines to render paragraph separation when the text is pasted in the editor.


The following process preserves text separated by two \newlines and replaces single \newline codes with a space to form a single continuous text line for each paragraph. References to character codes can be found for \newline, here, and for space, here.

  1. Trim paragraphs to remove preceding and trailing spaces by searching for occurrences of space before and after a \newline.
  2. Replace two spaces with one space.
  3. Replace two \newlines with a unique symbol to mask actual paragraph breaks. This <<>> symbol example is easily created on any English language keyboard, and is unlikely to exist in any text.
  4. Replace a single \newline with space. This formats the text into one continuous line and correctly defines paragraph breaks.
  5. Replace the paragraph symbol <<>>, with two \newlines to restore paragraph breaks.
  6. Replace colon preceded by space ( :), with a colon (:).
  7. Replace semicolon preceded by space ( ;), with a semicolon (;).
  8. Replace question mark preceded by space ( ?), with a question mark (?).
  9. Replace exclamation mark preceded by space ( !), with an exclamation mark (!).

All of the above can be instantaneously accomplished in a text editor with macro capability.

Supervised changes[edit]

These changes must be supervised because they occur in various contexts.

  1. Search for the hyphen - (ANSI 045), and join the words as required. Hyphenated words, which by themselves are correct, are left as is, being the typesetting style at the time.
  2. Search for the double quotation mark " (ANSI 034) and check for matching opening and closing marks. Delete the space between the marks and the enclosed text.
  3. There is an occasionally occurring typographical style applied to a series of paragraphs where the beginning of each paragraph is opened without a closing double quotation mark.
  1. Search for single quotation mark ' (ANSI 039) surrounded by space They are used to enclose text, within, or in place of, double quotation marks.
  1. Search for 'ae', and 'oe', which are most likely to incorrectly rendered ligatures 'æ', and the 'œ'. Assumptions can be made of their existence in the text, based on the article's subject matter.

OCR anomalies[edit]

In early volumes, some symbols and characters are ignored by the OCR. These include the em dash (—), currency symbols ($ and £), and the temperature indicator º, etc.

In several instances, the OCR process has difficulty in distinguishing certain characters and commonly misreads the following:

  1. Uppercase words beginning with "W are preceded by a double quotation mark. Compare to the original.
  2. Short words beginning with 'w' are occasionally garbled as in 'w T e', which is supposed to be 'we'. Correct these by searching for ' w ' surrounded by spaces.
  3. Occasionally, the lowercase 'h' is rendered as 'b'.
  4. Words containing 'g' is problematic.
  5. Words containing 'p' are often replaced by 'jj'.
  6. Uppercase 'N' is often rendered incorrectly
  7. In uppercase the R is often rendered as 'K' 'E', or 'B'. Spell check finds the error, unless the change is a meaningful word.
  8. Ligatures.

Spell check[edit]

If possible, perform a spell check, Bad spelling in the original is indicated by . Outdated, but correct spelling, is left as is, The {{sic}} template is invisible in read mode, but in edit mode indicates that a previous editor was aware of the spelling.

US English dictionary is sufficient, and spelling variations of English words are ignored.

If simultaneous use of multiple dictionaries is possible, then UK English, French and German dictionaries are useful.

An alphabetic list of archaically spelled words and proper names collected from the Volumes can be found on this page.


  1. Proofread to insert missing 'em—dashes' (ANSI 0151).
  2. Proofread to ''italicize'' text. Referenced publication names are always italicized.
  3. References to the publication itself are always in small caps, as in The Popular Science Monthly or The Monthly.
  4. If there is an image, insert the image template in the same place as in the original, and add the Fig no. and caption to the template, even if it's showing in the image.

The following appear with decreasing frequency and have a relation to the article topic.

  1. Proofread for ambiguous text missed by the spell check. They may be incorrectly rendered scientific, technical, or currency symbols like, fractions '½', degrees '°', or currency '£' symbols.
  2. Aside from formatted titles, some articles use CAPITALIZED and Small caps for emphasis.

Select the 'Proofread' button and save the edits.

Managing images[edit]

Image naming and preparation[edit]

Using the image upload wizard[edit]

Images are stored on Wikimedia Commons and can only be uploaded by registered users. Use this form for uploading images.

Image category selection[edit]

An image being uploaded to the Commons must include the Volume sub category found grouped under the Popular Science Monthly illustrations Main category. For example: This link shows all images from Volume 5.

Additional categories about the image are helpful and THIS PAGE is the search page for the extensive Wikimedia categories list.

Proofreading tools[edit]

The preference is that the Wiki editor should be able to do all the operations, but it's not yet possible. External editors are used to perform the outlined process, and open source software is desirable, but this is not always possible or convenient.


Image layout[edit]

PSM V54 D323 Two simple shapes creating optical illusions.png
Fig. 11.—This represents an ordinary table-glass, the bottom of the glass and the entire rear side, except the upper portion, being seen through the transparent nearer side, and the rear apparently projecting above the front. But it fluctuates in appearance between this and a view of the glass in which the bottom is seen directly, partly from underneath, the whole of the rear side is seen through the transparent front, and the front projects above the back. Fig. 12.—In this scroll the left half may at first seem concave and the right convex, it then seems to roll or advance like a wave, and the left seems convex and the right concave, as though the trough of the wave had become the crest, and vice versa.
 

 
PSM V54 D322 Simple shape creating optical illusion 1.png
PSM V54 D322 Simple shape creating optical illusion 3.png
Fig. 8.—This drawing may be viewed as the representation of a book standing on its half-opened covers as seen from the back of the book; or as the inside view of an open book showing the pages.
PSM V54 D322 Simple shape creating optical illusion 2.png
Fig. 10.—The smaller square may be regarded as either the nearer face of a projecting figure or as the more distant face of a hollow figure. Fig. 9.—When this figure is viewed as an arrow, the upper or feathered end seems flat; when the rest of the arrow is covered, the feathered end may be made to project or recede like the book cover in Fig. 8.
 

 
PSM V54 D238 Flakes of volcanic ash.png
PSM V54 D238 Swelled fused volcanic ash particles.png
Fig. 2.—Flakes of Volcanic Ash. Magnified about 100 diameters. A, flake with a branching rib; B, fragment of a broken hollow sphere of glass; C, fragment with drawn out tubular vesicles; D and E, plain fragments of broken pumice bubbles. (From American Geologist, April, 1893.)
Fig. 3.—A Particle of Volcanic Ash swelled up by Fusion. Magnified 100 diameters.
{|align=center width="430" {{ts|sm85|bc|lh95}} border=1
|[[File:PSM V54 D238 Flakes of volcanic ash.png|frameless|center|215px|]]
|width=10px| 
|rowspan=2 |[[File:PSM V54 D238 Swelled fused volcanic ash particles.png|frameless|center|215px|]]
|-
|rowspan=2 width=210px {{ts|aj|it|vtb}}|{{sc|Fig. 2.—Flakes of Volcanic Ash.}} Magnified about 100 diameters. A, flake with a branching rib; B, fragment of a broken hollow sphere of glass; C, fragment with drawn out tubular vesicles; D and E, plain fragments of broken pumice bubbles. (From American Geologist, April, 1893.)
|
|-
|
|width=210px {{ts|aj|it|vtb}}|{{sc|Fig. 3.—A Particle of Volcanic Ash swelled up by Fusion.}} Magnified 100 diameters.
|}

Text editors[edit]

Any text editor, (preferably with a spell check dictionary and a keyboard macro feature), can be used.

US English dictionary is sufficient, although, if multiple dictionaries are possible, then scientific / academic dictionaries would be good additional choices. An alphabetic list of archaic spellings and proper names collected from Volume 1, can be found on this page, although the list need to be cleaned up.

All of the unsupervised tasks listed previously, are instantly performed by a single macro. In addition, frequently used templates can be assigned to keyboard macros.

Image editor[edit]

The editor is used to trim excessive blank space surrounding the image and save it as a .JPG file. Image clarity of drawings doesn't improve by changing the pixel size, and photos didn't appear in print until 1880.

Conversion tools[edit]

Dictionaries[edit]

Main namespace[edit]

Web page names[edit]

Taking into account the number of volumes, the volumes biannual cycle (May to October, November to April), the monthly issues and the article titles, four page name segments (root and three subsequent branches) minimum, are required to satisfy uniqueness and identify the project, the volume, the issue, and article.

  1. The root segment is made up of the publication name, which satisfies all volumes.
  2. The first sub segment is the the Volume and number.
  3. The second sub segment indicates issue month and year. Volumes are six months, calculated from the first issue in May 1872 and thus, even numbered issues span two years. Also, there may be merged volumes.
  4. The third sub segment is the article title, matching the Table of Contents, which was matched to each article.

In addition to general article titles, this structure takes into account lecture series by attaching their sub-title numbers to the series title (the lecture series are numbered), and thus avoiding a fourth sub segment (fifth page name segment).

Series
Popular Science Monthly/Volume 1/May 1872/The Study of Sociology I
Popular Science Monthly/Volume 1/May 1872/Natural History of Man I
Recurring monthly features

Table of Contents[edit]

  1. Table of Contents are the article titles of the main namespace. While the original article titles are uppercase, the web pages and the Table of Contents are capitalized lowercase, except pronouns, prepositions and conjunctions which are lowercase, unless they are the first word of the title.

Comments on the contents[edit]

For students of science, technology and social history, the publication provides a fascinating view through the window of the printed word, and what a view it is. To read the articles promulgated by the great minds of 19th century, the depth and diverse range of subjects covered is a mine, of pure gold. The language, the terminology, and the spelling of the day, coupled with an occasional tone of condescension employed in addressing the reading audience, enhances the experience.

The publication aimed to reach a wide audience by disseminating information, and publicizing issues of wide ranging interest for the emerging 19th century middle class thirsty for knowledge. The novel approach of fusing the perceived desire of the public, and serving as a platform for the dissemination of academic thought, was well received.

Of great interest is the level of scientific knowledge and the social issues of the day. It's somewhat eerie to read that the then prevailing views expressed on matters of public health, education, nutrition, employment, natural resources and pollution are still familiar in our time. The knowledge espoused range from the quaint to the surprisingly advanced, with many theories still in the process of being debated and formulated when this is written.

The typesetting[edit]

The display of increased confidence in the viability of the enterprise is palpable as indicated by proudly published reviews on on this page of the June 1872 issue. There was a positive reception by the press, the academic community, and the interested public. Subtle changes appear progressively after this issue. The typesetting style is progressively streamlined and displays increased professionalism.

The composition is progressively improved and the payoff for the Wikisource proofreader is the reduced frequency of typographical embellishments. The number of quotes, italics and em dashes used on the pages are no more than a couple per page, and that's good news. Of course there are some extreme exceptions, like idiosyncratic writing where the word "practical", enclosed in quotes, appears nine times on a single page.

After spending time at a daily paper, observing Linotype machines in operation, surrounded by typesetters and proofreaders at work, a sense of amazement is felt when one considers that these pages were manually set character by character, space by space, block by block, to a justified paragraph format, laid inversely and with very few errors.

Subject and style[edit]

While random sampling of articles by topic, an interesting relationship can be discerned between the article's subject, the typographical style, and even the word count per page. Articles on morality, religion, and religious thought, contain an increased number of typographical embellishments to emphasize their absolute, exhorting, admonishing and cautionary messages. This is indicated by an increased number of em—dashes, double quotes, single quotes, italics, and capitalized text. There is a lot of rolling of the holy.