User:Beeswaxcandle/Notes

From Wikisource
Jump to navigation Jump to search

Notes for History of Norfolk/Volume 2:

Thetford starts djvu 11 and finishes 157
Grimshou starts djvu 158 and finishes 280
Wayland starts djvu 281 and finishes 383
Forehoe starts djvu 384 and finishes 581 - this section will have several "no matches". It starts 10 pages out and finishes 22 pages out.

Clear through to 426. Jump to 431. OK to 462. Jump to 467. OK to 570. 573 by itself. 576 to 581 to finish.


Notes for Varieties of Religious Experience:

All print page numbers in TOC work in the Index.
Preface djvu 9
Lecture 1 djvu 17
Lecture 2 djvu 42
Lecture 3 djvu 53
Lecture 4&5 djvu 94
Lecture 6&7 djvu 143
Lecture 8 djvu 182
Lecture 9 djvu 205
Lecture 10 djvu 233
Lecture 11,12,13 djvu 275
Lecture 14&15 djvu 342
Lecture 16&17 djvu 395
Lecture 18 djvu 446
Lecture 19 djvu 474
Lecture 20 djvu 501
Postscript djvu 536
Lecture 5 djvu


Hi, can I check my understanding of your thoughts about Match & Split before I say anything on the Scriptorium?

In my wanderings through Special:LonelyPages and Special:LongPages over the past couple of weeks I've seen a variety of texts for which we don't currently have scans within en.ws (or are not linked in some way to scans that we do have). I see them as falling into five main groupings:

  1. The text layer from an IA scan copied and pasted into a mainspace page with no clean-up done (e.g. The Vatican as a World Power). The only reasonable solution for this group is to gradually work through them adding the djvu file, proofing that and then switching the mainspace text to <pages> - such as I'm doing at the moment with The Czar: A Tale of the Time of the First Napoleon.
  2. Text is sourced from PG/DP (e.g. Critique of Pure Reason (Meiklejohn)). For this group, we should offer a link to their scans (where available) and only need to host scans of these works if there is a edition that is sufficiently different to be worth hosting and making available separately or their scan is not available for linking.
  3. Text is sourced from some other proofread source (e.g. History of Norfolk/Volume 6). This group is variable in the quality of the proofreading and we should either link to their scans (where available) or host scans ourselves. I'm not sure, however, whether we want to proofread the scans we host in this group or whether we would just offer a link. (In the case of the particular work I've used as an example the IA OCR layer is not particularly good and M&S might be appropriate.)
  4. Copied and pasted from some other website (e.g. Go Toppers Go).
  5. Transcribed (or scanned) by the user from a copy held by them (e.g. Shimer College History (1853-1950))

For the last two groups we would like to host proper scans, but won't always be able to. Where we do host a scan we would proofread and transclude over the top of the pasted text. (Note that I have deliberately excluded the Case Law and Executive Orders groups as these are not germane to the M&S conversation.)

The only practical use for M&S is for the third group where we are chosing to host the scans and want to link them and the djvu text layer is mediocre to poor. However, Help:Match and split says that there is a requirement for the djvu text layer to be of a sufficiently reasonable quality. This means that M&S is of limited practical use. That being so, it should be hidden in some way from most general users in the en.ws domain. The easiest way of doing that would be to remove the link on Help:Contents