Wikisource:WikiProject NLS

From Wikisource
Jump to navigation Jump to search
WikiProject NLS

This project is to co-ordinate efforts to retrieve and present works stored by the National Library of Scotland

Project Overview[edit]

The National Library of Scotland has a collection of over 3,000 Scottish chapbooks which have been digitised and published on the Library's Digital Gallery. In March 2020, the Library undertook a project to correct the OCR on this collection by uploading them to Wikisource, proofreading them then exporting the corrected OCR and loading it back into the Digital Gallery. The project began during the Coronavirus crisis and had been a useful way of using staff time while they work from home; in the first four weeks of the project more than 60 members of staff had taken part. In addition, it has served as a good vehicle for engaging Library staff with the wider Wikimedia environment.

Workflow[edit]

We worked with Beeswaxcandle to develop a workflow that would allow us to meet Wikisource's quality standards while being able to complete texts at a reasonable pace. Please contact Gweduni if you would like more detail about any of the steps below.

The workflow contains the following stages:

  1. Uploading books to Wikisource
  2. Proofreading
  3. Validating
  4. Tranclusion
  5. Export of OCR

Uploading books to Wikisource[edit]

We upload approximately 40 books per day to Wikisource. This involves identifying the items we want to upload, preparing a metadata spreadsheet and then uploading to Wikimedia Commons through Pattypan. The books are then moved across to Wikisource from Wikimedia Commons and added to our internal tracking spreadhseet, where they are then distributed to staff members for editing.

The Index pages for such works are in Category:WikiProject NLS.

Proofreading[edit]

Some members of the team are responsible for the intial proofreading of the books. This stage contains some mandatory steps such as ensuring spelling and punctuation are correct, correct use of line breaks, and adopting a consistent approach to illegible text, blank pages and blurry pages. Staff who are more confident in the Wikisource platform can pick up some of the optional steps including alignment, formatting, special characters and generating tables or columns (any of these optional steps that are missed are then picked up at the Validating stage).

NLS Proofreading guide: https://commons.wikimedia.org/wiki/File:National_Library_of_Scotland_Wikisource_Project_Proofreading_Guide.pdf

Validating[edit]

There is a separate group of more advanced users who undertake the validation stage. This stage involves checking that the book has been proofread correctly and adding in tags and additional formatting to ensure the book better reflects the look of the original item.

NLS Validation and Transcluding Guide: https://commons.wikimedia.org/wiki/File:National_Library_of_Scotland_Wikisource_Project_Validating_and_Transcluding_Guide.pdf

Transclusion[edit]

Transclusion involves the publication of the item on Wikisource. This involves adding a header template, linking author details through to Wikidata, adding copyright templates and updating the item's status to "Transcluded". Once an item had been transcluded we are able to export the completed OCR.

Export of OCR[edit]

The Library's developer team are currently testing the process for exporting the completed OCR and reimporting into the Library's Digital Gallery.

Progress[edit]

The collection contains approximately 3,000 chapbooks, and we plan to upload around 200 every week. We hope to have as many of the 3,000 transcluded as possible before the end of the Coronavirus lockdown, and will make this an ongoing piece of work once staff return to the Library. The "uploaded" column shows the number of items uploaded, while the "To proofread" column shows the number to be proofread once duplicates or possible duplicates have been excluded.

2020 Progress
Date Uploaded To proofread Started Proofread Validated Transcluded Percentage
Transcluded
03/04/2020 243 243 154 106 59 0 0%
10/04/2020 358 266 241 190 77 8 0%
17/04/2020 558 377 280 230 89 24 1%
24/04/2020 753 509 348 294 148 61 2%
01/05/2020 1107 754 398 352 215 100 3%
08/05/2020 1350 888 396 354 188 178 6%
15/05/2020 1454 976 552 514 337 236 8%
22/05/2020 1663 1090 588 555 381 321 11%
29/05/2020 1739 1188 662 613 447 400 13%
05/06/2020 1858 1326 740 696 535 493 16%
12/06/2020 2150 1526 812 766 616 587 20%
19/06/2020 2398 1715 878 835 684 646 22%
26/06/2020 2553 1856 934 892 729 680 23%
03/07/2020 2698 2068 1015 974 770 724 24%
10/07/2020 2975 2395 1103 1066 830 791 26%
17/07/2020 2997 2412 1192 1155 901 882 29%
24/07/2020 2997 2376 1267 1233 956 938 31%
31/07/2020 2997 2376 1314 1282 999 972 32%
07/08/2020 2997 2376 1351 1319 1003 979 33%
14/08/2020 2997 2376 1384 1351 1016 987 33%
21/08/2020 2997 2376 1399 1367 1025 989 33%
28/08/2020 2997 2376 1416 1383 1026 995 33%
04/09/2020 2997 2376 1448 1416 1035 1000 33%
11/09/2020 2997 2376 1476 1440 1046 1032 34%
18/09/2020 2997 2376 1496 1462 1060 1042 35%
25/09/2020 2997 2376 1518 1481 1068 1043 35%
02/10/2020 2997 2376 1531 1496 1079 1043 35%
09/10/2020 2997 2376 1544 1509 1089 1045 35%
16/10/2020 2997 2376 1554 1519 1097 1047 35%
23/10/2020 2997 2376 1563 1527 1108 1048 35%
30/10/2020 2997 2376 1577 1541 1116 1053 35%
13/11/2020 2997 2376 1596 1575 1133 1061 35%
20/11/2020 2997 2376 1609 1586 1147 1064 36%
27/11/2020 2997 2376 1622 1599 1153 1064 36%
04/12/2020 2997 2376 1630 1607 1159 1064 36%
11/12/2020 2997 2376 1636 1613 1162 1064 36%
18/12/2020 2997 2376 1639 1618 1170 1073 36%
25/12/2020 2997 2376 1644 1623 1190 1075 36%
2021 Progress
Date Uploaded To proofread Started Proofread Validated Transcluded Percentage
Transcluded
01/01/2021 2997 2376 1648 1625 1199 1075 36%
08/01/2021 2997 2376 1684 1653 1200 1075 36%
15/01/2021 2997 2376 1735 1695 1235 1081 36%
22/01/2021 2997 2376 1785 1742 1275 1086 36%
29/01/2021 2997 2376 1834 1795 1299 1112 37%
05/02/2021 2997 2376 1888 1846 1326 1129 38%
12/02/2021 2997 2376 1934 1892 1353 1129 38%
19/02/2021 2997 2376 1985 1940 1388 1133 38%
26/02/2021 2997 2376 2022 1989 1417 1134 38%
05/03/2021 2997 2376 2053 2025 1433 1136 38%
12/03/2021 2997 2376 2075 2048 1444 1137 38%
19/03/2021 2997 2376 2102 2074 1450 1137 38%
26/03/2021 2997 2376 2113 2087 1457 1137 38%
02/04/2021 2997 2376 2135 2104 1485 1137 38%
09/04/2021 2997 2376 2157 2122 1497 1137 38%
16/04/2021 2997 2376 2186 2162 1525 1137 38%
23/04/2021 2997 2376 2200 2177 1548 1137 38%
30/04/2021 2997 2376 2212 2190 1566 1137 38%
07/05/2021 2997 2376 2220 2198 1575 1137 38%
14/05/2021 2997 2376 2230 2212 1585 1137 38%
21/05/2021 2997 2376 2243 2221 1591 1137 38%
28/05/2021 2997 2376 2253 2233 1595 1140 38%
04/06/2021 2997 2376 2264 2240 1598 1140 38%
11/06/2021 2997 2376 2271 2245 1601 1140 38%
18/06/2021 2997 2376 2274 2249 1607 1140 38%
25/06/2021 2997 2376 2277 2253 1616 1140 38%
02/07/2021 2997 2376 2285 2261 1621 1140 38%
09/07/2021 2997 2376 2290 2266 1628 1140 38%
16/07/2021 2997 2376 2295 2272 1631 1140 38%
23/07/2021 2997 2376 2300 2275 1636 1140 38%
30/07/2021 2997 2376 2302 2276 1641 1140 38%
27/08/2021 2997 2376 2307 2284 1644 1140 38%
24/09/2021 2997 2376 2315 2290 1645 1140 38%
29/10/2021 2997 2376 2317 2294 1647 1140 38%
26/11/2021 2997 2376 2317 2294 1648 1140 38%
31/12/2021 2997 2376 2317 2294 1650 1146 38%
2022 Progress
Date Uploaded To proofread Started Proofread Validated Transcluded Percentage
Transcluded
28/01/2022 2997 2376 2317 2294 1654 1146 38%
25/02/2022 2997 2376 2318 2295 1654 1146 38%
25/03/2022 2997 2376 2318 2295 1654 1146 38%

Items for Proofreading by the Wikisource Community[edit]

The majority of proofreading is being undertaken by staff at the NLS but if members of the wider Wikisource community would like to undertake some proofreading, the ten listed below are available. Please sign your username against an item if you are working on it.

List of Works Completed[edit]

Accidents[edit]

Adventures and adventurers[edit]

Apparitions[edit]

Clothing and Dress[edit]

Courtship and seduction[edit]

Covenanters[edit]

Crime and Punishment[edit]

Curiosities and Wonders[edit]

Elegiac Poetry[edit]

Emigration[edit]

Executions[edit]

Fairs[edit]

Freemasonry, Incest and Ireland[edit]

* An abstract of the bloody massacre in Ireland

Jacobites, Kings and Rulers, Last Words[edit]

Marriage[edit]

Murders[edit]

Occupations[edit]

Pirates, Politics, Prophesies and Prostitutes[edit]

Religion and Morality[edit]

Riots, Robbery and Wit & Humour[edit]

Scotland and Scots[edit]

Slavery, Soldiers and Sailors, Sports, Street Life, Suicide[edit]

Temperance, Treason, Transvestites, Trials, War[edit]

To Do[edit]

Uploads that may need metadata scan re-alignment at Commons: /Scrambled Resolved.

Contributors[edit]

Gweduni (talk) 10:20, 6 April 2020 (UTC)[reply]

Mandarasa (talk) 10:23, 6 April 2020 (UTC)[reply]

LilacRoses (talk) 10:24, 6 April 2020 (UTC)[reply]

Tamheaney (talk) 09:39, 7 April 2020 (UTC)[reply]

Chime Hours (talk) 14:34, 9 July 2020 (UTC)[reply]

AndrewOfWyntoun (talk) 14:34, 9 July 2020 (UTC)[reply]