Wikisource:Bot requests/Archives/2022

From Wikisource
Jump to navigation Jump to search
Warning Please do not post any new comments on this page.
This is a discussion archive first created in , although the comments contained were likely posted before and after this date.
See current discussion or the archives index.

Find and replace conversion of {{page}} to <pages /> in 1911 Encyclopædia Britannica

Is find and replace with regex something that can be done by bot? If so, I'd like to request that it be used to convert some uses of {{page}} in 1911 Encyclopædia Britannica—specifically, finding this pattern

<div class="prose">
{{page|[index name]/[page number]|num=[display number]|section="[section name]"}}
</div>

and converting it to

<div class="prose">
<pages index="[index name]" include=[page number] onlysection="[section name]" />
</div>

This PetScan is a list of 1911 Encyclopædia Britannica pages which use {{page}}. Thank you! —CalendulaAsteraceae (talkcontribs) 07:24, 13 January 2022 (UTC)

@CalendulaAsteraceae:, there are some pages where
<div class="prose"></div>
is not present. What shall be done there? Add it? Replace only the {{page}} part? Mpaa (talk) 21:52, 14 January 2022 (UTC)
There are also multi-page cases (which is not straightforward to process), how shall this be converted? e.g:
{{page|EB1911 - Volume 04.djvu/273|num=258|section=part2}}
{{page|EB1911 - Volume 04.djvu/274|num=259}}
{{page|EB1911 - Volume 04.djvu/275|num=260}}
{{page|EB1911 - Volume 04.djvu/276|num=261}}
{{page|EB1911 - Volume 04.djvu/277|num=262}}
{{page|EB1911 - Volume 04.djvu/278|num=263}}
{{page|EB1911 - Volume 04.djvu/279|num=264|section=Borneo}}
@Mpaa: The reason I specified pages with the <div class="prose">…</div> and only one invocation of {{page}} is that it's easy to do with find and replace. The div ensures there aren't other invocations of {{page}} above or below, since naively replacing them all with a <pages /> tag would cause weird line breaks since each use of <pages /> is wrapped in a div. Basically, I know it won't get everything, but it would make a significant dent since a lot of 1911 Encyclopædia Britannica entries follow this pattern. I can request find and replace for additional patterns, but I figured I'd start with one, especially since I don't know if this is something that can be done by bot at the moment. —CalendulaAsteraceae (talkcontribs) 04:26, 15 January 2022 (UTC)
To actually answer your questions, if there's just the one use of {{page}} without the <div class="prose"></div>, the {{page}} part should be replaced, but there's no need to add the div. For the example you gave, it should be replaced with
<pages index="EB1911 - Volume 04.djvu" from=273 fromsection="part2" to=279 tosection="Borneo" />
but even if that can't be done by bot, I'll still appreciate whatever help I can get! I've been doing find and replace manually on the simpler patterns and that is not one of them.
Another pretty common pattern is
<div class="prose">
{{page|EB1911 - Volume 08.djvu/429|num=408|section="Donation of Constantine"}}
{{page|EB1911 - Volume 08.djvu/430|num=409|section="Donation of Constantine"}}
</div>
and the div makes a difference there in terms of find and replace because there are also pages that do this
<div class="prose">
{{page|EB1911 - Volume 08.djvu/427|num=406|section="Donatello"}}
{{page|EB1911 - Volume 08.djvu/428|num=407|section="Donatello"}}
{{page|EB1911 - Volume 08.djvu/429|num=408|section="Donatello"}}
</div>
and a find and replace that didn't include the div would catch entries with more than two uses of {{page}} as well.
One more caution: some entries have nonconsecutive numbers due to plates, e.g.
{{page|EB1911 - Volume 09.djvu/325|num=308|section="Embossing"}}
{{page|EB1911 - Volume 09.djvu/328|num=309|section="Embossing"}}
so find and replace will need to account to that. The above, I would replace with
<pages index="EB1911 - Volume 09.djvu" include="325,328" onlysection="Embossing" />
CalendulaAsteraceae (talkcontribs) 04:48, 15 January 2022 (UTC)
Pages with a single {{page}} entry are done. Multi-page case cannot be done with standard tools, logic/code needs to be worked out first. Mpaa (talk) 14:39, 15 January 2022 (UTC)
Awesome; thank you! —CalendulaAsteraceae (talkcontribs) 21:46, 15 January 2022 (UTC)

Related request

Follow-up request: basically the same thing for single {{page}} entries of The New Student's Reference Work. A lot of pages look like


{{page|LA2-NSRW-1-0013.jpg|section=Abalone|num=1}}

(space is deliberate, that's a newline) and could be replaced with e.g.


<pages index="The New Student's Reference Work/Vol I" from="LA2-NSRW-1-0013.jpg" to="LA2-NSRW-1-0013.jpg" fromsection="Abalone" tosection="Abalone" />

The general pattern is


{{page|LA2-NSRW-[volume Arabic number]-[page scan number].jpg|section=[section]|num=[page number]}}

to


<pages index="The New Student's Reference Work/Vol [volume Roman number]" from="LA2-NSRW-[volume Arabic number]-[page scan number].jpg" to="LA2-NSRW-[volume Arabic number]-[page scan number].jpg" fromsection="[section]" tosection="[section]" />

There are 5 volumes. So, when you have time for this, I'd appreciate it! —CalendulaAsteraceae (talkcontribs) 22:16, 15 January 2022 (UTC)

PetScan link for NSRWCalendulaAsteraceae (talkcontribs) 07:29, 16 January 2022 (UTC)
Done.
If there is a way to query pages with only one single instance of "page" (or maybe one could start from PetScan query and do a prefiltering) for future runs, a possibility could be to:
1. create a new (temporary) template which implements the <pages> part starting from the parameters in page
2. run the following: https://github.com/wikimedia/pywikibot/blob/master/scripts/template.py#L27
so we could avoid the hassle of parsing params, different order of params in {{page}} etc.
Step 2 could also be broken down in two separate steps: 2a. template replacement, 2b. template substitution. Mpaa (talk) 15:51, 16 January 2022 (UTC)
@Mpaa: Thank you so much! If you started with the PetScan query and excluded results with either multiple instances of "page" on one line, or instances of "page" on two consecutive lines, I think that should filter the results appropriately. To be clear, the patterns I'm suggesting filtering out are
{{page|[anything]}}[anything]{{page|[anything]}}
and
{{page|[anything]}}
{{page|[anything]}}
The 1911 Encyclopædia Britannica PetScan still has about 900 results if you want to try filtering those pages to see if the filters work like I think they should. —CalendulaAsteraceae (talkcontribs) 03:59, 17 January 2022 (UTC)
Update: Category:Pages needing conversion is clear, and AFAICT all the works that can be converted to use {{page}} <pages /> have been. —CalendulaAsteraceae (talkcontribs) 08:10, 25 January 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 21:51, 19 March 2022 (UTC)

Mark all pages in Index:Evgenii Zamyatin - We (Zilboorg translation).pdf as proofread

Per a message on my talk page, an unregistered user proofread all the pages in the Index. Can they be marked as proofread? Languageseeker (talk) 02:33, 5 February 2022 (UTC)

Done I made sample checks and seemed OK. Mpaa (talk) 19:18, 5 February 2022 (UTC)
Thanks! Languageseeker (talk) 22:16, 12 February 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 21:51, 19 March 2022 (UTC)

Mark all pages in Index:Memory; how to develop, train, and use it - Atkinson - 1919.djvu as proofread

Per a message on my talk page, an unregistered user proofread all the pages in the Index. Can they be marked as proofread? Seems ok. Languageseeker (talk)

Done Mpaa (talk) 21:17, 12 February 2022 (UTC)
Thanks! Languageseeker (talk) 22:16, 12 February 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 21:51, 19 March 2022 (UTC)

Please mark all pages in Index:Murder on the Links - 1985.djvu as proofread

As for the above two topics, could you please mark all not proofread pages in this index as proofread? Thanks, TeysaKarlov

Done Mpaa (talk) 21:41, 19 March 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 21:51, 19 March 2022 (UTC)

Mark all Pages in Index:Zakhar Berkut(1944).djvu as Proofread

Could someone please mark all the pages in Index:Zakhar Berkut(1944).djvu as proofread. They have been proofread by an anonymous volunteer and looking through the book indicates a high-level of proofreading. Many thanks. Languageseeker (talk) 02:53, 5 March 2022 (UTC)

Done Mpaa (talk) 17:22, 5 March 2022 (UTC)
Thank you. Languageseeker (talk) 04:13, 8 March 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 21:51, 19 March 2022 (UTC)

Mark all Pages in Index:Candide Smollett E. P. Dutton.djvu as Proofread

Could someone please mark all the pages in Index:Candide Smollett E. P. Dutton.djvu as proofread. They have been proofread by an anonymous volunteer and looking through the book indicates a high-level of proofreading. Many thanks. Languageseeker (talk) 04:14, 8 March 2022 (UTC)

Done Mpaa (talk) 16:50, 13 March 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 21:51, 19 March 2022 (UTC)

Make null edits to pages with DefaultSort errors

This section was archived on a request by: --Xover (talk) 10:36, 10 April 2022 (UTC)

There are a lot of pages with DefaultSort errors which don't show up in Category:Works with DefaultSort error because they haven't been edited since that category was added to MediaWiki:Duplicate-defaultsort. I've been able to find them by searching for "Default sort key", but it would be a lot easier if the pages showed up in the category, so I'd like to request null edits to all the pages that show up in this search (and page 2). Thank you! —CalendulaAsteraceae (talkcontribs) 06:36, 4 February 2022 (UTC)

Doing… Xover (talk) 13:16, 4 February 2022 (UTC)
@CalendulaAsteraceae: Done Xover (talk) 19:53, 4 February 2022 (UTC)
Thank you! —CalendulaAsteraceae (talkcontribs) 22:19, 4 February 2022 (UTC)

Middlemarch

This section was archived on a request by: --Mpaa (talk) 21:18, 30 April 2022 (UTC)

I am asking to move the page Middlemarch and all its subpages to Middlemarch (1971) to make place for a version page, as we have also another version Middlemarch (1874) --Jan Kameníček (talk) 16:20, 25 January 2022 (UTC)

Done, moved to Middlemarch (1871). Mpaa (talk) 14:50, 29 January 2022 (UTC)


Please mark all pages of Index:The Science of Getting Rich - Wattles - 1910.djvu as proofread

This section was archived on a request by: --Mpaa (talk) 21:18, 30 April 2022 (UTC)

Hello again, Another case of pages marked not proofread by user(s) without an account. Please mark all such pages as proofread. Also, is it possible for a bot to convert the curvise templates used in some places of this work just to conventional italics? If so, please do that too. Thanks,TeysaKarlov (talk) 19:40, 29 March 2022 (UTC)

@Tylopous has already removed the cursive text and proofread this index. If both you and they are okay with it, maybe all the proofread pages could instead be marked as validated? Thanks, TeysaKarlov (talk) 05:44, 3 April 2022 (UTC)
Hi, well, I was quite careful in proofreading. Of course I can have missed something, but in principle all has now been proofread twice (with the exception of some frontmatter perhaps), as usual for validation.--Tylopous (talk) 06:01, 3 April 2022 (UTC)
Done Mpaa (talk) 20:24, 7 April 2022 (UTC)


Please mark all pages of Index:The Secret of Chimneys - 1987.djvu as proofread

This section was archived on a request by: --Mpaa (talk) 21:18, 30 April 2022 (UTC)

Another case of pages marked not proofread by user(s) without an account. Please mark all such pages as proofread. Languageseeker (talk) 08:58, 9 April 2022 (UTC)

Done Mpaa (talk) 16:09, 9 April 2022 (UTC)
@Mpaa Thank you! Languageseeker (talk) 16:22, 9 April 2022 (UTC)


Please mark all pages of the main text of Index:Anne of Avonlea (1909).djvu as proofread

This section was archived on a request by: --Mpaa (talk) 21:18, 30 April 2022 (UTC)

Another work proofread by user(s) without account(s). Thanks, TeysaKarlov (talk) 02:48, 17 April 2022 (UTC)

Done Mpaa (talk) 11:00, 17 April 2022 (UTC)

Please upgrade all not-proofread pages of Index:Through Bolshevik Russia - Snowden - 1920.djvu to proofread

This section was archived on a request by: --Mpaa (talk) 21:18, 30 April 2022 (UTC)

More contributions from user(s) without account(s). Thanks (and thanks for the above), TeysaKarlov

Never mind now, looks like someone beat you to it. TeysaKarlov (talk) 20:07, 20 April 2022 (UTC)

Please upgrade all not-proofread pages of Index:Anne of the Island (1920).djvu to proofread

This section was archived on a request by: --Mpaa (talk) 21:18, 30 April 2022 (UTC)

More contributions from user(s) without account(s). Thanks, TeysaKarlov

DoneMpaa (talk) 21:18, 30 April 2022 (UTC)
@Mpaa Thanks, and sorry that I missed changing the status of the cover and image pages. TeysaKarlov (talk) 02:40, 1 May 2022 (UTC) 02:40, 1 May 2022 (UTC)

Please upgrade all not-proofread pages of Index:Anne's house of dreams (1920 Canada).djvu to proofread

This section was archived on a request by: --Mpaa (talk) 21:11, 10 May 2022 (UTC)
Done

Please upgrade all not-proofread pages of Index:A Treatise on Painting.djvu to proofread

Unregistered users work. I went through and checked for images (many thanks to @Sp1nd01 for adding them) so we should be clear to proceed. Thanks,TeysaKarlov (talk) 20:17, 16 June 2022 (UTC)

Done. Mpaa (talk) 19:58, 17 June 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 20:06, 4 July 2022 (UTC)

Please upgrade all not-proofread pages of Index:From Passion to Peace - Allen - 1910.djvu to proofread

Hi Mpaa, Everything looks good to proceed with upgrading these pages to me. Is there something I am missing? Or is it just preferable to have the someone making the request different to the someone proofreading? Thanks,TeysaKarlov (talk) 03:08, 3 July 2022 (UTC)

@Mpaa Thanks! TeysaKarlov (talk) 20:55, 4 July 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 20:06, 4 July 2022 (UTC)

Please upgrade all not-proofread pages of Index:As a Man Thinketh - Allen - 1913.djvu to proofread

This section was archived on a request by: Mpaa (talk) 18:21, 11 August 2022 (UTC)

Thanks,TeysaKarlov (talk) 04:11, 5 June 2022 (UTC)

Done. Mpaa (talk) 12:34, 5 June 2022 (UTC)

@TeysaKarlov, @Mpaa: Did you notice that some pages of the work use wiki-formatting instead of the correct formatting; that the chapter pages were transcluded incorrectly, and that the IP who proofread the work added a poem to the front that does not appear in the scan? Given these issues, how certain are you that the IP has actually proofread the work? --EncycloPetey (talk) 18:16, 5 June 2022 (UTC)

@EncycloPetey Generally, I look over a few pages to see that the text looks like it matches, and see if e.g. they have italicised words that should be, have used nop's etc, which they had. I don't look over the whole work if the IP address looks familiar. As for your three issues, I am not sure what the incorrect formatting was, overall it seemed formatted fairly well, and I have most definitely seen plenty of works logged-in users marked as proofread that looked far worse (please clarify the formatting issue). I don't judge transclusion when making this bot request; as far as I was aware, that has no bearing on whether a page should be marked proofread or not proofread. As for an added poem, no I did not see this. Do you mean as part of the incorrect transclusion, or on one of the source pages (I don't see any of your edits having removed a poem). As of now, it also seems that about 7 pages have been validated with zero changes made to each by someone else (and one more by me just now, again without change), so overall, I would still say fairly confident that the work has been proofread (again, considering any transclusion issues to have no bearing on this). TeysaKarlov (talk) 20:17, 5 June 2022 (UTC)
I hope you have enjoyed reading this book 2001:4450:8156:4900:D8A7:A084:84BB:B3C0 20:59, 20 June 2022 (UTC)

Please convert any quotation marks/apostrophes from curly to straight in Index:The last man (Second Edition 1826 Volume 3).djvu

This section was archived on a request by: Mpaa (talk) 18:21, 11 August 2022 (UTC)

Only volume 3 is affected (all should be straight in vols. 1 and 2). Thanks,TeysaKarlov (talk) 21:10, 16 July 2022 (UTC)

@TeysaKarlov done. Mpaa (talk) 13:47, 18 July 2022 (UTC)

Template replacement in The Works of H G Wells Volume 5.pdf

This section was archived on a request by: Mpaa (talk) 18:21, 11 August 2022 (UTC)

Could all the instances of bar|3 be replaced with longdash and all curly quotes replaced with straight quotes in Index:The Works of H G Wells Volume 5.pdf. Languageseeker (talk) 13:49, 22 July 2022 (UTC)

@Languageseeker done. Mpaa (talk) 14:45, 22 July 2022 (UTC)
@Mpaa Thanks!

Index:The Works of H G Wells Volume 6.pdf

This section was archived on a request by: Mpaa (talk) 18:21, 11 August 2022 (UTC)

Could all the curly quotes be changed to straight quotes in this text. The proofreader is changing them manually, so I'm hoping to save them time. Languageseeker (talk) 13:15, 6 August 2022 (UTC)

@Languageseeker done. Mpaa (talk) 15:05, 6 August 2022 (UTC)

Please replace all instances of File:Harcourt, Brace and Co. logo (1922).png and File:Harcourt, Brace and company logo.png with File:Harcourt Brace & Co. logo.svg. The two pngs are of quite low quality and the SVG is an accurate representation of this publisher's logo. Languageseeker (talk) 16:46, 15 February 2022 (UTC)

@Languageseeker: Done Xover (talk) 06:41, 3 September 2022 (UTC)
This section was archived on a request by: --Xover (talk) 06:51, 3 September 2022 (UTC)

Batch Copy Pages

Can Someone batch copy the pages from Page:The black man.djvu/7 to Page:The black man - his antecedents, his genius, and his achievements (IA blackmanantecede00browrich).pdf/11. While the two editions were printed from the same stereotype plates, the front matter and back matter differs. Stop at Page:The black man.djvu/291.Languageseeker (talk) 20:01, 16 February 2022 (UTC)

Not all the pages match 1:1, see Page:The black man.djvu/32 and Page:The black man - his antecedents, his genius, and his achievements (IA blackmanantecede00browrich).pdf/36 and following. Mpaa (talk) 10:46, 19 February 2022 (UTC)
@Mpaa I did not notice. Thanks for double-checking! Languageseeker (talk) 12:35, 19 February 2022 (UTC)
This section was archived on a request by: --Xover (talk) 06:50, 3 September 2022 (UTC)

Find and replace for Index:Castelvines y Monteses Translated.pdf

I've realized that what I thought was a gap to offset the characters' names from the text of the play was actually just a double space following periods. Given this, I'd like to request that all instances of {{gap|1em}} in the pages of Index:Castelvines y Monteses Translated.pdf be replaced with (a space). Thank you! —CalendulaAsteraceae (talkcontribs) 04:15, 10 April 2022 (UTC)

@CalendulaAsteraceae: Done Xover (talk) 10:12, 10 April 2022 (UTC)
This section was archived on a request by: --Xover (talk) 06:49, 3 September 2022 (UTC)

Default layout related inquiry

Is there a bot that could insert the {{default layout}} template set to Layout 4, in all the main namespace pages I created? — ineuw (talk) 10:08, 13 August 2022 (UTC)

This section was archived on a request by: --Xover (talk) 06:50, 3 September 2022 (UTC)

Please upgrade all non-proofread pages of Index:Calculus Made Easy.pdf to proofread

This section was archived on a request by: Mpaa (talk) 21:34, 6 September 2022 (UTC)Non-registered user(s) work. Images for pages marked problematic are being added. Thanks,TeysaKarlov (talk) 20:24, 5 September 2022 (UTC)
Done Mpaa (talk) 21:34, 6 September 2022 (UTC)
@Mpaa In the math formula's can you also replace the period between two numbers with a mdot · For example, 1.0 should be 1·0? Languageseeker (talk) 21:02, 7 September 2022 (UTC)
@Languageseeker I'd like to listen to other opinions. Using mdot instead of decimal point, seems one of those cases where we just mimic the exact formatting (IMO at the expense of readability). Mpaa (talk) 08:27, 11 September 2022 (UTC)

Clean up of Index:The Education of Henry Adams (1907).djvu

This section was archived on a request by: Mpaa (talk) 19:33, 21 September 2022 (UTC)

My main request regarding this index is that any leading line breaks on pages, e.g. Page:The Education of Henry Adams (1907).djvu/365 are removed, as I presume a bot could to this without human interference (there are no ref follows, so the text should always be the first thing on each page). I am not sure if bots can search for when to add the nop template, or for other things like removing spaces before semi-colons, and possibly adding em dashes (the last I doubt), but whatever can be done would be nice. Thanks,TeysaKarlov (talk) 23:49, 17 September 2022 (UTC)

@TeysaKarlov I did some clean up. Not all of the above. Mpaa (talk) 21:55, 19 September 2022 (UTC)
@Mpaa Any help is appreciated. It wasn't something I had a vested interest in, but just thought needed a little more before being marked proofread. Thanks, TeysaKarlov (talk) 19:57, 20 September 2022 (UTC)
@TeysaKarlov Missing em dashes is the major thing left. Mpaa (talk) 20:04, 20 September 2022 (UTC)
em dashes have benn added. Mpaa (talk) 19:33, 21 September 2022 (UTC)

Request:categorize EU Laws

This section was archived on a request by: Mpaa (talk) 19:59, 21 September 2022 (UTC)

I would like to see pages in Category:European Union Laws which have titles containing "decisions" classified to Category:Decisions of the European Union and pages which have titles containing "regulations" grouped into Category:Regulations of the European Union.--Johnson.Xia (talk) 17:35, 3 June 2022 (UTC)

@Johnson.Xia It seems resolved to me. Is that right? Mpaa (talk) 21:37, 6 September 2022 (UTC)
Yes, it is. Thanks for your notice!
PS: I would like to categorize Official Journals of EU as well. Johnson.Xia (talk) 21:53, 6 September 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 15:34, 30 October 2022 (UTC)

Bot request for The Adventures of Pinocchio

done Mpaa (talk) 16:16, 8 October 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 15:34, 30 October 2022 (UTC)

Bot request for The story girl

done Mpaa (talk) 18:05, 9 October 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 15:34, 30 October 2022 (UTC)

Bot request for The Game of Life

done. Mpaa (talk) 10:19, 30 October 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 15:34, 30 October 2022 (UTC)

Bot request for The Grumbling Hive

Done. Mpaa (talk) 15:34, 30 October 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 15:34, 30 October 2022 (UTC)

Please convert curly quotes to straight quotes

Done. Mpaa (talk) 10:09, 30 October 2022 (UTC)

The files in this category have a {{Do not move to Commons}} expiry date of 2033, but it should be 2032, because James Francis Horrabin was a British illustrator who died in 1962, and the {{Do not move to Commons}} expiry date is the last year the file cannot be on Commons. There are a lot of off-by-one errors in usages of this template, and I'm working on correcting them. This particular case should be pretty straightforward to fix with a bot: just replace expiry=2033 with expiry=2032 for all pages in this category. Thanks in advance! —CalendulaAsteraceae (talkcontribs) 22:36, 7 November 2022 (UTC)

Done. Mpaa (talk) 21:09, 8 November 2022 (UTC)
Thank you! —CalendulaAsteraceae (talkcontribs) 04:17, 9 November 2022 (UTC)
This section was archived on a request by: —CalendulaAsteraceae (talkcontribs) 00:44, 13 November 2022 (UTC)

Fixing broken refs imported by BenchBot

This section was archived on a request by: Mpaa (talk) 19:39, 27 November 2022 (UTC)

Back when Wikisource was young and full of promise, BenchBot imported a load of USSC cases. Unfortunately, it didn't properly convert the citations, so now there's a bunch of them sitting in Category:Pages with reference errors (232).

It looks like the majority of the problems stem from stupidly accepting the footnote numbering from whatever website they were imported from, and that numbering resets midway through each case leading to duplicate ref numbers. The fix is to just renumber them sequentially: it risks misnumbering any cites that were out of order in the original, but will clear up the big red error messages and will lead to far fewer total errors on these pages (so the rest can be fixed manually). (example edit)

I don't think this is a complicated task for someone well-versed in PWB (I could be wrong), but it does definitely require coding a custom bot and a bespoke parser for page content (line-by-line regex is probably good enough) to fix. Any takers? --Xover (talk) 20:07, 18 October 2022 (UTC)

@Xover Is this OK? [1], [2] Mpaa (talk) 22:08, 18 October 2022 (UTC)
@Mpaa: Perfect! And I've got to say I'm impressed with how quickly you put that together. I don't think I could have done it that fast even were the environment one I was familiar with (I'm a native Perl speaker, so Python always feels kinda alien, and PWB as a framework is… daunting). Xover (talk) 04:02, 19 October 2022 (UTC)
Hi. I fixed only pages with BenchBot as contributor. There are many pages left, e.g. where at least one ref is not in the form <ref name="ref[digits]"/>, I did nothing on them. Also others with an unbalanced number of entries <ref name="ref[digits]"/> vs <ref name="ref[digits]"> at the bottom, I skipped those as well. So in total about 4-500 pages processed. Mpaa (talk) 20:36, 20 October 2022 (UTC)
@Mpaa: That's awesome. Thanks!
BenchBot generally did a pretty poor job on references so there are multiple problems with these pages. For example, several of them that I randomly checked had footnote markers without an actual corresponding footnote. So they're going to end up somewhat broken no matter what we do. I'm just hoping to get them to a state where they no longer spew big red error messages at our readers (and don't fill up the maintenance categories, hiding other problems). Xover (talk) 05:54, 21 October 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 19:39, 27 November 2022 (UTC)

There were some unnumbered pages missing in the initial scan that I didn't notice at first. Would it be possible to move all pages (not including front matter) forward 7 pages? E.g. Page:Middle Aged Love Stories (IA middleagedlove00bacorich).djvu/48 -> Page:Middle Aged Love Stories (IA middleagedlove00bacorich).djvu/55Beleg Tâl (talk) 15:28, 12 November 2022 (UTC)

@Beleg Tâl done. Mpaa (talk) 15:44, 12 November 2022 (UTC)
@Mpaa: thank you! —Beleg Tâl (talk) 15:45, 12 November 2022 (UTC)
This section was archived on a request by: Mpaa (talk) 19:39, 27 November 2022 (UTC)

Please convert any remaining straight quotes to curly quotes, as per discussion page. There shouldn't be too many, but still a few more than I am inclined to do manually. Thanks, TeysaKarlov (talk) 23:06, 26 November 2022 (UTC)

These are the remaining straight double quotes and these are the remaining straight single quotes. —CalendulaAsteraceae (talkcontribs) 23:36, 26 November 2022 (UTC)
I did the single quotes, for double it is a bit trickier. Mpaa (talk) 11:00, 27 November 2022 (UTC)
Done. Mpaa (talk) 15:24, 27 November 2022 (UTC)
Thanks both! TeysaKarlov (talk) 19:13, 27 November 2022 (UTC)

Would it be possible to remove all internal links like e.g. Simplified Chinese character from Translation:List of Frequently Used Characters in Modern Chinese? Links to Wiktionary or Wikipedia can possibly be preserved, though I would not object against their removal from this unanotated version either. --Jan Kameníček (talk) 14:08, 18 December 2022 (UTC)

@Jan.Kamenicek: We can use Pathoschild's gadget regex editor tool for text replacements especially for one page like that. It is a good tool to master as it is really useful for regular maintenance. — billinghurst sDrewth 06:07, 11 April 2023 (UTC)
Done
This section was archived on a request by: — billinghurst sDrewth 06:13, 11 April 2023 (UTC)