Wikisource:Bot requests

Bot requests
This page allows users to request that an existing bot accomplish a given task. Note that some tasks may require that an entirely new bot or script be written. This is not the place to ask for help running or writing a bot.

A bot operating performing a task should make note of it so that other bots don't attempt to do the same. Tasks that are permanently assigned or scheduled for long-term execution are listed on Persistent tasks.

See also

Unassigned requests[edit]

Alphabetical work index[edit]

Per Wikisource talk:Works#Work index revision, we are attempting to create a new, alphabetical work index along the lines of Wikisource:Authors. Could a bot create the "A" page, please?

Header: {{work index page|A}} (the template has not been created yet).

Then there should be several headings like this:

== Aa ==


== Ab ==


Under each heading there should be a list of all pages whose title begins with those two letters, excluding subpages and pages in Category:Mainspace disambiguation pages, Category:Case disambiguation pages, Category:Versions pages, and Category:Translations pages. After the title, the categories of each page should be listed in parentheses, for example:

I hope I was clear enough. Thanks in advance.--Erasmo Barresi (talk) 09:24, 19 June 2013 (UTC)

Categorizing U.S. Supreme Court decision redirect pages[edit]

In the manner of w:Category:United States Supreme Court cases redirects, could a bot create Category:United States Supreme Court cases redirects here and add all pages like 102 U.S. 278 to it, so as to keep all such pages in one place? It Is Me Here t / c 14:18, 28 May 2014 (UTC)

A template like in Wikisource:Redirects#Redirect_sorting should be created first. What is the source Category or how can such pages be identified?--Mpaa (talk) 19:21, 28 May 2014 (UTC)
Curious as to what the point of this would be. "So as to keep all such pages in one place" doesn't tell me much. Hesperian 02:24, 29 May 2014 (UTC)
Same here. Plus I don't consider a redirect a "page" in it of itself, so tracking them seems redundant more often than not imho. They are just 'alternative' titles to some other "true" page.

In this case, these reporter case citations are (for the most part) listed along with the associated case-name on the volume page. 102 U.S. 278 is page 278 of volume 102 so the citation would be listed on United States Reports/Volume 102. Would that be in-line with "... keep[ing] all such pages citations in one place"? -- George Orwell III (talk) 03:34, 29 May 2014 (UTC)

Well, for instance, if the members of the category were listed there alphabetically, you could, at a glance, see whether any were missing (e.g. if it went "270, 272, 273", ...), rather than having to look through hundreds of subpages. Similarly, if (at some point) the decision were taken to e.g. delete all such redirects, you could, by glancing at the Category page, see how many were left, rather than, again, having to browse through hundreds of subpages of United States Reports. It Is Me Here t / c 18:02, 31 May 2014 (UTC)

Punctuation typography stickler[edit]


i would like all existing pages of Index:Sacred Books of the East - Volume 6.djvu and Index:Sacred Books of the East - Volume 9.djvu to undergo the following tedious treatment:

  1. Replace all â with â, Â with Â, î with î, Î with Î, û with û, Û with Û, — with "—" (Perhaps there is already a bot which specializes in converting HTML entities ?)
  2. replace all occurrences of " !" by "{{ep}}" (watch the space before the exclamation point), then all "!" by the same ; then undo some of those by replacing "</ref>{{ep}}" back to "</ref>!".
  3. do the same with " ?" by "{{qm}}", then all "?" (without the preceding space).
  4. Replace " — " (em dash preceded and followed by a space) by "{{—}}"
  5. " :" with "{{':}}", then again for ":", then undo the overkill "{{'{{':}}}}" back to "{{':}}"
  6. If (1) above has been completed, then " ;" with "{{;}}", then again for ";", then undo the overkill "{{{{;}}}}" back to "{{;}}"

Thanks in advance ? --Jerome Charles Potts (talk) 18:09, 15 June 2014 (UTC)

Gday @Jerome Charles Potts:.
  1. We have no bot that does it, as it is legitimate display. It displays fine, what is the issue? Where I have replacements that I wish to undertake regularly, I write them into my custom regex rules and (auto)replace at the time of proofreading
  2. No, and in my opinion the template should not have been created, and we definitely do not want to use it. In fact, why are you creating templates that are not in line with our existing practice?
  3. as above
  4. not one that we would recommend, it is usually the reverse in that there is no spacing either side of em dashes. Poetry of course being a rule to itself.
  5. there is no way that I am going to utilise {{':}}, etc. as a template (that is just ugly IMNSHO, and unfortunate that they have been created).
  6. as above
I can run replacements that remove some of the spaces that exist before punctuation marks. We do try to keep it simple, and not get captured by a typographical moment in time. The words are king. — billinghurst sDrewth 11:51, 17 June 2014 (UTC)
@Billinghurst: Because it reads better: clearer, fluent, not stumbling to decode a pile of glyphs. --Jerome Charles Potts (talk) 22:42, 19 June 2014 (UTC)
Some examples of the problematic pages can be useful. So you are indicating that from your first point the OCR has resulted in the html named entity components being present. If that is the case, then we can review and can understand better why that part of the request, though to note that all text layers needing to be present in the namespace. Would also need a list of the codes you want replaced. That still doesn't address the remainder which complicates the proofreading of the works as it is not the way that we have been handling other works of that period. — billinghurst sDrewth 04:52, 20 June 2014 (UTC)
The OCR provided either unaccented characters or an error such as a "t" for a "û". A previous editor corrected some of those by using those HTML character entities, in pages index="Sacred Books of the East - Volume 6.djvu" from=15 to=124 specifically. The reason why i asked them to be converted is as a preparation to item number 6 (which i should've listed as number 2).
My reply above about clarity is actually about the remainder, items 2 to 6, which are about discreet spacing around some punctuation. Yes, it makes a complicated source code, which i don't see complicating proofreading, quite the opposite in my opinion, since the rendering ends up clearer. --Jerome Charles Potts (talk) 10:22, 20 June 2014 (UTC)
I did point 1. Very few pages and straightforward. HTML made proofreading hard. Just a comment: using characters like "ⅼⅰ" does not help IMO ...--Mpaa (talk) 20:21, 20 June 2014 (UTC)
Thank you much. As concerns the use of roman numerals unicode characters, my feeling is that even though they look ugly in the default font of my web browser on my computer, some time in the future it may be valuable to be able to search through text for roman numeration. So, i'm for it, but without having read anything about the possible debate they at Unicode had over the creation of those characters. --Jerome Charles Potts (talk) 04:12, 22 June 2014 (UTC)

Bulk move request[edit]

Request- Bulk move of pages

Start -Page:Wisconsin Rapids directory (1921).djvu/31 End -Page:Wisconsin Rapids directory (1921).djvu/379 New Start :Page:Wisconsin Rapids directory (1921).djvu/39

(In moving Invalidate proofreading status of all pages moved.)

The file was recently patched at Commons hence this request ShakespeareFan00 (talk) 12:33, 23 August 2014 (UTC)

Yes check.svg Done , except change of proofreading state.--Mpaa (talk) 16:34, 23 August 2014 (UTC)

Work retitle[edit]

Index:Botanistsguidet00wauggoog.djvu to Index:The Botanist's Guide Through the Counties of Northumberland and Durham (Vol 1).djvu, filename updated at Commons. ShakespeareFan00 (talk) 22:51, 26 August 2014 (UTC)

If you want to have the first Google page removed, you'd better do it before we move everything. I'll wait until I'll get an answer here.--Mpaa (talk) 16:50, 27 August 2014 (UTC)
Ah.. That was a consideration. The Google front page, should be a blank (in terms of page namespace) anyway. ShakespeareFan00 (talk) 17:10, 27 August 2014 (UTC)
Not an issue at the moment. Move away! ShakespeareFan00 (talk) 17:54, 27 August 2014 (UTC)
Yes check.svg Done Moved, but there is a problem. The djvu file is actually Vol.1 + Vol.2, see Page:The_Botanist's_Guide_Through_the_Counties_of_Northumberland_and_Durham_(Vol_1).djvu/148. I realized it only when the move was done looking at the pagelist and doing random checks. Your call on what to do next.--Mpaa (talk) 21:06, 27 August 2014 (UTC)
Not sure... Ask for a wider opinion as I'm not sure how multi volume in one scan set is done here ShakespeareFan00 (talk) 22:30, 27 August 2014 (UTC)
Don't know, but if before the name was 'cripted', now might be misleading. Maybe not an issue.--Mpaa (talk) 22:36, 27 August 2014 (UTC)

Assigned requests[edit]

For noting {{PD-old}}[edit]

A quick note to say that I have been running Sdrewthbot through Author: ns pages for those authors listed as dying between 1880-1912 and either adding the licence if it doesn't have a licence, or converting any that don't say PD-old to be using that licence. — billinghurst sDrewth 07:16, 1 January 2013 (UTC)

Migrating from {{edition}} to edition = y[edit]

There are 4k+ pages that transclude {{edition}}. I am considering using my bot to run through and rm the template and add the parameter line edition = y to those pages' {{header}}. Checking the attitude of the community to this proposed bot run. — billinghurst sDrewth

Just been doing some tests and found at least one anomaly. Found that Gettysburg Address had {{versions}} and {{edition}} which would seem to be a dichotomy. This is probably a case of the main ns page being moved and the talk page being left behind. Initially I updated the versions template, which I quickly rolled back for the preceding reason. I would think that there will probably be similar situations with {{disambiguation}} in which I would think that the same logic of the display of edition would be incongruous. I will still undertake their replacement to the edition parameter, however they will not display unless we choose to make them display by updating the templates, and this will also apply to the these subsidiary templates of header. about 50-70 templatesbillinghurst sDrewth 06:19, 27 January 2013 (UTC)
Though not documented, found that {{edition}} takes the parameter title= as used in National Geographic Magazine/Volume 31/Number 6/Our State Flowers/The Mountain Laurel. Skipping these for example work, and probably in any first batch of the replacement run. — billinghurst sDrewth 07:25, 27 January 2013 (UTC)

Notification of forthcoming job — Category:No reference tag[edit]

A while back I did some tagging of reference errors, and the WMF task bot has finally caught up. For the main namespace pages in Category:No reference tag, I am planning to append the pages with {{smallrefs}}, either at the end of if it is just a transclusion, or before a copyright tag if it exists on the page. Other pages in the category will just be skipped at this time, and will be revisited. If there is any reason to hold off, or requests, or suggestions, please let me know. — billinghurst sDrewth 13:07, 17 February 2013 (UTC)

Should it be also before Categories, if any?--Mpaa (talk) 13:21, 17 February 2013 (UTC)
My mental + blah blah didn't make it to the page. Caught in an iterative loop elsewhere. Yes, that was my meaning. — billinghurst sDrewth 13:35, 17 February 2013 (UTC)

Substituting out diacritic templates for the actual diacritic[edit]

I have started to substitute diacritic templates for the actual diacritic. Sample edits can be sen in my contribs. If there are no objections, I am going to continue the clean-up. Bye--Mpaa (talk) 14:43, 29 June 2014 (UTC)

Some more done as User:MpaaBot--Mpaa (talk) 19:12, 3 July 2014 (UTC)
Done.--Mpaa (talk) 09:43, 11 August 2014 (UTC)