Wikisource:Bot requests

From Wikisource
Jump to: navigation, search
Bot requests
This page allows users to request that an existing bot accomplish a given task. Note that some tasks may require that an entirely new bot or script be written. This is not the place to ask for help running or writing a bot.

A bot operating performing a task should make note of it so that other bots don't attempt to do the same. Tasks that are permanently assigned or scheduled for long-term execution are listed on Persistent tasks.

See also

Unassigned requests[edit]

Alphabetical work index[edit]

Per Wikisource talk:Works#Work index revision, we are attempting to create a new, alphabetical work index along the lines of Wikisource:Authors. Could a bot create the "A" page, please?

Header: {{work index page|A}} (the template has not been created yet).

Then there should be several headings like this:

== Aa ==


== Ab ==


Under each heading there should be a list of all pages whose title begins with those two letters, excluding subpages and pages in Category:Mainspace disambiguation pages, Category:Case disambiguation pages, Category:Versions pages, and Category:Translations pages. After the title, the categories of each page should be listed in parentheses, for example:

I hope I was clear enough. Thanks in advance.--Erasmo Barresi (talk) 09:24, 19 June 2013 (UTC)

Categorizing U.S. Supreme Court decision redirect pages[edit]

In the manner of w:Category:United States Supreme Court cases redirects, could a bot create Category:United States Supreme Court cases redirects here and add all pages like 102 U.S. 278 to it, so as to keep all such pages in one place? It Is Me Here t / c 14:18, 28 May 2014 (UTC)

A template like in Wikisource:Redirects#Redirect_sorting should be created first. What is the source Category or how can such pages be identified?--Mpaa (talk) 19:21, 28 May 2014 (UTC)
Curious as to what the point of this would be. "So as to keep all such pages in one place" doesn't tell me much. Hesperian 02:24, 29 May 2014 (UTC)
Same here. Plus I don't consider a redirect a "page" in it of itself, so tracking them seems redundant more often than not imho. They are just 'alternative' titles to some other "true" page.

In this case, these reporter case citations are (for the most part) listed along with the associated case-name on the volume page. 102 U.S. 278 is page 278 of volume 102 so the citation would be listed on United States Reports/Volume 102. Would that be in-line with "... keep[ing] all such pages citations in one place"? -- George Orwell III (talk) 03:34, 29 May 2014 (UTC)

Well, for instance, if the members of the category were listed there alphabetically, you could, at a glance, see whether any were missing (e.g. if it went "270, 272, 273", ...), rather than having to look through hundreds of subpages. Similarly, if (at some point) the decision were taken to e.g. delete all such redirects, you could, by glancing at the Category page, see how many were left, rather than, again, having to browse through hundreds of subpages of United States Reports. It Is Me Here t / c 18:02, 31 May 2014 (UTC)

Punctuation typography stickler[edit]


i would like all existing pages of Index:Sacred Books of the East - Volume 6.djvu and Index:Sacred Books of the East - Volume 9.djvu to undergo the following tedious treatment:

  1. Replace all â with â, Â with Â, î with î, Î with Î, û with û, Û with Û, — with "—" (Perhaps there is already a bot which specializes in converting HTML entities ?)
  2. replace all occurrences of " !" by "{{ep}}" (watch the space before the exclamation point), then all "!" by the same ; then undo some of those by replacing "</ref>{{ep}}" back to "</ref>!".
  3. do the same with " ?" by "{{qm}}", then all "?" (without the preceding space).
  4. Replace " — " (em dash preceded and followed by a space) by "{{—}}"
  5. " :" with "{{':}}", then again for ":", then undo the overkill "{{'{{':}}}}" back to "{{':}}"
  6. If (1) above has been completed, then " ;" with "{{;}}", then again for ";", then undo the overkill "{{{{;}}}}" back to "{{;}}"

Thanks in advance ? --Jerome Charles Potts (talk) 18:09, 15 June 2014 (UTC)

Gday @Jerome Charles Potts:.
  1. We have no bot that does it, as it is legitimate display. It displays fine, what is the issue? Where I have replacements that I wish to undertake regularly, I write them into my custom regex rules and (auto)replace at the time of proofreading
  2. No, and in my opinion the template should not have been created, and we definitely do not want to use it. In fact, why are you creating templates that are not in line with our existing practice?
  3. as above
  4. not one that we would recommend, it is usually the reverse in that there is no spacing either side of em dashes. Poetry of course being a rule to itself.
  5. there is no way that I am going to utilise {{':}}, etc. as a template (that is just ugly IMNSHO, and unfortunate that they have been created).
  6. as above
I can run replacements that remove some of the spaces that exist before punctuation marks. We do try to keep it simple, and not get captured by a typographical moment in time. The words are king. — billinghurst sDrewth 11:51, 17 June 2014 (UTC)
@Billinghurst: Because it reads better: clearer, fluent, not stumbling to decode a pile of glyphs. --Jerome Charles Potts (talk) 22:42, 19 June 2014 (UTC)
Some examples of the problematic pages can be useful. So you are indicating that from your first point the OCR has resulted in the html named entity components being present. If that is the case, then we can review and can understand better why that part of the request, though to note that all text layers needing to be present in the namespace. Would also need a list of the codes you want replaced. That still doesn't address the remainder which complicates the proofreading of the works as it is not the way that we have been handling other works of that period. — billinghurst sDrewth 04:52, 20 June 2014 (UTC)
The OCR provided either unaccented characters or an error such as a "t" for a "û". A previous editor corrected some of those by using those HTML character entities, in pages index="Sacred Books of the East - Volume 6.djvu" from=15 to=124 specifically. The reason why i asked them to be converted is as a preparation to item number 6 (which i should've listed as number 2).
My reply above about clarity is actually about the remainder, items 2 to 6, which are about discreet spacing around some punctuation. Yes, it makes a complicated source code, which i don't see complicating proofreading, quite the opposite in my opinion, since the rendering ends up clearer. --Jerome Charles Potts (talk) 10:22, 20 June 2014 (UTC)
I did point 1. Very few pages and straightforward. HTML made proofreading hard. Just a comment: using characters like "ⅼⅰ" does not help IMO ...--Mpaa (talk) 20:21, 20 June 2014 (UTC)
Thank you much. As concerns the use of roman numerals unicode characters, my feeling is that even though they look ugly in the default font of my web browser on my computer, some time in the future it may be valuable to be able to search through text for roman numeration. So, i'm for it, but without having read anything about the possible debate they at Unicode had over the creation of those characters. --Jerome Charles Potts (talk) 04:12, 22 June 2014 (UTC)

Bulk move request[edit]

Request- Bulk move of pages

Start -Page:Wisconsin Rapids directory (1921).djvu/31 End -Page:Wisconsin Rapids directory (1921).djvu/379 New Start :Page:Wisconsin Rapids directory (1921).djvu/39

(In moving Invalidate proofreading status of all pages moved.)

The file was recently patched at Commons hence this request ShakespeareFan00 (talk) 12:33, 23 August 2014 (UTC)

Yes check.svg Done , except change of proofreading state.--Mpaa (talk) 16:34, 23 August 2014 (UTC)

Work retitle[edit]

Index:Botanistsguidet00wauggoog.djvu to Index:The Botanist's Guide Through the Counties of Northumberland and Durham (Vol 1).djvu, filename updated at Commons. ShakespeareFan00 (talk) 22:51, 26 August 2014 (UTC)

If you want to have the first Google page removed, you'd better do it before we move everything. I'll wait until I'll get an answer here.--Mpaa (talk) 16:50, 27 August 2014 (UTC)
Ah.. That was a consideration. The Google front page, should be a blank (in terms of page namespace) anyway. ShakespeareFan00 (talk) 17:10, 27 August 2014 (UTC)
Not an issue at the moment. Move away! ShakespeareFan00 (talk) 17:54, 27 August 2014 (UTC)
Yes check.svg Done Moved, but there is a problem. The djvu file is actually Vol.1 + Vol.2, see Page:The_Botanist's_Guide_Through_the_Counties_of_Northumberland_and_Durham_(Vol_1).djvu/148. I realized it only when the move was done looking at the pagelist and doing random checks. Your call on what to do next.--Mpaa (talk) 21:06, 27 August 2014 (UTC)
Not sure... Ask for a wider opinion as I'm not sure how multi volume in one scan set is done here ShakespeareFan00 (talk) 22:30, 27 August 2014 (UTC)
Don't know, but if before the name was 'cripted', now might be misleading. Maybe not an issue.--Mpaa (talk) 22:36, 27 August 2014 (UTC)

Request the move of pages[edit]

Can someone please move the pages of this Ackermann’s Repository of Arts 1809-v01-Jan-Jun.djvu file to Index:Repository of Arts, Series 1, Volume 01, 1809, January-June.djvu. — Ineuw talk 22:29, 4 September 2014 (UTC)

Yes check.svg Done by someone — billinghurst sDrewth 13:48, 5 September 2014 (UTC)
No, it is not done, only the index is moved, see [1] Yes check.svg Done --Mpaa (talk) 18:05, 5 September 2014 (UTC)

Set of images for Move to commons template[edit]

All files with the pattern "Jotr-visitor-figure-X.png" (where X is in the range 1 to 129) are PD and should be hosted on Commons. Can they please either be moved, or tagged with {{Move to commons}}? Beeswaxcandle (talk) 09:10, 5 September 2014 (UTC)

Shouldn't they be retagged? Now the are {{PD-USGov}}. Also the == {{int:filedesc}} == is missing.--Mpaa (talk) 09:38, 5 September 2014 (UTC)
For anyone who wants a good tool to move images, I have found the tool "For the Common Good" very functional, and I have developed a configuration that works fine. You can add those components easily. — billinghurst sDrewth 12:55, 5 September 2014 (UTC)
Hmm, I may as well make it available for use, and get feedback on improvements … User:Billinghurst/tools/FtCG configuration file for enWS.wikibillinghurst sDrewth 13:30, 5 September 2014 (UTC)

Yes check.svg Done They will all need {{information}} template and source data prior to moving. — billinghurst sDrewth 13:47, 5 September 2014 (UTC)

De-categorise subpages of Latter Day Saints' Messenger and Advocate[edit]

A new editor has been adding Category:Latter Day Saints' Messenger and Advocate to the subpages of this periodical. I have asked them stop, however there are now 215 pages in the category. Could a bot please go through and remove the category from the articles? Thanks, Beeswaxcandle (talk) 20:43, 16 September 2014 (UTC)

Should the previous category Category:Mormon periodicals be restored ann all pages?--Mpaa (talk) 21:58, 16 September 2014 (UTC)
No, that should just contain the mainpage for each periodical. However, looking at causes me to realise that a similar de-categorisation of the article subpages for the other two periodicals in it also needs to happen. Beeswaxcandle (talk) 22:18, 16 September 2014 (UTC)
I am getting a bit lost. Can you please summarize what is needed?--Mpaa (talk) 11:58, 17 September 2014 (UTC)
Category:Latter Day Saints' Messenger and Advocate, Category:Journal of Discourses, and Category:The Seer should be removed from all pages that contain them. Following which these three categories can be deleted. Beeswaxcandle (talk) 02:40, 20 September 2014 (UTC)
I can do that. Just wonder what is the issue, as I have seen this scheme used elsewhere, see e.g. Category:Geological Society of London.--Mpaa (talk) 08:36, 20 September 2014 (UTC)
The issue is that these are work-based categories (which we don't do) and are redundant information on sub-pages where the title already includes the work. The Geological Soc. category is different. Beeswaxcandle (talk) 08:45, 20 September 2014 (UTC)
Yes check.svg Done --Mpaa (talk) 10:11, 20 September 2014 (UTC)

Request to archive older Scriptorium posts[edit]

I don't know if I am out of turn, but wondering if some of the older Scriptorium posts can be archived. The page is very large. — Ineuw talk 16:55, 23 September 2014 (UTC)

Generic re-titler for Index: namespace (aka Special:Move-work)[edit]

Currently, Index pages can be moved manually. However this means that in order to avoid broken links Page:'s also have to be moved manually.

There is therefore a need for a tool-assisted edit script or bot which allows for an entire work to be re-titled.

The 2 input parameters would be the old Index name and the new Index name :-

The bot/script would be required to do the following:-

  1. Search for and compile a list of extant Page: namespace entries for "oldname.ext"
  2. For each Page (keeping a log of redirects created):-
    1. Check for a what links to the page,
      1. For a pages<\noiwki> entry update <nowiki><pages index="oldname.ext" ...> to <pages index"newname.ext" ...> if not already done so(ie no reference to oldname.ext on the page.
      2. For a link update [[Page:oldname.ext/...]] to [[Page:newname.ext/...]]
      3. For a transclusion update {{Page:oldname.ext/ ... }}to {{Page:newname.ext/ ...}}
    2. On the page itself (in addition to the above) update {{raw image}} params, and any links as previously.
    3. Check the index page for links/transclusions and update accordingly.
    4. Finally move the index page to "Index:newname.ext"
    5. Update links to the index page.
    6. Cleanup the redirects either by doing a deletion or taggging for sdelete with a suitable reason.

Can an existing bot cope with such a task? ( It's also my view that this sort of mass move relating to Page: and Index: namespaces should really be a Special: Page implemented in the relevant Mediawiki extension). ShakespeareFan00 (talk) 18:20, 28 September 2014 (UTC)

De-categorise Journal of Discourses subpages from Category:Mormon religious speeches[edit]

Sorry if I'm not doing this quite right; I'm new at Wikisource & and trying to follow up on this advice from my talk page.

The w:Journal of Discourses is by definition a 26-volume collection of public Mormon religious speeches, with no other content. Given this, it makes little sense to have the 1000+ subpages of the individual speeches listed separately in Category:Mormon religious speeches. There is other content in that category that does belong in it, and potentially other categories that belong on those pages, but the Journal of Discourses subpages should have Category:Mormon religious speeches removed from the individual pages; in other words, of the Journal of Discourses Wikisource content, only the parent work should be listed in Category:Mormon religious speeches. — AsteriskStarSplat (talk) 16:32, 30 September 2014 (UTC)

Yes check.svg Done --Mpaa (talk) 21:46, 30 September 2014 (UTC)

Header fixes to Index:Ninety-three.djvu[edit]

Is there a bot that can make fixes to the book title in the running headers on the Pages of the above Index? It should read NINETY-THREE. (italic with a period), but many don't contain periods, or are not italicized, or both. If it were only a handful of Pages affected, I would tackle it manually. I am only being picky about uniformity in case this work was to be considered for Featured Text some day once validated. Thanks, Londonjackbooks (talk) 15:11, 4 December 2014 (UTC)

I can run a bot through if it is considered worthwhile, though for the header component, that is excluded, I am not certain that it is. For FTC we have never worried about the headers, only what shows in the main ns. — billinghurst sDrewth 13:18, 5 December 2014 (UTC)
In that case, don't worry about it!  :) Thanks, Londonjackbooks (talk) 14:19, 5 December 2014 (UTC)

Assigned requests[edit]

For noting {{PD-old}}[edit]

A quick note to say that I have been running Sdrewthbot through Author: ns pages for those authors listed as dying between 1880-1912 and either adding the licence if it doesn't have a licence, or converting any that don't say PD-old to be using that licence. — billinghurst sDrewth 07:16, 1 January 2013 (UTC)

Migrating from {{edition}} to edition = y[edit]

There are 4k+ pages that transclude {{edition}}. I am considering using my bot to run through and rm the template and add the parameter line edition = y to those pages' {{header}}. Checking the attitude of the community to this proposed bot run. — billinghurst sDrewth

Just been doing some tests and found at least one anomaly. Found that Gettysburg Address had {{versions}} and {{edition}} which would seem to be a dichotomy. This is probably a case of the main ns page being moved and the talk page being left behind. Initially I updated the versions template, which I quickly rolled back for the preceding reason. I would think that there will probably be similar situations with {{disambiguation}} in which I would think that the same logic of the display of edition would be incongruous. I will still undertake their replacement to the edition parameter, however they will not display unless we choose to make them display by updating the templates, and this will also apply to the these subsidiary templates of header. about 50-70 templatesbillinghurst sDrewth 06:19, 27 January 2013 (UTC)
Though not documented, found that {{edition}} takes the parameter title= as used in National Geographic Magazine/Volume 31/Number 6/Our State Flowers/The Mountain Laurel. Skipping these for example work, and probably in any first batch of the replacement run. — billinghurst sDrewth 07:25, 27 January 2013 (UTC)

Notification of forthcoming job — Category:No reference tag[edit]

A while back I did some tagging of reference errors, and the WMF task bot has finally caught up. For the main namespace pages in Category:No reference tag, I am planning to append the pages with {{smallrefs}}, either at the end of if it is just a transclusion, or before a copyright tag if it exists on the page. Other pages in the category will just be skipped at this time, and will be revisited. If there is any reason to hold off, or requests, or suggestions, please let me know. — billinghurst sDrewth 13:07, 17 February 2013 (UTC)

Should it be also before Categories, if any?--Mpaa (talk) 13:21, 17 February 2013 (UTC)
My mental + blah blah didn't make it to the page. Caught in an iterative loop elsewhere. Yes, that was my meaning. — billinghurst sDrewth 13:35, 17 February 2013 (UTC)

Substituting out diacritic templates for the actual diacritic[edit]

I have started to substitute diacritic templates for the actual diacritic. Sample edits can be sen in my contribs. If there are no objections, I am going to continue the clean-up. Bye--Mpaa (talk) 14:43, 29 June 2014 (UTC)

Some more done as User:MpaaBot--Mpaa (talk) 19:12, 3 July 2014 (UTC)
Done.--Mpaa (talk) 09:43, 11 August 2014 (UTC)
Excuse me, but as a noob I don't understand. The templates like {{ae}} are not marked deprecated, and I know I read somewhere (sorry, I've read a lot of help files in the last week) that the templates should be used. Therefore I've been using them. Where is this policy explained? Sorry for posting this on this page. Laura1822 (talk) 19:05, 4 September 2014 (UTC)
In general, these might be replaced anytime: Category:Diacritic_templates, see top line. But you're probably right about {{ae}} and similar. As I have applied this list Category_talk:Templates_that_can_be_substituted, I think some instances might have been substituted. That list should be fixed if we do not want {{ae}} & co. to be replaced in future.
TBH, I do not know if this feature is still on: "support functionality for automatically turning on and off the display of ligatures". Comments welcome also from others.--Mpaa (talk) 20:48, 4 September 2014 (UTC)
So as a proofreader, should I use the template or insert the ligature directly? It's certainly easier to do the latter. Yes to some but not to others? What is the point of the template(s)? Laura1822 (talk) 13:18, 5 September 2014 (UTC)
I can't remember too well, but I think it was a compatibility issue. Some displays couldn't render the ligatures, so to avoid a person seeing a little box, the template would break it into the "ae" or "oe" letters instead.—Zhaladshar (Talk) 13:26, 5 September 2014 (UTC)
Template purpose is to assist in the entering of diacritics, see Help:Templates#Character_formatting. IMO, both ways are OK.--Mpaa (talk) 13:35, 5 September 2014 (UTC)
(edit conflict) @Laura1822: for some they didn't know how to do the ligatures, so we just made templates available. You are welcome to use the templates or the actual characters. Every (not) so often, I run a bot through and do a replacement. It is neither here nor there on the frequency, and in many cases it is not overtly necessarily, though for a number of edge cases too many templates on a page is problematic. So in the end, it is just worthwhile running a bot through to do the work. Thanks for the question, it is worthwhile, and if you can think of some help pages where you see that it is of value to add the information, then we can look to do that. — billinghurst sDrewth 13:38, 5 September 2014 (UTC)
Thanks! For what it's worth, on the Help page Mpaa linked above, I had completely missed the explanation that it was optional, since I was looking at the tables (probably looking for something else at the time). If the usage of the templates is somewhat obsolete, or intended only for use by proofreaders who can't see them for some reason, then I think that could be clarified. It would help for the templates themselves to have an explanation, or at least a statement that they are optional. Thanks! Laura1822 (talk) 13:56, 5 September 2014 (UTC)
I have improved template pages for Diacritics with doc page.--Mpaa (talk) 20:31, 5 September 2014 (UTC)
This point is still not clear to me: is this feature still used: "support functionality for automatically turning on and off the display of ligatures (see {{ae}})" or we can substitute {{ae}} as well?--Mpaa (talk) 18:18, 5 September 2014 (UTC)