Wikisource:Bot requests

From Wikisource
Jump to navigation Jump to search
Bot requests

This page allows users to request that an existing bot accomplish a given task. Note that some tasks may require that an entirely new bot or script be written. This is not the place to ask for help running or writing a bot.

A bot operating performing a task should make note of it so that other bots don't attempt to do the same. Tasks that are permanently assigned or scheduled for long-term execution are listed on Persistent tasks.

See also

Unassigned requests[edit]

Move all subpages of Who's Who in the Far East (June) 1906-7 to use title case[edit]

I was informed by User:Beeswaxcandle that I should use title case instead of all caps in article names. So I request to move all subpages of Who's Who in the Far East (June) 1906-7 to use title case. Although I can use a bot to move it myself, that would leave tons of redirects for admins to delete. But if an admin can easily batch-delete a list of pages, I can move it myself and then provide the list of pages to delete. I'm sorry for the inconvenience. Thanks, --Stevenliuyi (talk) 08:58, 6 May 2021 (UTC)Reply[reply]

@Stevenliuyi: Please review the list at Wikisource:Bot requests/sandbox. I notice that there is at least one English name that needs to be fixed, and the Chinese names didn't convert on the regex that I used. Would you fix or create the target (only) in the list in the pair list, and I will get it done. No need to fix those that are broken though you should fix the previous/next links of the articles either side. To note that as I did for your other work, I will look to get a work specific template in place, though will do that afterwards. — billinghurst sDrewth 13:10, 24 May 2021 (UTC)Reply[reply]
I suppose that I really to want to ensure that the Chinese names are capitalised properly. — billinghurst sDrewth 02:57, 25 May 2021 (UTC)Reply[reply]
@Stevenliuyi and @Billinghurst: Has this request been actioned (i.e. can it be closed as resolved)? Xover (talk) 10:34, 10 April 2022 (UTC)Reply[reply]
@Stevenliuyi: Please see Billinghurst's request (above) for quality control of the list of targets in Wikisource:Bot requests/sandbox. They have done the legwork to prepare for the move, but it is unable to progress until you've checked and corrected the target page names. Xover (talk) 05:33, 3 September 2022 (UTC)Reply[reply]

Add {{R from case citation}} to all redirects from case citations to cases[edit]

Would it be possible for a bot to detect redirects pointing from a case citation (e.g. 347 U.S. 483) to a case (here, Brown v. Board of Education), and add {{R from case citation}} to these. It would be useful to have all case citations in one category, given the obvious utility of Wikisource as a caselaw database for lawyers. BD2412 T 04:25, 24 May 2021 (UTC)Reply[reply]

@BD2412: What's the identifying feature of a case page? Is it in a particular category? Does it contain a unique template? Once identified, should all redirects to these be so tagged, or only a subset of them? Do the redirects to these have a distinguishing feature to identify them as distinct from all other redirects on the site?
Adding a category to a set of pages is pretty straightforward, so the challenge is how to identify those pages automatically. Doing something to one page (the redirect) based on properties of another page (the actual case page) can also be challenging depending on the details (it may require writing a custom bot rather than just running one of the existing scripts for pywikibot).
Also, how many of these are there? If the criteria are complex, and the number of pages relatively low, it may be better to do it manually or semi-automated (a user script in the browser that finds and tags redirects to the current page on request, say). Xover (talk) 04:06, 25 May 2021 (UTC)Reply[reply]
I have been thinking about exactly those issues. I don't know that we have any case citation redirects from non-U.S. cases, so the group to start with would be documents in the category tree under Category:United States case law by court. The redirects themselves will all be in a [Number] [Reporter] [Number] format, so for example the first page of results from a search for pages starting with "1" is almost entirely redirects to cases (everything from 100 L.Ed. 1003 on, with a few exceptions). So, anything in that format redirecting to something in that category tree should be a case citation redirect. I would also note that a great many of these were generated by User:BenchBot when that bot was active, and just counting those from the bot's contributions, there are over 9,000. I suppose I could do those manually, or use BD2412bot, but that will leave stray case citation redirects added by others. BD2412 T 04:25, 25 May 2021 (UTC)Reply[reply]
I am doing some manually to see if there are any hitches that come up that way. BD2412 T 19:30, 25 May 2021 (UTC)Reply[reply]
@BD2412: I have to admit I completely forgot about this request. Sorry. If the page selection logic is "Check all pages in Category:United States case law by court for incoming redirects, and pick the redirects whose page name matches [Number] [Reporter] [Number]" then I think it's probably doable. It's going to require writing a custom bot though, which I've not yet done so I'd need to find time for both the bot coding and the learning curve (which means it'll be a while before I might tackle it).
Mpaa is vastly more skilled than me in this area though, so perhaps they could be persuaded to help?
Hmm. Or possibly there is a Toolforge tool for querying the wikis for pages that match these criteria? Once we have the list of redirects adding the template to all of them is trivial; it's getting the list of pages (redirects) that's a little bit challenging. Quarry maybe? That'll need understanding SQL JOINs, which make my head hurt, but should be doable for someone with a bit of DBA in their mix. PetScan is also often great for this kind of thing, but I don't think it can be query for incoming redirects like this. Hmm. And you could probably do this on-wiki in JavaScript too, come to think of it. Xover (talk) 06:26, 3 September 2022 (UTC)Reply[reply]
I believe I did most of these manually at some point. Still, there will always be new ones. BD2412 T 06:32, 3 September 2022 (UTC)Reply[reply]
Oh, ok. Should we consider this request resolved then (so it'll get archived)? Or were you looking for a permanent bot task to do this automatically as new cases are added? Xover (talk) 06:45, 3 September 2022 (UTC)Reply[reply]
A bot to pick these up as new cases or added would be nice. I don't think I could track that manually. BD2412 T 19:01, 3 September 2022 (UTC)Reply[reply]
@BD2412, @Xover
To summarize:
1. walk recursively Category:United States case law by court
2. find redirects that point to pages in above categoryies
3. if redirect title matches '\d+ [^ ]*? \d+' and does not contain {{R from case citation}}, append it
I made a test for some articles in Category:United States Supreme Court decisions in Volume 107.
Are these edits OK? Mpaa (talk) 22:46, 3 September 2022 (UTC)Reply[reply]
@Mpaa: yes, those are absolutely correct. BD2412 T 23:04, 3 September 2022 (UTC)Reply[reply]
@BD2412 ongoing. This query gives pages pointed by a redirect matching the regex but not yet tagged with {{R from case citation}}. From here there are several possibilities to ensure they are pages of interest. I intersected them with pages belonging to (subcategories of) "Category:United States case law by court". Mpaa (talk) 21:20, 4 September 2022 (UTC)Reply[reply]
Done. Mpaa (talk) 21:04, 5 September 2022 (UTC)Reply[reply]

Wikidata bulk edit[edit]

I made a query for works on enWS that have WD items with no "instance of" statement. The criteria I used are:

  • Pages in mainspace
  • No redirects or disambiguation pages (this includes Versions and Translations btw)
  • Does not contain a forward slash in the page name (in order to exclude subpages)
  • Is linked to Wikidata, and linked Wikidata item does not have a P31 statement

This query returns 13889 results, which is more than even QuickStatements can handle. Would it be possible for a bot to update these Wikidata items with P31=Q3331189 (instance of = version, edition, or translation)?

Thanks :) —Beleg Tâl (talk) 13:22, 1 November 2021 (UTC)Reply[reply]

I think we could be more specific for certain groups, e.g I have addressed "Presidential Radio Address" articles as "instance of speech". There are several groups of articles that can be identified and then addressed with QuickStatements. After that, the bot can be run on what is left. Mpaa (talk) 23:13, 1 November 2021 (UTC)Reply[reply]
@Mpaa: Except they are editions as we host them, the speech would be the parent to the item, per d:WD:Books as there may be other published editions of the same speech. — billinghurst sDrewth 12:17, 5 September 2022 (UTC)Reply[reply]
@Billinghurst I see. I saw other were linked that way and I followed along. If it is not correct, it should be cleaned up but I do not master wikidata tools enough to write a bot for it. Mpaa (talk) 21:34, 5 September 2022 (UTC)Reply[reply]
We desperately need better Wikidata tools (so we're not dependent on Billinghurst to be on eternal vigilance here). But the current gadget we have for this is loaded from some user's personal page on Russian Wikisource (which is kinda iffy in itself these days), and its code is completely incomprehensible. If anybody knows of or runs across good API docs for how to talk to Wikidata I'd be very interested. As far as I can tell, the only existing API is the main MW:API with some very minor additions for WD, and that's way way too painful to use for our purposes. Xover (talk) 06:15, 6 September 2022 (UTC)Reply[reply]
User:Beleg Tâl why not just do it with Petscan itself, from memory it could additions. Also note that there is the interwiki Petscan: for these. — billinghurst sDrewth 12:14, 5 September 2022 (UTC)Reply[reply]

Assigned requests[edit]

Copy the proofread text from Index: The last man (Second Edition 1826 Volume 1).djvu to Index:The last man vol 1.djvu[edit]

The three volumes of The Last Man only have a different title page between the first and second edition, could the proofread text of the three-volumes of the second edition be copied to the scans of the first edition. Languageseeker (talk) 23:29, 16 July 2022 (UTC)Reply[reply]

@Mpaa If it is OK to copy also the Page status, better wait for all 3 vols to be validated. Mpaa (talk) 13:53, 18 July 2022 (UTC)Reply[reply]
Makes Sense. 13:49, 22 July 2022 (UTC)