Wikisource:Bot requests

From Wikisource
Jump to: navigation, search
Bot requests
This page allows users to request that an existing bot accomplish a given task. Note that some tasks may require that an entirely new bot or script be written. This is not the place to ask for help running or writing a bot.

A bot operating performing a task should make note of it so that other bots don't attempt to do the same. Tasks that are permanently assigned or scheduled for long-term execution are listed on Persistent tasks.

See also

Unassigned requests[edit]

Request to delete specific pages in Index:Southern_Historical_Society_Papers_volume_35.djvu[edit]

If someone with delete privileges has a bot they can send to delete all pages in Index:Southern_Historical_Society_Papers_volume_35.djvu that were created by LA2-bot and have a status lower than "Proofread", I would be most grateful. I am working on improving the text in the underlying djvu file and would like to test my changes on this volume. I understand that this is the best way for me to repopulate the pages with revised text. I’ve been working with users Beeswaxcandle and Maury and have their buy-in, but I take full responsibility for this request, so any questions or clarifications should be directed to my talk page (or here, if that’s better for history tracking purposes). If specific page ranges are preferred (or to double-check my request), they would correspond to un-proofread file pages 34-126, 128-172, 175-218, 236-271, 273-307, 309-364, 366-384, and 386-388. Note that the "text" pages displayed on the Index page are 14 less, so they display on the Index page like 20-112, etc. Thanks in advance to the relevant bot and its operator, Dictioneer (talk) 00:52, 14 May 2015 (UTC)

There is no need to delete the pages. If you press the OCR button, the text with your modifications will appear. The latest text layer is used by OCR feature.
This does not match my experience of the OCR button in my testing. I was surprised as well, but Beeswaxcandle tells me that an actual OCR routine is invoked, and this has sometimes been my experience on the test page he set up for me. In fact, I’ve been unable to get the OCR button to function predictably or even appear consistently on my toolbar, so maybe you’re both right, depending on some variable that I can’t identify.
I have a comment though. I do not think the right way to update the text is adding wiki-syntax to the text layer of the djvu file. Instead of uploading Mb of images each time and deleting the pages, you can just work on a few Kb of text and uploading it. There are bots for that as well. If you are interested, let me know.— Mpaa (talk) 22:42, 14 May 2015 (UTC)
Thanks for this proposal  —  I am interested in this as an alternative (although I have a sufficiently fast connection that a 35MB upload goes very quickly). What I am most interested in is testing some of the changes I have been working on with user Maury’s and Beeswaxcandle’s input. Whatever makes progress on that goal is the best route -- I’m also happy to try both approaches and report back on their relative efficacies. Let me know your preference for proceeding. My understanding is that I lack the user-privileges to mass-delete a bunch of pages (which is a good thing, given my inexperience!:) so I will need the assistance of an admin in either course of action. Thanks, Dictioneer (talk) 00:08, 15 May 2015 (UTC)
As I said, all you need is to enable your OCR button. You will get the the text with the latest text layer.
Regarding the process of updating the djvu text-layer to improve the text, my point is that it is a waste of resources to re-upload every time 35 MB to change a few KB of text, and then change them again in the Page ns. Do it directly in the Page ns. There are bots that upload pages from files.— Mpaa (talk) 11:29, 15 May 2015 (UTC)
If you want, I can try to take the text layer from the latest version of the djvu file and upload it only if latest uploader was LA2.
Thanks, that is functionally equivalent, and I appreciate the offer. I've left a suggestion for helping isolate the problem on your talk page. If you're willing to give my suggestion a try, I think the next step is to upload the text as described above (it might also help to debug the problem). I assume you have the relevant djvused script to emit the text from the most recent Commons upload, but let me know if there's anything I can do to assist. Dictioneer (talk) 17:21, 16 May 2015 (UTC)
Done. As I said on my talk page, my advice is to down this road again only if the new djvu has a better text layer due to a new OCR process. There is obviously an advantage in porting back the corrected text to the djvu file, but that is a complete different story.— Mpaa (talk) 21:48, 16 May 2015 (UTC)
I've proofread a dozen pages, Maury has validated them with a few suggestions, it all looks good: I think this topic can be marked "complete" by whomever is responsible for such things. Thanks for the help, I will try to follow your advice on future texts. Dictioneer (talk) 22:10, 17 May 2015 (UTC)

Populate category "Women authors"[edit]

I created this category and I think it’s important on this project, but it would take much time to fill it entirely manually, while the task may be easily completed using data from Wikidata, so I propose it to be completed by bot. --Nonexyst (talk) 09:21, 1 April 2015 (UTC)

I can take it. Just need some free time.--Mpaa (talk) 10:00, 1 April 2015 (UTC)
Thanks for response.--Nonexyst (talk) 10:25, 1 April 2015 (UTC)
Yes check.svg Done --Mpaa (talk) 21:09, 1 April 2015 (UTC)
Actually this would be more efficient with a switch in the Author template, something like:
{{#switch: {{#property:P21}}| female = [[Category:female]] | male = [[Category:male]] }}
As on my talk page concerns were issued (see User_talk:Mpaa#Categorizing_by_gender), when the discussion is settled down with the proper actors (@Prosfilaes:, @EncycloPetey:, @Nonexyst:) and a decision is taken, I can remove this category the explicit category if needed if the decision is not to have the category or to implement it directly in Author template. Waiting for further input in the meanwhile.--Mpaa (talk) 19:55, 2 April 2015 (UTC)

┌─────────────┘
Well, I still argue for keep the categorization by gender, since it’s carried out in many wikis including English Wikipedia and French Wikisource, and gender of an author is much expressed in the author’s works, especially in case if fiction writer/poet, so the category has to be useful. Many wikis use male gender as default, but nevertheless I don’t oppose creating "Male authors" category.--Nonexyst (talk) 21:03, 2 April 2015 (UTC)

We would be better to utilise enWP's categorisation module thingy, and just categorise from WD straight, without any intervention by switches. — billinghurst sDrewth 09:37, 3 April 2015 (UTC)
See w:Module:Category handlerbillinghurst sDrewth 10:52, 3 April 2015 (UTC)
I miss how "Module:Category handler" fetches data straight from WD, but I leave the implementation to someone more familiar with both Lua and WD. My point was that we do not need to explicitly indicate [[Category:Men/Women author]] on each author page, but only embed the logic which gets the gender info from WD directly into the author template.--Mpaa (talk) 17:02, 3 April 2015 (UTC)
I modified {{Author}} to automatically categorise based on WD (if someone finds a better implementation, please improve). I also created the three categories correspondent to Property P21 values (Category:Female Authors, Category:Male Authors and Category:Transgender Female Authors).
I did not move Women authors, so we have the two options to review. If this is fine, I will remove this last one and delete or redirect the category to Female Authors.
If a better naming is desired, no objections from me.--Mpaa (talk) 20:22, 4 April 2015 (UTC)
I was in doubt how to name the category, but finally chose "Women authors" in order to make it uniform with Wikipedia (Category:Women writers, Category:Women scientists, etc., though, Category:Male writers, etc.), so I think the best is the conform Wikipedia style in naming categories. And finally all words except first in these category names should have their first letters decapitalized. --Nonexyst (talk) 21:08, 4 April 2015 (UTC)
Renamed to: Category:Women authors, Category:Male authors and Category:Transgender female authors and added Maintenance categories (Category:Author pages with gender in Wikidata, Category:Author pages with no gender in Wikidata‎ and Category:Author pages with gender manually categorised‎)--Mpaa (talk) 22:06, 4 April 2015 (UTC)
Great work! My appreciation. Sorry, I didn’t pay attention to one, as I think, important detail. I think there’s no need in two categories for transgender authors, since these categories shall not be highly populated in nearly future, moreover, I don’t know if there exist any transgender female author whose works are eligible for Wikisource, while one trans man definitely exists. So maybe the best solution is "Transgender and transsexual authors", conforming to Wikipedia. In such case, an author should be placed in two categories, e.g. "Women authors" and "Trans… authors", which, as I understand, should be even easier to implement.--Nonexyst (talk) 23:49, 4 April 2015 (UTC)
Fine. Getting tricky ...--Mpaa (talk) 07:39, 5 April 2015 (UTC)
Thanks. To me, it’s complete now. --Nonexyst (talk) 08:52, 5 April 2015 (UTC)

Replacing the reference tags[edit]

Is there a bot that could replace the PSM <reference /> tags with {{smallrefs}}. The pages were generated before my time. If there is such a bot, can it be activated/implemented starting with with PSM Volume 31? Ineuw (talk) 07:23, 28 February 2015 (UTC)

@Ineuw: We would have to trawl the pages and do a replacement. There may be a means to find them by some arcane prodding of the dumps, however, in the end, it is probably just as easy to just grab the main ns pages for PSM and do the replacement. [I am presuming that you are talking main ns, and not Page: ns, which would not seem a worthwhile exercise. — billinghurst sDrewth 10:46, 3 April 2015 (UTC)
I’d like to see Page:s done also. Moondyne (talk) 02:04, 5 April 2015 (UTC)
That is a lot of pages trawled for replacements of no demonstrated value. Why do you wish to see it done there? If it is for your viewing, then you can write some css to changes its presentation. — billinghurst sDrewth 12:51, 5 April 2015 (UTC)
I dont care. Moondyne (talk)

┌─────────────────────────────────┘
I want to thank whoever inserted the smallrefs into vol 33 of PSM. @Billinghurst: I understand your objection to inserting this in the Page namespace because of it's limited value, but aside from my preference of reading the contents in the Page namespace, it also helps me to keep focused on details and correct errors (I) made. This also applies to the running headers - which I put back after removing them initially. All main namespace pages already have the smallrefs.— Ineuw talk 22:30, 21 April 2015 (UTC)

Remove two pages from PSM Vol 35[edit]

Could someone kindly remove this and this page. They are the last two pages before a series of blank pages. — Ineuw talk 05:33, 23 May 2015 (UTC)

Assigned requests[edit]


Substituting out diacritic templates for the actual diacritic[edit]

I have started to substitute diacritic templates for the actual diacritic. Sample edits can be sen in my contribs. If there are no objections, I am going to continue the clean-up. Bye--Mpaa (talk) 14:43, 29 June 2014 (UTC)

Some more done as User:MpaaBot--Mpaa (talk) 19:12, 3 July 2014 (UTC)
Done.--Mpaa (talk) 09:43, 11 August 2014 (UTC)
Excuse me, but as a noob I don't understand. The templates like {{ae}} are not marked deprecated, and I know I read somewhere (sorry, I've read a lot of help files in the last week) that the templates should be used. Therefore I've been using them. Where is this policy explained? Sorry for posting this on this page. Laura1822 (talk) 19:05, 4 September 2014 (UTC)
In general, these might be replaced anytime: Category:Diacritic_templates, see top line. But you're probably right about {{ae}} and similar. As I have applied this list Category_talk:Templates_that_can_be_substituted, I think some instances might have been substituted. That list should be fixed if we do not want {{ae}} & co. to be replaced in future.
TBH, I do not know if this feature is still on: "support functionality for automatically turning on and off the display of ligatures". Comments welcome also from others.--Mpaa (talk) 20:48, 4 September 2014 (UTC)
So as a proofreader, should I use the template or insert the ligature directly? It's certainly easier to do the latter. Yes to some but not to others? What is the point of the template(s)? Laura1822 (talk) 13:18, 5 September 2014 (UTC)
I can't remember too well, but I think it was a compatibility issue. Some displays couldn't render the ligatures, so to avoid a person seeing a little box, the template would break it into the "ae" or "oe" letters instead.—Zhaladshar (Talk) 13:26, 5 September 2014 (UTC)
Template purpose is to assist in the entering of diacritics, see Help:Templates#Character_formatting. IMO, both ways are OK.--Mpaa (talk) 13:35, 5 September 2014 (UTC)
(edit conflict) @Laura1822: for some they didn't know how to do the ligatures, so we just made templates available. You are welcome to use the templates or the actual characters. Every (not) so often, I run a bot through and do a replacement. It is neither here nor there on the frequency, and in many cases it is not overtly necessarily, though for a number of edge cases too many templates on a page is problematic. So in the end, it is just worthwhile running a bot through to do the work. Thanks for the question, it is worthwhile, and if you can think of some help pages where you see that it is of value to add the information, then we can look to do that. — billinghurst sDrewth 13:38, 5 September 2014 (UTC)
Thanks! For what it's worth, on the Help page Mpaa linked above, I had completely missed the explanation that it was optional, since I was looking at the tables (probably looking for something else at the time). If the usage of the templates is somewhat obsolete, or intended only for use by proofreaders who can't see them for some reason, then I think that could be clarified. It would help for the templates themselves to have an explanation, or at least a statement that they are optional. Thanks! Laura1822 (talk) 13:56, 5 September 2014 (UTC)
I have improved template pages for Diacritics with doc page.--Mpaa (talk) 20:31, 5 September 2014 (UTC)
This point is still not clear to me: is this feature still used: "support functionality for automatically turning on and off the display of ligatures (see {{ae}})" or we can substitute {{ae}} as well?--Mpaa (talk) 18:18, 5 September 2014 (UTC)