Wikisource:Bot requests/Archives/2011

From Wikisource
Jump to: navigation, search
Warning Please do not post any new comments on this page. This is a discussion archive first created on 01 January 2011, although the comments contained were likely posted before and after this date. See current discussion or the archives index.

Requests[edit]

Years of works[edit]

As I update many {{PD-old-70}} to show why PD in the USA, I sometimes move categories of years of works into the {{header}} with the parameter "year=", but I would like to ask if anyone can make a bot to do this task, such as converting [[Category:1900 works]] to | year = 1900 in the header. Manually changing these is time-consuming.--Jusjih (talk) 02:11, 14 October 2009 (UTC)

Had a start at this, and you can see the AWB code that I used at User:SDrewthbot/AWB_modules#removing_Category:.5Cd.5Cd.5Cd.5Cd_works.2C_replace_with_year_in_headerbillinghurst sDrewth 14:47, 21 October 2010 (UTC)
Finished the recurse categories of Category:19th century works, and working through the 18thC now. I do have a query, what do we want to do with works in the top level of that category, ie. no specific year of the work? My initial thought would be that as they are non-specific that adding them as a year is unproductive and they should be left as they are. I have converted those that are within a decade to YYYYs works as that seemed close enough. — billinghurst sDrewth 01:14, 3 November 2010 (UTC)
To keep track of progress see User:SDrewthbot
To note issues and thoughts that have been raised with me about some of the varied uses of category addition
  • Adding category to subpages. Solutions: Where the work is a periodical to keep the year parameter; where the work is chapters, or similar, to remove the year; where the work is an addition of an earlier work to keep the year category
  • Where categories are added to subpages of works where the original work is later incorporated into a collection. Solution: to add the category manually rather than utilise the year parameter which could be misleading, especially due to its placement against the title rather than the section.
  • Where a work may have been published over as a series (sometimes over years) in a journal, so where parts are transcluded together multiple years are required. Solution: add the first year as year parameter and subsequent years to be added manually.
  • Where Volumes of a work are created over multiple years (and added as subpages). Solution: to add year parameter to each of the volumes, and retain whatever has been used at the top level of the work,
Added here to record findings against request. — billinghurst sDrewth 12:05, 6 November 2010 (UTC)

Tag all pages that use {{use page image}} as "Problematic"[edit]

Pages that use {{Use page image}} are Problematic by definition. This list of Page: pages that use that template contains a great many pages that are tagged otherwise. Hesperian 01:02, 9 February 2010 (UTC)

and I would say that the term problematic itself adds to the confusion. They are not problems, they are unresolved or pending work. That said, I would agree that it is probably the closest along with non-proofread. I would prefer that we found an image processing slave, rather than a bot job. billinghurst sDrewth 12:50, 9 February 2010 (UTC)
Template:Book cover is a variant of this template, and I have proposed that template for deletion, see Wikisource:Proposed deletions#Template:Book cover. — billinghurst sDrewth 01:26, 3 November 2010 (UTC)
Yes check.svg Done Beyond a specific batch under discussion, they are either marked "Problematic" or the template has been removed from the respective pages. — billinghurst sDrewth 06:17, 3 November 2010 (UTC)

Move Wikisource index pages[edit]

As per Wikisource:Scriptorium#Attempt_at_a_summary, it would be helpful for a bot to move all pages in Category:Wikisource index pages to the portal namespace, except Wikisource:About, Wikisource:Index, Wikisource:Index/Tools and scripts, Wikisource:Index/Community, Help:Contents, and Wikisource:Authors . The pages that are being moved should have their names maintained, so for example Wikisource:Virginia becomes Portal:Virginia. The old pages should be converted to soft redirects. —Spangineerwp (háblame) 23:25, 10 June 2010 (UTC)

X mark.svg Not done AdamBMorgan (talkcontribs) is undertaking these manually as he provides an underlying structure to the Portal: namespace. — billinghurst sDrewth 13:42, 7 December 2010 (UTC)

Marking index pages as 'Not proofread'[edit]

Asking for the bot to process Index:A Study of Mexico.djvu to save and assign 'Not proofread' to the pages. Thanks. - Ineuw (talk) 15:17, 28 June 2010 (UTC)

Comment. That would be a bot that applies the text layer, not specifically to mark as not proofread. Last that I knew that did that was User:mjbot, though it has generally not been required now that the text layer can be added as the pages are created, which was not the case when the bot was created. — billinghurst sDrewth 06:20, 3 November 2010 (UTC)

joining lines considering hyphens when it imports a Page form OCR[edit]

Please set Ryuchbot for this purpose. --Ryuchbot (talk) 07:28, 20 July 2010 (UTC)

Please see the information and process at Wikisource:Bots. — billinghurst sDrewth 14:01, 20 July 2010 (UTC)

Move pages of poorly named DJVU file[edit]

The pages at Index:Whatsocialclasse00sumnrich.djvu have all been proofread. The DJVU file should be moved to File:What Social Classes Owe to Each Other.djvu on Commons. Once that is done, the index can be moved along with all the corresponding pages. Let me know if you need assistance on Commons (I don't think moving files requires admin rights, but if it does I can do so). —Spangineerwp (háblame) 15:07, 30 August 2010 (UTC)

I am not aware of any bot tool that we have that moves files. Happy to learn if one exists, and how we can get to use it. — billinghurst sDrewth 06:28, 3 November 2010 (UTC)
Moving the files isn't what the bot is needed for; just the page moves: I can move the file if someone's bot can then do the page moves. —Spangineer (háblame) 19:39, 3 January 2011 (UTC)
My bot can move these pages: so I'm now moving the .djvu.
FYI, This tool can move some files from one project to another. JackPotte (talk) 20:05, 3 January 2011 (UTC)
The 180 pages are now being moved... JackPotte (talk) 19:21, 4 January 2011 (UTC)
Yes check.svg Done JackPotte (talk) 20:00, 4 January 2011 (UTC)
Wonderful, thank you! —Spangineer (háblame) 22:21, 4 January 2011 (UTC)
I have been through and fixed up the transclusions. We had blank bodies in the main namespace. Oopsie. — billinghurst sDrewth 23:37, 4 January 2011 (UTC)

Template:PageQuality[edit]

The following discussion is closed.

Need to run a bot through to save all pages that utilise the template:PageQuality. This will convert all the pages to use <pagequality>. (non-urgent) — billinghurst sDrewth 13:21, 7 December 2010 (UTC)

As far as I could see this template isn't called directly in all pages: the function pageQuality() from MediaWiki:Common.js does it. Consequently it should be wiser to ask to ThomasV what exactly he had in mind. JackPotte (talk) 19:32, 3 January 2011 (UTC)
{{PageQuality}} is the old style, and obviously a template. ThomasV has converted to the tag <pagequality> for better measures and updates. As pages with the old template are saved, mw:Extension:ProofreadPage automatically converts them. The bot process is to just go through and save each file, it is about getting around to it. — billinghurst sDrewth 12:22, 4 January 2011 (UTC)
Tens of thousands of edits, for what purpose? I like the idea of doing this for 'Validated' and 'Without text' pages, as presumably these will only get updated to the new tag if a bot does it. But 'Not proofread', 'Proofread' and 'Problematic' pages are by definition awaiting further edits, so there seems no need to do anything but wait patiently. Hesperian 12:34, 4 January 2011 (UTC)
I didn't do the detail at the time, it was a parked thought bubble. I have already done the old "without text" when I was checking those 18k of files looking for mislabelled text and images. I like your suggested plan to do the validated, and then look and review. — billinghurst sDrewth 12:59, 4 January 2011 (UTC)
I can provide a list if you want. Hesperian 00:49, 5 January 2011 (UTC)
ThomasV has been doing it. Billinghurst (talk) 15:53, 30 March 2011 (UTC)
Yes check.svg Done there was some remainder and I have wiped them away. — billinghurst sDrewth 12:38, 6 May 2011 (UTC)

Template:Dropcap & Template:Dropinitial[edit]

Can all instances of {{Dropcap}} be replaced with {{Dropinitial}}? Dropcap has been replaced but is still in use. - AdamBMorgan (talk) 19:45, 2 March 2011 (UTC) Yes check.svg Done and converted it to a redirect. — billinghurst sDrewth 22:52, 7 May 2011 (UTC)

Template:Portals, Template:Indexes & headers[edit]

The function of the templates {{Indexes}} and {{Portals}} have been replaced with the

| portal = 

parameter in the headers. Can all uses of these templates therefore be replaced by bot with the parameter? This will involve replacing pipes (|) with slashes (/). - AdamBMorgan (talk) 19:45, 2 March 2011 (UTC)

Underway, code for AWB custom module at User:SDrewthbot/AWB_modules. It is for indexes, though will be readily modifiable for portals. Billinghurst (talk) 13:41, 29 March 2011 (UTC)
both Yes check.svg Done Billinghurst (talk) 14:33, 30 March 2011 (UTC)

There are a few minor issues with some pages in the Portal: ns which I have left a separate note for ABM. Otherwise this looks complete and the templates can be removed. Billinghurst (talk) 15:55, 30 March 2011 (UTC)

Mass deletion of not-proofread pages[edit]

Previous djvu file into Index:Horse shoes and horse shoeing.djvu was not complete; I replaced it and I manually aligned text of proofread pages; now remaining, not proofread pages could be deleted (as suggested by Billinghurst): Please delete:

Thanks. --Alex brollo (talk) 06:43, 2 February 2011 (UTC)

I'll take it. Script will be posted to User:TalBot/horseshoe-delete.py soon.--GrafZahl (talk) 09:11, 2 February 2011 (UTC)
The script is ready. Unless there are exceptions objections, I'll start it tomorrow (UTC).--GrafZahl (talk) 09:58, 2 February 2011 (UTC)
Thanks. I'll be more careful next time while choosing the best djvu file. :-( --Alex brollo (talk) 10:08, 2 February 2011 (UTC)
No prob, and of course, I meant "objections" above.--GrafZahl (talk) 10:16, 2 February 2011 (UTC)
Thanks :-) --Alex brollo (talk) 11:22, 3 February 2011 (UTC)

Yes check.svg Done You're welcome.--GrafZahl (talk) 12:35, 3 February 2011 (UTC)

Bot Request: Save DJVU pages[edit]

Index:Os Lusíadas (Camões, tr. Burton, 1880), Volume 1.djvu could stand to have a bot run over it to just save each of the 300 pages as "Not Proofread", showing only the OCR text. Isn't there a bot to do this? TheSkullOfRFBurton (talk) 02:40, 31 March 2011 (UTC)

X mark.svg Not done No readily available script and no designated discernible value in applying text layer. — billinghurst sDrewth 23:45, 10 April 2011 (UTC)

A bot to remove the ? marks[edit]

Do we have a bot which removes the w:replacement character" (U+FFFD, ), emnbedded at the beginning of the text lines of not proofread pages? If we have one, would it be possible to clean up the PSM Pages? Thanks. — Ineuw talk 20:14, 10 April 2011 (UTC)

We could run a bot through them, however, if it is pages that you are going to edit, and the characters are just of nuisance value, then it is probably more worthwhile creating a regex script either part of a general cleanup script or as a separate script. That you could just run (click) that when you edit the pages. Running a bot through lots of non-proofread pages to remove a few characters and leave the other text otherwise unaffected is not the most productive. — billinghurst sDrewth 23:43, 10 April 2011 (UTC)

Please don’t bother with a script because I remove the automatically when editing. My concern was that, (article title pages excepted), I am assembling not-proofread pages as the titles are harvested, and this would have made the articles more decent looking for the casual reader. Am aware and agree that unfinished pages should not be posted on the main namespace, but this puts me in a sort of a Catch-22 situation and this idea was a stop-gap idea. BTW, thanks for the quick reply.— Ineuw talk 00:38, 11 April 2011 (UTC)

1001 Children's books[edit]

Can someone make a bot to remove the category: Category:1001 Children's Books You Must Read Before You Grow Up from the works listed in there, as well as add

|portal = Children's literature

to each work? Thank you. - Theornamentalist (talk) 23:42, 3 June 2011 (UTC)

Yes check.svg Done and with a general tidy of the works. — billinghurst sDrewth 13:52, 4 June 2011 (UTC)
Thank you - Theornamentalist (talk) 15:29, 8 June 2011 (UTC)

Add alternate URLs for US Federal Case Law[edit]

I noticed that most of the US case law that we index (e.g. Federal_Reporter/Second_series/Volume_488, etc.) is available at bulk.resource.org (e.g. http://bulk.resource.org/courts.gov/c/F2/488/ ) and Justia (e.g. http://cases.justia.com/us-court-of-appeals/F2/488 ), as well as OpenJurist (who took the initiative to make the index pages). Considering that bulk.resource.org were the first to make it available free on the 'net, we should certainly include them -- and we might as well include Justia, too. JesseW (talk) 08:17, 21 October 2010 (UTC)

Include them to what end??? To bypass the eventual importation of all case opinions to Wikisource - as is tha case with the United States Reports currently in progress???. Granted without Archive.org existing, BenchBot could not authomatically whittle down the pile of work as it does now but its not like they aren't littered with mistakes & ommissions over there either.

Any interlinks you've seen go from red to blue in any of the other reporters currently listed and framed on WS just happen to have the same case names as ones being created for the United States Reports part of the USSC Project. These will all require disambiguation but we're just not at that point yet.

Sorry; I just don't follow the reasoning for wanting to add something that is, in effect, going to be superseded by the Wikiproject(s) anyway, but I'm sure other folks will review this request irregardless. George Orwell III (talk) 10:42, 21 October 2010 (UTC)

Er, no -- of course it's better to have our own, well-curated copy -- but the two don't seem opposed. It's good to link to useful sources of information (as these are, particularly the bulk.resource.org ones which seem easiest to read, with less districting stuff added on the side), and it's good to import it into Wikisource. i guess I "just don't follow the reasoning" for opposing linking to multiple copies of material on the basis that we will (eventually) also have a local copy. JesseW (talk) 02:25, 22 October 2010 (UTC)
Maybe we've talked past each other here. I have absolutely NO objection to linking external sources - quite the opposite actually. My problem is 2-fold. The case reporter that was the most furthest along was the United States Reporter even before I wound up here. The person (or people?) who set up most of the case lists & volumes, transcribed a nice chunk of case opinions and added many other useful portions related to US Law that the current project people are still using or improving today didn't make things easy. The one thing that in retrospect was a rather large oversight was to use full URLs -- like the ones below pointing back to OpenJurist below...
* [hftp://openjurist.org/37/us/1   37 U.S.  1]  (1838)  [[United States v. Laub]]
* [hftp://openjurist.org/37/us/11  37 U.S. 11]  (1838)  [[Lessee of Swayze v. Burke]]
* [hftp://openjurist.org/37/us/27  37 U.S. 27]  (1838)  [[United States v. Woolsey]]
* [hftp://openjurist.org/37/us/32  37 U.S. 32]  (1838)  [[Bank of the United States v. Daniel]]
* [hftp://openjurist.org/37/us/59  37 U.S. 59]  (1838)  [[Bradstreet v. Thomas]]
* [hftp://openjurist.org/37/us/66  37 U.S. 66]  (1838)  [[McKinney v. Carroll]]
* [hftp://openjurist.org/37/us/72  37 U.S. 72]  (1838)  [[United States v. Coombs]] ....etc etc
... and did the same for nearly every inline citation, opinion, footnote, dissent, law review and similar with no rhyme or reason for others to adhere to moving forward. Not only are the full URLs used but they can jump around to the different legal sites from one sentence to the next and then to a third in the one after that!!!!
A template should have been used early on that automated and enabled a minimal window into doing some upkeep by modifying the values in one template if need be rather than thousands of citations across all the cases. Adding to this current state with what you described only introduces another level of entries that will add to the BOTs workload while at the same time introducing a greater chance of botching the conversion(s) in the process.
The other issue is the lack of such templates. The only template that does anything like what is needed on Wikisource right now is {{Federal reporter}} -- and that isn't even setup for all the different external sites last I checked. We need to import, modify then validate something like Template:Ussc currently on Wikipedia {{USSC}} to over here on Wikisource for starters before we even get to throwing a BOT at it. Wouldn't you agree? George Orwell III (talk) 06:44, 22 October 2010 (UTC)
I certainly do. Thank you so much for clarifying! That is indeed a depressingly convoluted mess, and should certainly be straightened out before we set a bot on further confusing things. If/when I have some time, I'll see what I can do to help. JesseW (talk) 19:07, 22 October 2010 (UTC)
I am glad we are on the same page too! Pitching in is the most helpful thing you can do while this request stays 'Open'. As long you see your entry on this page, the request is open to anyone willing to address it before the folks over on the USSC Project (& BenchBot) do. While the lower amount of regular users compared to Wikipedia has many advantages for Wikisource, the honest truth is that this ties up limited resources like BOTS and these requests can take a long time to get to (if at all). In the meantime, helping out is always appreciated and if you run into a roadblock or have a question, stop by the Law Portal or USSC Project page - drop me a note directly even if need be. George Orwell III (talk) 22:44, 22 October 2010 (UTC)

Add in tables[edit]

Could someone help me out or find a bot that could make all of the rest of the pages for the book (Index:List of residents 1.djvu) to look like page 10? - Tannertsf (talk) 01:45, 13 May 2011 (UTC)

I don't think so. Best you can do is to look to create a template of the header and have that as an easy addition. Formatting the pages to take text into the tables is going to be a problem on raw text. If you went through and corrected it, had a each column on a new line, and a double return between each data row, then we can probably bot the formatting. — billinghurst sDrewth 16:18, 13 May 2011 (UTC)

I was looking for more of a formatting bot, and then we could put the text in. - Tannertsf (talk) 16:20, 13 May 2011 (UTC)

It would be easier the other way around. The text is already there. Fix it to be cell per line, double gap between sets, and we can run the bot through. — billinghurst sDrewth 16:24, 13 May 2011 (UTC)

Moving Pages for relocated Index[edit]

The index for A Little Pretty Pocket Book was moved from Index:A Little Prettly Pocket-book.djvu ‎ to Index:A Little Pretty Pocket-book.djvu, but apparently the Pages don't move with the index. Could I get a bot to relocate the proofread pages?--T. Mazzei (talk) 00:02, 15 July 2011 (UTC)

Done with tabbed browsing. Sometimes it can be quicker than setting up a bot run. :-) Hesperian 00:49, 15 July 2011 (UTC)

Follow-up to BOT run[edit]

... for about 60 Catholic Encyclopedia (1913) articles that didn't seem to get their standard mainspace headers swapped out for the project's customized one, {{CE13}}, during one of the becoming-all-too-frequent unannounced & undocumented maint. BOT runs awhile back.

Manually took care of most non-CE13 strays already. The remaining bangs are with Category:Headers missing parameters. -- George Orwell III (talk) 04:37, 31 July 2011 (UTC)

Yes check.svg Donebillinghurst sDrewth 09:47, 31 July 2011 (UTC)

Special:WantedCategories[edit]

I would like to empty this page with my bot, according to Category:United States Supreme Court decisions in Volume 1. JackPotte (talk) 15:23, 15 August 2011 (UTC)

Yes check.svg Done JackPotte (talk) 14:28, 20 August 2011 (UTC)

Line feed fix[edit]

Formally noting here (as discussed at WS:S) that I have fixed the line feed issue that came about from the modification to the proofread page javascript. Details at User:SDrewthbot/trim trailing LF‎‎billinghurst sDrewth 11:08, 29 October 2011 (UTC)

Bulk move request PSM v43[edit]

Have another need for bulk move script per this request on my talk page.

In short, in volume 43 of the Popular Science Monthly project we need to move everything from Djvu position 579 to the new end (902) by an offset of +2. TIA. -- George Orwell III (talk) 12:05, 10 November 2011 (UTC)

... then I'd like to buy a vowel, Alex George Orwell III (talk) 20:32, 11 November 2011 (UTC)
Yes check.svg Done Any articles pointing to those pages should have their pages tags adjusted. Inductiveloadtalk/contribs 05:22, 14 November 2011 (UTC)
The single article using those pages was adjusted. Inductiveloadtalk/contribs 07:11, 14 November 2011 (UTC)

Bulk move request for PSM volumes 43 and 47[edit]

These two volumes need the bulk move script to be run. Thank you. — Ineuw talk 23:46, 13 November 2011 (UTC)

Yes check.svg Done For both. Inductiveloadtalk/contribs 07:05, 14 November 2011 (UTC)