User talk:Mpaa

From Wikisource
Jump to: navigation, search

(Archives index, Last archive) Welcome

Hello, Mpaa, and welcome to Wikisource! Thank you for joining the project. I hope you like the place and decide to stay. Here are a few good links for newcomers:

Carl Spitzweg 021-detail.jpg

You may be interested in participating in

Add the code {{active projects}}, {{PotM}} or {{CotW}} to your page for current wikisource projects.

You can put a brief description of your interests on your user page and contributions to another Wikimedia project, such as Wikipedia and Commons.

Have questions? Then please ask them at either

I hope you enjoy contributing to Wikisource, the library that is free for everyone to use! In discussions, please "sign" your comments using four tildes (~~~~); this will automatically produce your IP address (or username if you're logged in) and the date. If you need help, ask me on my talk page, or ask your question here (click edit) and place {{helpme}} before your question.

Again, welcome! — billinghurst sDrewth 12:00, 7 April 2011 (UTC)

Help with data extraction[edit]

Hi. I am asking you for help to extract some text data from the Wikisource text databases because so far I wasn't successful in achieving this goal.

The data I need is from this pageto this page of the first 105 characters of every paragraph. This may or may not contain already contain a wiki link and an anchor, but the necessary part of the text is the word "Page nnn" followed by a the reference number. From these I convert and create links and generate the anchors in the text pages.

Reference anchor:
{{fs90/s}}{{anchor|463-1}}[[Page:The Conquest of Mexico Volume 1.djvu/53#53-1|Page 9 (<sup>1</sup>)]].—

Main text source:
{{anchor|53-1}}[[Page:The Conquest of Mexico Volume 1.djvu/463#463-1|<sup>1</sup>]]

Not to confuse you, both ends are, or will be anchored and linked for convenience. — Ineuw talk 19:25, 25 August 2015 (UTC)

Done: User:Ineuw/Sandbox. Hope I got you right.— Mpaa (talk) 21:44, 25 August 2015 (UTC)
Perfect, many many thanks.— Ineuw talk 22:36, 25 August 2015 (UTC)
@Mpaa: Could you extract a new data set with the same parameters. Using the previous list I found a number of duplications which I corrected. Thanks in advance, and don't forget to bill me.— Ineuw talk 09:06, 29 August 2015 (UTC)
Done: User:Ineuw/Sandbox.— Mpaa (talk) 10:08, 29 August 2015 (UTC)
Much thanks :-) — Ineuw talk 18:22, 29 August 2015 (UTC)

Volume 2[edit]

@Mpaa: Could you please extract the data like above but for volume 2 and place it again in User:Ineuw/Sandbox? The page range is BEGINNING HERE and ENDING HERE. I am sure that after I made corrections I will come back for another data extraction. Thanks in advance.— Ineuw talk 03:50, 5 September 2015 (UTC)

done.— Mpaa (talk) 07:16, 5 September 2015 (UTC)
Thanks. — Ineuw talk 19:45, 5 September 2015 (UTC)

Categories by gender[edit]

Hi, there! There is a small problem in current version of author template with fetching gender data from wikidata. The problem arises when wikidata gender value is set to "unknown value" (see for example Author:C. E. Brewster). Cheers, Captain Nemo (talk) 01:10, 1 September 2015 (UTC).

Tried to fix it. I added Category:Author pages with unknown gender in Wikidata.— Mpaa (talk) 20:45, 1 September 2015 (UTC)

Publication year of Shakespeare's sonnets[edit]


Back in 2012 you added |year=1598 to all of Shakespeare's sonnets (see eg. Sonnet 4 (Shakespeare)). As far as I know the sonnets were all first published in 1609, except 138 and 144 that had previously appeared, probably through piracy, in The Passionate Pilgrim in 1599). Is there any particular reason you have it down as 1598 here? --Xover (talk) 19:51, 10 September 2015 (UTC)

I just moved the year from being an explicit category to a parameter of the header template (which does automatic categorisation by year), see so I guess you need to find out who put Category:1598 works on the page.— Mpaa (talk) 18:35, 11 September 2015 (UTC)
Thanks. --Xover (talk) 15:51, 12 September 2015 (UTC)

Updated scripts[edit]

Hi Mpaa. I edited your common.js, Regexp toolbar.js, and works.js to update you to the latest version of TemplateScript. You were using a much older version called regex menu framework, so you should notice a lot of improvements. A few of the big changes:

  regex menu framework TemplateScript
regex editor ✓ an improved regex editor which can save your patterns for later use
compatibility unknown ✓ compatible with all skins and modern browsers
custom scripts limited ✓ much better framework for writing scripts
supported views edit ✓ add templates and scripts for any view (edit, block, protect, etc)
keyboard shortcuts ✓ add keyboard shortcuts for your templates and scripts

I also updated deprecated functions. Let me know if anything breaks. :) —Pathoschild 05:02, 12 September 2015 (UTC)

Hi. Thanks a lot!— Mpaa (talk) 08:00, 12 September 2015 (UTC)

use of[edit]

Hi. Are you using to populate pages? I tried

 python -lang:en -family:wikisource -djvu:A_cyclopedia_of_American_medical_biography_vol._1.djvu -index:A_cyclopedia_of_American_medical_biography_vol._1.djvu

then it spat out a string of error messages. Before I start on the fuller diagnosis, wandering if you have used the tool, or whether it is compat mode, and I am bashing my head against the desk. Thanks. `— billinghurst sDrewth 15:23, 23 September 2015 (UTC)

There is bug. Will look into it.— Mpaa (talk) 17:01, 23 September 2015 (UTC)
Actually there were a couple. I submitted a patch. Hopefully will be merged soon (I hope ..., it is difficult to make a forecast on approval time). It is a good thing that someone starts using these scripts, as they have very few users and are either new or ported to core, so new bugs might pop up. @John Vandenberg: might help with the approval.— Mpaa (talk) 18:00, 23 September 2015 (UTC)
Merged. Hope it is OK now.— Mpaa (talk) 19:23, 23 September 2015 (UTC)
Thanks. I will have a go later. I sit in Freenodes's IRC #pywikibot so always happy to prod them for coding checks, just wasn't comfortable on a lesser used and specific script.

My plan is to get some of these biographical compilation works pushed through as that makes them findable in search, and may therefore get some people chipping away at them. [@Charles Matthews: FYI.] I am also thinking that we can set up some mini/sub-projects that may be self-sustaining if we get the parent guidance and components right; leveraging what we learnt from the DNB. Once that is going, I then want to look at some of those publications like Gentleman's Magazine, etc. which are otherwise referenced in this biographical works. I am not looking to use it on chaptered works, of either fiction or non-fiction. — billinghurst sDrewth 05:35, 24 September 2015 (UTC)

Potentially crossing the line with this question.[edit]

Hi again,

Not that your answer will prevent my support for you get the 'crat bit either way -- nor does it preclude the possibility that I'm being outlandishly cautious in my own little world of concerns -- but I'd feel better if I knew for sure; so here it goes...

Are you a resident of Australia?

I know how that may read but the ONLY reason I ask is I'm fairly sure the other 'higher-than-sysop' bit holders are residents of that great nation and I'm a bit concerned one good natural disaster there coupled with some local problem taking place at the same time here could present response issues given the right timing.

I do think you're well suited for the bit no matter how you answer (if at all - a simple 'y/n' will do) and plan to support you given better to have someone competent in place in spite of the slim likelihood of any such confluence of events taking place. Apologies in tenfold if you feel this kind of question is beyond something you should feel you ever need to answer never mind the seemingly 'poor-taste' in my asking of it in the first place may appear to some. All I'm trying to really establish is if there is any chance there may be a "gap" in observation & coverage while moving forward since WS is a 24 hour, 7 days a week endeavor and is mainly why I ask.

Sincerely. -- George Orwell III (talk) 20:59, 23 September 2015 (UTC)

No, in the "Old World" (EU). But there might be a "gap" all the same as I cannot commit to a daily supervision.— Mpaa (talk) 21:14, 23 September 2015 (UTC)
GOIII to note that we have one Oz CU, one Oz 'crat and we are as close together as the UK is to Russia; and I would think that we would be on very different information pipelines. The other CU, and the other 'crat are in the US, and I have no idea of their location to each other. — billinghurst sDrewth 06:03, 24 September 2015 (UTC)

Do you have a bot to auto add unproofed text from OCR?[edit]

I was wondering if was possible to look into mass-adding text for this Index:Ruffhead - The Statutes at Large - vol 3.djvu so that BD2412 and others can run automated cleanup scripts.ShakespeareFan00 (talk) 23:48, 23 September 2015 (UTC)

I am experimenting with Wikisource-bot, as a proof of concept at this stage, where I am targetting a few biographical works that users can: 1) find in a search, 2) may wish to fix for a biographical reference at WP, or 3) for a cross reference for a work here; as such they can dip in and out of as their time permits, and transclude in small parts, and I think that is justifiable. Our previous issues with chaptered works just being added and forgotten about was considered somewhere between valueless and innocuous.

I sense an eagerness for its use, and I would think that the general discussion would need to again be raised to what and where the community thought would be of value to apply non-proofread text; and how it would be envisaged that people are encouraged to proofread its text. In short, that it is being curated, and transcribed, not set and left. Also the benefit in applying the text layer in that form against the issues about having it placed but not progressing in proofread status. So let me see if I can get the tool working, and approved by the community for my plans, then we can look at other uses and tasks with solutions. — billinghurst sDrewth 05:57, 24 September 2015 (UTC)

Fwiw, I have the following workflow with pywikibot:
  • load non-existing pages with pywikibot: the 'preload' functionality will fetch the text layer from the djvu file for me.
  • save all pages in a file that can be used by
  • do the typical clean-up work offline (rh, typos, blanks in punctuation, etc. using text editors, offline scripts, etc. whatever is best in that case)
  • once the result is good enough, I bulk-upload it
  • pywikibot would benefit if it could read/write files containing pages (there is work on going about this).
Another option is to work directly online, interposing a clean-up function between fetching the (not existing) page and saving it.
Mpaa (talk) 18:04, 24 September 2015 (UTC)

Bot substitution & cleanup[edit]

Meant to thank you again for the bot work on War... No small favor. I am flying through the pages now, hoping I don't make too many errors as a result! Londonjackbooks (talk) 19:41, 28 September 2015 (UTC)

Happy of being helpful. Should you need help in future, just ask. In cases like this, with a small effort one can simplify a lot.— Mpaa (talk) 20:31, 28 September 2015 (UTC)

Crap!; or, What one says when pages are missing from one's book[edit]

Thought I'd ask here first; pages 278 & 279 are missing from War. I have images of the missing pages at the ready from another online version (same edition, different printing). The reason that pagination appeared to be squared away is because pages 296 & 297 repeat themselves after p. 297. All is well with pages after that. Do you know how this can be fixed? Sorry, and Thanks! Londonjackbooks (talk) 19:35, 29 September 2015 (UTC)

Yes, it can be fixed. Are you familiar with manipulating djvu files? You should remove the two duplicate images and insert the two new ones in the proper place. Then, we should shift pages correspondingly. If you can refer to djvu page number, it is clearer. If you are not familiar with the process, upload the two images here on WS and we will sort it out somehow.— Mpaa (talk) 20:31, 29 September 2015 (UTC)
I wish I knew how. But DJVU pages 312 & 313 should be removed (they are the pages which repeat), and pages shifted from DJVU pg 294. The pages to insert will fill DJVU pgs 294 & 295. I have uploaded the images. They are located in Category: User images, and are the only images listed. Let me know if I can do anything else, and thanks! Londonjackbooks (talk) 20:50, 29 September 2015 (UTC)
Should be OK now. Make a null edit (op righ menu) if images are not aligned yet.— Mpaa (talk) 22:10, 29 September 2015 (UTC)
Great! Thank you so much for the fix... Glad it was repairable. Londonjackbooks (talk) 22:29, 29 September 2015 (UTC)



You are now a bureaucrat. Enjoy the crushing pressure. :-D

Hesperian 02:28, 1 October 2015 (UTC)

Congratulations - good job! BD2412 T 03:39, 1 October 2015 (UTC)
Congratulations, excellent choice. . . . . and as they say in Budapest, today Rome, tomorrow the world. — Ineuw talk 22:36, 3 October 2015 (UTC)

Purging Index: ns[edit]

I have started Wikisource-bot on the task of purging all the Index: ns (started ~1200 GMT). That should clear up the issue of page editing not updating Special:IndexPage, and where you had identified that some indices were missing their class. — billinghurst sDrewth 12:10, 2 October 2015 (UTC)


Hi, I’ve been busy proofreading Nietzsche the thinker and wondered if you wanted some feedback on your bots’ work? Cheers, Zoeannl (talk) 01:35, 6 October 2015 (UTC)

Yes, that is appreciated, thanks. I actually cleaned up the text offline and just ised the bot for fast page uploading— Mpaa (talk) 21:21, 6 October 2015 (UTC)
The clean-up of footnotes is almost flawless with very few missed. The pre-italicizing is very convenient—to the extent I’ve had to stifle my annoyance when it doesn’t work! almost always because of a scanno. I've noticed 2 minor glitches: it consistently deletes anything on the line before a "Cf." and there have been a couple of missing following footnotes i.e. when split over 2 pages the footnote on the 2nd page isn't there.

Also, if you have done any clean-up on PSM so it's set up for easy proofreading, please let me know. I find such pages very soothing at the end of a long day—when Nietzsche is a bit much. :) Cheers, Zoeannl (talk) 08:08, 7 October 2015 (UTC)

@Zoeannl:, you might want to take a look at PSM 55.— Mpaa (talk) 20:58, 13 October 2015 (UTC)
Lovely. Cheers, Zoeannl (talk) 00:17, 14 October 2015 (UTC)
PSM 56, 58 and are good to go as well. BTW, thanks for proofreading Nietzsche.--Mpaa (talk) 15:23, 25 October 2015 (UTC)

File:Works of Jules Verne - Parke - Vol 5.djvu[edit]

Hello Mpaa,

I prefer you to decide about the namings, whatever system you chose. The French and Canadian system is often (but not always) this one: fr:Livre:Paquin - La cité dans les fers, 1926.djvu. I don't know if Wikidata makes this kind of naming obsolete. Lots of thanks for your help! Regards, --Zyephyrus (talk) 13:18, 8 October 2015 (UTC)

Hi. I just noticed in Central discussions that a bulk page move was required, so I took the task. I had no part in the rest. If a better naming is suggested, I think this should be mentioned in the thread where all this started.— Mpaa (talk) 20:13, 8 October 2015 (UTC)

Wikisource Conference[edit]

Hi Mpaa, I'm Aubrey from As you probably know, a bunch of wikisourcerors (and Wikimedia Austria) are organizing a conference from 20 to 22 November, in Wien, Austria. We are trying to reach to experienced Wikisource editors, because the conference will be (hopefully) an important event for the Wikisource international community. Will you be able to attend? Do you have a chapter who could provide you a scholarship? If you're interested, please write to me at Thank you very much. --Aubrey (talk) 17:24, 18 October 2015 (UTC)

Hi Aubrey. Thanks for the invitation. Unfortunately, I do not think I will be able to attend.— Mpaa (talk) 17:43, 18 October 2015 (UTC)

A minor request for the text cleanup process[edit]

Hi. Would it be possible to alter the <ref>footnote</ref> to <ref>*</ref>? When the asterisk is removed, it's difficult to find at a glance. Thanks — Ineuw talk 17:06, 2 November 2015 (UTC)

Sure, I will do that from now on.— Mpaa (talk) 17:58, 2 November 2015 (UTC)
Thanks.— Ineuw talk 20:53, 3 November 2015 (UTC)


Dear Mpaa I am contacting you since I found you interested in music and committed in the Grove Dictionary. I have found an interesting pair of essays on Senesino and Farinellim, the two most famous castratos in London during the time of Handel, in The Westminster Magazine: or, The Pantheon of Taste, vol. 5 (1777), pp. 396-397 and would like to submit both to the English wikisource in the form of texts (of course a scan of the pages can accompany the essays). So far however I have only contributed to the German version and only reference sites listing links to pdfs of old musicological and general newspapers. So I do not know how I can possibly create a new site here containing an article from the Westminster Magazine, the more so since no single article from there seems to be contained in Wikisource yet. I have uploaded the first of the two texts on a separate page in my wikisource profile. The "references" are original footnotes from the text.

Not really, probably you have seen some occasional contribution of mine ... :-)

Three questions on this:
1) Can you explain to me how to submit this to the English wikisource best

Same as on de.wikisource. Download the scan to Commons (File:xxx.djvu or pdf, depending on what format you have) and create an Index:xxx.djvu (or pdf).

2) Would you take care of this for me? The errors in the "text only version" supplied by Google Books have been corrected with the exception of two minor things: a) what the omonious "lö" in "Jupiter & lö" stands for and what amount is meant by 50 (unknown symbol) L. (50,000 pounds?). The text version has been proofread by me against the scan.

You need to proofread against the scan. Since you already did it, copy-paste the prtion of text in th corresponding Pages in Page:ns.
Texts should be faithful to original, so errors should stay there, may be you can use {{SIC}}.

3) Is it ok or general usage in the English wikisource to include links to wikipedia to make clear what person or building etc. is meant in the text?--Haendelfan (talk) 11:10, 10 November 2015 (UTC)

Yes, without overlinking. When you transclude a page in Main ns, in the author header (or via wikidata) you can link to from there as well.
If you link here the index page you will create, I will keep an eye on it.Mpaa (talk) 23:01, 10 November 2015‎