User talk:Mpaa

From Wikisource
Jump to: navigation, search

(Archives index, Last archive) Welcome

Hello, Mpaa, and welcome to Wikisource! Thank you for joining the project. I hope you like the place and decide to stay. Here are a few good links for newcomers:

Carl Spitzweg 021-detail.jpg

You may be interested in participating in

Add the code {{active projects}}, {{PotM}} or {{CotW}} to your page for current wikisource projects.

You can put a brief description of your interests on your user page and contributions to another Wikimedia project, such as Wikipedia and Commons.

Have questions? Then please ask them at either


I hope you enjoy contributing to Wikisource, the library that is free for everyone to use! In discussions, please "sign" your comments using four tildes (~~~~); this will automatically produce your IP address (or username if you're logged in) and the date. If you need help, ask me on my talk page, or ask your question here (click edit) and place {{helpme}} before your question.

Again, welcome! — billinghurst sDrewth 12:00, 7 April 2011 (UTC)


Follow-up on Fill pages with OCR from PDF[edit]

Hi Mpaa, maybe you remember the mentioned discussion. As a follow-up question i want you to ask if or how it is possible to create pages with providing only the pure text; header and footer should be automatically generated as it is done when a user manually creates a page, e. g. [1]. Btw, how does wikisource.py generate headers and footers? Thank you, --Aschroet (talk) 08:42, 18 August 2016 (UTC)

If you set header and footer via Index page, preload should load them.— Mpaa (talk) 18:24, 24 August 2016 (UTC)
Hi Mpaa, I know that your script wikisourcetext.py does load this header and footer. But when i generate the OCR by an external tools i usually use pagefromfile.py to create the pages. However, for this script i need to add header and footer on my own, preload is not supported there. Do you have any idea how i could combine the text preloading and the retrieval of the text from a file? Thank you, --Aschroet (talk) 17:49, 20 October 2016 (UTC)
The djvutext.py script adds the headers/footers as per the fields on the Index:. So if you cannot steal the code from there, you could just run a bot through with that script to create the pages, then run your bot through that applies the OCR text. (maybe?)
pagefromfile.py will overwrite the whole page. I think the best way would be modify pagefromfile,py to fetch the page from wiki first and replace only the body taken from pagefromfile.py. Very hard to have this accepted as a global patch. If you want I can make a custom version from you that will work only in the Page namespace. Or, if you describe how you 'generate the OCR by an external tools' maybe there is a better way.— Mpaa (talk) 19:00, 21 October 2016 (UTC)

Hi Mpaa, i successfully used wikisourcetext.py several times. So i did for de:Index:Sylvicultura oeconomica.pdf. However, now i have the case where i had to add some pages to the underlying PDF of an index. This moved many already created pages so that the OCR do not fit anymore. To fix it i tried to run the script with -force again against those pages. Interestingly, the script returns that the pages were written but they did not. Any idea why this happens? When you want to try please only do it for the mentioned page in the command:

python pwb.py scripts/wikisourcetext.py -summary:Seitenerstellung -index:Index:Sylvicultura_oeconomica.pdf -pages:434 -always -force -pt:1

Thank you, --Aschroet (talk) 20:14, 5 February 2017 (UTC)

Hi. The options are misleading. The script only works is pages are not existing. I think there is no way of getting the file text once the page has been created. You have two options: 1. move pages to compensate for the shift with movepages.py (easiest is to provide 'from' 'to' pages via the -pairsfile option) or 2. delete and recreate pages from scratch. If you submit a bug in https://phabricator.wikimedia.org/, it will be easier to get it fixed. Thanks for reporting this.— Mpaa (talk) 18:23, 6 February 2017 (UTC)
Thanks, task T157535. --Aschroet (talk) 09:07, 8 February 2017 (UTC)

SQL statement for sale - very cheap.[edit]

Greetings. I've written this SQL statement which extracts names of the proofreaders/validators of the Book of the Month/ Project of the Month, from the Page namespace. One has to replace the monthly project name common to all pages (no page number which would extract only a single page) and terminate it with the % sign which is the MariaDB/MySQL wildcard character. Beeswaxcandle mentioned that you are working on something like this. Interestingly, now, the results are also offered formatted as a wikitable.

This example is the list of contributors of "Tom Swift and His Airship.djvu" — Ineuw talk 09:02, 7 November 2016 (UTC)

rev_user_text
Akme
AnotherAnonymous
BD2412
Beeswaxcandle
Beleg Tál
Chris55
Clpo13
HelicopterLlama
Ineuw
Kathleen.wright5
Megganitis
Mudbringer
Samwilson
Slowking4
Tar-ba-gan
Vivrax
Wikisource-bot
William Maury Morris II
Zoeannl
Hi. I made a one-time query and gave it to BWC for his convenience. I am not active on this right now.— Mpaa (talk) 18:33, 7 November 2016 (UTC)
Thanks, I don't know if he used it, but will leave a message to read this page. — Ineuw talk 19:05, 7 November 2016 (UTC)

Spacing around dashes in Aaron's Rod[edit]

Please don't do this. I have reverted those such edits you just made. I deliberately left those spaces in to reflect how the text was printed: note that most dashes are unspaced, but those ones are. BethNaught (talk) 23:23, 7 December 2016 (UTC)

Sorry.— Mpaa (talk) 18:34, 8 December 2016 (UTC)

Removing the old pages from a deleted index file[edit]

Could you delete the old pages displayed on this Index:Life in Mexico vol 1.djvu? The original source file was deleted from the commons, but the pages showing on the index are still from the old book. If you look at the publisher at the bottom of Page:Life in Mexico vol 1.djvu/5 (Chapman and Hall) and compare it to the cover on the Index itself (Charles Little and James Brown, Boston) you can see that they are different. All pages, are from the old copy and they can be all be deleted. The contents of the proofread pages were saved. — Ineuw talk 08:04, 15 December 2016 (UTC)

Done.— Mpaa (talk) 19:33, 15 December 2016 (UTC)
Much thanks. Could I have removed the pages using SQL? — Ineuw talk 22:13, 15 December 2016 (UTC)
Do not know but I am inclined to think not possible.— Mpaa (talk) 18:27, 16 December 2016 (UTC)

mass[edit]

What are units of mass Spectacular Gaets (talk) 18:39, 15 February 2017 (UTC)

kg.— Mpaa (talk) 20:28, 15 February 2017 (UTC)

File:X.jpeg[edit]

still needed? — billinghurst sDrewth 13:35, 1 April 2017 (UTC)