From Wikisource
(Redirected from Wikisource:SCRIPTORIUM)
Jump to: navigation, search
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 305 active users here.



This section can be used by any person to communicate Wikisource-related and relevant information; it is not restricted. Generally announcements won't have discussion, or it will be minimal, so if a discussion is relevant, often add another section to Other with a link in the announcement to that section.

Beta: Create a VisualEditor plugin to integrate with Wikisource[edit]

Coren has put a note into Phabricator about the next development stage of having VisualEditor integrate into Wikisource, initially in the standard namespaces, and following that into the Page: namespace. He says that there is usable mainspace editing with Visual Editor (including the transclusion tag, though we don't use it). This is currently working on a test server, and is scheduled for the deployment train Tueseday, Apr 5 (and deployed on group 1 that includes the Wikisources on Apr 6). Jdforrester is planning to turn the configuration switch on April 7 at which point, VisualEditor will become available as a beta feature on wikisource for all content namespaces except Page (that's the next part being worked upon).

Feedback is best straight into the Phabricator ticket. — billinghurst sDrewth 01:56, 3 April 2016 (UTC)

To be clear, while that note was exactly correct, it's probably more useful to note that VisualEditor will be made available in the beta features of every Wikisource on April 7. You can turn it on there at that time. Coren (talk) 14:15, 3 April 2016 (UTC)
it’s working well for article space, not enabled for page namespace yet. test it out and leave feedback. i added a Wikisource:VisualEditor redirect, since it was a redlink in the edit summaries. Slowking4RAN's revenge 01:22, 10 April 2016 (UTC)

250,000 Validated Pages[edit]

We reached 250,000 validated pages on Monday 4 April, with this edit [1] by Akme. Beeswaxcandle (talk) 07:00, 5 April 2016 (UTC)

Hurrah! That's great. :) Now, on to the next ¼ million... — Sam Wilson ( TalkContribs ) … 00:21, 6 April 2016 (UTC)


Use sortable tables instead list items to list works[edit]

I was wondering, wouldn't it be better to use sortable tables instead of list items to list works in author pages and portals? This way works can be displayed either alphabetically or by year depending on what the user prefers. It's very difficult to find works when they are listed by year and sometimes it's nice to see in which order the works were written chronologically. So why not have both? Jpez (talk) 05:52, 21 April 2016 (UTC)

That works, but only if the list to be sorted does not have items listed under other items, and only as long as all copies of a work are published under the same title. It wouldn't work very well for the page Author:Aeschylus, because (1) The Oresteia has three sub-parts, (b) there are multiple translations of single works listed, (c) works such as his Χοηφόροι have been titled in English as both "Choephori" and "The Libation Bearers" and will not group together when sorting by title, (d) the different editions of translations by the same translator will have been published in different years, and certainly never in the same year as the original publication.
In any case, you can find a work on a page, if you know the title, by using your browser's "Find" function. --EncycloPetey (talk) 16:30, 21 April 2016 (UTC)
@EncycloPetey: Personally the way I would set up Author:Aeschylus page would be completely different. I would get rid of all the sub lists and only link to the main page of each work. Concerning The Orestia, I would link to The Orestia page and not list each work of the trilogy. I don't see the point in listing each and every translation of every work on the authors page when they are all individually listed on each works page anyway. For example it would look something like this.


Title Year
The Persians 472 BCE
Prometheus Bound 480–410 BCE
Seven against Thebes 467 BCE
The Suppliants 463 BCE
The Oresteia 458 BCE
unsigned comment by Jpez (talk) .
@Jpez: That approach would eliminate all the benefits of being able to see (at a glance) whose translations of each play we have, how many we have, what state they are in, and when the translations were published. The sample table above hides all the information except the original date of performance, which is by far the least valuable piece of information concerning those plays. --EncycloPetey (talk) 01:02, 23 April 2016 (UTC)
@EncycloPetey: Well all that information would only be a click away, and if there are many works and translations the list of them can be overwhelming. I think it's like something you've done here Portal:Greek_language_and_literature with the ancient Greek drama portal. Instead of listing every work there you've created the portal and linked to that. Anyway I don't think this is a serious issue, it's just the way I would set up the page and a way tables might be implemented. Jpez (talk) 05:18, 23 April 2016 (UTC)
@Jpez: No, the information would not be a click away, and that's my point. Each bit of information would be a click away, but the user would have to click all of the links and remember what was on each page all at once to get the overall view currently available by putting it in a single place. The Portal:Ancient Greek drama is separate because the list is many screens long—in fact it is as long as all the other content on Greek language and literature put together—and insofar as it succeeds, it does so because it forms a coherent and separate whole within the corpus of Greek literature. The Portal itself is intended to be exhaustive, and is much longer than most Author pages.. But a Portal is likely not a good comparison, since each Portal has the freedom to include or exclude content, and to be structured in any way that is convenient. Our Authors pages need to follow a reasonably consistent format for the sake of our users. --EncycloPetey (talk) 05:43, 23 April 2016 (UTC)
Having tables adds an element of complexity and isn't for every user. — billinghurst sDrewth 04:14, 22 April 2016 (UTC)
@Billinghurst: It does, but it can be made easier by using a template. Jpez (talk) 05:18, 23 April 2016 (UTC)
I am very against the idea, as I think the tables look ugly and provide more problems than they solve. In fact I have been trying to get rid of tables in places, as for example Talk:Bible#Page formatting.
While I agree that the above example of Author:Aeschylus is a poor one, as translations should really be listed on the separate {{translations}} page, I can think of other examples of sub-item lists that are relevant. I often use them for works derived, adapted, exerpted, or originally contained in other works; for examples see Author:John Mason Neale, Author:Bernard of Cluny, Author:Katherine Hankey.
What about when you have several sections on a page? Would you put all poetry, prose, dramas, encyclopedia entries, letters, anthologies, etc. into a single table? Would we still have a different column scheme per section table? per page? —Beleg Tâl (talk) 13:26, 22 April 2016 (UTC)
@Beleg Tâl: For arguments sake, (since I see no one is too keen on implementing my brilliant idea :) I agree that they can look ugly (as the table on the bible page does in my opinion), but I think they could be made to look nice as well for example just getting rid of the borders makes it look a bit more presentable.
Title Year
The Persians 472 BCE
Prometheus Bound 480–410 BCE
Seven against Thebes 467 BCE
The Suppliants 463 BCE
The Oresteia 458 BCE
A few gaps, a bit of aligning etc and you could even make it look somewhat the same way as we are using now (which I like btw), the only difference being that it would be sortable. If I have time I might come up with something, (just for arguments sake). As for different sections etc, I would prefer to use different tables for each section, the setup would be the same as it now just sortable. Also lists can be used within tables if needed etc. Jpez (talk) 05:18, 23 April 2016 (UTC)

Bot approval requests[edit]


Preferably, we ask your HELP questions at Wikisource:Scriptorium/Help.

Repairs (and moves)[edit]

Other discussions[edit]

U.S. Supreme Court Style Manual![edit]

The U.S. Supreme Court Style Manual, viewed by the justices as an internal document for helping law clerks and justices draft opinions in proper form, is going public for the first time, without the court's approval. What's a good way to ensure we get an OCR'd copy of this soon? Buy it and give it to the Internet Archive? They have the infrastructure to OCR it. {{PD-USGov}} applies, of course. --Elvey (talk) 00:54, 31 March 2016 (UTC)

We don't have a purchasing acquisition policy. If you think that there is value in getting it, and reproducing it (where compliant with scope) then it would be through a private purchase, or through a grant application to WMF. IA would indeed be the means to convert to OCR if necessary. Will it not be an electronic document anyway? — billinghurst sDrewth 01:07, 1 April 2016 (UTC)
WMDC has aa book grant program, apply if interested -- deadline tomorrow. Slowking4RAN's revenge 01:40, 3 April 2016 (UTC)
@Elvey: note deadline of 4 April — billinghurst sDrewth 03:01, 3 April 2016 (UTC)
Cool. Done. --Elvey (talk) 04:02, 3 April 2016 (UTC)

Problem identified — long tables not wrapping over printed pages (pdf/epub/...)[edit]

When pages are exported/printed to EPUB/PDF long tables are no longer wrapping over numbers of pages, they seem to get stuck on one long page that expands off the bottom. This previously was not the case and I am unsure when it broke. Anyone got any ideas on what may be the issue, and or who we can harass to look at a fix? — billinghurst sDrewth 13:16, 2 April 2016 (UTC)

Pages with <math> markup[edit]

In Wikisource, under each user Preferences -> Appearance - Math section (at the bottom of the Apprearance page), please check that your Math setting is MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools). If you are viewing or editing Wikisource, the older PNG and LaTeX settings are currently generating some gibberish. Billinghurst has requested a fix through Phabricator. Until this is fixed, please check your user Preference settings for Math before editing pages with math. Outlier59 (talk) 02:20, 3 April 2016 (UTC)

English Translations of Puranas[edit]

I have got some English Translations of Hindu Puranas published by Motilal Banarsidass from West Bengal Public Library Network and they are under Public Domain. But the name of the authors are not mentioned. How can I add them to Wikisource? -Trinanjon (talk) 03:50, 3 April 2016 (UTC)

There are plenty of Puranas from this publisher available in the WB site. Not all are PD. It may be possible to identify the translator; e.g. J. L. Shastri was the translator of Siva Purana volumes. So can you provide the specific links of the books you are considering? Because there are 74 books in this series (1, 2), will be app. 100 volumes on completion. Hrishikes (talk) 06:34, 3 April 2016 (UTC)
Is the Siva Purana by J.L. Shastri under PD? I will also be giving the links of the books such as Skanda Purana, Padma Purana, Garuda Purana, Varaha Purana, etc. -Trinanjon (talk) 09:16, 3 April 2016 (UTC)
Published in 1950 (see here, 1st ed of vol 1 avl here, which shows date as 1970 on the book), so author was alive then; therefore not PD-India on URAA date. J. L. Shastri was one of the general editors for the whole series, so none of the lot is likely to be PD. Hrishikes (talk) 10:22, 4 April 2016 (UTC)

April PotM needed[edit]

We need a PotM for April up on the Main page. Input at Wikisource talk:Proofread of the Month#April 2016... Londonjackbooks (talk) 15:31, 3 April 2016 (UTC)

I went ahead and made a selection based on previous input. See Talk. Londonjackbooks (talk) 20:43, 3 April 2016 (UTC)

Tech News: 2016-14[edit]

22:13, 4 April 2016 (UTC)

Most viewed books with chapters[edit]

During my recent exploration, I felt the need for books with most views on my Telugu Wikisource. I noticed similar requests for English wikisource ( #1) . Starting from the top 1000 pageviews data, I have written an 'R' script to aggregate the page views for all books with chapters(as indicated by use of '/' in main name space page title. I am happy to share the first results of the same at User:Arjunaraoc/201603TopViewsOfBookChapters. I found the top rank going to Constitution_of_India a bit surprising. Do share your feedback. --Arjunaraoc (talk) 11:27, 6 April 2016 (UTC)

Employment Non-Discrimination Act of 2013 - duplicate legislation needs review and possibly merging[edit]

We have two copies of the same legislation, one supported by a scan, and the other a copy ... Special:PrefixIndex/Employment Non-Discrimination Act of 2013 Not certain if one is fro the HoR and the other the Senate or what. It would be useful for someone conversant with US legislation to have a look-see and work out which is better, whether they should be merged or what. — billinghurst sDrewth 02:10, 7 April 2016 (UTC)

Importing partial completed predominantly English text from Telugu wikisource[edit]

We have a predominantly English text about Telugu grammar partially proof read in Telugu Wikisource. Will English Wikisourcers be interested in importing it here and completing it? --Arjunaraoc (talk) 09:18, 7 April 2016 (UTC)

@Arjunaraoc: I presume you meant this link? If so it appears none of Charles Phillip Brown's works are currently on enWS. And in passing, why does it appear to be tagged PD-2013 when the flyleaf reads 1857? Is this either an accident or (as I don't at all read Telugu) some other concern? AuFCL (talk) 10:31, 7 April 2016 (UTC)
AuFCL, Yes. I updated the link now. Copyright tag was incorrect. As we have several books for which copyright was freed via Digital Library of India, many Telugu wikisourcers used PD-2013 for such books. I updated it now as PD-Old. --Arjunaraoc (talk) 23:52, 7 April 2016 (UTC)
@Arjunaraoc: Hi! Can you please explain how copyright was "freed" by DLI? DLI has plenty of copyrighted books, including those published in the 1990s, but how can copyright be deemed as freed by inclusion in DLI? Hrishikes (talk) 02:13, 8 April 2016 (UTC)
@Hrishikes, DLI had done an exercise of contacting authors and publishers to free the copyrights. In the earlier versions of DLI page, there used to be a section like search in copyright freed books. As they claim to be compliant to Indian copyright act, DLI being a government body, we are treating all DLI books as copyright freed. Hope that helps. --Arjunaraoc (talk) 04:57, 8 April 2016 (UTC)
@Arjunaraoc: I don't think this is quite in order. If such were the case, DLI would have included a CC license or equivalent for every book, because the works are copyrighted as per Indian Copyright Act, but DLI can, of course, procure the copyrights and release them to PD under CC license. Have they done so? Can you point to any such documentation? Because, other Indic wikisources (I am active in Bengali), and even English Wikisource can also benefit if such is the case. DLI being a Govt body does not automatically make the books copyright-free. Best, Hrishikes (talk) 05:35, 8 April 2016 (UTC)
@Arjunaraoc:, Can you please provide a link or any documentation as a proof to the statement where it has been declared that DLI have been given consent by the copyright-holders and the publishers of the books to release them under CC. DLI has so many books which are not under PD-India and unless there is a proof about their release of license, it will be considered as copyright violation, if I am not wrong. -- Bodhisattwa (talk) 06:42, 8 April 2016 (UTC)
@Hrishikes,@Bodhisattwa Check this presentation at page 21 where in the copyrights were freed were mentioned. COmmons has not accepted our claim recently and deleted several books making us to uploaded them to the Telugu wikisource. There were some other presentations on the web about dealing with copyrights, which I am not able to locate now. Hope that helps. --Arjunaraoc (talk) 09:26, 8 April 2016 (UTC)
The presentation link cited in the previous remark may be dead. You may check the latest copyright policy of the DLI Copyright Policy of DLI as archived on wayback machine on April 8, 2016 and contact DLI for any more clarifications. --Arjunaraoc (talk) 09:43, 8 April 2016 (UTC)
@Arjunaraoc: Could not check the 1st link (DLI site is down), but checked the second link, which is basically useless. In it, DLI claims that the works are copyright-free, and states that if copyright holders complained otherwise, then concerned books will be removed. No explanation as to how a book not covered by PD-India (like a book published in 1970) could be copyright-free. Only a vague claim, without specifics, never suffices; I can well understand why Commons did not accede to your claims. While adding books to Wikisource (whether directly or through Commons), one should check whether the book can be really deemed as PD as per 1st publucation year and author's death year. One should not go by any "claim" by a website, even if Govt-owned. DLI just claims that they are copyright law compliant, and then continues piling up copyrighted works by the hundreds. Without specific documentation of release under CC or the like, all books seemingly to be under copyright should be deemed as copyrighted. Hrishikes (talk) 10:58, 8 April 2016 (UTC)
@Arjunaraoc:, Thanks for the links, (the first link dont open though). The second link only shows a claim from DLI that all of their books are copyright-free. But there is no such proof that authors and publishers have given their consent to DLI to release their works under CC license. Furthermore, the link also says that, the copyright policy is as per the Indian CopyRight Act 1957, according to which books can be copyright-free after 60 years of the death of author or first publication whichever is later. So, it is self-contradictory itself to the claim. -- Bodhisattwa (talk) 13:45, 8 April 2016 (UTC)

Visual Editor now in article space[edit]

visual editor is now a beta feature for article space editing. check it out and leave feedback. here is the fabricator task -- Slowking4RAN's revenge 16:28, 7 April 2016 (UTC)

OCR not working?[edit]

Is the OCR button working for anyone?

This file Index:The New International Encyclopædia 1st ed. v. 02.djvu doesn't seem to have a text layer, but when I tried using OCR, my only result was the edit window turned grey. --EncycloPetey (talk) 14:51, 8 April 2016 (UTC)

It worked for me but the result was awful, you'd be much better off typing it yourself than using the OCR produced. Maybe it would be better to upload it to and see if you get a better OCR. Jpez (talk) 08:23, 9 April 2016 (UTC)
The source of that file has a text layer, perhaps there was a problem with the upload wizard. Or maybe the uploader intended to use proofread text from elsewhere. I'm guessing that overwriting the file would fix things. CYGNIS INSIGNIS 08:55, 9 April 2016 (UTC)
OCR layer added. Hrishikes (talk) 14:00, 9 April 2016 (UTC)
Yes check.svg Done --Thanks, everyone! --EncycloPetey (talk) 16:35, 9 April 2016 (UTC)

Need an example linking to sections in main name space, transcluded from the page name space.[edit]

I am trying to link to sections in wikisource main namespace which were originally in page name space, but do not seem to get it work. Example on te.wikisource: te wikisource page with section tag #జలగం and the page containing the section is page which has section named ##జలగం##. Can some one give an example? --Arjunaraoc (talk) 06:48, 9 April 2016 (UTC)

@Arjunaraoc: I think I see what you are trying to do. Linkage requires the existence of either id= or name= on the element to which you wish the anchor to target. Unfortunately the <section> does not provide this service (as far as I know) so may I suggest augmenting:
<section begin="జలగం"/>{{p|fs150}}జలగం వెంగళరావు ముఖ్యమంత్రిత్వం</p>
—by substituting something like this instead:
<section begin="జలగం"/>{{p|fs150}}<span id="జలగం"/>జలగం</span> వెంగళరావు ముఖ్యమంత్రిత్వం</p>
—which then ought to expose the anchor point "జలగం" for linkage purposes as usual. AuFCL (talk) 07:23, 9 April 2016 (UTC)
  • @AuFCLI tried
    <section begin="జలగం"/>{{p|fs150}}<span id="jalagam">జలగం వెంగళరావు ముఖ్యమంత్రిత్వం</span></p>
    after correcting a minor typo and using english name for id, as otherwise the link is not working. One more doubt, is it possible to see the sections after transclusion directly in wikipage?. --Arjunaraoc (talk) 09:10, 9 April 2016 (UTC)
    @Arjunaraoc: I think I might have misunderstood your requirements. Did you want to (A) construct a destination/landing point for a link (which is what I tried to describe above), (B) transclude a portion of a page into another page?

    In other words which is the relevant tag between "జలగం"/"jalagam" (case A), or "ఆత్మకథచివరిపేరా" (case B)? I think you may need to re-state the question. Please pardon me for confusing the issue. AuFCL (talk) 09:51, 9 April 2016 (UTC)

  • @AuFCL, Not at all. My requirement is (A). I thought doing (B),even if it is not ultimately used for transclusion, will also help accomplish (A), but looks like (A) needs special HTML code called <span>..</span>. In this specific case, I did not need a section transclusion requirement, so I dropped (B) and used your solution for (A) with slight change. Hope revised link the revised link makes it clear.The page is linked from Wikipedia --Arjunaraoc (talk) 10:22, 9 April 2016 (UTC)
  • My additional question about the need for seeing the anchors, is so that I do not need to visit page namespace, before making the link, if the transcluded pages already have anchors.--Arjunaraoc (talk) 10:25, 9 April 2016 (UTC)
O.K. A couple of points: there is nothing special about using <span> to carry the id= attribute; I only chose that as a fairly harmless HTML tag which would not disrupt the rest of the text. Unfortunately the {{p}} template does not make provision for specifying a name/id value; otherwise using its expansion:
<p class="pclass" style="font-size:150%;" id="jalagam">జలగం వెంగళరావు ముఖ్యమంత్రిత్వం</p>
—ought to work equally well. As far as I can tell the te-wikipedia page you specified appears correctly linked to the te-wikisource destination.
With regard to making the anchor-points visible, use of {{anchor}} or {{anchor+}} (both of which are present on teWS) might be what you were looking for, as they create a <span> automatically with the required attributes to establish the anchor point as well as those to provide minimal marking? For example, hovering your mouse cursor over the word "anchor" in this sentence should yield a pop-up identifying message. Maybe this is not obvious enough for your intended purpose? AuFCL (talk) 11:05, 9 April 2016 (UTC)
  • AuFCL, Thanks very much for clarifying very well. Your suggestion about {{Anchor+}} is also useful. English Wikisourcers are always friendly and helpful in my interactions. Thanks a lot. --Arjunaraoc (talk) 17:23, 9 April 2016 (UTC)


Outlier59 (talk) 16:27, 9 April 2016 (UTC)

  • Outlier59, Thanks for your helpful suggestions. I am able to resolve the issue. --Arjunaraoc (talk) 17:23, 9 April 2016 (UTC)

Edit check request[edit]

Could someone check this edit[4] of mine? It was supposed to be a 1 char ocr fix but it shows up in the page history as deleting 117 characters. I compared the before and after versions of the page and see only the 1 char fix. So I don't know what's going on. Thanks. 20:53, 9 April 2016 (UTC)

It is ok, just don't care about the byte counter ... BTW, I have no idea why it is not accurate, probably something has changed internally.— Mpaa (talk) 21:01, 9 April 2016 (UTC)

Bradshaw anyone?[edit]

Found these when trying to find something:-

A 1906 and a 1944 edition:-

If someone is able to figure out the copyrights I' more than willing to attempt transcriptions. ShakespeareFan00 (talk) 19:58, 10 April 2016 (UTC)

The REshapign of British Railways[edit]

There are now scans on [[5]], one small problem though, the digitising source has marked them as NC , which means that despite the document being an expired Crown copyright (3 years AGO!) , the scans can't be put on Commons, unless some wnats top have a very loud row with the University of Southampton. (sigh) 21:38, 10 April 2016 (UTC)

I also found amongst the same collection, the Worboys and Anderson Reports ( which given my recent efforts on UK Traffic signs... I felt might be in scope here). Shame some archives apply NC :( ShakespeareFan00 (talk) 21:38, 10 April 2016 (UTC)

Tech News: 2016-15[edit]

20:44, 11 April 2016 (UTC)

Top 100 downloads using WSexport tool[edit]

I thought it useful to share the Top 100 downloads using wsexport tool for the month of March 2016. Note that this includes even download of ordinary pages apart from books. Let me know your feedback if any. --Arjunaraoc (talk) 05:25, 12 April 2016 (UTC)

That's really interesting. What's up with The_Problems_of_Philosophy having so many more hits than anything else? — Sam Wilson ( TalkContribs ) … 10:47, 12 April 2016 (UTC)
I think it was Featured Text.. but the FT tag on its discussion page has it March 2015, not 2016. Outlier59 (talk) 11:15, 12 April 2016 (UTC)
There was no featured text for March, 2016. Due to a somewhat daffy implementation of the templates (they operate on months only without regards year) under these circumstances {{Featured text/March}} gets recycled—and as that has not been changed since 2015, The_Problems_of_Philosophy gets a re-airing. AuFCL (talk) 11:51, 12 April 2016 (UTC)
Ah, makes sense now. Thanks. — Sam Wilson ( TalkContribs ) … 23:43, 12 April 2016 (UTC)

Penguin Classics (or any publisher)[edit]

I've been playing with Sparql and Wikidata, and have come up with a little script to make publisher lists like (for example) Portal:Penguin Classics. Is not very useful while there's hardly any data in Wikidata, but maybe one day... :-) I just wanted to see what sort of coverage we've got over that collection. — Sam Wilson ( TalkContribs ) … 10:47, 12 April 2016 (UTC) not creating djvu?[edit]

I uploaded a pdf to a couple of days ago and it seems to have created various files but not the djvu, which was what I was wanting. Did I do something wrong or has something changed over there? Moondyne (talk) 02:36, 13 April 2016 (UTC)

yes, see also Wikisource:Scriptorium#Internet_Archive_no_longer_creates_DjVu-files.21. maybe we need to send them some t-shirts / beer. or build a tool to convert on upload. Slowking4 03:01, 13 April 2016 (UTC)
Aha, that's a bit sad. Moondyne (talk) 04:37, 13 April 2016 (UTC)
Per Wikisource:DjVu vs. PDF, I take it there's now no point in creating a djvu solely for WS. Yes? Moondyne (talk) 04:48, 13 April 2016 (UTC)
That's an interesting question. It sounds like you're right, PDFs should be the preferred format now. Certainly, there are more tools for working with them. — Sam Wilson ( TalkContribs ) … 04:54, 13 April 2016 (UTC)
What a good essay! PDFs are expensive in many ways,—time, cost, transparency and accessibility—I am not moved from my position that they suck. My prejudice was recently reinforced when, up until a couple of weeks ago, some bug caused them to render as garbage for this end-user. I get why are preferring EPUB and that format for readers, but for this site's purposes they are inferior; other online converters to djvu are reasonably successful. PDF should be welcome, but not preferred. CYGNIS INSIGNIS 11:34, 13 April 2016 (UTC)

Fill pages with OCR from PDF[edit]

Hello everybody, is there a bot that can create Wiki pages with the contained OCR of the PDF page, e. g. de:Seite:Ludwig Bechstein - Thüringer Sagenbuch - Erster Band.pdf/19. Is that possible with a simple command using Pywikibot? Thank you in advance, --Aschroet (talk) 16:46, 13 April 2016 (UTC)

If you write your own script, you can use ProofreadPage()/IndexPage() as Page classes, they have several convenience methods.
Or you can use Page.preloadText() if you use the standard Page() class.
def preloadText(self):
        The text returned by EditFormPreloadText.
        See API module "info".
        Application: on Wikisource wikis, text can be preloaded even if
        a page does not exist, if an Index page is present.
If you want, I can write few lines of code for you. Or if you tell me the index, I can do it for you.— Mpaa (talk) 20:53, 13 April 2016 (UTC)
@Mpaa: with djvu going out of vogue with IA, it seems pertinent for pywikibot to look to having "pdftxt" script that replicates "djvutxt". Then we have the general purpose bot available through the WSes. — billinghurst sDrewth 22:41, 13 April 2016 (UTC)
Thank you for the fast reply. Of course i would prefer the suggested pdftxt script, so that others could use it as well. --Aschroet (talk) 09:29, 14 April 2016 (UTC)
That is feasible, but I cannot say when. If you need something faster for a specific index, just let me know.— Mpaa (talk) 18:31, 14 April 2016 (UTC)
@Billinghurst:, @Aschroet: I made this script:, who knows if it will be ever added to the library. But you can fetch it if you like it.— Mpaa (talk) 18:44, 18 April 2016 (UTC)

Mpaa, for de:Index:Ludwig Bechstein - Thüringer Sagenbuch - Erster Band.pdf it would be nice. --Aschroet (talk) 18:39, 14 April 2016 (UTC)

Yes check.svg Done , hope no one got angry on de.wikisource, I forgot I have no bot rights there ...— Mpaa (talk) 20:10, 14 April 2016 (UTC)

Footnote on page without marker in the text[edit]

Here's an oddball question: When proofreading Page:Craik_History_of_British_Commerce_Vol_2.djvu/183, I found a footnote which does not have a corresponding mark in the body of the text. (It is the first note on the page, to "British Merchant, i. 302.") I determined where (I think) the note should have been inserted (here is the source for that reference on Google Books), but I'm not sure if I should have done that. The location isn't in the source text, after all, even though the note is, and what the author references is data in a seems pretty clear what he meant.

Thoughts? Should I mark the note with the SIC template and a transcriber's note? Leave it out entirely? Do something else?

I've used style="display:none;" for this in the past. I updated the page in question, I think it looks okay. —Beleg Tâl (talk) 14:48, 15 April 2016 (UTC)
Comment: In my experience, sometimes small marks in the text (such as periods, tops of semi-colons, asterisks, and the like) fail to appear because of the quirks of ink printing. It isn't always possible to indicate how such a correction ought to be made. In this instance, I favor inserting the item as a normal footnote, and including a transcriber's note of explanation within the footnote. --EncycloPetey (talk) 18:33, 15 April 2016 (UTC)
I always put them in the most logical place, leave them displayed and put a comment for the validator to explain what I've done. Beeswaxcandle (talk) 19:11, 15 April 2016 (UTC)
Yeah, I too add things like this in when its reasonably obvious where they should go. Depends on the work, though; books are more predictable than some other types of thing. — Sam Wilson ( TalkContribs ) … 00:45, 16 April 2016 (UTC)

Requesting a GeoNotice for a local event in San Francisco[edit]

Hi all, we're launching a monthly series of WikiSalons in San Francisco. The event announcement is here: w:en:Wikipedia:Bay Area WikiSalon, April 2016

Is there a Wikisource admin who would be willing to set up a Geonotice, so it would show up at the top of the watchlist for Wikisourcers in the San Francisco bay area? Here's an example of what would need to be done: w:en:Special:Diff/715314854 Just making an identical edit to the counterpart page here on Wikisource would do the trick. Thanks for any help -- and hoping to see some Wikisource folks at the WikiSalon! -Pete (talk) 22:11, 15 April 2016 (UTC)

Note: I have learned this might be a more complex request than I realized. Some helpful discussion here, on Commons: commons:Commons:Administrators'_noticeboard#Requesting_a_GeoNotice_for_a_local_event_in_San_Francisco -Pete (talk) 17:47, 17 April 2016 (UTC)
Most Projects are not as large as Commons or Wikipedia. For Wikisource (and most other non-pedia projects) posting to the central community discussion page will reach everyone. --EncycloPetey (talk) 17:50, 17 April 2016 (UTC)
Hi @EncycloPetey:, thanks. I'm not sure I believe this -- I think I did several years of Wikisource work before ever looking at the Scriptorium, and I have never checked it anywhere near as often as I look at my Watchlist. I don't know any way to test it, but I'd be rather surprised if the vast majority of users check the Scriptorium on a regular basis. But, if there is no established way of doing something like a Geonotice, I don't see any reason to insist on I said above, I initially thought I was requesting something simple and routine, and am happy to retract the request if that's not the case. -Pete (talk) 05:07, 19 April 2016 (UTC)
I believe that all geonotices are coordinated through meta. I am not aware of any local controls, see m:Special:CentralNoticebillinghurst sDrewth 12:34, 19 April 2016 (UTC)
Thanks @Billinghurst:, but I just checked...CentralNotice can't get more geographically granular than an entire country. So I guess Geonotice is the only tool that will do that, and if it's not currently set up here at Wikisource, it's not worth doing for this. Thanks for all the info though, this has been an informative discussion. -Pete (talk) 19:12, 21 April 2016 (UTC)

Server switch 2016[edit]

The Wikimedia Foundation will be testing its newest data center in Dallas. This will make sure Wikipedia and the other Wikimedia wikis can stay online even after a disaster. To make sure everything is working, the Wikimedia Technology department needs to conduct a planned test. This test will show whether they can reliably switch from one data center to the other. It requires many teams to prepare for the test and to be available to fix any unexpected problems.

They will switch all traffic to the new data center on Tuesday, 19 April.
On Thursday, 21 April, they will switch back to the primary data center.

Unfortunately, because of some limitations in MediaWiki, all editing must stop during those two switches. We apologize for this disruption, and we are working to minimize it in the future.

You will be able to read, but not edit, all wikis for a short period of time.

  • You will not be able to edit for approximately 15 to 30 minutes on Tuesday, 19 April and Thursday, 21 April, starting at 14:00 UTC (15:00 BST, 16:00 CEST, 10:00 EDT, 07:00 PDT).

If you try to edit or save during these times, you will see an error message. We hope that no edits will be lost during these minutes, but we can't guarantee it. If you see the error message, then please wait until everything is back to normal. Then you should be able to save your edit. But, we recommend that you make a copy of your changes first, just in case.

Other effects:

  • Background jobs will be slower and some may be dropped.

Red links might not be updated as quickly as normal. If you create an article that is already linked somewhere else, the link will stay red longer than usual. Some long-running scripts will have to be stopped.

  • There will be a code freeze for the week of 18 April.

No non-essential code deployments will take place.

This test was originally planned to take place on March 22. April 19th and 21st are the new dates. You can read the schedule at They will post any changes on that schedule. There will be more notifications about this. Please share this information with your community. /User:Whatamidoing (WMF) (talk) 21:08, 17 April 2016 (UTC)

Big Birthdays[edit]

Well, we missed the chance to celebrate Charlotte Brontë's 200th birthday by featuring one of her works this month, and I don't see anyone else of that stature in literature with a birthday this year.

But 2017 will mark the 200th birthday of Aleksey Konstantinovich Tolstoy (no, not that Tolstoy) as well as Henry David Thoreau. We still have time to prepare for those. --EncycloPetey (talk) 05:20, 18 April 2016 (UTC)

Proposal to globally ban WayneRay from Wikimedia[edit]

Per Wikimedia's Global bans policy, I'm alerting all communities in which WayneRay participated in that there's a proposal to globally ban his account from all of Wikimedia. Members of the Wikisource community are welcome in participate in the discussion. --Michaeldsuarez (talk) 14:48, 18 April 2016 (UTC)

Tech News: 2016-16[edit]

20:40, 18 April 2016 (UTC)

Announce: Unique Devices data available on API[edit]

The analytics team is happy to announce that the Unique Devices data is now available to be queried programmatically via an API.

This means that getting the daily number of unique devices for English Wikipedia for the month of February 2016, for all sites (desktop and mobile) is as easy as launching this query

You can get started by taking a look at our docs at wikitech:Analytics/Unique Devices#Quick Start

If you are not familiar with the Unique Devices data the main thing you need to know is that is a good proxy metric to measure Unique Users, more info below.

Since 2009, the Wikimedia Foundation used comScore to report data about unique web visitors. In January 2016, however, we decided to stop reporting comScore numbers because of certain limitations in the methodology, these limitations translated into misreported mobile usage. We are now ready to replace comscore numbers with the Unique Devices Dataset. While unique devices does not equal unique visitors, it is a good proxy for that metric, meaning that a major increase in the number of unique devices is likely to come from an increase in distinct users. We understand that counting uniques raises fairly big privacy concerns and we use a very private conscious way to count unique devices, it does not include any cookie by which your browsing history can be tracked.

—NRuiz (WMF), wikitech-l

Not sure if anyone is wishing to play with that data, or the value of it, either way, it is there. — billinghurst sDrewth 12:11, 20 April 2016 (UTC)

Without knowing the likelihood of someone using multiple devices, or the mean number of devices from which users access, the data is of little value. For example, I regularly use four devices to access Wikisource on any given day. --EncycloPetey (talk) 19:35, 21 April 2016 (UTC)
Interesting, especially the split mobile/desktop.— Mpaa (talk) 21:02, 21 April 2016 (UTC)

Catalog of Copyright Entries[edit]

i’ve started this long term project by uploading Index:1977 Books and Pamphlets July-Dec.djvu. as historical background, the US copyright office stopped digitizing its records from 1923 to 1977. The Hathi trust has a project to research each orphan work in that period to determine copyright status. they find about half the time works were not renewed making them public domain.[20] there around 100 volumes of 1600 pages, of book copyright records.

IAuploader does not work, it appears the files are too big (larger than 50MB less than 100MB). i use chunked uploads but it fails half the time. i will approach Hathi trust for comments if this helps their search. user:Mpaa would a bot filling pages be useful for these records? any thoughts would be appreciated. Slowking4₮₳₤₭ 00:35, 25 April 2016 (UTC)

@Slowking4:, do you need help?— Mpaa (talk) 17:49, 26 April 2016 (UTC)
@Mpaa:, i am untutored in the ways of bot page creation, "not proofread". these volumes would seem to be a good fit for that. Slowking4₮₳₤₭ 23:31, 26 April 2016 (UTC)
I am surprised that chunked upload is failing in your case. I have uploaded lots of books in recent times (upto yesterday) by this method, to both Commons and Bengali Wikisource, without any failure, even files more than 300 mb in size (e.g. this file of 392 mb). It works even when internet connection goes off (by power-cut) and I have to shift to another connection (by wi-fi). Irrespective of net connection problem, the upload continues, with in-between halts. IA upload also works for me, even for files more than 99 mb in size (e.g. this file). Hrishikes (talk) 01:27, 25 April 2016 (UTC)
i find it times out trying to knit chunks together. maybe you will have better luck, have a go at c:File:Catalog_of_Copyright_Entries_1977_Books_and_Pamphlets_Jan-June.djvu & [21]. Slowking4₮₳₤₭ 03:20, 25 April 2016 (UTC)
@Slowking4: The djvu file is corrupt. I'll look into it tonight. Hrishikes (talk) 10:49, 25 April 2016 (UTC)
@Slowking4: Yes check.svg Done Index:Catalog of Copyright Entries 1977 Books and Pamphlets Jan-June.pdf. Hrishikes (talk) 09:49, 26 April 2016 (UTC)
great job, i fear the this IA corrupt file problem may be widespread, and a major hurdle along with file size. getting one year readable will make a good first step. thanks. Slowking4₮₳₤₭ 09:59, 26 April 2016 (UTC)
@Slowking4: Finally succeeded with djvu: Index:Catalog o‌f Copyright Entries 1977 Books and Pamphlets Jan-June.djvu. The djvu corruption was due to overcompression. Hrishikes (talk) 15:25, 28 April 2016 (UTC)
the text layer is much better for the jpg version i.e. Page:Catalog of Copyright Entries 1977 Books and Pamphlets Jan-June.pdf/11 versus Page:Catalog o‌f Copyright Entries 1977 Books and Pamphlets Jan-June.djvu/5. thoughts ? Slowking4₮₳₤₭ 22:52, 28 April 2016 (UTC)
@Slowking4: The text layer at IA was created from the high resolution jp2 version, whereas the djvu cum text layer was created by me locally from the pdf version. By the way, please arrange moving 1 to 2. Hrishikes (talk) 00:13, 29 April 2016 (UTC)

Tech News: 2016-17[edit]

21:02, 25 April 2016 (UTC)

Wikisource sessions at Open Educational Resources conference[edit]

Hi all, the OER conference took place last week at the University of Edinburgh, with an audience of academics, librarians, learning technologists, and related staff, from many different countries. I gave two sessions relating to Wikisource: a short presentation to an audience of around 50, then a longer tour through the site in a computer room, to an audience of about 9 or 10. Twitter reaction was positive- the audience seem very appreciative of Wikisource and some voiced an interest in working with it further. I've collected the reactions here. I will stay in touch with those who have expressed an interest and see if we can get them to share some texts. MartinPoulter (talk) 14:59, 26 April 2016 (UTC)

Index:O Douglas - Olivia in India.djvu[edit]

No file present. ShakespeareFan00 (talk) 16:22, 26 April 2016 (UTC)

File needs to be undeleted at Commons and then moved to en WS. Billinghurst can do this, having admin rights at both ends. Hrishikes (talk) 17:13, 26 April 2016 (UTC)
Yes check.svg Donebillinghurst sDrewth 07:06, 27 April 2016 (UTC)
Thanks :)ShakespeareFan00 (talk) 08:53, 27 April 2016 (UTC)

\mathop not functioning, asking for the replacement of Math extension by SimpleMathJax[edit]

I was trying to use the TeX/LaTeX \mathop operand and discovered that it didn't work. It seems to be because of the Math extension which will disappear soon or later and to be replaced. And in fact, I tested on our wikis (1.24.0) and it works with this new and simple SimpleMathJax extension ( I tried to read the discussion at and I can understand that some browsers couldn't display the new maths yet (how many can't?) but it looks very nice and I am willing to push the adoption of this new extension which is not adopted yet ( The following code

<math>\mathop{\int\!\!\!\int\!\!\!\int}_{\Pi-\varpi} u(y){\partial a(y) \over \partial y_i}\,\mathrm{d} y_i</math>

should render as

\iiint\limits_{\Pi-\varpi} u(y){\partial a(y) \over \partial y_i}\,\mathrm{d} y_i

but is rendering as:

\mathop{\int\!\!\!\int\!\!\!\int}_{\Pi-\varpi} u(y){\partial a(y) \over \partial y_i}\,\mathrm{d} y_i

--Nbrouard (talk) 08:45, 28 April 2016 (UTC)

Further evidence regarding Author pages[edit]

Two days ago, I posted "A Lament for Adonis" in the New texts. This is our first work by the classical author Bion, and the first new text (translation) by Elizabeth Barrett Browning that we've had in a long, long time. Below, you can see the view statistics for these three pages, in the same order I've linked to this in the preceeding text.

As I noted before. People are watching our New Texts list, and are visiting the Author pages in addition to the page for the new text. --EncycloPetey (talk) 16:23, 28 April 2016 (UTC)

I don't want to depress your enthusiasm but please remember many "modern" browsers perform pre-caching; i.e. they will "follow" one or more links deep from the page you are viewing in anticipation that you may choose to follow one of those links. In other words many of these page views may merely have been the result of an automatic process which cannot be usefully distinguished from actual manual viewing. Just a thought (and of course hope this is only a misconception.) AuFCL (talk) 21:15, 28 April 2016 (UTC)
If that were happening here, I would expect the two author pages to have more nearly the same number of hits. From experience with previous listings, I've seen drops in the number of page views for a work and its author while the page was still in place at the top of the list, after the first two or three days there. So, I rather think that's not what we're seeing, or it's not so common as to produce a noticeable effect in the data. --EncycloPetey (talk) 03:26, 29 April 2016 (UTC)

Patching DjVu files[edit]

I know from recent discussion that the Internet Archive no longer generates DjVu files. But do we still have people here who can patch problems in DjVu files? Specifically, can duplicate pages be removed and missing pages inserted, without the need for IA's assistance? --EncycloPetey (talk) 14:48, 1 May 2016 (UTC)

Yes, I can. Hrishikes (talk) 15:19, 1 May 2016 (UTC)
@Hrishikes:, I was already working on it, I have uploaded a fixed version (page 290 is still poor quality).— Mpaa (talk) 15:26, 1 May 2016 (UTC)
@Mpaa, @EncycloPetey: Page 290 corrected. Hrishikes (talk) 16:15, 1 May 2016 (UTC)
Yes check.svg Done @Mpaa, @Hrishikes: Thanks for that. I guess the answer is "yes", then. :D --EncycloPetey (talk) 16:33, 1 May 2016 (UTC)

New POTM[edit]

Hi everybody, I have changed the POTM option to the work for May, as discussed in the relevant page. I don't know if the action was in order; if not, please revert. Hrishikes (talk) 05:34, 2 May 2016 (UTC)

Tech News: 2016-18[edit]

20:09, 2 May 2016 (UTC)

OCR gadget messes up the editing environment[edit]

Selection of the OCR gadget forces the page header above the toolbars, and suppresses the Proofread tool option of the advanced editing toolbar. PLEASE SEE THIS IMAGE. — Ineuw talk 20:30, 2 May 2016 (UTC)

PetScan: maintenance tool available for enWS[edit]

To bring to the attention of users, Petscan, a new rendition of previous toollabs tools (intersections, categories, ...) by Magnus Manske.

PetScan can generate lists of Wikipedia (and related projects) pages or Wikidata items that match certain criteria, such as all pages in a certain category, or all items with a certain property. PetScan can also combine some temporary lists (here called "sources") in various ways, to create a new one.


It would be interesting to hear what uses our contributors can get, or think that we should get from the tool. — billinghurst sDrewth 01:19, 3 May 2016 (UTC)

I know nothing about PetScan or its applications except as revealed in recent discussions. However the link provided seems to lead nowhere. From the "list of tools" though, the correct PetScan link would appear instead to be // whereas the above link expands to // Is there some kind of redirect missing or other kind of known breakage? AuFCL (talk) 10:51, 3 May 2016 (UTC)
There is no breakage. The URL for PetScan is simple as that. It runs on its own virtual machine, so it doesn't conform to the pattern of the other, "shared infrastructure" tools. If you absolutely want a toollabs "internal" link, you can use toollabs:quick-intersection, which redirects to PetScan, but that seems quite pointless to me. --Magnus Manske (talk) 08:31, 3 May 2016 (UTC)

Pollyanna, move to disambiguate or replace?[edit]

Hi. Our contributors have recently finished the transcription of Index:Pollyanna.djvu and we already have a Gutenberg version at Pollyanna. I am seeking opinion on whether I move the file and create a {{versions}} page, or whether we replace with the transcluded version in its place? — billinghurst sDrewth 02:10, 3 May 2016 (UTC)

IMO, replacement with scan-backed text is better than versioning. If the PG text was a significant and different edition, then that would be a different matter. Beeswaxcandle (talk) 02:34, 3 May 2016 (UTC)
I agree, replace. My rule of thumb is, keep the Gutenberg version iff:
(a) it is possible to figure out what edition it is based upon. But note that Gutenberg boilerplate header text specifies "Project Gutenberg Etexts are usually created from multiple editions.... Therefore, we do NOT keep these books in compliance with any particular paper edition, usually otherwise." So it will not usually be possible to identify an edition. Sometimes, however, the published information on the title page will be transcribed; or, intratextual evidence will allow it to be attributed to, say, the "magazine text", or the "American book text".
(b) the Gutenberg edition differs in important ways from our sourced edition, so that it is worth continuing to host until we have a sourced version of that edition.
In this case, the Gutenberg Pollyanna fails (a).
Hesperian 02:40, 3 May 2016 (UTC)
I agree with Beeswaxcandle and Hesperian. Replace it if there are no significant differences between the two. Jpez (talk) 06:19, 3 May 2016 (UTC)
Thanks. Until we have a firm statement and guidance in the deletion policy, I will seek the community's opinion where I come across these examples. — billinghurst sDrewth 10:15, 3 May 2016 (UTC)
Note that the Gutenberg boilerplate isn't true; very few of Project Gutenberg's etexts are created from multiple editions. If it went through Distributed Proofreaders (which will be credited in the book), I can probably find the edition information. I have access to a private archive with scans from the Distributed Proofreaders books; the PTB are concerned with displaying scans from other online sources that would not appreciate public display of their scans, and the remaining scans that could be displayed freely aren't separated out.--Prosfilaes (talk) 01:16, 5 May 2016 (UTC)
@Prosfilaes: If we can attribute the work to an edition, I can resurrect the prior version, and disambiguate. Until we have a version, to which we attribute value, and DP/Gutenberg does not, I think that this is going to be a repeating issue. Solutions are better than repeated problems. — billinghurst sDrewth 07:10, 5 May 2016 (UTC)

TOC template and "overuse"?[edit]

I am working on a dotted TOC template for Indian Medicinal Plants Part 1. It seems as if the current template is overused, but, if so, then what should I use for the rest of the TOC? Should I "break" the TOC into different parts? - Tannertsf (talk) 15:26, 4 May 2016 (UTC)

Use {{TOCstyle}} , That should only need one template invocation per contents page... ShakespeareFan00 (talk) 19:20, 4 May 2016 (UTC)