Wikisource:Scriptorium

From Wikisource
Jump to: navigation, search
Scriptorium
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 311 active users here.

Contents

Announcements[edit]

Note
This section can be used by any person to communicate Wikisource-related and relevant information; it is not restricted. Generally announcements won't have discussion, or it will be minimal, so if a discussion is relevant, often add another section to Other with a link in the announcement to that section.

Beta: Create a VisualEditor plugin to integrate with Wikisource[edit]

Coren has put a note into Phabricator about the next development stage of having VisualEditor integrate into Wikisource, initially in the standard namespaces, and following that into the Page: namespace. He says that there is usable mainspace editing with Visual Editor (including the transclusion tag, though we don't use it). This is currently working on a test server, and is scheduled for the deployment train Tueseday, Apr 5 (and deployed on group 1 that includes the Wikisources on Apr 6). Jdforrester is planning to turn the configuration switch on April 7 at which point, VisualEditor will become available as a beta feature on wikisource for all content namespaces except Page (that's the next part being worked upon).

Feedback is best straight into the Phabricator ticket. — billinghurst sDrewth 01:56, 3 April 2016 (UTC)

To be clear, while that note was exactly correct, it's probably more useful to note that VisualEditor will be made available in the beta features of every Wikisource on April 7. You can turn it on there at that time. Coren (talk) 14:15, 3 April 2016 (UTC)
it’s working well for article space, not enabled for page namespace yet. test it out and leave feedback. i added a Wikisource:VisualEditor redirect, since it was a redlink in the edit summaries. Slowking4RAN's revenge 01:22, 10 April 2016 (UTC)

250,000 Validated Pages[edit]

We reached 250,000 validated pages on Monday 4 April, with this edit [1] by Akme. Beeswaxcandle (talk) 07:00, 5 April 2016 (UTC)

Hurrah! That's great. :) Now, on to the next ¼ million... — Sam Wilson ( TalkContribs ) … 00:21, 6 April 2016 (UTC)

2000 validated indices[edit]

Sometime in the past week we have past 2000 validated works (currently 2008), with a further 1,139 being proofread. Congrats to all. — billinghurst sDrewth 13:40, 9 May 2016 (UTC)

It looks like it was Index:Rebecca of Sunnybrook Farm (1903).djvu on 2 May. Beeswaxcandle (talk) 09:35, 10 May 2016 (UTC)

Proposals[edit]

Use sortable tables instead list items to list works[edit]

I was wondering, wouldn't it be better to use sortable tables instead of list items to list works in author pages and portals? This way works can be displayed either alphabetically or by year depending on what the user prefers. It's very difficult to find works when they are listed by year and sometimes it's nice to see in which order the works were written chronologically. So why not have both? Jpez (talk) 05:52, 21 April 2016 (UTC)

That works, but only if the list to be sorted does not have items listed under other items, and only as long as all copies of a work are published under the same title. It wouldn't work very well for the page Author:Aeschylus, because (1) The Oresteia has three sub-parts, (b) there are multiple translations of single works listed, (c) works such as his Χοηφόροι have been titled in English as both "Choephori" and "The Libation Bearers" and will not group together when sorting by title, (d) the different editions of translations by the same translator will have been published in different years, and certainly never in the same year as the original publication.
In any case, you can find a work on a page, if you know the title, by using your browser's "Find" function. --EncycloPetey (talk) 16:30, 21 April 2016 (UTC)
@EncycloPetey: Personally the way I would set up Author:Aeschylus page would be completely different. I would get rid of all the sub lists and only link to the main page of each work. Concerning The Orestia, I would link to The Orestia page and not list each work of the trilogy. I don't see the point in listing each and every translation of every work on the authors page when they are all individually listed on each works page anyway. For example it would look something like this.

Tragedies

Title Year
The Persians 472 BCE
Prometheus Bound 480–410 BCE
Seven against Thebes 467 BCE
The Suppliants 463 BCE
The Oresteia 458 BCE
unsigned comment by Jpez (talk) .
@Jpez: That approach would eliminate all the benefits of being able to see (at a glance) whose translations of each play we have, how many we have, what state they are in, and when the translations were published. The sample table above hides all the information except the original date of performance, which is by far the least valuable piece of information concerning those plays. --EncycloPetey (talk) 01:02, 23 April 2016 (UTC)
@EncycloPetey: Well all that information would only be a click away, and if there are many works and translations the list of them can be overwhelming. I think it's like something you've done here Portal:Greek_language_and_literature with the ancient Greek drama portal. Instead of listing every work there you've created the portal and linked to that. Anyway I don't think this is a serious issue, it's just the way I would set up the page and a way tables might be implemented. Jpez (talk) 05:18, 23 April 2016 (UTC)
@Jpez: No, the information would not be a click away, and that's my point. Each bit of information would be a click away, but the user would have to click all of the links and remember what was on each page all at once to get the overall view currently available by putting it in a single place. The Portal:Ancient Greek drama is separate because the list is many screens long—in fact it is as long as all the other content on Greek language and literature put together—and insofar as it succeeds, it does so because it forms a coherent and separate whole within the corpus of Greek literature. The Portal itself is intended to be exhaustive, and is much longer than most Author pages.. But a Portal is likely not a good comparison, since each Portal has the freedom to include or exclude content, and to be structured in any way that is convenient. Our Authors pages need to follow a reasonably consistent format for the sake of our users. --EncycloPetey (talk) 05:43, 23 April 2016 (UTC)
Having tables adds an element of complexity and isn't for every user. — billinghurst sDrewth 04:14, 22 April 2016 (UTC)
@Billinghurst: It does, but it can be made easier by using a template. Jpez (talk) 05:18, 23 April 2016 (UTC)
I am very against the idea, as I think the tables look ugly and provide more problems than they solve. In fact I have been trying to get rid of tables in places, as for example Talk:Bible#Page formatting.
While I agree that the above example of Author:Aeschylus is a poor one, as translations should really be listed on the separate {{translations}} page, I can think of other examples of sub-item lists that are relevant. I often use them for works derived, adapted, exerpted, or originally contained in other works; for examples see Author:John Mason Neale, Author:Bernard of Cluny, Author:Katherine Hankey.
What about when you have several sections on a page? Would you put all poetry, prose, dramas, encyclopedia entries, letters, anthologies, etc. into a single table? Would we still have a different column scheme per section table? per page? —Beleg Tâl (talk) 13:26, 22 April 2016 (UTC)
@Beleg Tâl: For arguments sake, (since I see no one is too keen on implementing my brilliant idea :) I agree that they can look ugly (as the table on the bible page does in my opinion), but I think they could be made to look nice as well for example just getting rid of the borders makes it look a bit more presentable.
Title Year
The Persians 472 BCE
Prometheus Bound 480–410 BCE
Seven against Thebes 467 BCE
The Suppliants 463 BCE
The Oresteia 458 BCE
A few gaps, a bit of aligning etc and you could even make it look somewhat the same way as we are using now (which I like btw), the only difference being that it would be sortable. If I have time I might come up with something, (just for arguments sake). As for different sections etc, I would prefer to use different tables for each section, the setup would be the same as it now just sortable. Also lists can be used within tables if needed etc. Jpez (talk) 05:18, 23 April 2016 (UTC)

Bot approval requests[edit]

Help[edit]

Preferably, we ask your HELP questions at Wikisource:Scriptorium/Help.

Repairs (and moves)[edit]

Other discussions[edit]

U.S. Supreme Court Style Manual![edit]

The U.S. Supreme Court Style Manual, viewed by the justices as an internal document for helping law clerks and justices draft opinions in proper form, is going public for the first time, without the court's approval. What's a good way to ensure we get an OCR'd copy of this soon? Buy it and give it to the Internet Archive? They have the infrastructure to OCR it. {{PD-USGov}} applies, of course. --Elvey (talk) 00:54, 31 March 2016 (UTC)

We don't have a purchasing acquisition policy. If you think that there is value in getting it, and reproducing it (where compliant with scope) then it would be through a private purchase, or through a grant application to WMF. IA would indeed be the means to convert to OCR if necessary. Will it not be an electronic document anyway? — billinghurst sDrewth 01:07, 1 April 2016 (UTC)
WMDC has aa book grant program, apply if interested https://wikimediadc.org/wiki/Book_grants -- deadline tomorrow. Slowking4RAN's revenge 01:40, 3 April 2016 (UTC)
@Elvey: note deadline of 4 April — billinghurst sDrewth 03:01, 3 April 2016 (UTC)
Cool. Done. --Elvey (talk) 04:02, 3 April 2016 (UTC)

Problem identified — long tables not wrapping over printed pages (pdf/epub/...)[edit]

When pages are exported/printed to EPUB/PDF long tables are no longer wrapping over numbers of pages, they seem to get stuck on one long page that expands off the bottom. This previously was not the case and I am unsure when it broke. Anyone got any ideas on what may be the issue, and or who we can harass to look at a fix? — billinghurst sDrewth 13:16, 2 April 2016 (UTC)

Pages with <math> markup[edit]

In Wikisource, under each user Preferences -> Appearance - Math section (at the bottom of the Apprearance page), please check that your Math setting is MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools). If you are viewing or editing Wikisource, the older PNG and LaTeX settings are currently generating some gibberish. Billinghurst has requested a fix through Phabricator. Until this is fixed, please check your user Preference settings for Math before editing pages with math. Outlier59 (talk) 02:20, 3 April 2016 (UTC)

English Translations of Puranas[edit]

I have got some English Translations of Hindu Puranas published by Motilal Banarsidass from West Bengal Public Library Network and they are under Public Domain. But the name of the authors are not mentioned. How can I add them to Wikisource? -Trinanjon (talk) 03:50, 3 April 2016 (UTC)

There are plenty of Puranas from this publisher available in the WB site. Not all are PD. It may be possible to identify the translator; e.g. J. L. Shastri was the translator of Siva Purana volumes. So can you provide the specific links of the books you are considering? Because there are 74 books in this series (1, 2), will be app. 100 volumes on completion. Hrishikes (talk) 06:34, 3 April 2016 (UTC)
Is the Siva Purana by J.L. Shastri under PD? I will also be giving the links of the books such as Skanda Purana, Padma Purana, Garuda Purana, Varaha Purana, etc. -Trinanjon (talk) 09:16, 3 April 2016 (UTC)
Published in 1950 (see here, 1st ed of vol 1 avl here, which shows date as 1970 on the book), so author was alive then; therefore not PD-India on URAA date. J. L. Shastri was one of the general editors for the whole series, so none of the lot is likely to be PD. Hrishikes (talk) 10:22, 4 April 2016 (UTC)

April PotM needed[edit]

We need a PotM for April up on the Main page. Input at Wikisource talk:Proofread of the Month#April 2016... Londonjackbooks (talk) 15:31, 3 April 2016 (UTC)

I went ahead and made a selection based on previous input. See Talk. Londonjackbooks (talk) 20:43, 3 April 2016 (UTC)

Tech News: 2016-14[edit]

22:13, 4 April 2016 (UTC)

Most viewed books with chapters[edit]

During my recent exploration, I felt the need for books with most views on my Telugu Wikisource. I noticed similar requests for English wikisource ( #1) . Starting from the top 1000 pageviews data, I have written an 'R' script to aggregate the page views for all books with chapters(as indicated by use of '/' in main name space page title. I am happy to share the first results of the same at User:Arjunaraoc/201603TopViewsOfBookChapters. I found the top rank going to Constitution_of_India a bit surprising. Do share your feedback. --Arjunaraoc (talk) 11:27, 6 April 2016 (UTC)

Employment Non-Discrimination Act of 2013 - duplicate legislation needs review and possibly merging[edit]

We have two copies of the same legislation, one supported by a scan, and the other a copy ... Special:PrefixIndex/Employment Non-Discrimination Act of 2013 Not certain if one is fro the HoR and the other the Senate or what. It would be useful for someone conversant with US legislation to have a look-see and work out which is better, whether they should be merged or what. — billinghurst sDrewth 02:10, 7 April 2016 (UTC)

I think I’ve sorted them out. They are competing versions of the same unenacted U.S. federal legislation. The Senate version passed; the House version died in committee (at least in the 113th Congress; I did not check to see whether similar bills were introduced in subsequent Congresses). I’ve linked them to each other and indexed them both under Portal:United States Congress. Hope that is satisfactory. Tarmstro99 13:30, 7 May 2016 (UTC)

Importing partial completed predominantly English text from Telugu wikisource[edit]

We have a predominantly English text about Telugu grammar partially proof read in Telugu Wikisource. Will English Wikisourcers be interested in importing it here and completing it? --Arjunaraoc (talk) 09:18, 7 April 2016 (UTC)

@Arjunaraoc: I presume you meant this link? If so it appears none of Charles Phillip Brown's works are currently on enWS. And in passing, why does it appear to be tagged PD-2013 when the flyleaf reads 1857? Is this either an accident or (as I don't at all read Telugu) some other concern? AuFCL (talk) 10:31, 7 April 2016 (UTC)
AuFCL, Yes. I updated the link now. Copyright tag was incorrect. As we have several books for which copyright was freed via Digital Library of India, many Telugu wikisourcers used PD-2013 for such books. I updated it now as PD-Old. --Arjunaraoc (talk) 23:52, 7 April 2016 (UTC)
@Arjunaraoc: Hi! Can you please explain how copyright was "freed" by DLI? DLI has plenty of copyrighted books, including those published in the 1990s, but how can copyright be deemed as freed by inclusion in DLI? Hrishikes (talk) 02:13, 8 April 2016 (UTC)
@Hrishikes, DLI had done an exercise of contacting authors and publishers to free the copyrights. In the earlier versions of DLI page, there used to be a section like search in copyright freed books. As they claim to be compliant to Indian copyright act, DLI being a government body, we are treating all DLI books as copyright freed. Hope that helps. --Arjunaraoc (talk) 04:57, 8 April 2016 (UTC)
@Arjunaraoc: I don't think this is quite in order. If such were the case, DLI would have included a CC license or equivalent for every book, because the works are copyrighted as per Indian Copyright Act, but DLI can, of course, procure the copyrights and release them to PD under CC license. Have they done so? Can you point to any such documentation? Because, other Indic wikisources (I am active in Bengali), and even English Wikisource can also benefit if such is the case. DLI being a Govt body does not automatically make the books copyright-free. Best, Hrishikes (talk) 05:35, 8 April 2016 (UTC)
@Arjunaraoc:, Can you please provide a link or any documentation as a proof to the statement where it has been declared that DLI have been given consent by the copyright-holders and the publishers of the books to release them under CC. DLI has so many books which are not under PD-India and unless there is a proof about their release of license, it will be considered as copyright violation, if I am not wrong. -- Bodhisattwa (talk) 06:42, 8 April 2016 (UTC)
@Hrishikes,@Bodhisattwa Check this presentation at page 21 where in the copyrights were freed were mentioned. COmmons has not accepted our claim recently and deleted several books making us to uploaded them to the Telugu wikisource. There were some other presentations on the web about dealing with copyrights, which I am not able to locate now. Hope that helps. --Arjunaraoc (talk) 09:26, 8 April 2016 (UTC)
The presentation link cited in the previous remark may be dead. You may check the latest copyright policy of the DLI Copyright Policy of DLI as archived on wayback machine on April 8, 2016 and contact DLI for any more clarifications. --Arjunaraoc (talk) 09:43, 8 April 2016 (UTC)
@Arjunaraoc: Could not check the 1st link (DLI site is down), but checked the second link, which is basically useless. In it, DLI claims that the works are copyright-free, and states that if copyright holders complained otherwise, then concerned books will be removed. No explanation as to how a book not covered by PD-India (like a book published in 1970) could be copyright-free. Only a vague claim, without specifics, never suffices; I can well understand why Commons did not accede to your claims. While adding books to Wikisource (whether directly or through Commons), one should check whether the book can be really deemed as PD as per 1st publucation year and author's death year. One should not go by any "claim" by a website, even if Govt-owned. DLI just claims that they are copyright law compliant, and then continues piling up copyrighted works by the hundreds. Without specific documentation of release under CC or the like, all books seemingly to be under copyright should be deemed as copyrighted. Hrishikes (talk) 10:58, 8 April 2016 (UTC)
@Arjunaraoc:, Thanks for the links, (the first link dont open though). The second link only shows a claim from DLI that all of their books are copyright-free. But there is no such proof that authors and publishers have given their consent to DLI to release their works under CC license. Furthermore, the link also says that, the copyright policy is as per the Indian CopyRight Act 1957, according to which books can be copyright-free after 60 years of the death of author or first publication whichever is later. So, it is self-contradictory itself to the claim. -- Bodhisattwa (talk) 13:45, 8 April 2016 (UTC)

Visual Editor now in article space[edit]

visual editor is now a beta feature for article space editing. check it out and leave feedback. here is the fabricator task https://phabricator.wikimedia.org/T48580 -- Slowking4RAN's revenge 16:28, 7 April 2016 (UTC)

OCR not working?[edit]

Is the OCR button working for anyone?

This file Index:The New International Encyclopædia 1st ed. v. 02.djvu doesn't seem to have a text layer, but when I tried using OCR, my only result was the edit window turned grey. --EncycloPetey (talk) 14:51, 8 April 2016 (UTC)

It worked for me but the result was awful, you'd be much better off typing it yourself than using the OCR produced. Maybe it would be better to upload it to archive.org and see if you get a better OCR. Jpez (talk) 08:23, 9 April 2016 (UTC)
The source of that file has a text layer, perhaps there was a problem with the upload wizard. Or maybe the uploader intended to use proofread text from elsewhere. I'm guessing that overwriting the file would fix things. CYGNIS INSIGNIS 08:55, 9 April 2016 (UTC)
OCR layer added. Hrishikes (talk) 14:00, 9 April 2016 (UTC)
Yes check.svg Done --Thanks, everyone! --EncycloPetey (talk) 16:35, 9 April 2016 (UTC)

Need an example linking to sections in main name space, transcluded from the page name space.[edit]

I am trying to link to sections in wikisource main namespace which were originally in page name space, but do not seem to get it work. Example on te.wikisource: te wikisource page with section tag #జలగం and the page containing the section is page which has section named ##జలగం##. Can some one give an example? --Arjunaraoc (talk) 06:48, 9 April 2016 (UTC)

@Arjunaraoc: I think I see what you are trying to do. Linkage requires the existence of either id= or name= on the element to which you wish the anchor to target. Unfortunately the <section> does not provide this service (as far as I know) so may I suggest augmenting:
<section begin="జలగం"/>{{p|fs150}}జలగం వెంగళరావు ముఖ్యమంత్రిత్వం</p>
—by substituting something like this instead:
<section begin="జలగం"/>{{p|fs150}}<span id="జలగం"/>జలగం</span> వెంగళరావు ముఖ్యమంత్రిత్వం</p>
—which then ought to expose the anchor point "జలగం" for linkage purposes as usual. AuFCL (talk) 07:23, 9 April 2016 (UTC)
  • @AuFCLI tried
    <section begin="జలగం"/>{{p|fs150}}<span id="jalagam">జలగం వెంగళరావు ముఖ్యమంత్రిత్వం</span></p>
    
    after correcting a minor typo and using english name for id, as otherwise the link is not working. One more doubt, is it possible to see the sections after transclusion directly in wikipage?. --Arjunaraoc (talk) 09:10, 9 April 2016 (UTC)
    @Arjunaraoc: I think I might have misunderstood your requirements. Did you want to (A) construct a destination/landing point for a link (which is what I tried to describe above), (B) transclude a portion of a page into another page?

    In other words which is the relevant tag between "జలగం"/"jalagam" (case A), or "ఆత్మకథచివరిపేరా" (case B)? I think you may need to re-state the question. Please pardon me for confusing the issue. AuFCL (talk) 09:51, 9 April 2016 (UTC)

  • @AuFCL, Not at all. My requirement is (A). I thought doing (B),even if it is not ultimately used for transclusion, will also help accomplish (A), but looks like (A) needs special HTML code called <span>..</span>. In this specific case, I did not need a section transclusion requirement, so I dropped (B) and used your solution for (A) with slight change. Hope revised link the revised link makes it clear.The page is linked from Wikipedia --Arjunaraoc (talk) 10:22, 9 April 2016 (UTC)
  • My additional question about the need for seeing the anchors, is so that I do not need to visit page namespace, before making the link, if the transcluded pages already have anchors.--Arjunaraoc (talk) 10:25, 9 April 2016 (UTC)
O.K. A couple of points: there is nothing special about using <span> to carry the id= attribute; I only chose that as a fairly harmless HTML tag which would not disrupt the rest of the text. Unfortunately the {{p}} template does not make provision for specifying a name/id value; otherwise using its expansion:
<p class="pclass" style="font-size:150%;" id="jalagam">జలగం వెంగళరావు ముఖ్యమంత్రిత్వం</p>
—ought to work equally well. As far as I can tell the te-wikipedia page you specified appears correctly linked to the te-wikisource destination.
With regard to making the anchor-points visible, use of {{anchor}} or {{anchor+}} (both of which are present on teWS) might be what you were looking for, as they create a <span> automatically with the required attributes to establish the anchor point as well as those to provide minimal marking? For example, hovering your mouse cursor over the word "anchor" in this sentence should yield a pop-up identifying message. Maybe this is not obvious enough for your intended purpose? AuFCL (talk) 11:05, 9 April 2016 (UTC)
  • AuFCL, Thanks very much for clarifying very well. Your suggestion about {{Anchor+}} is also useful. English Wikisourcers are always friendly and helpful in my interactions. Thanks a lot. --Arjunaraoc (talk) 17:23, 9 April 2016 (UTC)

@Arjunaraoc:

Outlier59 (talk) 16:27, 9 April 2016 (UTC)

  • Outlier59, Thanks for your helpful suggestions. I am able to resolve the issue. --Arjunaraoc (talk) 17:23, 9 April 2016 (UTC)

Edit check request[edit]

Could someone check this edit[4] of mine? It was supposed to be a 1 char ocr fix but it shows up in the page history as deleting 117 characters. I compared the before and after versions of the page and see only the 1 char fix. So I don't know what's going on. Thanks. 50.0.121.79 20:53, 9 April 2016 (UTC)

It is ok, just don't care about the byte counter ... BTW, I have no idea why it is not accurate, probably something has changed internally.— Mpaa (talk) 21:01, 9 April 2016 (UTC)

Bradshaw anyone?[edit]

Found these when trying to find something:-

A 1906 and a 1944 edition:- https://archive.org/search.php?query=creator%3A%22Henry+Blacklock+%26+Co.%22

If someone is able to figure out the copyrights I' more than willing to attempt transcriptions. ShakespeareFan00 (talk) 19:58, 10 April 2016 (UTC)

The REshapign of British Railways[edit]

There are now scans on archive.org [[5]], one small problem though, the digitising source has marked them as NC , which means that despite the document being an expired Crown copyright (3 years AGO!) , the scans can't be put on Commons, unless some wnats top have a very loud row with the University of Southampton. (sigh) 21:38, 10 April 2016 (UTC)

I also found amongst the same collection, the Worboys and Anderson Reports ( which given my recent efforts on UK Traffic signs... I felt might be in scope here). Shame some archives apply NC :( ShakespeareFan00 (talk) 21:38, 10 April 2016 (UTC)

Tech News: 2016-15[edit]

20:44, 11 April 2016 (UTC)

Top 100 downloads using WSexport tool[edit]

I thought it useful to share the Top 100 downloads using wsexport tool for the month of March 2016. Note that this includes even download of ordinary pages apart from books. Let me know your feedback if any. --Arjunaraoc (talk) 05:25, 12 April 2016 (UTC)

That's really interesting. What's up with The_Problems_of_Philosophy having so many more hits than anything else? — Sam Wilson ( TalkContribs ) … 10:47, 12 April 2016 (UTC)
I think it was Featured Text.. but the FT tag on its discussion page has it March 2015, not 2016. Outlier59 (talk) 11:15, 12 April 2016 (UTC)
There was no featured text for March, 2016. Due to a somewhat daffy implementation of the templates (they operate on months only without regards year) under these circumstances {{Featured text/March}} gets recycled—and as that has not been changed since 2015, The_Problems_of_Philosophy gets a re-airing. AuFCL (talk) 11:51, 12 April 2016 (UTC)
Ah, makes sense now. Thanks. — Sam Wilson ( TalkContribs ) … 23:43, 12 April 2016 (UTC)

Penguin Classics (or any publisher)[edit]

I've been playing with Sparql and Wikidata, and have come up with a little script to make publisher lists like (for example) Portal:Penguin Classics. Is not very useful while there's hardly any data in Wikidata, but maybe one day... :-) I just wanted to see what sort of coverage we've got over that collection. — Sam Wilson ( TalkContribs ) … 10:47, 12 April 2016 (UTC)

Archive.org not creating djvu?[edit]

I uploaded a pdf to archive.org a couple of days ago and it seems to have created various files but not the djvu, which was what I was wanting. Did I do something wrong or has something changed over there? Moondyne (talk) 02:36, 13 April 2016 (UTC)

yes, see also Wikisource:Scriptorium#Internet_Archive_no_longer_creates_DjVu-files.21. maybe we need to send them some t-shirts / beer. or build a tool to convert on upload. Slowking4 03:01, 13 April 2016 (UTC)
Aha, that's a bit sad. Moondyne (talk) 04:37, 13 April 2016 (UTC)
Per Wikisource:DjVu vs. PDF, I take it there's now no point in creating a djvu solely for WS. Yes? Moondyne (talk) 04:48, 13 April 2016 (UTC)
That's an interesting question. It sounds like you're right, PDFs should be the preferred format now. Certainly, there are more tools for working with them. — Sam Wilson ( TalkContribs ) … 04:54, 13 April 2016 (UTC)
What a good essay! PDFs are expensive in many ways,—time, cost, transparency and accessibility—I am not moved from my position that they suck. My prejudice was recently reinforced when, up until a couple of weeks ago, some bug caused them to render as garbage for this end-user. I get why archive.org are preferring EPUB and that format for readers, but for this site's purposes they are inferior; other online converters to djvu are reasonably successful. PDF should be welcome, but not preferred. CYGNIS INSIGNIS 11:34, 13 April 2016 (UTC)

Fill pages with OCR from PDF[edit]

Hello everybody, is there a bot that can create Wiki pages with the contained OCR of the PDF page, e. g. de:Seite:Ludwig Bechstein - Thüringer Sagenbuch - Erster Band.pdf/19. Is that possible with a simple command using Pywikibot? Thank you in advance, --Aschroet (talk) 16:46, 13 April 2016 (UTC)

If you write your own script, you can use ProofreadPage()/IndexPage() as Page classes, they have several convenience methods.
Or you can use Page.preloadText() if you use the standard Page() class.
def preloadText(self):
        """
        The text returned by EditFormPreloadText.
        See API module "info".
        Application: on Wikisource wikis, text can be preloaded even if
        a page does not exist, if an Index page is present.
        """
If you want, I can write few lines of code for you. Or if you tell me the index, I can do it for you.— Mpaa (talk) 20:53, 13 April 2016 (UTC)
@Mpaa: with djvu going out of vogue with IA, it seems pertinent for pywikibot to look to having "pdftxt" script that replicates "djvutxt". Then we have the general purpose bot available through the WSes. — billinghurst sDrewth 22:41, 13 April 2016 (UTC)
Thank you for the fast reply. Of course i would prefer the suggested pdftxt script, so that others could use it as well. --Aschroet (talk) 09:29, 14 April 2016 (UTC)
That is feasible, but I cannot say when. If you need something faster for a specific index, just let me know.— Mpaa (talk) 18:31, 14 April 2016 (UTC)
@Billinghurst:, @Aschroet: I made this script: wikisourcetext.py, who knows if it will be ever added to the library. But you can fetch it if you like it.— Mpaa (talk) 18:44, 18 April 2016 (UTC)

Mpaa, for de:Index:Ludwig Bechstein - Thüringer Sagenbuch - Erster Band.pdf it would be nice. --Aschroet (talk) 18:39, 14 April 2016 (UTC)

Yes check.svg Done , hope no one got angry on de.wikisource, I forgot I have no bot rights there ...— Mpaa (talk) 20:10, 14 April 2016 (UTC)

Footnote on page without marker in the text[edit]

Here's an oddball question: When proofreading Page:Craik_History_of_British_Commerce_Vol_2.djvu/183, I found a footnote which does not have a corresponding mark in the body of the text. (It is the first note on the page, to "British Merchant, i. 302.") I determined where (I think) the note should have been inserted (here is the source for that reference on Google Books), but I'm not sure if I should have done that. The location isn't in the source text, after all, even though the note is, and what the author references is data in a table...it seems pretty clear what he meant.

Thoughts? Should I mark the note with the SIC template and a transcriber's note? Leave it out entirely? Do something else?

I've used style="display:none;" for this in the past. I updated the page in question, I think it looks okay. —Beleg Tâl (talk) 14:48, 15 April 2016 (UTC)
Comment: In my experience, sometimes small marks in the text (such as periods, tops of semi-colons, asterisks, and the like) fail to appear because of the quirks of ink printing. It isn't always possible to indicate how such a correction ought to be made. In this instance, I favor inserting the item as a normal footnote, and including a transcriber's note of explanation within the footnote. --EncycloPetey (talk) 18:33, 15 April 2016 (UTC)
I always put them in the most logical place, leave them displayed and put a comment for the validator to explain what I've done. Beeswaxcandle (talk) 19:11, 15 April 2016 (UTC)
Yeah, I too add things like this in when its reasonably obvious where they should go. Depends on the work, though; books are more predictable than some other types of thing. — Sam Wilson ( TalkContribs ) … 00:45, 16 April 2016 (UTC)

Requesting a GeoNotice for a local event in San Francisco[edit]

Hi all, we're launching a monthly series of WikiSalons in San Francisco. The event announcement is here: w:en:Wikipedia:Bay Area WikiSalon, April 2016

Is there a Wikisource admin who would be willing to set up a Geonotice, so it would show up at the top of the watchlist for Wikisourcers in the San Francisco bay area? Here's an example of what would need to be done: w:en:Special:Diff/715314854 Just making an identical edit to the counterpart page here on Wikisource would do the trick. Thanks for any help -- and hoping to see some Wikisource folks at the WikiSalon! -Pete (talk) 22:11, 15 April 2016 (UTC)

Note: I have learned this might be a more complex request than I realized. Some helpful discussion here, on Commons: commons:Commons:Administrators'_noticeboard#Requesting_a_GeoNotice_for_a_local_event_in_San_Francisco -Pete (talk) 17:47, 17 April 2016 (UTC)
Most Projects are not as large as Commons or Wikipedia. For Wikisource (and most other non-pedia projects) posting to the central community discussion page will reach everyone. --EncycloPetey (talk) 17:50, 17 April 2016 (UTC)
Hi @EncycloPetey:, thanks. I'm not sure I believe this -- I think I did several years of Wikisource work before ever looking at the Scriptorium, and I have never checked it anywhere near as often as I look at my Watchlist. I don't know any way to test it, but I'd be rather surprised if the vast majority of users check the Scriptorium on a regular basis. But, if there is no established way of doing something like a Geonotice, I don't see any reason to insist on it...as I said above, I initially thought I was requesting something simple and routine, and am happy to retract the request if that's not the case. -Pete (talk) 05:07, 19 April 2016 (UTC)
I believe that all geonotices are coordinated through meta. I am not aware of any local controls, see m:Special:CentralNoticebillinghurst sDrewth 12:34, 19 April 2016 (UTC)
Thanks @Billinghurst:, but I just checked...CentralNotice can't get more geographically granular than an entire country. So I guess Geonotice is the only tool that will do that, and if it's not currently set up here at Wikisource, it's not worth doing for this. Thanks for all the info though, this has been an informative discussion. -Pete (talk) 19:12, 21 April 2016 (UTC)

Server switch 2016[edit]

The Wikimedia Foundation will be testing its newest data center in Dallas. This will make sure Wikipedia and the other Wikimedia wikis can stay online even after a disaster. To make sure everything is working, the Wikimedia Technology department needs to conduct a planned test. This test will show whether they can reliably switch from one data center to the other. It requires many teams to prepare for the test and to be available to fix any unexpected problems.

They will switch all traffic to the new data center on Tuesday, 19 April.
On Thursday, 21 April, they will switch back to the primary data center.

Unfortunately, because of some limitations in MediaWiki, all editing must stop during those two switches. We apologize for this disruption, and we are working to minimize it in the future.

You will be able to read, but not edit, all wikis for a short period of time.

  • You will not be able to edit for approximately 15 to 30 minutes on Tuesday, 19 April and Thursday, 21 April, starting at 14:00 UTC (15:00 BST, 16:00 CEST, 10:00 EDT, 07:00 PDT).

If you try to edit or save during these times, you will see an error message. We hope that no edits will be lost during these minutes, but we can't guarantee it. If you see the error message, then please wait until everything is back to normal. Then you should be able to save your edit. But, we recommend that you make a copy of your changes first, just in case.

Other effects:

  • Background jobs will be slower and some may be dropped.

Red links might not be updated as quickly as normal. If you create an article that is already linked somewhere else, the link will stay red longer than usual. Some long-running scripts will have to be stopped.

  • There will be a code freeze for the week of 18 April.

No non-essential code deployments will take place.

This test was originally planned to take place on March 22. April 19th and 21st are the new dates. You can read the schedule at wikitech.wikimedia.org. They will post any changes on that schedule. There will be more notifications about this. Please share this information with your community. /User:Whatamidoing (WMF) (talk) 21:08, 17 April 2016 (UTC)

Big Birthdays[edit]

Well, we missed the chance to celebrate Charlotte Brontë's 200th birthday by featuring one of her works this month, and I don't see anyone else of that stature in literature with a birthday this year.

But 2017 will mark the 200th birthday of Aleksey Konstantinovich Tolstoy (no, not that Tolstoy) as well as Henry David Thoreau. We still have time to prepare for those. --EncycloPetey (talk) 05:20, 18 April 2016 (UTC)

Proposal to globally ban WayneRay from Wikimedia[edit]

Per Wikimedia's Global bans policy, I'm alerting all communities in which WayneRay participated in that there's a proposal to globally ban his account from all of Wikimedia. Members of the Wikisource community are welcome in participate in the discussion. --Michaeldsuarez (talk) 14:48, 18 April 2016 (UTC)

Tech News: 2016-16[edit]

20:40, 18 April 2016 (UTC)

Announce: Unique Devices data available on API[edit]

The analytics team is happy to announce that the Unique Devices data is now available to be queried programmatically via an API.

This means that getting the daily number of unique devices for English Wikipedia for the month of February 2016, for all sites (desktop and mobile) is as easy as launching this query

You can get started by taking a look at our docs at wikitech:Analytics/Unique Devices#Quick Start

If you are not familiar with the Unique Devices data the main thing you need to know is that is a good proxy metric to measure Unique Users, more info below.

Since 2009, the Wikimedia Foundation used comScore to report data about unique web visitors. In January 2016, however, we decided to stop reporting comScore numbers because of certain limitations in the methodology, these limitations translated into misreported mobile usage. We are now ready to replace comscore numbers with the Unique Devices Dataset. While unique devices does not equal unique visitors, it is a good proxy for that metric, meaning that a major increase in the number of unique devices is likely to come from an increase in distinct users. We understand that counting uniques raises fairly big privacy concerns and we use a very private conscious way to count unique devices, it does not include any cookie by which your browsing history can be tracked.

—NRuiz (WMF), wikitech-l

Not sure if anyone is wishing to play with that data, or the value of it, either way, it is there. — billinghurst sDrewth 12:11, 20 April 2016 (UTC)

Without knowing the likelihood of someone using multiple devices, or the mean number of devices from which users access, the data is of little value. For example, I regularly use four devices to access Wikisource on any given day. --EncycloPetey (talk) 19:35, 21 April 2016 (UTC)
Interesting, especially the split mobile/desktop.— Mpaa (talk) 21:02, 21 April 2016 (UTC)

Catalog of Copyright Entries[edit]

i’ve started this long term project by uploading Index:1977 Books and Pamphlets July-Dec.djvu. as historical background, the US copyright office stopped digitizing its records from 1923 to 1977. The Hathi trust has a project to research each orphan work in that period to determine copyright status. they find about half the time works were not renewed making them public domain.[20] there around 100 volumes of 1600 pages, of book copyright records.

IAuploader does not work, it appears the files are too big (larger than 50MB less than 100MB). i use chunked uploads but it fails half the time. i will approach Hathi trust for comments if this helps their search. user:Mpaa would a bot filling pages be useful for these records? any thoughts would be appreciated. Slowking4₮₳₤₭ 00:35, 25 April 2016 (UTC)

@Slowking4:, do you need help?— Mpaa (talk) 17:49, 26 April 2016 (UTC)
@Mpaa:, i am untutored in the ways of bot page creation, "not proofread". these volumes would seem to be a good fit for that. Slowking4₮₳₤₭ 23:31, 26 April 2016 (UTC)
I am surprised that chunked upload is failing in your case. I have uploaded lots of books in recent times (upto yesterday) by this method, to both Commons and Bengali Wikisource, without any failure, even files more than 300 mb in size (e.g. this file of 392 mb). It works even when internet connection goes off (by power-cut) and I have to shift to another connection (by wi-fi). Irrespective of net connection problem, the upload continues, with in-between halts. IA upload also works for me, even for files more than 99 mb in size (e.g. this file). Hrishikes (talk) 01:27, 25 April 2016 (UTC)
i find it times out trying to knit chunks together. maybe you will have better luck, have a go at c:File:Catalog_of_Copyright_Entries_1977_Books_and_Pamphlets_Jan-June.djvu & [21]. Slowking4₮₳₤₭ 03:20, 25 April 2016 (UTC)
@Slowking4: The djvu file is corrupt. I'll look into it tonight. Hrishikes (talk) 10:49, 25 April 2016 (UTC)
@Slowking4: Yes check.svg Done Index:Catalog of Copyright Entries 1977 Books and Pamphlets Jan-June.pdf. Hrishikes (talk) 09:49, 26 April 2016 (UTC)
great job, i fear the this IA corrupt file problem may be widespread, and a major hurdle along with file size. getting one year readable will make a good first step. thanks. Slowking4₮₳₤₭ 09:59, 26 April 2016 (UTC)
@Slowking4: Finally succeeded with djvu: Index:Catalog o‌f Copyright Entries 1977 Books and Pamphlets Jan-June.djvu. The djvu corruption was due to overcompression. Hrishikes (talk) 15:25, 28 April 2016 (UTC)
the text layer is much better for the jpg version i.e. Page:Catalog of Copyright Entries 1977 Books and Pamphlets Jan-June.pdf/11 versus Page:Catalog o‌f Copyright Entries 1977 Books and Pamphlets Jan-June.djvu/5. thoughts ? Slowking4₮₳₤₭ 22:52, 28 April 2016 (UTC)
@Slowking4: The text layer at IA was created from the high resolution jp2 version, whereas the djvu cum text layer was created by me locally from the pdf version. By the way, please arrange moving 1 to 2. Hrishikes (talk) 00:13, 29 April 2016 (UTC)
ok, the Index:1977 Books and Pamphlets July-Dec.djvu is ready for proofreading. we will need to copy over the better pdf or Gutenberg text layers. (but they are not complete). Slowking4 01:33, 17 May 2016 (UTC)

Tech News: 2016-17[edit]

21:02, 25 April 2016 (UTC)

Wikisource sessions at Open Educational Resources conference[edit]

Hi all, the OER conference took place last week at the University of Edinburgh, with an audience of academics, librarians, learning technologists, and related staff, from many different countries. I gave two sessions relating to Wikisource: a short presentation to an audience of around 50, then a longer tour through the site in a computer room, to an audience of about 9 or 10. Twitter reaction was positive- the audience seem very appreciative of Wikisource and some voiced an interest in working with it further. I've collected the reactions here. I will stay in touch with those who have expressed an interest and see if we can get them to share some texts. MartinPoulter (talk) 14:59, 26 April 2016 (UTC)

Index:O Douglas - Olivia in India.djvu[edit]

No file present. ShakespeareFan00 (talk) 16:22, 26 April 2016 (UTC)

File needs to be undeleted at Commons and then moved to en WS. Billinghurst can do this, having admin rights at both ends. Hrishikes (talk) 17:13, 26 April 2016 (UTC)
Yes check.svg Donebillinghurst sDrewth 07:06, 27 April 2016 (UTC)
Thanks :)ShakespeareFan00 (talk) 08:53, 27 April 2016 (UTC)

\mathop not functioning, asking for the replacement of Math extension by SimpleMathJax[edit]

I was trying to use the TeX/LaTeX \mathop operand and discovered that it didn't work. It seems to be because of the Math extension which will disappear soon or later and to be replaced. And in fact, I tested on our wikis (1.24.0) and it works with this new and simple SimpleMathJax extension (https://www.mediawiki.org/wiki/Extension:SimpleMathJax). I tried to read the discussion at https://phabricator.wikimedia.org/T99369 and I can understand that some browsers couldn't display the new maths yet (how many can't?) but it looks very nice and I am willing to push the adoption of this new extension which is not adopted yet (https://en.wikisource.org/wiki/Special:Version). The following code

<math>\mathop{\int\!\!\!\int\!\!\!\int}_{\Pi-\varpi} u(y){\partial a(y) \over \partial y_i}\,\mathrm{d} y_i</math>

should render as

\iiint\limits_{\Pi-\varpi} u(y){\partial a(y) \over \partial y_i}\,\mathrm{d} y_i

but is rendering as:

\mathop{\int\!\!\!\int\!\!\!\int}_{\Pi-\varpi} u(y){\partial a(y) \over \partial y_i}\,\mathrm{d} y_i

--Nbrouard (talk) 08:45, 28 April 2016 (UTC)

Further evidence regarding Author pages[edit]

Two days ago, I posted "A Lament for Adonis" in the New texts. This is our first work by the classical author Bion, and the first new text (translation) by Elizabeth Barrett Browning that we've had in a long, long time. Below, you can see the view statistics for these three pages, in the same order I've linked to this in the preceeding text.

As I noted before. People are watching our New Texts list, and are visiting the Author pages in addition to the page for the new text. --EncycloPetey (talk) 16:23, 28 April 2016 (UTC)

I don't want to depress your enthusiasm but please remember many "modern" browsers perform pre-caching; i.e. they will "follow" one or more links deep from the page you are viewing in anticipation that you may choose to follow one of those links. In other words many of these page views may merely have been the result of an automatic process which cannot be usefully distinguished from actual manual viewing. Just a thought (and of course hope this is only a misconception.) AuFCL (talk) 21:15, 28 April 2016 (UTC)
If that were happening here, I would expect the two author pages to have more nearly the same number of hits. From experience with previous listings, I've seen drops in the number of page views for a work and its author while the page was still in place at the top of the list, after the first two or three days there. So, I rather think that's not what we're seeing, or it's not so common as to produce a noticeable effect in the data. --EncycloPetey (talk) 03:26, 29 April 2016 (UTC)

Patching DjVu files[edit]

I know from recent discussion that the Internet Archive no longer generates DjVu files. But do we still have people here who can patch problems in DjVu files? Specifically, can duplicate pages be removed and missing pages inserted, without the need for IA's assistance? --EncycloPetey (talk) 14:48, 1 May 2016 (UTC)

Yes, I can. Hrishikes (talk) 15:19, 1 May 2016 (UTC)
@Hrishikes:, I was already working on it, I have uploaded a fixed version (page 290 is still poor quality).— Mpaa (talk) 15:26, 1 May 2016 (UTC)
@Mpaa, @EncycloPetey: Page 290 corrected. Hrishikes (talk) 16:15, 1 May 2016 (UTC)
Yes check.svg Done @Mpaa, @Hrishikes: Thanks for that. I guess the answer is "yes", then. :D --EncycloPetey (talk) 16:33, 1 May 2016 (UTC)

New POTM[edit]

Hi everybody, I have changed the POTM option to the work for May, as discussed in the relevant page. I don't know if the action was in order; if not, please revert. Hrishikes (talk) 05:34, 2 May 2016 (UTC)

Tech News: 2016-18[edit]

20:09, 2 May 2016 (UTC)

OCR gadget messes up the editing environment[edit]

Selection of the OCR gadget forces the page header above the toolbars, and suppresses the Proofread tool option of the advanced editing toolbar. PLEASE SEE THIS IMAGE. — Ineuw talk 20:30, 2 May 2016 (UTC)

PetScan: maintenance tool available for enWS[edit]

To bring to the attention of users, Petscan, a new rendition of previous toollabs tools (intersections, categories, ...) by Magnus Manske.

PetScan can generate lists of Wikipedia (and related projects) pages or Wikidata items that match certain criteria, such as all pages in a certain category, or all items with a certain property. PetScan can also combine some temporary lists (here called "sources") in various ways, to create a new one.

m:Petscan

It would be interesting to hear what uses our contributors can get, or think that we should get from the tool. — billinghurst sDrewth 01:19, 3 May 2016 (UTC)

I know nothing about PetScan or its applications except as revealed in recent discussions. However the link provided seems to lead nowhere. From the "list of tools" though, the correct PetScan link would appear instead to be //petscan.wmflabs.org/ whereas the above link expands to //tools.wmflabs.org/Petscan. Is there some kind of redirect missing or other kind of known breakage? AuFCL (talk) 10:51, 3 May 2016 (UTC)
There is no breakage. The URL for PetScan is https://petscan.wmflabs.org/ simple as that. It runs on its own virtual machine, so it doesn't conform to the pattern of the other, "shared infrastructure" tools. If you absolutely want a toollabs "internal" link, you can use toollabs:quick-intersection, which redirects to PetScan, but that seems quite pointless to me. --Magnus Manske (talk) 08:31, 3 May 2016 (UTC)

Pollyanna, move to disambiguate or replace?[edit]

Hi. Our contributors have recently finished the transcription of Index:Pollyanna.djvu and we already have a Gutenberg version at Pollyanna. I am seeking opinion on whether I move the file and create a {{versions}} page, or whether we replace with the transcluded version in its place? — billinghurst sDrewth 02:10, 3 May 2016 (UTC)

IMO, replacement with scan-backed text is better than versioning. If the PG text was a significant and different edition, then that would be a different matter. Beeswaxcandle (talk) 02:34, 3 May 2016 (UTC)
I agree, replace. My rule of thumb is, keep the Gutenberg version iff:
(a) it is possible to figure out what edition it is based upon. But note that Gutenberg boilerplate header text specifies "Project Gutenberg Etexts are usually created from multiple editions.... Therefore, we do NOT keep these books in compliance with any particular paper edition, usually otherwise." So it will not usually be possible to identify an edition. Sometimes, however, the published information on the title page will be transcribed; or, intratextual evidence will allow it to be attributed to, say, the "magazine text", or the "American book text".
AND
(b) the Gutenberg edition differs in important ways from our sourced edition, so that it is worth continuing to host until we have a sourced version of that edition.
In this case, the Gutenberg Pollyanna fails (a).
Hesperian 02:40, 3 May 2016 (UTC)
I agree with Beeswaxcandle and Hesperian. Replace it if there are no significant differences between the two. Jpez (talk) 06:19, 3 May 2016 (UTC)
Thanks. Until we have a firm statement and guidance in the deletion policy, I will seek the community's opinion where I come across these examples. — billinghurst sDrewth 10:15, 3 May 2016 (UTC)
Note that the Gutenberg boilerplate isn't true; very few of Project Gutenberg's etexts are created from multiple editions. If it went through Distributed Proofreaders (which will be credited in the book), I can probably find the edition information. I have access to a private archive with scans from the Distributed Proofreaders books; the PTB are concerned with displaying scans from other online sources that would not appreciate public display of their scans, and the remaining scans that could be displayed freely aren't separated out.--Prosfilaes (talk) 01:16, 5 May 2016 (UTC)
@Prosfilaes: If we can attribute the work to an edition, I can resurrect the prior version, and disambiguate. Until we have a version, to which we attribute value, and DP/Gutenberg does not, I think that this is going to be a repeating issue. Solutions are better than repeated problems. — billinghurst sDrewth 07:10, 5 May 2016 (UTC)
DP does care. It is true that books like Pollyanna, which are early PG works, don't have edition information. Mind you, we're not really doing much better; Wikipedia claims that File:Pollyanna (Eleanor Porter book) first edition cover.jpg is the first edition cover, but that's not the cover of the edition we used; not only that, that cover says Boston and our title page says New York. This could have any number of differences from a true first edition that aren't noted on the title page or verso. We never act like the edition information matters to us; it's at Pollyanna, and that page says nothing about the edition besides reproducing the title page.
What are the differences between the Gutenberg copy and our copy? If we care about the versions, that's check number one. Once we know that, it may be easy to find a matching scan in Google Books or HathiTrust. If we don't care about the versions, which it doesn't seem we do for Pollyanna, it's convenient to have a random copy at hand to check against, but that doesn't mean we're reproducing an edition instead of a random copy.--Prosfilaes (talk) 22:03, 6 May 2016 (UTC)
Going back to Hesperian, if the Gutenberg version has significant differences from the Wikisource version, it's worth looking into. It quite likely means there's multiple significant editions out there that we should support (and again, given Google Books, it shouldn't be that hard to find what edition the Gutenberg volume came from in most cases), or that the Wikisource edition is simply a bad edition that either shouldn't be supported or at least should be noted that it is an abridged, censored, or otherwise edited version that should not be taken as the (e.g.) Polyanna. If the Gutenberg edition is so abridged, censored, or otherwise edited, it seems polite to mention it upstream so someone can consider marking it and/or replacing it.--Prosfilaes (talk) 22:44, 6 May 2016 (UTC)
I think it's best to make a "versions" page and keep the "unsourced" text—if the "unsourced" text has any sort of source information. It's possible that other websites that created ebooks and/or online texts will later provide specific source information, maybe even including page scans. Two examples I can think of are (1) The Jungle Book (authorama.com version), and (2) the Liberty Fund rendition of Essay on the First Principles of Government, 2nd Edition (1771), not uploaded here, but which I consulted during proofreading the book here on Wikisource [see Index talk:Essay on the First Principles of Government 2nd Ed.djvu]. Another benefit of versions pages is that we might spot more publishing reprints here on Wikisource -- such as the 1910 Jungle Book using the same illustrations as the 1894 Jungle Book. Outlier59 (talk) 23:41, 6 May 2016 (UTC)

TOC template and "overuse"?[edit]

I am working on a dotted TOC template for Indian Medicinal Plants Part 1. It seems as if the current template is overused, but, if so, then what should I use for the rest of the TOC? Should I "break" the TOC into different parts? - Tannertsf (talk) 15:26, 4 May 2016 (UTC)

Use {{TOCstyle}} , That should only need one template invocation per contents page... ShakespeareFan00 (talk) 19:20, 4 May 2016 (UTC)

pages tag without a self-closing slash[edit]

I've noticed a few instances of pages tags without self-closing slashes (i.e. <pages ...> as opposed to <pages ... />) which aren't turned into page transclusion by the parser. I can only assume that that worked at some point and now doesn't. Is there an easy way to find and correct all of these instances? Prosody (talk) 16:33, 9 May 2016 (UTC)

Putting insource:/\<pages[^\/]*\>/ into the search box returns a bit more than what you ask for but nevertheless might be a useful start.

Although unusual practice surely the form <pages …></pages> is legal? For instance American History Told by Contemporaries/Volume 2/Chapter 33 seems to be working fine, and survives a purge operation. What have I missed, please? AuFCL (talk) 22:52, 9 May 2016 (UTC)

I wasn't familiar with the practice of using an closing tag so I was unclear, sorry. It seems that works too. What doesn't work is a pages tag without self closing slash or closing tag. I've definitely seen it once here and once on multilingual Wikisource. Prosody (talk) 00:59, 10 May 2016 (UTC)
I am fixing those cases as MpaaBot.— Mpaa (talk) 19:23, 10 May 2016 (UTC)
Hope I fixed all cases "without self closing slash". Let me know if you happen to find more.— Mpaa (talk) 21:43, 10 May 2016 (UTC)
@Prosody: phabricator:T134957 noted. Not sure why I was notified of its creation so if that was of your doing thank you. I further note Danny_B's slightly dismissive closure message neither confirms nor denies the "correctness" of the <pages …></pages> form. Should these be changed also to the self-closing form "to play safe?" I am inclined to think so but obviously don't know for sure. AuFCL (talk) 09:03, 11 May 2016 (UTC)
When {{#tag:pages||index={{subst:BASEPAGENAME}}.djvu|from=|to=}}> is substituted, it turns into <pages index=".djvu" from= to= ></pages>. It is correct, and I don't see the value in the conversion of functional and legitimate code. — billinghurst sDrewth 14:06, 11 May 2016 (UTC)

Finger-Prints for Everybody from the Literary Digest[edit]

I have a photographed two pages of the articel "Finger-Prints for Everybody" from the Lit. Dig. of July 19, 1919 (PD-1923). Is WS interested and, if so, how do I give you the image files? 104.229.143.192 17:00, 9 May 2016 (UTC)

Do you have the whole article? ShakespeareFan00 (talk) 07:45, 10 May 2016 (UTC)
HathiTrust has multiple scans available of the entire run of the Literary Digest up to 1922. https://babel.hathitrust.org/cgi/pt?id=mdp.39015028103888;view=1up;seq=298 is a link to scans of this article.--Prosfilaes (talk) 08:35, 10 May 2016 (UTC)
Hot damn! I guess I don't need to keep those! Thanx. 104.229.143.192 15:48, 10 May 2016 (UTC)

Proposed WikiProject merger[edit]

Since I don't have a freaking clue how if at all these things are done here, I welcome input at Wikisource talk:WikiProject Bible#Proposed merger regarding a possible merger of the Bible and Bible dictionary WikiProjects. John Carter (talk) 17:31, 9 May 2016 (UTC)

Rather than a straight merger, it might be better to consider the one a sub-project of the other. But most projects here are very fluid. --EncycloPetey (talk) 19:47, 9 May 2016 (UTC)
I was myself thinking of maybe keeping the main WikiProject Bible, with a task force/work group for Bible dictionaries, maybe later another task force for commentaries, etc. So, in general I agree with keeping the page, and material, but think that renaming it to be a subunit of the WikiProject Bible might be beneficial. John Carter (talk) 20:45, 9 May 2016 (UTC)
I would say that both are pretty well dead, so anything that invigorates them is worthwhile. A project is a loose collection of topic and people, so I personally don't think that it matters if the RFC isn't progressive. — billinghurst sDrewth 12:25, 11 May 2016 (UTC)

Tech News: 2016-19[edit]

23:22, 9 May 2016 (UTC)

Proofreading works with some language the proofer doesn't know[edit]

Am I allowed to mark a page as proofread or validated if it contains some words of a language I don't know? See, for example, Page:Chinese Merry Tales (1909).djvu/27. Most of the work is English, and I can format it and check if the Chinese characters look about right, but I can't read them, and I can't read the tooltips. Tar-ba-gan is doing the Chinese. Outlier59 (talk) 00:13, 10 May 2016 (UTC)

Newbie question - apostrophes and quote marks[edit]

Hi. I'm new around here; as a way of finding out about Wikisource I've enjoyed contributing to the current Proofread of the Month. However I'm confused about an aspect of style:

  1. Should apostropes be of this form: Hanuman’s or this form: Hanuman's? Does it depend on the usage of the text?
  2. Same question for quotation marks (single and double).

The style guide seems to encourage straight quotes, but in the PotM I've seen others leave in the curly quotes/apostrophes rendered by the OCR. At first I changed things to straight, but after observing others I have stopped doing so. Can someone please clear up my confusion? Thanks! BethNaught (talk) 20:30, 12 May 2016 (UTC)

You will get mixed opinions. Mine: in the style guide straight is preferred (I do too). Someone uses curly. Most important: consistency through the whole text.— Mpaa (talk) 21:51, 12 May 2016 (UTC)
Would the Index talk: be the appropriate place to ask re this particular book? BethNaught (talk) 21:58, 12 May 2016 (UTC)
Yes, index page talk is the best place to ask. Curly quotes are not used all that often, as far as I've seen. OCR clean up converts curly quotes to straight quotes. Outlier59 (talk) 22:08, 12 May 2016 (UTC)
Thank you; I have asked at the Index talk: which to use. I have another question: I see that some validated pages use the {{hwe}} and {{hws}} templates, but others with hyphenated words over pages do not. Does this matter and do I need to worry about using these templates? BethNaught (talk) 22:16, 12 May 2016 (UTC)
You'll need to do something so that when the pages are transcluded into the work, the hyphens across pages do not show up. By far the easiest method is to use {{hws}} and {{hwe}}, as it is both easy to learn and easy for other new editors to interpret later. --EncycloPetey (talk) 22:20, 12 May 2016 (UTC)
Thank you. I think I've caught all the instances on pages I've edited now. BethNaught (talk) 22:35, 12 May 2016 (UTC)
I prefer ’ because we use ' for bolding and italics. BD2412 T 22:51, 12 May 2016 (UTC)
Perhaps this is to be expected, but I feel like I'm getting decidedly mixed messages. I'm going with straight as per the reply on the index talk. BethNaught (talk) 07:39, 13 May 2016 (UTC)
@BethNaught: Straight is the guidance, and I believe that it was chosen (after a discussion) as the guidance as it is on the keyboard, so easy. Some works come to us with curly quotes, and rather than stamp our feet, or inexpertly change them with the risk of missing, we accept them as they are. So if you are doing them in a fresh work, then we prefer they are straight. If we are working collaboratively on a work, then we should do them as straight. Putting commentary on Index talk: pages is always helpful in explaining why you are doing something, especially where the explanation is differing from the style guide. To note that if you need to do italics with a single quote, you can utilise &apos; to get a straight apostrophe and not have the word converted to bolded. — billinghurst sDrewth 09:33, 13 May 2016 (UTC)
Or <nowiki>'</nowiki>.--Prosfilaes (talk) 09:57, 13 May 2016 (UTC)
Or {{'}}.— Mpaa (talk) 19:39, 13 May 2016 (UTC)
i’m afraid my default spell checker changes my {{'}} into ’, and then snaps back for ''. very annoying, but not motivated to fix. Slowking4 03:11, 16 May 2016 (UTC)

Hathitrust[edit]

Hi, Does anyone have an account on Hathitrust, to get some books? I'd like to get the following books. Thanks in advance, Yann (talk) 20:59, 15 May 2016 (UTC)

it’s working for me. is it a geo-lock? [42] do these links work for you?
they say they are digitized by google at Univ Michigan, maybe an upload to IA is in order. maybe you will need a VPN. Slowking4 03:07, 16 May 2016 (UTC)

Tech News: 2016-20[edit]

16:01, 16 May 2016 (UTC)

A new "Welcome" dialog[edit]

Hello everyone. This is a heads-up about a change which has just been announced in Tech News: Add the "welcome" dialog (with button to switch) to the wikitext editor.

In a nutshell, later this week this will provide a one-time "Welcome" message in the wikitext editor which explains that anyone can edit, and every improvement helps. The user can then start editing in the wikitext editor right away, or switch to the visual editor. (This is the equivalent of an already existing welcome message for visual editor users, which suggests the option to switch to the wikitext editor. If you have already seen this dialog in the visual editor, you will not see the new one in the wikitext editor.)

  • I want to make sure that, although users will see this dialog only once, they can read it in their language as much as possible. Please read the instructions if you can help with that.
  • I also want to underline that the dialog does not change in any way the current site-wide configuration of the visual editor. Nothing changes permanently for users who chose to hide the visual editor in their Preferences or for those who don't use it anyway, or for wikis where it's still a Beta Feature, or for wikis where certain groups of users don't get the visual editor tab, etc.
    • There is a slight chance that you see a few more questions than usual about the visual editor. Please refer people to the documentation or to the feedback page, and feel free to ping me if you have questions too!
  • Finally, I want to acknowledge that, while not everyone will see that dialog, many of you will; if you're reading this you are likely not the intended recipients of that one-time dialog, so you may be confused or annoyed by it—and if this is the case, I'm truly sorry about that. This message also avoids that you have to explain the same thing over and over again—just point to this section. Please feel free to cross-post this message at other venues on this wiki if you think it will help avoid that users feel caught by surprise by this change.

If you want to learn more, please see https://phabricator.wikimedia.org/T133800; if you have feedback or think you need to report a bug with the dialog, you can post in that task (or at mediawiki.org if you prefer).

Thanks for your attention and happy editing, Elitre (WMF) 16:50, 16 May 2016 (UTC)

How does a "Welcome to Wikipedia" message affect our project here at Wikisource? --EncycloPetey (talk) 21:43, 16 May 2016 (UTC)
@Elitre (WMF): I'm seeing the same "Welcome to Wikipedia" message on your links here as EncycloPetey questioned. Can you clarify how this impacts Wikisource? Outlier59 (talk) 00:34, 17 May 2016 (UTC)
it’s the same old broadcast tl;dr when people complain (English Wikipedia drama). I keep getting confused trying to counsel newbies who have wikitext editor, but want VE, and have to find the small pencil (whatever that means). or pushing the "combined button", to get to wikitext. it affects you here, if you are experimenting with VE in article space. they keep thrashing the interface, maybe one day they will settle on something so we can train on it. Slowking4 01:46, 17 May 2016 (UTC)
EncycloPetey, Outlier59, the message will say "Welcome to Wikisource" here of course. --Elitre (WMF) (talk) 05:50, 17 May 2016 (UTC) PS: Slightly OT, the hard work of adapting the visual editor to Wikisource is mainly led by volunteers User:Coren and User:Tpt who would certainly benefit from direct feedback about what's working and what you'd like to see improved. Thanks!
As a reminder—this wiki features a Single Edit Tab system; if you're not sure you know or remember how that works, you can read the guide (which details, among other things, how to switch between editors from the buttons on the toolbar); you can change your editing settings at any time, by the way. (I had also written a very quick intro to the visual editor, in case anyone is interested). Best, --Elitre (WMF) (talk) 14:34, 17 May 2016 (UTC)
This doesn't look very applicable to what Wikisource does. Most of our valued content is transcluded from items in a different namespace, and changes must be compared in the edit window against the source document. This doesn't even seem to be enabled, so it doesn't apply to this project or to any Wikisource for that matter. --EncycloPetey (talk) 18:12, 17 May 2016 (UTC)
So, how often will a user see the Welcome message from the visual editor? I've gotten it twice in the past 24 hours, but never had to experience it before. --EncycloPetey (talk) 09:22, 19 May 2016 (UTC)
Six times now in 24 hours, and I do not see the message in my native language, but in whatever language is native to the project where I am editing. This message is a bad idea, and I wish there were some way to never see it on any project while logged in, instead of on every project and in every language every time I visit after every time I log in. --EncycloPetey (talk) 20:32, 19 May 2016 (UTC)

Desirability of manuscripts in Wikisource?[edit]

Hi, I'm in the process of talking to other people in the Harold B. Library (where I work as a Wikipedia coordinator) about making our transcriptions of works more accessible. Many of our transcriptions are of old personal journals. They're definitely in the public domain, I'm just not sure if transcriptions of manuscripts are within the scope of Wikisource--would they fall under original contributions? Rachel Helps (BYU) (talk) 18:11, 17 May 2016 (UTC)

Are these scans of handwritten originals, scsns of typed transcriptions/ editorial collations of journals, or typed up into a modern digital format (like Project Gutenburg works)? ShakespeareFan00 (talk) 18:44, 17 May 2016 (UTC)
There are scans of the handwritten originals that we have in our special collections along with transcriptions made by student workers (which I'd need to adapt to Wikisource standards). I don't think Project Gutenberg would accept them, since they were never formally published, but I could be wrong. Rachel Helps (BYU) (talk) 19:15, 17 May 2016 (UTC)
These definitly sound like they are in scope. Does the library also hold short works like correspondence and ephemera (see User:MartinPoulter's efforts for the latter :) ShakespeareFan00 (talk) 19:26, 17 May 2016 (UTC)
Yes, we have loads of letters and ephemera, but not much of it is scanned yet. Thanks for your information; I'll let my colleagues know that Wikisource is a potential place to put transcriptions of manuscript material. Rachel Helps (BYU) (talk) 19:58, 17 May 2016 (UTC)
@Rachel Helps (BYU): when it comes to personal journals, it is not necessarily a direct yes, there is a bit of a notability or referential requirement. So if the person is notable to WS standards (an author) then YesY. If the journal is someone listed in WP/Wikt/Wikquote/.. then YesY as they have the requisite 'notability'. If the work itself is notable in that it is required as a citable reference at enWP, etc., then YesY. If it is old uncle Jim's scratchings about his trip to the town and he bought his chewing tobacco, then N. Sames goes for people's wills, we want the notable wills, or the will of notable people, but that of the cocky who shore sheep for Farmer Squire, no.

We definitely don't need formal publication, and that is explained in WS:WWI, and thanks for taking the time to ask. :-) — billinghurst sDrewth 07:09, 18 May 2016 (UTC)

yes, welcome here, although the Smithsonian transcription, and NARA citizen archivist have set up their own sites. they push their own volunteers with twitter, and other social media. unpublished manuscripts may have a 95 years term, so check license status. if you wanted to have an editathon / intro to wikicode, with a google hangout, i’d be happy to participate. Slowking4 17:43, 18 May 2016 (UTC)
Unpublished manuscripts don't have a 95 years term. They have a life+70 term. Anonymous works are more hairy than in Europe; if I understand it right, they still have a life+70 term, but one can assume (with properly checking) that the author died at most 50 years from creation.--Prosfilaes (talk) 23:19, 18 May 2016 (UTC)
I am going to assume that someone who works at an archival library would be able to contact any descendants who might stake a claim on a copyright and perhaps get them to indicate whether the heirs choose to enforce the copyright or not. John Carter (talk) 23:40, 18 May 2016 (UTC)
Generally Wikimedia insists on an explicit license, rather than an unenforced copyright.
Moreover, part of my frustration with copyright is that my grandfather died ten years ago, and I don't have a clue where my aunt's share in his copyright might have ended up. People move, change their names, die, leave stuff to random people, and pretty quickly it's very hard to tell who might hold copyright of some minor work.--Prosfilaes (talk) 01:12, 19 May 2016 (UTC)
welcome, to the fun of orphan works. In the US, and this is a US library, it is "Works created before 1978 and first published after or in 1978 are protected for the earlier of 95 years from publication or registration for copyright or 120 years from creation (for anonymous or corporate works) or 70 years after death of the creator for known authors" hence, the setup of institutional transcription on their servers. as hathi trust found nothing finds the copyright holder like publishing. if commons won’t take the manuscripts, we should, as easy "fair use". Slowking4 12:48, 20 May 2016 (UTC)

Tool "Cite this page"[edit]

Anyone noticed that Wikisource uses a Wikipedia "Cite this page" tool? Here's the opening blurb....

IMPORTANT NOTE: Most educators and professionals do not consider it appropriate to use tertiary sources such as encyclopedias as a sole source for any information—citing an encyclopedia as an important reference in footnotes or bibliographies may result in censure or a failing grade. Wikipedia articles should be used for background information, as a reference for correct terminology and search terms, and as a starting point for further research....   As with any community-built reference, there is a possibility for error in Wikipedia's content—please check your facts against multiple sources and read our disclaimers for more information. [bold added]

How do we get this fixed? Outlier59 (talk) 21:43, 18 May 2016 (UTC)

We edit MediaWiki:Citethispage-contentbillinghurst sDrewth 11:54, 19 May 2016 (UTC)
Thanks! I'll see if I can re-word that. Outlier59 (talk) 13:43, 19 May 2016 (UTC)
I don't have permission to change that. Outlier59 (talk) 13:48, 19 May 2016 (UTC)
Nope, and that page would only be changed by consensus. Suggestions can be made on its talk page. — billinghurst sDrewth 15:22, 19 May 2016 (UTC)
apparently User:George Orwell III decide to add this "important notice" without consensus, an interesting historical note [49]. Slowking4 12:37, 20 May 2016 (UTC)
I wonder whether the function as a whole makes sense for Wikisource, and I suspect it is used very little even on Wikipedia. Possibly the best thing to do is to hide it from the toolbar.--Erasmo Barresi (talk) 20:35, 22 May 2016 (UTC)
Citing a "community-built", "tertiary source" encyclopedia such as Wikipedia is probably used very little because it's difficult to verify/trace to reliable primary sources. As long as Wikisource provides page scans to verify Wikisource texts against original publications, we're making progress here. It's not perfect, but it's progress. I say keep the citation option, but CORRECT it. As User:Slowking4 noticed, User:George Orwell III changed it. I say CORRECT this. Outlier59 (talk) 01:26, 23 May 2016 (UTC)
well, I do appreciate the attempt to educate the public about how to use sources, but no amount of tl;dr chastisement will work; it will require lots of one on one tutoring of the next generation. and the talk of Wikipedia is a tell, this warning is not applicable here, maybe the consensus is to fix their problem, or maybe we should blank it. an essay on what and how to use wikisource, with an executive summary might help. Slowking4 01:38, 23 May 2016 (UTC)
Can you say that in plain English? Outlier59 (talk) 01:57, 23 May 2016 (UTC)
GOIII is currently unavailable due to one of life's lesser disasters and may not be able to respond for some time.

In the meantime how about this:

@Outlier59: what was the basis for you discovering this apparent fault? What was the application for which you needed to "cite" a page? If for curiosity or experimentation sakes alone then Erasmo Barresi's questioning of the existence of the link to the tool stands and I support removing that link. Surely only if there is a proven potential application for it is there any incentive to "fix" anything? AuFCL (talk) 02:06, 23 May 2016 (UTC)

I agree with the idea of removing the link. It seems to me to display the wrong data anyway, because when I click 'cite this page' I would expect to be given citation styles for the actual work (or part thereof) that I'm reading. For example, citing Against the Grain/Chapter I should give "Joris-Karl Huysmans" as the author and not "Wikisource contributors". It should also say what chapter this is. So yeah, I'm in favour of the simplification of the sidebar tool list. — Sam Wilson ( TalkContribs ) … 12:14, 23 May 2016 (UTC)

Tech News: 2016-21[edit]

18:40, 23 May 2016 (UTC)