Wikisource:Scriptorium

From Wikisource
(Redirected from Wikisource:S)
Jump to: navigation, search
Scriptorium
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource.

Contents

Announcements[edit]

Note
This section can be used by any person to communicate Wikisource-related and relevant information; it is not restricted. Generally announcements won't have discussion, or it will be minimal, so if a discussion is relevant, often add another section to Other with a link in the announcement to that section.

Proposals[edit]

Template and its documentation deletion proposal[edit]

I propose that this template with its documentation be deleted. Its history shows no links to articles or pages, was created in 2012, and superseded by the {{FIS}} template.— Ineuw talk 23:24, 7 June 2015 (UTC)

Please make your proposal at WS:Proposed deletions. Beeswaxcandle (talk) 05:28, 8 June 2015 (UTC)
Though if it is predominantly yours, and not used after these years, just go ahead and delete it. Though as rightly says that deletion requests belong at this other page — billinghurst sDrewth 14:01, 8 June 2015 (UTC)
Deletion is done and in the future will do as instructed.— Ineuw talk 04:26, 10 June 2015 (UTC)
@Ineuw:Don't forget to clean up the (now orphan) ex-template documentation. AuFCL (talk) 21:21, 13 July 2015 (UTC)
Thank you, my guide and mentor, and I don`t say this facetiously.— Ineuw talk 21:29, 13 July 2015 (UTC)

Split the scriptorium?[edit]

Perhaps this has been discussed before (I couldn't find anything in the archives) but wouldn't this page be better off being split with a dedicated subpage for each of the current sections e.g. a separate bot approvals page? It might be just me but it seems to take a bit longer to load this page than similar pages elsewhere. Green Giant (talk) 18:54, 22 July 2015 (UTC)

@Green Giant: We could look to remove the transclusion to the /Help subpage, and replace it with a link to the page, and the ability to start a question as a new section on the subpage. — billinghurst sDrewth 03:25, 6 August 2015 (UTC)
Yeah that's the approach I had in mind. I was thinking that we could have a set of tabs like c:Template:Administrators' noticeboard, leaving only the Other Discussions visible here. I don't know how popular that might be though. Green Giant (talk) 12:12, 6 August 2015 (UTC)
Strong support for this proposal. Pages should be kept reasonably short (and loadable), and concisely directed in terms of their topic. Cheers! BD2412 T 14:35, 6 August 2015 (UTC)
I think this is an excellent idea. This page is long and it is sometimes difficult to keep on top of all the discussions going on at once. Symbol support vote.svg SupportBeleg Tâl (talk) 14:37, 6 August 2015 (UTC)
Symbol support vote.svg Support: one subpage for each of the current sections and no transclusion, as in Wikipedia's Village pump. I just made a draft of the section index (feel free to edit it), and there I included links to pages which are outside the Scriptorium, like Wikisource:Possible copyright violations, but nevertheless may be what users are looking for when they click on "Central discussion" in the sidebar.
With regard to this, what about enabling Flow in some or all of the subpages?--Erasmo Barresi (talk) 09:55, 10 August 2015 (UTC)
I have removed the transcluded section, and added a click link to start a new question. — billinghurst sDrewth 10:30, 10 August 2015 (UTC)

Updating obsolete gadgets[edit]

My recent voyage through the gadgetland reminded me that the Regex search and replace gadget is obsolete and was superseded long ago by a much better version from Pathoschild. Would it be possible to replace the old gadget with the new code so that everyone can benefit from this tool, rather than having to paste this code in the their common.js file?

mw.loader.load('//tools-static.wmflabs.org/meta/scripts/pathoschild.templatescript.js');

This new script saves the search and replace parameters to be reused, rather than having to be retyped. — Ineuw talk 04:34, 16 August 2015 (UTC)

I have poked Pathoschild, he has global rights to update. — billinghurst sDrewth 00:27, 17 August 2015‎ (UTC)
TemplateScript is the newer version of the obsolete regex menu framework. It's much more powerful, but it's not backwards compatible — I'll update custom scripts that use it over the next little while, and update the gadget when I'm done. —Pathoschild 00:55, 17 August 2015 (UTC)
When you say "not backwards compatible", what does that mean? I don't know who uses the old regex framework, but as far as I know it didn't store any parameters for reuse. Did I miss something (which is not unusual)? — Ineuw talk 08:34, 17 August 2015 (UTC)
TemplateScript was designed as the successor to the regex menu framework, so it's mostly backward compatible. Here's a quick overview off the top of my head:
  Regex menu framework TemplateScript backward compatible
regex editor ✓ improved
custom scripts ✓ improved ✘ different schema
custom sidebars "Scripts" sidebar ✓ any number of custom sidebars
supported views edit ✓ any (edit, block, protect, etc)
gadget support no-conflict mode (one gadget) ✓ built-in support (any number of gadgets)
compatibility unknown ✓ all skins, modernish browsers, MW extensions
keyboard shortcuts
save regex patterns
custom plugins
translatable
The only breaking change is how you define custom scripts in your *.js pages. TemplateScript was written to take this much further than the regex menu framework did, so it uses a more expressive approach to defining custom scripts. Here's a sample complex script from a recent migration (using the new helpers):
Regex menu framework:
function rmflinks() {
   regexTool('remove linebreaks','linebreaks()');
}

function linebreaks() {
   /* exception pattern */
   var pattern = '\{\{[^}]+\}\}';

   /* store exceptions in an array */
   var patternlocal = new RegExp(pattern, 'ig');
   var exceptionvalues = editbox.value.match(patternlocal);

   /* replace exceptions with placeholders */
   var patternlocal = new RegExp(pattern, 'i');
   for(var x=0; x<exceptionvalues.length; x++) {
      editbox.value = editbox.value.replace(patternlocal, '~exception~');
   }

   regex(/([^\n]+)\n([^\n]+)/g,'$1 $2',10);
   regex(/(=+.+?=+) *([^\n]+)/g,'$1\n$2');

   /* restore placeholders */
   for(var i=0; i<exceptionvalues.length; i++) {
      var pattern = new RegExp('~exception~');
      editbox.value = editbox.value.replace(pattern, exceptionvalues[i]);
   }
}
TemplateScript:
pathoschild.TemplateScript.add({
   name: 'remove linebreaks',
      script: function(context) {
         var escaped = context.helper.escape(/\{\{[^}]+\}\}/g);
         context.helper
            .replace(/([^\n])\n([^\n])/g,'$1 $2')
            .replace(/(=+.+?=+) *([^\n]+)/g,'$1\n$2');
         context.helper.unescape(escaped);
   }
});
So to migrate the gadget to TemplateScript, I first need to update custom scripts. —Pathoschild 18:37, 17 August 2015 (UTC)

BOT approval requests[edit]

request for bot flag on account KasparBot[edit]

I want to perform the same task as on enwiki, frwiki, dawiki, mkwiki, jawiki, kowiki and cswiki, huwiki, bewiki in future. The bot will #1 move authority control information (Template:Authority control) to wikidata and replace the template with a blank {{Authority control}} (see w:en:Wikipedia:Bots/Requests for approval/KasparBot), #2 add {{Authority control}} to pages with authority control information on wikidata but without a local template transclusion on bewiki (see w:en:Wikipedia:Bots/Requests for approval/KasparBot 2). It uses my own Java framework. The bot's tasks are coordinated at Wikidata:WikiProject Authority control/Status. Regards, -- T.seppelt (talk) 08:42, 18 July 2015 (UTC)

Do you make a consistency check between local and wikidata info? If so, what if the two are different? Do you skip the page or who wins?— Mpaa (talk) 15:31, 18 July 2015 (UTC)
I skip the page. All problems will be tracked at a special section at the bot's tool. Regards, -- T.seppelt (talk) 23:58, 18 July 2015 (UTC)

@Mpaa: I could make same test edits. Do you agree? --T.seppelt (talk) 07:26, 22 July 2015 (UTC)

ok for meMpaa (talk) 19:16, 22 July 2015 (UTC)
@Mpaa: done. I didn't notice any mistakes. -- T.seppelt (talk) 06:29, 23 July 2015 (UTC)

@Mpaa:, @BirgitteSB:, @Hesperian:, @Zhaladshar: Nothing happens here. Do you want to me to hand in any information or do some more test edits? By the way, you can see the estimated edits of this bot at [1]. Kind regards, -- T.seppelt (talk) 16:13, 24 July 2015 (UTC)

I can't really add a flag because there is not really any consensus. I don't know enough about authority control to really weigh in on this. I'm hoping enough people who do can put together a consensus so that I know whether I can flag the account or not.—Zhaladshar (Talk) 14:32, 25 July 2015 (UTC)|
I'd need to know more about this to say, but it looks as though the test edits were all done in the Author namespace. Will the bot's edits be limited to the Author namespace? The stated scope of the bot's edits is very vague. --EncycloPetey (talk) 15:36, 25 July 2015 (UTC)
The bot looks at all pages in Category:Pages using authority control with parameters. User pages won't be affected because they usually don't have Wikidata items. You can inspect the estimated edits at the bot's tool page. Kind regards, --T.seppelt (talk) 19:47, 25 July 2015 (UTC)
@Mpaa:,@EncycloPetey: I started a general discussion. -- T.seppelt (talk) 06:34, 5 August 2015 (UTC)

request for bot flag on account YiFeiBot[edit]

Removing interlanguage links to from pages if the link is already on Wikidata. The bot is already approved as global bot and locally on commonswiki, enwiki, urwiki, zhwiki, zhwikivoyage. Programmed in python with pywikibot framework (source). --Zhuyifei1999 (talk) 05:58, 11 August 2015 (UTC)

From memory, there is already a bot doing this here, so I don't see the point in duplication of function. Beeswaxcandle (talk) 07:09, 16 August 2015 (UTC)
I'm unaware of any active & approved bot on this task. And some simulated runs shows many to-be-removed interlanguage links. Would you point out the bot for me? --Zhuyifei1999 (talk) 18:08, 16 August 2015 (UTC)
Oppose - There are links here that should not be removed even if they exist somewhere on Wikidata. Some of the interlinkages are not yet properly figured out, and if we rely solely on Wikidata, we can lose track of the links altogether. I've seen this happening in recent months when someone at Wikidata decided that our copy of a work was an "edition", and so needed to be on a separate data item from the work itself. Because the interwiki links were all at Wikidata, the move meant that all interwiki links disappeared from that work. So, at this point, because reasonable precautions are not yet in place against that sort of thing, I would say "no", we shouldn't start stripping out the interwikis. --EncycloPetey (talk) 13:34, 16 August 2015 (UTC)
Hmm. The removal on Wikidata seems to be vandalism and should be reverted on sight, as existing items should be used whenever possible. --Zhuyifei1999 (talk) 18:08, 16 August 2015 (UTC)
To note that the edits at Wikidata are correct, as our works are editions, and part of a general book, it just hasn't been suitably thought through on how we interlanguage link due to this. — billinghurst sDrewth 00:40, 17 August 2015 (UTC)
Symbol oppose vote.svg Oppose Agree with EncycloPetey that it is premature to remove interwiki links by bot and transfer them to WD. There is still the open discussions to occur about how to link books/editions/translations between languages, and still the discussion about how to best cater for links between works. I would hope that the Wikisource conference proposed for Vienna in November would be able to have a good roundtable discussion about this matter. @Micru, Aubrey: unsigned comment by billinghurst (talk) 00:37, 17 August 2015.

I withdraw my nomination. If en.ws community decides not to remove the links, I'm fine with this outcome, and wait for the Wikisource conference in November. --Zhuyifei1999 (talk) 14:09, 17 August 2015 (UTC)

Help[edit]

Preferably, we ask your HELP questions at Wikisource:Scriptorium/Help.

Repairs (and moves)[edit]

Other discussions[edit]

Famous passages as separate works[edit]

Does anyone want to weigh in on when a famous or popular passage should be listed as its own work on Wikisource? For example: The Lord's Prayer is an excerpt from The Gospel of Matthew, but I doubt anyone would exclude it as not being a work unto itself. However, this could give precedence for other popular passages: what about the Magnificat, Nunc dimittis, the Beatitudes, etc? I can't think of any non-Biblical examples at the moment. I think that a good treatment of this is the Ten Commandments, which simply links to Bible (King James)/Exodus#Chapter 20. However, this runs into the issue that there are multiple versions/translations and it selects only one. I am inclined to do likewise with The Lord's Prayer, with just a list of parallel passages, and remove other translations. —Beleg Tâl (talk) 21:28, 19 June 2015 (UTC)

Oh, here's another example: the hymn "Jerusalem the Golden" is an excerpt from the poem The Celestial Country, which in turn is a translation of an excerpt from De contemptu mundi. —Beleg Tâl (talk) 21:45, 19 June 2015 (UTC)
A secular example is The Walrus and the Carpenter taken out of Through the Looking Glass. Some of them should probably be redirects to sections/anchors in the text, while others could stand on their own with a cross-reference. @Londonjackbooks: is likely to have an opinion on where the balance point lies as I know she's been converting some poems to redirects. I'm not sure myself where that point of significance is and how it should be measured. Could we think of it in terms of scan-backing? If the work the excerpt comes from has no scan, then the excerpt can stand. If the work has a scan (that we're hosting), then the excerpt should be a redirect (or a versions page in the case of multiple hosting of the excerpt). Beeswaxcandle (talk) 22:59, 19 June 2015 (UTC)
"The Walrus and the Carpenter" (and "Jabberwocky" etc.) is a perfect example of what I was thinking of. Interestingly, those poems do have scan-backing, but only from the scan of Through the Looking Glass. You could, therefore, replace the text of The Walrus and the Carpenter with the transcluded text from the novel, and this would be an improvement to our copy of the poem. However, what then would warrant it having its own page? The only thing I can think of would be if we had a scan of it published as a work by itself.
To give a concrete example from something I am working on, I transcluded the Morning Prayer service from the Book of Common Prayer (ECUSA) the other day, and it contains a number of readings. Some of the readings are obviously separate works quoted in full (like "We Praise Thee" and "O Glorious Light"), and I have put them on their own page. Some of them are also merely readings from scripture, like Psalm 100 or 1 Chronicles 3:1-15, and I have ignored these as excerpts. However, some that I have mentioned, like the Lord's Prayer or the Magnificat, toe the line between, since they are parts of a larger work but are frequently used (quoted, referenced, published) separately.
I would be interested to hear @Londonjackbooks:'s opinion on this. —Beleg Tâl (talk) 18:04, 20 June 2015 (UTC)

Pictogram voting comment.svg Comment I would think that the famous component will have been separately published and outside of the bible. I would suspect that we would utilise a disambiguation page is appropriate and utilise the note field to have your note of the various sources. It would not be a hard and fast rule as sometimes we are only going to have one version and it will be a redirect until we subsequent work.

For a passage alone, I would think that we would call that a quotation and from the disambiguation page have a WQ link. — billinghurst sDrewth 13:29, 20 June 2015 (UTC)

Thanks, @Billinghurst:. I think that this is the best way to determine if it deserves its own page, if it has been separately published outside the source work. A concern with this approach, like I mentioned to @Beeswaxcandle:, is that there are works which contain extensive quotations from other works. The one I am working on right now is Book of Common Prayer (ECUSA), but I imagine any reader or anthology would be subject to this issue. Some quotations are full works and should be transcluded onto their own page. Some quotations are clearly meant as excerpts and should not. Some, however, like the Lord's Prayer or the Magnificat, appear to be understood as a full work (which they are sometimes published as), even though they are known to be excerpts.
Perhaps, a good rule of thumb is this: if there exists a copy of the excerpt published as its own work (and not as a quotation), then we can put it on Wikisource, and then we can put a disambiguation page which will link to the scan as well as to the places in the larger work where it exists. Otherwise, we can disallow it. —Beleg Tâl (talk) 18:04, 20 June 2015 (UTC)
@Beleg Tâl: Using your example of We Praise Thee, I would have made that page redirect (Book of Common Prayer (ECUSA)/The Daily Office/Daily Morning Prayer: Rite One#52) to the text within the Book of Common Prayer (ECUSA) where the passage itself begins instead of giving it its own page (even though transcluded). If more than one version exists on WS, then I would create a "We Praise Thee" versions page, pointing readers to the scan-backed versions available. I agree with Beeswaxcandle's assessment above. That would solve the question of whether a passage/poem "deserves its own page" outside of the indexed source text within which it is transcluded. Generally speaking, I believe it does not. There may be exceptions I am not yet aware of. Sorry I chimed in late... just getting back online, and hoping I understood the question. Thanks, Londonjackbooks (talk) 14:13, 8 July 2015 (UTC)
That's very interesting, as We Praise Thee (a.k.a. Te Deum) is a full work unto itself, and not an excerpt from a larger work. This is a slightly different question, and one which we have discussed before (although I don't think there was a definitive conclusion that time)—see Wikisource:Scriptorium/Help/Archives/2014#Works contained in other works. The two questions are of course intertwined, when you have a work such as the Book of Common Prayer which contains full works such as We Praise Thee as well as excerpts such as The Lord's Prayer.
It sounds like you are agreeing with User:Beeswaxcandle with regards to this question, namely that an excerpt should only be listed as its own work if we have a copy of the excerpt that was separately published as its own work. However, it sounds like you also have an opinion on the different question of separately transcluding fully cited works, which is that it should NEVER be done. This is different from what User:Billinghurst said in the previous discussion ("I have separately transcluded a work from an existing work where it is included in full, not an excerpt, and it is incidental to the work itself.").
This is an important discussion in my opinion, as I have done several prayerbooks and hymnals with the intention of having the cited prayers and hymns available on Wikisource, and I intend to continue doing so. I hope to get a good understanding of current consensus if it exists, and to create consensus if no consensus exists currently. Please let me know if I have mistaken your position. —Beleg Tâl (talk) 16:48, 9 July 2015 (UTC)
I don't believe you have mistaken my position—which is only my general opinion. I feel it is redundant to have two accounts on WS from the same transcluded source. I would be interested in viewing a/the piece that @Billinghurst: has transcluded separate from an existing work as mentioned above and in the archived Scriptorium thread. I did admit that there could be exceptions, and I believe I have made such an exception with a few poems here, come to think of it,—for presentation purposes. Thanks, Londonjackbooks (talk) 17:30, 9 July 2015 (UTC)
@Londonjackbooks: Blue Goodness of the Weald as presented in another work. — billinghurst sDrewth 08:15, 11 July 2015 (UTC)
Thanks. Taking a look, that poem is excerpted from "Sussex" by Kipling in The Five Nations. Londonjackbooks (talk) 11:05, 11 July 2015 (UTC)

No file Index:Economic and Social Council Resolution 2007-25.pdf[edit]

Commons deletion? ShakespeareFan00 (talk) 15:13, 3 July 2015 (UTC)

See at c:Commons:Deletion requests/File:Economic and Social Council Resolution 2007-25.pdf Hrishikes (talk) 15:32, 3 July 2015 (UTC)
yep, after 1987 c:Template:PD-US-no notice-UN, maybe we need to have a word with the UN for some CC licenses. see also Administrative Instruction ST/AI/189/Add.9/Rev.1 Slowking4Farmbrough's revenge 21:19, 4 July 2015 (UTC)
That user mistakenly marked many files created after 1987 for deletion when a good number still remained in the public domain. See w:Template talk:PD-UN for @George Orwell III:'s explanation. There's a chance that it is still in the public domain but you'd have to look into it. The Haz talk 20:59, 8 July 2015 (UTC)
I should add that document likely was deleted by mistake. It probably should have had c:Template:PD-UN-doc. The Haz talk 21:04, 8 July 2015 (UTC)

Would like to add the Garnett translation of Dead Souls[edit]

I see that on archive.org there are two volumes of Constance Garnett's translation of Dead Souls by Gogol, published by Chatto & Windus (London, 1922). Since Garnett died in 1946, I gather this work has just entered the public domain in the US, but won't in the UK for a couple of years. So in this case I should upload the djvu files to Wikisource rather than to Wikimedia Commons? There's already the Hogarth translation on Wikisource (copied from Gutenberg), but the Garnett translation seems to be more accurate and complete. Mudbringer (talk) 07:59, 4 July 2015 (UTC)

I'm not sure of the copyright status in the United States. The Copyright Renewals include
DEAD SOULS, by Nikolay Gogol; translated by Mrs. Edward Garnett [i. e., Constance Black Garnett] (The collected works of Nikolay Gogol, v. 1 and 2) © 23Apr23, (pub. abroad 7Nov22), A704400. R71938, 1Dec50, David Garnett (C)
Which states that its copyright date in the US is 1923, despite, as the Copyright Office made note of, it was published in 1922. Wikilivres would take it, but let me look for other advice.--Prosfilaes (talk) 01:32, 5 July 2015 (UTC)
Uggh. That is the registration date, which is when copyright would start unless it had been previously published, in which case it should have started then (of course, not according to the 9th Circuit). Possibly the publication abroad was not "general". Technically I think, the renewal had to be made before the 28th anniversary of the publication starting -- if that was 7 Nov 1922, then the renewal came a month late. If it was 23 April 1923, then the renewal was right in the window. (It was later that registrations were allowed to the end of the last calendar year, but maybe that was close enough and they allowed it.) Or… maybe this was the old ad interim copyright for the foreign publication, which served a short while until copies could be manufactured in the U.S., at which point a proper registration could be made. Maybe the 1922 date was for the ad interim copyright. The Compendium I stated: An application covering an American edition or a work first registered for ad interim copyright should state the date of publication of the American edition, but should also indicate the year date of publication or the foreign edition. Later it says: Ad interim copyright may be extended to the full term if an American edition is manufactured and published during the five-year ad interim period, and if a claim in the American edition is registered. (See item 8.4.6.ll.b.) In such case the full copyright term is computed from the date of first publication abroad. (Compendium I, page 8-7, warning large PDF). So it sounds like ad interim copyright was for a short time (though it did exempt notice requirements on the foreign works), and it could be extended to the full term by complying with the manufacturing requirements followed by registration, which as a guess sounds like happened in 1923. But the above is pretty explicit that copyright started with the publication abroad. If so, that would be 1922, regardless of the later registration. In that case, 1923 is the full U.S. registration date and not the start of copyright. I'm wavering but… I think I'll lean towards 1922 as the publication year for that, so U.S. expiry on Jan 1, 1998. [As an aside, the Compendium I noted the Heim case, the one the 9th Circuit relied on in Twin Books, and noted that the Office did register works without notice under the rule of doubt up until the UCC came into force, which the Office thinks then changed the Heim doctrine. But at the time, and per Heim, that only changed the validity of the registration, not the start date of the copyright… that was a 9th Circuit invention.] The UK publication does have an "All rights reserved" but no copyright notice. It's probably in Twin Books territory but I think most would consider it being PD in the US., so I'd lean OK for Wikisource. It should become unambiguously PD in 2019. Carl Lindberg (talk) 03:06, 5 July 2015 (UTC)
Thank you for your research and advice. I guess I'll go ahead and get started on it and see how it goes. Mudbringer (talk) 05:12, 5 July 2015 (UTC)
Just a note -- Constance Garnett was British and died in 1946, and the UK is the country of origin, so the work cannot be uploaded to Commons until 2017 since that is when the UK copyright would expire. Carl Lindberg (talk) 16:14, 5 July 2015 (UTC)
Thanks, that's what I thought. I've uploaded the djvu files to Wikisource here and here. I'll try making the index files tomorrow. Mudbringer (talk) 17:00, 5 July 2015 (UTC)

Index:Compendium of US Copyright Office Practices, II (1984).pdf[edit]

Any takers to push the last few index pages into proofread status so I can mark this for validation ( barring 2 pages that need symbol images)?ShakespeareFan00 (talk) 22:01, 4 July 2015 (UTC)

Thanks, any takers for validation? ShakespeareFan00 (talk) 13:49, 6 July 2015 (UTC)

Tech News: 2015-28[edit]

15:13, 6 July 2015 (UTC)

The World Factbook (1982)[edit]

Anyone want to follow the pattern set and assemble this? I'd really appreciate someone else resolving some issues with transcription inconsistencies mostly caused by the way Proofread page handles page breaks.ShakespeareFan00 (talk) 11:18, 8 July 2015 (UTC)

2015 Wikimania meetup[edit]

anyone interested in a wikimania meetup? [7] Slowking4Farmbrough's revenge 19:33, 8 July 2015 (UTC)

Limitations on author pages[edit]

It may be time to draft a set of standards limiting content on Author pages for authors whose works are not in PD, and whose works will not be PD for a long time. I've come across Author:Alexios Schandermani, which appears to be little more than a personal advertisement for an author's works.

In particular, how much information is right for the author's description, and what is too much? How much information should we provide for works that are not hosted on Wikisource, Wikilivres, or any public internet location? --EncycloPetey (talk) 20:43, 9 July 2015 (UTC)

I looked at it and I do not believe it belongs here on Wikisource. It certainly isn't before 1923, is under copyright and it is self-promotion of his work. Want to buy his books now? It is almost as bad a Brook D. Simpson, instructor from NY working in AZ.edu and posting his books on general Grant on Wikipedia. Want to buy his books? We are allowing self-promotion of books to be purchased but yet what we do on WS is for free. —Maury (talk) 21:05, 9 July 2015 (UTC)
While I agree with your sentiments, I am looking for objective criteria that we could draft, so that when these pages appear, we can point disgruntled contributors to a page explaining the situation, rather than waste time writing an original explanation every time. --EncycloPetey (talk) 21:29, 9 July 2015 (UTC)
Is there any reason to limit them, instead of just forbidding them? If the author doesn't have any free works and nothing will go PD for at least 20 years, there's no real reason to have pages for them. I can see cases where we could have problematic pages for people with a little free work, but I'd rather not add rules for something that's not current--unless it is a current problem.--Prosfilaes (talk) 23:12, 9 July 2015 (UTC)
The author page cited really looks like a promotional piece and detracts from the dignity of the site, but Prosfilaes is right, total prohibition works better than imposing limitations and easier to control. Here are some proposals:

1. As said above, pages for authors having no PD work and no likely PD work in the next 20 years should not be allowed here.

Agree, though I wouldn't even give 20 years. Caveat: there are exceptions to this rule (covered later) — billinghurst sDrewth
In 2019, new stuff will start entering the PD in the US. Anyone interested enough to start accumulating information to facilitate to the entry of works a couple years ahead of time should be permitted to do so. 20 years is arbitrary, but there's hardly any abuse potential in letting works up to 1940 be listed.--Prosfilaes (talk) 03:03, 13 July 2015 (UTC)

2. If author pages are at all allowed, full bibliography should also be allowed, irrespective of whether it is hosted here, for the sake of completeness, general information and as a stimulus to prospective contributors to add the work here.

Agree and we have always allowed a linking to freely hosted full works elsewhere. — billinghurst sDrewth
I think that ignores some of the issues that brought this up. Assuming that the poems under Author:Alexios Schandermani stay, does that mean that we will provide full bibliography (even though prospective contributors can't add the works) and links to legally hosted non-Free works elsewhere? (The history of that page should be looked through for the many variations on what we could see.)--Prosfilaes (talk) 03:03, 13 July 2015 (UTC)

3. For foreign authors, bibliography should be limited mostly to English translations; there is no use having a foreign language bibliography, that is for the Wikisource in that language.

Disagree, there are referenced/cited works in non-English for authors, and the authors should be linked, especially when they can be interlanguage linked after that. — billinghurst sDrewth
I'll note that "foreign authors" is problematic in a multi-national environment, as well as the implication that nations and languages go together. I think if we have a bibliography, we should have a list of works that can be translated as well as those that just have to be scanned. There are certainly authors where even when translations exist, original names are necessary for clarifying what is a translation of what.--Prosfilaes (talk) 03:03, 13 July 2015 (UTC)

4. Description for authors having Wikipedia pages should be restricted to a few words or phrases, like "British journalist" or "Indian novelist" and the like. The reader can see the rest from the Wikipedia link. A few lines excerpted from Wikipedia may be allowed, without any weasel words or superlatives or eulogistic description.

Agree though I would say minimal text to put the author or their works in context, though I tend less to excerpt enWP, eg. contributor mentions for local multi-author works

5. Original description of a few sentences should be allowed for authors not having Wikipedia articles, but this should be very concise, without having a biased look.

Agree though my saying covering minimal text I think cover this — billinghurst sDrewth

6. Works listed should not have detailed description. Noting its genre should suffice, detailed content, author's purpose and method of writing it etc. should not feature here. If the work has any outstanding uniqueness (e.g., won a Nobel prize, was the first detective novel in source language etc.), then that information may be provided concisely.

hmmm detail would normally belong on a work, though if there is some detail and no work, I am not adverse to sourced commentary. Minimal and contextual if it clearly adds value would be comment, where disputed the lesser position should be favoured — billinghurst sDrewth

7. Overall, the whole page should have an objective look, there should not be any imprint of passion or emotion of contributors on the page.

Agree neutral, note personal opinion and choice, where disputes occur, the lesser is generally preferred — billinghurst sDrewth

8. No linking of books to commercial sites.

Agree focus is on linking to free works — billinghurst sDrewth
submitted for further discussion.
N.B. I have created/modified 4 author pages (1, 2, 3, 4) in some detail. I am not sure whether these pages would meet the community consensus. If not, other contributors are welcome to amend the pages. Hrishikes (talk) 01:39, 10 July 2015 (UTC)
Two pros and a con:
  • Pro: what about communal research? Most of the above arguments only really work if one contributor makes the basic Author: structure, and nobody else significantly touches that page.
  • Con: Isn't it a little bit irresponsible to keep pushing research responsibility out to sister projects (i.e. WP, WD etc.) If this actually worked then eliminate local Author: pages altogether!
  • Pro: Sometimes details available "erode" over time. There is nothing sadder than finding out a biography was available years earlier but is no more because the hosting site has gone down, never to be restored. Sometimes another authority might keep backups but not always, or worse keeps archives of dead links. For example: Brite Sparks (biographies of Australian science figures) is long gone although the NLA has some entries still available.
  • AuFCL (talk) 08:05, 10 July 2015 (UTC)
I've cut out the promotional material. Is it still objectionable? —Beleg Tâl (talk) 14:57, 10 July 2015 (UTC)
As of now. All those links obscure the two (unlicensed) works on Wikisource. And it's fine to link to Google Books and publishers as reference, but inline like that randomly prioritizes certain sites and obscures the difference between works available and bibliographic information; and linking to works found online dilutes our Free mandate. I guess, I'm looking at a different page; you meant [8]. I don't see the point in removing years and ISBNs. The poetry pushes it into a case I said we might not need to handle now, the case where there's some trivial amount of free work behind more non-free work.--Prosfilaes (talk) 19:18, 10 July 2015 (UTC)
i agree with the not promotional, however, good bibliographies can be hard to find. alternatively, you could also have a style guide = standard list only with isbn, not link to pay or blog sites. (similar to w:Wikipedia:WikiProject Bibliographies). Slowking4Farmbrough's revenge 02:36, 11 July 2015 (UTC)
Always better to have an ISBN and link to Special:BookSources where the reader can pick which external link to follow. Green Giant (talk) 18:00, 11 July 2015 (UTC)

Added commentary inline above. The purpose of our author pages is to provide detail about authors of works and writings in the public domain, and links to those works that are freely available. We do have some exceptions to that basic premise is we do have author pages to some significantly notable authors who are not in the public domain as their works have been added and deleted, and we do this to stop the addition of these works. The premise again is that if there are no works in the public domain, that they are of the exception, and where there presence is disputed then we are more likely to delete those author pages. — billinghurst sDrewth 11:40, 12 July 2015 (UTC)

ie. Author:Stephen King The Haz talk 18:29, 14 July 2015 (UTC)

One way to handle biographical details, if details beyond the minimum need to be held for some reason, is to post those to the accompanying Talk page, with any links or references used. Better still, add the information to Wikipedia, but I know they don't always keep stub biographies on people who are not "notable" enough. --EncycloPetey (talk) 00:06, 15 July 2015 (UTC)

Index:British Reptiles, Amphibians, and Fresh-water Fishes.djvu[edit]

No file. - Per a Commons deletion as the images were not yet out of UK copyright. The text was OK and the file could have been localised as I suggested at WS:PD a while ago. ShakespeareFan00 (talk) 08:26, 10 July 2015 (UTC)

File is now local :) ShakespeareFan00 (talk) 11:05, 10 July 2015 (UTC)

Index:The Pilgrim's Progress.djvu[edit]

Concern was previously raised that the new material on the front of this didn't have a clear status. Unless someones able to provide a better date ( checked archive.org which didn't have one, I'm considering putting a Deletion request at Commons, Pilgrims Progress itself is of course Public domain (the Deletion would be solely in relation to the "new material" sushc as the title bindings and [[9]] ShakespeareFan00 (talk) 10:37, 10 July 2015 (UTC)

Further to this the R.H Brock identified died in 1943., - http://illustrationartgallery.blogspot.co.uk/2010/10/richard-henry-brock.html ShakespeareFan00 (talk) 10:40, 10 July 2015 (UTC)
This would appear to be solely about the notes on Page 6, the index and the fact that theres not date for the edition ShakespeareFan00 (talk) 10:46, 10 July 2015 (UTC)
The earlier-thread is here- Wikisource:Scriptorium/Archives/2014-10#Index:The_Pilgrim.27s_Progress.djvu ShakespeareFan00 (talk) 11:05, 10 July 2015 (UTC)
@ShakespeareFan00: The Digital Library of India has 17 copies of this work, pertaining to different years/editions. Which one do you want? Hrishikes (talk) 11:11, 10 July 2015 (UTC)
One that is clearly and unambiguously in the public domain internationally. As I said the problem's arisen as the specfic edition doesn't have a definitive date. In the previous thread it was certainly suggested that a "clearly dated" edition be found.  :) ShakespeareFan00 (talk) 11:24, 10 July 2015 (UTC)
Please Check out and choose
  1. 1909 (Harvard Classics vol 15 containing The Pilgrim's Progress by John Bunyan and The Lives of John Donne and George Herbert by Izaak Walton)
  2. 1904 (1956 reprint, Oxford Standard Authors Series)
  3. 1892 (Ward, Lock, Bowden & Co., London)
  4. 1908 (Cassell & Co., London-Paris-NY)
  5. 1904 (Oxford, 1929 reprint]
  6. 1904 (The Pilgrim's Progress, The Holy War and Grace Abounding by John Bunyan, Thomas Nelson & Sons, Lond-Edin-NY)

Hrishikes (talk) 12:12, 10 July 2015 (UTC)

My recomendation is that IF you can show the 1904, Nelson version to be free from copyright restrictions outside the US, that's probably the best bet. ShakespeareFan00 (talk) 13:04, 10 July 2015 (UTC)
Hmm some of the data you've provided suggest the Nelson version we've got might be a post 1922 reprint version, (sigh) Wasn't able to view scans on the DLI link as it companied about a missing URL's, This needs some more research..
ShakespeareFan00 (talk) 12:45, 10 July 2015 (UTC)
Confirmed, The version we have is from the 1960's, check the address of the US arm on the Colophon page against details here(Thomas Nelson (publisher)ShakespeareFan00 (talk) 13:04, 10 July 2015 (UTC)
Why are you getting bothered about reprints? As far as I understand, editions have new copyright, not reprints. Of the various works I am now doing, Panchatantra is 1955 reprint, 1925 copyright; The Home & the World is 1957 reprint, 1919 copyright. A reprint has no new material, so no new copyright, except when separately registered for the same. Aside from the copyright aspect, scan quality is also important. DLI scans have variable quality, which is important in case of images. So the file in DLI should be chosen from the angle of scan quality. As you are already engaged in this work, so you are more suited to do the choosing. You can compare all the 17 versions by going to the DLI homepage and author-searching Bunyan. The scans are in TIF format, so a reader is needed, which can be installed from the link given on the DLI page. Moreover, an endless list of this book's versions are available in other sites: Google Books -- 1, 2, 3; Internet Archive -- 1, 2, 3, 4, 5, 6 and more. So there should not be any problem about replacing the WS version with a suitable substitute. Best wishes, Hrishikes (talk) 14:22, 10 July 2015 (UTC)
Sorry but the plugin required on windows is not "free" software... You tried :) ShakespeareFan00 (talk) 15:32, 10 July 2015 (UTC)
In your list DLI 2 & 6 are seemingly broken links, and don't show up by searching on Bunyan as an author..ShakespeareFan00 (talk) 15:57, 10 July 2015 (UTC)
The concern is about the "index" which is not part of the original Bunyan work... ShakespeareFan00 (talk) 15:59, 10 July 2015 (UTC)
Thanks - Index:The pilgrims progress as originally published by John Bunyan ; being a facsimile of the first edition (1878).djvu

& Index:The pilgrim's progress by John Bunyan every child can read (1909).djvu ready for proofreading if anyone cares.ShakespeareFan00 (talk) 16:48, 10 July 2015 (UTC) ShakespeareFan00 (talk) 16:48, 10 July 2015 (UTC)

The second also has a version of the The Little Pilgrim which is not currently sourced. ShakespeareFan00 (talk) 16:49, 10 July 2015 (UTC)

@ShakespeareFan00: Yes, 2 and 6 go to broken links because DLI people put wrong linking on the allmetainfo page. Correct links: 2 and 6. The plug-in is very much free, see on this page. If you want to do it without the plug-in, then you can download the pages directly by going to 2 (change page number upto 438) and 6 (change page number upto 755). If you wish to download the books directly as PDF, then you will need DLI downloader, for which, see 1, 2, 3, 4. I have not used these tools as yet, so I don't know whether these work or not. Anyway, this discussion is now practically redundant as you have already added the work from IA, this is only for answering the points you had raised. Hrishikes (talk) 03:04, 11 July 2015 (UTC)

Tech News: 2015-29[edit]

15:06, 13 July 2015 (UTC)

Index:The pilgrim's progress by John Bunyan every child can read (1909).djvu[edit]

And another work done. ShakespeareFan00 (talk) 16:19, 13 July 2015 (UTC)

Marx in English from USSR[edit]

There are some books by Karl Marx, published without year of publication and without names of translators and editors, by the Foreign Languages Publishing House, Moscow, in the Soviet era. I am interested in Notes on Indian History (664-1858) found at https://archive.org/details/notesonindianhis00marxuoft and https://books.google.co.in/books/?id=LQcUAAAAIAAJ. Google Books show the year as 1947, but it is not so. The Publisher's Note mentions that the work was prepared after the Russian version of 1947. So the year can be deemed to be in the 1950s or thereabouts, and the IA version is the second impression. The then Soviet law forbade copyright, and even if current US law deem the work as non-PD, copyright status is difficult to understand, as the translators/editors are not named in the work. So request guidance about whether this is addable here. Hrishikes (talk) 04:45, 16 July 2015 (UTC)

In 1996, Russia had a law that was life+50 (plus extensions for some authors), restoring copyright to older works. Thus any such works would have been had their copyright restored in the United States and thus have copyright for 95 years from publication.--Prosfilaes (talk) 06:44, 16 July 2015 (UTC)
I agree with Prosfilaes, those works will probably become public domain in the US in 2043 at the earliest. Green Giant (talk) 18:34, 16 July 2015 (UTC)

Broadcast free information from space![edit]

Hi all,

There's a cool event based in Uganda, but designed for remote participation, this weekend.

"Outernet" is a project to repurpose satellites to "broadcast" free information, that can be picked up by inexpensive receivers, for free, and then reshared for free over local networks/WiFi. A way to get information to remote and underserved parts of the world. It's one-way communication, so certainly not a replacement for the Internet or a total solution to the Digital Divide -- but a very cool project nonetheless. They are also developing democratic processes for deciding what content to share.

They are having an edit-a-thon this weekend. It runs for 36 continuous hours: 10am Saturday to 10pm Sunday, local time in Uganda.

Event link/signup HERE

And see their blog post

Pete (talk) 15:41, 16 July 2015 (UTC)

Transcribing Bilingual Parallel Texts on English Wikisource[edit]

Going by Multilingual texts, there has been discussion about allowing transcriptions of books that reproduce non-English texts with English translations on facing pages. Most of the bilingual pages I've found here so far have been Wikisource translations, such as this poem by Ovid. There is a page listing one of the most famous series of such books, the Loeb Classical Library, but work on importation and transcription has barely begun.

One book I'm very interested in working on is Swahili Tales which has been started on multilingual Wikisource, but doesn't seem to be currently active. I've tried working on that, but a lot of the templates normally used here on English Wikisource don't seem to work there, and when you try to edit the English-language pages you're warned that they're prohibited. Would it be a grievous breach of etiquette to set up an index file for that book and do the editing here? Does anyone have any thoughts about how to format the final version? It would be very nice to show the Swahili and English texts in parallel, possibly transcluding one page at a time in the rows of a table, or perhaps even better to define each paragraph as a section and to arrange those in parallel in a table. I've set up a sample of what a parallel text might look like here.

I do feel that at least the Swahili text should be on multilingual Wikisource, where it can be categorised with the other Swahili texts, but then the English translation in the book is a significant text in its own right (it has, for example, been translated into Japanese), and it would make an important addition to the collection of folklore texts here. Mudbringer (talk) 05:48, 19 July 2015 (UTC)

The English version, after proofreading in original location, may be transcluded in English Wikisource, by using {{Iwpages}}. A note may be provided within <mark>...</mark> on the index page that the English pages would be transcluded in the English site. That will circumvent the prohibition. Alternately, the whole work may be proofread here after setting up the index file, and then the Swahili pages transcluded in oldwikisource. The template would work there because it was imported here from that site. Hrishikes (talk) 06:03, 19 July 2015 (UTC)
Thanks for the pointer! Looking at the French documentation for {{Iwpages}} I found a bilingual text of works by Cicero on facing pages that has made good progress, in Latin and French. It looks like the procedure is to set up separate index pages for each language, so perhaps I should try to get an English index page set up for Swahili Tales. Unfortunately, it doesn't look like they're planning a facing-page presentation for the Cicero text. I tried putting a few pages into a table here, and the formats don't fit well together. Still, I think I can see the way forward. Thanks! Mudbringer (talk) 08:12, 19 July 2015 (UTC)
@Mudbringer: Two indices are not necessary. One will do fine. See The History of the Bengali Language/Appendix 1 and click Appendix II in the header portion and you will get one option. Hrishikes (talk) 08:22, 19 July 2015 (UTC)
@Hrishikes: I can't find any provision for including alternate pages (e.g. only pages 3,5,7 ...) with {{Iwpages}}, which is what I'd need to do here.
Have you tried this method ---

{{Iwtrans|lang|Page:Book.djvu/72|num=12}}
{{Iwtrans|lang|Page:Book.djvu/74|num=13}}

I have not had occasion to use it, but I guess it should work. I don't know wheher exclude/include parameters and the step function for using alt. pages given at Help:Transclusion#Advanced usage will work or can be added to the template. Hrishikes (talk) 12:23, 19 July 2015 (UTC)
{{iwtrans}} brings in too much, at least when I tried it here. Mudbringer (talk) 12:47, 19 July 2015 (UTC)
I see the problem now. The only thing certain to work in the current state of templates is creating a second index here, importing the pages with {{Iwpage}} and then going for normal transclusion, with the step function for alternate pages. If this function were present in {{Iwpages}}, that would simplify the matter immensely. You may seek expert opinion from George Orwell III for any other viable option. Hrishikes (talk) 02:44, 20 July 2015 (UTC)
Yes, it looks like that'll provide the most flexibility later on. Thanks a lot for your help! Mudbringer (talk) 05:24, 20 July 2015 (UTC)
Another possibility I've suggested on multilingual ws: {{tiret}} and {{tiret2}} (this one). Would it be useful for many languages? --Zyephyrus (talk) 07:34, 19 July 2015 (UTC)

I've set up an index file for the English text of Swahili Tales on en.wikisource (original index file here), proofread the first very short tale, and made a few tests towards finding a usable format for transcluding and arranging the original text and English translations in parallel. Here is a list of the tests with a few remarks about well some of them worked. If anyone would care to look at them and leave any comments or suggestions on my talk page I'd be grateful. Mudbringer (talk) 14:07, 20 July 2015 (UTC)

I've tried this test with the {{ts}} template adding (vtp (vertical align top). Not quite satisfied with the result. Can it be of any use? --Zyephyrus (talk) 17:15, 21 July 2015 (UTC)
Yes, that's definitely an improvement to have the sections aligned at the top. Thank you! Mudbringer (talk) 04:17, 22 July 2015 (UTC)

Page:Sheet Metal Drafting.djvu/181[edit]

How to format the long division/square root calculations? I tried looking at the LATEX wikibook and still couldn't see an easy method.ShakespeareFan00 (talk) 15:56, 20 July 2015 (UTC)

I would suggest <math> is not appropriate here and go for a more textual approach. This is not perfect but at least fairly close to what you want?
Source Result
<span style="visibility:hidden;">42</span>√{{overline|487.9347}}|{{underline|22.08+}}<br/>
<span style="visibility:hidden;">42√</span>4<br/>
42|{{overline|{{underline|087}}}}<br/>
<span style="visibility:hidden;">42|0</span>84<br/>
{{underline|4408}}|{{overline|3 9347}}<br/>
<span style="visibility:hidden;">4408|</span>3 5264<br/>
<span style="visibility:hidden;">4408|</span>{{bar|5}}<br/>
<span style="visibility:hidden;">4408|3 </span>4083

42487.9347|22.08+
42√4
42|087
42|084
4408|3 9347
4408|3 5264
4408|—————
4408|3 4083

AuFCL (talk) 21:46, 20 July 2015 (UTC)

ShakespeareFan00 (talk) Thanks :) Probably wrap that in a div and we are done. ShakespeareFan00 (talk)

I had a bit of a further muck around directly on the page but deliberately left the result as unvalidated. Proceed or back it out at your pleasure? AuFCL (talk) 22:05, 21 July 2015 (UTC)

Tech News: 2015-30[edit]

03:05, 21 July 2015 (UTC)

Two questions about Swahili Tales[edit]

I'm making some progress in proofreading Swahili Tales here and on multilingual Wikisource. There are two things I'd like to ask about at this point:

  1. On the archive.org page from which the djvu file was obtained it says "National Library of Scotland holds full rights in this digital resource and agrees to license the resource under the Creative Commons License: Attribution-Noncommercial-Share Alike 2.5 UK: Scotland". Should I edit the Wikimedia Commons file, and the Index files to reflect this?
  2. The pages of this book contain many handwritten notes by John Francis Campbell that are of great interest. For example this page has: "Monday, August 1, 1870 / Present from the Duke of Argyle. — / Read same day. Contains portions of many well known stories of which versions are in Gaelic. See notes at the end of each story. / J. F. Campbell". Would it be permissible to add a page to the Wikisource edition of this book giving transcriptions of the notes, or would it be better to produce a separate article containing them? I'm thinking a standalone article transcribing the notes would be preferable, as that would allow for a logical link from Campbell's author page, and taken as a whole they are a significant work in themselves.

Thanks for any comments or suggestions. Mudbringer (talk) 01:57, 23 July 2015 (UTC)

Commons category in {{plain sister}}[edit]

Could someone who knows Lua edit Module:Plain sister so that the Commons category is retrieved from Wikidata (d:Property:P373) if the "commonscat" parameter is not filled in?--Erasmo Barresi (talk) 11:19, 23 July 2015 (UTC)

Hi Erasmo,

Fwiw... A similar issue concerning the interaction between WikiData and template params such as those found in Plain sister was started just a few days ago and might be better to follow through there than in WS:S. Either way, I believe we'll need "outside" help when it comes to Lua scripting; I don't know of any regular contributor here that is truly fluent Lua to be blunt about it. -- George Orwell III (talk) 21:29, 24 July 2015 (UTC)

I should definitely go through the Lua tutorial and familiarize myself with the basics, but the fact that I am hardly the only one who's ignorant in this field kind of reassures me :) Moving to Template talk:Header.--Erasmo Barresi (talk) 18:06, 25 July 2015 (UTC)

For those who like long s[edit]

Here is a seventeenth century item for the long s lovers: Index:The Six Voyages of John Baptista Tavernier.djvu. Other than the long s, proofreading is easy, by copy-pasting from the page-wise online version of the University of Michigan. Hrishikes (talk) 13:38, 23 July 2015 (UTC)

File:William Tell Told Again.djvu[edit]

Commons about to delete (sigh) :( 16:24, 23 July 2015 (UTC)

Uploaded locally. Authors need to be checked before uploading to Commons, and P. G. Wodehouse won't be out of copyright in the EU for 30 years (1975+71 = 2046).--Prosfilaes (talk) 01:12, 24 July 2015 (UTC)

Index:Armistice Day.djvu[edit]

If someone would like to resolve the issue of the "problem scans" then this could be a Featured text for November I think. ShakespeareFan00 (talk) 16:55, 23 July 2015 (UTC)

Index:Sheet Metal Drafting.djvu[edit]

Anyone want to do a pedant check on this? Concerns were expressed that the proof-reading missed some items. ShakespeareFan00 (talk) 22:59, 23 July 2015 (UTC)

Proposal to create PNG thumbnails of static GIF images[edit]

The thumbnail of this gif is of really bad quality.
How a PNG thumb of this GIF would look like

There is a proposal at the Commons Village Pump requesting feedback about the thumbnails of static GIF images: It states that static GIF files should have their thumbnails created in PNG. The advantages of PNG over GIF would be visible especially with GIF images using an alpha channel. (compare the thumbnails on the side)

This change would affect all wikis, so if you support/oppose or want to give general feedback/concerns, please post them to the proposal page. Thank you. --McZusatz (talk) & MediaWiki message delivery (talk) 05:07, 24 July 2015 (UTC)

perhaps the graphs extension will render static renderings obsolete. Slowking4Farmbrough's revenge 02:55, 25 July 2015 (UTC)
And perhaps cynic need not comment as their views are entirely predictable? AuFCL (talk) 03:07, 25 July 2015 (UTC)

Index:1918 Engineer Notebook small.pdf Status check[edit]

According to some information at Ancestry.com the author of these notes was still alive in 1961. This means the status of the notes should be checked as it could be that it wasn't formally registered as such. ShakespeareFan00 (talk) 10:59, 26 July 2015 (UTC)

Umm, what is your point? The work is unpublished, and states as such at Commons. That puts the copyright in a completely different space, and it sounds ore like it requires an OTRS permission. — billinghurst sDrewth 13:00, 26 July 2015 (UTC)
That was the concern, that it was unpublished. However, given some recent unpleasntness at Commons, I didn't want to start the Commons investigations process until it was clear it was a problem. ShakespeareFan00 (talk) 13:50, 26 July 2015 (UTC)
Do you have a link to the Ancestry source that says 1964? Because Commons says he died in 1961, and refers to this: http://www.findagrave.com/cgi-bin/fg.cgi?page=gr&GRid=90126972 which is a picture of his grave (or, of course, that of someone else with the same name). Not sure if that changes things re copyright? Also, in case it helps, here's some more info about this particular file: http://www.reddit.com/r/history/comments/16gxgp/bought_an_army_engineering_notebook_from_1918_at/Sam Wilson ( TalkContribs ) … 11:28, 27 July 2015 (UTC)
My mistake, I'd read a 1 as 4 on a small image, Ammended. It doesn't as far as I know change the status if it was previously unpublished.ShakespeareFan00 (talk) 15:33, 27 July 2015 (UTC)

Full page, landscape table[edit]

Proofreading page 30, Wages_in_US_1908-1910 and I have no idea how to make such a table. Have proofread the whole page as an image. Any suggestions? Cheers, Zoeannl (talk) 02:19, 27 July 2015 (UTC)

You may take help of 1, 2, 3. Hrishikes (talk) 04:38, 27 July 2015 (UTC)
I would suggest "twisting" the table within the page so that as much text as possible (in this case all) is upright, and then formatting the resulting table in this configuration. I've made a first attempt: now somebody please pick out and fix the errors I am sure to have introduced. For starters: is that "18" on row 2, data column 8 really a "13" because that makes the percentages work? AuFCL (talk) 06:53, 27 July 2015 (UTC)

Block move requestIndex:Views in India, chiefly among the Himalaya Mountains.djvu[edit]

Some missing page were found after this had been transcribed (namely a preface) and some pages of notes.)

The pages that need moving are:

Range New range
12 14
13 15
14-172 18-176

Thanks ShakespeareFan00 (talk) 08:47, 27 July 2015 (UTC)

Out of Scope articles? =[edit]

https://en.wikisource.org/w/index.php?title=Special:Contributions/Azylicure_14&offset=&limit=500&target=Azylicure+14

Tagged these as out of Wikisource Scope but wanted a second opinion.ShakespeareFan00 (talk) 09:11, 27 July 2015 (UTC)

This was from a sockpuppet account that was copying content and templates from project to project, and attacking and vandalizing userpages of people who called him on it. The same vandal hit Wikiquote a few days ago using multiple accounts. --EncycloPetey (talk) 04:10, 30 July 2015 (UTC)

Tech News: 2015-31[edit]

15:05, 27 July 2015 (UTC)

Need to edit a copy-protected page[edit]

The page Onward, Christian Soldiers is copy protected, but the source text exists on wikisource so the page should be redirected to The Army and Navy Hymnal/Hymns/Onward, Christian SoldiersBeleg Tâl (talk) 15:43, 29 July 2015 (UTC)

I unprotected the page and leave it up to you to move/redirect/replace as needed with the scan-backed version. Just let us know if we need to protect anything afterwards here. -- George Orwell III (talk) 18:39, 29 July 2015 (UTC)
Thanks. I've redirected it. I don't think protecting it is necessary, as it just gets in the way of legitimate editing, but I guess if it was a highly vandalized page we'll just have to wait and see if the vandalism recurs. —Beleg Tâl (talk) 19:55, 29 July 2015 (UTC)
Also odd that the protected version included two verses that were not supported by the accompanying source text. --EncycloPetey (talk) 04:08, 30 July 2015 (UTC)
Is that odd? It came from a different place. It might be necessary to turn the original into a dab and move the old test somewhere else instead of just deleting the extra verses. (Assuming they're genuine.) — LlywelynII 10:05, 30 July 2015 (UTC)
Resurrected and moved old page. Converted base page to {{versions}} — billinghurst sDrewth 11:35, 30 July 2015 (UTC)
You should consider renaming the old page. The old page is not sheet music, and the new one is, so the disambig is incorrect at best and confusing at worst. —Beleg Tâl (talk) 13:26, 30 July 2015 (UTC)
FYI Onward Christian Soldiers is also available in the Salvation Army Songbook here Onward Christian Soldiers Songbook No. 690 --kathleen wright5 (talk) 03:29, 8 August 2015 (UTC)

EB11, vol. XXVI[edit]

Something's hinky with Volume 26. Anyone know how to fix it?

[If the problem doesn't display on your end, what I'm seeing is Error: Numeric value expected in red text instead of any of the pages. When I try clicking on individual linked pages from the djvu file's page, I can see them but there's no button forward or backward into the other pages that haven't been edited yet.] — LlywelynII 04:36, 30 July 2015 (UTC)

Good grief! This issue has been reported, and fallen into the archives pending action(? As if?) many, multiple times. Either nobody cares or nobody has the sense to mark items "not to be archived until finally addressed." Something might be done one day but for now it appears nobody has the authority or the ability to fix this issue locally. It has been established as being a system problem of scope beyond merely Commons/WikiSource. AuFCL (talk) 05:29, 30 July 2015 (UTC)
And since it does not directly affect Wikipedia, nothing will ever happen to fix the problem. At least that's my experience. So the way to get it fixed is to add broken links and faulty citations all over Wikipedia referencing the content from EB1911 until the Wikipedians start griping about it. . . I'll stop snarking now. --EncycloPetey (talk) 06:40, 30 July 2015 (UTC)
Pardon, LlywelynII if you are feeling picked upon. It is not intentional—you merely happen to be about the dozenth person to ask about this matter. Seriously, let's make this item a mini-index and leave it tagged not to be archived until such time as this particular issue is fixed or otherwise goes away?

Accordingly : See any of (please add any I've missed):

AuFCL (talk) 07:39, 30 July 2015 (UTC)
Thanks for the apology, but nah I don't feel picked on. I can understand your perspective but our EB material is going to be some of the most-used material on the entire site, so it's just something that is going to continue being a problem. Does no one know what the issue is? or we do and we just have to wait for the WikiMedia code monkeys to get around to that particular typewriter?
(And actually there was a complaint I made somewhere about a similar problem in the EB9 and it actually did get fairly promptly addressed so I was assuming it might be something easy.) — LlywelynII 08:32, 30 July 2015 (UTC)
Looking at this conversation, it looks like there's some problem with large numbers of text chunks in the scan? Couldn't we just cut the .djvu file into two pieces? — LlywelynII 10:15, 30 July 2015 (UTC)
@Llywelyn: The issue (as I read it) is that the <pagelist> componentry calls the API of the djvu file for the number of pages, and what it is bringing back is not in a format it comprehends (presuming that it is an error message rather than a number), such that proofreadpages api spits out that error message. So it is fails for the full page span, and it fails for a partial list (I tested.)

With regard to the commentary, if we are wanting to get work done, sometimes we have to be the squeaky wheel, and if we don't make our needs obvious, and clearly state the problem, and the effect, then it often won't get traction. What we had on the phabricator ticket about the issue is not enough to get anyone' interest of it being a specific issue that needs speedy resolution, it gives indication of the size or impact. Phabricator is the avenue to the developers, and lots of foot traffic, votes, and helpful noise across a ticket will bring it to attention. — billinghurst sDrewth 11:13, 30 July 2015 (UTC)

having made the other 26 volumes match and split ready, i’ve been mulling copying over all the articles in vols 26 & 27, from IA ocr. the side by side could be stitched later. (the articles in the volume would be findable in a search and linkable from WP). Slowking4Farmbrough's revenge 23:16, 2 August 2015 (UTC)

Index:French Polynesia.pdf[edit]

No file, deleted at commons?ShakespeareFan00 (talk) 11:56, 30 July 2015 (UTC)

It was deleted over there back on 11th June this year by INeverCry (who, worryingly seems to have been blocked a month later and may not even be an administrator any more?); at least according to c:Commons:Deletion requests/File:French Polynesia.pdf and selected commons logs. AuFCL (talk) 13:00, 30 July 2015 (UTC)

What does a Healthy Community look like to you?[edit]

Community Health Cover art News portal.png

Hi,
The Community Engagement department at the Wikimedia Foundation has launched a new learning campaign. The WMF wants to record community impressions about what makes a healthy online community. Share your views and/or create a drawing and take a chance to win a Wikimania 2016 scholarship! Join the WMF as we begin a conversation about Community Health. Contribute a drawing or answer the questions on the campaign's page.

Why get involved?[edit]

The world is changing. The way we relate to knowledge is transforming. As the next billion people come online, the Wikimedia movement is working to bring more users on the wiki projects. The way we interact and collaborate online are key to building sustainable projects. How accessible are Wikimedia projects to newcomers today? Are we helping each other learn?
Share your views on this matter that affects us all!
We invite everyone to take part in this learning campaign. Wikimedia Foundation will distribute one Wikimania Scholarship 2016 among those participants who are eligible.

More information[edit]


Happy editing!

MediaWiki message delivery (talk) 23:42, 31 July 2015 (UTC)

Tech News: 2015-32[edit]

15:51, 3 August 2015 (UTC)

quote before dropped initial[edit]

I have tried a previously provided solution to format a quote mark before a dropped initial but it hasn't worked for me on this page. Any suggestions? — Zoeannl (talk) 03:50, 5 August 2015 (UTC)

Yes check.svg Done @Zoeannl: Just "float left" the quote, not the "dropinitial" too, which has its own formatting to push it left. There is guidance provided at Template:Dropinitial of the means depending on your desired output — billinghurst sDrewth 04:07, 5 August 2015 (UTC)

Future of Authority control on English Wikisource[edit]

Hello everyone,

I am requesting the bot flag for my bot (KasparBot) at the moment. According to Wikidata:WikiProject Authority control/Status the bot will copy authority control information from this wiki to Wikidata and clean up the template (stage 2): {{Authority control|...}} → {{Authority control}} You can inspect the assumed edits here. The second task includes embedding blank templates to those articles which have authority control information on Wikidata but not on enwikisource (stage 5). The bot runs on enwiki, frwiki and multiple other wikis too. The final aim is to deprecate the local parameters of the template and use Wikidata information by default (stage 4). This would remove differences and improve the consistency of Authority control information on all wikis. Please comment on this proposal. I need a community consensus to run the bot. Thank you very much.

Regards, -- T.seppelt (talk) 06:33, 5 August 2015 (UTC)

I would welcome the cleanup of authority control by bot, rather than my slow manual processes, and will check out your example edits. I believe that we have cleaned up those author pages where there is information discrepancy, so would think that there is probably a quick an easy job.

Rather than the addition of the authority control template to each author page, I would think that we should be better considering the automatic addition (embedding to base?) of the template to our existing configuration for author pages. I would much rather that we looked to have it applied to the base of all top level author pages (not to subpages) so it never has to be fussed about in addition ever, it is just there. — billinghurst sDrewth 06:57, 5 August 2015 (UTC)

That would be in my eyes the best solution. (we should think about something like this as stage 6..) I could also help with removing the entire template after checking for differences. Regards, --T.seppelt (talk) 07:20, 5 August 2015 (UTC)
1) I also would welcome the cleanup of authority control by bot, rather than my slow manual processes, and will check out your example edits:) We have indeed cleaned up those author pages with discrepancies, but only with respect to viaf, there might be quite a few GND, LCCN, etc to fix. And those values are not necessarily correct ones, they might have been added by the gadget when viaf was added.
2) there is an issue with cleaning up the template: the existing gadget "add authority control" is not working properly if there is naked {{Authority control}} template on the page. This is why there are quite a few appearances of "{{Authority control|$1}}".
3) I also think that authority control should be a part of author template (though, it's a separate discussion:)
To sum up, I support the proposal. Cheers, Captain Nemo (talk) 07:59, 5 August 2015 (UTC).
The gadget should just be killed, it is well superfluous and pre-dates WD. We now have the means to identify when and where the AC are, and when they are not local, and if we secrete it to every author page, it will pick up the data; and if there is no corresponding page at WD, we should note and resolve that separately. — billinghurst sDrewth 12:28, 5 August 2015 (UTC)
The proposal is not limited to doing Authors pages, and for that reason I oppose this. I've seen how often the data are added incorrectly at Wikidata for our works, or a new data item is started incorrectly there, or a mismatch between items was made. The identification and matching between works cannot be handled by bot. There are just too many errors, and too many additional interwiki linking problems when it comes to dealing with works and editions. --EncycloPetey (talk) 23:28, 5 August 2015 (UTC)
@EncycloPetey: then how are you with limiting this to Author/Portal namespace where we predominantly have AC in place? — billinghurst sDrewth 23:53, 5 August 2015 (UTC)
Follow-up. To note that we have less than 250 works with authority control in the main namespace which is a manageable number to handle manually. If the detail added at WD is problematic for works, then we should be looking to review and provide that feedback to WD as part of their quality assurance processes. — billinghurst sDrewth 23:58, 5 August 2015 (UTC)
If only it were that simple. I've looked at the guide page they have for "books". The talk page is a long running list of unresolved issues. Issues are being raised there, but they're not being resolved. Even for individual works that I've done there, I've sometimes had to go back and forth on conversation with two or more people about issues for which there is no provision and no common-sense solution in place either. --EncycloPetey (talk) 00:34, 6 August 2015 (UTC).
Agree w/EncycloPetey. And the larger problem here is we never really did a good job of tracking edition info. IMO, the year parameter should have been associated with the publisher (& it's location/nation/city) rather than the title, work or author. That combination of Year & Publisher would more correctly dictate the print run of any given edition regardless of translations, etc. What we have now makes it too hard for a bot or script to separate a work from its editions and/or the author/editor and so on. Unfortunately, the only way to insure accuracy at the work vs. edition level without the Publisher part would be to do it manually (pretty much they way we've been dealing with the edition nuance since I first landed here). -- George Orwell III (talk) 00:00, 6 August 2015 (UTC)
Wait a second, that argument is starting to "throw the baby out with the bathwater" with just a blanket NO statement. Firstly, 1) the first part of the proposal is to put our AC information against our WD data that was taken from our site, so that seems like a good thing. If we do that, why would we still want local parameters? If we have concerns that others are not taking the same stringency with WD additions then that is about us controlling the addition of AC template addition here. So we modify the proposal that we don't have the bot add AC templates here, and that seems like a wise choice, we can manage that ourselves by our processes, and manage and review its addition. — billinghurst sDrewth 00:19, 6 August 2015 (UTC)
"Seems" on the face of the proposal perhaps, but definitely NOT in the light of what I know to be the case in actual fact. I've been trying to clean up just the two dozen surviving dramatical works from ancient Greece and have run into all sorts of mismatch problems and errors. The concept "seems" fine in theory, but in fact it is too flawed for me to support it. And, yes, even the data taken from our own site is wrong, as it is from other Wikisource projects. And the problem is not just the addition of data here; I have seen multiple cases where bots were used to generate incorrect items on Wikidata from information here. Damage is already being done. --EncycloPetey (talk) 00:28, 6 August 2015 (UTC)
Then you are saying, don't trust our main namespace data. If that is the case, why don't we kill it all and start afresh. We should then monitor the addition of the template locally, and ensure that the data has been appropriately been added at WD, and any data here is pushed to WD. Or, we migrate to a verified template for AC signed by the verifier, similar to how Commons manages Flickr imports. — billinghurst sDrewth 00:45, 6 August 2015 (UTC)
Let me clarify - as for the AC template in Author: namespace; I would tend to agree that our current info has been through enough vetting to "hand over" to Wikidata. That doesn't mean its perfect by any measure either.

On the other hand, the use and/or exportation of AC info associated primarily with anything in the main or translation namespace is premature at best. That info is just plain not ready for Wikidata by any stretch -- in my opinion -- because our current repository of information for those namespaces lacks the more accurate publisher[-city]-year-edition relationship. That's more of an [Index: to] Header template deficiency rather than an AC template issue or bot problem. -- George Orwell III (talk) 00:48, 6 August 2015 (UTC)

On top of that, Wikidata items do not match with ours (and will not match) for individual works. For example, the English Wikipedia has an article about Shakespeare's Hamlet, which has a matching data item on Wikidata, and all the other Wikipedias have their articles listed there as well. Wikiquote too. However, Wikisource has a particular edition of Hamlet, or several such editions. By current proposed norms on Wikidata, each edition must have its own separate data item, so there is not a direct relationship between any copy of Hamlet here and the general Wikidata item. What must happen is for Wikisource to create a general page for Hamlet where all possible current and future editions will be listed, and that page will be added at Wikidata. Any individual editions we have must become separate data items at Wikisource, with all the information specific to that edition. Then the item for that edition is identified as an edition on the general Wiidata item, and the general item gets a link to the instance of each edition. The same holds true for any translation of a work from the original language, so the eventual result at Wikidata is that every Wikisource work listed anywhere on any Wikisource project will have its own individual data item, with its own specific set of bibliographical info. This edition has a different year, publisher, ISBN, etc? Then it goes on a new data item, and all the AC info has to be tracked down anew.

That is most certainly not how any of the Wikisource projects are currently set up.

This also creates the problem that there will be no wikilinks to, from, or between any works on any Wikisource. The general page with all the Wikipedia links will be separate from the First Folio edition of Hamlet, which will be separate from the Quarto, which will be separate from any later editions, from the French translation, from the Spanish translation, etc. So the ultimate result of the current scheme will necessary remove all wikilinks between Wikisource projects for individual works.

So, billinghurst, it's not just the main namespace data that's wrong on Wikisource, but rather the entire structure of our main namespace has to change in order to connect properly with Wikidata.

--EncycloPetey (talk) 01:07, 6 August 2015 (UTC)
You and I both have been vocal about the last para (me on wikisource-l, and you at WD), and I would prefer to separate that discussion away from the bot discussion. It is a discussion that needs to happen, but can we please separate it.

I believe that we can agree that at this point of time, that the migration of enWS AC data from the main ns should not occur, and in fact the issue of AC data in mainspace is complex and problematic primarily in association with an edition. Accordingly, I propose that we would recommend to the bot operator (first redraft)

  • to not take enWS main ns AC data to WD (veracity in doubt)
  • to not undertake AC changes by bot to the main ns
  • that the translation ns should be off-limit as it is Wikisource'd transcription area, and should actually be devoid of any valid AC data at this point of time
  • that the Author and Portal ns should be able to be migrated, though there will be some errors as no validation has been undertaken
  • that following completion of migration of data that the bot operator to recheck with this community whether the existing AC templates in the Author and Portal ns should be updated or removed, pending any discussion that we undertake about a suitable replacement strategy.
  • that there is an indication whether this is a one-off proposal to migrate data, or whether it will be a regular check and update process for the bot
Thoughts? — billinghurst sDrewth 01:29, 6 August 2015 (UTC)

┌───────────────────┘
May I present a radical perspective?

Once I trusted Author: pages on enWS, if only because if I happened to spot a flaw to which I knew the correction I could so apply said correction, and take the personal consequences if I introduced an error.

Then along came WD, and that was O.K. because only the language modified (I could edit a "claim" over there instead of wikitext here to fix a fault; and some kind of robotic synchronisation between the systems restored consistency) but the functionality remained largely similar.

NOW you are suggesting all changes must be propagated back from WD (which along the road has modified the rules and I am no longer allowed to modify in ways which previously were not barred) to enWS, and enWS modifications are to be isolated/ignored or perhaps in the near future not permitted either? Slice-by-insidious slice the ability to pay even the shabbiest lip-service to the concept of "the X that anyone can edit" has been stripped until this last step leaves the basic concept in ruins. Is it too high a price to pay to eliminate the core mission? I ask why did it come to this at all?

I must side with EncycloPetey above in opposing this in the proposed form. Either that or I have entirely misunderstood the flow of development and would very much appreciate being reassured that my interpolation of feared future changes is incorrect. AuFCL (talk) 01:35, 6 August 2015 (UTC)

Re: billinghurst: The Author namespace is the only one that the bot has been tested for, and the only one where we've had serious and methodical vetting by experienced users. Thus, the Authors namespace is the only one I think could actually benefit at this time from AC data transfer (in either direction). However, this limitation is strongly at odds with the original bot proposal, and with the response I got when I queried the bot operator about it. So, I would want to hear from the bot operator about this, and be sure he understands the nature and severity of the issue for the other namespaces. (I am of the opinion that no one who properly understands Wikisource is proposing to run a bot. We had the same issue on Wiktionary when Wikidata proposed housing the dictionary information, but that issue is even more deeply challenging, in ways that I won't try to elaborate upon here.) --EncycloPetey (talk) 01:43, 6 August 2015 (UTC)
Continue to concur with EP; the Author namespace is the only namespace where enough iterations of review by us as well as by WD have taken place to be considered remotely accurate. That still won't do us much good if any non-English WS doesn't closely follow our approach and templates to the Author namespace not to mention conflicting info between the domains for any given Author.

And I agree picking up the editions -to- works discussion should take place besides this narrower one. -- George Orwell III (talk) 01:51, 6 August 2015 (UTC)

Re: AuFCL: I share your concerns, and am uncertain towards what direction future development will lead. However, I do see one strong advantage for Wikisource integration with Wikidata: greater accessibility to our resources, and thus a stronger internet presence. FreeBase now pulls directly from Wikidata for searches, and the addition of LoC and other control information will permit users of those libraries and databases to access our works through searches from those institutions' databases. We do more good for the global internet community with that kind of accessibility. --EncycloPetey (talk) 01:51, 6 August 2015 (UTC)
Re portal namespace, there are 49 [33], we move them manually if you want, I don't mind, though believe that users who added them were competent. If it is only the Author: ns, fine with me, it contains >> 99% of our AC additions.
Re AuFCL's comments. Absolutely not, that is melodramatic scaremongering. We control our templates and how we code and use them, each of them has the ability to override whatever data is in WD, and I have made no proposal to change that. Each of us has the ability to edit WD and so manage the data that appears at our site, but yes, it is off-WS. The issue is that our data is static and UNCHECKED, so when a page moves at a sister wiki, it is not updated here, where it is moved a sister WS, it is not updated here. As new authorities are added or are corrected or merged they won't appear here. Certainly there are risks with big data, but we manage our risks, and get involved in the process, and be rigorous, we don't undertake the ostrich manoeuvre. It is a value proposition and we have an open discussion about what we do and how. WD has had issue with data, and they are very much addressing that, and part of that is taking our data into their system. Changes should be reviewed, and we have a good community to do that, and we should ask for good processes to maintain the integrity of data. There are also benefits, and they should also be part of the discussion and how we measure the value of a proposal. — billinghurst sDrewth 02:48, 6 August 2015 (UTC)
I have no intention of prosecuting this further. Melodramatic or otherwise, that is my honest take upon the apparent trend of changes and I await reassurance or vindication, neither of which have been provided by statements made so far. And yes, I accept this is an argument which should take place elsewhere. AuFCL (talk) 02:58, 6 August 2015 (UTC)
Okay, so there is a (narrowed down) proposal that pushes our Author control authority data to WD where it doesn't exist there. [I have already been through and cleaned up instances of where we have had data mismatches based on the VIAF.] Then the proposal is to display that data here by use of the WD call using the template. For existing authors there should be negligible difference in data [noting that we already have WD populated AC templates, partially or fully.] What are your specific issues with the specific proposal as it sits, and what risks do you see that we would need to manage? — billinghurst sDrewth 03:23, 6 August 2015 (UTC)
With regard to VIAF, do you find multiple VIAF identifiers for the same item much? I have been working through the 40 or so surviving ancient Greek plays, adding authority control data from LoC, BnF, and VIAF. More often than not, the VIAF has two or even three identifiers for the same Greek play--most of which appear to be duplicates as a result of not synchronizing the import of BnF data into the mix (the oddball identifier is usually French, or includes the French listing). Have you found this to be the case? And at some point we'll have to consider how our authority control template will deal with multiple VIAF values. In the immediate case, the question is whether the bot can handle such a situation, namely, that there is a VIAF identifier mismatch, but neither is actually wrong and both should be used. --EncycloPetey (talk) 22:36, 6 August 2015 (UTC)
[For the works that I do, I struggle to find them in VIAF in the first place.] For the authors, I believe that there has been a big effort to merge and resolve duplicates, and how much of that has been due to Wikidata and public collaboration would be interesting to know. So I would suggest that you add the multiples and then set a "preferred" ranking for the one that will continue, or set the unpreferred to "deprecated". I have seen that where VIAF has amalgamated data that they have now created their own system of redirects for the old numbers, so I am hoping that WD will have a system to automate updates and additions either via the VIAF identifier, or via the individual repositories pushing data. [one day!] — billinghurst sDrewth 00:09, 7 August 2015 (UTC)
Unfortunately, when there is more than one VIAF identifier for a work, I do not know which identifier will be the one that continues. The split is sometimes half-and-half, or a three-way split of libraries between identifiers. As I say, I've found this to be the case more often than not (and for well-known and long-standing works of literature such as the ancient Greek dramas!). If I knew someone I could contact who handles VIAF data, and could say "all these have been found to be duplicates", then I would do so. But for now, this state of affairs bodes ill for trying to coordinate VIAF information pertaining to works of literature either at wikidata or here, since I expect the same situation will apply to the VIAF identifiers of other works. What I have tried to do is to include the LoC Authority and the code from the BnF in the Wikidata item, so that users (who may only receive only one of several VIAF identifiers through the template) can still access the French and US listings. I tried doing the GND as well, but the German library data was not as easy to figure out, even when I had the VIAF listing already at hand. --EncycloPetey (talk) 00:38, 7 August 2015 (UTC)
The template will select and display one, and if they are equivalent (and not set separately), then it is not particularly an issue. The VIAF data will update and be resolved and/or the predominant will be set. In my experience, the lowest number is retained for VIAF data, presumably as it is the oldest record, but in the end it doesn't matter, due to the redirects that are implemented, so as long as we are not selecting the wrong work/link, all should be good. — billinghurst sDrewth 01:24, 7 August 2015 (UTC)
@Billinghurst:, @EncycloPetey:, @George Orwell III:, @AuFCL: Some urban legends seem to be created in this discussion:) So, some hard facts about the relationship between VIAF and Wikidata. Wikidata now is the official partner of VIAF: see here. What that means: 1) wikidata item about a person is the official part of the VIAF cluster; 2) hard data such as birth and death date is sought and used in VIAf cluster; 3) VIAF data from offical VIAf partners such as LC, BNF, DNB, etc is fed back to wikidata; 4) the flow of data between viaf and wikidata is done on regular basis. That means that wikidata has the same up-to-date data as viaf and all its official partners. That means that wikidata data about a person is in no way inferior to any constituent part of viaf including LC, BNF, DNB and so on and so forth. Storing this data locally (at wikisource) means that the data is not updated. If you keep an eye on category:viaf different at wikidata you will see that all differences are due to wikisource having deprecated viaf data (I myself have deleted dozens deprecated viafs from wikisource in the past month or so!). So, I hope, this discussion can be concluded to everybody's satisfaction.
At the same time, I fully agree that the situation is very different in the main namespace. At least for the time being I think we should manage this on our own, locally. Cheers, Captain Nemo (talk) 08:58, 11 August 2015 (UTC).
Further detail about VIAF and Wikidata interaction can be found here. This is blog by Thom Hickey, who is the chief scientist at OCLC. So, please, no more about accuracy of viaf partner data in wikidata:) Cheers, Captain Nemo (talk) 09:15, 11 August 2015 (UTC).
I unconditionally withdraw my opposition, as my fundamental objection has been proven to rest upon a misunderstanding which I shall not detail in case taken out of context might add to any confusion. (Happy to discuss out of stream should any one happen to be curious.) Accordingly I also strike my comments above. AuFCL (talk) 10:26, 11 August 2015 (UTC)
@Captain Nemo: "wikidata has the same up-to-date data as viaf and all its official partners" is overstating matters. In many cases I've found, wikidata is altogether missing links to VIAF or any library partners of VIAF. --EncycloPetey (talk) 13:50, 11 August 2015 (UTC)
@EncycloPetey: 1) Are you sure you are not confusing your past experiences with the current state of things? If you have a look at the links in my prev. post you will see that the integration started in April 2015. Any past grievances are just private history. 2) If you have in mind situations i) when viaf has multiple clusters for one identity or ii) viaf bundles several people in one cluster, then integration is exactly the way to help to resolve it. Wikidata regularly creates data base dumps for these cases and they being resolved both by viaf and on wikidata. One recent example I aware of is viaf clusters related to popes. Those chaps used to have several viaf clusters coming from LC and DNB which are now (mostly) merged. I know of this because I cleaned up many of those stale viaf here on wikisource, there would be no need for that manual work if our authority control template just imported fresh data from wikidata. 3)If you referring to different spelling and birth/death years then again, viaf now actively seeks and imports this data from wikidata. I dont know exactly how viaf does that but this matching is not a trivial task so it takes time. 4) If you concerned with something else, please give specific factual examples and I will try to clarify. Cheers, Captain Nemo (talk) 02:01, 12 August 2015 (UTC).
@Captain Nemo: 1) I am raising concerns with the information on Wikidata that I encountered while editing in the past week. So, yes, this is a problem in the current state of affairs on Wikidata. (2) Yes, multiple VIAF for a single entity, often two or three. (3) No, spelling and dates have nothing to do with this. (4) I meant exactly what I said. I spent time last week going over the information for a small group of related data items. I found that most had no data for VIAF, LoC, or BnF. Where they had VIAF data, there were usually additional VIAF identifiers missing because of redundancy in the VIAF data for the same entity. --EncycloPetey (talk) 03:56, 12 August 2015 (UTC)
@EncycloPetey: I had a look at your edits for August on wikidata and failed to find any edit of the type you mention. Please note (maybe it wasn't articulated clearly before) but my statements is about author namespace only. The plays and its characters and so on is complete different issue, viaf is not doing those at the moment. If you referring to the wikidata items that you have not edited, please give specific examples and I will try to clarify. Cheers, Captain Nemo (talk) 05:55, 12 August 2015 (UTC).
@Captain Nemo: When you say "viaf is not doing those", I assume you mean that they are not cleaning up redundancy or errors. Sorry, but I did not understand that you were limiting your discussion of Wikidata and VIAF to author information only. With regard to authors, I have not seen any problems on Wikidata that would affect information we use on our site, but I have seen (in the past few days) an experienced Wikidata (and MW) editor who added grossly incorrect information to Author pages. It was the result of adding information that he did not understand himself, and had not bothered to source or check, which was rather ironic given that he was, at the same time, involved in a site discussion about sourcing and verifying claims added to data items. --EncycloPetey (talk) 16:54, 12 August 2015 (UTC)

Hello again,

I would like to reach a consensus on this topic. Please state with Symbol oppose vote.svg Oppose or Symbol support vote.svg Support if you agree with this proposal or not. My bot would do this:

  1. fetch entries of Category:Pages using authority control with parameters in namespace 102 (Author)
  2. compare the AC information to Wikidata
    • add claims if no information is available on Wikidata
    • keep the information on Wikisource in case of differences, malformed values etc. and skip step 3
  3. remove the local information from Wikisource

Please be also aware that you can inspect all future edits on https://tools.wmflabs.org/kasparbot/ac.php?select-project-enwikisource=1. All faulty pages will be available there. The current situation is:

count state
13557 template can be replaced
1023 malformed value
285 different value on wikidata
24 unknown template property
5 technical problems
1 more than one template embedded

I would like to close this RfC on 31st of September. -- T.seppelt (talk) 08:59, 17 August 2015 (UTC)

  1. Symbol support vote.svg Support as proposer -- T.seppelt (talk) 08:59, 17 August 2015 (UTC)
  2. Symbol support vote.svg Support -- Captain Nemo (talk) 11:55, 17 August 2015 (UTC).
  3. support. I have reviewed the first 1400 listed entries and would just say that the non-VIAF data is not worth transferring, nor remaining and would consist of little value, and we should just strip the author:ns AC templates of all paramters. Though the 5 would be worth manually eyeballing. — billinghurst sDrewth 14:25, 17 August 2015 (UTC)

Removed Authority control/VIAF gadget[edit]

I have removed the VIAF gadget from local display. Magnus has developed an alternate version at Wikidata that is more suited to needs. You can add it to your common.js at Wikidata, or you can add it to your global.js file at meta. If you do edit Wikidata and wish to add authority control data and are stuck, then please talk to me and I can assist. — billinghurst sDrewth 23:26, 16 August 2015 (UTC)

Author template now will pull image from Wikidata if not specifically chosen[edit]

I have tested and modified {{author}}. I have added the ability for the template to directly access the image data stored at wikidata if it is available. The specific use of the parameter in the template will override Wikidata choice, so the only difference noted will be that there will be some author pages with images without a parameter call. Once this has the approval of the community I will look to remove the empty parameter field from active templates, then we can talk about how we progress with comparing wikidata and how to populate where our data is not in Wikidata, etc. — billinghurst sDrewth 14:13, 5 August 2015 (UTC)

/me semi-struts. 1,596 total and still populating in Category:Author_pages_with_Wikidata_image. There may be issues if there are multiple images against an author, so we will have to manual override for the moment. With such situations we need to set a preferred image at WD and get Module:Wikidata updated to provide a single preferred result. — billinghurst sDrewth 14:22, 5 August 2015 (UTC)
Only 5 Author pages with [image?] script errors at this point. Tracked in Category:Pages with script errors. -- George Orwell III (talk) 00:06, 6 August 2015 (UTC)
Nice GOIII, and great that we already have a system to catch those errors, I was wondering how I was going to find them. — billinghurst sDrewth 00:22, 6 August 2015 (UTC)
Fixed the errors, all due to the pages not having an instance set at WD. — billinghurst sDrewth 00:33, 6 August 2015 (UTC)
@George Orwell III: Can you think of a means to track where we have Author pages with red file links? Be they failed manual additions, file removals or by my update to the template. — billinghurst sDrewth 00:58, 6 August 2015 (UTC)
Off-hand I'd say 'not easily'. The section Headings seem consistent enough to come up with a group of them to probe through with some reliability but the entries themselves are not always listed the same nor use the same syntax. If every entry were template-based, syntax, etc. wouldn't matter because we could poll the template parameters themselves to track what main or translation space works exist or are merely listed (red links). I'm not saying it can't be done but a deeper look on my part will have to wait until the weekend. -- George Orwell III (talk) 01:13, 6 August 2015 (UTC)
Additional thought: seems like the possibilities of doing something like that increases if going by what the What Leaves Here? gadget currently provides. -- George Orwell III (talk) 01:26, 6 August 2015 (UTC)
Hold on a sec -- did you mean red-links to the File: namespace for missing author portrait images or red links to works in the main or translation namespace? -- George Orwell III (talk) 01:29, 6 August 2015 (UTC)
File namespace only at this point of time. I was thinking that we may be able to run a sql query from quarry:? — billinghurst sDrewth 01:32, 6 August 2015 (UTC)
Now I follow you. I'd think there would be a way but not sure how. Isn't it just a matter of determining whether or not the red-link contains something like &action=upload ? Isn't that what happens when you click on a File: link to a non-hosted file? -- George Orwell III (talk) 01:37, 6 August 2015 (UTC)
@John Vandenberg, Pathoschild: can either of you help here? — billinghurst sDrewth 03:13, 6 August 2015 (UTC)
Unless I am misunderstanding, Category:Pages with missing files is what you are looking for.--Erasmo Barresi (talk) 17:13, 6 August 2015 (UTC)
<facepalm> I was thinking something namespace specific, and losing the trees for the forest. I will fix up those images over the weekend. Fixed Thanks. — billinghurst 13:24, 7 August 2015 (UTC)
Just a not-so-important remark: When choosing which image to rank as preferred, I base my decision on which image most featured Wikipedia articles have in their leads. See for example d:Q187982.--Erasmo Barresi (talk) 17:22, 6 August 2015 (UTC)
I have already had this argument discussion with WD about 'preferred rank' and image display, and the (easy) ability to return one result rather than many. To me the "preferred" will always be the one that best represents the person, whereas for what I choose to display at enWS will be what best represents them as the author, which for me will mean that I love to show any caricature if there is one available. It is exactly the argument over "preferred" that will always allow us to have choice in the header parameter, but call a default if nothing is chosen. — billinghurst sDrewth 23:54, 6 August 2015 (UTC)
Another issue I've seen that illustrates why "most common use" does not always offer the best choice. I've seen instances where there is a play and a sculpture with the same title, and so there is a disambiguation page to help users distinguish the two. But then the lead image for the play is an image of the statue, which visually negates the effort made to disambiguate the two items. And in many cases, I've found that the most commonly used image is used solely because it was used first, the original article propagated through translation to many wikipedias, and the lesser used image is a newer higher-quality image that is just beginning to be used. Again, in this situation, the more common usage in no way indicates a better selection or preference. --EncycloPetey (talk) 00:46, 7 August 2015 (UTC)
Right, EncycloPetey. My approach could work as a rule of thumb rather than a hard and fast one.--Erasmo Barresi (talk) 08:55, 7 August 2015 (UTC)

Wikisource celebrates Public Domain Day?[edit]

I have recently started the 2016 in public domain age on English Wikipedia and started a discussion on the Wikimedia Uk list. I think some Public Domain day events in January would be a good way of making more people aware of Wikisource and would offer the opportunity to provide support for newcomers. It would be great to hear of other peoples views?Leutha (talk) 10:04, 8 August 2015 (UTC)

There's not really much for us, since we use US copyright law, and nothing published before 2002 is going to go into the public domain until 2019-01-01. Unpublished or recently published material by all those authors dead 70 years is now PD, but that usually only has interest to scholars.--Prosfilaes (talk) 20:38, 9 August 2015 (UTC)
in the US, the open access librarians tend to blog around January 1. [34] maybe some networking building on wikiloves libraries might bear fruit. Slowking4Farmbrough's revenge 01:24, 10 August 2015 (UTC)

Scanned books from the Archaeological Survey of India[edit]

A good collection is available here, covering subjects from around the world on a multitude of topics. Hrishikes (talk) 05:55, 10 August 2015 (UTC)

@Hrishikes: Be sure to add it to Wikisource:Sources. — billinghurst sDrewth 10:16, 10 August 2015 (UTC)
Yes check.svg Done Hrishikes (talk) 11:26, 10 August 2015 (UTC)

Upload to Commons: an innovation[edit]

Recently I have encountered an obstacle to Commons uploading; even when the file is below 100 MB (but contains a lot of pages), Commons says that the entity is too large when attempting importation from IA and shows internal API error when attempting direct upload. I have bypassed the objection by taking a roundabout route: I first uploaded a very small file with the desired file name and once Commons accepted it, I uploaded the full file as a newer version to the previously uploaded file. This time, uploads were successful. Examples: Index:Debrett's Peerage, Baronetage, Knightage and Companionage.djvu, Index:Dod's Peerage, Baronetage, Knightage etc. of Great Britain and Ireland.djvu, Index:The Peerage, Baronetage and Knightage of the British Empire Part 2.djvu, Index:The Peerage, Baronetage and Knightage of the British Empire Part 1.djvu. This is for general information for others who have faced this kind of problem. Hrishikes (talk) 06:19, 10 August 2015 (UTC)

Clever work around. I am told that the problem there lies with IA-upload rather than with Commons, and we therefore need to see if @Tpt: can tweak his upload tool. To also note that you can directly upload from a url to Commons, so you could just try that and put in place the {{book}} template afterwards. — billinghurst sDrewth 10:19, 10 August 2015 (UTC)
I had tried direct upload too. The file got uploaded, then I had to fill up file name, year, category etc., then the file got submitted. Then Commons refused to publish it citing internal API error. That's why I had to take a work around. Hrishikes (talk) 10:59, 10 August 2015 (UTC)
If you did use the direct upload on Commons and that fails, then we need to lodge that as a bug in Phabricator. I will see if I can find a big file to upload. — billinghurst sDrewth 11:13, 10 August 2015 (UTC)
Finding a big file is not a problem; but you need to find a big file with the specific problem that Commons objects to. Every big file does not get refused. So the best option: download one of my files as mentioned above; change the name and upload --- then you will see the problem. Hrishikes (talk) 11:22, 10 August 2015 (UTC)
Just a little note that Hrishkes won't be able to upload by URL on Commons because that is restricted to reviewers, admins and GW users. Green Giant (talk) 01:26, 13 August 2015 (UTC)
Thanks for the correction … See what happens when they add new things and when you have advanced rights, you don't even realise that not everyone can do or see something! Anyway Hrishikes has uploaded so many things over the past months, they must be close to locking themselves away for a year and have a proofreading binge. :-) — billinghurst sDrewth 01:36, 13 August 2015 (UTC)
wow, i should think a request for GWtoolset rights would be a snowball, (they’ll let anybody have those) ;-) Slowking4Farmbrough's revenge 03:30, 19 August 2015 (UTC)

Author/section fields for anthologized works[edit]

Hi all, this got started on Captain Nemo's talk page, but maybe it'd be better to put here ...

I've been working on two books that are collections of short works in translation, Slavonic Fairy Tales and The Sweet-Scented Name. The first has many "contributers", while the second is drawn from various collections in Russian by Fyodor Sologub, and happens to bear the name of one of the individual stories.

To begin with the latter collection, I'd like to know which is more appropriate, to put the story titles into the headers in the title field, or into the section field. So here it's a choice between how I did it for Turandina or for Lohengrin. One thing that makes me feel uncomfortable is that the title "The Sweet-Scented Name" gets more prominence in the header than does the individual story titles. It already appears twice on each story's page, and I feel like its third appearance in the header is visually overkill. One precedent I am going by is Tolstoy's Twenty-three Tales and the individual stories, such as A Prisoner in the Caucasus which have the story titles in the title field. So far I see no serious problem either way, but when there are collections in which different sections are by different authors the handling of the header information gets very complicated, and I've seen some bad results, and complicated work-arounds. For example, in the story Best Russian Short Stories/The Cloak we read in the header "Best Russian Short Stories by Nikolai Gogol" even though Gogol is only the author of the one story "The Cloak". That particular problem is avoided in The World's Famous Orations/Volume 6/At the Bar of the House of Lords, but at the cost of filling in the section field with

''At the Bar of the House of Lords''<br />by {{Author link|Isaac Butt}}

a format which would be difficult have everyone apply consistently. So what I'd like to propose is to explicitly allow for the option of placing titles of individual works from within an anthology book into the title field, in which case the author field should consistently give the author of the individual work rather than to the collection as a whole. Does that cause any serious problems in how the header data is handled? Mudbringer (talk) 11:22, 10 August 2015 (UTC)

The "Gogol" problem is solved by leaving the author field empty and adding the "contributor" field. Cheers, Captain Nemo (talk) 13:33, 10 August 2015 (UTC).
With such works, I have carried through name of book with | override author = (ed.) Author name (possibly leave empty <shrug>) then use the section and contributor parameters. — billinghurst sDrewth 14:11, 10 August 2015 (UTC)
And the example that you probably preate the contributor parameter, plus it was having to do hacks like that brought the change about. Well, that and that we have many journals transcribed that have many articles by different authors.
@Billinghurst: ... OK, I was having trouble with the spelling of "contributor" "facepalm". I tried reworking the first story in Slavonic Fairy Tales based on your explanation, so could you please check to see if I've understood everything: Carried Away by the Wind, vs. what I was doing before: Why is the Sole of Man's Foot Uneven?. The way I've done it leaves an extraneous space before a comma, but I'll not be fussy. Does bolding the section field cause any problems? One thing I couldn't have figured out from studying the help on the header template was the difference between "override author" and "override_author" ... the latter was leaving the author's name in the header, when I was trying to suppress it. Thanks! Mudbringer (talk) 14:50, 10 August 2015 (UTC)
I tried fixing The Cloak, and that seems to work ok, but the previous story in that collection, for which a translator is given, The Queen of Spades, winds up with T. Keane as the translator for Best Russian Short Stories, rather than just for that story. Mudbringer (talk) 15:57, 10 August 2015 (UTC)
You can use override_contributor for this. For example: The seven great hymns of the mediaeval church/Dies Iræ/Dix. —Beleg Tâl (talk) 21:07, 10 August 2015 (UTC)

Re the underscore in parameters[edit]

We have had a legacy issue in the the underscore in the template has long been used, and we have carried on using it with new parameters to maintain consistency, however, it does catch people out in that it needs to be present in the parameter. I have just updated the template so you can use alternatively use override author and when I have a chance I will look to update so we can have "override editor", "override translator" and "override contributor". I am not even certain that the underscore needs to be mandated now (it did back then) and whether can be omitted in the template, so I will have a play in the sandbox first to see if we can simplify. As GOIII briefly discussed the whole Template:Header is showing its age and has significant legacy issues, and it probably due for a refresh, especially with Lua now being available, and with Wikidata being present, and the old microdata hack probably able to be case out. But not today. — billinghurst sDrewth 01:37, 11 August 2015 (UTC)

Tech News: 2015-33[edit]

14:57, 10 August 2015 (UTC)

creating Special:MyPage/EditCounterOptIn.js[edit]

Hi. Just thought that it would be useful to remind (new) users that there is a useful count tool which we have linked from the bottom of your Special:MyContributions page. It is even more useful if users create the page Special:MyPage/EditCounterOptIn.js (or use m:Special:MyPage/EditCounterOptIn.js if you want counts globally) and content is not needed, just the creation of the page. Thanks, it helps me to work out when I can apply autopatrolled rights for users, and to assess users for admin rights. — billinghurst sDrewth 05:11, 12 August 2015 (UTC)

Thanks for the pointer! I'd not noticed the opt-in thing. I think the global page should be m:Special:MyPage/EditCounterGlobalOptIn.js though? — Sam Wilson ( TalkContribs ) … 13:04, 12 August 2015 (UTC)
@Billinghurst: I have opted in globally, but the monthly stats are still invisible—a notice appears in their place. Is this normal?--Erasmo Barresi (talk) 15:39, 13 August 2015 (UTC)
<grrr> Never work with children, animals AND SOFTWARE APPLICATIONS! It looks to be having issues. I will see if it part of the Labs current works, whether it will self fix, or we need to ping Max. — billinghurst sDrewth 22:59, 13 August 2015 (UTC)
It was working correctly the other day when I looked; now is broken. There seem to be PHP errors appearing in the HTML output, which are stuffing up the Javascript. — Sam Wilson ( TalkContribs ) … 23:22, 13 August 2015 (UTC)

Author:Georges Perrott[edit]

Is this author the same as w:Georges Perrot? The PSM work listed would conform to the genre of the French author. The last "t" is a typo or real? Hrishikes (talk) 01:05, 15 August 2015 (UTC)

VIAF lists sol et le climat de la Grèce, leurs rapports avec le caractère de sa civilisation et de son art as one of the works of "Georges Perrot Archéologue et helléniste français" and the trailing comment on Popular Science Monthly/Volume 42/December 1892/The Environment of Grecian Culture reads Translated for The Popular Science Monthly from the Revue des Deux Mondes. My French is not all that good but this seems circumstantial evidence they are one and the same person? AuFCL (talk) 01:36, 15 August 2015 (UTC)

Tech News: 2015-34[edit]

16:17, 17 August 2015 (UTC)

New namespaces?[edit]

Anybody else notice the 5 new namespaces being listed in the various ns related input boxes?

  • Gadget (ns:2300)
  • Gadget talk (ns:2301)
  • Gadget definition (ns:2302)
  • Gadget definition talk (ns:2303)
  • Topic (ns:2600)

Anyone? Pointers to more info appreciated. -- George Orwell III (talk) 01:39, 18 August 2015 (UTC)

On the "Gadget" set of namespaces I found the following: "The difference intended to be made as part of the ResourceLoader 2 project is moving these out of the prefixed "Gadget-" scope in the MediaWiki:-namespace (which is intended for messages, not actual wiki content (let alone executable resources). Instead move them to a new Gadget:-namespace only editable by users with the editgadget right" (mw:ResourceLoader/Version 2 Design Specification).--Erasmo Barresi (talk) 08:12, 18 August 2015 (UTC)
So does this mean Flow is coming to the Scriptorium? :) Am I too against tradition if I say that's a good thing? ;-) — Sam Wilson ( TalkContribs ) … 23:05, 18 August 2015 (UTC)
There is a "by request" model at Wikidata, and I would think that WMF will probably introduce it cautiously. @Quiddity (WMF): would you mind adding to this conversation, and pointing this community to pertinent information. — billinghurst sDrewth 02:35, 19 August 2015 (UTC)
Sounds sensible. I reckon it'd be nice to have here. :) — Sam Wilson ( TalkContribs ) … 03:11, 19 August 2015 (UTC)
If there is consensus, it will be very easy to have it enabled. Personally, I favor it for all talk pages plus the discussion pages in the Wikisource namespace (like this one), but it is possible to enable it on a smaller scale if wished.--Erasmo Barresi (talk) 12:23, 19 August 2015 (UTC)

Epubs produced from enWS[edit]

Following Tpt doing a post, it reminded me that he has stats for wsexport tool usage. I had a quick peak and here are the numbers of epubs that have been generated for 2015

Epubs produced from
late Mar to mid Aug 2015
Mar 3415
Apr 13234
May 98804
Jun 19480
Jul 48659
Aug 11215

They are stunning numbers, and thanks to @Tpt: for his magnificent tool, and I think that we can do better if we look at making that output format more dominant. — billinghurst sDrewth 11:25, 18 August 2015 (UTC)

That's amazing! So many readers! (and some bots?) Is it possible to get stats for individual works? And yeah, I agree, it'd be cool to make the epub download links much more obvious. — Sam Wilson ( TalkContribs ) … 23:01, 18 August 2015 (UTC)
@Samwilson: When I come in with a mobile to the main page, Template:Featured download shows big and bold, so it would be interesting to know how whacked that template/button is as part of the downloads; so yes, a breakdown of the stats would be useful. Personally I feel that the template could be useful in some format to highlight that the works are downloadable and would consider its use more liberally ESPECIALLY as in the mobile version there is NO sidebar, and ready means to even see that an epub version is available. Hmm, it may even be worth adding as part of Template:New texts even just as an icon. — billinghurst sDrewth 02:59, 19 August 2015 (UTC)
I think it would be a good thing to be able to identify the works most downloaded by our readers. The top three could be featured in main page in "Readers' Choice" category. Hrishikes (talk) 04:32, 20 August 2015 (UTC)
I agree with you on the above Hrishikes. I also like to see the most {popular} downloaded categories, in order, from highest to lowest as well as the most popular download format. Respects, —Maury (talk) 07:03, 20 August 2015 (UTC)
Ditto. We live in an island all by ourselves. A bridge with the readers is highly desirable, so that the community can ascertain what types of works the readers most want and community efforts can be based on that feedback, e.g., in POTM etc. Hrishikes (talk) 07:25, 20 August 2015 (UTC)
@Billinghurst: Detailed statistics of ePub downloads (07 2015) I have published on User:Zdzislaw/stats/2015 07‎. Zdzislaw (talk) 18:07, 24 August 2015 (UTC)

How can we improve Wikimedia grants to support you better?[edit]

Hello,

The Wikimedia Foundation would like your feedback about how we can reimagine Wikimedia Foundation grants, to better support people and ideas in your Wikimedia project. Ways to participate:

Feedback is welcome in any language.

With thanks,

I JethroBT (WMF), Community Resources, Wikimedia Foundation. 05:21, 19 August 2015 (UTC)

New feature "Watch changes in category membership"[edit]

recent changes page view with categorization

Hi, coming with this week’s software changes, it will be possible to watch when a page is added to or removed from a category (T9148). The feature has been requested by the German Community and is part of the Top 20 technical wishlist. The feature was already deployed to Mediawiki.org on August 18 and it will be rolled out on Wikisource between 6-8 pm UTC today. It will be available on all Wikipedias from Thursday 20 on, likewise between 6-8 pm UTC. In this RFC-Proposal, you can find the details of the technical implementation. The feature was implemented via a new "recent changes" type for categorization. Through this, categorization will be logged and shown on the recent changes page. The categorization logg in "recent changes" is the data base for the watchlist: When you watch a category, added or removed pages from that category will be shown on the watchlist. The categorization of pages can be turned off in the watchlist preferences as well as recent changes preferences. If you have any questions or remarks about the feature or if you find a bug, please get in touch! Bugs can also be reported directly in Phabricator, just add the project “TCB-Team” to the respective task. Cheers, Birgit Müller (WMDE) (talk) 15:11, 19 August 2015 (UTC)

For those of us not fluent in technobabble, does this feature track only explicit category membership, or also membership controlled by template parameters? --EncycloPetey (talk) 16:24, 19 August 2015 (UTC)
The new feature will also log categorisation by templates, but it will limit the display to the template name and the number of pages embedding it. Kai Nissen (WMDE) (talk) 17:48, 19 August 2015 (UTC)
This feature is very boring in Special:RecentChanges on all wikisource because it shows three entry for any page validation with three identical diff links, so we get only 16 real change per page of 50 in RC. It's possible to work-around it through user preference but that preference should be to hide cat change in RC by default. — Phe 20:19, 19 August 2015 (UTC)
We've been asking for this on en.Wikipedia for years. It makes it possible to observe the removal of legitimate categories, by accident or by vandalism. BD2412 T 21:06, 19 August 2015 (UTC)
dontcha know categories are broken, and the answer is wikidata? maybe we could get GWtoolset as a top 20? Slowking4Farmbrough's revenge 22:12, 19 August 2015 (UTC)
We can't use wikidata for Page:*, and Page:* is the only really problematic namespace. — Phe 00:20, 20 August 2015 (UTC)
The proofread extension use category to mark the state of pages and changing the state from proofread to validated generate three RC entry, one for the cat removed, one for the cat added and one for the page change itself. I don't think this is evil for any other namespace than Page:, but for Page: we have really a specific use of category. I was thinking first to ask than the default preference should be to hide cat change in RC by default on all wikisource, but it'll probably better to allow to blacklist namespace, sort of "do not generate cat change RC entry for this namespace". — Phe 00:20, 20 August 2015 (UTC)
I don't find that problematic. It's not like pages will be going through category changes dozens of times. BD2412 T 02:09, 20 August 2015 (UTC)
I've already turned it off for RC in my preferences. It's a right nuisance when trying to patrol RC. The five Page namespace categories (Category:Proofread, Category:Problematic &c.) are assigned automatically and there is no need to watch pages move between them as they move through the proofreading process. When these moves are appearing on RC, then it prevents clearly seeing the real changes. Beeswaxcandle (talk) 02:28, 20 August 2015 (UTC)
@Birgit Müller (WMDE), Kai Nissen (WMDE), Phe: Totally agree that it will be problematic in/for our Page: namespace in its current methodology. So a wiki needs the ability to exclude a namespace, or maybe as part of a naming hierarchy (preferably by local setting rather than by a wiki configuration request through phabricator); or by the ability to [hide/show] the functionality in recent changes. I also think that we need to get this onto Phabricator prior to the rollout, as it is going to be problematic, and should have been tested in a broader wiki-space prior to implementation. An alternative would be that the categorisation change(s) are noted against the edit, rather than against the category, so it is all on one line per edit. — billinghurst sDrewth 03:47, 20 August 2015 (UTC)
Flagged this as an issue for WSes on the Phabricator ticket above. I have asked if there is a means to halt rollout to the WSes this week. — billinghurst sDrewth 03:54, 20 August 2015 (UTC)
@Billinghurst:I am probably being blind but I see no reference? To which ticket are you referring please? AuFCL (talk) 04:08, 20 August 2015 (UTC)
Also, it already is rolled out. Beeswaxcandle (talk) 04:13, 20 August 2015 (UTC)
Thanks BWC, WPs! We are also going to see it added regularly to any subpage. Maybe we could have "ignored categories", cf. hidden categories which exists. @AuFCL: T9148 . To note that of 500 edits, 212 are categorisation changes in three categories, though switching to "Show page categorization" does tidy things nicely. — billinghurst sDrewth 04:20, 20 August 2015 (UTC)
D'oh! I really am going blind! Thanks. AuFCL (talk) 05:49, 20 August 2015 (UTC)
┌─────────────┘
The proposal seems to say that recentchanges will not be affected - "In order to prevent recentchanges from being flooded by high-usage template categorization, there won't be recentchanges entries for the pages embedding the template". "Possible to watch" sounds to me like editors being able to voluntarily add a category to a watchlist for purposes of keeping track of membership. BD2412 T 03:56, 20 August 2015 (UTC)

┌─────────────────────────────────┘
Perhaps this is a little bit early to start noting bugs but I have observed this peculiarity: if some page changes to either enter or leave a watched category my watchlist throws up a '(changed since last visit)' link to the changed page which is nice. What is not so nice is that visiting that link does not reset it status in the watchlist but visiting the category instead does! So to de-flag a notified event one has to navigate to a page which otherwise you have no reason to access? Seems a bit odd. AuFCL (talk) 09:37, 20 August 2015 (UTC)

Hi AuFCL, it is not too early to report bugs or comment on the feature, thanks! There is already a Phabricator task for the link issue you mentioned - T109688. We will collect all the comments on problematic issues, improvement ideas and reported bugs on the various project pages as well as in Phabricator and check through them within the next few weeks. Thanks to everyone here. Birgit Müller (WMDE) (talk) 14:49, 20 August 2015 (UTC)

Note to the community, this change was rolled back (temporary suspension). I haven't found the conversation that lead to the rollback, just the comment on the initial phabricator card that states that the rollback has occurred. There were a number of issues identified, so one would think that there is some more work to do prior to the next version. So while we have been picking fault with our issues, we should not lose sight of the excellent initiative that WMDE has taken on board, thank them for their effort and wish them luck with their refinement of the change. @Birgit Müller (WMDE), Kai Nissen (WMDE):billinghurst sDrewth 23:37, 20 August 2015 (UTC)

@Billinghurst:, thank you for posting the update regarding the revert of this change. After it got deployed to the production systems it became apparent that under some special circumstances it may cause privacy issues: In case of some types of templates entries in the watchlist and recent changes were showing the IP address instead of the user name.
It's good that this came to our attention as the developer team can start to work on a solution for this now. Thanks to all who already tested the new feature and gave valuable comments. It is a big help for the next version of this feature. Kai Nissen (WMDE) (talk) 10:02, 21 August 2015 (UTC)

Bug in WikiEditor?[edit]

From today morning, I am seeing three sets of Bangla script in the same box, one after other, in the WikiEditor. Not that I am facing any problem, but still, it looks uncanny. Hrishikes (talk) 04:46, 20 August 2015 (UTC)

@Hrishikes: Have you posted to phab:? —Justin (koavf)TCM 14:10, 21 August 2015 (UTC)
No, as I am not having any major problem. No letter is absent; although there are three full sets. Hrishikes (talk) 14:37, 21 August 2015 (UTC)
Follow-up: I think this repetition of sets is now a general feature: sets are repeated 2 to 4 times depending on the browser. I have checked Latin, Latin Extended and Bangla sets. Hrishikes (talk) 05:01, 23 August 2015 (UTC)

1901 work with 2000 copyright notice[edit]

In 2000 the uncensored version of Mark Twain's 1901 essay, "The United States of Lyncherdom", taken directly from Twain's manuscript, was first published in the journal Prospects. (It was first published in a censored version in a 1923 anthology of Twain's essays, Europe and Elsewhere.) I presume the work itself has entered the public domain because it was created more than 95 years ago and it has been more than 95 years since Twain's death. However, the journal printing of the essay contains the following disclaimer:

This previously unpublished essay by Mark Twain is ©2000 by Richard A. Watson and Chemical Bank as Trustees of the Mark Twain Foundation, which reserves all reproduction or dramatization rights in every medium. It is published here with the permission of the University of California Press and Robert H. Hirst, General Editor of the Mark Twain Project.

Is there any legal basis for this copyright notice, or may I freely transcribe this work? IvanhoeIvanhoe (talk) 20:08, 20 August 2015 (UTC)

If the uncensored version of the essay was not published until 2000, then it could indeed be under copyright. --EncycloPetey (talk) 20:47, 20 August 2015 (UTC)
OK, the uncensored version is under copyright until 1 January 2048, it appears, and the censored version is under copyright until 1 January 2019. IvanhoeIvanhoe (talk) 21:41, 20 August 2015 (UTC)
@IvanhoeIvanhoe: As the first version was published in 1923 it falls under that copyright, presuming that it was renewed, which puts the work under copyright, and you are correct until 2019. I would venture that the same date would apply to uncensored version rather than 2048. My reason is that they could not put a new copyright date of 2000 as that was more than 70 years after his death,[59]. I don't know how they could claim a new start without special legislation, so it could only be a claim under the previous copyright. — billinghurst sDrewth 10:54, 21 August 2015 (UTC)
I don't know what you're pointing to on the Cornell page. The life+70 rule was basically irrelevant until 2002. New editions of old works do get new copyrights, and if there's enough new text I don't see why it wouldn't. So the new text is under copyright until 2048, IMO, as is the rest of Mark Twain's unpublished stuff, as it was "published" in a ridiculous expensive microfilm edition in 2001, just to grab that copyright.--Prosfilaes (talk) 21:17, 21 August 2015 (UTC)

Genealogical works from Peter Crombecq[edit]

Hi,

After looking up one of our mayors from the 13th century, I stumbled across the genealogical works of Peter Crombecq, which I referred as a source. The PDF files he created are on the site of his provider though. This means that they will disappear if the internet account is suspended, which inevatibly it will be eventually.

So I asked him if it would be OK to host his works on Wikisource. The first question is, is this the right platform to 'publish' his efforts? He doesn't mind the works to become available to the public. In the PDF files he simply mentions that he wants his name to be mentioned when the material is used.

I asked a question about this on the Dutch wikisource Scriptorium counterpart a few days ago, but the site is relatively death... So it seems better to come and ask over here.

The next question is: Is it OK for me to upload the 'converted' books? Or would it be better he creates an account and uploads himself? He doesn't want to 'waste' too much time on it though. So I proposed I would convert the books to wikiformat and either he or I can then conveniently upload.

This is the PDF file of the first book, I started to work on:

http://users.telenet.be/PeterCrombecq/Genea%20Stek/Crombecq/

It starts with a few chapters of introduction, then the bulk of the book with blocks about people that were involved in the city council of Leuven. I created a page with this introduction, then a page per person:

book
introduction
Coutereel, Peeter an example of an entry

This does not follow the 'layout' of the PDF, which has the introduction over several pages and then in the second part several entries per page. Is that a problem? It is more logical this way and it's easier to connect the entry of 1 person to his/her wikidata item.

I'm adding the author template on each page of the book, does that make sense? It is in compliance with his request to have the source mentioned. I'm mentioning his name, because the way I see it, this series of wikisource pages will/would become the source.

Maybe I should refer to the PDF file on his provider's website, or would it be acceptable to upload that PDF to Commons? Of course, if that is acceptable, maybe I'm doing entirely too much effort by converting it to wikisource... The advantage of having it in this format on Wikisource would be that others can chime in and add to it though. --Polyglot (talk) 18:25, 23 August 2015 (UTC)

I can only answer from the perspective of the English WS as there may be different perspectives on the Dutch WS. @Dick Bos:, if there are things that I've missed, could you please assist?
  • If these texts were in English, they would be welcome here. The copyright release in the text is compatible with us hosting them;
  • The pdfs should be uploaded to Commons, and it's fine for you to do so;
  • Including the author's name in the header template on each page you create on nlWS is the correct way to mention his name. You should also include a license template on the book's mainpage;
  • Yes, keep the print pages for the Introduction together into a single WS page;
  • I recommend that you use the Proofread extension to wikify the books rather than doing that before uploading
Beeswaxcandle (talk) 22:50, 23 August 2015 (UTC)

Tech News: 2015-35[edit]

13:02, 24 August 2015 (UTC)

Some maintenance work that I am considering[edit]

I am considering doing the following maintenance work over the next while, and thought that I would flag my intentions and allow for any discussion to take place

  1. update {{Author}} so we can determine which author pages where we have Wikipedia links, for which the WP link data is not recorded at Wikidata, I think that there will be none, however, just want to make sure. At that point I believe that we will be able to remove all Wikipedia links from Author templates (as redundant).
  2. remove all wikidata parameter fields from the Author template, they are redundant, and we have a ready means to identify where there is no WD linking, and that maintenance is being managed on a regular basis
  3. from Category:Index Validated (initially) and then Category:Index Proofread, identify the overarching works for each file and then look to put that appropriate badging on each of the WD links to WS. Once that is done, I propose that we amend / extend {{Featured download}} button to be displayed on any work that is proofread or validated (ie. has the label). We will also need to have some sort of watch function on the works that enter these categories so they can be badged systematically.

Any thoughts and ideas welcomed. — billinghurst sDrewth 14:36, 24 August 2015 (UTC)

I am not sure I understand your third point, but I agree with the first two. However, I have noticed that often, if a WikiData entry exists for an author, but the Author header does not contain a sister wiki parameter (or portal link or related author), then the sister wiki box does not show up. For this reason, I have often put a Wikipedia parameter in the template to force it to show up. The other option is to use {{plain sister}}, but if you add this template and then someone adds a portal parameter then you will get two sister wiki boxes. I think fixing this will need to be part of the maintenance work you describe. —Beleg Tâl (talk) 15:33, 24 August 2015 (UTC)
@Beleg Tâl:Re point 3. We have the template which we have been utilising on featured works only (big and bright as per Main page), whereas on majority of works we rely on the link in the sidebar to indicate alternate format downloads. 1) In mobile there is no sidebar, so you cannot see, and epubs are predominantly a mobile technology; 2) it is little link somewhat hidden away; 3) we have only utilised on a subset of works. My plan is to look to have it more widely available for where works have an elevated status with the newly available WD badges available for Wikisources[62]. These badges are to what I am referring with regard to the indexes and their corresponding works, and that work needs to be started prior to work on the templating, which I will bring to the community to discuss in detail.

With regard to Wikidata label visibility, we recently updated {{header}} to apply #if statements in a different way so those links should show without additional intervention. To my understanding with {{author}} it has been the case of either a direct link or a search link displays without any intervention, if that is not the case, then some examples would be useful. — billinghurst sDrewth 23:26, 24 August 2015 (UTC)

@Billinghurst: On your first point, you can already verify that there are no such pages at Wikisource:Maintenance of the Month/Wikidata/Wikipedia authors.
@Beleg Tâl: May you point to a page where this problem occurs?--Erasmo Barresi (talk) 12:00, 25 August 2015 (UTC)
@Billinghurst: Support 1 and 2. Would like to propose 2a. remove all remaining deprecated parameter fields from the Author template (commons, wikiquote, etc). Cheers, Captain Nemo (talk) 12:28, 25 August 2015 (UTC).
I am okay with that proposal to remove the sisters in author namespace. Are people comfortable now that as WQ and Commons are migrated to WD and the duplication finders/mergers have been at work that they are now superfluous to our needs? [Again Author namespace only at this point] — billinghurst sDrewth 12:45, 25 August 2015 (UTC)
@Billinghurst: It would seem that you are correct, and this behaviour has already been fixed since I last encountered it. —Beleg Tâl (talk) 13:20, 25 August 2015 (UTC)
@Billinghurst: First, the E-Pub icon in mobile mode is should be next to the pencil (edit) and star (add/rem to watchlist) icons per my tweaks to MediaWiki:Mobile.css. If its still "hidden away" for you, please let me know.

Second, I have no problems with your suggested maintenance tasks though I still think its only a dent in the larger issue when it comes to Wikidata. Unless ALL the key parameters in the various header-to-namespace templates are made into formal messages in the MW (ns:8) namespace so they can be translated and put in use by our sister language WS domains at the same time, in the long run, all we are managing to accomplish here is inadvertently creating an preference for English driven Wikidata over our counterpart's. Plus we are "doing" too much in our header templates as it is on the local level; if 'everybody was working from the same page' (i.e. applying formalized messages for key template parameters), much of the currently entrenched if... then... when... localized template jargon can be safely handled by wikidata &/or lua instead imo. -- George Orwell III (talk) 20:20, 25 August 2015 (UTC)

@Kaldari: is the sort of thing that we can look to the Community-Tech group to assist the Wikisource community to modernise and utilise better practices?

Noisy watchlist warning Author: namespace. With my bot account User:SDrewthbot I have started a run on cleansing redundant parameters from Author: namespace (parameters = Wikipedia · Wikiquote · Wikivoyage · Wikinews · Wikidata · Wikibooks · Commons · CommonsCat · empty image) for Category:Authors-A with 1 recurse (800 pages). Please put any feedback/issues here or my talk page. I will look to schedule the remaining works to start in 24 hours dependent on feedback.

To also note that I have also placed a notice on Special:Watchlist. If your watchlist is has plenty of authors upon it, you may wish to temporarily mask bot edits in your Special:Preferences. — billinghurst sDrewth 12:01, 26 August 2015 (UTC)

We should not remove category links to Commons at this time. Wikidata is still debating the "correct" way to attach Commons categories to items, and it's a bit of a mess. Please put back the categories that you have already removed. --EncycloPetey (talk) 14:03, 26 August 2015 (UTC)
We pick whichever methodology that they use and we pick up their multiple links, be it the interwiki link, or the separate commons category claim. We will pick up whichever changes that they make, it is no problem. [We don't pick up creator, but we don't need to as it is only a display layer at this point of time, and we can easily pick that up whenever.]

Proposing to only list top level of a work as a featured text[edit]

I am looking in Category:Featured texts and I see that the root page and the subpages for some works are listed as featured texts. To me, we should list the top level of the work, and not the successive cascading pages. So that would be things like Popular Science Monthly/Volume 1 and Doctor Syn and no subpages. Removing the {{featured}} template from the subpages will remove them from the categorisation. For some works it has only been the top level that has been done, so I propose to remove the template from the subpages of the nominated works where it has been added. — billinghurst sDrewth 06:18, 25 August 2015 (UTC)

Agree. Subpages should be considered to inherit all categories of their parent and do not need them to be explicitly stated. If a category placed on a parent would cause a subpage to be implicitly in an inappropriate category, then the parental category is wrong. Beeswaxcandle (talk) 20:33, 25 August 2015 (UTC)
The text we used at WS:FT and WS:FTC was inconsistent, so I have aligned to top level only, and will look to do the resulting maintenance later. — billinghurst sDrewth 00:19, 26 August 2015 (UTC)
Job Yes check.svg Done , each work has the featured template once, at the level it was awarded. — billinghurst sDrewth 13:41, 26 August 2015 (UTC)

Wanted: A really awesome tool for extracting images and uploading them to Commons.[edit]

I have been browsing through the thousands of pages with missing images (linked from {{Missing image}}), and it occurs to me that it is going to take an enormous effort to manually extract and upload all of these images to Commons (or, at least, all of the images that are eligible for listing there). What we need is a single tool that:

  1. Extracts the image on the page from the page;
  2. Associates the image with the publication information relevant to the public domain status of the image (which should be on the Index page); and
  3. Uploads the image to Commons if eligible there, with standard information, or here if only eligible here.

It doesn't matter if the extraction process gets more from the page than necessary, because images can fairly easily be refined once they are uploaded; it is the initial uploading that it is most time consuming, in my experience. Now, how do we get this done? BD2412 T 17:40, 26 August 2015 (UTC)

I think you'll find that @Hesperian: is already partway there with his {{raw image}} process. The problem really comes with step 1, in that the quality of the images in .djvu and .pdf files is not good enough. For IA files, we can access the jpeg2 images (which is what the raw image process does). For files that are hosted elsewhere we have problems. Beeswaxcandle (talk) 20:49, 26 August 2015 (UTC)
My concern is with with the next step - the process for uploading the image as an image file with the categories of copyright information required for uploads. A better version of an image can always be uploaded, but it's a pain to input, for example, the work name, author name, publication date, description of the work, and category information (which, for images from a specific work should include an "Images from [work]" category), when these steps must be repeated for dozens of images from a given work. We currently have over 10,000 image files that need such an extraction. BD2412 T 21:21, 26 August 2015 (UTC)
My issue is with the idea of a bot that makes bulk copyright claims. I won't go there. If someone does want to take this on, then please PLEASE ensure that your bot plays nicely with HesperianBot. That means talk to me early and often about what HesperianBot does, and why, and how not to break it. Hesperian 01:09, 27 August 2015 (UTC)
Bulk copyright claims would be an important part of the process. If a DJVU file exists on Commons, and you have a hundred or so images {{extracted from}} that DJVU file, you should be able to simply apply the copyright of the source file to all of the images at once. —Beleg Tâl (talk) 21:27, 27 August 2015 (UTC)
sorry about adding a bunch. the missing image template links to a page, on commons. could there be a way to snip images as a derivative on commons, and serve up in a semi-automatic way on source? Slowking4Farmbrough's revenge 01:52, 28 August 2015 (UTC)
@Beleg Tâl: Unfortunately, that isn't always true. Sometimes the copyright on the images is different from the work, and almost always the illustrator / painter / photographer was NOT the same person as the author of the text. And even if the copyright and creator information is the same for all the images in a work (which often isn't the case), each image needs its caption and text placement information included in the Commons description. Besides which, sometimes the source of the image files is Hathi Trust, when the DjVu came from IA. There are a lot of factors involved in setting the image description at Commons, and I frankly do not see any way that most of the information could ever be handled by anything like a bot, unless the bot were loading information from a prepared file for all the images. It couldn't simply be duplicated from other image files, because the image information will vary from image to image. --EncycloPetey (talk) 03:47, 28 August 2015 (UTC)
I do realize that the illustrator and the author are separate people with separate copyrights. However, if the DJVU file contains the works of both, then its copyright needs to reflect the copyright of all its contents, doesn't it? If the DJVU file license does not account for the copyright of the images, then the license is already incorrect. In that case, it would be better in my opinion to leave it to Commons to correct the licenses, maybe with a bot of their own to assist.
With regard to images from HathiTrust, these would not be impacted because they're not extracted from the DJVU file.
A simple yet reasonable automated description would also be possible, like "Illustration from page $page of $djvufile" - which is no different than what I manually put in the description of all the illustrations I upload to Commons myself. —Beleg Tâl (talk) 12:09, 28 August 2015 (UTC)
Further thoughts, just brainstorming:
  • Commons already allows you to bulk-upload files with the same license tag. So even if we don't want it to pull the license from the DJVU file, we can still simply specify the license once for all.
  • What if we create a tag on Commons that transcludes the license tag from the source DJVU, with a short message saying "this file was extracted by a bot from $source.djvu, the license for the source file is {{#section:File:source.djvu|License}}"
  • We could also have the bot tag the image for human review after upload; it would still be less complex than the current system but still ensure that the file is manually checked for problems
Beleg Tâl (talk) 12:30, 28 August 2015 (UTC)