Wikisource:Scriptorium

From Wikisource
(Redirected from Wikisource:SCRIPTORIUM)
Jump to: navigation, search
Scriptorium
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource.

Contents

Announcements[edit]

Note
This section can be used by any person to communicate Wikisource-related and relevant information; it is not restricted. Generally announcements won't have discussion, or it will be minimal, so if a discussion is relevant, often add another section to Other with a link in the announcement to that section.

Proposals[edit]

Split the scriptorium?[edit]

Perhaps this has been discussed before (I couldn't find anything in the archives) but wouldn't this page be better off being split with a dedicated subpage for each of the current sections e.g. a separate bot approvals page? It might be just me but it seems to take a bit longer to load this page than similar pages elsewhere. Green Giant (talk) 18:54, 22 July 2015 (UTC)

@Green Giant: We could look to remove the transclusion to the /Help subpage, and replace it with a link to the page, and the ability to start a question as a new section on the subpage. — billinghurst sDrewth 03:25, 6 August 2015 (UTC)
Yeah that's the approach I had in mind. I was thinking that we could have a set of tabs like c:Template:Administrators' noticeboard, leaving only the Other Discussions visible here. I don't know how popular that might be though. Green Giant (talk) 12:12, 6 August 2015 (UTC)
Strong support for this proposal. Pages should be kept reasonably short (and loadable), and concisely directed in terms of their topic. Cheers! BD2412 T 14:35, 6 August 2015 (UTC)
I think this is an excellent idea. This page is long and it is sometimes difficult to keep on top of all the discussions going on at once. Symbol support vote.svg SupportBeleg Tâl (talk) 14:37, 6 August 2015 (UTC)
Symbol support vote.svg Support: one subpage for each of the current sections and no transclusion, as in Wikipedia's Village pump. I just made a draft of the section index (feel free to edit it), and there I included links to pages which are outside the Scriptorium, like Wikisource:Possible copyright violations, but nevertheless may be what users are looking for when they click on "Central discussion" in the sidebar.
With regard to this, what about enabling Flow in some or all of the subpages?--Erasmo Barresi (talk) 09:55, 10 August 2015 (UTC)
I have removed the transcluded section, and added a click link to start a new question. — billinghurst sDrewth 10:30, 10 August 2015 (UTC)

Updating obsolete gadgets[edit]

My recent voyage through the gadgetland reminded me that the Regex search and replace gadget is obsolete and was superseded long ago by a much better version from Pathoschild. Would it be possible to replace the old gadget with the new code so that everyone can benefit from this tool, rather than having to paste this code in the their common.js file?

mw.loader.load('//tools-static.wmflabs.org/meta/scripts/pathoschild.templatescript.js');

This new script saves the search and replace parameters to be reused, rather than having to be retyped. — Ineuw talk 04:34, 16 August 2015 (UTC)

I have poked Pathoschild, he has global rights to update. — billinghurst sDrewth 00:27, 17 August 2015‎ (UTC)
TemplateScript is the newer version of the obsolete regex menu framework. It's much more powerful, but it's not backwards compatible — I'll update custom scripts that use it over the next little while, and update the gadget when I'm done. —Pathoschild 00:55, 17 August 2015 (UTC)
When you say "not backwards compatible", what does that mean? I don't know who uses the old regex framework, but as far as I know it didn't store any parameters for reuse. Did I miss something (which is not unusual)? — Ineuw talk 08:34, 17 August 2015 (UTC)
TemplateScript was designed as the successor to the regex menu framework, so it's mostly backward compatible. Here's a quick overview off the top of my head:
  Regex menu framework TemplateScript backward compatible
regex editor ✓ improved
custom scripts ✓ improved ✘ different schema
custom sidebars "Scripts" sidebar ✓ any number of custom sidebars
supported views edit ✓ any (edit, block, protect, etc)
gadget support no-conflict mode (one gadget) ✓ built-in support (any number of gadgets)
compatibility unknown ✓ all skins, modernish browsers, MW extensions
keyboard shortcuts
save regex patterns
custom plugins
translatable
The only breaking change is how you define custom scripts in your *.js pages. TemplateScript was written to take this much further than the regex menu framework did, so it uses a more expressive approach to defining custom scripts. Here's a sample complex script from a recent migration (using the new helpers):
Regex menu framework:
function rmflinks() {
   regexTool('remove linebreaks','linebreaks()');
}

function linebreaks() {
   /* exception pattern */
   var pattern = '\{\{[^}]+\}\}';

   /* store exceptions in an array */
   var patternlocal = new RegExp(pattern, 'ig');
   var exceptionvalues = editbox.value.match(patternlocal);

   /* replace exceptions with placeholders */
   var patternlocal = new RegExp(pattern, 'i');
   for(var x=0; x<exceptionvalues.length; x++) {
      editbox.value = editbox.value.replace(patternlocal, '~exception~');
   }

   regex(/([^\n]+)\n([^\n]+)/g,'$1 $2',10);
   regex(/(=+.+?=+) *([^\n]+)/g,'$1\n$2');

   /* restore placeholders */
   for(var i=0; i<exceptionvalues.length; i++) {
      var pattern = new RegExp('~exception~');
      editbox.value = editbox.value.replace(pattern, exceptionvalues[i]);
   }
}
TemplateScript:
pathoschild.TemplateScript.add({
   name: 'remove linebreaks',
      script: function(context) {
         var escaped = context.helper.escape(/\{\{[^}]+\}\}/g);
         context.helper
            .replace(/([^\n])\n([^\n])/g,'$1 $2')
            .replace(/(=+.+?=+) *([^\n]+)/g,'$1\n$2');
         context.helper.unescape(escaped);
   }
});
So to migrate the gadget to TemplateScript, I first need to update custom scripts. —Pathoschild 18:37, 17 August 2015 (UTC)

BOT approval requests[edit]

request for bot flag on account KasparBot[edit]

I want to perform the same task as on enwiki, frwiki, dawiki, mkwiki, jawiki, kowiki and cswiki, huwiki, bewiki in future. The bot will #1 move authority control information (Template:Authority control) to wikidata and replace the template with a blank {{Authority control}} (see w:en:Wikipedia:Bots/Requests for approval/KasparBot), #2 add {{Authority control}} to pages with authority control information on wikidata but without a local template transclusion on bewiki (see w:en:Wikipedia:Bots/Requests for approval/KasparBot 2). It uses my own Java framework. The bot's tasks are coordinated at Wikidata:WikiProject Authority control/Status. Regards, -- T.seppelt (talk) 08:42, 18 July 2015 (UTC)

Do you make a consistency check between local and wikidata info? If so, what if the two are different? Do you skip the page or who wins?— Mpaa (talk) 15:31, 18 July 2015 (UTC)
I skip the page. All problems will be tracked at a special section at the bot's tool. Regards, -- T.seppelt (talk) 23:58, 18 July 2015 (UTC)

@Mpaa: I could make same test edits. Do you agree? --T.seppelt (talk) 07:26, 22 July 2015 (UTC)

ok for meMpaa (talk) 19:16, 22 July 2015 (UTC)
@Mpaa: done. I didn't notice any mistakes. -- T.seppelt (talk) 06:29, 23 July 2015 (UTC)

@Mpaa:, @BirgitteSB:, @Hesperian:, @Zhaladshar: Nothing happens here. Do you want to me to hand in any information or do some more test edits? By the way, you can see the estimated edits of this bot at [1]. Kind regards, -- T.seppelt (talk) 16:13, 24 July 2015 (UTC)

I can't really add a flag because there is not really any consensus. I don't know enough about authority control to really weigh in on this. I'm hoping enough people who do can put together a consensus so that I know whether I can flag the account or not.—Zhaladshar (Talk) 14:32, 25 July 2015 (UTC)|
I'd need to know more about this to say, but it looks as though the test edits were all done in the Author namespace. Will the bot's edits be limited to the Author namespace? The stated scope of the bot's edits is very vague. --EncycloPetey (talk) 15:36, 25 July 2015 (UTC)
The bot looks at all pages in Category:Pages using authority control with parameters. User pages won't be affected because they usually don't have Wikidata items. You can inspect the estimated edits at the bot's tool page. Kind regards, --T.seppelt (talk) 19:47, 25 July 2015 (UTC)
@Mpaa:,@EncycloPetey: I started a general discussion. -- T.seppelt (talk) 06:34, 5 August 2015 (UTC)

request for bot flag on account YiFeiBot[edit]

Removing interlanguage links to from pages if the link is already on Wikidata. The bot is already approved as global bot and locally on commonswiki, enwiki, urwiki, zhwiki, zhwikivoyage. Programmed in python with pywikibot framework (source). --Zhuyifei1999 (talk) 05:58, 11 August 2015 (UTC)

From memory, there is already a bot doing this here, so I don't see the point in duplication of function. Beeswaxcandle (talk) 07:09, 16 August 2015 (UTC)
I'm unaware of any active & approved bot on this task. And some simulated runs shows many to-be-removed interlanguage links. Would you point out the bot for me? --Zhuyifei1999 (talk) 18:08, 16 August 2015 (UTC)
Oppose - There are links here that should not be removed even if they exist somewhere on Wikidata. Some of the interlinkages are not yet properly figured out, and if we rely solely on Wikidata, we can lose track of the links altogether. I've seen this happening in recent months when someone at Wikidata decided that our copy of a work was an "edition", and so needed to be on a separate data item from the work itself. Because the interwiki links were all at Wikidata, the move meant that all interwiki links disappeared from that work. So, at this point, because reasonable precautions are not yet in place against that sort of thing, I would say "no", we shouldn't start stripping out the interwikis. --EncycloPetey (talk) 13:34, 16 August 2015 (UTC)
Hmm. The removal on Wikidata seems to be vandalism and should be reverted on sight, as existing items should be used whenever possible. --Zhuyifei1999 (talk) 18:08, 16 August 2015 (UTC)
To note that the edits at Wikidata are correct, as our works are editions, and part of a general book, it just hasn't been suitably thought through on how we interlanguage link due to this. — billinghurst sDrewth 00:40, 17 August 2015 (UTC)
Symbol oppose vote.svg Oppose Agree with EncycloPetey that it is premature to remove interwiki links by bot and transfer them to WD. There is still the open discussions to occur about how to link books/editions/translations between languages, and still the discussion about how to best cater for links between works. I would hope that the Wikisource conference proposed for Vienna in November would be able to have a good roundtable discussion about this matter. @Micru, Aubrey: unsigned comment by billinghurst (talk) 00:37, 17 August 2015.

I withdraw my nomination. If en.ws community decides not to remove the links, I'm fine with this outcome, and wait for the Wikisource conference in November. --Zhuyifei1999 (talk) 14:09, 17 August 2015 (UTC)

Help[edit]

Preferably, we ask your HELP questions at Wikisource:Scriptorium/Help.

Repairs (and moves)[edit]

Other discussions[edit]

Need to edit a copy-protected page[edit]

The page Onward, Christian Soldiers is copy protected, but the source text exists on wikisource so the page should be redirected to The Army and Navy Hymnal/Hymns/Onward, Christian SoldiersBeleg Tâl (talk) 15:43, 29 July 2015 (UTC)

I unprotected the page and leave it up to you to move/redirect/replace as needed with the scan-backed version. Just let us know if we need to protect anything afterwards here. -- George Orwell III (talk) 18:39, 29 July 2015 (UTC)
Thanks. I've redirected it. I don't think protecting it is necessary, as it just gets in the way of legitimate editing, but I guess if it was a highly vandalized page we'll just have to wait and see if the vandalism recurs. —Beleg Tâl (talk) 19:55, 29 July 2015 (UTC)
Also odd that the protected version included two verses that were not supported by the accompanying source text. --EncycloPetey (talk) 04:08, 30 July 2015 (UTC)
Is that odd? It came from a different place. It might be necessary to turn the original into a dab and move the old test somewhere else instead of just deleting the extra verses. (Assuming they're genuine.) — LlywelynII 10:05, 30 July 2015 (UTC)
Resurrected and moved old page. Converted base page to {{versions}} — billinghurst sDrewth 11:35, 30 July 2015 (UTC)
You should consider renaming the old page. The old page is not sheet music, and the new one is, so the disambig is incorrect at best and confusing at worst. —Beleg Tâl (talk) 13:26, 30 July 2015 (UTC)
FYI Onward Christian Soldiers is also available in the Salvation Army Songbook here Onward Christian Soldiers Songbook No. 690 --kathleen wright5 (talk) 03:29, 8 August 2015 (UTC)

EB11, vol. XXVI[edit]

Something's hinky with Volume 26. Anyone know how to fix it?

[If the problem doesn't display on your end, what I'm seeing is Error: Numeric value expected in red text instead of any of the pages. When I try clicking on individual linked pages from the djvu file's page, I can see them but there's no button forward or backward into the other pages that haven't been edited yet.] — LlywelynII 04:36, 30 July 2015 (UTC)

Good grief! This issue has been reported, and fallen into the archives pending action(? As if?) many, multiple times. Either nobody cares or nobody has the sense to mark items "not to be archived until finally addressed." Something might be done one day but for now it appears nobody has the authority or the ability to fix this issue locally. It has been established as being a system problem of scope beyond merely Commons/WikiSource. AuFCL (talk) 05:29, 30 July 2015 (UTC)
And since it does not directly affect Wikipedia, nothing will ever happen to fix the problem. At least that's my experience. So the way to get it fixed is to add broken links and faulty citations all over Wikipedia referencing the content from EB1911 until the Wikipedians start griping about it. . . I'll stop snarking now. --EncycloPetey (talk) 06:40, 30 July 2015 (UTC)
Pardon, LlywelynII if you are feeling picked upon. It is not intentional—you merely happen to be about the dozenth person to ask about this matter. Seriously, let's make this item a mini-index and leave it tagged not to be archived until such time as this particular issue is fixed or otherwise goes away?

Accordingly : See any of (please add any I've missed):

AuFCL (talk) 07:39, 30 July 2015 (UTC)
Thanks for the apology, but nah I don't feel picked on. I can understand your perspective but our EB material is going to be some of the most-used material on the entire site, so it's just something that is going to continue being a problem. Does no one know what the issue is? or we do and we just have to wait for the WikiMedia code monkeys to get around to that particular typewriter?
(And actually there was a complaint I made somewhere about a similar problem in the EB9 and it actually did get fairly promptly addressed so I was assuming it might be something easy.) — LlywelynII 08:32, 30 July 2015 (UTC)
Looking at this conversation, it looks like there's some problem with large numbers of text chunks in the scan? Couldn't we just cut the .djvu file into two pieces? — LlywelynII 10:15, 30 July 2015 (UTC)
@Llywelyn: The issue (as I read it) is that the <pagelist> componentry calls the API of the djvu file for the number of pages, and what it is bringing back is not in a format it comprehends (presuming that it is an error message rather than a number), such that proofreadpages api spits out that error message. So it is fails for the full page span, and it fails for a partial list (I tested.)

With regard to the commentary, if we are wanting to get work done, sometimes we have to be the squeaky wheel, and if we don't make our needs obvious, and clearly state the problem, and the effect, then it often won't get traction. What we had on the phabricator ticket about the issue is not enough to get anyone' interest of it being a specific issue that needs speedy resolution, it gives indication of the size or impact. Phabricator is the avenue to the developers, and lots of foot traffic, votes, and helpful noise across a ticket will bring it to attention. — billinghurst sDrewth 11:13, 30 July 2015 (UTC)

having made the other 26 volumes match and split ready, i’ve been mulling copying over all the articles in vols 26 & 27, from IA ocr. the side by side could be stitched later. (the articles in the volume would be findable in a search and linkable from WP). Slowking4Farmbrough's revenge 23:16, 2 August 2015 (UTC)

Tech News: 2015-32[edit]

15:51, 3 August 2015 (UTC)

quote before dropped initial[edit]

I have tried a previously provided solution to format a quote mark before a dropped initial but it hasn't worked for me on this page. Any suggestions? — Zoeannl (talk) 03:50, 5 August 2015 (UTC)

Yes check.svg Done @Zoeannl: Just "float left" the quote, not the "dropinitial" too, which has its own formatting to push it left. There is guidance provided at Template:Dropinitial of the means depending on your desired output — billinghurst sDrewth 04:07, 5 August 2015 (UTC)

Future of Authority control on English Wikisource[edit]

Hello everyone,

I am requesting the bot flag for my bot (KasparBot) at the moment. According to Wikidata:WikiProject Authority control/Status the bot will copy authority control information from this wiki to Wikidata and clean up the template (stage 2): {{Authority control|...}} → {{Authority control}} You can inspect the assumed edits here. The second task includes embedding blank templates to those articles which have authority control information on Wikidata but not on enwikisource (stage 5). The bot runs on enwiki, frwiki and multiple other wikis too. The final aim is to deprecate the local parameters of the template and use Wikidata information by default (stage 4). This would remove differences and improve the consistency of Authority control information on all wikis. Please comment on this proposal. I need a community consensus to run the bot. Thank you very much.

Regards, -- T.seppelt (talk) 06:33, 5 August 2015 (UTC)

I would welcome the cleanup of authority control by bot, rather than my slow manual processes, and will check out your example edits. I believe that we have cleaned up those author pages where there is information discrepancy, so would think that there is probably a quick an easy job.

Rather than the addition of the authority control template to each author page, I would think that we should be better considering the automatic addition (embedding to base?) of the template to our existing configuration for author pages. I would much rather that we looked to have it applied to the base of all top level author pages (not to subpages) so it never has to be fussed about in addition ever, it is just there. — billinghurst sDrewth 06:57, 5 August 2015 (UTC)

That would be in my eyes the best solution. (we should think about something like this as stage 6..) I could also help with removing the entire template after checking for differences. Regards, --T.seppelt (talk) 07:20, 5 August 2015 (UTC)
1) I also would welcome the cleanup of authority control by bot, rather than my slow manual processes, and will check out your example edits:) We have indeed cleaned up those author pages with discrepancies, but only with respect to viaf, there might be quite a few GND, LCCN, etc to fix. And those values are not necessarily correct ones, they might have been added by the gadget when viaf was added.
2) there is an issue with cleaning up the template: the existing gadget "add authority control" is not working properly if there is naked {{Authority control}} template on the page. This is why there are quite a few appearances of "{{Authority control|$1}}".
3) I also think that authority control should be a part of author template (though, it's a separate discussion:)
To sum up, I support the proposal. Cheers, Captain Nemo (talk) 07:59, 5 August 2015 (UTC).
The gadget should just be killed, it is well superfluous and pre-dates WD. We now have the means to identify when and where the AC are, and when they are not local, and if we secrete it to every author page, it will pick up the data; and if there is no corresponding page at WD, we should note and resolve that separately. — billinghurst sDrewth 12:28, 5 August 2015 (UTC)
The proposal is not limited to doing Authors pages, and for that reason I oppose this. I've seen how often the data are added incorrectly at Wikidata for our works, or a new data item is started incorrectly there, or a mismatch between items was made. The identification and matching between works cannot be handled by bot. There are just too many errors, and too many additional interwiki linking problems when it comes to dealing with works and editions. --EncycloPetey (talk) 23:28, 5 August 2015 (UTC)
@EncycloPetey: then how are you with limiting this to Author/Portal namespace where we predominantly have AC in place? — billinghurst sDrewth 23:53, 5 August 2015 (UTC)
Follow-up. To note that we have less than 250 works with authority control in the main namespace which is a manageable number to handle manually. If the detail added at WD is problematic for works, then we should be looking to review and provide that feedback to WD as part of their quality assurance processes. — billinghurst sDrewth 23:58, 5 August 2015 (UTC)
If only it were that simple. I've looked at the guide page they have for "books". The talk page is a long running list of unresolved issues. Issues are being raised there, but they're not being resolved. Even for individual works that I've done there, I've sometimes had to go back and forth on conversation with two or more people about issues for which there is no provision and no common-sense solution in place either. --EncycloPetey (talk) 00:34, 6 August 2015 (UTC).
Agree w/EncycloPetey. And the larger problem here is we never really did a good job of tracking edition info. IMO, the year parameter should have been associated with the publisher (& it's location/nation/city) rather than the title, work or author. That combination of Year & Publisher would more correctly dictate the print run of any given edition regardless of translations, etc. What we have now makes it too hard for a bot or script to separate a work from its editions and/or the author/editor and so on. Unfortunately, the only way to insure accuracy at the work vs. edition level without the Publisher part would be to do it manually (pretty much they way we've been dealing with the edition nuance since I first landed here). -- George Orwell III (talk) 00:00, 6 August 2015 (UTC)
Wait a second, that argument is starting to "throw the baby out with the bathwater" with just a blanket NO statement. Firstly, 1) the first part of the proposal is to put our AC information against our WD data that was taken from our site, so that seems like a good thing. If we do that, why would we still want local parameters? If we have concerns that others are not taking the same stringency with WD additions then that is about us controlling the addition of AC template addition here. So we modify the proposal that we don't have the bot add AC templates here, and that seems like a wise choice, we can manage that ourselves by our processes, and manage and review its addition. — billinghurst sDrewth 00:19, 6 August 2015 (UTC)
"Seems" on the face of the proposal perhaps, but definitely NOT in the light of what I know to be the case in actual fact. I've been trying to clean up just the two dozen surviving dramatical works from ancient Greece and have run into all sorts of mismatch problems and errors. The concept "seems" fine in theory, but in fact it is too flawed for me to support it. And, yes, even the data taken from our own site is wrong, as it is from other Wikisource projects. And the problem is not just the addition of data here; I have seen multiple cases where bots were used to generate incorrect items on Wikidata from information here. Damage is already being done. --EncycloPetey (talk) 00:28, 6 August 2015 (UTC)
Then you are saying, don't trust our main namespace data. If that is the case, why don't we kill it all and start afresh. We should then monitor the addition of the template locally, and ensure that the data has been appropriately been added at WD, and any data here is pushed to WD. Or, we migrate to a verified template for AC signed by the verifier, similar to how Commons manages Flickr imports. — billinghurst sDrewth 00:45, 6 August 2015 (UTC)
Let me clarify - as for the AC template in Author: namespace; I would tend to agree that our current info has been through enough vetting to "hand over" to Wikidata. That doesn't mean its perfect by any measure either.

On the other hand, the use and/or exportation of AC info associated primarily with anything in the main or translation namespace is premature at best. That info is just plain not ready for Wikidata by any stretch -- in my opinion -- because our current repository of information for those namespaces lacks the more accurate publisher[-city]-year-edition relationship. That's more of an [Index: to] Header template deficiency rather than an AC template issue or bot problem. -- George Orwell III (talk) 00:48, 6 August 2015 (UTC)

On top of that, Wikidata items do not match with ours (and will not match) for individual works. For example, the English Wikipedia has an article about Shakespeare's Hamlet, which has a matching data item on Wikidata, and all the other Wikipedias have their articles listed there as well. Wikiquote too. However, Wikisource has a particular edition of Hamlet, or several such editions. By current proposed norms on Wikidata, each edition must have its own separate data item, so there is not a direct relationship between any copy of Hamlet here and the general Wikidata item. What must happen is for Wikisource to create a general page for Hamlet where all possible current and future editions will be listed, and that page will be added at Wikidata. Any individual editions we have must become separate data items at Wikisource, with all the information specific to that edition. Then the item for that edition is identified as an edition on the general Wiidata item, and the general item gets a link to the instance of each edition. The same holds true for any translation of a work from the original language, so the eventual result at Wikidata is that every Wikisource work listed anywhere on any Wikisource project will have its own individual data item, with its own specific set of bibliographical info. This edition has a different year, publisher, ISBN, etc? Then it goes on a new data item, and all the AC info has to be tracked down anew.

That is most certainly not how any of the Wikisource projects are currently set up.

This also creates the problem that there will be no wikilinks to, from, or between any works on any Wikisource. The general page with all the Wikipedia links will be separate from the First Folio edition of Hamlet, which will be separate from the Quarto, which will be separate from any later editions, from the French translation, from the Spanish translation, etc. So the ultimate result of the current scheme will necessary remove all wikilinks between Wikisource projects for individual works.

So, billinghurst, it's not just the main namespace data that's wrong on Wikisource, but rather the entire structure of our main namespace has to change in order to connect properly with Wikidata.

--EncycloPetey (talk) 01:07, 6 August 2015 (UTC)
You and I both have been vocal about the last para (me on wikisource-l, and you at WD), and I would prefer to separate that discussion away from the bot discussion. It is a discussion that needs to happen, but can we please separate it.

I believe that we can agree that at this point of time, that the migration of enWS AC data from the main ns should not occur, and in fact the issue of AC data in mainspace is complex and problematic primarily in association with an edition. Accordingly, I propose that we would recommend to the bot operator (first redraft)

  • to not take enWS main ns AC data to WD (veracity in doubt)
  • to not undertake AC changes by bot to the main ns
  • that the translation ns should be off-limit as it is Wikisource'd transcription area, and should actually be devoid of any valid AC data at this point of time
  • that the Author and Portal ns should be able to be migrated, though there will be some errors as no validation has been undertaken
  • that following completion of migration of data that the bot operator to recheck with this community whether the existing AC templates in the Author and Portal ns should be updated or removed, pending any discussion that we undertake about a suitable replacement strategy.
  • that there is an indication whether this is a one-off proposal to migrate data, or whether it will be a regular check and update process for the bot
Thoughts? — billinghurst sDrewth 01:29, 6 August 2015 (UTC)

┌───────────────────┘
May I present a radical perspective?

Once I trusted Author: pages on enWS, if only because if I happened to spot a flaw to which I knew the correction I could so apply said correction, and take the personal consequences if I introduced an error.

Then along came WD, and that was O.K. because only the language modified (I could edit a "claim" over there instead of wikitext here to fix a fault; and some kind of robotic synchronisation between the systems restored consistency) but the functionality remained largely similar.

NOW you are suggesting all changes must be propagated back from WD (which along the road has modified the rules and I am no longer allowed to modify in ways which previously were not barred) to enWS, and enWS modifications are to be isolated/ignored or perhaps in the near future not permitted either? Slice-by-insidious slice the ability to pay even the shabbiest lip-service to the concept of "the X that anyone can edit" has been stripped until this last step leaves the basic concept in ruins. Is it too high a price to pay to eliminate the core mission? I ask why did it come to this at all?

I must side with EncycloPetey above in opposing this in the proposed form. Either that or I have entirely misunderstood the flow of development and would very much appreciate being reassured that my interpolation of feared future changes is incorrect. AuFCL (talk) 01:35, 6 August 2015 (UTC)

Re: billinghurst: The Author namespace is the only one that the bot has been tested for, and the only one where we've had serious and methodical vetting by experienced users. Thus, the Authors namespace is the only one I think could actually benefit at this time from AC data transfer (in either direction). However, this limitation is strongly at odds with the original bot proposal, and with the response I got when I queried the bot operator about it. So, I would want to hear from the bot operator about this, and be sure he understands the nature and severity of the issue for the other namespaces. (I am of the opinion that no one who properly understands Wikisource is proposing to run a bot. We had the same issue on Wiktionary when Wikidata proposed housing the dictionary information, but that issue is even more deeply challenging, in ways that I won't try to elaborate upon here.) --EncycloPetey (talk) 01:43, 6 August 2015 (UTC)
Continue to concur with EP; the Author namespace is the only namespace where enough iterations of review by us as well as by WD have taken place to be considered remotely accurate. That still won't do us much good if any non-English WS doesn't closely follow our approach and templates to the Author namespace not to mention conflicting info between the domains for any given Author.

And I agree picking up the editions -to- works discussion should take place besides this narrower one. -- George Orwell III (talk) 01:51, 6 August 2015 (UTC)

Re: AuFCL: I share your concerns, and am uncertain towards what direction future development will lead. However, I do see one strong advantage for Wikisource integration with Wikidata: greater accessibility to our resources, and thus a stronger internet presence. FreeBase now pulls directly from Wikidata for searches, and the addition of LoC and other control information will permit users of those libraries and databases to access our works through searches from those institutions' databases. We do more good for the global internet community with that kind of accessibility. --EncycloPetey (talk) 01:51, 6 August 2015 (UTC)
Re portal namespace, there are 49 [12], we move them manually if you want, I don't mind, though believe that users who added them were competent. If it is only the Author: ns, fine with me, it contains >> 99% of our AC additions.
Re AuFCL's comments. Absolutely not, that is melodramatic scaremongering. We control our templates and how we code and use them, each of them has the ability to override whatever data is in WD, and I have made no proposal to change that. Each of us has the ability to edit WD and so manage the data that appears at our site, but yes, it is off-WS. The issue is that our data is static and UNCHECKED, so when a page moves at a sister wiki, it is not updated here, where it is moved a sister WS, it is not updated here. As new authorities are added or are corrected or merged they won't appear here. Certainly there are risks with big data, but we manage our risks, and get involved in the process, and be rigorous, we don't undertake the ostrich manoeuvre. It is a value proposition and we have an open discussion about what we do and how. WD has had issue with data, and they are very much addressing that, and part of that is taking our data into their system. Changes should be reviewed, and we have a good community to do that, and we should ask for good processes to maintain the integrity of data. There are also benefits, and they should also be part of the discussion and how we measure the value of a proposal. — billinghurst sDrewth 02:48, 6 August 2015 (UTC)
I have no intention of prosecuting this further. Melodramatic or otherwise, that is my honest take upon the apparent trend of changes and I await reassurance or vindication, neither of which have been provided by statements made so far. And yes, I accept this is an argument which should take place elsewhere. AuFCL (talk) 02:58, 6 August 2015 (UTC)
Okay, so there is a (narrowed down) proposal that pushes our Author control authority data to WD where it doesn't exist there. [I have already been through and cleaned up instances of where we have had data mismatches based on the VIAF.] Then the proposal is to display that data here by use of the WD call using the template. For existing authors there should be negligible difference in data [noting that we already have WD populated AC templates, partially or fully.] What are your specific issues with the specific proposal as it sits, and what risks do you see that we would need to manage? — billinghurst sDrewth 03:23, 6 August 2015 (UTC)
With regard to VIAF, do you find multiple VIAF identifiers for the same item much? I have been working through the 40 or so surviving ancient Greek plays, adding authority control data from LoC, BnF, and VIAF. More often than not, the VIAF has two or even three identifiers for the same Greek play--most of which appear to be duplicates as a result of not synchronizing the import of BnF data into the mix (the oddball identifier is usually French, or includes the French listing). Have you found this to be the case? And at some point we'll have to consider how our authority control template will deal with multiple VIAF values. In the immediate case, the question is whether the bot can handle such a situation, namely, that there is a VIAF identifier mismatch, but neither is actually wrong and both should be used. --EncycloPetey (talk) 22:36, 6 August 2015 (UTC)
[For the works that I do, I struggle to find them in VIAF in the first place.] For the authors, I believe that there has been a big effort to merge and resolve duplicates, and how much of that has been due to Wikidata and public collaboration would be interesting to know. So I would suggest that you add the multiples and then set a "preferred" ranking for the one that will continue, or set the unpreferred to "deprecated". I have seen that where VIAF has amalgamated data that they have now created their own system of redirects for the old numbers, so I am hoping that WD will have a system to automate updates and additions either via the VIAF identifier, or via the individual repositories pushing data. [one day!] — billinghurst sDrewth 00:09, 7 August 2015 (UTC)
Unfortunately, when there is more than one VIAF identifier for a work, I do not know which identifier will be the one that continues. The split is sometimes half-and-half, or a three-way split of libraries between identifiers. As I say, I've found this to be the case more often than not (and for well-known and long-standing works of literature such as the ancient Greek dramas!). If I knew someone I could contact who handles VIAF data, and could say "all these have been found to be duplicates", then I would do so. But for now, this state of affairs bodes ill for trying to coordinate VIAF information pertaining to works of literature either at wikidata or here, since I expect the same situation will apply to the VIAF identifiers of other works. What I have tried to do is to include the LoC Authority and the code from the BnF in the Wikidata item, so that users (who may only receive only one of several VIAF identifiers through the template) can still access the French and US listings. I tried doing the GND as well, but the German library data was not as easy to figure out, even when I had the VIAF listing already at hand. --EncycloPetey (talk) 00:38, 7 August 2015 (UTC)
The template will select and display one, and if they are equivalent (and not set separately), then it is not particularly an issue. The VIAF data will update and be resolved and/or the predominant will be set. In my experience, the lowest number is retained for VIAF data, presumably as it is the oldest record, but in the end it doesn't matter, due to the redirects that are implemented, so as long as we are not selecting the wrong work/link, all should be good. — billinghurst sDrewth 01:24, 7 August 2015 (UTC)
@Billinghurst:, @EncycloPetey:, @George Orwell III:, @AuFCL: Some urban legends seem to be created in this discussion:) So, some hard facts about the relationship between VIAF and Wikidata. Wikidata now is the official partner of VIAF: see here. What that means: 1) wikidata item about a person is the official part of the VIAF cluster; 2) hard data such as birth and death date is sought and used in VIAf cluster; 3) VIAF data from offical VIAf partners such as LC, BNF, DNB, etc is fed back to wikidata; 4) the flow of data between viaf and wikidata is done on regular basis. That means that wikidata has the same up-to-date data as viaf and all its official partners. That means that wikidata data about a person is in no way inferior to any constituent part of viaf including LC, BNF, DNB and so on and so forth. Storing this data locally (at wikisource) means that the data is not updated. If you keep an eye on category:viaf different at wikidata you will see that all differences are due to wikisource having deprecated viaf data (I myself have deleted dozens deprecated viafs from wikisource in the past month or so!). So, I hope, this discussion can be concluded to everybody's satisfaction.
At the same time, I fully agree that the situation is very different in the main namespace. At least for the time being I think we should manage this on our own, locally. Cheers, Captain Nemo (talk) 08:58, 11 August 2015 (UTC).
Further detail about VIAF and Wikidata interaction can be found here. This is blog by Thom Hickey, who is the chief scientist at OCLC. So, please, no more about accuracy of viaf partner data in wikidata:) Cheers, Captain Nemo (talk) 09:15, 11 August 2015 (UTC).
I unconditionally withdraw my opposition, as my fundamental objection has been proven to rest upon a misunderstanding which I shall not detail in case taken out of context might add to any confusion. (Happy to discuss out of stream should any one happen to be curious.) Accordingly I also strike my comments above. AuFCL (talk) 10:26, 11 August 2015 (UTC)
@Captain Nemo: "wikidata has the same up-to-date data as viaf and all its official partners" is overstating matters. In many cases I've found, wikidata is altogether missing links to VIAF or any library partners of VIAF. --EncycloPetey (talk) 13:50, 11 August 2015 (UTC)
@EncycloPetey: 1) Are you sure you are not confusing your past experiences with the current state of things? If you have a look at the links in my prev. post you will see that the integration started in April 2015. Any past grievances are just private history. 2) If you have in mind situations i) when viaf has multiple clusters for one identity or ii) viaf bundles several people in one cluster, then integration is exactly the way to help to resolve it. Wikidata regularly creates data base dumps for these cases and they being resolved both by viaf and on wikidata. One recent example I aware of is viaf clusters related to popes. Those chaps used to have several viaf clusters coming from LC and DNB which are now (mostly) merged. I know of this because I cleaned up many of those stale viaf here on wikisource, there would be no need for that manual work if our authority control template just imported fresh data from wikidata. 3)If you referring to different spelling and birth/death years then again, viaf now actively seeks and imports this data from wikidata. I dont know exactly how viaf does that but this matching is not a trivial task so it takes time. 4) If you concerned with something else, please give specific factual examples and I will try to clarify. Cheers, Captain Nemo (talk) 02:01, 12 August 2015 (UTC).
@Captain Nemo: 1) I am raising concerns with the information on Wikidata that I encountered while editing in the past week. So, yes, this is a problem in the current state of affairs on Wikidata. (2) Yes, multiple VIAF for a single entity, often two or three. (3) No, spelling and dates have nothing to do with this. (4) I meant exactly what I said. I spent time last week going over the information for a small group of related data items. I found that most had no data for VIAF, LoC, or BnF. Where they had VIAF data, there were usually additional VIAF identifiers missing because of redundancy in the VIAF data for the same entity. --EncycloPetey (talk) 03:56, 12 August 2015 (UTC)
@EncycloPetey: I had a look at your edits for August on wikidata and failed to find any edit of the type you mention. Please note (maybe it wasn't articulated clearly before) but my statements is about author namespace only. The plays and its characters and so on is complete different issue, viaf is not doing those at the moment. If you referring to the wikidata items that you have not edited, please give specific examples and I will try to clarify. Cheers, Captain Nemo (talk) 05:55, 12 August 2015 (UTC).
@Captain Nemo: When you say "viaf is not doing those", I assume you mean that they are not cleaning up redundancy or errors. Sorry, but I did not understand that you were limiting your discussion of Wikidata and VIAF to author information only. With regard to authors, I have not seen any problems on Wikidata that would affect information we use on our site, but I have seen (in the past few days) an experienced Wikidata (and MW) editor who added grossly incorrect information to Author pages. It was the result of adding information that he did not understand himself, and had not bothered to source or check, which was rather ironic given that he was, at the same time, involved in a site discussion about sourcing and verifying claims added to data items. --EncycloPetey (talk) 16:54, 12 August 2015 (UTC)

Hello again,

I would like to reach a consensus on this topic. Please state with Symbol oppose vote.svg Oppose or Symbol support vote.svg Support if you agree with this proposal or not. My bot would do this:

  1. fetch entries of Category:Pages using authority control with parameters in namespace 102 (Author)
  2. compare the AC information to Wikidata
    • add claims if no information is available on Wikidata
    • keep the information on Wikisource in case of differences, malformed values etc. and skip step 3
  3. remove the local information from Wikisource

Please be also aware that you can inspect all future edits on https://tools.wmflabs.org/kasparbot/ac.php?select-project-enwikisource=1. All faulty pages will be available there. The current situation is:

count state
13557 template can be replaced
1023 malformed value
285 different value on wikidata
24 unknown template property
5 technical problems
1 more than one template embedded

I would like to close this RfC on 31st of September. -- T.seppelt (talk) 08:59, 17 August 2015 (UTC)

  1. Symbol support vote.svg Support as proposer -- T.seppelt (talk) 08:59, 17 August 2015 (UTC)
  2. Symbol support vote.svg Support -- Captain Nemo (talk) 11:55, 17 August 2015 (UTC).
  3. support. I have reviewed the first 1400 listed entries and would just say that the non-VIAF data is not worth transferring, nor remaining and would consist of little value, and we should just strip the author:ns AC templates of all paramters. Though the 5 would be worth manually eyeballing. — billinghurst sDrewth 14:25, 17 August 2015 (UTC)

Removed Authority control/VIAF gadget[edit]

I have removed the VIAF gadget from local display. Magnus has developed an alternate version at Wikidata that is more suited to needs. You can add it to your common.js at Wikidata, or you can add it to your global.js file at meta. If you do edit Wikidata and wish to add authority control data and are stuck, then please talk to me and I can assist. — billinghurst sDrewth 23:26, 16 August 2015 (UTC)

Author template now will pull image from Wikidata if not specifically chosen[edit]

I have tested and modified {{author}}. I have added the ability for the template to directly access the image data stored at wikidata if it is available. The specific use of the parameter in the template will override Wikidata choice, so the only difference noted will be that there will be some author pages with images without a parameter call. Once this has the approval of the community I will look to remove the empty parameter field from active templates, then we can talk about how we progress with comparing wikidata and how to populate where our data is not in Wikidata, etc. — billinghurst sDrewth 14:13, 5 August 2015 (UTC)

/me semi-struts. 1,596 total and still populating in Category:Author_pages_with_Wikidata_image. There may be issues if there are multiple images against an author, so we will have to manual override for the moment. With such situations we need to set a preferred image at WD and get Module:Wikidata updated to provide a single preferred result. — billinghurst sDrewth 14:22, 5 August 2015 (UTC)
Only 5 Author pages with [image?] script errors at this point. Tracked in Category:Pages with script errors. -- George Orwell III (talk) 00:06, 6 August 2015 (UTC)
Nice GOIII, and great that we already have a system to catch those errors, I was wondering how I was going to find them. — billinghurst sDrewth 00:22, 6 August 2015 (UTC)
Fixed the errors, all due to the pages not having an instance set at WD. — billinghurst sDrewth 00:33, 6 August 2015 (UTC)
@George Orwell III: Can you think of a means to track where we have Author pages with red file links? Be they failed manual additions, file removals or by my update to the template. — billinghurst sDrewth 00:58, 6 August 2015 (UTC)
Off-hand I'd say 'not easily'. The section Headings seem consistent enough to come up with a group of them to probe through with some reliability but the entries themselves are not always listed the same nor use the same syntax. If every entry were template-based, syntax, etc. wouldn't matter because we could poll the template parameters themselves to track what main or translation space works exist or are merely listed (red links). I'm not saying it can't be done but a deeper look on my part will have to wait until the weekend. -- George Orwell III (talk) 01:13, 6 August 2015 (UTC)
Additional thought: seems like the possibilities of doing something like that increases if going by what the What Leaves Here? gadget currently provides. -- George Orwell III (talk) 01:26, 6 August 2015 (UTC)
Hold on a sec -- did you mean red-links to the File: namespace for missing author portrait images or red links to works in the main or translation namespace? -- George Orwell III (talk) 01:29, 6 August 2015 (UTC)
File namespace only at this point of time. I was thinking that we may be able to run a sql query from quarry:? — billinghurst sDrewth 01:32, 6 August 2015 (UTC)
Now I follow you. I'd think there would be a way but not sure how. Isn't it just a matter of determining whether or not the red-link contains something like &action=upload ? Isn't that what happens when you click on a File: link to a non-hosted file? -- George Orwell III (talk) 01:37, 6 August 2015 (UTC)
@John Vandenberg, Pathoschild: can either of you help here? — billinghurst sDrewth 03:13, 6 August 2015 (UTC)
Unless I am misunderstanding, Category:Pages with missing files is what you are looking for.--Erasmo Barresi (talk) 17:13, 6 August 2015 (UTC)
<facepalm> I was thinking something namespace specific, and losing the trees for the forest. I will fix up those images over the weekend. Fixed Thanks. — billinghurst 13:24, 7 August 2015 (UTC)
Just a not-so-important remark: When choosing which image to rank as preferred, I base my decision on which image most featured Wikipedia articles have in their leads. See for example d:Q187982.--Erasmo Barresi (talk) 17:22, 6 August 2015 (UTC)
I have already had this argument discussion with WD about 'preferred rank' and image display, and the (easy) ability to return one result rather than many. To me the "preferred" will always be the one that best represents the person, whereas for what I choose to display at enWS will be what best represents them as the author, which for me will mean that I love to show any caricature if there is one available. It is exactly the argument over "preferred" that will always allow us to have choice in the header parameter, but call a default if nothing is chosen. — billinghurst sDrewth 23:54, 6 August 2015 (UTC)
Another issue I've seen that illustrates why "most common use" does not always offer the best choice. I've seen instances where there is a play and a sculpture with the same title, and so there is a disambiguation page to help users distinguish the two. But then the lead image for the play is an image of the statue, which visually negates the effort made to disambiguate the two items. And in many cases, I've found that the most commonly used image is used solely because it was used first, the original article propagated through translation to many wikipedias, and the lesser used image is a newer higher-quality image that is just beginning to be used. Again, in this situation, the more common usage in no way indicates a better selection or preference. --EncycloPetey (talk) 00:46, 7 August 2015 (UTC)
Right, EncycloPetey. My approach could work as a rule of thumb rather than a hard and fast one.--Erasmo Barresi (talk) 08:55, 7 August 2015 (UTC)

Wikisource celebrates Public Domain Day?[edit]

I have recently started the 2016 in public domain age on English Wikipedia and started a discussion on the Wikimedia Uk list. I think some Public Domain day events in January would be a good way of making more people aware of Wikisource and would offer the opportunity to provide support for newcomers. It would be great to hear of other peoples views?Leutha (talk) 10:04, 8 August 2015 (UTC)

There's not really much for us, since we use US copyright law, and nothing published before 2002 is going to go into the public domain until 2019-01-01. Unpublished or recently published material by all those authors dead 70 years is now PD, but that usually only has interest to scholars.--Prosfilaes (talk) 20:38, 9 August 2015 (UTC)
in the US, the open access librarians tend to blog around January 1. [13] maybe some networking building on wikiloves libraries might bear fruit. Slowking4Farmbrough's revenge 01:24, 10 August 2015 (UTC)

Scanned books from the Archaeological Survey of India[edit]

A good collection is available here, covering subjects from around the world on a multitude of topics. Hrishikes (talk) 05:55, 10 August 2015 (UTC)

@Hrishikes: Be sure to add it to Wikisource:Sources. — billinghurst sDrewth 10:16, 10 August 2015 (UTC)
Yes check.svg Done Hrishikes (talk) 11:26, 10 August 2015 (UTC)

Upload to Commons: an innovation[edit]

Recently I have encountered an obstacle to Commons uploading; even when the file is below 100 MB (but contains a lot of pages), Commons says that the entity is too large when attempting importation from IA and shows internal API error when attempting direct upload. I have bypassed the objection by taking a roundabout route: I first uploaded a very small file with the desired file name and once Commons accepted it, I uploaded the full file as a newer version to the previously uploaded file. This time, uploads were successful. Examples: Index:Debrett's Peerage, Baronetage, Knightage and Companionage.djvu, Index:Dod's Peerage, Baronetage, Knightage etc. of Great Britain and Ireland.djvu, Index:The Peerage, Baronetage and Knightage of the British Empire Part 2.djvu, Index:The Peerage, Baronetage and Knightage of the British Empire Part 1.djvu. This is for general information for others who have faced this kind of problem. Hrishikes (talk) 06:19, 10 August 2015 (UTC)

Clever work around. I am told that the problem there lies with IA-upload rather than with Commons, and we therefore need to see if @Tpt: can tweak his upload tool. To also note that you can directly upload from a url to Commons, so you could just try that and put in place the {{book}} template afterwards. — billinghurst sDrewth 10:19, 10 August 2015 (UTC)
I had tried direct upload too. The file got uploaded, then I had to fill up file name, year, category etc., then the file got submitted. Then Commons refused to publish it citing internal API error. That's why I had to take a work around. Hrishikes (talk) 10:59, 10 August 2015 (UTC)
If you did use the direct upload on Commons and that fails, then we need to lodge that as a bug in Phabricator. I will see if I can find a big file to upload. — billinghurst sDrewth 11:13, 10 August 2015 (UTC)
Finding a big file is not a problem; but you need to find a big file with the specific problem that Commons objects to. Every big file does not get refused. So the best option: download one of my files as mentioned above; change the name and upload --- then you will see the problem. Hrishikes (talk) 11:22, 10 August 2015 (UTC)
Just a little note that Hrishkes won't be able to upload by URL on Commons because that is restricted to reviewers, admins and GW users. Green Giant (talk) 01:26, 13 August 2015 (UTC)
Thanks for the correction … See what happens when they add new things and when you have advanced rights, you don't even realise that not everyone can do or see something! Anyway Hrishikes has uploaded so many things over the past months, they must be close to locking themselves away for a year and have a proofreading binge. :-) — billinghurst sDrewth 01:36, 13 August 2015 (UTC)
wow, i should think a request for GWtoolset rights would be a snowball, (they’ll let anybody have those) ;-) Slowking4Farmbrough's revenge 03:30, 19 August 2015 (UTC)

Author/section fields for anthologized works[edit]

Hi all, this got started on Captain Nemo's talk page, but maybe it'd be better to put here ...

I've been working on two books that are collections of short works in translation, Slavonic Fairy Tales and The Sweet-Scented Name. The first has many "contributers", while the second is drawn from various collections in Russian by Fyodor Sologub, and happens to bear the name of one of the individual stories.

To begin with the latter collection, I'd like to know which is more appropriate, to put the story titles into the headers in the title field, or into the section field. So here it's a choice between how I did it for Turandina or for Lohengrin. One thing that makes me feel uncomfortable is that the title "The Sweet-Scented Name" gets more prominence in the header than does the individual story titles. It already appears twice on each story's page, and I feel like its third appearance in the header is visually overkill. One precedent I am going by is Tolstoy's Twenty-three Tales and the individual stories, such as A Prisoner in the Caucasus which have the story titles in the title field. So far I see no serious problem either way, but when there are collections in which different sections are by different authors the handling of the header information gets very complicated, and I've seen some bad results, and complicated work-arounds. For example, in the story Best Russian Short Stories/The Cloak we read in the header "Best Russian Short Stories by Nikolai Gogol" even though Gogol is only the author of the one story "The Cloak". That particular problem is avoided in The World's Famous Orations/Volume 6/At the Bar of the House of Lords, but at the cost of filling in the section field with

''At the Bar of the House of Lords''<br />by {{Author link|Isaac Butt}}

a format which would be difficult have everyone apply consistently. So what I'd like to propose is to explicitly allow for the option of placing titles of individual works from within an anthology book into the title field, in which case the author field should consistently give the author of the individual work rather than to the collection as a whole. Does that cause any serious problems in how the header data is handled? Mudbringer (talk) 11:22, 10 August 2015 (UTC)

The "Gogol" problem is solved by leaving the author field empty and adding the "contributor" field. Cheers, Captain Nemo (talk) 13:33, 10 August 2015 (UTC).
With such works, I have carried through name of book with | override author = (ed.) Author name (possibly leave empty <shrug>) then use the section and contributor parameters. — billinghurst sDrewth 14:11, 10 August 2015 (UTC)
And the example that you probably preate the contributor parameter, plus it was having to do hacks like that brought the change about. Well, that and that we have many journals transcribed that have many articles by different authors.
@Billinghurst: ... OK, I was having trouble with the spelling of "contributor" "facepalm". I tried reworking the first story in Slavonic Fairy Tales based on your explanation, so could you please check to see if I've understood everything: Carried Away by the Wind, vs. what I was doing before: Why is the Sole of Man's Foot Uneven?. The way I've done it leaves an extraneous space before a comma, but I'll not be fussy. Does bolding the section field cause any problems? One thing I couldn't have figured out from studying the help on the header template was the difference between "override author" and "override_author" ... the latter was leaving the author's name in the header, when I was trying to suppress it. Thanks! Mudbringer (talk) 14:50, 10 August 2015 (UTC)
I tried fixing The Cloak, and that seems to work ok, but the previous story in that collection, for which a translator is given, The Queen of Spades, winds up with T. Keane as the translator for Best Russian Short Stories, rather than just for that story. Mudbringer (talk) 15:57, 10 August 2015 (UTC)
You can use override_contributor for this. For example: The seven great hymns of the mediaeval church/Dies Iræ/Dix. —Beleg Tâl (talk) 21:07, 10 August 2015 (UTC)

Re the underscore in parameters[edit]

We have had a legacy issue in the the underscore in the template has long been used, and we have carried on using it with new parameters to maintain consistency, however, it does catch people out in that it needs to be present in the parameter. I have just updated the template so you can use alternatively use override author and when I have a chance I will look to update so we can have "override editor", "override translator" and "override contributor". I am not even certain that the underscore needs to be mandated now (it did back then) and whether can be omitted in the template, so I will have a play in the sandbox first to see if we can simplify. As GOIII briefly discussed the whole Template:Header is showing its age and has significant legacy issues, and it probably due for a refresh, especially with Lua now being available, and with Wikidata being present, and the old microdata hack probably able to be case out. But not today. — billinghurst sDrewth 01:37, 11 August 2015 (UTC)

Tech News: 2015-33[edit]

14:57, 10 August 2015 (UTC)

creating Special:MyPage/EditCounterOptIn.js[edit]

Hi. Just thought that it would be useful to remind (new) users that there is a useful count tool which we have linked from the bottom of your Special:MyContributions page. It is even more useful if users create the page Special:MyPage/EditCounterOptIn.js (or use m:Special:MyPage/EditCounterOptIn.js if you want counts globally) and content is not needed, just the creation of the page. Thanks, it helps me to work out when I can apply autopatrolled rights for users, and to assess users for admin rights. — billinghurst sDrewth 05:11, 12 August 2015 (UTC)

Thanks for the pointer! I'd not noticed the opt-in thing. I think the global page should be m:Special:MyPage/EditCounterGlobalOptIn.js though? — Sam Wilson ( TalkContribs ) … 13:04, 12 August 2015 (UTC)
@Billinghurst: I have opted in globally, but the monthly stats are still invisible—a notice appears in their place. Is this normal?--Erasmo Barresi (talk) 15:39, 13 August 2015 (UTC)
<grrr> Never work with children, animals AND SOFTWARE APPLICATIONS! It looks to be having issues. I will see if it part of the Labs current works, whether it will self fix, or we need to ping Max. — billinghurst sDrewth 22:59, 13 August 2015 (UTC)
It was working correctly the other day when I looked; now is broken. There seem to be PHP errors appearing in the HTML output, which are stuffing up the Javascript. — Sam Wilson ( TalkContribs ) … 23:22, 13 August 2015 (UTC)

Author:Georges Perrott[edit]

Is this author the same as w:Georges Perrot? The PSM work listed would conform to the genre of the French author. The last "t" is a typo or real? Hrishikes (talk) 01:05, 15 August 2015 (UTC)

VIAF lists sol et le climat de la Grèce, leurs rapports avec le caractère de sa civilisation et de son art as one of the works of "Georges Perrot Archéologue et helléniste français" and the trailing comment on Popular Science Monthly/Volume 42/December 1892/The Environment of Grecian Culture reads Translated for The Popular Science Monthly from the Revue des Deux Mondes. My French is not all that good but this seems circumstantial evidence they are one and the same person? AuFCL (talk) 01:36, 15 August 2015 (UTC)

Tech News: 2015-34[edit]

16:17, 17 August 2015 (UTC)

New namespaces?[edit]

Anybody else notice the 5 new namespaces being listed in the various ns related input boxes?

  • Gadget (ns:2300)
  • Gadget talk (ns:2301)
  • Gadget definition (ns:2302)
  • Gadget definition talk (ns:2303)
  • Topic (ns:2600)

Anyone? Pointers to more info appreciated. -- George Orwell III (talk) 01:39, 18 August 2015 (UTC)

On the "Gadget" set of namespaces I found the following: "The difference intended to be made as part of the ResourceLoader 2 project is moving these out of the prefixed "Gadget-" scope in the MediaWiki:-namespace (which is intended for messages, not actual wiki content (let alone executable resources). Instead move them to a new Gadget:-namespace only editable by users with the editgadget right" (mw:ResourceLoader/Version 2 Design Specification).--Erasmo Barresi (talk) 08:12, 18 August 2015 (UTC)
So does this mean Flow is coming to the Scriptorium? :) Am I too against tradition if I say that's a good thing? ;-) — Sam Wilson ( TalkContribs ) … 23:05, 18 August 2015 (UTC)
There is a "by request" model at Wikidata, and I would think that WMF will probably introduce it cautiously. @Quiddity (WMF): would you mind adding to this conversation, and pointing this community to pertinent information. — billinghurst sDrewth 02:35, 19 August 2015 (UTC)
Sounds sensible. I reckon it'd be nice to have here. :) — Sam Wilson ( TalkContribs ) … 03:11, 19 August 2015 (UTC)
If there is consensus, it will be very easy to have it enabled. Personally, I favor it for all talk pages plus the discussion pages in the Wikisource namespace (like this one), but it is possible to enable it on a smaller scale if wished.--Erasmo Barresi (talk) 12:23, 19 August 2015 (UTC)

Epubs produced from enWS[edit]

Following Tpt doing a post, it reminded me that he has stats for wsexport tool usage. I had a quick peak and here are the numbers of epubs that have been generated for 2015

Epubs produced from
late Mar to mid Aug 2015
Mar 3415
Apr 13234
May 98804
Jun 19480
Jul 48659
Aug 11215

They are stunning numbers, and thanks to @Tpt: for his magnificent tool, and I think that we can do better if we look at making that output format more dominant. — billinghurst sDrewth 11:25, 18 August 2015 (UTC)

That's amazing! So many readers! (and some bots?) Is it possible to get stats for individual works? And yeah, I agree, it'd be cool to make the epub download links much more obvious. — Sam Wilson ( TalkContribs ) … 23:01, 18 August 2015 (UTC)
@Samwilson: When I come in with a mobile to the main page, Template:Featured download shows big and bold, so it would be interesting to know how whacked that template/button is as part of the downloads; so yes, a breakdown of the stats would be useful. Personally I feel that the template could be useful in some format to highlight that the works are downloadable and would consider its use more liberally ESPECIALLY as in the mobile version there is NO sidebar, and ready means to even see that an epub version is available. Hmm, it may even be worth adding as part of Template:New texts even just as an icon. — billinghurst sDrewth 02:59, 19 August 2015 (UTC)
I think it would be a good thing to be able to identify the works most downloaded by our readers. The top three could be featured in main page in "Readers' Choice" category. Hrishikes (talk) 04:32, 20 August 2015 (UTC)
I agree with you on the above Hrishikes. I also like to see the most {popular} downloaded categories, in order, from highest to lowest as well as the most popular download format. Respects, —Maury (talk) 07:03, 20 August 2015 (UTC)
Ditto. We live in an island all by ourselves. A bridge with the readers is highly desirable, so that the community can ascertain what types of works the readers most want and community efforts can be based on that feedback, e.g., in POTM etc. Hrishikes (talk) 07:25, 20 August 2015 (UTC)
@Billinghurst: Detailed statistics of ePub downloads (07 2015) I have published on User:Zdzislaw/stats/2015 07‎. Zdzislaw (talk) 18:07, 24 August 2015 (UTC)

How can we improve Wikimedia grants to support you better?[edit]

Hello,

The Wikimedia Foundation would like your feedback about how we can reimagine Wikimedia Foundation grants, to better support people and ideas in your Wikimedia project. Ways to participate:

Feedback is welcome in any language.

With thanks,

I JethroBT (WMF), Community Resources, Wikimedia Foundation. 05:21, 19 August 2015 (UTC)

New feature "Watch changes in category membership"[edit]

recent changes page view with categorization

Hi, coming with this week’s software changes, it will be possible to watch when a page is added to or removed from a category (T9148). The feature has been requested by the German Community and is part of the Top 20 technical wishlist. The feature was already deployed to Mediawiki.org on August 18 and it will be rolled out on Wikisource between 6-8 pm UTC today. It will be available on all Wikipedias from Thursday 20 on, likewise between 6-8 pm UTC. In this RFC-Proposal, you can find the details of the technical implementation. The feature was implemented via a new "recent changes" type for categorization. Through this, categorization will be logged and shown on the recent changes page. The categorization logg in "recent changes" is the data base for the watchlist: When you watch a category, added or removed pages from that category will be shown on the watchlist. The categorization of pages can be turned off in the watchlist preferences as well as recent changes preferences. If you have any questions or remarks about the feature or if you find a bug, please get in touch! Bugs can also be reported directly in Phabricator, just add the project “TCB-Team” to the respective task. Cheers, Birgit Müller (WMDE) (talk) 15:11, 19 August 2015 (UTC)

For those of us not fluent in technobabble, does this feature track only explicit category membership, or also membership controlled by template parameters? --EncycloPetey (talk) 16:24, 19 August 2015 (UTC)
The new feature will also log categorisation by templates, but it will limit the display to the template name and the number of pages embedding it. Kai Nissen (WMDE) (talk) 17:48, 19 August 2015 (UTC)
This feature is very boring in Special:RecentChanges on all wikisource because it shows three entry for any page validation with three identical diff links, so we get only 16 real change per page of 50 in RC. It's possible to work-around it through user preference but that preference should be to hide cat change in RC by default. — Phe 20:19, 19 August 2015 (UTC)
We've been asking for this on en.Wikipedia for years. It makes it possible to observe the removal of legitimate categories, by accident or by vandalism. BD2412 T 21:06, 19 August 2015 (UTC)
dontcha know categories are broken, and the answer is wikidata? maybe we could get GWtoolset as a top 20? Slowking4Farmbrough's revenge 22:12, 19 August 2015 (UTC)
We can't use wikidata for Page:*, and Page:* is the only really problematic namespace. — Phe 00:20, 20 August 2015 (UTC)
The proofread extension use category to mark the state of pages and changing the state from proofread to validated generate three RC entry, one for the cat removed, one for the cat added and one for the page change itself. I don't think this is evil for any other namespace than Page:, but for Page: we have really a specific use of category. I was thinking first to ask than the default preference should be to hide cat change in RC by default on all wikisource, but it'll probably better to allow to blacklist namespace, sort of "do not generate cat change RC entry for this namespace". — Phe 00:20, 20 August 2015 (UTC)
I don't find that problematic. It's not like pages will be going through category changes dozens of times. BD2412 T 02:09, 20 August 2015 (UTC)
I've already turned it off for RC in my preferences. It's a right nuisance when trying to patrol RC. The five Page namespace categories (Category:Proofread, Category:Problematic &c.) are assigned automatically and there is no need to watch pages move between them as they move through the proofreading process. When these moves are appearing on RC, then it prevents clearly seeing the real changes. Beeswaxcandle (talk) 02:28, 20 August 2015 (UTC)
@Birgit Müller (WMDE), Kai Nissen (WMDE), Phe: Totally agree that it will be problematic in/for our Page: namespace in its current methodology. So a wiki needs the ability to exclude a namespace, or maybe as part of a naming hierarchy (preferably by local setting rather than by a wiki configuration request through phabricator); or by the ability to [hide/show] the functionality in recent changes. I also think that we need to get this onto Phabricator prior to the rollout, as it is going to be problematic, and should have been tested in a broader wiki-space prior to implementation. An alternative would be that the categorisation change(s) are noted against the edit, rather than against the category, so it is all on one line per edit. — billinghurst sDrewth 03:47, 20 August 2015 (UTC)
Flagged this as an issue for WSes on the Phabricator ticket above. I have asked if there is a means to halt rollout to the WSes this week. — billinghurst sDrewth 03:54, 20 August 2015 (UTC)
@Billinghurst:I am probably being blind but I see no reference? To which ticket are you referring please? AuFCL (talk) 04:08, 20 August 2015 (UTC)
Also, it already is rolled out. Beeswaxcandle (talk) 04:13, 20 August 2015 (UTC)
Thanks BWC, WPs! We are also going to see it added regularly to any subpage. Maybe we could have "ignored categories", cf. hidden categories which exists. @AuFCL: T9148 . To note that of 500 edits, 212 are categorisation changes in three categories, though switching to "Show page categorization" does tidy things nicely. — billinghurst sDrewth 04:20, 20 August 2015 (UTC)
D'oh! I really am going blind! Thanks. AuFCL (talk) 05:49, 20 August 2015 (UTC)
┌─────────────┘
The proposal seems to say that recentchanges will not be affected - "In order to prevent recentchanges from being flooded by high-usage template categorization, there won't be recentchanges entries for the pages embedding the template". "Possible to watch" sounds to me like editors being able to voluntarily add a category to a watchlist for purposes of keeping track of membership. BD2412 T 03:56, 20 August 2015 (UTC)

┌─────────────────────────────────┘
Perhaps this is a little bit early to start noting bugs but I have observed this peculiarity: if some page changes to either enter or leave a watched category my watchlist throws up a '(changed since last visit)' link to the changed page which is nice. What is not so nice is that visiting that link does not reset it status in the watchlist but visiting the category instead does! So to de-flag a notified event one has to navigate to a page which otherwise you have no reason to access? Seems a bit odd. AuFCL (talk) 09:37, 20 August 2015 (UTC)

Hi AuFCL, it is not too early to report bugs or comment on the feature, thanks! There is already a Phabricator task for the link issue you mentioned - T109688. We will collect all the comments on problematic issues, improvement ideas and reported bugs on the various project pages as well as in Phabricator and check through them within the next few weeks. Thanks to everyone here. Birgit Müller (WMDE) (talk) 14:49, 20 August 2015 (UTC)

Note to the community, this change was rolled back (temporary suspension). I haven't found the conversation that lead to the rollback, just the comment on the initial phabricator card that states that the rollback has occurred. There were a number of issues identified, so one would think that there is some more work to do prior to the next version. So while we have been picking fault with our issues, we should not lose sight of the excellent initiative that WMDE has taken on board, thank them for their effort and wish them luck with their refinement of the change. @Birgit Müller (WMDE), Kai Nissen (WMDE):billinghurst sDrewth 23:37, 20 August 2015 (UTC)

@Billinghurst:, thank you for posting the update regarding the revert of this change. After it got deployed to the production systems it became apparent that under some special circumstances it may cause privacy issues: In case of some types of templates entries in the watchlist and recent changes were showing the IP address instead of the user name.
It's good that this came to our attention as the developer team can start to work on a solution for this now. Thanks to all who already tested the new feature and gave valuable comments. It is a big help for the next version of this feature. Kai Nissen (WMDE) (talk) 10:02, 21 August 2015 (UTC)

Bug in WikiEditor?[edit]

From today morning, I am seeing three sets of Bangla script in the same box, one after other, in the WikiEditor. Not that I am facing any problem, but still, it looks uncanny. Hrishikes (talk) 04:46, 20 August 2015 (UTC)

@Hrishikes: Have you posted to phab:? —Justin (koavf)TCM 14:10, 21 August 2015 (UTC)
No, as I am not having any major problem. No letter is absent; although there are three full sets. Hrishikes (talk) 14:37, 21 August 2015 (UTC)
Follow-up: I think this repetition of sets is now a general feature: sets are repeated 2 to 4 times depending on the browser. I have checked Latin, Latin Extended and Bangla sets. Hrishikes (talk) 05:01, 23 August 2015 (UTC)

1901 work with 2000 copyright notice[edit]

In 2000 the uncensored version of Mark Twain's 1901 essay, "The United States of Lyncherdom", taken directly from Twain's manuscript, was first published in the journal Prospects. (It was first published in a censored version in a 1923 anthology of Twain's essays, Europe and Elsewhere.) I presume the work itself has entered the public domain because it was created more than 95 years ago and it has been more than 95 years since Twain's death. However, the journal printing of the essay contains the following disclaimer:

This previously unpublished essay by Mark Twain is ©2000 by Richard A. Watson and Chemical Bank as Trustees of the Mark Twain Foundation, which reserves all reproduction or dramatization rights in every medium. It is published here with the permission of the University of California Press and Robert H. Hirst, General Editor of the Mark Twain Project.

Is there any legal basis for this copyright notice, or may I freely transcribe this work? IvanhoeIvanhoe (talk) 20:08, 20 August 2015 (UTC)

If the uncensored version of the essay was not published until 2000, then it could indeed be under copyright. --EncycloPetey (talk) 20:47, 20 August 2015 (UTC)
OK, the uncensored version is under copyright until 1 January 2048, it appears, and the censored version is under copyright until 1 January 2019. IvanhoeIvanhoe (talk) 21:41, 20 August 2015 (UTC)
@IvanhoeIvanhoe: As the first version was published in 1923 it falls under that copyright, presuming that it was renewed, which puts the work under copyright, and you are correct until 2019. I would venture that the same date would apply to uncensored version rather than 2048. My reason is that they could not put a new copyright date of 2000 as that was more than 70 years after his death,[38]. I don't know how they could claim a new start without special legislation, so it could only be a claim under the previous copyright. — billinghurst sDrewth 10:54, 21 August 2015 (UTC)
I don't know what you're pointing to on the Cornell page. The life+70 rule was basically irrelevant until 2002. New editions of old works do get new copyrights, and if there's enough new text I don't see why it wouldn't. So the new text is under copyright until 2048, IMO, as is the rest of Mark Twain's unpublished stuff, as it was "published" in a ridiculous expensive microfilm edition in 2001, just to grab that copyright.--Prosfilaes (talk) 21:17, 21 August 2015 (UTC)

Genealogical works from Peter Crombecq[edit]

Hi,

After looking up one of our mayors from the 13th century, I stumbled across the genealogical works of Peter Crombecq, which I referred as a source. The PDF files he created are on the site of his provider though. This means that they will disappear if the internet account is suspended, which inevatibly it will be eventually.

So I asked him if it would be OK to host his works on Wikisource. The first question is, is this the right platform to 'publish' his efforts? He doesn't mind the works to become available to the public. In the PDF files he simply mentions that he wants his name to be mentioned when the material is used.

I asked a question about this on the Dutch wikisource Scriptorium counterpart a few days ago, but the site is relatively death... So it seems better to come and ask over here.

The next question is: Is it OK for me to upload the 'converted' books? Or would it be better he creates an account and uploads himself? He doesn't want to 'waste' too much time on it though. So I proposed I would convert the books to wikiformat and either he or I can then conveniently upload.

This is the PDF file of the first book, I started to work on:

http://users.telenet.be/PeterCrombecq/Genea%20Stek/Crombecq/

It starts with a few chapters of introduction, then the bulk of the book with blocks about people that were involved in the city council of Leuven. I created a page with this introduction, then a page per person:

book
introduction
Coutereel, Peeter an example of an entry

This does not follow the 'layout' of the PDF, which has the introduction over several pages and then in the second part several entries per page. Is that a problem? It is more logical this way and it's easier to connect the entry of 1 person to his/her wikidata item.

I'm adding the author template on each page of the book, does that make sense? It is in compliance with his request to have the source mentioned. I'm mentioning his name, because the way I see it, this series of wikisource pages will/would become the source.

Maybe I should refer to the PDF file on his provider's website, or would it be acceptable to upload that PDF to Commons? Of course, if that is acceptable, maybe I'm doing entirely too much effort by converting it to wikisource... The advantage of having it in this format on Wikisource would be that others can chime in and add to it though. --Polyglot (talk) 18:25, 23 August 2015 (UTC)

I can only answer from the perspective of the English WS as there may be different perspectives on the Dutch WS. @Dick Bos:, if there are things that I've missed, could you please assist?
  • If these texts were in English, they would be welcome here. The copyright release in the text is compatible with us hosting them;
  • The pdfs should be uploaded to Commons, and it's fine for you to do so;
  • Including the author's name in the header template on each page you create on nlWS is the correct way to mention his name. You should also include a license template on the book's mainpage;
  • Yes, keep the print pages for the Introduction together into a single WS page;
  • I recommend that you use the Proofread extension to wikify the books rather than doing that before uploading
Beeswaxcandle (talk) 22:50, 23 August 2015 (UTC)

Tech News: 2015-35[edit]

13:02, 24 August 2015 (UTC)

Some maintenance work that I am considering[edit]

I am considering doing the following maintenance work over the next while, and thought that I would flag my intentions and allow for any discussion to take place

  1. update {{Author}} so we can determine which author pages where we have Wikipedia links, for which the WP link data is not recorded at Wikidata, I think that there will be none, however, just want to make sure. At that point I believe that we will be able to remove all Wikipedia links from Author templates (as redundant).
  2. remove all wikidata parameter fields from the Author template, they are redundant, and we have a ready means to identify where there is no WD linking, and that maintenance is being managed on a regular basis
  3. from Category:Index Validated (initially) and then Category:Index Proofread, identify the overarching works for each file and then look to put that appropriate badging on each of the WD links to WS. Once that is done, I propose that we amend / extend {{Featured download}} button to be displayed on any work that is proofread or validated (ie. has the label). We will also need to have some sort of watch function on the works that enter these categories so they can be badged systematically.

Any thoughts and ideas welcomed. — billinghurst sDrewth 14:36, 24 August 2015 (UTC)

I am not sure I understand your third point, but I agree with the first two. However, I have noticed that often, if a WikiData entry exists for an author, but the Author header does not contain a sister wiki parameter (or portal link or related author), then the sister wiki box does not show up. For this reason, I have often put a Wikipedia parameter in the template to force it to show up. The other option is to use {{plain sister}}, but if you add this template and then someone adds a portal parameter then you will get two sister wiki boxes. I think fixing this will need to be part of the maintenance work you describe. —Beleg Tâl (talk) 15:33, 24 August 2015 (UTC)
@Beleg Tâl:Re point 3. We have the template which we have been utilising on featured works only (big and bright as per Main page), whereas on majority of works we rely on the link in the sidebar to indicate alternate format downloads. 1) In mobile there is no sidebar, so you cannot see, and epubs are predominantly a mobile technology; 2) it is little link somewhat hidden away; 3) we have only utilised on a subset of works. My plan is to look to have it more widely available for where works have an elevated status with the newly available WD badges available for Wikisources[41]. These badges are to what I am referring with regard to the indexes and their corresponding works, and that work needs to be started prior to work on the templating, which I will bring to the community to discuss in detail.

With regard to Wikidata label visibility, we recently updated {{header}} to apply #if statements in a different way so those links should show without additional intervention. To my understanding with {{author}} it has been the case of either a direct link or a search link displays without any intervention, if that is not the case, then some examples would be useful. — billinghurst sDrewth 23:26, 24 August 2015 (UTC)

@Billinghurst: On your first point, you can already verify that there are no such pages at Wikisource:Maintenance of the Month/Wikidata/Wikipedia authors.
@Beleg Tâl: May you point to a page where this problem occurs?--Erasmo Barresi (talk) 12:00, 25 August 2015 (UTC)
@Billinghurst: Support 1 and 2. Would like to propose 2a. remove all remaining deprecated parameter fields from the Author template (commons, wikiquote, etc). Cheers, Captain Nemo (talk) 12:28, 25 August 2015 (UTC).
I am okay with that proposal to remove the sisters in author namespace. Are people comfortable now that as WQ and Commons are migrated to WD and the duplication finders/mergers have been at work that they are now superfluous to our needs? [Again Author namespace only at this point] — billinghurst sDrewth 12:45, 25 August 2015 (UTC)
@Billinghurst: It would seem that you are correct, and this behaviour has already been fixed since I last encountered it. —Beleg Tâl (talk) 13:20, 25 August 2015 (UTC)
@Billinghurst: First, the E-Pub icon in mobile mode is should be next to the pencil (edit) and star (add/rem to watchlist) icons per my tweaks to MediaWiki:Mobile.css. If its still "hidden away" for you, please let me know.

Second, I have no problems with your suggested maintenance tasks though I still think its only a dent in the larger issue when it comes to Wikidata. Unless ALL the key parameters in the various header-to-namespace templates are made into formal messages in the MW (ns:8) namespace so they can be translated and put in use by our sister language WS domains at the same time, in the long run, all we are managing to accomplish here is inadvertently creating an preference for English driven Wikidata over our counterpart's. Plus we are "doing" too much in our header templates as it is on the local level; if 'everybody was working from the same page' (i.e. applying formalized messages for key template parameters), much of the currently entrenched if... then... when... localized template jargon can be safely handled by wikidata &/or lua instead imo. -- George Orwell III (talk) 20:20, 25 August 2015 (UTC)

@Kaldari: is the sort of thing that we can look to the Community-Tech group to assist the Wikisource community to modernise and utilise better practices?

Noisy watchlist warning Author: namespace. With my bot account User:SDrewthbot I have started a run on cleansing redundant parameters from Author: namespace (parameters = Wikipedia · Wikiquote · Wikivoyage · Wikinews · Wikidata · Wikibooks · Commons · CommonsCat · empty image) for Category:Authors-A with 1 recurse (800 pages). Please put any feedback/issues here or my talk page. I will look to schedule the remaining works to start in 24 hours dependent on feedback.

To also note that I have also placed a notice on Special:Watchlist. If your watchlist is has plenty of authors upon it, you may wish to temporarily mask bot edits in your Special:Preferences. — billinghurst sDrewth 12:01, 26 August 2015 (UTC)

We should not remove category links to Commons at this time. Wikidata is still debating the "correct" way to attach Commons categories to items, and it's a bit of a mess. Please put back the categories that you have already removed. --EncycloPetey (talk) 14:03, 26 August 2015 (UTC)
We pick whichever methodology that they use and we pick up their multiple links, be it the interwiki link, or the separate commons category claim. We will pick up whichever changes that they make, it is no problem. [We don't pick up creator, but we don't need to as it is only a display layer at this point of time, and we can easily pick that up whenever.]

Proposing to only list top level of a work as a featured text[edit]

I am looking in Category:Featured texts and I see that the root page and the subpages for some works are listed as featured texts. To me, we should list the top level of the work, and not the successive cascading pages. So that would be things like Popular Science Monthly/Volume 1 and Doctor Syn and no subpages. Removing the {{featured}} template from the subpages will remove them from the categorisation. For some works it has only been the top level that has been done, so I propose to remove the template from the subpages of the nominated works where it has been added. — billinghurst sDrewth 06:18, 25 August 2015 (UTC)

Agree. Subpages should be considered to inherit all categories of their parent and do not need them to be explicitly stated. If a category placed on a parent would cause a subpage to be implicitly in an inappropriate category, then the parental category is wrong. Beeswaxcandle (talk) 20:33, 25 August 2015 (UTC)
The text we used at WS:FT and WS:FTC was inconsistent, so I have aligned to top level only, and will look to do the resulting maintenance later. — billinghurst sDrewth 00:19, 26 August 2015 (UTC)
Job Yes check.svg Done , each work has the featured template once, at the level it was awarded. — billinghurst sDrewth 13:41, 26 August 2015 (UTC)

Wanted: A really awesome tool for extracting images and uploading them to Commons.[edit]

I have been browsing through the thousands of pages with missing images (linked from {{Missing image}}), and it occurs to me that it is going to take an enormous effort to manually extract and upload all of these images to Commons (or, at least, all of the images that are eligible for listing there). What we need is a single tool that:

  1. Extracts the image on the page from the page;
  2. Associates the image with the publication information relevant to the public domain status of the image (which should be on the Index page); and
  3. Uploads the image to Commons if eligible there, with standard information, or here if only eligible here.

It doesn't matter if the extraction process gets more from the page than necessary, because images can fairly easily be refined once they are uploaded; it is the initial uploading that it is most time consuming, in my experience. Now, how do we get this done? BD2412 T 17:40, 26 August 2015 (UTC)

I think you'll find that @Hesperian: is already partway there with his {{raw image}} process. The problem really comes with step 1, in that the quality of the images in .djvu and .pdf files is not good enough. For IA files, we can access the jpeg2 images (which is what the raw image process does). For files that are hosted elsewhere we have problems. Beeswaxcandle (talk) 20:49, 26 August 2015 (UTC)
My concern is with with the next step - the process for uploading the image as an image file with the categories of copyright information required for uploads. A better version of an image can always be uploaded, but it's a pain to input, for example, the work name, author name, publication date, description of the work, and category information (which, for images from a specific work should include an "Images from [work]" category), when these steps must be repeated for dozens of images from a given work. We currently have over 10,000 image files that need such an extraction. BD2412 T 21:21, 26 August 2015 (UTC)
My issue is with the idea of a bot that makes bulk copyright claims. I won't go there. If someone does want to take this on, then please PLEASE ensure that your bot plays nicely with HesperianBot. That means talk to me early and often about what HesperianBot does, and why, and how not to break it. Hesperian 01:09, 27 August 2015 (UTC)
Bulk copyright claims would be an important part of the process. If a DJVU file exists on Commons, and you have a hundred or so images {{extracted from}} that DJVU file, you should be able to simply apply the copyright of the source file to all of the images at once. —Beleg Tâl (talk) 21:27, 27 August 2015 (UTC)
sorry about adding a bunch. the missing image template links to a page, on commons. could there be a way to snip images as a derivative on commons, and serve up in a semi-automatic way on source? Slowking4Farmbrough's revenge 01:52, 28 August 2015 (UTC)
@Beleg Tâl: Unfortunately, that isn't always true. Sometimes the copyright on the images is different from the work, and almost always the illustrator / painter / photographer was NOT the same person as the author of the text. And even if the copyright and creator information is the same for all the images in a work (which often isn't the case), each image needs its caption and text placement information included in the Commons description. Besides which, sometimes the source of the image files is Hathi Trust, when the DjVu came from IA. There are a lot of factors involved in setting the image description at Commons, and I frankly do not see any way that most of the information could ever be handled by anything like a bot, unless the bot were loading information from a prepared file for all the images. It couldn't simply be duplicated from other image files, because the image information will vary from image to image. --EncycloPetey (talk) 03:47, 28 August 2015 (UTC)
I do realize that the illustrator and the author are separate people with separate copyrights. However, if the DJVU file contains the works of both, then its copyright needs to reflect the copyright of all its contents, doesn't it? If the DJVU file license does not account for the copyright of the images, then the license is already incorrect. In that case, it would be better in my opinion to leave it to Commons to correct the licenses, maybe with a bot of their own to assist.
With regard to images from HathiTrust, these would not be impacted because they're not extracted from the DJVU file.
A simple yet reasonable automated description would also be possible, like "Illustration from page $page of $djvufile" - which is no different than what I manually put in the description of all the illustrations I upload to Commons myself. —Beleg Tâl (talk) 12:09, 28 August 2015 (UTC)
Further thoughts, just brainstorming:
  • Commons already allows you to bulk-upload files with the same license tag. So even if we don't want it to pull the license from the DJVU file, we can still simply specify the license once for all.
  • What if we create a tag on Commons that transcludes the license tag from the source DJVU, with a short message saying "this file was extracted by a bot from $source.djvu, the license for the source file is {{#section:File:source.djvu|License}}"
  • We could also have the bot tag the image for human review after upload; it would still be less complex than the current system but still ensure that the file is manually checked for problems
Beleg Tâl (talk) 12:30, 28 August 2015 (UTC)
I would also note that to some degree the author can be irrelevant. We have some illustrated works that were published 150+ years ago. For those works, the illustrations will be in the public domain, period, due to their age. We also have a number of government documents that include government-made maps, which are uncopyrightable government work product, regardless of who drew them. BD2412 T 02:58, 29 August 2015 (UTC)

Tech News: 2015-36[edit]

21:36, 31 August 2015 (UTC)

New typography: Am I going blind?[edit]

A new month, a new software release: Special:Version now reports

1.26wmf20 (5a73576)
02:34, 1 September 2015

Is it just me or is everything which previously was normal text, now bold; and everything which used to be bold now even bolder? (This is certainly not the case on wikipedia; so is this change an accident or is this the new normal?) AuFCL (talk) 06:44, 1 September 2015 (UTC)

Please ignore above. Turned out to be a local browser cache issue. Senility? AuFCL (talk) 09:06, 1 September 2015 (UTC)

Enclosure Act for Shifnal[edit]

Hello, Please let me know if this text is on en.wikisource. I, for myself, didn't find it. Thank you very much. --Bel Bonjour ~ Ambre Troizat (talk) 10:31, 1 September 2015 (UTC)

m:Grants:PEG/WCUG Wikisource/Wikisource Conference 2015[edit]

Hi to all. I am not sure how many people subscribe to the mailing list Wikisource-L, where User:Aubrey and User:Micru have been put together a proposal for an internal Wikisource-specific conference. The proposal for the conference is located at m:Grants:PEG/WCUG Wikisource/Wikisource Conference 2015 and it would be great if you could spend a few minutes reading through its content. Aubrey and Micru have already successfully instituted a Wikisource Community User Group which the responsible part of WMF has agreed should be a permanent fixture of Wikimedia.

Anyway, the proposal exists, it is available to have your thoughts and endorsement if you so choose to give it. You might even wish to consider if it proceeds to seek some funding to attend from your local WMF chapter if it exists, or be part of your planning to see a part of Europe that you haven't seen before. — billinghurst sDrewth 14:22, 1 September 2015 (UTC)