MediaWiki talk:Proofreadpage index template

From Wikisource
Jump to navigation Jump to search


Please add:

Thanks. --HombreDHojalata (talk) 19:19, 20 July 2013 (UTC)

Added. -- George Orwell III (talk) 02:56, 21 July 2013 (UTC)

Please also add pt:Predefinição:Proofread Index Lugusto 17:33, 24 December 2013 (UTC)

Added. -- George Orwell III (talk) 17:43, 24 December 2013 (UTC)

Propose to track empty KEY[edit]

As part of my clean-up of translation status, I am seeing many example of missing key parameter. I propose that we add a tracking category so where they are left empty they are more readily identified and corrected. — billinghurst sDrewth 10:09, 8 May 2016 (UTC)

Proposal to add microformat attribute[edit]

I'd like to add the ws-cover microformat parameter to this template, to make it easier to scrape the URL of the cover image. This could be done either by adding id="ws-cover" to the paragraph element that wraps the cover image, or |class=ws-cover to the actual [[File:...]] syntax that displays the cover image (in which case, it could be left off the placeholder image, which would probably be nicer).

The current code is this:

<p style="margin-top:0; margin-bottom:0;">{{#if:{{{Image|}}}|{{#iferror: {{#expr: 1 + {{{Image}}} }} | {{{Image|[[File:Placeholder book.svg|frameless|link=Special:Upload]]}}} | [[File:{{PAGENAMEE}}|page={{{Image}}}|frameless]]}} }}</p>

I suspect some conversation was had about this once upon a time, but I can't find anything about it now. (Most other microformat parameters are already there.)

Sam Wilson 03:39, 24 July 2016 (UTC)

(Sorry, the 'current code' I pasted above had one of the suggested alterations in it; I've removed that now.) Sam Wilson 08:41, 25 July 2016 (UTC)
Yes check.svg Done I've added the class to the cover image. Hope no one minds. Sam Wilson 09:25, 2 August 2016 (UTC)
@Samwilson: Pardon my pointing this out but my reading of mul:Wikisource:Microformat#Ids_2 suggests the class ought to be applied to an HTML element enclosing a text node containing the path fragment of the image name. In fact your edit results in the non-text-enclosing <img> tag being classified. Unless the screen scraping process is a lot more flexible/smarter than I believe it to currently be I am not sure you have achieved the result you expected. It might be better to explicitly encode a <span id="ws-cover" style="display:none;">{{PAGENAMEE}}/{{{Image}}}</span> just before the closing </p>?

And in any case neither version of this logic is not going to work for things like Index:The Pathway of Roses, Larson, 1913 as this happens to be is a perfectly legal structure which happens to fail the enclosing {{#iferror: check. AuFCL (talk) 11:44, 2 August 2016 (UTC)

Eeek yes, you're quite right @AuFCL. See, I'm trying to extract data about cover pages in order to add values to books on Wikidata for the image (P18) property. The addition of a class to this template is really just to add something to nominate this image over any other (that might be on Index page) as the cover image... I must confess that I'd read the description of the microformat and rather made the jump in my mind that "of course it'd be trivial for the script to extract the name of the commons file..."! So, sorry. This needs some further thought, especially as you say as it doesn't work where the {{{image}}} parameter is not an integer. Sigh.

Not only that, but it needs to be possible to split the thing into two parts: the commons filename, and the page number (which can then be added as a qualifier (e.g.)). I'll carry on with my daft regex for the time being I think! And I'll revert my change to this template.

Sam Wilson 02:30, 3 August 2016 (UTC)

@Samwilson: As I mentioned (I thought clearly) is that we already populate that data from c:template:book using Mediawiki:Gadget-Fill_Index.js which sets definitions and calls fr:MediaWiki:Gadget-Fill_Index.js. That should be usable, I would have thought. — billinghurst sDrewth 04:19, 3 August 2016 (UTC)


Wikidata could be used to pull much if not all of the information from the relevant fields. What does the community think about adding support for it in this template? NMaia (talk) 22:17, 29 October 2017 (UTC)

You'll have to be more specific. In most situations, the data isn't available at Wikidata when we set up an Index template. Wikidata may have a data item for the work, but rarely will it already have a data item for the edition being transcribed.
For example, the play Oedipus Rex has data item d:Q148643 for the play itself, but has d:Q24054383 for the 1878 Catalan translation by Enric Franco, d:Q24790575 for the 1843 Italian translation by Felice Bellotti, d:Q24791190 for the 1916 Polish translation by Kazimierz Morawski, d:Q23691937 fro the 1878 revised English translation by Plumptre, etc. Every individual edition of a work has a separate set of data on a separate data item. And unless the data item for that specific edition already exists, the Index page cannot be loaded from Wikidata.
These are always separate data items, so the only way this would be viable is to first create the data item before starting an Index page, which doesn't actually save on any work. I've set up hundreds of Index pages on the English Wikisource, yet there has never been a time that the necessary data item already existed at Wikidata. It would be more confusing for current editors because of having to edit in a new location with new policies prior to beginning work on their local Wikisource. --EncycloPetey (talk) 00:36, 30 October 2017 (UTC)
That's a good point, but I'd remind that multilingual works with 1:1 interwikis do exist, like Le Corbeau (Mallarmé). Come to think of it, it would be better if the template itself was a frontend for editing on Wikidata, to make sure the data about the editions is more easily reusable and machine-readable. In any case, Wikidata data should only be a fallback, so no duplication of work would be necessary. NMaia (talk) 00:57, 30 October 2017 (UTC)
Yes, such items do exist, and I've created many dozens myself. But in each and every case where the data item exists, it was created after the work existed on Wikisource. --EncycloPetey (talk) 01:19, 30 October 2017 (UTC)
(ec)While it is plausible, the file at Commons and the edition item at Wikidata need to exist first. So we are better to start at Commons and put the metadata against the file, thooooooough to step back again, we are even better to get the file and use that to create the item at WD, That requires stuctured data at Commons, and that is coming. Once we can test for the existence of a file:{{PAGENAME}} that has metadata at WD, then when we create Index:{{PAGENAME}} it will be possible to pull metadata components. Getting data input way upstream is the aim, and then we can discuss manipulation at our level. — billinghurst sDrewth 01:06, 30 October 2017 (UTC)
One big caveat is that the data at are frequently dead wrong. Someone imported the data from somewhere else, and no one bothered to check it. If humans aren't visually checking the data being imported from there, then we just end up propagating bad data faster, necessitating even more cleanup. --EncycloPetey (talk) 01:18, 30 October 2017 (UTC)
yeah, i did whinge a little too loud at the IA people at wikimania about the metadata. they were very nice, and commons has items with no metadata. but would not want to import all IA books from there, better to pull commons. and feedback wikidata to IA. we have an ontology dilemma of edition versus work to untangle, no one on the internet is doing it, so we may be leading the way. if we had a database of first editions, or query of worldcat by firsts that might be a starting point. Slowking4SvG's revenge 02:19, 30 October 2017 (UTC)
Indeed, we do seem to be leading the way there, as you say, which does make the work all the more troublesome for us. I've seen time and again that most of the folks not working on Wikidata don't realize the enormity of the Wikisource work/edition/translation problem, not even to a small degree, so the issue is compounded by the fact that we first have to convince the data-wranglers of the situation and explain its complexity. I've tried to discuss it at Wikidata, with people who ought to have the easiest time understanding the issues, yet it took more than a week of back-and-forth posts to explain it to just one other editor recently. Wikidata doesn't even have a complete model or guidelines in place for dealing with many of the edition/translation issues that will have to be sorted out. --EncycloPetey (talk) 03:45, 30 October 2017 (UTC)
yes, we need a wikidata wikiproject for bibliographic metadata. [1] ? the wikicite people did a good start, we need to build on what the librarians have done elsewhere, rather than reinventing the wheel, and imposing an ontology top down. and maybe distill your wiki’splaining into an FAQ there. Slowking4SvG's revenge 17:15, 30 October 2017 (UTC)
@Slowking4: Re: "we need to build on what the librarians have done elsewhere". It's much worse than that. I just cleaned up Wikidata items for Edith Wharton's first 10 novels. That's Edith Wharton, the American author who won the Pulitzer Prize for her novel The Age of Innocence, and whose Ethan Frome is taught in American schools. Of the 10 novels that I worked on, 4 of them didn't have a data item in Wikidata; I had to start one for those. Most shockingly, most of her novels had no authority ID listed in the Library of Congress, and many had no listing in BnF or a GND ID. Some of them don't even have articles on Wikipedia. So even for well-known authors, it's hard to build on what librarians have done, because sometimes there isn't anything there to build upon. --EncycloPetey (talk) 00:35, 6 November 2017 (UTC)
@EncycloPetey:, well i go to LOC for images, and worldcat for metadata [2], i take it you found this "help" [3], [4]. i know some people at LOC - maybe we can team, if you have some concrete asks. and maybe should have started with Jane Austen, given Wadewitz’s work. very few literature articles; they would rather do battleships; it was me and books2read doing author biographies. maybe we need a literature user group; we should start with the wikicite people, move them beyond ontologies, and on to action plans. or put something on the wishlist - m:2017 Community Wishlist Survey. or get you essay in [5], [6] - and btw, should move this to scriptorium, to gauge interest: having policy discussions on templates, is how they roll on WP or commons, not WS. Slowking4SvG's revenge 01:51, 6 November 2017 (UTC)
@Slowking4: I don't think what I'm saying constitutes a policy discussion; it's more of a "here are some problems we face". What I'd really like to do is host some sort of workshop / practicum in San Francisco, but between my current work schedule and my current health issues, that's not going to happen very soon. I think I could demonstrate workflow while making people aware of issues in a way that also teaches new editors for Wikisource / Wikidata / Wikipedia interchange. --EncycloPetey (talk) 02:03, 6 November 2017 (UTC)
if you have a workshop idea, submit it for m:WikiCite_2018. even if you do not have time, memorialize on meta, go for a rapid grant, and maybe we can get some teamwork going. even a good problem statement would be good. Slowking4SvG's revenge 02:16, 6 November 2017 (UTC)
I always use the 'populate from Commons' gadget for Index pages.

Personally, I don't think the data in Index pages is really all that important; I'm much more interested in getting {{header}} to pull data from Wikidata. The problem with Index pages is that they've not got proper sitelinks from Wikidata to here, but just a URL property value, and so we can't look-up from here what a particular page's WD item is. We could do so, once we've made a link from the Index page to the mainspace top-level work page.

And I agree with EncycloPetey that the workflow is sort of backwards if the WD item has to be created before the Index page.—Sam Wilson 01:20, 30 October 2017 (UTC)

Create WD from Commons file, it is an edition, and we have to create that for our index pages anyway. I imagine a delicious adaptation for the bo:ok template. However, to do that we have to have all our ducks in order to know that we have publishers and authors already set — billinghurst sDrewth 03:17, 30 October 2017 (UTC)