Wikisource:Scriptorium: Difference between revisions

From Wikisource
Latest comment: 4 years ago by Mpaa in topic The Outline of History
Jump to navigation Jump to search
Content deleted Content added
→‎The Outline of History: no success for me
Line 939: Line 939:
::adding file links for convenience. [[c:file:The Outline of History Vol 1.djvu]] and [[c:file:The Outline of History Vol 2.djvu]] — [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 03:52, 13 January 2020 (UTC)
::adding file links for convenience. [[c:file:The Outline of History Vol 1.djvu]] and [[c:file:The Outline of History Vol 2.djvu]] — [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 03:52, 13 January 2020 (UTC)
:::The two djvu files have been moved here guessing that at some point that someone will complain. I have also left a note on the deletion discussion to address this matter. — [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 04:07, 13 January 2020 (UTC)
:::The two djvu files have been moved here guessing that at some point that someone will complain. I have also left a note on the deletion discussion to address this matter. — [[user:billinghurst|billinghurst]] ''<span style="font-size:smaller">[[user talk:billinghurst|sDrewth]]</span>'' 04:07, 13 January 2020 (UTC)
::::{{ping|billinghurst}} I do not know how you managed to create a duplicate here, I tried for the other images (via pywikibot/API) but failed due to "''API error fileexists-shared-forbidden: A file with this name exists already in the shared file repository''". I also tried to import from commons (see [[:File:Page 011 (Vol. 1 - The Outline of History, H.G. Wells).png]]), but I guess there is no local file here. I hope you have the tools to make a mass move from there to here. All I could do is save a local copy on my PC.[[User:Mpaa|Mpaa]] ([[User talk:Mpaa|talk]]) 22:11, 13 January 2020 (UTC)


== A personal essay from a kindred site ==
== A personal essay from a kindred site ==

Revision as of 22:11, 13 January 2020

Scriptorium

The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 395 active users here.

Announcements

Changes to Template:Header

To the template

some alterations
  1. for the parameter contributor there is now a synonym section_author
  2. for the parameter override_contributor there is now a synonym override_section_author

this was requested as there was the statement that "contributor" has some level of confusion. Whether we should migrate usage, and/or deprecate the term has not been discussed.

some additions
  1. the parameter section_translator, wikilinked
  2. the parameter override_section_translator, not wikilinked and takes formatting

This allows for the recording of translators of a subpart of a work, previously use of translator applied it to the section for the work, not the subsection.

The documentation has been updated

It is preferred that any discussion should be handled in a new section on this page, rather than as part of this announcement. Thanks. — billinghurst sDrewth 12:40, 30 December 2019 (UTC)Reply

Proposals

New speedy deletion criterion for person-based categories

The following discussion is closed and will soon be archived:

There is community consensus for a new G8 criterion for speedy deletion of person-based categories.


[Addendum for clarity in the archives: the <s>…</s> markup above was added after the fact by two editors who do not agree with the community's decision, but for whatever reason do not wish to open a new discussion to attempt to persuade the community to their point of view. That the closing text is struck through does not indicate an absence of consensus for this decision; merely that they wish to annotate it with their dissent. Note also that this discussion was closed and reopened mid-way through in a way that is not visible in the archived text: the revision history of WS:S during the period of the discussion must be examined to get the full picture. Any further discussion of this issue (including any contrary community decisions) is likely to be found on WS:S with keywords "G8", "person-based", or "author-based". --Xover (talk) 08:09, 9 January 2020 (UTC)] Reply

The <s>…</s> markup above was added by me, who regards the struck text as a mischievously counterfactual summary of the discussion, and the "addendum for clarity" as little better. Hesperian 23:29, 9 January 2020 (UTC)Reply

Following on from a discussion at WS:PD#Speedy deletion of author based categories.

It is long established and in the main uncontroversial that English Wikisource does not use person-based categories (of the type "Works by John Smith", "Poetry by John Smith", etc.). Some previous discussions can be found at: 1, 2, and 3 (and the two following threads). However, absent a speedy deletion criterium specifically for these, admins have to rely on the provision for precedent-based deletions. In practice this means such categories must be brought to WS:PD to be rubber stamped, wait at least two weeks (because inertia and habit), and then hopefully someone will remember to process them. Eventually.

I therefore propose that we extend the deletion policy with a new G8 criterion as follows:

  • Person-based categories—Categories where the defining characteristic is person-based. This includes, but is not limited to, author-based categories like "Works by author name".

All deletions (modulo CU type concerns) are subject to community challenge in any case, and are clearly visible in the deletion log, so there is no particular benefit to the bureaucracy where there exists no significant uncertainty or controversy. --Xover (talk) 14:32, 15 July 2019 (UTC)Reply

 Support, but I'd note that there is an exception discussed in link #2: namely, American presidential documents categorized by president. This is due to the fact that the administration of the executive branch is tied to who is the president at the time. There was no consensus as to the scope of this exception: what kinds of presidential documents it applies to, or whether other governments may have the same treatment, etc. —Beleg Tâl (talk) 14:42, 15 July 2019 (UTC)Reply
 Oppose 2 weeks is not too long to wait. organization of subject of a work is useful, a migration to a stable ontology is necessary. Slowking4Rama's revenge 13:58, 30 July 2019 (UTC)Reply
2 weeks is definitely too long to wait when a full beaurocratic procedure with a foregone conclusion could be replaced with a simple administrative action. —Beleg Tâl (talk) 14:32, 30 July 2019 (UTC)Reply
Also it is worth pointing out that this proposal is not regarding whether such categories should be kept or deleted (since we have already established that they should be deleted), but only whether they should be posted to WS:PD before we delete them. —Beleg Tâl (talk) 18:51, 30 July 2019 (UTC)Reply
And that strictly speaking, under current policy, they can be deleted a few days after a notice has been posted to WS:PD (no two week wait required, just that the discussion must have "started"). It's just that habit and inertia inevitably means that almost all cases will in practice suffer this 2+ week purely bureaucratic delay. I'm a big believer in process and the value of bureaucracy when properly deployed, but even I think this one is a pointless waste of volunteer time. We have issues that require actual discussion or other action that have sat open on the noticeboards for a year and a half; we should not waste those resources on filling out forms in triplicate for issues that are not controversial. Any deletion can be reviewed and overturned, if needed, by the community; let's save the cautious multiple-safeguards approach for stuff that might actually need it. --Xover (talk) 19:11, 30 July 2019 (UTC)Reply
I always wait until there has been a full month of inactivity, since there are many editors who only edit occasionally, but that's just me. —Beleg Tâl (talk) 19:17, 30 July 2019 (UTC)Reply
 Support --EncycloPetey (talk) 17:40, 30 July 2019 (UTC)Reply
 Support --Jan Kameníček (talk) 19:38, 30 July 2019 (UTC)Reply
 Support though if possible I'd like to see the exception Beleg Tâl specified firmed up a bit, i.e. perhaps a general exception for things like governments, ministries, and reigns which are "person-based" but serve an obviously different function to categories-by-author (noting on the UK side things like Category:Acts of the Parliament of Great Britain passed under George III). —Nizolan (talk) 00:44, 1 August 2019 (UTC)Reply
  • Note Based on the discussion above I have added the above criterion with an additional limitation to exempt things like UK governments tied to a monarch's regnal period or the administrations of US presidents. I read the above as general support for this criterion—sufficient for adding it—but with some remaining uncertainty about the optimum phrasing. I'll therefore leave this discussion open for a while longer so that interested parties may object or suggest better wording. I'll also add that minor changes to the wording (that do not change the meaning) can easily be made later with a proposal at the policy talk page. And we can always bring bigger changes up here for reevaluation if it causes problems. --Xover (talk) 19:32, 11 August 2019 (UTC)Reply

Deletion review

I long ago (2005) gathered together historical documents related to the life of Indigenous Australian warrior Yagan in Category:Yagan. This has always seems to me a reasonable category, but it just got speedily deleted without so much as a how-d'-y'-do.

The examples given in this proposal were of the form "Works by John Smith", "Poetry by John Smith", etc. No other examples were given in the discussion. So I'm not sure if the community really intends that categories like this would be deleted. Can we review this please?

Hesperian 23:48, 2 September 2019 (UTC)Reply

Hmm. I'm not going to express an opinion on "should" / "should not" for this, but I will note that based on my understanding of the discussions this would indeed be the intended effect. The defining characteristic of the category is that its members relate somehow to a specific person, and for such the consensus appeared to be that portals were better suited. But perhaps there is a distinction between Category:Yagan and Category:John Smith that I am not seeing? Or is it the specificity: Category:Foo by Person is bad, butCategory:Person is acceptable? --Xover (talk) 03:58, 3 September 2019 (UTC)Reply


As things stand:

  • I can gather together documents about the Battle of Borodino in Category:Battle of Borodino, because that's an event.
  • I can gather together documents about Fort Knox in Category:Fort Knox, because that's a place.
  • I can gather together documents about scissors in Category:Scissors, because they are objects.
  • I can gather together documents about intelligence in Category:Intelligence, because that's an abstract concept.
  • But I can't gather together documents about Yagan in Category:Yagan, because he was a person.

Can no-one see how bizarrely arbitrary this is??

And it hasn't even really been discussed, since the only examples given above are "Works by" categories, the deletion of which makes perfect sense. Hesperian 11:50, 3 September 2019 (UTC)Reply

Fully agree with Hesperian, the speedy deletion is a misinterpretation of the guidance. The "category:works of ..." is to ensure that works of authors are added to author pages, and not categorised. There is no determination that it would relate to anything else. Categorisation has always existed for people, again our biggest issue is how to separate author categorisation from subject categorisation. — billinghurst sDrewth 12:39, 3 September 2019 (UTC)Reply
Read the policy, it does not say "works by …", it says "person-based". —Beleg Tâl (talk) 12:56, 3 September 2019 (UTC)Reply
Per our deletion policy (as updated according to the consensus in the above discussion), "Person-based categories" are now a criterion for speedy deletion. This "includes, but is not limited to, author-based categories", but "the defining characteristic is person-based". This was very explicit in the above proposal. My deletion of Category:Yagan was therefore 100% within our deletion policy. You can propose a reversion to the older version of the deletion policy, and a restoration of Category:Yagan (even though it is entirely redundant of Portal:Yagan), but I will have no part in it. —Beleg Tâl (talk) 12:53, 3 September 2019 (UTC)Reply
Also: as things stood before the above discussion, I could gather together documents about Yagan in Category:Yagan, but couldn't gather together documents about Yazid III in Category:Yazid III, which is just as bizarrely arbitrary. —Beleg Tâl (talk) 13:03, 3 September 2019 (UTC)Reply
(ec) It is my opinion that it is not a positive change. 0-100 in four seconds. I find the statement It is long established and in the main uncontroversial that English Wikisource does not use person-based categories to not be the case, especially as it has been the case since 2005. Something that was entirely in scope and I believe would have been kept in a PD, is now going to a speedy deletion and deleted without conversation. I find that inappropriate, and for that to have been implemented in four weeks is an example of poor implementation and poor policy. I am wondering where this community is going, and the lack of vision that this represents. — billinghurst sDrewth 13:14, 3 September 2019 (UTC)Reply
It may also have simply flown under the radar. It is also just one category affected, and a completely redundant one at that (equally redundant to any Author-based categories). And the proposal to update the policy was done entirely by the books, and is a significant benefit to the community. —Beleg Tâl (talk) 13:30, 3 September 2019 (UTC)Reply
And it has been long established and in the main uncontroversial that English Wikisource does not use categories for individuals who have pages in Author space; the fact that there existed one or two categories for an individual in Portal space is (to me) a minor detail and I would have also considered it long established and uncontroversial that these were also unwelcome. —Beleg Tâl (talk) 13:33, 3 September 2019 (UTC)Reply


Of most concern to me in this new G8 is, what if Portal:Yagan did not exist? In that case, Category:Yagan would be the only way in which we had organised our material by topic, yet it would still be summarily deletable under this new G8.

I think a more coherent policy position might be:

We don't want to organise our material by both Author/Portal and Category. So it is fine to create a category for a topic if there is no corresponding Author/Portal page. But be aware that this is a stopgap -- once someone has created the Author/Portal page, the category may be deleted.

Note that this doesn't distinguish people from other topics. Category:Yagan is fine, but only until Portal:Yagan has been created. Even Category:Works by John Doe is fine, but only until Author:John Doe has been created.

I think the biggest problem with this position is the really big topics that would be better handled by a category than by an Author/Portal page e.g. War. In that case, I would say keep the category and ditch the portal, which would be unmaintainable. In a speedy criterion there would certainly need to be something to prevent deletion of categories that contained subcategories or a collection of portal/author pages.

Thoughts? Hesperian 22:50, 3 September 2019 (UTC)Reply


Since the attitude to concerns raised here has been "I will have no part in it" followed by non-participation in the discussion, I have boldly replaced "person-based" with "author-based". I accept the new G8 was proposed, discussed and implemented in good faith, but subsequent objections have made it clear that there is no consensus for speedy deletion in the gap between person-based" and "author-based".

To be clear: we may not agree on whether Category:Yagan should have been deleted, but I think we can all agree that the deletion was contentious, and speedy delete criteria are intended to capture non-contentious matters.

Hesperian 07:53, 6 September 2019 (UTC)Reply

@Hesperian: I'm not going to revert that because I think at least temporarily going back to the status quo is prudent when a concern has been raised so soon after implementation. But I do object in principle to your approach here: whatever the problems with the new G8, it was properly discussed, consensus determined, and implemented. For you to unilaterally reverse it is not a good practice, no matter the merits of your concerns with it. The proper description of the thread above is, strictly speaking, not "absence of consensus" but rather "complaints after the fact" (possibly good, proper, and meritorius complaints, but still after the fact). So I am going to insist that this removal of the new criterion is a temporary measure while discussion is ongoing, and not the new status quo. If no new consensus is reached here then we revert back to what was previously decided. (To be clear, if you had suggested we should temporarily revert I would have supported that. It is your acting unilaterally with an apparent intent to change the status quo I object to.)
That being said I am absolutely open to being convinced of anything from the new criterion needing to be tweaked and to it needing to be dropped altogether. The reason I am not currently actively discussing is that I do not feel I sufficiently grasp the issue and am mulling it over. Your distinction between "person-based" and "author-based" has not been apparent to me prior to your latest comment, and I now suspect that that distinction is the crux of your objection; but I still do not grasp why you do not feel a portal would be sufficient. On the other hand, reasonably curated categories are cheap, and can conceivably be automatically applied to works included in a portal.
I also suspect, though I may of course be entirely mistaken, that what we are discussing here is not actually a speedy criterion, but rather a more fundamental issue of category and portal policy. I am not convinced the speedy criterion is a useful proxy for that debate, on the one hand, and that the former will resolve itself neatly if the latter is settled, on the other. --Xover (talk) 08:30, 6 September 2019 (UTC)Reply
@Hesperian: "I will have no part in it" is me, not the community. I agree with Xover that it is necessary to establish a new consensus with the community to make a subsequent update to the deletion policy (in which discussion I will remain neutral). And like I said to TE(æ)A,ea.: three days is not remotely sufficient for closing a discussion. Be patient. —Beleg Tâl (talk) 12:20, 6 September 2019 (UTC)Reply
  •  Comment There is definitely a long-established practice that we collect and curate works that relate to authors, and due to our strong preference to curate, we determined to not categorise, which would have a duplication and a confusion. It has not been the case for individuals who were not authors, and it should not be a requirement that we have to curate such pages, especially where a person may be mentioned on a page(s) though not be the focus of the pages. For instance, the page The Perth Gazette and Western Australian Journal/Volume 1/Number 28 would be considered for categorisation in "Category:Yagan" though would not particularly be the focus of a page and put onto a Portal: ns page. I would definitely not expect someone to have to make edits to a portal page to that target, though I would have no qualms with someone categorising. Where we have authors, we have wikilink'd back to author pages for that relevance. So it is my belief that these non-author categories should not be speedied, if there is a case for their deletion, then bring it to the community. I also believe that a proposer should be listing consequences of their suggested policy changes, not leaving it to the community. I find the above consensus to be a troubling "yes ... tick and flick" exercise by the community without an in-depth exploration of the consequences, approving a change to speedy deletion should be items that are completely non-controversial.

    The above deletion discussion started with the scope of a PD discussion about author categories, and then specifically addressed two author related categories. No examples were given of non-author categories that would have been wrapped up in the change of our guidance, nor that we were going to now speedy delete categories that have been existing for greater than 10 years. I have a strong belief that anything that has existed for over 10 years onsite should not be speedied, and that speedy deletions are only best applied to recent additions.

    Xover: You suggested the policy change, then summarily closed less than four weeks later, and implemented. May I suggest that is not the ideal practice either, as this is a change of policy where all person categories are deleted, not as indicated in the discussion that it was an existing process and the speedy being the only change. We are not a huge community, we don't have the same editing rates, or the diversity of eyes to analyse such situations, and that is traditionally why we have left discussions open for extended periods. — billinghurst sDrewth 10:55, 7 September 2019 (UTC)Reply

    @Billinghurst: "Too quickly closed" is a fair complaint, although I don't entirely agree with that assessment. I agree there should be plenty of time for the community to ponder, scrutinise, discuss, and decide; and in fact was somewhat disappointed that the proposal did not garner wider participation and more discussion. I agree speedy criteria should have a firm basis, which broad participation in the proposal is the best way to ensure (and document!). But I also observe that community participation in such discussions is distressingly low in general, and by that yardstick the above was about the most I felt one could realistically hope for. When no further comments either way surfaced—not even any "Unsure" or "Wait, I need to think a bit more"—I felt that was sufficient to implement. If we want to have much longer timeframes to tease out every possible community comment then we should have specific guidance to that effect (and I do mean a specific number of weeks).
    I agree that speedy should be for uncontroversial things, but then my understanding was that this was uncontroversial. My intent in making the proposal was not to change practice regarding use of categories vs. portals, but rather to eliminate a pointless two-week wait and bureaucratic box-ticking for something that was a priori determined would be deleted. I do however disagree that speedy should not be applicable to, for example, decade old clear copyvio. The purpose of speedy deletions is to reduce bureaucracy and make maintenance more efficient—where possible—and to reduce the demands on the community's time and attention in formal discussions. Because, as you point out, such participation is perhaps our scarcest resource! The age of the material affected is entirely orthogonal to whether it falls within one of the speedy deletion criteria.
    "Uncontroversial" is a better distinction, but even there some nuance is needed. The policy that leads to the deletion (by whatever process) must be unambiguously decided: it must be uncontroversial that that was what the community decided. The issue itself, though, can still be plenty controversial: there are some contributors who would never see anything deleted, for any reason, and express their frustration with copyright law and our copyright policy in every copyright discussion they participate in (nevermind proposed deletions). That someone disagrees with the community's decision, once made, is not a valid reason for considering the implementation of that decision controversial.
    On the issue at hand, though, I (am starting to) see the personauthor distincton, but I am having trouble understanding how a portal is any less suited for a person than for an author. To my mind the very same arguments for portal over category for authors apply equally to persons. Why wouldn't The Perth Gazette and Western Australian Journal/Volume 1/Number 28 go in the portal? Or is it the perceived relative amount of effort in curating the two approaches? Hesperian's more coherent policy position seems to suggest that that is the case.
    I don't think starting with a category but deleting it if a portal is created is a particularly rational approach, but as a proposal it does speak directly to the relationship between categories and portals. To me, the opposite end of the spectrum (that you also address) seems more elucidating: once a topic is sufficiently large, a portal becomes an awkward way to organise the information. In those cases I could see an argument for using both; the category for everything and the portal for the highlights. But that's an argument that will be relevant only rarely (relatively speaking) and only in the reverse order (only once the portal is "full" does the category come into play). Most person-related topics will not have too many relevant works for a portal.
    Or perhaps a different angle of attack would aid common understanding: Categories, Portals, and Author-pages overlap in various ways and in different degrees, and so we should establish some coherent guidance on the purpose of each, what to use each for, and how to distinguish between them in difficult cases. Perhaps in discussing what that guidance should be we would better understand the various perspectives than through the proxy of a speedy criterion? For example, do we want a portal about a person as a historical figure if that person is also an author? Is an Author: page and a Portal: the same thing except for inclusion criteria? Do the same layout rules and restrictions apply to both? --Xover (talk) 03:19, 9 September 2019 (UTC)Reply
i am sad that admins persist in summarily deleting, for contentious issues that require a consensus. we need a standard of elevating issues on chat before deletion. and a standard of practice of how to organize ontologies of "subject of" and "depicts". i don’t care how- portals, categories, subsection, anything that can be linked from wikidata. but we need an organizational consensus, not deletion. Slowking4Rama's revenge 03:43, 13 September 2019 (UTC)Reply
@Slowking4: But, but, but, but you do not understand the sysop perspective. They delete without consequence (for themselves, as from a sysop's perspective a deleted page may be view/restored and viewed without going through with restore. See? No consequence!) As for for the plebs, tough! Them's oughta put in an application to be tiara'd like good little princesses… 114.78.171.144 06:09, 13 September 2019 (UTC)Reply
114.78: I realise you're taking the piss here, but I actually agree that this is an important difference in perspective to take into account. One thing is that the consequences of deletion can in some (but not all!) cases appear smaller to those with the technical ability to view and restore deleted pages, but the perspective is also shifted when you have long backlogs of tasks that either can only be resolved (in practice) by deletion or where deletion is a fairly foregone conclusion. To have to conduct a formal analysis, formulate it cogently, and run a community discussion is a lot of effort. The relatively low community participation in those discussions means they have a tendency to deadlock, and if resolved are too local to support any kind of future precedent. When a lot of your tasks are dealing with that dynamic, you will naturally tend to develop a bias (big or small) toward more efficient resolutions like having speedy criteria for whatever the issue at hand is.
But when you spend a lot of time going through the maintenance backlogs you also gain the very real experience that tells you that a lot of stuff has been dumped here with no followup, attempts to format properly, or even giving minimal source or copyright information. There is literally no hope of these works being brought up to standard as they are, and would in any case be easier to recreate from scratch than fix in place, even if they aren't blatant copyright violations. While we certainly need to watch for and not get fooled by the previously mentioned bias, we also should let ourselves be guided by this experience. Sometimes the perspective of those who work the maintenance backlogs (which is not by any means limited to just admins!) gives them a better foundation for reasoning about an issue than those who work primarily on their own transcriptions (and sometimes not). --Xover (talk) 07:25, 13 September 2019 (UTC)Reply
your "guided by experience" does not address the power dynamics of a summary standard of practice. when you undertake an action. no matter how reasonable or justified you may feel, while the community is feeling ill-used, then you might want to rethink your action, if you would presume to lead a community. we have a lot of ban-able admins. Slowking4Rama's revenge 11:44, 13 September 2019 (UTC)Reply
@Xover, @Slowking4:My sincere apologies if my comment came across solely as micturient. When young fresh meat front up to gain the authority bit it is entirely reasonable they not realise they are actually signing up for a melange of teacher, executioner, judge and neat-freak. What is less excusable is that some of them never even learn of the damage they do to the parallel roles whilst obsessing over the matter of the moment. Ordinary users are watchers and judger's too and may take away quite unexpected conclusions from administrator actions. Looked at another way the spread of intelligence is (sadly) unrelated to the authority role granted. That there never seems to be a shortage of potential idiot actions does not mean it is a good idea to go down each and every rabbit-hole.
On the other hand the occasional well-reasoned explanation might even result in the next applicant putting their hand up and taking some pressure off off the backlog slaves. If that flags me as both bitter and optimistic then just handle it. I have to. 114.78.171.144 22:06, 13 September 2019 (UTC)Reply
@Slowking4: I have suggested above that the ontological discussion might be a better way to approach this issue than the speedy criterion. What are the ontological categories we need to handle, and what tool or structure of those we have available to us would be best to handle each? If we can figure out some guidance on that then what should be kept and what should be deleted will, hopefully, follow naturally. Perhaps you could flesh out your thoughts regarding "subject of" and "depicts" with that in mind? --Xover (talk) 07:25, 13 September 2019 (UTC)Reply
we would need to group together all those works, which people seem to use categories . we have categories on authors, we could start with a wikidata infobox at author pages. if the community wants portals for subjects, then we will need a infobox and migration from categories to portals. (this is different from how it is done on commons) you could then link on wikidata, and have some query function to aid search, we need some wayfinding to aid search of topics. Slowking4Rama's revenge 11:53, 13 September 2019 (UTC)Reply

Further discussion needed (New speedy deletion criterion for person-based categories)

I am quite a bit concerned about this, and have unarchived it to prevent it lingering on unresolved.

We are now in a situation where the community has voted to implement a criteria for speedy deletion, that allows any administrator to delete such matter at their own discretion with no a priori community approval (all admin actions are, of course, subject to a posteriori review by the community), but where at least two long-standing and very experienced contributors have objected to the core issue after the fact, and levelled criticisms at the formalities of the community decision process. Their objections are reasonable ones (in the "reasonable men may disagree" sense), and the criticisms of the process valid.

To make clear the procedural issues, the proposal described the issue as "in the main uncontroversial", which the objections have demonstrated was not entirely accurate, and it was closed after a mere four weeks (two weeks after the last comment), when an objection became apparent after six weeks. Additionally, relating to the core issue, those who disagree feel the examples provided in the proposal do not accurately reflect the criterion as it was implemented. These are all valid complaints and the responsibility for these deficiencies in the procedure fall to me (my apologies).

But, in any case, the core issue remains: we now have a speedy criterion that two very respected and experienced community members have valid and strong-held objections to.

The arguments of those who object are presented above under the "Deletion review" thread. I had hoped that the community would chime in on that discussion such that it would be possible to assess whether the community shares the concerns of those who have objected, or whether they still support the criterion as implemented.

But as that has not happened I would like to directly request that the community chime in to make clear their position on how to handle this.

  • Despite the criticisms, the original community vote was valid and concluded with support, so the default outcome, if no change is mandated here, is that the criterion as written will be implemented. It is currently temporarily suspended as a conservative measure since objections have been raised.
    • In particular, this means that if you do not express an opinion now you will in practical effect be reaffirming the original outcome!
  • Does the community feel that the concerns raised are serious enough to invalidate the previous vote and revert to the status quo ante?
  • Does the community feel we should proceed as per the existing vote and adjust course as necessary at a later date?
  • Alternately, does the community feel we should proceed as previously voted but with specific changes to the wording of the criterion?
    • For example, Hesperian has specifically proposed replacing "Person-based" with "Author-based" in the criterium.
  • Would the community prefer a new proposal, that better explains the issues, be made and a new vote held on that?
  • In essence: do you have any opinion or recommendation on how this disagreement should be handled such that we end up with the issue settled?
    • Not everyone needs to agree with the outcome, but everyone should preferably feel that the outcome was fairly arrived at!

Pinging previous participants in the vote/discussion (but everyone are, of course, encouraged to chime in): Beleg Tâl, Slowking4, EncycloPetey, Jan Kameníček, Nizolan, billinghurst, Hesperian.

This has dragged on unresolved and it's the kind of thing that has the potential create conflicts and discord down the line so, despite the sheer amount of text and rehashing, please chime in and make your position clear! --Xover (talk) 07:22, 14 October 2019 (UTC)Reply

  •  Comment The examples used of the purpose and solutions did not adequately represent the proposal. I don't believe that any long-held page that appears valid at a point in time should be speedy deleted with a change in policy, especially where it is unclear in the proposal that such pages were being incorporated. My understanding of our approach was that we would not build author category listing pages those to go. — billinghurst sDrewth 09:56, 14 October 2019 (UTC)Reply
    @Billinghurst: It is not clear to me from this comment how you would prefer to resolve this issue. Could you make that explicit? --Xover (talk) 06:46, 15 October 2019 (UTC)Reply
    Don't speedy delete long-held pages.

    If you are putting forward a policy change, then identify the pages that are going to be caught by the policy change. Look to use best examples, not examples where we are already in agreement. If you are deleting and you come across long-held pages you believe that are caught by a policy change, and they have not been specifically mentioned, then have the open-discussion so that we have a consensus that is what we were looking to do. Administrators are the implementers of consensus, not the determiners of what happens here, and we should be looking to be considerate. Err on the good-side and the patient-side. In reality, for many things there is no hurry, despite some of us at some stages just wanting to get things tidied away.billinghurst sDrewth 22:46, 21 October 2019 (UTC)Reply

    @Billinghurst: Apart from the age exemption (which I have addressed above somewhere), this is all good advice and I agree whole-heartedly. But now you're just chiding me. What, specifically, is your preferred way to resolve this issue? Do you want the new speedy criteria rolled back and removed? Do you want its text changed from "Person-based" to "Author-based"? Or are you proposing an entirely different, general, rule that no content older than X time units may ever be deleted under any criterion for speedy deletion?
    Because right now we have an existing, valid, community decision in favour of the new criteria with the "Person-based" meaning, but I am bending over backwards to try to make sure the concerns you and Hesperian have raised are taken into account (giving everyone a chance to change their minds if they are swayed by your concerns).
    If your goal is to censure me for insufficiently researching and documenting the consequences of the new policy, or for failing to insist on a longer period before being closed, then, fine, consider me suitably chastened. But me standing dressed in a white sheet in church on three sundays isn't really going to change much. So far, of those who originally supported the new criterion, only Jan has chimed in and they reaffirm their original position. If you want a different outcome you need to at least tell us what it is. --Xover (talk) 07:49, 22 October 2019 (UTC)Reply
  • As I said before, I remain  Neutral regarding the proposed change from the current "person-based" deletion rationale to the proposed "author-based" rationale. —Beleg Tâl (talk) 15:19, 14 October 2019 (UTC)Reply
    I voted for deletion of person-based categories and I hope that the vote also counts in this way. If somebody wishes only deletion of author-based categories instead, it should be suggested as an alternative rule. I admit it is my fault I did not protest when somebody changed the proposal without others expressing their consent clearly, but still: changing rules needs explicit consent, which is missing here.
    That said, I do not think that the idea of treating author-based categories differently from categories of other people is good.
    • Firstly, this can be a source of big confusion to many readers browsing categories: some people are included in the category tree and others not, and accidental visitor to Wikisource unfamiliar with our internal rules will not find the clue.
    • Secondly, it is not defined, who is considered to be an author by this rule: A person who is author of a work at Wikisource? A person who is author of a work eligible to be added to Wikisource? A person who is author of a work in English or translated into English, although it won't become eligible for WS for decades? Or any person who is author of whatever in any language, which may but also may not be translated into English in the future? We have some definition in the Style guide which says that "... author ... is any person who has written any text that is included in Wikisource. However, too many contributors refuse to follow this definition and found author pages of people who have no work here, sometimes even authors who have never written anything in English and nothing by them has been translated into English so far (example). I am afraid the same will sooner or later happen with categories.
    • Let's say that we determine some line dividing authors and the rule will say which authors can have categories and which not. The rule could be: authors who have an author page cannot have category, and vice versa (or any other definition). Again: accidental visitor browsing categories will be confused, unable to find our internal clue why Alois Rašín can be included in the category tree and Karel Kramář not.
    To conclude it, the best way is the simplest: forbid all person-based categories and organize people only in the author and portal namespaces, or alternatively allow categories for everybody. I am for the first of these two choices. Jan Kameníček (talk) 20:10, 14 October 2019 (UTC)Reply
  • comment, i am concerned about increased use of speedy deletion, that has been abused elsewhere. i would prefer use of maintenance task flows in the open. i do not see a pressing problem. but maybe this is overblown, and the admin task flow here will not be abused. i raised my concern and got dismissed, which is fine with me.
  • what we really need is a consensus about how we structure our data with wikidata. (be it categories, portals or tags) we need a stable page, about work subjects, that can link to wikidata. we have a "works about" section for authors. but we need it for non-authors also.Slowking4Rama's revenge 14:09, 15 October 2019 (UTC)Reply
    Your concerns were not dismissed, some editors merely disagreed with them. But in the interest of clarity, in view of your comment here and your original oppose vote to the proposal, do I understand correctly that your preferred resolution to this issue is to roll back to before this proposal and have no speedy deletion criterion for this at all? --Xover (talk) 14:41, 15 October 2019 (UTC)Reply
yeah, apparently, i have out of consensus views of those who show up for process discussions. i just want some stable bibliographic metadata about "depicted people" and subjects. i am open to how to structure it, and what is the road map to get there. i do no care about rolling back a particular direction that i think is mistaken. (the problem with deletion is that it decreases the slim possibility of quality improvement, since it hides quality defects rather than making them more visible.) Slowking4Rama's revenge 15:43, 15 October 2019 (UTC)Reply
@Slowking4: do you see the value in having a general essay and guidance on how we handle people who are not authors. Then having a range of means to handle these depending on the person's notability, and possibly the number of references/sources that we are having for these people. Some of the solutions will be here at enWS, and others may be at WD. Our policy guidance of 2010 probably needs to evolve with the implementation of Wikidata which is a bigger people resource and allows interactions and linking differently than our 2010 focus on enWP linking to notable people. Here I am thinking something akin to Wikisource:For Wikipedians and it might be something like [[Wikisource:For Wikidatans]] and [[Wikisource:Managing people data at Wikisource]]. — billinghurst sDrewth 22:55, 21 October 2019 (UTC)Reply

Notice that discussion about to be closed

Since this discussion has been stalled since October (and ongoing since July), and no new consensus appears to be likely, I intend to formally close this discussion at some point before the new year as reaffirming the original consensus. If further or new discussions related to this issue are needed I suggest they be brought up in a separate thread, and unless they represent a concrete proposal, that the thread be opened down in the regular discussion section. --Xover (talk) 08:33, 14 December 2019 (UTC)Reply

Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --Xover (talk) 10:49, 29 December 2019 (UTC)Reply

Update to NopInserter Gadget

The following discussion is closed and will soon be archived:

There is a clear absence of community support for this change, though the lack of even a single objection suggests this may be due to apathy or lack of interest rather than any perceived or actual problem with the proposal. In future, the community is strongly urged to express their preference explicitly in order to enable consensus-based processes to function.

A while back, while debugging an unrelated issue, I found a bug in MediaWiki:Gadget-NopInserter.js that prevented it from displaying the intended visual indication of its operation. I also found that at the time of implementation there had been some differing preferences for what type of visual indicator be used and when prompting for confirmation was appropriate. I have therefore created an updated version in User:Xover/Gadget-NopInserter.js that fixes the bug and adds configuration options for whether to confirm addition of a {{nop}}, the style of visual indicator, and the duration of the indicator effect. To try it out you can add the following to your common.js (but disable the site-wide gadget in your preferences first!):

mw.config.set('userjs-nopinserter', {
	dontConfirmNopAddition: true,
	notificationStyle: "highlight",
	notificationTimeout: 1000
});

mw.loader.load('//en.wikisource.org/w/index.php?title=User:Xover/Gadget-NopInserter.js&action=raw&ctype=text/javascript');

It also fixes the bug that prevented the site-wide gadget from actually showing the outline based highlight it was supposed to. And for good measure I added support for a notificationStyle using mediawiki's bubble notifications (set notificationStyle: "message" to try it out). The weird double-negative construction of "dontConfirmNopAddition" is just because I've preserved the default behaviour of the site-wide gadget. If you remove everything except the mw.loader.load line (no setting of options) you will get the old default behaviour with just the bugfix.

The changes can be seen in this diff.

The changed version has had some limited testing and seems ready for wider testing. I therefore propose that we update MediaWiki:Gadget-NopInserter.js with this version. Note that since we do not have interface administrators locally, I will have to request this edit from the Stewards at meta, and they will require a community discussion to verify that this is indeed a change in line with community consensus. It would therefore be very helpful if as many as possible indicated whether or not you support this proposal. --Xover (talk) 12:37, 6 October 2019 (UTC)Reply

I think local bureaucrats can set the "Interface administrators" bit.Mpaa (talk) 14:18, 6 October 2019 (UTC)Reply
Bureaucrats have the technical ability to flip that bit, yes. But by WMF Legal-imposed policy it requires 2FA, and so can't just be assigned ad hoc like other local permissions. And since we have no permanent interface admins, nor any "list of people willing and able to make interface admin-edits", we don't actually have any functioning local way to request such changes; unless you yourself happen to have 2FA enabled for other reasons. Thus, asking the Stewards at Meta is actually the easiest option for getting such changes made currently. --Xover (talk) 05:55, 7 October 2019 (UTC)Reply
OK, just a bit weird that it is 'technically' possible but not 'legally' possible without 2FA. If they want to be on the safe side, they should not allow without 2FA.
 Support Anyhow, I am fine with the proposal.Mpaa (talk) 19:37, 7 October 2019 (UTC)Reply
Yeah, the 2FA requirement is kinda dumb to begin with (not that it helps that we don't have a functioning local Interface Admin policy). In any case, this thread, so far, does not demonstrate community support for the proposed change (absence of objections is not the same as support), so it appears this change will not be implemented. Note that if the community's reticence should happen to be about the other changes, I can redo this patch to only fix the (7 years old) bug. In the mean time, anyone that wishes may of course use the copy in my userspace using the syntax described above, but absent any indications of interest I probably will not be actively maintaining that copy. --Xover (talk) 13:39, 5 November 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:18, 14 December 2019 (UTC)Reply

Proposed changes to WS:WWI regarding advertisements

The following discussion is closed and will soon be archived:

The proposed changes have been implemented.

There is a proposal to update the wording of our policy regarding the inclusion of advertisements, in particular advertisements that are part of a larger transcluded text. Please see the discussion at Wikisource talk:What Wikisource includes#Proposed changes to Advertisement section. —Beleg Tâl (talk) 13:50, 15 November 2019 (UTC)Reply

This is more of a clarification than a change of policy. The previous instructions were very vague and confusing. Kaldari (talk) 23:27, 18 November 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:48, 14 December 2019 (UTC)Reply

Bot approval requests

Repairs (and moves)

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

Once a Week Vol. 7

The following discussion is closed and will soon be archived:

resolved

Vol. 7 of Once a Week is missing pages 629 and 630. Of the two scans at IA, this is the one that has all the pages. Could someone please use it to repair the Djvu, or replace the whole thing if necessary, since the complete scan is fairly decent quality. Levana Taylor (talk) 22:32, 23 November 2019 (UTC)Reply

@Levana Taylor: Done. Sorry about the delay. --Xover (talk) 13:50, 29 November 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Mpaa (talk) 20:37, 21 December 2019 (UTC)Reply

Index:Poetry of the Magyars.djvu

The following discussion is closed and will soon be archived:

Resolved.

I have uploaded a corrected source file, but the file correction necessitates some page moves. I have made the more complicated moves, but would appreciate it if someone with a bot can complete the process.

Only pages in the (DjVu) range /48 to /216 need to be moved, and these pages need to be moved one page down. That is:

  • Page:Poetry of the Magyars.djvu/48 --> Page:Poetry of the Magyars.djvu/47
  • Page:Poetry of the Magyars.djvu/49 --> Page:Poetry of the Magyars.djvu/48
  • ...
  • Page:Poetry of the Magyars.djvu/215 --> Page:Poetry of the Magyars.djvu/214
  • Page:Poetry of the Magyars.djvu/216 --> Page:Poetry of the Magyars.djvu/215

All pages outside the range stated above already are in the correct location. I had to do some of those by hand because the original file was missing two pages, in addition to other problems, so not all pages needed to be moved one. --EncycloPetey (talk) 20:05, 26 December 2019 (UTC)Reply

Done (in a minute). Mpaa (talk) 20:43, 26 December 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 10:56, 29 December 2019 (UTC)Reply

Other discussions

How shall I transcribe two books in one?

The following discussion is closed and will soon be archived:

Resolved.

I have started working on a publication of engravings by Wenceslaus Hollar. The book does not contain the year of publication, but HathiTrust states that it was published between 1794 and 1812. The book looks like a reprint of originally two separate books, one published in 1640 and the other in 1643. The problem is that this reprint does not have one title common for both parts.

Can I transcribe the publication as two separate works under their individual titles? Or should I transcribe them as one work and devise some title? I was considering using the first of the titles for the whole publication, but it would be really misleading, as it speaks only about England, while the other part deals with various European countries. --Jan Kameníček (talk) 19:50, 22 November 2019 (UTC)Reply

I'd just transcribe them as two separate works, if there's no overall introduction or anything.--Prosfilaes (talk) 21:04, 23 November 2019 (UTC)Reply
I also think it is the best solution, but I wanted to have it confirmed by somebody else. Thank you very much. --Jan Kameníček (talk) 21:25, 23 November 2019 (UTC)Reply
I, on the other hand, would probably transcribe them as one work and devise some title, like I did with The Holly & the Ivy, and Twelve Articles and Lyra EcclesiasticaBeleg Tâl (talk) 21:30, 23 November 2019 (UTC)Reply
Hm, simple connection of two titles with "and" could also be a solution. I’ll think about it for a while, thanks as well. --Jan Kameníček (talk) 23:20, 23 November 2019 (UTC)Reply
I don't think there is a clear answer in general; this sort of thing needs a judgement call for each work, and with quite some leeway for individual contributor preference. It also needs to be considered whether the book in question is actually a publication and not merely two works bound together (as was common practice for collectors of all stripes in the 18th and early 19th century). And on this particular book the fact the two works have the same publisher might suggest they are one publication, while the fact both included works have separate colophons suggests they are independent publications bound together. Similarly, there appears to be no front or end matter that is common to both works: they share only the binding. It is hard to be categorical, but I suspect I would have eventually landed on treating these as separate works that had merely been bound together. But I would not have faulted anyone for landing on the opposite.
Incidentally, the publishers, “Laurie & Whittle”, are still around, trading these days as “Imray Laurie Norie & Wilson Ltd”. --Xover (talk) 08:32, 24 November 2019 (UTC)Reply
I know of a number of examples where works more or less related were packed into one binding out of publishing constraints. I think that we should make sure that sure separate parts are separated out, like they would be in an anthology or magazine, and make them available individually, even if they are under a higher level heading for the complete work.--Prosfilaes (talk) 02:00, 27 November 2019 (UTC)Reply
Which has been done by creating redirects at the root where they have been displayed as subpages. Where they have a set of known publishing components, especially with regard to how they are portrayed at Wikidata, then keeping to the known truth is best. Here the provenance of the work is simply not known, we just know that they shared the same binding.

We know that many of our works were singly published, serially published, and multiply published, so do what makes most sense that maintains the credibility of the publication/work(s). Document it well either in notes, or on talk page, so that someone can understand what you did when looked at in five years time. — billinghurst sDrewth 04:30, 27 November 2019 (UTC)Reply

Thanks everybody for valuable opinions. I have considered them all and finally decided to keep them together (as the publisher enclosed them in common binding), but as two separate subpages and with explanation in the note. I think this solution shows that originally they were separate and at the same time it is faithfull to the intention of the reprint’s publisher. --Jan Kameníček (talk) 00:11, 8 December 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 10:57, 29 December 2019 (UTC)Reply

Page deletions

The following discussion is closed and will soon be archived:

Resolved.

A quick speedy -Template:Ws diclist smallcaps.css , but I can't tag the page as such. This was created in error. ShakespeareFan00 (talk) 19:07, 9 December 2019 (UTC)Reply

@ShakespeareFan00: Done --Xover (talk) 19:30, 9 December 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 10:59, 29 December 2019 (UTC)Reply

How often does the database for searches actualyl update?

https://en.wikisource.org/w/index.php?title=Special:Search&limit=500&offset=0&ns0=1&ns1=1&ns2=1&ns3=1&ns4=1&ns5=1&ns6=1&ns7=1&ns8=1&ns9=1&ns10=1&ns11=1&ns12=1&ns13=1&ns14=1&ns15=1&ns100=1&ns101=1&ns102=1&ns103=1&ns104=1&ns105=1&ns106=1&ns107=1&ns114=1&ns115=1&ns828=1&ns829=1&ns2300=1&ns2301=1&ns2302=1&ns2303=1&sort=create_timestamp_desc&search=insource%3A%2Fclearfix%2F&advancedSearch-current={}

I can edit, update the search and find that entries I'd ALREADY resolved still appear in the Search Results. I assume the searches are being cached somehow? ShakespeareFan00 (talk) 18:48, 13 December 2019 (UTC)Reply

Database updates with edits … mw:help:CirrusSearchbillinghurst sDrewth 11:16, 15 December 2019 (UTC)Reply

Stalled files in Category:Index - File to check

The files in this category have not yet been pagelisted, and are also in instances lacking authorship information to determine thier status. It would be appreciated if other contributors in a position to provide additional information could do so. ShakespeareFan00 (talk) 10:05, 14 December 2019 (UTC)Reply

Talk to the uploaders, or work with them to get it done. Not certain that we should be imposing this upon users here. We all know the place exists, and there are thousands of works that need more work than these. All maintenance is important, and we all impose it upon others, and choose to do what we wish to when we can. — billinghurst sDrewth 11:14, 15 December 2019 (UTC)Reply
unclear to me what you want to "check" with Index:KJV 1778 Oxford Edition.pdf ? do you want better metadata? or a completed index pagination? a quality improvement team to triage and improve indexes would be a better approach, than an ad hoc "look as this" approach, Slowking4Rama's revenge 14:57, 16 December 2019 (UTC)Reply
Ideally both improved meta-data and fully completed page-lists would be desirable, There are only 5 remaining items in this category. :) ShakespeareFan00 (talk) 17:27, 16 December 2019 (UTC)Reply

FYI: Moved author pages not updating at Wikidata

On moving author pages in the past few days, the links are not being updated at Wikidata. I have flagged the issue at Wikidata chat, and am seeking their guidance on how to progress the matter, especially not knowing which part of the system is at fault.

I have no idea whether it is more than author pages, or not. I don't see that it is main namespace pages, though I haven't dug through sufficiently to check, though nothing evident in the tracking category. — billinghurst sDrewth 10:25, 15 December 2019 (UTC)Reply

Problematic redirects (versions in a page with other works) and WD

The following set of triplet links are local redirect · wikidata link for redirect · redirect target

The issue is that redirects should not have wikidata items, especially when they become a part of a page that could have its own item.

Paths to resolution

  1. do nothing locally; at WD delete the interwiki (redirect) links, request WD item deletion; or
  2. locally convert to a versions page as we know that these are all reproductions, not originals; at WD make the pages work pages (merge items if work page already exists); or
  3. make a specific edition page (new concept) with the single link to the edition page; at WD we make the link an edition page

These items, themselves, are empty of description at WD bar the wikilink.

Of these, I prefer option 2. It gives the best linkage and visibility, and sets it within any broader context. — billinghurst sDrewth 11:10, 15 December 2019 (UTC)Reply

Wikidata does allow items to link to redirect, as far as I know, but I do think that it is preferable to have it link to a versions page with only one single version (option #2) than any of the other suggested alternatives. —Beleg Tâl (talk) 15:28, 17 December 2019 (UTC)Reply
As far as I can tell, this happened because we had items at those locations, then Wikidata items were created for them en masse using a bot, then someone here cleaned up the local item by redirecting the page to a scan-backed copy. The result is that WD now has a data item for the redirect. In the case of at least the first two items, we would want a WD item eventually, since both are self-contained compositions that might wish to be referenced. One option is to create a version page. Another is to back-create an edition using section editing. That is, insert sections into the main work so that the individual portion can be transcluded in isolation, with bibliographic references. --EncycloPetey (talk) 17:40, 17 December 2019 (UTC)Reply

Match and Split (Phebot) not working

Match and split is a really useful tool, especially when used in conjunction with OCR text which has already been somewhat cleaned up. But the bot that drives it has been offline for a couple days. I see that User:Phe has not been active here on English Wikisource for many months. Does anybody know a good way to get this bot (or another process that performs the same functions) back online? -Pete (talk) 23:57, 16 December 2019 (UTC)Reply

You can try Phabricator, but having the experience of fruitless begging for fixing the Phe’s OCR tool there for many months, I do not see your chances very promising. --Jan Kameníček (talk) 00:25, 17 December 2019 (UTC)Reply
Thank you Jan, I'm happy to create a phab ticket, but I'd like clarity on a couple points first, if you or anybody is able to provide it. (1) Does anybody know whether Phe's codee is available anywhere public? (2) Does anybody know whether the specifications for that code were articulated or discussed prior to its being created? (3) Is this the kind of code that can, or should, be run on the wmflabs site? Apart from getting the code itself, what obstacles to that might exist? -Pete (talk) 18:39, 17 December 2019 (UTC)Reply
@Peteforsyth: The code for all Phe's tools are available at https://github.com/phil-el/phetools, and all the tools run at the Toolserver here: https://tools.wmflabs.org/phetools/.
The problem here is that they are maintained only by Phe (who is a volunteer and cannot be expected to be at our disposal in any given timeframe, or at all for that matter), so nobody else can do much about problems with them, and Phe has been unavailable since this summer. If anybody wants to fork the code and set up an alternate tool then that is absolutely possible, but these are not particularly simple tools so it will take a commensurate amount of skill; a not insignificant investment of time to understand the code and get the alternate tools running; and a ditto time commitment to maintaining them over the long haul (otherwise we'll just be back here in three months).
I'd be happy to help however I can if anyone wants to make the attempt, but I'm allergic to PHP and Python (which are the main languages used in phe-tools) and cannot commit to any predictable amount of time. --Xover (talk) 19:06, 17 December 2019 (UTC)Reply
Thank you Xover, very helpful. I am of course very cognizant that nobody is obligated to do anything here -- at this point I'm just wanting to help document what would be desirable to get done. This helps a great deal, I will open a phabricator ticket. -Pete (talk) 19:10, 17 December 2019 (UTC)Reply
One further question, then -- I notice that the robot is now running (yay!) Is it because it lives on the toolserver, that it's possible for it to be started without Phe's intervention? (I had previously thought this was a bot that ran on Phe's own computer.) When the robot goes down like it did a few days ago, what is it that needs to happen (in its current state) for it to be restarted -- and what is the correct process for requesting that thing to happen? -Pete (talk) 19:15, 17 December 2019 (UTC)Reply
@Peteforsyth: The Toolserver is a relatively complex beast, so reducing it to simple statements are going to misleading. Nobody but the maintainer can normally restart a tool, unless there is a security or infrastructure stability issue that requires intervention by "sysadmin"-type people. However, the various actual servers that make up the service we call "Toolserver" are sometimes rebooted for other reasons (which has the side-effect of restarting the tools running there), or one of them crashes (which is fixed by rebooting it, which has as a side-effect… etc.), or …
In this particular case (Match & Split down) it seems likely that the issue was a transient one with some component in the Toolserver service, and that the tool was not restarted as such: whatever happened in the infrastructure had as a side-effect that Match & Split started working again.
Which, by the way, is bad news for the OCR problem: if it was something a restart would fix, it is likely it would have resolved itself by now because the hosts get rebooted periodically for other reasons. That it still fails suggests that actual changes to the tool are needed, and that's something that requires an active maintainer with available cycles.
For Toolserver tools in general, the way to get the tool restarted is to contact the maintainer. They're the only ones with the access rights to do so. --Xover (talk) 07:29, 18 December 2019 (UTC)Reply
For the record, it looks like Tpt is also a maintainer of Phetools, which is good news since Tpt is still quite active on Wikisource. If either Phe or Tpt want to add me as a maintainer as well, I would be happy to help keep an eye on it. I'm a secondary maintainer on lots of random Tool Forge tools. Kaldari (talk) 22:36, 18 December 2019 (UTC)Reply
@Kaldari: That'd be great! But note that 1) I suspect tpt may have pings disabled, and 2) they may have been added as a maintainer on the same terms as you're volunteering for. Phe has not edited any Wikimedia project or had any activity on Github for 6+ months now (frankly I'm a little worried about them), and without their explicit consent tpt may not be comfortable adding you on their own cognisance. --Xover (talk) 07:22, 19 December 2019 (UTC)Reply

00:15, 17 December 2019 (UTC)

Page numbers not displayed

The following discussion is closed and will soon be archived:

Resolved.

Does anybody know why the page numbers of "Russian Government Links To And Contacts With The Trump Campaign" and of other subpages of the work are not displayed? --Jan Kameníček (talk) 23:26, 17 December 2019 (UTC)Reply

It could be because some of the "page numbers" are 15 characters long or longer. Page numbers typically should be 4 or 5 characters max. --EncycloPetey (talk) 23:29, 17 December 2019 (UTC)Reply
That was it. I reworked the pagelist and it helped. Thanks. --Jan Kameníček (talk) 00:23, 18 December 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 11:00, 29 December 2019 (UTC)Reply

Index:Old-folks.jpg

The following discussion is closed and will soon be archived:

Resolved.

Just doing a random page validation and I find that I am unable to validate the Index:Old-folks.jpg page. I don't receive the validated button when attempting to save it. Other pages on other works do show the validated button. Is there a problem with this particular page? Sp1nd01 (talk) 10:20, 18 December 2019 (UTC)Reply

@Sp1nd01: For some reason, a previous edit had managed to remove the username from the (invisible) noinclude section where the page status is stored. I've edited the page (set it to problematic and then back to proofread) so that my username got inserted there, so now you should be able to set it to validated. --Xover (talk) 11:04, 18 December 2019 (UTC)Reply
@Xover: Thank you, that has now worked as expected. Sp1nd01 (talk) 13:31, 18 December 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 11:01, 29 December 2019 (UTC)Reply

Weird Tales vol. no. 1 scan

The following discussion is closed and will soon be archived:

Resolved.

There are two scans, a .pdf and a .djvu, with the same naming scheme. Practice (for other scans) has been to use .djvu files, but there are generally more .pdf (non-.djvu) files available (if I am not mistaken). Could the proper scan be determined, and the data from the improper scan be transferred? TE(æ)A,ea. (talk) 19:48, 18 December 2019 (UTC).Reply

DjVu scans are all-around easier to use on Wikisource. PDFs are easier to make, but have more technical problems. --EncycloPetey (talk) 19:50, 18 December 2019 (UTC)Reply
The DjVu was a replacement for the PDF, and the pages have been moved to it.--Prosfilaes (talk) 06:25, 19 December 2019 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 11:02, 29 December 2019 (UTC)Reply

I unsuccessfully tried to get some input on this question in the KaldariBot discussion above, but didn't get any strong opinions. My question is:
Should "featured" text status replace the "validated" status or exist along-side it? In other words, should a featured text be marked as both "featured" and "validated", or is it just "featured" (which implies that it is also validated, proofread, etc.)?
Any feedback on this questions would be appreciated. Kaldari (talk) 22:25, 18 December 2019 (UTC)Reply

A text cannot be featured unless it is fully validated. However, "featured" is not a proofreading status step for an Index, so it exists alongside. Beeswaxcandle (talk) 06:03, 20 December 2019 (UTC)Reply
I agree with Beeswaxcandle. The text progress statuses and the featured status are orthogonal: while all featured text are by definition validated, semantically the two mean different things. I'll throw in the caveat that I don't comprehend Wikidata sufficiently to judge whether it might make sense to overload a single property rather than having separate properties; but in all other contexts (logical information model on down to database schema) I would have kept them separate. --Xover (talk) 06:49, 20 December 2019 (UTC)Reply
The closest analogy to this WP's "good" and "Featured" status. Each of those is determined independently, however. An article can attain Featured status without attaining "Good" status; two separate procedures and evaluations are made to determine each status. So, if we decide to have both badges for Featured articles, this wouldn't look any different to what WP does. I say this only to note that no WP editors would think it odd that both badges existed on the same item, so we're not likely to confuse anyone if we double-badge our Featured articles. --EncycloPetey (talk) 18:49, 20 December 2019 (UTC)Reply
Here are some examples of enWS pages which are featured but not validated:
So we cannot take it as given that all featured pages are validated. —Beleg Tâl (talk) 19:20, 20 December 2019 (UTC)Reply
Thanks for the feedback, all! Sounds like the consensus is that "featured" and "validated" badges should exist independently of each other. Kaldari (talk) 20:07, 20 December 2019 (UTC)Reply

OCR: Enable the Google-based version, until Phe's Tesseract version is operational?

I have recently read the discussion about broken OCR in some detail. The most recent comments point out that it would be up to English Wikisource to enable a (temporary) replacement until the traditional OCR tool is (hopefully) back in working order, or replaced with a better version.

Since we have a reasonably functional option based on Google's OCR, is there any good reason not to enable that by default, pending a more ideal outcome? Pinging some users involved in the discussion: @Xover, @Koavf, @Tpt, @Ineuw, @Jdforrester (WMF), @AKlapper (WMF), @Jan.Kamenicek: -Pete (talk) 21:19, 19 December 2019 (UTC)Reply

I have the Google OCR button in my gadgets, but my experience is that its output is so bad that I do not use it, so enabling it by default does not solve anything for me. However, I understand that for some people (especially the new ones) it may be better than nothing. The main thing I am afraid of is that once we get a "reasonably functional" tool, we will never get a well functional one.

--Jan Kameníček (talk) 23:40, 19 December 2019 (UTC)Reply

Yes, my experience is similar. But it depends on the text -- for some texts, it does a pretty nice job. IMO it's more important to have something for new users than nothing, when it comes to OCR. For the reasons you describe, I'm sure that Wikisource users would continue to advocate for something more functional regardless of whether or not the Google one is enabled, so I do not share your concern in this instance. -Pete (talk) 23:52, 19 December 2019 (UTC)Reply
Like Jan, I have had little success with Google's OCR tool. I usually find that it's easier to type it by hand when the OCR tool isn't working. But that is in part because I work heavily with: (a) Plays or poetry, where the formatting, capitalization, and punctuation do not follow standard sentence patterns. (b) Works with footnotes, which are in a different size and format, and therefore cause the OCR to bork. (c) Works that contain bits of text in other languages, which never come out right. (d) Works that contain special diacritical marks. (e) Works that contain unusual archaic typography, such as special characters for "ct", or italicized script that the OCR can't handle. If you're working on a text that consists primarily of standard sentences and paragraphs, without italics or any special characters, and without archaic spellings or archaic typography, that is entirely in English, then Google's OCR might be useful. But for me, it isn't. --EncycloPetey (talk) 00:47, 20 December 2019 (UTC)Reply
Something is better than nothing and there is no traction at Phab. I think the OCR tool that I am using now works fine. —Justin (koavf)TCM 00:50, 20 December 2019 (UTC)Reply
 Support If this is a proposal, make Google OCR the default since that is the only working OCR. All this means that it will be listed on the Gadgets page under Editing tools for Page: namespace instead of Development. — Ineuw (talk) 05:39, 20 December 2019 (UTC)Reply
 Support Why not? --Xover (talk) 06:26, 20 December 2019 (UTC)Reply
PS. Aklapper and Jdforrester are just processing and trying to manage all Phabricator tasks (there're a couple of thousand open tasks, iirc, all told). Neither one of them will have any particular opinion on this issue, or the specific Phab regarding Phe's OCR, so there's no need to ping them here. --Xover (talk) 06:43, 20 December 2019 (UTC)Reply
Note that the privacy policy requires informed consent for users before sending their data to non-Wikimedia services, which includes Cloud Services (like the proxy for this tool). The gadget as-is is in violation of the privacy policy and should be fixed to add a modal consent form (immediately, and definitely before this is enabled for users by default). Jdforrester (WMF) (talk) 08:28, 20 December 2019 (UTC)Reply
@Samwilson: ^^^ FYI. I'm trying to read up / do some digging on this to try to figure out what wriggle room there is and / or the broader impact on other gadgets. --Xover (talk) 16:58, 20 December 2019 (UTC)Reply
@Jdforrester (WMF): Why is a proxy hosted and run by the WMF considered a "non-Wikimedia service"? Which part of the privacy policy deals with this? Kaldari (talk) 19:43, 20 December 2019 (UTC)Reply
I guess you could argue that the API is non-Wikimedia. I'd still like to know what the actual wording of the policy is that relates to this, though. Kaldari (talk) 19:55, 20 December 2019 (UTC)Reply
i’ll believe in the "privacy policy" scruples, when i see them implemented in the m:IP Editing: Privacy Enhancement and Abuse Mitigation. until then, editors should expect to be constantly surveilled across all projects. Slowking4Rama's revenge 16:55, 21 December 2019 (UTC)Reply
 Support provided that we first implement the privacy form mentioned by Jdforrester above. -Pete (talk) 17:50, 20 December 2019 (UTC)Reply
@Peteforsyth: Note that that privacy policy issue is purely a formal requirement thing in this particular instance. There is no information that would normally be considered privacy sensitive being transmitted anywhere for this case.
When you hit the OCR button (and only when you actively press the button), the gadget sends the language code of the project (i.e. "en" here on enWS) and the URL of the scanned page image to the Toolserver. The Toolserver doesn't see your IP address because the request passes through a proxy server (managed by the WMF like the wikis). The OCR tool on Toolserver fetches the scanned page image and passes it and the language code to Google's Vision API (all Google sees is the scan image, the language code, and the IP of the Toolserver; your browser never communicates with Google directly). The Google API then returns the extracted text, which the tool on Toolserver returns to your web browser, and which the gadget code then inserts into the text field for editing.
And just to rub salt in the wound, the Google OCR tool/gadget was, AIUI, developed by the WMF Community Tech team; meaning that not only is no actually sensitive data being transmitted, but every component involved that might conceivably be an attack vector is actually under the WMF's direct control.
That said, the privacy policy is not optional and not subject to per-project policies, so we'll have to figure out some way to make this work within those requirements. I'm just not sure how the heck to do that just yet (there is no standard facility for displaying such a prompt, and ditto for saving that choice for next time; showing a confirmation dialog for every single page is… not even an option). --Xover (talk) 18:42, 20 December 2019 (UTC)Reply
Makes sense, and thanks for the explanation. My "condition" should not be interpreted too strictly; I of course defer to those more knowledgeable than myself about the proper way to handle this. -Pete (talk) 20:14, 20 December 2019 (UTC)Reply
Google OCR is excellent at reproducing accented Latin characters for my projects about Mexico. I also used the OCR on French Wikisource and it also works very well. It seemed to me that it is also Phe's OCR tool. I was hoping to figure out how I can link to it in my vector.js. I asked this on in the French Scriptorium but received no reply. Perhaps someone here can figure it out and let us know? — Ineuw (talk) 10:19, 12 January 2020 (UTC)Reply

New Wikisource users

Hello, all,

I organized a little "Wikisource party" earlier today for some WMF staff who were interested in how things work here. We had lots of questions, and you got several pages proofread as a result.

I also wanted to say that one of them had decided to try it out in advance, and he felt encouraged and reassured when someone thanked him for his first attempt. He's since proofread about another 40 or 50 pages, so it's working. ;-) Thanks for being such a friendly community. Whatamidoing (WMF) (talk) 23:47, 19 December 2019 (UTC)Reply

Thanks for doing that, and for letting us know! You may already know this, but please let people know that WS:S/H is a great resource for asking questions. -Pete (talk) 23:49, 19 December 2019 (UTC)Reply
Thanks so much for taking the initiative to do that. Very much appreciated! And please do let us know if we can assist in any way. --Xover (talk) 06:27, 20 December 2019 (UTC)Reply

IA uploader cannot find an existing archive.org file

I would like to upload volume 31 of the National Geographic Magazine, see [5] . Originally, I downloaded the pdf file from HathiTrust, but my attempts to convert it into djvu using some online converters djvu failed, so I uploaded it to the Internet Archive and tried to upload it to Commons using IA uploader. However, the uploading process has already been ongoing for many hours and when I looked at the view log, there is written: "invalid ia identifier, I can't locate needed files", which is strange.

May I ask for help with converting and uploading the file? --Jan Kameníček (talk) 13:02, 21 December 2019 (UTC)Reply

sometimes there is a lag as IA does its internal conversions. and then IA uploader goes slow as it converts to djvu. (and commons will not like PDM) i would retry, since this process is an open ticket. Slowking4Rama's revenge 16:48, 21 December 2019 (UTC)Reply
File:The National Geographic Magazine Vol 31 1917.djvu. Mpaa (talk) 17:25, 21 December 2019 (UTC)Reply
Perfect, thanks very much! --Jan Kameníček (talk) 21:20, 21 December 2019 (UTC)Reply
Having looked at the uploaded file in detail, I can see that the converting process has diminished the quality considerably and the OCR layer was destroyed too, see for example [6] --Jan Kameníček (talk) 21:39, 21 December 2019 (UTC)Reply
@Jan.Kamenicek:I think the reason is that IA has derived low quality jp2 images from the pdf file, and ran the process based on them. Also quality of IA pages displayed in their reader is poor. Mpaa (talk) 18:06, 22 December 2019 (UTC)Reply
Sorry if this is a stupid question, but is there a reason you can't just use the PDF? BethNaught (talk) 21:58, 21 December 2019 (UTC)Reply
@BethNaught: The most important reason is that Mediawiki has various problem with PDFs, the biggest of them being that it does not extract the original text layer of PDFs well (for detailed description of the problem see e. g. here). Conversion into djvu usually improves it. Besides that, the PDF file is over 100 MB, so it needs to be downsized in some way as Commons does not accept such large files. Conversion into djvu results in smaller size, another way would be keeping PDF but lowering its quality. Conversion into djvu usually solves most problems, but this time they seem enhanced instead :-( --Jan Kameníček (talk) 22:08, 21 December 2019 (UTC)Reply
I had no problem uploading File:The National Geographic Magazine Vol 31 1917.pdf using the UploadWizard or ChunkedUpload (finishing as I type this). —Justin (koavf)TCM 22:29, 21 December 2019 (UTC)Reply
@Koavf: Oh, that is wonderful, thanks very much! I did not even try it as Commons had always refused me when uploading over 100MB files, so something must have changed there. There is still the problem of text extraction from PDFs, but this time I will go with PDF, as it is (strangely) much better than DJVU, comparing e.g. [7] and [8]. --Jan Kameníček (talk) 23:00, 21 December 2019 (UTC)Reply
No problem. That DJVU is pretty garbage. —Justin (koavf)TCM 23:40, 21 December 2019 (UTC)Reply

Proposing move of pages The Works of Charles Dickens/Volume 1

Cakebot1 has been working on Index:Works of Charles Dickens, ed. Lang - Volume 1.djvu and transcluding as subpages of the "The Works of ..." To me this is not a good place for these works to be, and we touched on this conversation above at #Main namespace works; portal works and tendency to encyclopaedic components or listings. The titles that we use should be representing the works. As we already have The Pickwick Papers we need to determine a number of things.

  1. Whether we prefer to have 32 volumes of works reproduced as subpages (with all the inherent relative link resolutions), or a work to be a rootpage
  2. If the above determines that it is a rootpage, then how we would name works from volumes, when they will also have been previously published
  3. Moving the existing work, as it becomes the disambiguation.

My preference is to move and present the work as individual works, and to create a portal page to support the series as re-published. I would like to get this addressed early, prior to the contributor getting well into the work, and before we get further volumes popping up. [Also noting that we will need to fix the chapter numbering to our style]. — billinghurst sDrewth 00:23, 22 December 2019 (UTC)Reply

This was published as a single collection under a uniform title. Moving these to the main namespace as separate items would break the implicit connection of the series. We'd have to disambiguate these copies from other copies if they move to the main namespace, which would create even more work. We already host "works of..." sets for several other authors this way, so I see no problem with hosting it as is, under the common title with subpages. --EncycloPetey (talk) 00:37, 22 December 2019 (UTC)Reply
We have to disambiguate either way. Please tell me how "Works of ..." is useful, and how it represents the work of the author? It more seems to reflect the later work of a publisher, and just a series build, nothing more. That we made that choice previously could just be an indication of a poor choice, not evidence of how we should do things. If we are to keep under the Works of, as "Volume 1" is not helpful. — billinghurst sDrewth 03:10, 22 December 2019 (UTC)Reply
Special:PrefixIndex/Works of displays our "Works of", about 50 pages, 25 seem to be redirects. — billinghurst sDrewth 03:14, 22 December 2019 (UTC)Reply
I think you're confused here. This is not a work by Dickens; it is a work by Andrew Lang that includes works by Dickens. Lang has provided selection, ordering, editing, formatting and layout, introductions and prefatory matter, and indices. Extracting the parts of this work that are derived from Dickens' and placing it out of the context in which it was published would be misleading, and would be a disservice to, e.g., those who wish to compare Lang's selection or editing with that of earlier or later editions. The included Dickens works are obviously also editions of independent works, and so should appear on versions pages for those works, but that in no way affects the status of Lang's work. --Xover (talk) 08:55, 22 December 2019 (UTC)Reply
I agree with Xover here. Things should generally be organized as they were published, and for most authors with Works collections, there are different variations to worry about, making it more important.--Prosfilaes (talk) 15:43, 22 December 2019 (UTC)Reply
billinghurst, not all of our "works of" publication sets begin with those exact words, or even have the word "works" in their title, so your count heavily underestimates the number of such items we currently have. We have titles like The complete poetical works and letters of John Keats; Poetical Works of John Oldham; Victor Hugo's Works (Guernsey Edition); The Writings of Henry David Thoreau; The Plays of Euripides; or Masterpieces of Greek Literature. Not to mention the many magazines, newspapers, journals, and periodical that are set up just like this. --EncycloPetey (talk) 21:13, 22 December 2019 (UTC)Reply
Strongly agree with Xover and the others: compilation and anthology works are still works per se and should be preserved as such. Versions pages and redirects are the correct way to connect top level mainspace titles with editions that appear in such collections. Also, I've spent a lot of time consolidating loose works into their appropriate collections, so I have a personal vested interest here also. —Beleg Tâl (talk) 15:42, 23 December 2019 (UTC)Reply

This seems to me a case similar to The Novels and Stories of Henry James: many works published separately over a period of time, but also claimed to be volumes of a set. I did The Novels and Tales of Henry James (n.b. Tales not Stories) in the form The Novels and Tales of Henry James/Volume 2/The American/Chapter 1, and it is still that way now, but I have come to the view that it isn't right and doesn't work. For The Novels and Stories of Henry James, I named each work as a distinct publication (which it is), e.g. Confidence (London: Macmillan & Co., 1921), and simply link them from the works page The Novels and Stories of Henry James. I commend this method to you as the cleaner, and more in accordance with "things should generally be organized as they were published". Hesperian 01:35, 23 December 2019 (UTC)Reply

It's easy to do that for novel-sized works. What about New Hampshire (Frost)? Should each and every poem be in main space? Does that mean that if we do the Atlantic Monthly, a periodical where at least one of those poems were first published, that everything in it should be in main space? I can see the argument that a series of novels should be broken out by title, but I think it gets real messy in the case of anthologies and periodicals, especially when we're talking about excerpts of longer works and intros that cover multiple works.--Prosfilaes (talk) 11:36, 23 December 2019 (UTC)Reply
No, of course not. New Hampshire is a single published work. These sets are not single published works. They are multiple individually published works that the publisher has declared to be volumes in a set. That declaration doesn't change the fact of them having been published invidually. If each poem in New Hampshire had been published and printed and bound and sold separately, and subsequently declared by the publisher as aggregating into a single work, then, and only then, would I would say that each and every poem should be in mainspace. Hesperian 06:10, 3 January 2020 (UTC)Reply
I don't know that these sets are not single published works; I know that they're separately bound works. Not all the works in The Novels and Tales of Henry James were published separately; presumably many of the short stories were never so published, and the fact that the various collections of stories don't overlap is means that they are functionally volumes in a set. Looking at volumes 10-18 of The Novels and Tales, we're going to want those volumes as separate volumes. In practice, I don't think your proposal is much different from saying novel-sized works, besides making arbitrary historical distinction based on which works happened to have standalone publication and which were originally published in periodicals.--Prosfilaes (talk) 07:50, 3 January 2020 (UTC)Reply
I am not saying that individual works, e.g. single poems, should be presented here separately: no way. I am saying that the books in this set were separately prepared, printed, bound, issued on distinct dates and made available for sale individually (and ultimately will fall into the public domain separately over several years). They are distinct publications, in the literal sense of 'publication': made available to the public via the shelves of the local bookshop. Such distinct publications should be presented here as individual texts, not as subpages of a larger set, where that set is not a publication in that literal sense. Hesperian 23:44, 5 January 2020 (UTC)Reply
I guess I'm saying that The Novels and Tales of Henry James exists only as a concept created by the publisher, not as a literal publication. In which case it is also relevant to note that the publisher Bernhardt Tauchnitz had a 'set' entitled Collections of British Authors that was added to for over 100 years, and ended up with 5370 volumes. Hesperian 23:56, 5 January 2020 (UTC)Reply
But the only name for volume 16 of The Novels and Tales of Henry James is The Novels and Tales of Henry James, volume 16. You can say the same thing about periodicals, which are much more likely to end up with 5,370 volumes.--Prosfilaes (talk) 05:22, 6 January 2020 (UTC)Reply
No indeed. the half-title page reads "The Novels and Tales of Henry James / Volume 16", but the full title page reads "The Author of Beltraffio, The Middle Years, Greville Fane, and Other Tales", and it is commonly indexed under that title.[9] I can't find a good scan of Volume 16, but here's Volume 18. Hesperian 23:57, 6 January 2020 (UTC)Reply
True; which may not be the case for other works. I will note that "Famous Story and other works" can drive bibliographers and collectors nuts, as it can frequently refer to many distinct collections by many publishers.--Prosfilaes (talk) 03:02, 7 January 2020 (UTC)Reply
But it's not so easy to break the novels out in this case, either. It would still be messy. Volume one of the Dickens collection is not the novel; it's volume one of the set as well as volume one of the novel, and the novel exists across more than one volume. --EncycloPetey (talk) 17:25, 23 December 2019 (UTC)Reply

Italian Wikisource

I was surprised to find an English-language text on Italian Wikisource: it:Scientia - Vol. VII/The origin and nature of comets; can it be imported here? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:41, 23 December 2019 (UTC)Reply

Yes, it can. In effect, we would duplicate the Index page here as well as the relevant English language pages. I'm not sure whether anyone has developed a tool that would assist with doing this, as it's an uncommon occurrence. It would be tedious to do by hand, particularly if we're keeping the edit history intact to credit the work done at it.WS, but it could certainly be done. --EncycloPetey (talk) 17:22, 23 December 2019 (UTC)Reply

20:03, 23 December 2019 (UTC)

Naming of governmental works and duplication resulting therefrom

I have just noticed White House memorandum of a telephone conversation between U.S. President Trump and Ukraine President Zelensky, July 25, 2019, which is based of a validated scan, is a duplicate of the earlier Memorandum of Telephone Conversation with President Zelenskyy of Ukraine. In addition, the earlier page was created by the same user (IP address) that created Letter to Chairman Burr and Chairman Schiff, August 12, 2019, which is the whistle-blower report. Neither of these names, and especially not the second, are a proper indicator of what the text is in actuality. These could be more readily accessed if names more properly indicative of their contents were given, and if they were connected to the relevant Wikipedia article(s). TE(æ)A,ea. (talk) 22:11, 24 December 2019 (UTC).Reply

we have a style guide Wikisource:Style guide, if you want to add a section on work title, go for it. item find is pretty bad, but if you want to organized with a category or portal, or wikidata, go for it. -- Slowking4Rama's revenge 04:11, 25 December 2019 (UTC)Reply
Doesn't seem to be style issue.

Naming of pages can be difficult especially where it is correspondence, see special:prefixindex/Letter from. Create redirects, put it into Wikidata, and link from the articles at enWP are all part of the process. If you think that a work should have a different/better name, then propose it and we can move it. We are not adverse to a diversity of opinion and a discussion. — billinghurst sDrewth 10:03, 25 December 2019 (UTC)Reply

Empty categories

I would like to ask about the some currently empty categories (see below). Do they have any usage or can they be deleted?

  • Category:Pages containing image
    There is no explanation what the category is meant for. There are thousands pages containing images in Wikisource, but none of them has been categorized here so far. If there is reason for this category's existence, it should be easy to populate it by some bot. If there is not, I suggest to delete it.
  • Category:Pages containing errors‎
    The category description says: "These pages of non-fiction contain some error on them.", without specifying, what sort of errors is meant (spelling+grammar, factual errors by the author, factual errors caused by incomplete human knowledge in the time the work was written…) It is also not clear, how it should be populated (by SIC templates or manually?). I suggest to delete it.
  • Category:Texts without page numbers‎
    It is not clear which texts should come here: texts whose original publications are numbered but these numbers are not mirrored at Wikisource (which is true for most texts here which are not backed by scans) or texts whose original publications are not numbered? If there is a reason for existence of this category and after its aim is cleared, it can be at least partially populated by a bot. If there is not such a reason, I suggest to delete it too. --Jan Kameníček (talk) 22:34, 27 December 2019 (UTC)Reply
see also User:Hesperian "Decline. IMO, in a community this small, nothing created in good faith by a regular should be speedily deleted. This should be taken to WS:PD" and User:Billinghurst, User:John Vandenberg, User:Cygnis insignis. -- Slowking4Rama's revenge 00:23, 30 December 2019 (UTC)Reply
They are labelled maintenance/tracking categories so they will presumably have templates that will populate them when something is incorrect in their use. So suggest leave them as they are doing no harm, though I cannot remember what they do. Adding some commentary to them is probably of value. I you are seeing them where you should not be seeing them, then plug in {{maintenance category}}. (Hopefully we are better at labelling, and use of <includeonly> these days.) — billinghurst sDrewth 07:08, 30 December 2019 (UTC)Reply
They might be supposed to be filled by some templates, but one of possible reasons why they are empty is that the templates do not exist anymore. If their purpose is worth to keep them there should be some way to find it out and add it into the categories’ talk pages or somewhere. --Jan Kameníček (talk) 19:19, 30 December 2019 (UTC)Reply
Fully agree that overt labelling and documentation is the way to go. Doesn't change my initial comment. As a community we misused <includeonly> simply for the sake of neatness. — billinghurst sDrewth 01:27, 31 December 2019 (UTC)Reply

Checking page style for court cases

I recently created a page for Valvoline Oil Co. v. Havoline Oil Co., using the existing page, Universal City Studios, Inc. v. Reimerdes, as my source for wiki formatting.

I would like to know if there is anything I should change in the style of any future works I may add; if there's any formatting I shouldn't have included or anything I left out.

Qwertygiy (talk) 19:27, 29 December 2019 (UTC)Reply

A few items: (1) I see no link to the original source of the text copy. A link to the source of the text copy should appear in the header, or on the item's Talk page. (2) You've overlinked. there is no reason to link to Wikipedia articles like "Magazine", "Advertising", or "New York". (3) The judge who authored the decision should be identified in the header or header notes.
Also, you can center an image without using the template; I've done this for the two images. --EncycloPetey (talk) 19:43, 29 December 2019 (UTC)Reply

 Comment @Qwertygiy: Agree with EncycloPetey's comments, see Wikisource:Wikilinks. Maximise internal links, minimal external links where adds true value and unambiguous. So we would do either author = or contributor = for whomever wrote a judgement or wrote an opinion. We would normally do local author links in the body of the work for the judges cited and create relevant author pages.

Some questions and comments

  1. Are the references yours, or where they in the original document? If yours, then they should be moved to the talk page, and use the edition parameter, and a note to point to them. We try to present clean documents, not annotations.
  2. We would normally add put the case into WD, if there is an article for the case at enWP, they can share the same item for case law, and this would provide the interwiki links.
  3. At some point we would/should create Portal:United States District Court for the Southern District of New York—their creation is organic, how many other works—sometimes even consider an anchored redirect a subsection to a parent portal page to make it easy to break it out at a later stage.

billinghurst sDrewth 08:54, 30 December 2019 (UTC)

  1. In regards to the references, they were all included verbatim in the source text (in parentheses) or were referencing earlier such citations as supra. Any citations that were integrated into the text rather than thusly isolated, I left in place and merely added links. My reasoning was that such a citation serves the same purpose whether in parenthesis or in footnote reference; the former was easier to create on a 1910s typewriter while the latter is easier to read on a 2010s webpage.
  2. In regards to Wikidata, I'll take a look at the procedures for that. I'm not very familiar with it yet, most of my contributions being solely at enWP.
  3. In regards to the portal, creating one that is a redirect to the subsection of the US case law portal seems like the best idea at the moment, since the half-dozen works added thus far seems a little too small and specific to justify having its own portal, but there are many thousands more that exist and just aren't added as of yet.
  4. In regards to the link to source, I left it in the original commit message; I'll add it to the talk page.
Qwertygiy (talk) 21:33, 30 December 2019 (UTC)Reply

Versions and Wikidata problem

Version pages

In discussion with another editor, I've discovered that the information on Wikisource:Versions does not align with current practice.

Versions pages
Different versions of the same work are listed on "versions pages." Such pages are only for different versions of substantively the same work. Different works should not be listed together on the same versions page, even if they have the same title and/or author; they should be listed on disambiguation pages. This applies even to works that are reviews or analysis of the work. For example, Charles Lamb's prose retelling of Shakespeare's Romeo and Juliet (Shakespeare) is a version of that work, and belongs on a versions page with it. The entry entitled "Romeo and Juliet" in the The New Student's Reference Work is a work about, rather than a version of, Shakespeare's play, and therefore should not be included on a versions page. (Works that share the same title are listed on disambiguation pages; works that share the same subject are listed on portal pages.)

Wikisource:Versions#Versions pages


The key section is: "Charles Lamb's prose retelling of Shakespeare's Romeo and Juliet (Shakespeare) is a version of that work, and belongs on a versions page with it.

Does this mean that movie scripts, operas, retellings, children's adaptations, etc. belong intermixed on the same versions page? And who is considered the "Author" on such a versions page, when each item would actually have a different person who wrote it?

There is an additional problem now that we are connecting to Wikidata. Romeo and Juliet (Shakespeare) is linked to Wikidata item d:Q83186, which is specifically for the play written by William Shakespeare. The retelling by Charles and Mary Lamb has a separate data item, because the author and publication information are different. If we are to treat versions pages as currently described at Wikisource:Versions (quoted above), then we must remove the link from Wikidata, because our content on that page does not match the Wikidata item. We would need to create a new kind of page that lists only editions of the work itself, separate from the versions/retellings/adaptations.

The problem goes yet deeper. If you do not see the issue at play here, look at Macbeth (Shakespeare) and Macbeth. The page Macbeth (Shakespeare) currently lists only editions of the play itself, not the retelling by Charles and Mary Lamb, nor the opera adaptation by Verdi. The page is already crowded with editions, and there are many more besides that are not yet listed, because it is a Shakespeare play. The disambiguation page Macbeth lists the other items by other authors. And note that we currently have three editions of Charles and Mary Lamb's Tales from Shakspeare retellings, in various stages of transcription. Will all of these editions be listed on the same page as all the editions of the Shakespeare play? If so, all editions of Verdi's opera and of any other editions of any derivative works would also all appear mixed together on the same page. Is this desirable?

The current wording of Versions is no doubt the result of an earlier, simpler time when Wikisource did not have many editions of the same work, and did not have to concern itself with the possibility of multiple editions of the same work, nor multiple editions of derivative works. I propose we reverse the current wording so that Versions pages explicitly do not include adaptations or retellings by other authors. --EncycloPetey (talk) 18:43, 30 December 2019 (UTC)Reply

I was thinking about the same problem when I was dealing with some folk tales that were retold by various authors.
The problem might be solved if we had two kinds of version pages: versions of work and versions of story. --Jan Kameníček (talk) 19:07, 30 December 2019 (UTC)Reply
The Italian Wikisource has adopted a separate "Opera:" (Work:) namespace for items that are the same work, but different editions. We could do the same. Having two different kinds of Versions pages would get messy anyway, however we tried to do it. If we opened a new namespace for the items that are the same work/author, that would free up Versions pages to treat items that are the same general story, but with different authors/wording. --EncycloPetey (talk) 19:12, 30 December 2019 (UTC)Reply
What about translations? Does it mean that the new namespace would also host original works and the translation pages would become redundant? --Jan Kameníček (talk) 19:26, 30 December 2019 (UTC)Reply
The pages which list Translations could be rolled into the Work: namespace. They would merely need to accommodate the information about the original language title. But yes, if we decided to go that way, it would mean that we would not need a separate set of Translations pages. The only difference right now between a Versions page and a Translations page is whether or not the original language of the work was English. And we have some marginal cases already which sit astride the two, such as Beowulf, which was written in Old English, so its page lists the Old English editions as well as translations into Modern English. With a separate Work: namespace, we wouldn't have that problem. --EncycloPetey (talk) 19:31, 30 December 2019 (UTC)Reply
I don't really see the problem. Yes, pages like that need to have additional internal structure. But those pages are far and few between. Of course the novel version of "And Then There Were None" and the play version should be on the same version page.--Prosfilaes (talk) 19:46, 30 December 2019 (UTC)Reply
You haven't stated any reasoning, and no, they are not few and far between, and it is a growing problem. Why should works written by different authors appear as "versions" on the same Versions page, instead of on a disambiguation page? Why should a ten page prose summary of the story of Macbeth appear on the same page as a 150 page play with stage directions, when the two have different authors and completely different text? Why not group them instead on a disambiguation page? --EncycloPetey (talk) 20:11, 30 December 2019 (UTC)Reply
Shakespeare's w:Romeo and Juliet (c. 1590–1595) is based on Arthur Brooke's 1562 narrative poem "The Tragical History of Romeus and Juliet" and William Painter's 1567 collection of Italian tales which included a version in prose named "The goodly History of the true and constant love of Romeo and Juliett"; Brooke's version was a translation into English of Pierre Boaistuau's 1559 French version; which was in turn a translation of Matteo Bandello's c. 1531–1545) Giuletta e Romeo; Bandello based his version on Luigi da Porto's c. 1524 Giulietta e Romeo; da Porto based his version on the c. 1476 Mariotto and Gianozza by Masuccio Salernitano, who draws on Dante's Divina Commedia (in canto six of Purgatorio), the Ephesiaca of Xenophon (c. 3rd century), and Pyramus and Thisbe from Ovid.
In the other direction there were the 16th/17th-century quarto and folio editions that may be said to be roughly the author's original editions; followed by something on the order of 25–30 main distinct editions up through the 19th century (starting with Nicholas Rowe, Alexander Pope, and Lewis Theobald in the first half of the 18th century; through the great Tonson editions edited by Samuel Johnson, George Steevens, and Isaac Reed; the 1790 and 1821 Malone editions; John Boydell's copiously illustrated edition; and up to the famous Cambridge/Globe and Arden editions). All of them aim to get at the "true" Shakespeare and seek to substitute their judgement for that of previous editors, leading to wildly differing results (not to mention Way Too Much Drama™ for historiography). And then we have things like Charles and Mary Lamb who retell the plays in prose at a level aimed at children, and Thomas Bowdler that expurgiated the plays to be fit "for 19th-century women and children". And in contemporary versions we have modern spelling editions, manga versions, etc.
And then you get into adaptations: w:List of films based on Romeo and Juliet list 150+ TV and movie adaptations alone. There are 8 ballets, 9 operas, 5 musicals, and 3 main compositions of classical music.
If we put everything on a versions page, Romeo and Juliet (Shakespeare) would have more than a thousand entries, more than half of them bearing little actual resemblance to the play that William Shakespeare wrote. We need to draw some lines somewhere: Shakespeare's works are extreme examples that help find those points in a way that Christie's paltry two versions do not. But the problem is general. --Xover (talk) 20:46, 30 December 2019 (UTC)Reply
And the vast majority of works that have a version page will have two versions on there. Most books were never reprinted. Few were ever made into plays or came out in significantly different editions. A page for one of Shakespeare's plays can afford to be the exception. Moreover, "would have" is begging trouble from the future. Why not worry about what we have, instead of what we might have?--Prosfilaes (talk) 00:57, 31 December 2019 (UTC)Reply
See Hymn and The Raven.--RaboKarbakian (talk) 20:35, 30 December 2019 (UTC)Reply
Those are Disambiguation pages, which are a separate concern. They are not relevant to the current discussion. --EncycloPetey (talk) 20:48, 30 December 2019 (UTC)Reply
There is a huge amount of variation in this kind of problem. In some cases it is pretty reasonable to consider the two works to be versions of the same work (e.g. the Hebrew and Greek versions of Esther). In some cases it's pretty reasonable to consider the two works to be completely different works sharing only a common underlying theme (e.g. the Routhier and Weir versions of O Canada, which should probably be converted to a disambig page). In the case of adaptations, for example La Fontaine's adaptation of The Tortoise and the Hare, or Seidenbusch's adaptation of Salve Regina, or the Lambs' prose adaptation of Macbeth, I would still consider them to be "versions" of the original work, and would list them on the original work's Versions page. However, I would note that they are adaptations rather than original versions (or direct translations), and generally would put them in a separate section of the Versions page. If the adaptation is significantly different from the original, I would also list it on a disambiguation page. —Beleg Tâl (talk) 20:51, 30 December 2019 (UTC)Reply
(I might, however, give the adaptations their own versions page, and just link to them from the main versions page) —Beleg Tâl (talk) 20:54, 30 December 2019 (UTC)Reply
Oh, here's a good example of what I mean: Alice's Adventures in WonderlandBeleg Tâl (talk) 20:56, 30 December 2019 (UTC)Reply
How would you feel about the proposal to create a new Work: namespace to solve the issue? --EncycloPetey (talk) 20:58, 30 December 2019 (UTC)Reply
I must admit that I don't really see how a Work: namespace would solve the issue. The same problem you see currently on Versions pages will persist on Work: pages. The idea of using Versions pages for versions-of-story already exists in Portal space (e.g. Portal:Cinderella). We would just be moving the problem around, not addressing it. —Beleg Tâl (talk) 21:06, 30 December 2019 (UTC)Reply
Furthermore: even if we tried to formalize a separate structure for version-of-work and version-of-story, this completely falls apart for most traditional folk stories and songs where there is no real difference between the two and every single edition is wildly different. How many version-of-work pages would you use for Tam Lin? How many version-of-work pages would you use for The Elfin Knight? —Beleg Tâl (talk) 21:11, 30 December 2019 (UTC)Reply
Not to mention that folk stories often have closely parallel versions in different languages, so then you could have some that are "original tellings" in English alongside some that are translated from another language. They'd end up on different pages if there are separate Version and Translation pages. I wish there were enough Wikisource editors to do a Folk Stories Project and sort this stuff out into Portal-style pages instead. Version pages could be reserved for works that are author-associated, a work of that author particularly. E.G. HC Andersen’s "Tinderbox" is a retelling of an old tale but it is universally known as Andersen's "Tinderbox," thus it deserves its own versions page (or rather translations page in English). Levana Taylor (talk) 01:45, 31 December 2019 (UTC)Reply
Right now, there is a broad scope in the problem, as you have noted, on the one hand between items that are the same work, and on the other items that clearly are not the same work. These disparate items may or may not appear on a Versions page at the whim of an editor. A Work: item would always be for a specific work, making a clear distinction between the two types of listings. The Work: namespace would also include translations. Right now, Esther (Bible) is a Translations page, only because the original was not written in English, even though they are simply different Versions of the same work. If we consider that the "original" Romeo and Juliet story was not in English, then "Romeo and Juliet" would need to become a Translations page, and this would be true of many of Shakespeare's plays, as his plays were not the first tellings of the stories. A Work: namespace would absorb all the Translations pages and lists of editions of the same work, and thus draw a clear divide between listings of the same work (Work:) and related works that are derived from each other, which would then be the focus of Versions pages. Right now. we draw a divide between works originally in English and works not originally in English. The proposal would shift that division to editions of the same work versus versions of similar works. The concern about wildly different editions is already a problem. There are two entirely different early editions of Shakespeare's King Lear, and some modern prints of Shakespeare's works include both for completeness. Despite the differences, they are clearly supposed to be the same work by the same author, whereas a retelling by Charles Lamb for Tales from Shakspeare is clearly not an edition of the same work, but is a related story. Works like The Elfin Knight are no different than copies of ancient works, where different manuscripts preserve or lose different passages. --EncycloPetey (talk) 21:21, 30 December 2019 (UTC)Reply
Okay, now I'm just confused about what is being proposed. The suggestion that translations should be merged into Work space, even though translations are the ur-example of derived works by different authors, further reinforces to me that there is no such thing as a clear distinction between "the same work" and "not the same work". —Beleg Tâl (talk) 22:02, 30 December 2019 (UTC)Reply
Translations preserve the content, even though the language has changed. A translation can be placed side-by-side with the original, aligning the texts. We have a translation namespace that does just that for multiple texts, from books of the Bible to poetry by Catullus. In contrast, a retelling by Charles Lamb will bear little resemblance to the parent text, even though it might be written in the same language. In a library catalog, translations of a work are still considered to be copies of the same work; only the language has changed. Whether you're reading Dante's Inferno in medieval Italian, modern Italian, English, or Chinese, it's still Dante's Inferno, and would be catalogued with Dante as the author. Retellings and adaptations will be catalogued under different authors.
To give a modern example: If I found a German "translation" of Stephenie Meyer's Twilight, I would expect it to have the same story, same characters, and same plot as the original; the same number of chapters, the same everything except the language. The novel Fifty Shades of Grey is a retelling of Twilight (originally as fan fiction) by a different author, of the same basic story. In the retelling, however, the setting is different, the character names are different, and there are no supernatural elements. It is a complete retelling. So, the translation bears more in common with its source text than a retelling. Under our current Versions page structure, both novels would be listed on the same Versions page because they derive one from the other. But an English translation from a German copy would be placed on a separate page, simply because it is a translation.
Translations in library catalogs and on Wikidata are treated simply as editions of the source text, with only a data code indicating that the language has changed. Retellings and derivative works are treated completely separately. The proposal would thus align Wikisource practices with both library databases and Wikidata structure. --EncycloPetey (talk) 22:56, 30 December 2019 (UTC)Reply
Your view of a translation is idealistic. In reality, translations can take (often silently) take huge liberties with their underlying work, frequently amounting to paraphrases or condensations.--Prosfilaes (talk) 01:05, 31 December 2019 (UTC)Reply
I am aware of translational difficulties, but idealistic or not, it's the view taken at Wikidata and by library catalogs. The same disparity can be found in some editions of works, where spellings, vocabulary, punctuation, and more can be altered by editors. Compare the two "first editions" of Moby-Dick, which made different sets of corrections requested by the authos; or the US and UK editions of A Clockwork Orange, for which the US publisher decided to omit the final chapter. Nevertheless, both the UK and US editions of Moby-Dick are considered to be the same work, as are the US and UK editions of A Clockwork Orange. But my point is that other works based on those novels, written by other authors, ought not to be considered the same work as the original. Currently, we make no such distinction. --EncycloPetey (talk) 01:23, 31 December 2019 (UTC)Reply

 Comment Not certain that I wish to wade through that conversation. Keeping it simple. Fixing pages here is all that seems needed.

  • Our versions are for works by the author
    main namespace pages, can link to WD item for enWP article
    If we versions pages that are out of scope, then fix them.
  • Disambiguation pages are for works of same/similar name by various authors.
    main namespace pages, that link to WD items for disambiguation
    Charles Lamb's retelling is not a version, it is a derivative work and to be disambiguated, and it can have notes that put it in context to the original work. Contributors who have morphed these pages should be pointed to this conversation. Disambiguation pages can be structured to capture some of the aspects of derivative works.
  • Translations versions are for works of the original author, where there are different translations/editions of translations
    main namespace pages, can link to WD item for enWP article
  • Portal: ns pages exist for curation where required.
    portal namespace, would link to WD item for portal (see topic's main Wikimedia portal (P1151) and Wikimedia portal's main topic (P1204))
    Not encouraging these for the work level, though if someone wishes to put the work into creating something that explains a subject matter, and a range of disambiguations, versions, translations, and retellings, then go for it. It would not replace these main namespace pages.

billinghurst sDrewth 01:10, 31 December 2019 (UTC)Reply

But your first item is part of the problem. We currently have advice on versions that differs from what you've described, so some kind of change is necessary. The question is what sort of change? --EncycloPetey (talk) 01:15, 31 December 2019 (UTC)Reply
Please be specific of which bit. Inexact commentary is less than helpful. — billinghurst sDrewth 01:24, 31 December 2019 (UTC)Reply
Read the opening three paragraphs of this discussion. Currently, our guideline advice would put Charles Lamb's retelling of Shakespeare's Romeo and Juliet on the same versions page with the play itself. Your comment says otherwise. Hence, we either need to align practice with the advice, or alter our advice to fit practice, or make some other change to resolve the discrepancy. --EncycloPetey (talk) 01:28, 31 December 2019 (UTC)Reply
Gotcha, if you are quoting a page, can I suggest {{cquote}} as so it is overt. I missed the cue, thought it was your comment. [Note to self don't try and write abusefilters, analyse abuse and try to have conversations. My apologies.] — billinghurst sDrewth 01:33, 31 December 2019 (UTC)Reply


Example of Wikidata item

General comment about works at WD and enWP, the item Romeo and Juliet (Q83186) is for the conceptual work and all that follows, not solely the play. It has from what it was based itself, and derivative works. So I don't see an issue with how it is being linked. It is how we wish to look at the latitude of subsidiary links. — billinghurst sDrewth 01:41, 31 December 2019 (UTC)Reply

The WD item points to derivative works, but doesn't list them as editions of the work itself. The WD structure identifies editions of the work by pointing to editions pages for the editions, and to derivative works by pointing to a data item for the derivative work, which in turn points to editions of that derivative work. Each derivative work has its own separate WD data item, with lists of editions, and each derivative work has its own data item with lists of editions. Currently, we seem to make no such distinction. So while Wikidata has separate data items for Macbeth, the play by Shakespeare, and Macbeth, the opera by Verdi, our advice currently would lump them into a single page. --EncycloPetey (talk) 01:53, 31 December 2019 (UTC)Reply
I believe that is a limited perspective, our WD linkage points to a central coordinating point of the concept of the work. The WP article w:Romeo and Juliet there is more than the words of the work itself. WikiCommons' c:Romeo and Juliet is definitely not focused on the base publication, they are the concept. Our versions page focuses on the conceptual, editions, and derived works, and we are arguing about derived works, and the interwikis definitely cater for that aspect, so I don't see a huge difference. The focus of our page is not on the derived works, each of our presentations has/should ahve its own item, and that has been our practice — billinghurst sDrewth 02:32, 31 December 2019 (UTC)Reply
The Wikipedia article is about the play. If the Wikipedia article were listed here it would be placed on a disambiguation page. So comparing what the Wikipedia article does to what we do is fallacious reasoning. Commons content will vary depending on the number of items in a category. When a category grows large enough, subcategories are split off. So, for example, commons:Category:Medea (Euripides) is for the play by Euripides, but there are subcategories for the translation of the play by Augusta Webster and for the 2009 Syracuse performance of the play. And I don't think you've actually looked at what commons:Category:Romeo and Juliet has or what subcategories it contains. The key here is that neither Wikipedia nor Commons make distinctions between items that are the work and items about the work. So if we follow your line of reasoning (based solely on mimicking Wikipedia and Commons) then our "Versions" pages ought to contain items that are the work, as well as items about the work, which is what we currently do with Portals. Is that what you are advocating for?
Further, the interwikis between Wikipedias link only to other articles about the Shakespeare play. The interwikis to Wikiquote link only to pages quoting the play. The interwiki links to other Wikisource projects link only to translations of the play. So I don't see any truth to your claim that the interwikis cater to derived works. Yes it's possible to navigate by following additional links, but we do that with {{similar}}.
That said, are you advocating for a change to the advice on the quoted page, or are you advocating for something else? Your preferred course of action isn't clear. --EncycloPetey (talk) 17:29, 31 December 2019 (UTC)Reply
I beg to differ about the WP article, it is an article about the play, and a zillion things that have sprung from it. If it mentions "legacy" and talks about ballet, it has gone beyond the play, and is about the conceptual work in a broader aspect including deemed pertinent derivatives. Commons has artwork that is not from the original play, it is someone's conceptual interpretation from the play, these are derived works on the subject. So as such, I don't see it as black and white as you do for the other sites, and the WD item. That said, I am awaiting others' comments, they are important for framing. — billinghurst sDrewth 12:15, 1 January 2020 (UTC)Reply

Cosmetic problem with {{header}} change

This conversation has been moved to Template talk:Header#title & contributor: one line or two? Levana Taylor (talk) 04:09, 31 December 2019 (UTC)Reply

Happy Public Domain Day!

Here are some things entering the public domain in the next several hours: https://web.law.duke.edu/cspd/publicdomainday/2020/Justin (koavf)TCM 06:29, 31 December 2019 (UTC)Reply

Do our "1923" templates need to be updated? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:53, 31 December 2019 (UTC)Reply
They shouldn't. {{PD-1923}} was adjusted last year to automatically progress, and {{PD/1923}} uses the code of the first. --EncycloPetey (talk) 17:17, 31 December 2019 (UTC)Reply
Template:PD-anon-1923 does need converting to make it automatic. It's currently stuck on 1924. BethNaught (talk) 11:55, 1 January 2020 (UTC)Reply
We should rather be updating and using {{PD-anon-1996}}, and just leave 1923 alone. — billinghurst sDrewth 12:05, 1 January 2020 (UTC)Reply
@BethNaught: I have updated {{PD-anon-1996}} and {{pd/1996}} so they display dates and text appropriately for the 2020. They are relatively done, so progression each year will be fine too. — billinghurst sDrewth 13:46, 1 January 2020 (UTC)Reply
1923 is not relevant for anonymous works anymore, and 1996 conflates "expired in the US" and "not renewed by the URAA", which isn't something we should be doing.--Prosfilaes (talk) 19:56, 1 January 2020 (UTC)Reply

┌─────────────────────────────────┘

Not only do we have at least three templates with "1923" in their names (and quite why we have {{pd/1923}} and {{pd-1923}}, differentiated only by one punctuation character, for apparently very different functions, is anybody's guess), but they hard-code Category:PD-1923 and Category:Author-PD-1923. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:23, 1 January 2020 (UTC)Reply

We have PD-1923 for works where we know their publication date, but not the date of the author, or the date of the author is complex (multiple or corporate), plus it only gives US. PD/1923 allows the split copyright of US and home country based on the author's date of death. PD/ gives us the indication of when we can move to Commons, whereas PD- does not. — billinghurst sDrewth 12:39, 1 January 2020 (UTC)Reply
"PD/ gives us the indication of when we can move to Commons, whereas PD- does not" I know what they do; but does anyone seriously think that "/" vs. "-" is the best way to convey that to colleagues, especially new editors? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:33, 1 January 2020 (UTC)Reply
Slashes means they are subpages, and accordingly can be relatively linked. — billinghurst sDrewth 13:44, 1 January 2020 (UTC)Reply
Template:Pd/1923 is not a "subpage", because Template:Pd does not exist; it's just badly named. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:33, 1 January 2020 (UTC)Reply

┌─────────────────────────────────┘

The year 1923 is no more relevant (in fact no fixed year is relevant now, the elapsed time is, so maybe also the wording of the text can be updated). So I suggest

  1. to rename the template {{PD-1923}} for {{PD-US-95}}.
  2. to rename the template {{PD/1923}} in a similar way, e. g. for {{PD/US-95}}, or even better to merge it with the previous one
  3. to rename the template {{PD-anon-1923}} for {{PD-anon-US-95}}, or merge it with the previous ones, making the anon just their parameter, e. g. {{PD-US-95|anon}}
  4. to change the texts for "This work is in the public domain in the United States because it was published more than 95 years ago. …" or something similar. --Jan Kameníček (talk) 12:28, 1 January 2020 (UTC)Reply
While I agree that 1923 has progressed, all the years of publication are relevant thereafter, and it is best to just keep it all harmonised. It was chosen that way to keep it simple; simply pick the year period, add your publication year. "-95" is just going to cause issues, is it -95, -95+1, -95 from today, how many is -95. Fixing the templates at the back end, is pretty easy, and I will just plan to get it done. — billinghurst sDrewth 12:34, 1 January 2020 (UTC)Reply
I do not see any difficulties. It would be as easy to use as e.g. {{PD-anon-70}} or {{PD-old-70}} at Commons, and most people coming here usually have experience with Commons templates. Or alternatively, it can be renamed for {{PD-US-96}} with the text "This work is in the public domain in the United States because it was published at least 96 years ago or earlier. …" The documentation can not only explain it further, but even specify which the latest acceptable year is (with automatic update of the year). If the templates were merged into one, everything would be perfectly harmonized. --Jan Kameníček (talk) 14:36, 1 January 2020 (UTC)Reply
The problem with using a template like "PD-anon-70" is that it is applicable for 10 years or less. Eventually, we will reach 80 years, 90 years, and 95 years, at which point such templates must be replaced based on the increasing number of years since the author's death or the work's publication. Any template that is set to operate based on a fixed range after the author's death or the work's publication date will produce this issue of perpetual monitoring. Such an approach might be possible on Commons, with a larger community to constantly adjust, but for a smaller community like Wikisource, it is not the best approach. It is better to have templates that adjust their display based on information provided about the date of publication or date of the author's death. --EncycloPetey (talk) 18:43, 1 January 2020 (UTC)Reply
100% agree that we should remove "1923" because it's not semantically meaningful--we're just saying, "public domain in the United States due to expiration of copyrite", whenever that is. —Justin (koavf)TCM 12:49, 1 January 2020 (UTC)Reply
People may indeed be familiar with the license names at Commons, but the problem on Commons is that those templates do not perform the same functions there as they do here. Over there, an item needs two templates: one for pma licensing and another for US licensing. Sometimes there is a combined template, but sometimes not. Also, if we use the same naming as Commons, people may assume our licensing works just like Commons, and it doesn't. I'm not saying that we shouldn't change licensing template, but I am saying we shouldn't be looking at the confusion on Commons to decided what we should do here. You only have to look at the parameter listings on commons:Template:PD-US-expired to see how confusing a single template can become. Our current system is much easier to use. Also, a reminder that this same discussion happened in 2018. --EncycloPetey (talk) 16:41, 1 January 2020 (UTC)Reply

┌─────────────────────────────────┘

Well, although I really do not see anything difficult in the templates I proposed, I can live with other kind of templates too. However, I am convinced that whatever templates are created or updated they should:

  • have some comprehensible and general name which does not have to be changed every year (current PD-1923 is an example of a template’s name not suitable for our purposes). PD-US-96 is imo suitable, but I am open to other suggestions too.
I'd prefer -95 or US-expired; 95 is confusing, but so is -96 and any other choice, and it's consistent with -70 and -50. They definitely should be changed. I might argue for moving PD-old to PD-old-100 and replacing PD-old with a warning message, since it is the biggest confusion with Commons users.--Prosfilaes (talk) 19:56, 1 January 2020 (UTC)Reply
There is a quasi-project ongoing at Wikisource:Requested texts/1924 for Public Domain Day; I am adding The Box-Car Children (darker story in the 1924 version than in later versions, interestingly). So get over there now and add a few. Lemuritus (talk) 21:13, 1 January 2020 (UTC)Reply
I agree with user:billinghurst that we shoud leave 1923 templates as they are. They are true indicators when a work came to be in the public domain. 1924 public domain works should be indicated as such. Both of these have informational value even if it sounds incongruent. — Ineuw (talk) 19:26, 2 January 2020 (UTC)Reply
  •  Comment having some more time to think about this, I am wondering whether we just look to having {{PD-US|year of death}} and we gracefully deprecate both {{pd/1923}} and {{PD-1923}} and revert its text back to where it was. We can copy the current updated code over to this new template, and if there are any 1923+ works, then we simply update them over to the new template. I think the use of "expired" is just superfluous, and we can write that into the text.

    My reasoning for PD-US are it is simple and it somewhat aligns with Commons. We merge the logic of these templates, if YYYY for DoD is given it displays the death date PMA text; if not date is given then it just gives the standard "out of copyright" text. The PD-1996 and PD/1996 remain as they are as they still have determinative conversations based on year of publication, and year of death; once a year we would run a bot through and convert those on the 95 year boundary. — billinghurst sDrewth 13:35, 3 January 2020 (UTC)Reply

    Agree. --Jan Kameníček (talk) 22:38, 6 January 2020 (UTC)Reply
    Would this template be used for works published less than 95 years ago? The potential problem I see with having PD-US as a template name is that users may place it regardless of the date of original publication. That is, will this template cover "no-renewal" and "no-notice" situations, or will those templates still serve the separate function? If so, then we may be creating a whole new problem. Nor can we rely on the date of that edition's publication as a guideline, since we need to know the date of first publication and/or date of copyright registration, since that is not always the same, and which is not always included by an uploader. For example, I have come across a work whose initial publication date is 1927, but will not enter PD until 2024 because copyright was filed and renewed within six months and overlapped into the following year. If we're going to overhaul things, we need to consider sources of confusion and what information will be needed by someone to verify the template is correct. --EncycloPetey (talk) 23:05, 6 January 2020 (UTC)Reply
    I agree that we should not have one template for works published more than 95 years ago and no-renewal and no-notice situations. I'm not a fan of a flat PD-US, but a mere naming of a template (the main template) isn't going to stop licenses from being carelessly placed on works.
    I don't understand what you mean by the publication date, though. My understanding is that works could be filed for copyright anytime in the first 28 years, even alongside renewals, but the clock started on the earliest of publication, copyright date, or registration.--Prosfilaes (talk) 23:44, 6 January 2020 (UTC)Reply
    If the work was first published in the UK, there was a grace period in which to register a US copyright (six months?) I have found an instance where the initial publication in the UK happened at the end of 1927, but the initial US copyright was granted in early 1928, followed by a renewal. I have verified this with the copyright database at Stanford. So in this instance, the date of initial publication (in the UK) cannot be used to determine copyright status within the US. --EncycloPetey (talk) 00:17, 7 January 2020 (UTC)Reply
    I checked with Clindberg on a similar case, and he said that the clock would have started in 1927. Since it's not an active issue, I don't want to ping him, but I'm pretty sure initial publication was enough.--Prosfilaes (talk) 03:11, 7 January 2020 (UTC)Reply

Need for a specific doohickey

Is there an existing gadget/widget/app/whathaveyou that would allow us to quickly convert selected text into either a mainspace link, or an Author piped link? It’s insane to consider doing it all by hand, but I'm looking at Leo Tolstoy: His Life and Work/Bibliography, Devil Worship (Joseph)/Bibliography, The Old New York Frontier/Bibliography, Ivan the Terrible/Bibliography and Page:Advanced Australia.djvu/242 and just between that small handful of pages, there are more than a hundred works we neither have, nor even have redlinks pointing toward them. If we could get the redlinks going, then we could tackle the next step of masscreating such author pages, or listing the works on Portals, etc. Lemuritus (talk) 21:37, 1 January 2020 (UTC)Reply

Wikisource:TemplateScript could help you out here—it has "Make title link" and "Make author link" scripts. BethNaught (talk) 21:41, 1 January 2020 (UTC)Reply
That said, there's no guarantee the Author pages will have title matching the text, e.g. "Mrs. Besant" <--> "Author:Annie Wood Besant. BethNaught (talk) 21:43, 1 January 2020 (UTC)Reply
That’s half-helpful, it reduces the need to parse the authors on Leo Tolstoy: His Life and Work/Bibliography to about 200 mouseclicks (which I just did :) ), but that’s still a great deal of strain for every single page considering twenty years of backlog is growing larger, not smaller. There's no way for those tools to be keyboarded, as in Ctrl-Q to make the highlighted text an author link? Also, related question where is the link to see which redlinks have the most references? Then, back to the main question, how the community should make it easier to link/redlink selected texts on pages :P Lemuritus (talk) 22:00, 1 January 2020 (UTC)Reply
Top 5000 wanted pages are at Special:WantedPages. Unfortunately, list is not updated often but gives some guidance. Beeswaxcandle (talk) 22:07, 1 January 2020 (UTC)Reply
Bookmarked it, thanks! I added a bunch of Author pages, would be nice if we had a weekly bot skim all Author pages that don't list any redlinks and add a {{populate}} tag to them :) (At least, until better bots can actually find the works themselves ;) ) Lemuritus (talk) 22:32, 1 January 2020 (UTC)Reply
To get a keyboard shortcut, TemplateScript supports key combinations. These consist of a browser-specific prefix (see w:en:Wikipedia:Keyboard shortcuts) followed by some other key (look at the code at MediaWiki:TemplateScript/typography.js to check for a specific command). N.B. this is just from reading code and docs, I haven't tested. BethNaught (talk) 22:12, 1 January 2020 (UTC)Reply
Thanks, that made the second page easier to parse. A tad complicated for new users (and I'm guessing some experienced ones) to even realise is an option, but at least I am a little more prepared. Lemuritus (talk) 23:08, 1 January 2020 (UTC)Reply
There are numbers of regex tools available, Pathoschild's TemplateScript is our more popular as it can ad hoc regex replacements, or you can embed regex into your sidebar. Numbers of us have such scripts withi our common.js files that we use to make repetitive maintenance or proofreading tasks easier. There is a regex tool within AWB which works quite well though all such tools should have caution applied.

As a general comment, while I encourage you to build bibliographic lists, I discourage them to be all redlinks, firstly they are not the prettiest look, they are vandal targets, and they are often not the most accurate due to case differences, or title differences. So get something in place, but please don't overly try to perfect the imperfect. We aim for the perfection in the transcriptions, the author and portal pages are curated pages. — billinghurst sDrewth 13:18, 3 January 2020 (UTC)Reply

I'm planning to fiddle with {{Populate}} to make it suggest "What links here" to the viewer to find some works; ideally it should have an even-easier "Click here to add this work to this author", but that might be a pipe-dream. Feel free to correct my errors on the template (which a bot should be auto-adding to authorpages). For example I need help getting the "Edit" in the wording to be a link to edit specifically the "Works" portion of the authorpage, not the general page. Also, IRC would mean less spam here on the scriptorium with these needs :P Lemuritus (talk) 23:17, 1 January 2020 (UTC)Reply

If a page is sufficiently structured, you can use regexr or a similar service to bulk-edit the text, adding [[$1]] and [[Author:$1|$1]] where needed. —Beleg Tâl (talk) 13:36, 2 January 2020 (UTC)Reply

an author page has no effect on copyrighted uploads at commons. just as a big stop sign on the creator page has no effect. how would you "know" contemporary authors have not released a CC or PD? for example Author:Robert Swan Mueller; and it would give you a deletion task flow on commons. rather you would need to blacklist at IA-uploader or wizard -- Slowking4Rama's revenge 19:58, 2 January 2020 (UTC)Reply
"have all their works in copyright" was the bit that you ignored. The community has had that conversation about not creating modern author pages where we cannot host works, and deleting those author pages. I am not talking about authors who have works in the public domain, clearly those author pages are suitable for creation. If you want to amend the text used for clarity, then go for it, however, please don't cloud the issue. — billinghurst sDrewth 11:58, 3 January 2020 (UTC)Reply
did not ignore it, just don’t believe it. the problem for you is you do not "know" a work is in copyright until you do the search. and the rushing to all or nothing conclusions is not helpful. and dictation which author pages to create based on first order conclusions is not helpful either. and you do not "know" that US government employees will not write a memoir that is copyrighted. i.e. Author:Barack Hussein Obama i am not clouding the issue, rather copyright is inherent cloudy, not amenable to false clarity. Slowking4Rama's revenge 15:10, 3 January 2020 (UTC)Reply
And what you have just said, doesn't change my original statement or meaning.

The community has had the discussion about modern author pages, and set the criteria, as we had people creating problematic pages, or creating pages with zero content, or all works in copyright. My "do not create modern author pages" statement was a general short note, not an explanation of the deliberative process. Where someone has done the searches and found that the work(s) are able to be hosted here, or they are freely hosted elsewhere, then we have allowed those pages, so at that point create the author page. Happy for you to usefully add to the discussion if you think that there is some minutiae to be addressed or to correct an error, or to add clarity, however, this persistent nitpicking and what seems to be deliberate opposition and obfuscation, sheerly because you can, is not helpful. — billinghurst sDrewth 01:09, 7 January 2020 (UTC)Reply

Import request

Can someone please import:

So I can make labels (e.g. Page:W. E. B. Du Bois - The Gift of Black Folk.pdf/41) per the method at w:en:Help:Cite link labels. Thanks. —Justin (koavf)TCM 09:31, 3 January 2020 (UTC)Reply

Not done We use our (house) referencing style per Help:Footnotes and endnotes, not replicate the works. Discussion has been had previously and is in the archives of this page. — billinghurst sDrewth 11:51, 3 January 2020 (UTC)Reply
These aren't mutually exclusive. Why not have the MediaWiki pages as well? And correct me if I'm wrong but it would also overcome this problem: "it has the strong drawback of not being able to group footnotes when transcluded to the main namespace."Justin (koavf)TCM 20:16, 3 January 2020 (UTC)Reply
What price The Tragedy of Romeo and Juliet (Dowden)/Act 2/Scene 1 where I have grouped footnotes on transclusion? Beeswaxcandle (talk) 21:17, 3 January 2020 (UTC)Reply
It seems like that works too but again, there is no harm in having the redundancy and would make it easier for persons familiar with en.wp's method. —Justin (koavf)TCM 22:29, 3 January 2020 (UTC)Reply
How would enabling this "redundancy" keep our house referencing style consistent? How do you propose that (for example) lower-case greek reference marks are automatically turned into numbered reference lists when the parser hits {{smallrefs}}? Always remember that our goal is to reproduce content so that it can be used, not to replicate a multitude of presentation styles from the many publishers out there. Beeswaxcandle (talk) 08:59, 4 January 2020 (UTC)Reply
Why is a consistent house referencing style across millions of works from millennia valuable? I realize that we are striving for a typographic rather than a photographic reproduction but if we can have greater consistency with the presentation of the original source, that is desirable. —Justin (koavf)TCM 06:53, 5 January 2020 (UTC)Reply
Says who? Says why? We already differentiate in multiple ways. We work to the words of the author, not slavishly to a typographic production. Repeatedly having this (epithet) argument is so painful. Every time we allow contributor variation I see it more spawns the hydra-response. I would suggest that every time we allow for user variation we have less site consistency, and that is undesirable. — billinghurst sDrewth 11:21, 5 January 2020 (UTC)Reply
Why is consistency desirable? There is no consistent style in terms of typography, page length, language usage, citations, spelling, etc. across all documents. —Justin (koavf)TCM 11:26, 5 January 2020 (UTC)Reply
Knuth designed his own typesetting system to get his books right. The first run of Alice in Wonderland was pulped because the images were too light. There are notable typographic features in Tristam Shandy, including an all black page at one point. The idea we can neatly abstract out "the words of the author" versus "a typographic production" is somewhat problematic, and given that we do include images, frequently unapproved by the author, not something we follow. Absolute site consistency is not a goal that most of us have, which why we have this argument over and over.--Prosfilaes (talk) 00:46, 7 January 2020 (UTC)Reply

Disambiguation pages that are not disambiguation pages, more collections

We have had spates in time when pages like Emerson have been created, and I don't see that they are disambiguation pages. They are finding pages, and we are just going to get ugly if we think that we can maintain pages like this with the number of surnames and biographical works that we have. I think that it is problematic enough that we have Author:Emerson though can sort of understand why we might, though don't think that we should. We are making horrid rods for our backs whilst hoisting ourselves on our own petards whilst performing rocket surgery. Noting that I am not taking aim at this creation, as it is similar to other sorts of previous creations. We do need to resolve those that do exist. — billinghurst sDrewth 12:58, 3 January 2020 (UTC)Reply

If if if if if we had to collect something like this, I feel that we would be better to do something like a category for Emerson (surname) or Biographies of people named Emerson, not that I truly love such an approach as it is just burdensome. — billinghurst sDrewth 12:58, 3 January 2020 (UTC)Reply
 Delete 100% agree: Emerson is nonsense unless we have works called simply "Emerson", and Author:Emerson is nonsense unless we have authors whose name is simply Emerson. There are also several disambig pages of this sort which I've added to Category:DNB disambiguation pages. I personally think that disambiguation pages for encyclopedia articles (or including encyclopedia articles in disambiguation pages) is not a thing we should be doing in general anyway. —Beleg Tâl (talk) 13:23, 3 January 2020 (UTC)Reply
@Beleg Tâl: I don't want to be seen as just saying "no", so in the cases of the biographical works we would point them to ToC or indexes. For biographical works that don't have either, then we have done our own compiled lists to these work and simply noted that they are compiled lists. Further if someone wanted to do something special for the surname Emerson, then my opinion is do Portal:Emerson and knock yourself out, no expectation that we are ever comprehensive. — billinghurst sDrewth 13:50, 3 January 2020 (UTC)Reply
 Neutral I have created the page Emerson only to make some space where I could move non-author links from Author:Emerson. I have nothing against its deletion if it is decided we do not need such pages. However, disambiguation pages like the Author:Emerson are IMO quite useful and my opinion here is  Keep as people often know only author’s surname and disambiguation page helps them more than just a search results list. --Jan Kameníček (talk) 13:47, 3 January 2020 (UTC)Reply
That is just going to get ugly to maintain, and when do you build one? Wouldn't you be better looking at Wikisource:Authors-Em? One day eventually there will be enough family name data to be able to get these bot generated. — billinghurst sDrewth 13:57, 3 January 2020 (UTC)Reply
Besides the fact that these lists are as badly kept as disambiguation pages and presently are missinng many (most?) author pages, they are also not very user friendly:
  1. Ordinary visitors to Wikisource know nothing about the existence of these lists and finding them after they arrive to the WS main page is not very intuitive. They usually just type the surname into the search field. This can still be solved by redirecting the surname e. g. Kennedy to Wikisource:Authors-K#Ke, but
  2. The lists sometimes contain quite a lot of authors beginning with two particular letters. What is more, the lists should be even longer than they are as too many authors are still missing there, and the number of author pages at Wikisource still keeps rising, so the lists are going to be longer and longer. It is not very friendly to force readers to go through long lists of names if they need just one particular name.
  3. Even worse, unlike disamgiguation pages, the lists do not contain any other information than birth and death dates, which makes it more difficult to find the author you need. You have to open every single author of the desired surname before you find the one you are looking for. Disambiguation pages usually contain also nationality and occupation, which makes the search much easier.
For these reasons I consider such a way a good one if the particular disambiguation page still does not exist and redirecting the surname to this list can be a temporary solution before the disambiguation page is founded.--Jan Kameníček (talk) 14:53, 3 January 2020 (UTC)Reply
As I was indicating these Wikisource:Authors-Xx page would be best as autogenerated pages. My understanding is that the means for generating such pages automatically exists. What is missing is the family-name data at WD, and that is the data we have in a field awaiting inhalation.

With regard to being incomplete now, they are and would be no more incomplete than to a page like Author:Emerson—urted additions are curated additions. With regard to being hard to find, they are linked from every author page, from the author-index link in the top left. Otherwise I hve no idea how users arrive here and look for author pages. — billinghurst sDrewth 11:14, 5 January 2020 (UTC)Reply

As a counterpoint, consider a name slightly more common than "Emerson". Can you imagine if we were to add everyone named "John" to Author:John, or everyone named "Henry" to Author:Henry? We should handle this by pushing for improved search functionality. We shouldn't (IMO) handle this by creating curated pages to disambig on partial names—we have enough to do already around here. —Beleg Tâl (talk) 14:01, 3 January 2020 (UTC)Reply
I agree that adding disambiguation pages with first names is unnecessary and impossible to maintain. As for John, only people whose main name is John, like Author:John of the Cross, or whose surname is John, should be added to such lists. BTW: most pages from Author:John should be either deleted or moved from author ns to portals, as they do not have any works at Wikisource, but that is for a different discussion). --Jan Kameníček (talk) 14:53, 3 January 2020 (UTC)Reply
you could do it by using wikidata, i.e. https://www.wikidata.org/wiki/Q190388 and "given name" -- Slowking4Rama's revenge 15:02, 4 January 2020 (UTC)Reply

Export to PDF, LaTeX, EPUB, ODT

Hello,

An alternative export to the formats PDF, LaTeX, EPUB and ODT is provided

https://mediawiki2latex.wmflabs.org/

The server is provided by the Wikimedia Foundation. The software running on it is GPL licensed open source and part of the Debian Linux distribution.

It is easy to integrate it into the sidebar if requested. This has been done on the German Wikiversity and German Wikibooks. If you are interested just copy a few lines into your MediaWiki:Common.js accordingly.

Dirk Hünniger (talk) 08:32, 4 January 2020 (UTC)Reply

This has been deployed on the German Wikisource. You may try it there too. Dirk Hünniger (talk) 16:18, 5 January 2020 (UTC)Reply

help wanted: OAW layout template?

First, a word of justification for proposing what I'm about to propose. I'm aware that the purpose of Wikisource is, above all, to provide proofread texts of the words of old writings, that images are a secondary concern, and that the typographic form that the text was originally presented in hardly concerns Wikisource at all. Nonetheless, since the magazine Once a Week was in its day well known for its illustrations, and is nowadays better known for them than for its texts, it seems necessary to do justice to the illustrations when transcribing the magazine here. And since the illustrations sometimes interact with the texts, attention to page layout becomes necessary in those instances.

I would argue that if some subpages of Once a Week need a global layout applied to them, for consistency's sake a layout should be developed that can be easily applied to all the subpages. There are surely enough subpages to justify creating a template: 2800-odd in the first series alone! (Some 1000 already exist.)

Here are some examples of subpages that have led me to my opinion that they all should be a column 500px wide, justified.

  • The Sweeper of Dunluce: the illustration was designed to wrap around the text in a specific manner, thus constraining the text's width, and making it look best if justified. There are also other articles with illustrations that wrap around text.
  • The Secret That Can't Be Kept: this play needs to be a fixed width so that the right-aligned stage directions don't move too far from the left-aligned dialogue.
  • The Notting Hill Mystery, Section 7: Fixed width ensures that the diagram is next to the text that it illustrates; more legible if justified.

Default layout 2 does have a 500px justified column. However, it also has other features that are unnecessary (such as specifying serif font) or undesirable (such as its sidenotes style). Thus my request to programmers: is anyone willing to create a custom layout for this magazine?

If anyone's interested in working on this, please continue the discussion at Wikisource:Wikiproject Once a Week/Layout talk. -- Levana Taylor (talk) 07:41, 6 January 2020 (UTC)Reply

Can't help with layout request, but I note for the play that you aren't using {{rbstagedir}}. I find this template quite useful rather than fiddling with floats. Beeswaxcandle (talk) 08:08, 6 January 2020 (UTC)Reply
Thanks for that, it’s helpful, but really, the main issue is the illustrations. Levana Taylor (talk) 08:24, 6 January 2020 (UTC)Reply
I don't really have any bright ideas about a one-size-fits-all formatting, but I'd like to recommend that one eye is kept on how the formatting translates to ebooks. Generally the ebook environment is much more lax with respect to sizes, so you can't reason much about the pixel sizes. For example, here is the start of the The Sweeper of Dunluce after ePub export. The screen is 1080 pixels wide (but less than 7cm across in real life), and the font size (x-height here about 25px), as on most e-readers, is under user control, as is the font itself. E-readers vary vastly in screen size, CSS capabilities and so on, so there's not a huge amount you can rely on. There are proprietary programs to "Preview" how it looks on specific e-readers, but they don't seem to work on non-Windows/Mac systems.
Also, keeping an eye on mobile, but non-e-reader, formatting can be helpful. For example the fixed width at The Secret That Can't Be Kept causes it to leak off the page to the right on a phone screen. Very often, this can be pre-empted by changing "width" to "max-width", so it can be further constrained by the screen size. This is easy to preview in Firefox and Chrome with Ctrl-Shift-M.
Generally speaking, it normally kinda-sorta works, but it rarely manages to look as swish as it does on the desktop website, unless special care is taken to accommodate these devices. And the more "fancy" formatting is used, the less likely it is it will translate to these devices without something bizarre happening. Inductiveloadtalk/contribs 09:24, 6 January 2020 (UTC)Reply
As Inductiveload's example points out, you can't square this circle. You can't have fluid, resizeable text and pixel perfect layout. All attempts to do so will fail, often causing accessibility issues in the process. Fluid design or pixel-level control: Choose one. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:48, 6 January 2020 (UTC)Reply
You are right, of course. Is the number of people using ebooks and small screens and mobile devices increasing to the point that it’s time to give up on doing layout that’s optimized for desktop viewing?
NB most of the OAW subpages work fine with fluid layout, and in fact I have been setting text-columns and images to max-width rather than fixed-width wherever I can, and using image-float for the smaller images so they reposition when the text column is shrunk. I’m just bothered by the few pages where it seems like something special is called for. Maybe I should just create a list of those and see if people have bright ideas for making them work semi-okay on desktop and mobile? There are all sorts of issues, such as the fact that the scores generated by Lilypond are a fixed size. Levana Taylor (talk) 12:37, 6 January 2020 (UTC)Reply
Actually, I wonder if the "The Sweeper of Dunluce" example might be failing because it's using percentages instead of pixels for the polygon, when outlining something that is inherently measured in pixels (an image). This will get wonky results when the screen is too small, whereas modern browsers are usually pretty good at doing the right thing with the virtual unit "pixel". I would also assert that, since the text in the ePub-conversion is perfectly legible even without the fancy formatting, the biggest problem is that File:Castle of Dunluce (OAW).png has a white, rather than transparent, background, which would look ugly even on desktop if we (or the user) had a layout with non-white background.
I also see nothing in these examples that is particularly advanced for a typical web layout, and which can scale from desktop to mobile. ePub ebooks are a special case that tries to be some kind of weird hybrid between dynamic layouts (web) and static layouts (PDF) and suffers inherent limitations as a result: and this means ePubs can't really be automatically generated without some severe tradeoffs somewhere. To get the benefits of ePub you really need to author for ePub; and limiting web presentation to the lowest common denominator will just dumb down both platforms.
We should certainly keep ebook readers in mind and work actively to get the best possible presentation of our works there, but I don't think we should do so at the expense of good presentation on the web (which includes mobile web browsers, that tend to be just as powerful as their desktop equivalents). The need to support older web browsers is already a severe limitation to what we can do, and necessitates designing for graceful degradation (IE/Edge: I'm looking at you!). If we add the pseudo-HTML support in your average ebook reader and the functionality of the Mediawiki-to-ePub converter to the support matrix we might as well just go to straight ASCII. --Xover (talk) 13:36, 6 January 2020 (UTC)Reply
I have replied with some thoughts at the Wikisource:Wikiproject Once a Week/Layout talk page. I think nearly all these issues can be mostly resolved with avoiding fixed-width in the source and applying a {{default layout}} instead, PSM-style. This means the "recommended" layout appears for most users, and can still be overridden as desired. Taking that care to make it work in Layout 1 and 2 will probably make it also work 95% in most e-readers (at least ones with a reasonable CSS engine like apps, YMMV on actual Kindles). Some things are just going to be a bit degraded on e-readers like the illuminated drop initials. That's just how it is, I don't propose to kneecap those to keep a generation-1 Kindle happy. But I certainly think WS should at least attempt to produce functional ebooks/mobile content (even if we don't advertise it as well as frWS does).
OAW is a lovely project, it would be really nice if we could get pretty things out of it like ePubs of the serial works and so on. Inductiveloadtalk/contribs 14:08, 6 January 2020 (UTC)Reply
If required, we can insert an OAW layout into the help:layout mix so that it can be set as a default layout. I do not know whether it would then be possible for it to be excluded from the toggled rotation, or whether creation and addition of a default layout automatically puts it into the rotation. — billinghurst sDrewth 00:45, 7 January 2020 (UTC)Reply

21:18, 6 January 2020 (UTC)

If someone wants to do a short proofread, the Appendix of this would be appreciated. ShakespeareFan00 (talk) 09:19, 7 January 2020 (UTC)Reply

The following copyright discussions and proposed deletion discussions have been open for more than 14 days, and with more than 14 days since the last comments, without a clear consensus having emerged. This is typically (but not always) because the issue is not clear cut or revolves around either interpretation of policy, personal preference within the scope afforded by policy, or other judgement calls (possibly in the face of imperfect information). In order to resolve these discussions it would be valuable with wider input from the community.

Copyright discussions require some understanding of copyright and our copyright policy, but often the sticking points are not intricate questions of law so one need not be an intellectual property lawyer to provide valuable input (most actual copyright questions are clear cut, so it's usually not these that linger). For other discussions it is simply the low number of participants that makes determining a consensus challenging, and so any further input on the matter would be helpful. In some cases, even "I have no opinion on this matter" would be helpful in that it tells us that this is a question the community is comfortable letting the generally low number of participants in such discussions decide.


Copyright discussions (WS:CV)


Proposed deletions (WS:PD)


Note that while these are discussions that have lingered the longest without resolution, all discussions on these pages would benefit from wider input. Even if you just agree with everyone else on an obvious case, noting your agreement documents and makes obvious that fact in a way the absence of comments does not. The same reasoning applies for noting your dissent even if everyone else has voted otherwise: it is good to document that a decision was not unanimous.

In short, I encourage everyone to participate in these two venues! --Xover (talk) 09:35, 7 January 2020 (UTC)Reply

Download tool is broken

The download tool seems to be broken. The Featured text links for a download (any format) result in an error. --EncycloPetey (talk) 17:49, 7 January 2020 (UTC)Reply

ia-upload is down too. I assumed it's a more general WMF Labs thing, but [12] works OK, so... Inductiveloadtalk/contribs 18:08, 7 January 2020 (UTC)Reply
Define broken. The httpd daemon is delivering the upload web page, though I didn't try it out. There are a range of issues that can come from tools, from whole thing, to webservice being off for the account, to a page not delivering the right output, or the processing not occurring. Some accurate description of broken will always be more helpful. — billinghurst sDrewth 00:55, 8 January 2020 (UTC)Reply
"I didn't try it out", but you nevertheless made a disparaging comment. But elsewhere you seem to have been thankful for the notification. Odd that. --EncycloPetey (talk) 02:30, 8 January 2020 (UTC)Reply
Huh? Oh sorry, was seeing "upload" and following that link, which can have lots of components to it, and I didn't want to upload a file to test (and I did say that). That said there is still multiple components to broken for downloads, the site toollabs:, the webservice; the webpage; the output of a file in mobi or pdf or epub; and the generated files themselves and their contents. That user described exactly what they were doing, and what they were wanting, so maybe it comes down a little to mw:How to report a bug. We still don't know what you were trying to do, and whether it is or is not working at this time. — billinghurst sDrewth 04:05, 8 January 2020 (UTC)Reply
I would say that "The Featured text links for a download (any format) result in an error" means that I was trying to pull a download for the Featured text, but instead was given an error message. I did not know yet whether or not I had a bug to report; I simply knew that something that normally works was not working. It is great that you could list several possible things that might have gone wrong, but how would I have known any of that before you posted? --EncycloPetey (talk) 16:51, 8 January 2020 (UTC)Reply
I was not trying to be disparaging, if I came over that way, then you have my apologies. Anyway, does your problem still exist, or do we need to be taking further actions. Specificity, rather than generality, is still king in that regard. — billinghurst sDrewth 21:10, 8 January 2020 (UTC)Reply

Court decision titles

While organizing categories and portals for Portal:United States District Courts, I have noticed that there appears to be very little uniformity of the manner in which court decisions are titled. A quick look at Category:United States District Court decisions shows that while most pages have the simple format A v. B, such as Doe v. MySpace, Inc. or Shelton v. McKinley, there are also multiple pages with formats:

The strongest opinion I have on the matter is that A v. B is generally preferable, as it is cleaner and more human-friendly, but that A v. B should not point to one decision if A v. B (Year) points to a separate decision; in that instance, I feel that A v. B should be an index page (if related) or disambiguation page (if unrelated) pointing to the multiple A v. B (Year) pages.

This still leaves open the question of what to do when a case has multiple decisions within the same year, such as the many pages for United States v. Hubbard in 1979. My opinion there is to endorse A v. B/Citation as a subpage of an index at A v. B, since such decisions are all part of the same case.

But should all decisions be placed at A v. B/Citation (or A v. B (Year)/Citation when applicable), or should that only be used when a case has multiple documents available? Consistency and exactness favor that all decisions use it, but simplicity and conciseness favor that it only be used when necessary.

And once again, that doesn't cover every instance; when there are unrelated cases with the same name in the same year, such as, for example, the theoretical generic title of United States v. Smith, how should they be differentiated? I see no current examples of this situation, so I don't feel that this question is as high a priority; it would just be good to have an established opinion, for the sake of a complete style guide.

THE SHORT VERSION: should A v. B be the preferred court decision name, with A v. B (Year) being used in cases where disambiguation is required, with Citation being used as a subpage only in cases where multiple documents are issued in the same case?

Qwertygiy (talk) 21:39, 7 January 2020 (UTC)Reply

I think that is exactly the correct decision tree here. —Justin (koavf)TCM 23:14, 7 January 2020 (UTC)Reply
  • I think that it is reasonable to assume that our community of interest for court decisions will be people who are in some way involved in legal research (whether lawyers, judges and clerks, law students, or legal writers). There is a predominant form of citation for legal cases in the American legal community, which is the Bluebook, which prescribes the A v. B, Citation (Year) format. I would also note that most of the court decisions that we host are merely one in a series of decisions made with respect to that particular case. For example, Heart of Atlanta Motel, Inc. v. United States 231 F. Supp. 393 (N.D. Ga. 1964) was the decision of the district court that was appealed to the Supreme Court, which decided the case as Heart of Atlanta Motel, Inc. v. United States, 231 F. Supp. 393 (1964). These can not be disambiguated by year because they are in the same year. Note that a case with only the year in the final parentheses will be a U.S. Supreme Court case, with all lower court cases specifying the court. My preference would be to follow the Bluebook citation for American cases. BD2412 T 00:14, 8 January 2020 (UTC)Reply
    I don't have a strong opinion on naming, but I think that people involved in legal research will have access to Westlaw. Our accessors are the common man, who reads Wikipedia and wants to look up an opinion to get the exact judge wording or a better understand of the legal principles behind their republic.--Prosfilaes (talk) 01:03, 8 January 2020 (UTC)Reply
    What ↑ said! --Xover (talk) 05:41, 8 January 2020 (UTC)Reply
  • A v. B, Citation (Court Year) should be the way that the case is titled in the header of the page, or when listed in Portals, certainly. However, there are several arguments to make against using it for the title of pages.
    • It complicates linking to pages, as it's a lot easier to remember the proper information for a page styled as Brown v. Board of Education, the way Wikipedia styles that case, than it is to remember 347 U.S. 483 (1954) (and that's before we get into cases with the varieties of F. Cas., F. Rep., F. Supp., F. Supp. 2d, etc.)
    • It would often result in having to recaption the link as [[Brown v. Board of Education, 347 U.S. 483 (1954)|Brown v. Board of Education]] whenever used in navigation or other templates where space is limited.
    • w:Wikipedia:Manual of Style/Legal#In the United States states that Bluebook format should be generally followed for article titles on Wikipedia, but the examples given are w:National Railroad Passenger Corp. v. Boston & Maine Corp., w:Bailey v. Drexel Furniture Co., and w:Carter v. Carter Coal Co., without citations following the titles.
    • It's also worth noting that the Bluebook is neither the official style guide for all U.S. jurisdictions (with significant deviations including the Supreme Court, California, Delaware, Maryland, and Michigan), nor is it a freely available resource; there are independent online citation generators which one can use, but the book and its guidelines themselves are protected by copyright and sold by Harvard, including via paid internet subscription.
Qwertygiy (talk) 01:33, 8 January 2020 (UTC)Reply
  • I think that following a Bluebook citation style for naming American cases and then different styles for different jurisdictions will only prove confusing. Redirects are cheap and we should create them where we can for human linking and machine reading but the proper title of a page should be something that is maximally simple and intelligible to a human, e.g. "Brown v. Board of Education". (Note also that we should use italicized titles for these pages.) —Justin (koavf)TCM 01:59, 8 January 2020 (UTC)Reply

 Comment my opinion is not specific to citation, and more about house style and seeing if we can meander our way to a sensible output for US court cases within our environment.

  • our use of year = YYYY spawns the output (YYYY), output which is overrideable, please consider how that works in the plan
  • remember to plan for disambiguation pages and where they fit into the mix where similarly named
  • we host the world's works so please don't get caught in a US-centric or US-only naming system
  • pagename and page title work best when close, though title should always be what was on the work
  • descriptive page names are useful
  • subpages (where we use a /) are for pages of the same work; different works are not subpages. And please avoid forward slashes in naming works as part of their general pagename)

billinghurst sDrewth 00:51, 8 January 2020 (UTC)

  • I don't think we should give too much weight to one specific citation style guide when it comes to page names, except in the sense that it is a strong advantage if the page name (or one of its redirects) bears a close resemblance to how that work will be cited in other works. If almost all other works refer to the play as King Henry the 5th, it would be disadvantageous for us to put it at Henry V. For court cases, that suggests we should aim to have redirects from the common citation styles by default; and it means we should let the Bluebook standard inform our choice of standard naming, as one factor among several.
    I don't have strong opinions on the specific naming (this is not my field), beyond finding Qwertygiy's proposal generally sensible. I would also strongly urge that the outcome of this discussion is some kind of written guideline for such page names that can be referred to in future. --Xover (talk) 05:56, 8 January 2020 (UTC)Reply

Bad text layer extraction from PDFs

As it was already discussed here, Mediawiki has problems with text layer extraction from PDFs. Now I have reported it to phabricator, see task T242169. If anybody were able to add any useful comment there, it would certainly be very helpful. --Jan Kameníček (talk) 18:28, 8 January 2020 (UTC)Reply

404:Not Found (thumbs)

Hi! I'm getting a "404:Not Found" in clicking certain image tabs in Proofread pages. Let's have an example with Index:Scientific American - Series 1 - Volume 009 - Issue 18.pdf. If you click in any page, and then click the "Image" tab, we get the error.

The URL that we are trying to load contains decimals (e.g. https://upload.wikimedia.org/wikipedia/commons/thumb/8/84/Scientific_American_-_Series_1_-_Volume_009_-_Issue_18.pdf/page4-1652.0833333333px-Scientific_American_-_Series_1_-_Volume_009_-_Issue_18.pdf.jpg) If you change 1652.0833333333px with 1652 in the URL, we'll see the thumb OK.

I think this happens with all the files that have been "mirrored" from Commons with decimals in the px. E.g.:

  • Commons: "Original file ‎(1,653 × 2,362 pixels, file size: 4.27 MB (...)".
  • Wikisource: "Original file ‎(1,652.0833333333 × 2,360.4166666667 pixels, file size: 4.27 MB, (...)".

It happens in all Wikisources I've looked. In one case, the error causes also not to display the image in the Page namespace because of the same URL with decimal px's (e.g. ca:Pàgina:Flor d'enamorats (E. Moliné y Brasés).pdf/12; this Catalan page was OK 3 days ago).

So, any idea, please? Thanks. -Aleator (talk) 15:00, 10 January 2020 (UTC)Reply

Also look any page of Index:AASHTO USRN 1980-06-22.pdf: both types of error (404 and no thumb). Something is wrong in PDFs... -Aleator (talk) 15:15, 10 January 2020 (UTC)Reply
@Aleator: This is probably phab:T242422, and connected to the new version of MediaWiki that was deployed today. —Xover (talk) 16:10, 10 January 2020 (UTC)Reply
@Aleator: As a workaround, you can try to set Scan resolution in edit mode in the Index to an integer value. Ankry (talk) 21:48, 13 January 2020 (UTC)Reply

The Outline of History

All the illustrations for Index:The Outline of History Vol 1.djvu and Index:The Outline of History Vol 2.djvu have been deleted from Commons. I could not find a link to the discussion, but it seems that the illustrations are under copyright in the UK because they are by J. F. Horrabin (d. 1962).

Since there are (were) illustrations on most pages of the work, this will mean that the integrity of the work as a whole has been compromised and may require significant cleanup. --EncycloPetey (talk) 16:59, 12 January 2020 (UTC)Reply

Not a long discussion, see c:Commons:Deletion_requests/Illustrations_by_J._F._Horrabin. If they were in scope to be hosted here, they could have been moved. Anyhow nothing new, they always act with little consideration for other projects.Mpaa (talk) 17:47, 12 January 2020 (UTC)Reply
c:Commons:Deletion requests/Illustrations by J. F. Horrabin -- this would be a good case for a local copy, if a commons admin wanted to copy it over. Slowking4Rama's revenge 03:17, 13 January 2020 (UTC)Reply
adding file links for convenience. c:file:The Outline of History Vol 1.djvu and c:file:The Outline of History Vol 2.djvubillinghurst sDrewth 03:52, 13 January 2020 (UTC)Reply
The two djvu files have been moved here guessing that at some point that someone will complain. I have also left a note on the deletion discussion to address this matter. — billinghurst sDrewth 04:07, 13 January 2020 (UTC)Reply
@Billinghurst: I do not know how you managed to create a duplicate here, I tried for the other images (via pywikibot/API) but failed due to "API error fileexists-shared-forbidden: A file with this name exists already in the shared file repository". I also tried to import from commons (see File:Page 011 (Vol. 1 - The Outline of History, H.G. Wells).png), but I guess there is no local file here. I hope you have the tools to make a mass move from there to here. All I could do is save a local copy on my PC.Mpaa (talk) 22:11, 13 January 2020 (UTC)Reply

A personal essay from a kindred site

https://blog.pgdp.net/2020/01/01/ten-eleven-years-at-dp/Justin (koavf)TCM 07:27, 13 January 2020 (UTC)Reply

Tech News: 2020-03

MediaWiki message delivery (talk) 18:39, 13 January 2020 (UTC)Reply