Wikisource:Scriptorium

From Wikisource
(Redirected from Wikisource:S)
Jump to navigation Jump to search
Scriptorium
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 336 active users here.

Contents

Announcements[edit]

Changes to Template:Header[edit]

To the template

some alterations
  1. for the parameter contributor there is now a synonym section_author
  2. for the parameter override_contributor there is now a synonym override_section_author

this was requested as there was the statement that "contributor" has some level of confusion. Whether we should migrate usage, and/or deprecate the term has not been discussed.

some additions
  1. the parameter section_translator, wikilinked
  2. the parameter override_section_translator, not wikilinked and takes formatting

This allows for the recording of translators of a subpart of a work, previously use of translator applied it to the section for the work, not the subsection.

The documentation has been updated

It is preferred that any discussion should be handled in a new section on this page, rather than as part of this announcement. Thanks. — billinghurst sDrewth 12:40, 30 December 2019 (UTC)

Proposals[edit]

New speedy deletion criterion for person-based categories[edit]

The following discussion is closed and will soon be archived:
There is community consensus for a new G8 criterion for speedy deletion of person-based categories.


[Addendum for clarity in the archives: the <s>…</s> markup above was added after the fact by two editors who do not agree with the community's decision, but for whatever reason do not wish to open a new discussion to attempt to persuade the community to their point of view. That the closing text is struck through does not indicate an absence of consensus for this decision; merely that they wish to annotate it with their dissent. Note also that this discussion was closed and reopened mid-way through in a way that is not visible in the archived text: the revision history of WS:S during the period of the discussion must be examined to get the full picture. Any further discussion of this issue (including any contrary community decisions) is likely to be found on WS:S with keywords "G8", "person-based", or "author-based". --Xover (talk) 08:09, 9 January 2020 (UTC)]

The <s>…</s> markup above was added by me, who regards the struck text as a mischievously counterfactual summary of the discussion, and the "addendum for clarity" as little better. Hesperian 23:29, 9 January 2020 (UTC)

Following on from a discussion at WS:PD#Speedy deletion of author based categories.

It is long established and in the main uncontroversial that English Wikisource does not use person-based categories (of the type "Works by John Smith", "Poetry by John Smith", etc.). Some previous discussions can be found at: 1, 2, and 3 (and the two following threads). However, absent a speedy deletion criterium specifically for these, admins have to rely on the provision for precedent-based deletions. In practice this means such categories must be brought to WS:PD to be rubber stamped, wait at least two weeks (because inertia and habit), and then hopefully someone will remember to process them. Eventually.

I therefore propose that we extend the deletion policy with a new G8 criterion as follows:

  • Person-based categories—Categories where the defining characteristic is person-based. This includes, but is not limited to, author-based categories like "Works by author name".

All deletions (modulo CU type concerns) are subject to community challenge in any case, and are clearly visible in the deletion log, so there is no particular benefit to the bureaucracy where there exists no significant uncertainty or controversy. --Xover (talk) 14:32, 15 July 2019 (UTC)

Symbol support vote.svg Support, but I'd note that there is an exception discussed in link #2: namely, American presidential documents categorized by president. This is due to the fact that the administration of the executive branch is tied to who is the president at the time. There was no consensus as to the scope of this exception: what kinds of presidential documents it applies to, or whether other governments may have the same treatment, etc. —Beleg Tâl (talk) 14:42, 15 July 2019 (UTC)
Symbol oppose vote.svg Oppose 2 weeks is not too long to wait. organization of subject of a work is useful, a migration to a stable ontology is necessary. Slowking4Rama's revenge 13:58, 30 July 2019 (UTC)
2 weeks is definitely too long to wait when a full beaurocratic procedure with a foregone conclusion could be replaced with a simple administrative action. —Beleg Tâl (talk) 14:32, 30 July 2019 (UTC)
Also it is worth pointing out that this proposal is not regarding whether such categories should be kept or deleted (since we have already established that they should be deleted), but only whether they should be posted to WS:PD before we delete them. —Beleg Tâl (talk) 18:51, 30 July 2019 (UTC)
And that strictly speaking, under current policy, they can be deleted a few days after a notice has been posted to WS:PD (no two week wait required, just that the discussion must have "started"). It's just that habit and inertia inevitably means that almost all cases will in practice suffer this 2+ week purely bureaucratic delay. I'm a big believer in process and the value of bureaucracy when properly deployed, but even I think this one is a pointless waste of volunteer time. We have issues that require actual discussion or other action that have sat open on the noticeboards for a year and a half; we should not waste those resources on filling out forms in triplicate for issues that are not controversial. Any deletion can be reviewed and overturned, if needed, by the community; let's save the cautious multiple-safeguards approach for stuff that might actually need it. --Xover (talk) 19:11, 30 July 2019 (UTC)
I always wait until there has been a full month of inactivity, since there are many editors who only edit occasionally, but that's just me. —Beleg Tâl (talk) 19:17, 30 July 2019 (UTC)
Symbol support vote.svg Support --EncycloPetey (talk) 17:40, 30 July 2019 (UTC)
Symbol support vote.svg Support --Jan Kameníček (talk) 19:38, 30 July 2019 (UTC)
Symbol support vote.svg Support though if possible I'd like to see the exception Beleg Tâl specified firmed up a bit, i.e. perhaps a general exception for things like governments, ministries, and reigns which are "person-based" but serve an obviously different function to categories-by-author (noting on the UK side things like Category:Acts of the Parliament of Great Britain passed under George III). —Nizolan (talk) 00:44, 1 August 2019 (UTC)
  • Note Based on the discussion above I have added the above criterion with an additional limitation to exempt things like UK governments tied to a monarch's regnal period or the administrations of US presidents. I read the above as general support for this criterion—sufficient for adding it—but with some remaining uncertainty about the optimum phrasing. I'll therefore leave this discussion open for a while longer so that interested parties may object or suggest better wording. I'll also add that minor changes to the wording (that do not change the meaning) can easily be made later with a proposal at the policy talk page. And we can always bring bigger changes up here for reevaluation if it causes problems. --Xover (talk) 19:32, 11 August 2019 (UTC)

Deletion review

I long ago (2005) gathered together historical documents related to the life of Indigenous Australian warrior Yagan in Category:Yagan. This has always seems to me a reasonable category, but it just got speedily deleted without so much as a how-d'-y'-do.

The examples given in this proposal were of the form "Works by John Smith", "Poetry by John Smith", etc. No other examples were given in the discussion. So I'm not sure if the community really intends that categories like this would be deleted. Can we review this please?

Hesperian 23:48, 2 September 2019 (UTC)

Hmm. I'm not going to express an opinion on "should" / "should not" for this, but I will note that based on my understanding of the discussions this would indeed be the intended effect. The defining characteristic of the category is that its members relate somehow to a specific person, and for such the consensus appeared to be that portals were better suited. But perhaps there is a distinction between Category:Yagan and Category:John Smith that I am not seeing? Or is it the specificity: Category:Foo by Person is bad, butCategory:Person is acceptable? --Xover (talk) 03:58, 3 September 2019 (UTC)


As things stand:

  • I can gather together documents about the Battle of Borodino in Category:Battle of Borodino, because that's an event.
  • I can gather together documents about Fort Knox in Category:Fort Knox, because that's a place.
  • I can gather together documents about scissors in Category:Scissors, because they are objects.
  • I can gather together documents about intelligence in Category:Intelligence, because that's an abstract concept.
  • But I can't gather together documents about Yagan in Category:Yagan, because he was a person.

Can no-one see how bizarrely arbitrary this is??

And it hasn't even really been discussed, since the only examples given above are "Works by" categories, the deletion of which makes perfect sense. Hesperian 11:50, 3 September 2019 (UTC)

Fully agree with Hesperian, the speedy deletion is a misinterpretation of the guidance. The "category:works of ..." is to ensure that works of authors are added to author pages, and not categorised. There is no determination that it would relate to anything else. Categorisation has always existed for people, again our biggest issue is how to separate author categorisation from subject categorisation. — billinghurst sDrewth 12:39, 3 September 2019 (UTC)
Read the policy, it does not say "works by …", it says "person-based". —Beleg Tâl (talk) 12:56, 3 September 2019 (UTC)
Per our deletion policy (as updated according to the consensus in the above discussion), "Person-based categories" are now a criterion for speedy deletion. This "includes, but is not limited to, author-based categories", but "the defining characteristic is person-based". This was very explicit in the above proposal. My deletion of Category:Yagan was therefore 100% within our deletion policy. You can propose a reversion to the older version of the deletion policy, and a restoration of Category:Yagan (even though it is entirely redundant of Portal:Yagan), but I will have no part in it. —Beleg Tâl (talk) 12:53, 3 September 2019 (UTC)
Also: as things stood before the above discussion, I could gather together documents about Yagan in Category:Yagan, but couldn't gather together documents about Yazid III in Category:Yazid III, which is just as bizarrely arbitrary. —Beleg Tâl (talk) 13:03, 3 September 2019 (UTC)
(ec) It is my opinion that it is not a positive change. 0-100 in four seconds. I find the statement It is long established and in the main uncontroversial that English Wikisource does not use person-based categories to not be the case, especially as it has been the case since 2005. Something that was entirely in scope and I believe would have been kept in a PD, is now going to a speedy deletion and deleted without conversation. I find that inappropriate, and for that to have been implemented in four weeks is an example of poor implementation and poor policy. I am wondering where this community is going, and the lack of vision that this represents. — billinghurst sDrewth 13:14, 3 September 2019 (UTC)
It may also have simply flown under the radar. It is also just one category affected, and a completely redundant one at that (equally redundant to any Author-based categories). And the proposal to update the policy was done entirely by the books, and is a significant benefit to the community. —Beleg Tâl (talk) 13:30, 3 September 2019 (UTC)
And it has been long established and in the main uncontroversial that English Wikisource does not use categories for individuals who have pages in Author space; the fact that there existed one or two categories for an individual in Portal space is (to me) a minor detail and I would have also considered it long established and uncontroversial that these were also unwelcome. —Beleg Tâl (talk) 13:33, 3 September 2019 (UTC)


Of most concern to me in this new G8 is, what if Portal:Yagan did not exist? In that case, Category:Yagan would be the only way in which we had organised our material by topic, yet it would still be summarily deletable under this new G8.

I think a more coherent policy position might be:

We don't want to organise our material by both Author/Portal and Category. So it is fine to create a category for a topic if there is no corresponding Author/Portal page. But be aware that this is a stopgap -- once someone has created the Author/Portal page, the category may be deleted.

Note that this doesn't distinguish people from other topics. Category:Yagan is fine, but only until Portal:Yagan has been created. Even Category:Works by John Doe is fine, but only until Author:John Doe has been created.

I think the biggest problem with this position is the really big topics that would be better handled by a category than by an Author/Portal page e.g. War. In that case, I would say keep the category and ditch the portal, which would be unmaintainable. In a speedy criterion there would certainly need to be something to prevent deletion of categories that contained subcategories or a collection of portal/author pages.

Thoughts? Hesperian 22:50, 3 September 2019 (UTC)


Since the attitude to concerns raised here has been "I will have no part in it" followed by non-participation in the discussion, I have boldly replaced "person-based" with "author-based". I accept the new G8 was proposed, discussed and implemented in good faith, but subsequent objections have made it clear that there is no consensus for speedy deletion in the gap between person-based" and "author-based".

To be clear: we may not agree on whether Category:Yagan should have been deleted, but I think we can all agree that the deletion was contentious, and speedy delete criteria are intended to capture non-contentious matters.

Hesperian 07:53, 6 September 2019 (UTC)

@Hesperian: I'm not going to revert that because I think at least temporarily going back to the status quo is prudent when a concern has been raised so soon after implementation. But I do object in principle to your approach here: whatever the problems with the new G8, it was properly discussed, consensus determined, and implemented. For you to unilaterally reverse it is not a good practice, no matter the merits of your concerns with it. The proper description of the thread above is, strictly speaking, not "absence of consensus" but rather "complaints after the fact" (possibly good, proper, and meritorius complaints, but still after the fact). So I am going to insist that this removal of the new criterion is a temporary measure while discussion is ongoing, and not the new status quo. If no new consensus is reached here then we revert back to what was previously decided. (To be clear, if you had suggested we should temporarily revert I would have supported that. It is your acting unilaterally with an apparent intent to change the status quo I object to.)
That being said I am absolutely open to being convinced of anything from the new criterion needing to be tweaked and to it needing to be dropped altogether. The reason I am not currently actively discussing is that I do not feel I sufficiently grasp the issue and am mulling it over. Your distinction between "person-based" and "author-based" has not been apparent to me prior to your latest comment, and I now suspect that that distinction is the crux of your objection; but I still do not grasp why you do not feel a portal would be sufficient. On the other hand, reasonably curated categories are cheap, and can conceivably be automatically applied to works included in a portal.
I also suspect, though I may of course be entirely mistaken, that what we are discussing here is not actually a speedy criterion, but rather a more fundamental issue of category and portal policy. I am not convinced the speedy criterion is a useful proxy for that debate, on the one hand, and that the former will resolve itself neatly if the latter is settled, on the other. --Xover (talk) 08:30, 6 September 2019 (UTC)
@Hesperian: "I will have no part in it" is me, not the community. I agree with Xover that it is necessary to establish a new consensus with the community to make a subsequent update to the deletion policy (in which discussion I will remain neutral). And like I said to TE(æ)A,ea.: three days is not remotely sufficient for closing a discussion. Be patient. —Beleg Tâl (talk) 12:20, 6 September 2019 (UTC)
  • Pictogram voting comment.svg Comment There is definitely a long-established practice that we collect and curate works that relate to authors, and due to our strong preference to curate, we determined to not categorise, which would have a duplication and a confusion. It has not been the case for individuals who were not authors, and it should not be a requirement that we have to curate such pages, especially where a person may be mentioned on a page(s) though not be the focus of the pages. For instance, the page The Perth Gazette and Western Australian Journal/Volume 1/Number 28 would be considered for categorisation in "Category:Yagan" though would not particularly be the focus of a page and put onto a Portal: ns page. I would definitely not expect someone to have to make edits to a portal page to that target, though I would have no qualms with someone categorising. Where we have authors, we have wikilink'd back to author pages for that relevance. So it is my belief that these non-author categories should not be speedied, if there is a case for their deletion, then bring it to the community. I also believe that a proposer should be listing consequences of their suggested policy changes, not leaving it to the community. I find the above consensus to be a troubling "yes ... tick and flick" exercise by the community without an in-depth exploration of the consequences, approving a change to speedy deletion should be items that are completely non-controversial.

    The above deletion discussion started with the scope of a PD discussion about author categories, and then specifically addressed two author related categories. No examples were given of non-author categories that would have been wrapped up in the change of our guidance, nor that we were going to now speedy delete categories that have been existing for greater than 10 years. I have a strong belief that anything that has existed for over 10 years onsite should not be speedied, and that speedy deletions are only best applied to recent additions.

    Xover: You suggested the policy change, then summarily closed less than four weeks later, and implemented. May I suggest that is not the ideal practice either, as this is a change of policy where all person categories are deleted, not as indicated in the discussion that it was an existing process and the speedy being the only change. We are not a huge community, we don't have the same editing rates, or the diversity of eyes to analyse such situations, and that is traditionally why we have left discussions open for extended periods. — billinghurst sDrewth 10:55, 7 September 2019 (UTC)

    @Billinghurst: "Too quickly closed" is a fair complaint, although I don't entirely agree with that assessment. I agree there should be plenty of time for the community to ponder, scrutinise, discuss, and decide; and in fact was somewhat disappointed that the proposal did not garner wider participation and more discussion. I agree speedy criteria should have a firm basis, which broad participation in the proposal is the best way to ensure (and document!). But I also observe that community participation in such discussions is distressingly low in general, and by that yardstick the above was about the most I felt one could realistically hope for. When no further comments either way surfaced—not even any "Unsure" or "Wait, I need to think a bit more"—I felt that was sufficient to implement. If we want to have much longer timeframes to tease out every possible community comment then we should have specific guidance to that effect (and I do mean a specific number of weeks).
    I agree that speedy should be for uncontroversial things, but then my understanding was that this was uncontroversial. My intent in making the proposal was not to change practice regarding use of categories vs. portals, but rather to eliminate a pointless two-week wait and bureaucratic box-ticking for something that was a priori determined would be deleted. I do however disagree that speedy should not be applicable to, for example, decade old clear copyvio. The purpose of speedy deletions is to reduce bureaucracy and make maintenance more efficient—where possible—and to reduce the demands on the community's time and attention in formal discussions. Because, as you point out, such participation is perhaps our scarcest resource! The age of the material affected is entirely orthogonal to whether it falls within one of the speedy deletion criteria.
    "Uncontroversial" is a better distinction, but even there some nuance is needed. The policy that leads to the deletion (by whatever process) must be unambiguously decided: it must be uncontroversial that that was what the community decided. The issue itself, though, can still be plenty controversial: there are some contributors who would never see anything deleted, for any reason, and express their frustration with copyright law and our copyright policy in every copyright discussion they participate in (nevermind proposed deletions). That someone disagrees with the community's decision, once made, is not a valid reason for considering the implementation of that decision controversial.
    On the issue at hand, though, I (am starting to) see the personauthor distincton, but I am having trouble understanding how a portal is any less suited for a person than for an author. To my mind the very same arguments for portal over category for authors apply equally to persons. Why wouldn't The Perth Gazette and Western Australian Journal/Volume 1/Number 28 go in the portal? Or is it the perceived relative amount of effort in curating the two approaches? Hesperian's more coherent policy position seems to suggest that that is the case.
    I don't think starting with a category but deleting it if a portal is created is a particularly rational approach, but as a proposal it does speak directly to the relationship between categories and portals. To me, the opposite end of the spectrum (that you also address) seems more elucidating: once a topic is sufficiently large, a portal becomes an awkward way to organise the information. In those cases I could see an argument for using both; the category for everything and the portal for the highlights. But that's an argument that will be relevant only rarely (relatively speaking) and only in the reverse order (only once the portal is "full" does the category come into play). Most person-related topics will not have too many relevant works for a portal.
    Or perhaps a different angle of attack would aid common understanding: Categories, Portals, and Author-pages overlap in various ways and in different degrees, and so we should establish some coherent guidance on the purpose of each, what to use each for, and how to distinguish between them in difficult cases. Perhaps in discussing what that guidance should be we would better understand the various perspectives than through the proxy of a speedy criterion? For example, do we want a portal about a person as a historical figure if that person is also an author? Is an Author: page and a Portal: the same thing except for inclusion criteria? Do the same layout rules and restrictions apply to both? --Xover (talk) 03:19, 9 September 2019 (UTC)
i am sad that admins persist in summarily deleting, for contentious issues that require a consensus. we need a standard of elevating issues on chat before deletion. and a standard of practice of how to organize ontologies of "subject of" and "depicts". i don’t care how- portals, categories, subsection, anything that can be linked from wikidata. but we need an organizational consensus, not deletion. Slowking4Rama's revenge 03:43, 13 September 2019 (UTC)
@Slowking4: But, but, but, but you do not understand the sysop perspective. They delete without consequence (for themselves, as from a sysop's perspective a deleted page may be view/restored and viewed without going through with restore. See? No consequence!) As for for the plebs, tough! Them's oughta put in an application to be tiara'd like good little princesses… 114.78.171.144 06:09, 13 September 2019 (UTC)
114.78: I realise you're taking the piss here, but I actually agree that this is an important difference in perspective to take into account. One thing is that the consequences of deletion can in some (but not all!) cases appear smaller to those with the technical ability to view and restore deleted pages, but the perspective is also shifted when you have long backlogs of tasks that either can only be resolved (in practice) by deletion or where deletion is a fairly foregone conclusion. To have to conduct a formal analysis, formulate it cogently, and run a community discussion is a lot of effort. The relatively low community participation in those discussions means they have a tendency to deadlock, and if resolved are too local to support any kind of future precedent. When a lot of your tasks are dealing with that dynamic, you will naturally tend to develop a bias (big or small) toward more efficient resolutions like having speedy criteria for whatever the issue at hand is.
But when you spend a lot of time going through the maintenance backlogs you also gain the very real experience that tells you that a lot of stuff has been dumped here with no followup, attempts to format properly, or even giving minimal source or copyright information. There is literally no hope of these works being brought up to standard as they are, and would in any case be easier to recreate from scratch than fix in place, even if they aren't blatant copyright violations. While we certainly need to watch for and not get fooled by the previously mentioned bias, we also should let ourselves be guided by this experience. Sometimes the perspective of those who work the maintenance backlogs (which is not by any means limited to just admins!) gives them a better foundation for reasoning about an issue than those who work primarily on their own transcriptions (and sometimes not). --Xover (talk) 07:25, 13 September 2019 (UTC)
your "guided by experience" does not address the power dynamics of a summary standard of practice. when you undertake an action. no matter how reasonable or justified you may feel, while the community is feeling ill-used, then you might want to rethink your action, if you would presume to lead a community. we have a lot of ban-able admins. Slowking4Rama's revenge 11:44, 13 September 2019 (UTC)
@Xover, @Slowking4:My sincere apologies if my comment came across solely as micturient. When young fresh meat front up to gain the authority bit it is entirely reasonable they not realise they are actually signing up for a melange of teacher, executioner, judge and neat-freak. What is less excusable is that some of them never even learn of the damage they do to the parallel roles whilst obsessing over the matter of the moment. Ordinary users are watchers and judger's too and may take away quite unexpected conclusions from administrator actions. Looked at another way the spread of intelligence is (sadly) unrelated to the authority role granted. That there never seems to be a shortage of potential idiot actions does not mean it is a good idea to go down each and every rabbit-hole.
On the other hand the occasional well-reasoned explanation might even result in the next applicant putting their hand up and taking some pressure off off the backlog slaves. If that flags me as both bitter and optimistic then just handle it. I have to. 114.78.171.144 22:06, 13 September 2019 (UTC)
@Slowking4: I have suggested above that the ontological discussion might be a better way to approach this issue than the speedy criterion. What are the ontological categories we need to handle, and what tool or structure of those we have available to us would be best to handle each? If we can figure out some guidance on that then what should be kept and what should be deleted will, hopefully, follow naturally. Perhaps you could flesh out your thoughts regarding "subject of" and "depicts" with that in mind? --Xover (talk) 07:25, 13 September 2019 (UTC)
we would need to group together all those works, which people seem to use categories . we have categories on authors, we could start with a wikidata infobox at author pages. if the community wants portals for subjects, then we will need a infobox and migration from categories to portals. (this is different from how it is done on commons) you could then link on wikidata, and have some query function to aid search, we need some wayfinding to aid search of topics. Slowking4Rama's revenge 11:53, 13 September 2019 (UTC)

Further discussion needed (New speedy deletion criterion for person-based categories)

I am quite a bit concerned about this, and have unarchived it to prevent it lingering on unresolved.

We are now in a situation where the community has voted to implement a criteria for speedy deletion, that allows any administrator to delete such matter at their own discretion with no a priori community approval (all admin actions are, of course, subject to a posteriori review by the community), but where at least two long-standing and very experienced contributors have objected to the core issue after the fact, and levelled criticisms at the formalities of the community decision process. Their objections are reasonable ones (in the "reasonable men may disagree" sense), and the criticisms of the process valid.

To make clear the procedural issues, the proposal described the issue as "in the main uncontroversial", which the objections have demonstrated was not entirely accurate, and it was closed after a mere four weeks (two weeks after the last comment), when an objection became apparent after six weeks. Additionally, relating to the core issue, those who disagree feel the examples provided in the proposal do not accurately reflect the criterion as it was implemented. These are all valid complaints and the responsibility for these deficiencies in the procedure fall to me (my apologies).

But, in any case, the core issue remains: we now have a speedy criterion that two very respected and experienced community members have valid and strong-held objections to.

The arguments of those who object are presented above under the "Deletion review" thread. I had hoped that the community would chime in on that discussion such that it would be possible to assess whether the community shares the concerns of those who have objected, or whether they still support the criterion as implemented.

But as that has not happened I would like to directly request that the community chime in to make clear their position on how to handle this.

  • Despite the criticisms, the original community vote was valid and concluded with support, so the default outcome, if no change is mandated here, is that the criterion as written will be implemented. It is currently temporarily suspended as a conservative measure since objections have been raised.
    • In particular, this means that if you do not express an opinion now you will in practical effect be reaffirming the original outcome!
  • Does the community feel that the concerns raised are serious enough to invalidate the previous vote and revert to the status quo ante?
  • Does the community feel we should proceed as per the existing vote and adjust course as necessary at a later date?
  • Alternately, does the community feel we should proceed as previously voted but with specific changes to the wording of the criterion?
    • For example, Hesperian has specifically proposed replacing "Person-based" with "Author-based" in the criterium.
  • Would the community prefer a new proposal, that better explains the issues, be made and a new vote held on that?
  • In essence: do you have any opinion or recommendation on how this disagreement should be handled such that we end up with the issue settled?
    • Not everyone needs to agree with the outcome, but everyone should preferably feel that the outcome was fairly arrived at!

Pinging previous participants in the vote/discussion (but everyone are, of course, encouraged to chime in): Beleg Tâl, Slowking4, EncycloPetey, Jan Kameníček, Nizolan, billinghurst, Hesperian.

This has dragged on unresolved and it's the kind of thing that has the potential create conflicts and discord down the line so, despite the sheer amount of text and rehashing, please chime in and make your position clear! --Xover (talk) 07:22, 14 October 2019 (UTC)

  • Pictogram voting comment.svg Comment The examples used of the purpose and solutions did not adequately represent the proposal. I don't believe that any long-held page that appears valid at a point in time should be speedy deleted with a change in policy, especially where it is unclear in the proposal that such pages were being incorporated. My understanding of our approach was that we would not build author category listing pages those to go. — billinghurst sDrewth 09:56, 14 October 2019 (UTC)
    @Billinghurst: It is not clear to me from this comment how you would prefer to resolve this issue. Could you make that explicit? --Xover (talk) 06:46, 15 October 2019 (UTC)
    Don't speedy delete long-held pages.

    If you are putting forward a policy change, then identify the pages that are going to be caught by the policy change. Look to use best examples, not examples where we are already in agreement. If you are deleting and you come across long-held pages you believe that are caught by a policy change, and they have not been specifically mentioned, then have the open-discussion so that we have a consensus that is what we were looking to do. Administrators are the implementers of consensus, not the determiners of what happens here, and we should be looking to be considerate. Err on the good-side and the patient-side. In reality, for many things there is no hurry, despite some of us at some stages just wanting to get things tidied away.billinghurst sDrewth 22:46, 21 October 2019 (UTC)

    @Billinghurst: Apart from the age exemption (which I have addressed above somewhere), this is all good advice and I agree whole-heartedly. But now you're just chiding me. What, specifically, is your preferred way to resolve this issue? Do you want the new speedy criteria rolled back and removed? Do you want its text changed from "Person-based" to "Author-based"? Or are you proposing an entirely different, general, rule that no content older than X time units may ever be deleted under any criterion for speedy deletion?
    Because right now we have an existing, valid, community decision in favour of the new criteria with the "Person-based" meaning, but I am bending over backwards to try to make sure the concerns you and Hesperian have raised are taken into account (giving everyone a chance to change their minds if they are swayed by your concerns).
    If your goal is to censure me for insufficiently researching and documenting the consequences of the new policy, or for failing to insist on a longer period before being closed, then, fine, consider me suitably chastened. But me standing dressed in a white sheet in church on three sundays isn't really going to change much. So far, of those who originally supported the new criterion, only Jan has chimed in and they reaffirm their original position. If you want a different outcome you need to at least tell us what it is. --Xover (talk) 07:49, 22 October 2019 (UTC)
  • As I said before, I remain Symbol neutral vote.svg Neutral regarding the proposed change from the current "person-based" deletion rationale to the proposed "author-based" rationale. —Beleg Tâl (talk) 15:19, 14 October 2019 (UTC)
    I voted for deletion of person-based categories and I hope that the vote also counts in this way. If somebody wishes only deletion of author-based categories instead, it should be suggested as an alternative rule. I admit it is my fault I did not protest when somebody changed the proposal without others expressing their consent clearly, but still: changing rules needs explicit consent, which is missing here.
    That said, I do not think that the idea of treating author-based categories differently from categories of other people is good.
    • Firstly, this can be a source of big confusion to many readers browsing categories: some people are included in the category tree and others not, and accidental visitor to Wikisource unfamiliar with our internal rules will not find the clue.
    • Secondly, it is not defined, who is considered to be an author by this rule: A person who is author of a work at Wikisource? A person who is author of a work eligible to be added to Wikisource? A person who is author of a work in English or translated into English, although it won't become eligible for WS for decades? Or any person who is author of whatever in any language, which may but also may not be translated into English in the future? We have some definition in the Style guide which says that "... author ... is any person who has written any text that is included in Wikisource. However, too many contributors refuse to follow this definition and found author pages of people who have no work here, sometimes even authors who have never written anything in English and nothing by them has been translated into English so far (example). I am afraid the same will sooner or later happen with categories.
    • Let's say that we determine some line dividing authors and the rule will say which authors can have categories and which not. The rule could be: authors who have an author page cannot have category, and vice versa (or any other definition). Again: accidental visitor browsing categories will be confused, unable to find our internal clue why Alois Rašín can be included in the category tree and Karel Kramář not.
    To conclude it, the best way is the simplest: forbid all person-based categories and organize people only in the author and portal namespaces, or alternatively allow categories for everybody. I am for the first of these two choices. Jan Kameníček (talk) 20:10, 14 October 2019 (UTC)
  • comment, i am concerned about increased use of speedy deletion, that has been abused elsewhere. i would prefer use of maintenance task flows in the open. i do not see a pressing problem. but maybe this is overblown, and the admin task flow here will not be abused. i raised my concern and got dismissed, which is fine with me.
  • what we really need is a consensus about how we structure our data with wikidata. (be it categories, portals or tags) we need a stable page, about work subjects, that can link to wikidata. we have a "works about" section for authors. but we need it for non-authors also.Slowking4Rama's revenge 14:09, 15 October 2019 (UTC)
    Your concerns were not dismissed, some editors merely disagreed with them. But in the interest of clarity, in view of your comment here and your original oppose vote to the proposal, do I understand correctly that your preferred resolution to this issue is to roll back to before this proposal and have no speedy deletion criterion for this at all? --Xover (talk) 14:41, 15 October 2019 (UTC)
yeah, apparently, i have out of consensus views of those who show up for process discussions. i just want some stable bibliographic metadata about "depicted people" and subjects. i am open to how to structure it, and what is the road map to get there. i do no care about rolling back a particular direction that i think is mistaken. (the problem with deletion is that it decreases the slim possibility of quality improvement, since it hides quality defects rather than making them more visible.) Slowking4Rama's revenge 15:43, 15 October 2019 (UTC)
@Slowking4: do you see the value in having a general essay and guidance on how we handle people who are not authors. Then having a range of means to handle these depending on the person's notability, and possibly the number of references/sources that we are having for these people. Some of the solutions will be here at enWS, and others may be at WD. Our policy guidance of 2010 probably needs to evolve with the implementation of Wikidata which is a bigger people resource and allows interactions and linking differently than our 2010 focus on enWP linking to notable people. Here I am thinking something akin to Wikisource:For Wikipedians and it might be something like [[Wikisource:For Wikidatans]] and [[Wikisource:Managing people data at Wikisource]]. — billinghurst sDrewth 22:55, 21 October 2019 (UTC)

Notice that discussion about to be closed

Since this discussion has been stalled since October (and ongoing since July), and no new consensus appears to be likely, I intend to formally close this discussion at some point before the new year as reaffirming the original consensus. If further or new discussions related to this issue are needed I suggest they be brought up in a separate thread, and unless they represent a concrete proposal, that the thread be opened down in the regular discussion section. --Xover (talk) 08:33, 14 December 2019 (UTC)

Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --Xover (talk) 10:49, 29 December 2019 (UTC)

Bot approval requests[edit]

Repairs (and moves)[edit]

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

Index:Poetry of the Magyars.djvu[edit]

The following discussion is closed and will soon be archived:
Resolved.

I have uploaded a corrected source file, but the file correction necessitates some page moves. I have made the more complicated moves, but would appreciate it if someone with a bot can complete the process.

Only pages in the (DjVu) range /48 to /216 need to be moved, and these pages need to be moved one page down. That is:

  • Page:Poetry of the Magyars.djvu/48 --> Page:Poetry of the Magyars.djvu/47
  • Page:Poetry of the Magyars.djvu/49 --> Page:Poetry of the Magyars.djvu/48
  • ...
  • Page:Poetry of the Magyars.djvu/215 --> Page:Poetry of the Magyars.djvu/214
  • Page:Poetry of the Magyars.djvu/216 --> Page:Poetry of the Magyars.djvu/215

All pages outside the range stated above already are in the correct location. I had to do some of those by hand because the original file was missing two pages, in addition to other problems, so not all pages needed to be moved one. --EncycloPetey (talk) 20:05, 26 December 2019 (UTC)

Done (in a minute). Mpaa (talk) 20:43, 26 December 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 10:56, 29 December 2019 (UTC)

Other discussions[edit]

How shall I transcribe two books in one?[edit]

The following discussion is closed and will soon be archived:
Resolved.

I have started working on a publication of engravings by Wenceslaus Hollar. The book does not contain the year of publication, but HathiTrust states that it was published between 1794 and 1812. The book looks like a reprint of originally two separate books, one published in 1640 and the other in 1643. The problem is that this reprint does not have one title common for both parts.

Can I transcribe the publication as two separate works under their individual titles? Or should I transcribe them as one work and devise some title? I was considering using the first of the titles for the whole publication, but it would be really misleading, as it speaks only about England, while the other part deals with various European countries. --Jan Kameníček (talk) 19:50, 22 November 2019 (UTC)

I'd just transcribe them as two separate works, if there's no overall introduction or anything.--Prosfilaes (talk) 21:04, 23 November 2019 (UTC)
I also think it is the best solution, but I wanted to have it confirmed by somebody else. Thank you very much. --Jan Kameníček (talk) 21:25, 23 November 2019 (UTC)
I, on the other hand, would probably transcribe them as one work and devise some title, like I did with The Holly & the Ivy, and Twelve Articles and Lyra EcclesiasticaBeleg Tâl (talk) 21:30, 23 November 2019 (UTC)
Hm, simple connection of two titles with "and" could also be a solution. I’ll think about it for a while, thanks as well. --Jan Kameníček (talk) 23:20, 23 November 2019 (UTC)
I don't think there is a clear answer in general; this sort of thing needs a judgement call for each work, and with quite some leeway for individual contributor preference. It also needs to be considered whether the book in question is actually a publication and not merely two works bound together (as was common practice for collectors of all stripes in the 18th and early 19th century). And on this particular book the fact the two works have the same publisher might suggest they are one publication, while the fact both included works have separate colophons suggests they are independent publications bound together. Similarly, there appears to be no front or end matter that is common to both works: they share only the binding. It is hard to be categorical, but I suspect I would have eventually landed on treating these as separate works that had merely been bound together. But I would not have faulted anyone for landing on the opposite.
Incidentally, the publishers, “Laurie & Whittle”, are still around, trading these days as “Imray Laurie Norie & Wilson Ltd”. --Xover (talk) 08:32, 24 November 2019 (UTC)
I know of a number of examples where works more or less related were packed into one binding out of publishing constraints. I think that we should make sure that sure separate parts are separated out, like they would be in an anthology or magazine, and make them available individually, even if they are under a higher level heading for the complete work.--Prosfilaes (talk) 02:00, 27 November 2019 (UTC)
Which has been done by creating redirects at the root where they have been displayed as subpages. Where they have a set of known publishing components, especially with regard to how they are portrayed at Wikidata, then keeping to the known truth is best. Here the provenance of the work is simply not known, we just know that they shared the same binding.

We know that many of our works were singly published, serially published, and multiply published, so do what makes most sense that maintains the credibility of the publication/work(s). Document it well either in notes, or on talk page, so that someone can understand what you did when looked at in five years time. — billinghurst sDrewth 04:30, 27 November 2019 (UTC)

Thanks everybody for valuable opinions. I have considered them all and finally decided to keep them together (as the publisher enclosed them in common binding), but as two separate subpages and with explanation in the note. I think this solution shows that originally they were separate and at the same time it is faithfull to the intention of the reprint’s publisher. --Jan Kameníček (talk) 00:11, 8 December 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 10:57, 29 December 2019 (UTC)

Page deletions[edit]

The following discussion is closed and will soon be archived:
Resolved.

A quick speedy -Template:Ws diclist smallcaps.css , but I can't tag the page as such. This was created in error. ShakespeareFan00 (talk) 19:07, 9 December 2019 (UTC)

@ShakespeareFan00: Yes check.svg Done --Xover (talk) 19:30, 9 December 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 10:59, 29 December 2019 (UTC)

Page numbers not displayed[edit]

The following discussion is closed and will soon be archived:
Resolved.

Does anybody know why the page numbers of "Russian Government Links To And Contacts With The Trump Campaign" and of other subpages of the work are not displayed? --Jan Kameníček (talk) 23:26, 17 December 2019 (UTC)

It could be because some of the "page numbers" are 15 characters long or longer. Page numbers typically should be 4 or 5 characters max. --EncycloPetey (talk) 23:29, 17 December 2019 (UTC)
That was it. I reworked the pagelist and it helped. Thanks. --Jan Kameníček (talk) 00:23, 18 December 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 11:00, 29 December 2019 (UTC)

Index:Old-folks.jpg[edit]

The following discussion is closed and will soon be archived:
Resolved.

Just doing a random page validation and I find that I am unable to validate the Index:Old-folks.jpg page. I don't receive the validated button when attempting to save it. Other pages on other works do show the validated button. Is there a problem with this particular page? Sp1nd01 (talk) 10:20, 18 December 2019 (UTC)

@Sp1nd01: For some reason, a previous edit had managed to remove the username from the (invisible) noinclude section where the page status is stored. I've edited the page (set it to problematic and then back to proofread) so that my username got inserted there, so now you should be able to set it to validated. --Xover (talk) 11:04, 18 December 2019 (UTC)
@Xover: Thank you, that has now worked as expected. Sp1nd01 (talk) 13:31, 18 December 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 11:01, 29 December 2019 (UTC)

Weird Tales vol. no. 1 scan[edit]

The following discussion is closed and will soon be archived:
Resolved.

There are two scans, a .pdf and a .djvu, with the same naming scheme. Practice (for other scans) has been to use .djvu files, but there are generally more .pdf (non-.djvu) files available (if I am not mistaken). Could the proper scan be determined, and the data from the improper scan be transferred? TE(æ)A,ea. (talk) 19:48, 18 December 2019 (UTC).

DjVu scans are all-around easier to use on Wikisource. PDFs are easier to make, but have more technical problems. --EncycloPetey (talk) 19:50, 18 December 2019 (UTC)
The DjVu was a replacement for the PDF, and the pages have been moved to it.--Prosfilaes (talk) 06:25, 19 December 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 11:02, 29 December 2019 (UTC)

OCR: Enable the Google-based version, until Phe's Tesseract version is operational?[edit]

I have recently read the discussion about broken OCR in some detail. The most recent comments point out that it would be up to English Wikisource to enable a (temporary) replacement until the traditional OCR tool is (hopefully) back in working order, or replaced with a better version.

Since we have a reasonably functional option based on Google's OCR, is there any good reason not to enable that by default, pending a more ideal outcome? Pinging some users involved in the discussion: @Xover, @Koavf, @Tpt, @Ineuw, @Jdforrester (WMF), @AKlapper (WMF), @Jan.Kamenicek: -Pete (talk) 21:19, 19 December 2019 (UTC)

I have the Google OCR button in my gadgets, but my experience is that its output is so bad that I do not use it, so enabling it by default does not solve anything for me. However, I understand that for some people (especially the new ones) it may be better than nothing. The main thing I am afraid of is that once we get a "reasonably functional" tool, we will never get a well functional one.

--Jan Kameníček (talk) 23:40, 19 December 2019 (UTC)

Yes, my experience is similar. But it depends on the text -- for some texts, it does a pretty nice job. IMO it's more important to have something for new users than nothing, when it comes to OCR. For the reasons you describe, I'm sure that Wikisource users would continue to advocate for something more functional regardless of whether or not the Google one is enabled, so I do not share your concern in this instance. -Pete (talk) 23:52, 19 December 2019 (UTC)
Like Jan, I have had little success with Google's OCR tool. I usually find that it's easier to type it by hand when the OCR tool isn't working. But that is in part because I work heavily with: (a) Plays or poetry, where the formatting, capitalization, and punctuation do not follow standard sentence patterns. (b) Works with footnotes, which are in a different size and format, and therefore cause the OCR to bork. (c) Works that contain bits of text in other languages, which never come out right. (d) Works that contain special diacritical marks. (e) Works that contain unusual archaic typography, such as special characters for "ct", or italicized script that the OCR can't handle. If you're working on a text that consists primarily of standard sentences and paragraphs, without italics or any special characters, and without archaic spellings or archaic typography, that is entirely in English, then Google's OCR might be useful. But for me, it isn't. --EncycloPetey (talk) 00:47, 20 December 2019 (UTC)
Something is better than nothing and there is no traction at Phab. I think the OCR tool that I am using now works fine. —Justin (koavf)TCM 00:50, 20 December 2019 (UTC)
Symbol support vote.svg Support If this is a proposal, make Google OCR the default since that is the only working OCR. All this means that it will be listed on the Gadgets page under Editing tools for Page: namespace instead of Development. — Ineuw (talk) 05:39, 20 December 2019 (UTC)
Symbol support vote.svg Support Why not? --Xover (talk) 06:26, 20 December 2019 (UTC)
PS. Aklapper and Jdforrester are just processing and trying to manage all Phabricator tasks (there're a couple of thousand open tasks, iirc, all told). Neither one of them will have any particular opinion on this issue, or the specific Phab regarding Phe's OCR, so there's no need to ping them here. --Xover (talk) 06:43, 20 December 2019 (UTC)
Note that the privacy policy requires informed consent for users before sending their data to non-Wikimedia services, which includes Cloud Services (like the proxy for this tool). The gadget as-is is in violation of the privacy policy and should be fixed to add a modal consent form (immediately, and definitely before this is enabled for users by default). Jdforrester (WMF) (talk) 08:28, 20 December 2019 (UTC)
@Samwilson: ^^^ FYI. I'm trying to read up / do some digging on this to try to figure out what wriggle room there is and / or the broader impact on other gadgets. --Xover (talk) 16:58, 20 December 2019 (UTC)
@Jdforrester (WMF): Why is a proxy hosted and run by the WMF considered a "non-Wikimedia service"? Which part of the privacy policy deals with this? Kaldari (talk) 19:43, 20 December 2019 (UTC)
I guess you could argue that the API is non-Wikimedia. I'd still like to know what the actual wording of the policy is that relates to this, though. Kaldari (talk) 19:55, 20 December 2019 (UTC)
i’ll believe in the "privacy policy" scruples, when i see them implemented in the m:IP Editing: Privacy Enhancement and Abuse Mitigation. until then, editors should expect to be constantly surveilled across all projects. Slowking4Rama's revenge 16:55, 21 December 2019 (UTC)
Symbol support vote.svg Support provided that we first implement the privacy form mentioned by Jdforrester above. -Pete (talk) 17:50, 20 December 2019 (UTC)
@Peteforsyth: Note that that privacy policy issue is purely a formal requirement thing in this particular instance. There is no information that would normally be considered privacy sensitive being transmitted anywhere for this case.
When you hit the OCR button (and only when you actively press the button), the gadget sends the language code of the project (i.e. "en" here on enWS) and the URL of the scanned page image to the Toolserver. The Toolserver doesn't see your IP address because the request passes through a proxy server (managed by the WMF like the wikis). The OCR tool on Toolserver fetches the scanned page image and passes it and the language code to Google's Vision API (all Google sees is the scan image, the language code, and the IP of the Toolserver; your browser never communicates with Google directly). The Google API then returns the extracted text, which the tool on Toolserver returns to your web browser, and which the gadget code then inserts into the text field for editing.
And just to rub salt in the wound, the Google OCR tool/gadget was, AIUI, developed by the WMF Community Tech team; meaning that not only is no actually sensitive data being transmitted, but every component involved that might conceivably be an attack vector is actually under the WMF's direct control.
That said, the privacy policy is not optional and not subject to per-project policies, so we'll have to figure out some way to make this work within those requirements. I'm just not sure how the heck to do that just yet (there is no standard facility for displaying such a prompt, and ditto for saving that choice for next time; showing a confirmation dialog for every single page is… not even an option). --Xover (talk) 18:42, 20 December 2019 (UTC)
Makes sense, and thanks for the explanation. My "condition" should not be interpreted too strictly; I of course defer to those more knowledgeable than myself about the proper way to handle this. -Pete (talk) 20:14, 20 December 2019 (UTC)
Google OCR is excellent at reproducing accented Latin characters for my projects about Mexico. I also used the OCR on French Wikisource and it also works very well. It seemed to me that it is also Phe's OCR tool. I was hoping to figure out how I can link to it in my vector.js. I asked this on in the French Scriptorium but received no reply. Perhaps someone here can figure it out and let us know? — Ineuw (talk) 10:19, 12 January 2020 (UTC)

Proposing move of pages The Works of Charles Dickens/Volume 1[edit]

Cakebot1 has been working on Index:Works of Charles Dickens, ed. Lang - Volume 1.djvu and transcluding as subpages of the "The Works of ..." To me this is not a good place for these works to be, and we touched on this conversation above at #Main namespace works; portal works and tendency to encyclopaedic components or listings. The titles that we use should be representing the works. As we already have The Pickwick Papers we need to determine a number of things.

  1. Whether we prefer to have 32 volumes of works reproduced as subpages (with all the inherent relative link resolutions), or a work to be a rootpage
  2. If the above determines that it is a rootpage, then how we would name works from volumes, when they will also have been previously published
  3. Moving the existing work, as it becomes the disambiguation.

My preference is to move and present the work as individual works, and to create a portal page to support the series as re-published. I would like to get this addressed early, prior to the contributor getting well into the work, and before we get further volumes popping up. [Also noting that we will need to fix the chapter numbering to our style]. — billinghurst sDrewth 00:23, 22 December 2019 (UTC)

This was published as a single collection under a uniform title. Moving these to the main namespace as separate items would break the implicit connection of the series. We'd have to disambiguate these copies from other copies if they move to the main namespace, which would create even more work. We already host "works of..." sets for several other authors this way, so I see no problem with hosting it as is, under the common title with subpages. --EncycloPetey (talk) 00:37, 22 December 2019 (UTC)
We have to disambiguate either way. Please tell me how "Works of ..." is useful, and how it represents the work of the author? It more seems to reflect the later work of a publisher, and just a series build, nothing more. That we made that choice previously could just be an indication of a poor choice, not evidence of how we should do things. If we are to keep under the Works of, as "Volume 1" is not helpful. — billinghurst sDrewth 03:10, 22 December 2019 (UTC)
Special:PrefixIndex/Works of displays our "Works of", about 50 pages, 25 seem to be redirects. — billinghurst sDrewth 03:14, 22 December 2019 (UTC)
I think you're confused here. This is not a work by Dickens; it is a work by Andrew Lang that includes works by Dickens. Lang has provided selection, ordering, editing, formatting and layout, introductions and prefatory matter, and indices. Extracting the parts of this work that are derived from Dickens' and placing it out of the context in which it was published would be misleading, and would be a disservice to, e.g., those who wish to compare Lang's selection or editing with that of earlier or later editions. The included Dickens works are obviously also editions of independent works, and so should appear on versions pages for those works, but that in no way affects the status of Lang's work. --Xover (talk) 08:55, 22 December 2019 (UTC)
I agree with Xover here. Things should generally be organized as they were published, and for most authors with Works collections, there are different variations to worry about, making it more important.--Prosfilaes (talk) 15:43, 22 December 2019 (UTC)
billinghurst, not all of our "works of" publication sets begin with those exact words, or even have the word "works" in their title, so your count heavily underestimates the number of such items we currently have. We have titles like The complete poetical works and letters of John Keats; Poetical Works of John Oldham; Victor Hugo's Works (Guernsey Edition); The Writings of Henry David Thoreau; The Plays of Euripides; or Masterpieces of Greek Literature. Not to mention the many magazines, newspapers, journals, and periodical that are set up just like this. --EncycloPetey (talk) 21:13, 22 December 2019 (UTC)
Strongly agree with Xover and the others: compilation and anthology works are still works per se and should be preserved as such. Versions pages and redirects are the correct way to connect top level mainspace titles with editions that appear in such collections. Also, I've spent a lot of time consolidating loose works into their appropriate collections, so I have a personal vested interest here also. —Beleg Tâl (talk) 15:42, 23 December 2019 (UTC)

This seems to me a case similar to The Novels and Stories of Henry James: many works published separately over a period of time, but also claimed to be volumes of a set. I did The Novels and Tales of Henry James (n.b. Tales not Stories) in the form The Novels and Tales of Henry James/Volume 2/The American/Chapter 1, and it is still that way now, but I have come to the view that it isn't right and doesn't work. For The Novels and Stories of Henry James, I named each work as a distinct publication (which it is), e.g. Confidence (London: Macmillan & Co., 1921), and simply link them from the works page The Novels and Stories of Henry James. I commend this method to you as the cleaner, and more in accordance with "things should generally be organized as they were published". Hesperian 01:35, 23 December 2019 (UTC)

It's easy to do that for novel-sized works. What about New Hampshire (Frost)? Should each and every poem be in main space? Does that mean that if we do the Atlantic Monthly, a periodical where at least one of those poems were first published, that everything in it should be in main space? I can see the argument that a series of novels should be broken out by title, but I think it gets real messy in the case of anthologies and periodicals, especially when we're talking about excerpts of longer works and intros that cover multiple works.--Prosfilaes (talk) 11:36, 23 December 2019 (UTC)
No, of course not. New Hampshire is a single published work. These sets are not single published works. They are multiple individually published works that the publisher has declared to be volumes in a set. That declaration doesn't change the fact of them having been published invidually. If each poem in New Hampshire had been published and printed and bound and sold separately, and subsequently declared by the publisher as aggregating into a single work, then, and only then, would I would say that each and every poem should be in mainspace. Hesperian 06:10, 3 January 2020 (UTC)
I don't know that these sets are not single published works; I know that they're separately bound works. Not all the works in The Novels and Tales of Henry James were published separately; presumably many of the short stories were never so published, and the fact that the various collections of stories don't overlap is means that they are functionally volumes in a set. Looking at volumes 10-18 of The Novels and Tales, we're going to want those volumes as separate volumes. In practice, I don't think your proposal is much different from saying novel-sized works, besides making arbitrary historical distinction based on which works happened to have standalone publication and which were originally published in periodicals.--Prosfilaes (talk) 07:50, 3 January 2020 (UTC)
I am not saying that individual works, e.g. single poems, should be presented here separately: no way. I am saying that the books in this set were separately prepared, printed, bound, issued on distinct dates and made available for sale individually (and ultimately will fall into the public domain separately over several years). They are distinct publications, in the literal sense of 'publication': made available to the public via the shelves of the local bookshop. Such distinct publications should be presented here as individual texts, not as subpages of a larger set, where that set is not a publication in that literal sense. Hesperian 23:44, 5 January 2020 (UTC)
I guess I'm saying that The Novels and Tales of Henry James exists only as a concept created by the publisher, not as a literal publication. In which case it is also relevant to note that the publisher Bernhardt Tauchnitz had a 'set' entitled Collections of British Authors that was added to for over 100 years, and ended up with 5370 volumes. Hesperian 23:56, 5 January 2020 (UTC)
But the only name for volume 16 of The Novels and Tales of Henry James is The Novels and Tales of Henry James, volume 16. You can say the same thing about periodicals, which are much more likely to end up with 5,370 volumes.--Prosfilaes (talk) 05:22, 6 January 2020 (UTC)
No indeed. the half-title page reads "The Novels and Tales of Henry James / Volume 16", but the full title page reads "The Author of Beltraffio, The Middle Years, Greville Fane, and Other Tales", and it is commonly indexed under that title.[1] I can't find a good scan of Volume 16, but here's Volume 18. Hesperian 23:57, 6 January 2020 (UTC)
True; which may not be the case for other works. I will note that "Famous Story and other works" can drive bibliographers and collectors nuts, as it can frequently refer to many distinct collections by many publishers.--Prosfilaes (talk) 03:02, 7 January 2020 (UTC)
But it's not so easy to break the novels out in this case, either. It would still be messy. Volume one of the Dickens collection is not the novel; it's volume one of the set as well as volume one of the novel, and the novel exists across more than one volume. --EncycloPetey (talk) 17:25, 23 December 2019 (UTC)

Empty categories[edit]

I would like to ask about the some currently empty categories (see below). Do they have any usage or can they be deleted?

  • Category:Pages containing image
    There is no explanation what the category is meant for. There are thousands pages containing images in Wikisource, but none of them has been categorized here so far. If there is reason for this category's existence, it should be easy to populate it by some bot. If there is not, I suggest to delete it.
  • Category:Pages containing errors‎
    The category description says: "These pages of non-fiction contain some error on them.", without specifying, what sort of errors is meant (spelling+grammar, factual errors by the author, factual errors caused by incomplete human knowledge in the time the work was written…) It is also not clear, how it should be populated (by SIC templates or manually?). I suggest to delete it.
  • Category:Texts without page numbers‎
    It is not clear which texts should come here: texts whose original publications are numbered but these numbers are not mirrored at Wikisource (which is true for most texts here which are not backed by scans) or texts whose original publications are not numbered? If there is a reason for existence of this category and after its aim is cleared, it can be at least partially populated by a bot. If there is not such a reason, I suggest to delete it too. --Jan Kameníček (talk) 22:34, 27 December 2019 (UTC)
see also User:Hesperian "Decline. IMO, in a community this small, nothing created in good faith by a regular should be speedily deleted. This should be taken to WS:PD" and User:Billinghurst, User:John Vandenberg, User:Cygnis insignis. -- Slowking4Rama's revenge 00:23, 30 December 2019 (UTC)
They are labelled maintenance/tracking categories so they will presumably have templates that will populate them when something is incorrect in their use. So suggest leave them as they are doing no harm, though I cannot remember what they do. Adding some commentary to them is probably of value. I you are seeing them where you should not be seeing them, then plug in {{maintenance category}}. (Hopefully we are better at labelling, and use of <includeonly> these days.) — billinghurst sDrewth 07:08, 30 December 2019 (UTC)
They might be supposed to be filled by some templates, but one of possible reasons why they are empty is that the templates do not exist anymore. If their purpose is worth to keep them there should be some way to find it out and add it into the categories’ talk pages or somewhere. --Jan Kameníček (talk) 19:19, 30 December 2019 (UTC)
Fully agree that overt labelling and documentation is the way to go. Doesn't change my initial comment. As a community we misused <includeonly> simply for the sake of neatness. — billinghurst sDrewth 01:27, 31 December 2019 (UTC)

Checking page style for court cases[edit]

I recently created a page for Valvoline Oil Co. v. Havoline Oil Co., using the existing page, Universal City Studios, Inc. v. Reimerdes, as my source for wiki formatting.

I would like to know if there is anything I should change in the style of any future works I may add; if there's any formatting I shouldn't have included or anything I left out.

Qwertygiy (talk) 19:27, 29 December 2019 (UTC)

A few items: (1) I see no link to the original source of the text copy. A link to the source of the text copy should appear in the header, or on the item's Talk page. (2) You've overlinked. there is no reason to link to Wikipedia articles like "Magazine", "Advertising", or "New York". (3) The judge who authored the decision should be identified in the header or header notes.
Also, you can center an image without using the template; I've done this for the two images. --EncycloPetey (talk) 19:43, 29 December 2019 (UTC)

Pictogram voting comment.svg Comment @Qwertygiy: Agree with EncycloPetey's comments, see Wikisource:Wikilinks. Maximise internal links, minimal external links where adds true value and unambiguous. So we would do either author = or contributor = for whomever wrote a judgement or wrote an opinion. We would normally do local author links in the body of the work for the judges cited and create relevant author pages.

Some questions and comments

  1. Are the references yours, or where they in the original document? If yours, then they should be moved to the talk page, and use the edition parameter, and a note to point to them. We try to present clean documents, not annotations.
  2. We would normally add put the case into WD, if there is an article for the case at enWP, they can share the same item for case law, and this would provide the interwiki links.
  3. At some point we would/should create Portal:United States District Court for the Southern District of New York—their creation is organic, how many other works—sometimes even consider an anchored redirect a subsection to a parent portal page to make it easy to break it out at a later stage.

billinghurst sDrewth 08:54, 30 December 2019 (UTC)

  1. In regards to the references, they were all included verbatim in the source text (in parentheses) or were referencing earlier such citations as supra. Any citations that were integrated into the text rather than thusly isolated, I left in place and merely added links. My reasoning was that such a citation serves the same purpose whether in parenthesis or in footnote reference; the former was easier to create on a 1910s typewriter while the latter is easier to read on a 2010s webpage.
  2. In regards to Wikidata, I'll take a look at the procedures for that. I'm not very familiar with it yet, most of my contributions being solely at enWP.
  3. In regards to the portal, creating one that is a redirect to the subsection of the US case law portal seems like the best idea at the moment, since the half-dozen works added thus far seems a little too small and specific to justify having its own portal, but there are many thousands more that exist and just aren't added as of yet.
  4. In regards to the link to source, I left it in the original commit message; I'll add it to the talk page.
Qwertygiy (talk) 21:33, 30 December 2019 (UTC)

Versions and Wikidata problem[edit]

Version pages[edit]

In discussion with another editor, I've discovered that the information on Wikisource:Versions does not align with current practice.

Versions pages
Different versions of the same work are listed on "versions pages." Such pages are only for different versions of substantively the same work. Different works should not be listed together on the same versions page, even if they have the same title and/or author; they should be listed on disambiguation pages. This applies even to works that are reviews or analysis of the work. For example, Charles Lamb's prose retelling of Shakespeare's Romeo and Juliet (Shakespeare) is a version of that work, and belongs on a versions page with it. The entry entitled "Romeo and Juliet" in the The New Student's Reference Work is a work about, rather than a version of, Shakespeare's play, and therefore should not be included on a versions page. (Works that share the same title are listed on disambiguation pages; works that share the same subject are listed on portal pages.)

Wikisource:Versions#Versions pages


The key section is: "Charles Lamb's prose retelling of Shakespeare's Romeo and Juliet (Shakespeare) is a version of that work, and belongs on a versions page with it.

Does this mean that movie scripts, operas, retellings, children's adaptations, etc. belong intermixed on the same versions page? And who is considered the "Author" on such a versions page, when each item would actually have a different person who wrote it?

There is an additional problem now that we are connecting to Wikidata. Romeo and Juliet (Shakespeare) is linked to Wikidata item d:Q83186, which is specifically for the play written by William Shakespeare. The retelling by Charles and Mary Lamb has a separate data item, because the author and publication information are different. If we are to treat versions pages as currently described at Wikisource:Versions (quoted above), then we must remove the link from Wikidata, because our content on that page does not match the Wikidata item. We would need to create a new kind of page that lists only editions of the work itself, separate from the versions/retellings/adaptations.

The problem goes yet deeper. If you do not see the issue at play here, look at Macbeth (Shakespeare) and Macbeth. The page Macbeth (Shakespeare) currently lists only editions of the play itself, not the retelling by Charles and Mary Lamb, nor the opera adaptation by Verdi. The page is already crowded with editions, and there are many more besides that are not yet listed, because it is a Shakespeare play. The disambiguation page Macbeth lists the other items by other authors. And note that we currently have three editions of Charles and Mary Lamb's Tales from Shakspeare retellings, in various stages of transcription. Will all of these editions be listed on the same page as all the editions of the Shakespeare play? If so, all editions of Verdi's opera and of any other editions of any derivative works would also all appear mixed together on the same page. Is this desirable?

The current wording of Versions is no doubt the result of an earlier, simpler time when Wikisource did not have many editions of the same work, and did not have to concern itself with the possibility of multiple editions of the same work, nor multiple editions of derivative works. I propose we reverse the current wording so that Versions pages explicitly do not include adaptations or retellings by other authors. --EncycloPetey (talk) 18:43, 30 December 2019 (UTC)

I was thinking about the same problem when I was dealing with some folk tales that were retold by various authors.
The problem might be solved if we had two kinds of version pages: versions of work and versions of story. --Jan Kameníček (talk) 19:07, 30 December 2019 (UTC)
The Italian Wikisource has adopted a separate "Opera:" (Work:) namespace for items that are the same work, but different editions. We could do the same. Having two different kinds of Versions pages would get messy anyway, however we tried to do it. If we opened a new namespace for the items that are the same work/author, that would free up Versions pages to treat items that are the same general story, but with different authors/wording. --EncycloPetey (talk) 19:12, 30 December 2019 (UTC)
What about translations? Does it mean that the new namespace would also host original works and the translation pages would become redundant? --Jan Kameníček (talk) 19:26, 30 December 2019 (UTC)
The pages which list Translations could be rolled into the Work: namespace. They would merely need to accommodate the information about the original language title. But yes, if we decided to go that way, it would mean that we would not need a separate set of Translations pages. The only difference right now between a Versions page and a Translations page is whether or not the original language of the work was English. And we have some marginal cases already which sit astride the two, such as Beowulf, which was written in Old English, so its page lists the Old English editions as well as translations into Modern English. With a separate Work: namespace, we wouldn't have that problem. --EncycloPetey (talk) 19:31, 30 December 2019 (UTC)
I don't really see the problem. Yes, pages like that need to have additional internal structure. But those pages are far and few between. Of course the novel version of "And Then There Were None" and the play version should be on the same version page.--Prosfilaes (talk) 19:46, 30 December 2019 (UTC)
You haven't stated any reasoning, and no, they are not few and far between, and it is a growing problem. Why should works written by different authors appear as "versions" on the same Versions page, instead of on a disambiguation page? Why should a ten page prose summary of the story of Macbeth appear on the same page as a 150 page play with stage directions, when the two have different authors and completely different text? Why not group them instead on a disambiguation page? --EncycloPetey (talk) 20:11, 30 December 2019 (UTC)
Shakespeare's w:Romeo and Juliet (c. 1590–1595) is based on Arthur Brooke's 1562 narrative poem "The Tragical History of Romeus and Juliet" and William Painter's 1567 collection of Italian tales which included a version in prose named "The goodly History of the true and constant love of Romeo and Juliett"; Brooke's version was a translation into English of Pierre Boaistuau's 1559 French version; which was in turn a translation of Matteo Bandello's c. 1531–1545) Giuletta e Romeo; Bandello based his version on Luigi da Porto's c. 1524 Giulietta e Romeo; da Porto based his version on the c. 1476 Mariotto and Gianozza by Masuccio Salernitano, who draws on Dante's Divina Commedia (in canto six of Purgatorio), the Ephesiaca of Xenophon (c. 3rd century), and Pyramus and Thisbe from Ovid.
In the other direction there were the 16th/17th-century quarto and folio editions that may be said to be roughly the author's original editions; followed by something on the order of 25–30 main distinct editions up through the 19th century (starting with Nicholas Rowe, Alexander Pope, and Lewis Theobald in the first half of the 18th century; through the great Tonson editions edited by Samuel Johnson, George Steevens, and Isaac Reed; the 1790 and 1821 Malone editions; John Boydell's copiously illustrated edition; and up to the famous Cambridge/Globe and Arden editions). All of them aim to get at the "true" Shakespeare and seek to substitute their judgement for that of previous editors, leading to wildly differing results (not to mention Way Too Much Drama™ for historiography). And then we have things like Charles and Mary Lamb who retell the plays in prose at a level aimed at children, and Thomas Bowdler that expurgiated the plays to be fit "for 19th-century women and children". And in contemporary versions we have modern spelling editions, manga versions, etc.
And then you get into adaptations: w:List of films based on Romeo and Juliet list 150+ TV and movie adaptations alone. There are 8 ballets, 9 operas, 5 musicals, and 3 main compositions of classical music.
If we put everything on a versions page, Romeo and Juliet (Shakespeare) would have more than a thousand entries, more than half of them bearing little actual resemblance to the play that William Shakespeare wrote. We need to draw some lines somewhere: Shakespeare's works are extreme examples that help find those points in a way that Christie's paltry two versions do not. But the problem is general. --Xover (talk) 20:46, 30 December 2019 (UTC)
And the vast majority of works that have a version page will have two versions on there. Most books were never reprinted. Few were ever made into plays or came out in significantly different editions. A page for one of Shakespeare's plays can afford to be the exception. Moreover, "would have" is begging trouble from the future. Why not worry about what we have, instead of what we might have?--Prosfilaes (talk) 00:57, 31 December 2019 (UTC)
See Hymn and The Raven.--RaboKarbakian (talk) 20:35, 30 December 2019 (UTC)
Those are Disambiguation pages, which are a separate concern. They are not relevant to the current discussion. --EncycloPetey (talk) 20:48, 30 December 2019 (UTC)
There is a huge amount of variation in this kind of problem. In some cases it is pretty reasonable to consider the two works to be versions of the same work (e.g. the Hebrew and Greek versions of Esther). In some cases it's pretty reasonable to consider the two works to be completely different works sharing only a common underlying theme (e.g. the Routhier and Weir versions of O Canada, which should probably be converted to a disambig page). In the case of adaptations, for example La Fontaine's adaptation of The Tortoise and the Hare, or Seidenbusch's adaptation of Salve Regina, or the Lambs' prose adaptation of Macbeth, I would still consider them to be "versions" of the original work, and would list them on the original work's Versions page. However, I would note that they are adaptations rather than original versions (or direct translations), and generally would put them in a separate section of the Versions page. If the adaptation is significantly different from the original, I would also list it on a disambiguation page. —Beleg Tâl (talk) 20:51, 30 December 2019 (UTC)
(I might, however, give the adaptations their own versions page, and just link to them from the main versions page) —Beleg Tâl (talk) 20:54, 30 December 2019 (UTC)
Oh, here's a good example of what I mean: Alice's Adventures in WonderlandBeleg Tâl (talk) 20:56, 30 December 2019 (UTC)
How would you feel about the proposal to create a new Work: namespace to solve the issue? --EncycloPetey (talk) 20:58, 30 December 2019 (UTC)
I must admit that I don't really see how a Work: namespace would solve the issue. The same problem you see currently on Versions pages will persist on Work: pages. The idea of using Versions pages for versions-of-story already exists in Portal space (e.g. Portal:Cinderella). We would just be moving the problem around, not addressing it. —Beleg Tâl (talk) 21:06, 30 December 2019 (UTC)
Furthermore: even if we tried to formalize a separate structure for version-of-work and version-of-story, this completely falls apart for most traditional folk stories and songs where there is no real difference between the two and every single edition is wildly different. How many version-of-work pages would you use for Tam Lin? How many version-of-work pages would you use for The Elfin Knight? —Beleg Tâl (talk) 21:11, 30 December 2019 (UTC)
Not to mention that folk stories often have closely parallel versions in different languages, so then you could have some that are "original tellings" in English alongside some that are translated from another language. They'd end up on different pages if there are separate Version and Translation pages. I wish there were enough Wikisource editors to do a Folk Stories Project and sort this stuff out into Portal-style pages instead. Version pages could be reserved for works that are author-associated, a work of that author particularly. E.G. HC Andersen’s "Tinderbox" is a retelling of an old tale but it is universally known as Andersen's "Tinderbox," thus it deserves its own versions page (or rather translations page in English). Levana Taylor (talk) 01:45, 31 December 2019 (UTC)
Right now, there is a broad scope in the problem, as you have noted, on the one hand between items that are the same work, and on the other items that clearly are not the same work. These disparate items may or may not appear on a Versions page at the whim of an editor. A Work: item would always be for a specific work, making a clear distinction between the two types of listings. The Work: namespace would also include translations. Right now, Esther (Bible) is a Translations page, only because the original was not written in English, even though they are simply different Versions of the same work. If we consider that the "original" Romeo and Juliet story was not in English, then "Romeo and Juliet" would need to become a Translations page, and this would be true of many of Shakespeare's plays, as his plays were not the first tellings of the stories. A Work: namespace would absorb all the Translations pages and lists of editions of the same work, and thus draw a clear divide between listings of the same work (Work:) and related works that are derived from each other, which would then be the focus of Versions pages. Right now. we draw a divide between works originally in English and works not originally in English. The proposal would shift that division to editions of the same work versus versions of similar works. The concern about wildly different editions is already a problem. There are two entirely different early editions of Shakespeare's King Lear, and some modern prints of Shakespeare's works include both for completeness. Despite the differences, they are clearly supposed to be the same work by the same author, whereas a retelling by Charles Lamb for Tales from Shakspeare is clearly not an edition of the same work, but is a related story. Works like The Elfin Knight are no different than copies of ancient works, where different manuscripts preserve or lose different passages. --EncycloPetey (talk) 21:21, 30 December 2019 (UTC)
Okay, now I'm just confused about what is being proposed. The suggestion that translations should be merged into Work space, even though translations are the ur-example of derived works by different authors, further reinforces to me that there is no such thing as a clear distinction between "the same work" and "not the same work". —Beleg Tâl (talk) 22:02, 30 December 2019 (UTC)
Translations preserve the content, even though the language has changed. A translation can be placed side-by-side with the original, aligning the texts. We have a translation namespace that does just that for multiple texts, from books of the Bible to poetry by Catullus. In contrast, a retelling by Charles Lamb will bear little resemblance to the parent text, even though it might be written in the same language. In a library catalog, translations of a work are still considered to be copies of the same work; only the language has changed. Whether you're reading Dante's Inferno in medieval Italian, modern Italian, English, or Chinese, it's still Dante's Inferno, and would be catalogued with Dante as the author. Retellings and adaptations will be catalogued under different authors.
To give a modern example: If I found a German "translation" of Stephenie Meyer's Twilight, I would expect it to have the same story, same characters, and same plot as the original; the same number of chapters, the same everything except the language. The novel Fifty Shades of Grey is a retelling of Twilight (originally as fan fiction) by a different author, of the same basic story. In the retelling, however, the setting is different, the character names are different, and there are no supernatural elements. It is a complete retelling. So, the translation bears more in common with its source text than a retelling. Under our current Versions page structure, both novels would be listed on the same Versions page because they derive one from the other. But an English translation from a German copy would be placed on a separate page, simply because it is a translation.
Translations in library catalogs and on Wikidata are treated simply as editions of the source text, with only a data code indicating that the language has changed. Retellings and derivative works are treated completely separately. The proposal would thus align Wikisource practices with both library databases and Wikidata structure. --EncycloPetey (talk) 22:56, 30 December 2019 (UTC)
Your view of a translation is idealistic. In reality, translations can take (often silently) take huge liberties with their underlying work, frequently amounting to paraphrases or condensations.--Prosfilaes (talk) 01:05, 31 December 2019 (UTC)
I am aware of translational difficulties, but idealistic or not, it's the view taken at Wikidata and by library catalogs. The same disparity can be found in some editions of works, where spellings, vocabulary, punctuation, and more can be altered by editors. Compare the two "first editions" of Moby-Dick, which made different sets of corrections requested by the authos; or the US and UK editions of A Clockwork Orange, for which the US publisher decided to omit the final chapter. Nevertheless, both the UK and US editions of Moby-Dick are considered to be the same work, as are the US and UK editions of A Clockwork Orange. But my point is that other works based on those novels, written by other authors, ought not to be considered the same work as the original. Currently, we make no such distinction. --EncycloPetey (talk) 01:23, 31 December 2019 (UTC)

Pictogram voting comment.svg Comment Not certain that I wish to wade through that conversation. Keeping it simple. Fixing pages here is all that seems needed.

  • Our versions are for works by the author
    main namespace pages, can link to WD item for enWP article
    If we versions pages that are out of scope, then fix them.
  • Disambiguation pages are for works of same/similar name by various authors.
    main namespace pages, that link to WD items for disambiguation
    Charles Lamb's retelling is not a version, it is a derivative work and to be disambiguated, and it can have notes that put it in context to the original work. Contributors who have morphed these pages should be pointed to this conversation. Disambiguation pages can be structured to capture some of the aspects of derivative works.
  • Translations versions are for works of the original author, where there are different translations/editions of translations
    main namespace pages, can link to WD item for enWP article
  • Portal: ns pages exist for curation where required.
    portal namespace, would link to WD item for portal (see topic's main Wikimedia portal (P1151) and Wikimedia portal's main topic (P1204))
    Not encouraging these for the work level, though if someone wishes to put the work into creating something that explains a subject matter, and a range of disambiguations, versions, translations, and retellings, then go for it. It would not replace these main namespace pages.

billinghurst sDrewth 01:10, 31 December 2019 (UTC)

But your first item is part of the problem. We currently have advice on versions that differs from what you've described, so some kind of change is necessary. The question is what sort of change? --EncycloPetey (talk) 01:15, 31 December 2019 (UTC)
Please be specific of which bit. Inexact commentary is less than helpful. — billinghurst sDrewth 01:24, 31 December 2019 (UTC)
Read the opening three paragraphs of this discussion. Currently, our guideline advice would put Charles Lamb's retelling of Shakespeare's Romeo and Juliet on the same versions page with the play itself. Your comment says otherwise. Hence, we either need to align practice with the advice, or alter our advice to fit practice, or make some other change to resolve the discrepancy. --EncycloPetey (talk) 01:28, 31 December 2019 (UTC)
Gotcha, if you are quoting a page, can I suggest {{cquote}} as so it is overt. I missed the cue, thought it was your comment. [Note to self don't try and write abusefilters, analyse abuse and try to have conversations. My apologies.] — billinghurst sDrewth 01:33, 31 December 2019 (UTC)


Example of Wikidata item[edit]

General comment about works at WD and enWP, the item Romeo and Juliet (Q83186) is for the conceptual work and all that follows, not solely the play. It has from what it was based itself, and derivative works. So I don't see an issue with how it is being linked. It is how we wish to look at the latitude of subsidiary links. — billinghurst sDrewth 01:41, 31 December 2019 (UTC)

The WD item points to derivative works, but doesn't list them as editions of the work itself. The WD structure identifies editions of the work by pointing to editions pages for the editions, and to derivative works by pointing to a data item for the derivative work, which in turn points to editions of that derivative work. Each derivative work has its own separate WD data item, with lists of editions, and each derivative work has its own data item with lists of editions. Currently, we seem to make no such distinction. So while Wikidata has separate data items for Macbeth, the play by Shakespeare, and Macbeth, the opera by Verdi, our advice currently would lump them into a single page. --EncycloPetey (talk) 01:53, 31 December 2019 (UTC)
I believe that is a limited perspective, our WD linkage points to a central coordinating point of the concept of the work. The WP article w:Romeo and Juliet there is more than the words of the work itself. WikiCommons' c:Romeo and Juliet is definitely not focused on the base publication, they are the concept. Our versions page focuses on the conceptual, editions, and derived works, and we are arguing about derived works, and the interwikis definitely cater for that aspect, so I don't see a huge difference. The focus of our page is not on the derived works, each of our presentations has/should ahve its own item, and that has been our practice — billinghurst sDrewth 02:32, 31 December 2019 (UTC)
The Wikipedia article is about the play. If the Wikipedia article were listed here it would be placed on a disambiguation page. So comparing what the Wikipedia article does to what we do is fallacious reasoning. Commons content will vary depending on the number of items in a category. When a category grows large enough, subcategories are split off. So, for example, commons:Category:Medea (Euripides) is for the play by Euripides, but there are subcategories for the translation of the play by Augusta Webster and for the 2009 Syracuse performance of the play. And I don't think you've actually looked at what commons:Category:Romeo and Juliet has or what subcategories it contains. The key here is that neither Wikipedia nor Commons make distinctions between items that are the work and items about the work. So if we follow your line of reasoning (based solely on mimicking Wikipedia and Commons) then our "Versions" pages ought to contain items that are the work, as well as items about the work, which is what we currently do with Portals. Is that what you are advocating for?
Further, the interwikis between Wikipedias link only to other articles about the Shakespeare play. The interwikis to Wikiquote link only to pages quoting the play. The interwiki links to other Wikisource projects link only to translations of the play. So I don't see any truth to your claim that the interwikis cater to derived works. Yes it's possible to navigate by following additional links, but we do that with {{similar}}.
That said, are you advocating for a change to the advice on the quoted page, or are you advocating for something else? Your preferred course of action isn't clear. --EncycloPetey (talk) 17:29, 31 December 2019 (UTC)
I beg to differ about the WP article, it is an article about the play, and a zillion things that have sprung from it. If it mentions "legacy" and talks about ballet, it has gone beyond the play, and is about the conceptual work in a broader aspect including deemed pertinent derivatives. Commons has artwork that is not from the original play, it is someone's conceptual interpretation from the play, these are derived works on the subject. So as such, I don't see it as black and white as you do for the other sites, and the WD item. That said, I am awaiting others' comments, they are important for framing. — billinghurst sDrewth 12:15, 1 January 2020 (UTC)

Cosmetic problem with {{header}} change[edit]

This conversation has been moved to Template talk:Header#title & contributor: one line or two? Levana Taylor (talk) 04:09, 31 December 2019 (UTC)

Happy Public Domain Day![edit]

Here are some things entering the public domain in the next several hours: https://web.law.duke.edu/cspd/publicdomainday/2020/Justin (koavf)TCM 06:29, 31 December 2019 (UTC)

Do our "1923" templates need to be updated? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:53, 31 December 2019 (UTC)
They shouldn't. {{PD-1923}} was adjusted last year to automatically progress, and {{PD/1923}} uses the code of the first. --EncycloPetey (talk) 17:17, 31 December 2019 (UTC)
Template:PD-anon-1923 does need converting to make it automatic. It's currently stuck on 1924. BethNaught (talk) 11:55, 1 January 2020 (UTC)
We should rather be updating and using {{PD-anon-1996}}, and just leave 1923 alone. — billinghurst sDrewth 12:05, 1 January 2020 (UTC)
@BethNaught: I have updated {{PD-anon-1996}} and {{pd/1996}} so they display dates and text appropriately for the 2020. They are relatively done, so progression each year will be fine too. — billinghurst sDrewth 13:46, 1 January 2020 (UTC)
1923 is not relevant for anonymous works anymore, and 1996 conflates "expired in the US" and "not renewed by the URAA", which isn't something we should be doing.--Prosfilaes (talk) 19:56, 1 January 2020 (UTC)

┌─────────────────────────────────┘

Not only do we have at least three templates with "1923" in their names (and quite why we have {{pd/1923}} and {{pd-1923}}, differentiated only by one punctuation character, for apparently very different functions, is anybody's guess), but they hard-code Category:PD-1923 and Category:Author-PD-1923. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:23, 1 January 2020 (UTC)

We have PD-1923 for works where we know their publication date, but not the date of the author, or the date of the author is complex (multiple or corporate), plus it only gives US. PD/1923 allows the split copyright of US and home country based on the author's date of death. PD/ gives us the indication of when we can move to Commons, whereas PD- does not. — billinghurst sDrewth 12:39, 1 January 2020 (UTC)
"PD/ gives us the indication of when we can move to Commons, whereas PD- does not" I know what they do; but does anyone seriously think that "/" vs. "-" is the best way to convey that to colleagues, especially new editors? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:33, 1 January 2020 (UTC)
Slashes means they are subpages, and accordingly can be relatively linked. — billinghurst sDrewth 13:44, 1 January 2020 (UTC)
Template:Pd/1923 is not a "subpage", because Template:Pd does not exist; it's just badly named. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:33, 1 January 2020 (UTC)

┌─────────────────────────────────┘

The year 1923 is no more relevant (in fact no fixed year is relevant now, the elapsed time is, so maybe also the wording of the text can be updated). So I suggest

  1. to rename the template {{PD-1923}} for {{PD-US-95}}.
  2. to rename the template {{PD/1923}} in a similar way, e. g. for {{PD/US-95}}, or even better to merge it with the previous one
  3. to rename the template {{PD-anon-1923}} for {{PD-anon-US-95}}, or merge it with the previous ones, making the anon just their parameter, e. g. {{PD-US-95|anon}}
  4. to change the texts for "This work is in the public domain in the United States because it was published more than 95 years ago. …" or something similar. --Jan Kameníček (talk) 12:28, 1 January 2020 (UTC)
While I agree that 1923 has progressed, all the years of publication are relevant thereafter, and it is best to just keep it all harmonised. It was chosen that way to keep it simple; simply pick the year period, add your publication year. "-95" is just going to cause issues, is it -95, -95+1, -95 from today, how many is -95. Fixing the templates at the back end, is pretty easy, and I will just plan to get it done. — billinghurst sDrewth 12:34, 1 January 2020 (UTC)
I do not see any difficulties. It would be as easy to use as e.g. {{PD-anon-70}} or {{PD-old-70}} at Commons, and most people coming here usually have experience with Commons templates. Or alternatively, it can be renamed for {{PD-US-96}} with the text "This work is in the public domain in the United States because it was published at least 96 years ago or earlier. …" The documentation can not only explain it further, but even specify which the latest acceptable year is (with automatic update of the year). If the templates were merged into one, everything would be perfectly harmonized. --Jan Kameníček (talk) 14:36, 1 January 2020 (UTC)
The problem with using a template like "PD-anon-70" is that it is applicable for 10 years or less. Eventually, we will reach 80 years, 90 years, and 95 years, at which point such templates must be replaced based on the increasing number of years since the author's death or the work's publication. Any template that is set to operate based on a fixed range after the author's death or the work's publication date will produce this issue of perpetual monitoring. Such an approach might be possible on Commons, with a larger community to constantly adjust, but for a smaller community like Wikisource, it is not the best approach. It is better to have templates that adjust their display based on information provided about the date of publication or date of the author's death. --EncycloPetey (talk) 18:43, 1 January 2020 (UTC)
100% agree that we should remove "1923" because it's not semantically meaningful--we're just saying, "public domain in the United States due to expiration of copyrite", whenever that is. —Justin (koavf)TCM 12:49, 1 January 2020 (UTC)
People may indeed be familiar with the license names at Commons, but the problem on Commons is that those templates do not perform the same functions there as they do here. Over there, an item needs two templates: one for pma licensing and another for US licensing. Sometimes there is a combined template, but sometimes not. Also, if we use the same naming as Commons, people may assume our licensing works just like Commons, and it doesn't. I'm not saying that we shouldn't change licensing template, but I am saying we shouldn't be looking at the confusion on Commons to decided what we should do here. You only have to look at the parameter listings on commons:Template:PD-US-expired to see how confusing a single template can become. Our current system is much easier to use. Also, a reminder that this same discussion happened in 2018. --EncycloPetey (talk) 16:41, 1 January 2020 (UTC)

┌─────────────────────────────────┘

Well, although I really do not see anything difficult in the templates I proposed, I can live with other kind of templates too. However, I am convinced that whatever templates are created or updated they should:

  • have some comprehensible and general name which does not have to be changed every year (current PD-1923 is an example of a template’s name not suitable for our purposes). PD-US-96 is imo suitable, but I am open to other suggestions too.
  • be updated automatically and without any necessary human interference a second after the old year finishes and new one starts, so that contributors have correct templates at hand immediately. --Jan Kameníček (talk) 18:58, 1 January 2020 (UTC)
I'd prefer -95 or US-expired; 95 is confusing, but so is -96 and any other choice, and it's consistent with -70 and -50. They definitely should be changed. I might argue for moving PD-old to PD-old-100 and replacing PD-old with a warning message, since it is the biggest confusion with Commons users.--Prosfilaes (talk) 19:56, 1 January 2020 (UTC)
There is a quasi-project ongoing at Wikisource:Requested texts/1924 for Public Domain Day; I am adding The Box-Car Children (darker story in the 1924 version than in later versions, interestingly). So get over there now and add a few. Lemuritus (talk) 21:13, 1 January 2020 (UTC)
I agree with user:billinghurst that we shoud leave 1923 templates as they are. They are true indicators when a work came to be in the public domain. 1924 public domain works should be indicated as such. Both of these have informational value even if it sounds incongruent. — Ineuw (talk) 19:26, 2 January 2020 (UTC)
@Ineuw: But the current 1923 templates do not indicate when the work came to be in the public domain at all. They are used for works published until 1922 which came into public domain in 1998 (and I think many of them even much earlier), as well as for works published in 1923 which came into public domain in 2019. --Jan Kameníček (talk) 16:51, 20 January 2020 (UTC)
  • Pictogram voting comment.svg Comment having some more time to think about this, I am wondering whether we just look to having {{PD-US|year of death}} and we gracefully deprecate both {{pd/1923}} and {{PD-1923}} and revert its text back to where it was. We can copy the current updated code over to this new template, and if there are any 1923+ works, then we simply update them over to the new template. I think the use of "expired" is just superfluous, and we can write that into the text.

    My reasoning for PD-US are it is simple and it somewhat aligns with Commons. We merge the logic of these templates, if YYYY for DoD is given it displays the death date PMA text; if not date is given then it just gives the standard "out of copyright" text. The PD-1996 and PD/1996 remain as they are as they still have determinative conversations based on year of publication, and year of death; once a year we would run a bot through and convert those on the 95 year boundary. — billinghurst sDrewth 13:35, 3 January 2020 (UTC)

    Agree. --Jan Kameníček (talk) 22:38, 6 January 2020 (UTC)
    Would this template be used for works published less than 95 years ago? The potential problem I see with having PD-US as a template name is that users may place it regardless of the date of original publication. That is, will this template cover "no-renewal" and "no-notice" situations, or will those templates still serve the separate function? If so, then we may be creating a whole new problem. Nor can we rely on the date of that edition's publication as a guideline, since we need to know the date of first publication and/or date of copyright registration, since that is not always the same, and which is not always included by an uploader. For example, I have come across a work whose initial publication date is 1927, but will not enter PD until 2024 because copyright was filed and renewed within six months and overlapped into the following year. If we're going to overhaul things, we need to consider sources of confusion and what information will be needed by someone to verify the template is correct. --EncycloPetey (talk) 23:05, 6 January 2020 (UTC)
    I agree that we should not have one template for works published more than 95 years ago and no-renewal and no-notice situations. I'm not a fan of a flat PD-US, but a mere naming of a template (the main template) isn't going to stop licenses from being carelessly placed on works.
    I don't understand what you mean by the publication date, though. My understanding is that works could be filed for copyright anytime in the first 28 years, even alongside renewals, but the clock started on the earliest of publication, copyright date, or registration.--Prosfilaes (talk) 23:44, 6 January 2020 (UTC)
    If the work was first published in the UK, there was a grace period in which to register a US copyright (six months?) I have found an instance where the initial publication in the UK happened at the end of 1927, but the initial US copyright was granted in early 1928, followed by a renewal. I have verified this with the copyright database at Stanford. So in this instance, the date of initial publication (in the UK) cannot be used to determine copyright status within the US. --EncycloPetey (talk) 00:17, 7 January 2020 (UTC)
    I checked with Clindberg on a similar case, and he said that the clock would have started in 1927. Since it's not an active issue, I don't want to ping him, but I'm pretty sure initial publication was enough.--Prosfilaes (talk) 03:11, 7 January 2020 (UTC)

Need for a specific doohickey[edit]

Is there an existing gadget/widget/app/whathaveyou that would allow us to quickly convert selected text into either a mainspace link, or an Author piped link? It’s insane to consider doing it all by hand, but I'm looking at Leo Tolstoy: His Life and Work/Bibliography, Devil Worship (Joseph)/Bibliography, The Old New York Frontier/Bibliography, Ivan the Terrible/Bibliography and Page:Advanced Australia.djvu/242 and just between that small handful of pages, there are more than a hundred works we neither have, nor even have redlinks pointing toward them. If we could get the redlinks going, then we could tackle the next step of masscreating such author pages, or listing the works on Portals, etc. Lemuritus (talk) 21:37, 1 January 2020 (UTC)

Wikisource:TemplateScript could help you out here—it has "Make title link" and "Make author link" scripts. BethNaught (talk) 21:41, 1 January 2020 (UTC)
That said, there's no guarantee the Author pages will have title matching the text, e.g. "Mrs. Besant" <--> "Author:Annie Wood Besant. BethNaught (talk) 21:43, 1 January 2020 (UTC)
That’s half-helpful, it reduces the need to parse the authors on Leo Tolstoy: His Life and Work/Bibliography to about 200 mouseclicks (which I just did :) ), but that’s still a great deal of strain for every single page considering twenty years of backlog is growing larger, not smaller. There's no way for those tools to be keyboarded, as in Ctrl-Q to make the highlighted text an author link? Also, related question where is the link to see which redlinks have the most references? Then, back to the main question, how the community should make it easier to link/redlink selected texts on pages :P Lemuritus (talk) 22:00, 1 January 2020 (UTC)
Top 5000 wanted pages are at Special:WantedPages. Unfortunately, list is not updated often but gives some guidance. Beeswaxcandle (talk) 22:07, 1 January 2020 (UTC)
Bookmarked it, thanks! I added a bunch of Author pages, would be nice if we had a weekly bot skim all Author pages that don't list any redlinks and add a {{populate}} tag to them :) (At least, until better bots can actually find the works themselves ;) ) Lemuritus (talk) 22:32, 1 January 2020 (UTC)
To get a keyboard shortcut, TemplateScript supports key combinations. These consist of a browser-specific prefix (see w:en:Wikipedia:Keyboard shortcuts) followed by some other key (look at the code at MediaWiki:TemplateScript/typography.js to check for a specific command). N.B. this is just from reading code and docs, I haven't tested. BethNaught (talk) 22:12, 1 January 2020 (UTC)
Thanks, that made the second page easier to parse. A tad complicated for new users (and I'm guessing some experienced ones) to even realise is an option, but at least I am a little more prepared. Lemuritus (talk) 23:08, 1 January 2020 (UTC)
There are numbers of regex tools available, Pathoschild's TemplateScript is our more popular as it can ad hoc regex replacements, or you can embed regex into your sidebar. Numbers of us have such scripts withi our common.js files that we use to make repetitive maintenance or proofreading tasks easier. There is a regex tool within AWB which works quite well though all such tools should have caution applied.

As a general comment, while I encourage you to build bibliographic lists, I discourage them to be all redlinks, firstly they are not the prettiest look, they are vandal targets, and they are often not the most accurate due to case differences, or title differences. So get something in place, but please don't overly try to perfect the imperfect. We aim for the perfection in the transcriptions, the author and portal pages are curated pages. — billinghurst sDrewth 13:18, 3 January 2020 (UTC)

I'm planning to fiddle with {{Populate}} to make it suggest "What links here" to the viewer to find some works; ideally it should have an even-easier "Click here to add this work to this author", but that might be a pipe-dream. Feel free to correct my errors on the template (which a bot should be auto-adding to authorpages). For example I need help getting the "Edit" in the wording to be a link to edit specifically the "Works" portion of the authorpage, not the general page. Also, IRC would mean less spam here on the scriptorium with these needs :P Lemuritus (talk) 23:17, 1 January 2020 (UTC)

If a page is sufficiently structured, you can use regexr or a similar service to bulk-edit the text, adding [[$1]] and [[Author:$1|$1]] where needed. —Beleg Tâl (talk) 13:36, 2 January 2020 (UTC)

  • Pictogram voting comment.svg Comment please do not create modern authors (have all their works in copyright). That someone has added a person to a portal is not a determinative reason to create an author page. Thanks. — billinghurst sDrewth 14:01, 2 January 2020 (UTC)
an author page has no effect on copyrighted uploads at commons. just as a big stop sign on the creator page has no effect. how would you "know" contemporary authors have not released a CC or PD? for example Author:Robert Swan Mueller; and it would give you a deletion task flow on commons. rather you would need to blacklist at IA-uploader or wizard -- Slowking4Rama's revenge 19:58, 2 January 2020 (UTC)
"have all their works in copyright" was the bit that you ignored. The community has had that conversation about not creating modern author pages where we cannot host works, and deleting those author pages. I am not talking about authors who have works in the public domain, clearly those author pages are suitable for creation. If you want to amend the text used for clarity, then go for it, however, please don't cloud the issue. — billinghurst sDrewth 11:58, 3 January 2020 (UTC)
did not ignore it, just don’t believe it. the problem for you is you do not "know" a work is in copyright until you do the search. and the rushing to all or nothing conclusions is not helpful. and dictation which author pages to create based on first order conclusions is not helpful either. and you do not "know" that US government employees will not write a memoir that is copyrighted. i.e. Author:Barack Hussein Obama i am not clouding the issue, rather copyright is inherent cloudy, not amenable to false clarity. Slowking4Rama's revenge 15:10, 3 January 2020 (UTC)
And what you have just said, doesn't change my original statement or meaning.

The community has had the discussion about modern author pages, and set the criteria, as we had people creating problematic pages, or creating pages with zero content, or all works in copyright. My "do not create modern author pages" statement was a general short note, not an explanation of the deliberative process. Where someone has done the searches and found that the work(s) are able to be hosted here, or they are freely hosted elsewhere, then we have allowed those pages, so at that point create the author page. Happy for you to usefully add to the discussion if you think that there is some minutiae to be addressed or to correct an error, or to add clarity, however, this persistent nitpicking and what seems to be deliberate opposition and obfuscation, sheerly because you can, is not helpful. — billinghurst sDrewth 01:09, 7 January 2020 (UTC)

Import request[edit]

Can someone please import:

So I can make labels (e.g. Page:W. E. B. Du Bois - The Gift of Black Folk.pdf/41) per the method at w:en:Help:Cite link labels. Thanks. —Justin (koavf)TCM 09:31, 3 January 2020 (UTC)

X mark.svg Not done We use our (house) referencing style per Help:Footnotes and endnotes, not replicate the works. Discussion has been had previously and is in the archives of this page. — billinghurst sDrewth 11:51, 3 January 2020 (UTC)
These aren't mutually exclusive. Why not have the MediaWiki pages as well? And correct me if I'm wrong but it would also overcome this problem: "it has the strong drawback of not being able to group footnotes when transcluded to the main namespace."Justin (koavf)TCM 20:16, 3 January 2020 (UTC)
What price The Tragedy of Romeo and Juliet (Dowden)/Act 2/Scene 1 where I have grouped footnotes on transclusion? Beeswaxcandle (talk) 21:17, 3 January 2020 (UTC)
It seems like that works too but again, there is no harm in having the redundancy and would make it easier for persons familiar with en.wp's method. —Justin (koavf)TCM 22:29, 3 January 2020 (UTC)
How would enabling this "redundancy" keep our house referencing style consistent? How do you propose that (for example) lower-case greek reference marks are automatically turned into numbered reference lists when the parser hits {{smallrefs}}? Always remember that our goal is to reproduce content so that it can be used, not to replicate a multitude of presentation styles from the many publishers out there. Beeswaxcandle (talk) 08:59, 4 January 2020 (UTC)
Why is a consistent house referencing style across millions of works from millennia valuable? I realize that we are striving for a typographic rather than a photographic reproduction but if we can have greater consistency with the presentation of the original source, that is desirable. —Justin (koavf)TCM 06:53, 5 January 2020 (UTC)
Says who? Says why? We already differentiate in multiple ways. We work to the words of the author, not slavishly to a typographic production. Repeatedly having this (epithet) argument is so painful. Every time we allow contributor variation I see it more spawns the hydra-response. I would suggest that every time we allow for user variation we have less site consistency, and that is undesirable. — billinghurst sDrewth 11:21, 5 January 2020 (UTC)
Why is consistency desirable? There is no consistent style in terms of typography, page length, language usage, citations, spelling, etc. across all documents. —Justin (koavf)TCM 11:26, 5 January 2020 (UTC)
Knuth designed his own typesetting system to get his books right. The first run of Alice in Wonderland was pulped because the images were too light. There are notable typographic features in Tristam Shandy, including an all black page at one point. The idea we can neatly abstract out "the words of the author" versus "a typographic production" is somewhat problematic, and given that we do include images, frequently unapproved by the author, not something we follow. Absolute site consistency is not a goal that most of us have, which why we have this argument over and over.--Prosfilaes (talk) 00:46, 7 January 2020 (UTC)

Disambiguation pages that are not disambiguation pages, more collections[edit]

We have had spates in time when pages like Emerson have been created, and I don't see that they are disambiguation pages. They are finding pages, and we are just going to get ugly if we think that we can maintain pages like this with the number of surnames and biographical works that we have. I think that it is problematic enough that we have Author:Emerson though can sort of understand why we might, though don't think that we should. We are making horrid rods for our backs whilst hoisting ourselves on our own petards whilst performing rocket surgery. Noting that I am not taking aim at this creation, as it is similar to other sorts of previous creations. We do need to resolve those that do exist. — billinghurst sDrewth 12:58, 3 January 2020 (UTC)

If if if if if we had to collect something like this, I feel that we would be better to do something like a category for Emerson (surname) or Biographies of people named Emerson, not that I truly love such an approach as it is just burdensome. — billinghurst sDrewth 12:58, 3 January 2020 (UTC)
Symbol delete vote.svg Delete 100% agree: Emerson is nonsense unless we have works called simply "Emerson", and Author:Emerson is nonsense unless we have authors whose name is simply Emerson. There are also several disambig pages of this sort which I've added to Category:DNB disambiguation pages. I personally think that disambiguation pages for encyclopedia articles (or including encyclopedia articles in disambiguation pages) is not a thing we should be doing in general anyway. —Beleg Tâl (talk) 13:23, 3 January 2020 (UTC)
@Beleg Tâl: I don't want to be seen as just saying "no", so in the cases of the biographical works we would point them to ToC or indexes. For biographical works that don't have either, then we have done our own compiled lists to these work and simply noted that they are compiled lists. Further if someone wanted to do something special for the surname Emerson, then my opinion is do Portal:Emerson and knock yourself out, no expectation that we are ever comprehensive. — billinghurst sDrewth 13:50, 3 January 2020 (UTC)
Symbol neutral vote.svg Neutral I have created the page Emerson only to make some space where I could move non-author links from Author:Emerson. I have nothing against its deletion if it is decided we do not need such pages. However, disambiguation pages like the Author:Emerson are IMO quite useful and my opinion here is Symbol keep vote.svg Keep as people often know only author’s surname and disambiguation page helps them more than just a search results list. --Jan Kameníček (talk) 13:47, 3 January 2020 (UTC)
That is just going to get ugly to maintain, and when do you build one? Wouldn't you be better looking at Wikisource:Authors-Em? One day eventually there will be enough family name data to be able to get these bot generated. — billinghurst sDrewth 13:57, 3 January 2020 (UTC)
Besides the fact that these lists are as badly kept as disambiguation pages and presently are missinng many (most?) author pages, they are also not very user friendly:
  1. Ordinary visitors to Wikisource know nothing about the existence of these lists and finding them after they arrive to the WS main page is not very intuitive. They usually just type the surname into the search field. This can still be solved by redirecting the surname e. g. Kennedy to Wikisource:Authors-K#Ke, but
  2. The lists sometimes contain quite a lot of authors beginning with two particular letters. What is more, the lists should be even longer than they are as too many authors are still missing there, and the number of author pages at Wikisource still keeps rising, so the lists are going to be longer and longer. It is not very friendly to force readers to go through long lists of names if they need just one particular name.
  3. Even worse, unlike disamgiguation pages, the lists do not contain any other information than birth and death dates, which makes it more difficult to find the author you need. You have to open every single author of the desired surname before you find the one you are looking for. Disambiguation pages usually contain also nationality and occupation, which makes the search much easier.
For these reasons I consider such a way a good one if the particular disambiguation page still does not exist and redirecting the surname to this list can be a temporary solution before the disambiguation page is founded.--Jan Kameníček (talk) 14:53, 3 January 2020 (UTC)
As I was indicating these Wikisource:Authors-Xx page would be best as autogenerated pages. My understanding is that the means for generating such pages automatically exists. What is missing is the family-name data at WD, and that is the data we have in a field awaiting inhalation.

With regard to being incomplete now, they are and would be no more incomplete than to a page like Author:Emerson—urted additions are curated additions. With regard to being hard to find, they are linked from every author page, from the author-index link in the top left. Otherwise I hve no idea how users arrive here and look for author pages. — billinghurst sDrewth 11:14, 5 January 2020 (UTC)

As a counterpoint, consider a name slightly more common than "Emerson". Can you imagine if we were to add everyone named "John" to Author:John, or everyone named "Henry" to Author:Henry? We should handle this by pushing for improved search functionality. We shouldn't (IMO) handle this by creating curated pages to disambig on partial names—we have enough to do already around here. —Beleg Tâl (talk) 14:01, 3 January 2020 (UTC)
I agree that adding disambiguation pages with first names is unnecessary and impossible to maintain. As for John, only people whose main name is John, like Author:John of the Cross, or whose surname is John, should be added to such lists. BTW: most pages from Author:John should be either deleted or moved from author ns to portals, as they do not have any works at Wikisource, but that is for a different discussion). --Jan Kameníček (talk) 14:53, 3 January 2020 (UTC)
you could do it by using wikidata, i.e. https://www.wikidata.org/wiki/Q190388 and "given name" -- Slowking4Rama's revenge 15:02, 4 January 2020 (UTC)

Export to PDF, LaTeX, EPUB, ODT[edit]

Hello,

An alternative export to the formats PDF, LaTeX, EPUB and ODT is provided

https://mediawiki2latex.wmflabs.org/

The server is provided by the Wikimedia Foundation. The software running on it is GPL licensed open source and part of the Debian Linux distribution.

It is easy to integrate it into the sidebar if requested. This has been done on the German Wikiversity and German Wikibooks. If you are interested just copy a few lines into your MediaWiki:Common.js accordingly.

Dirk Hünniger (talk) 08:32, 4 January 2020 (UTC)

This has been deployed on the German Wikisource. You may try it there too. Dirk Hünniger (talk) 16:18, 5 January 2020 (UTC)

help wanted: OAW layout template?[edit]

First, a word of justification for proposing what I'm about to propose. I'm aware that the purpose of Wikisource is, above all, to provide proofread texts of the words of old writings, that images are a secondary concern, and that the typographic form that the text was originally presented in hardly concerns Wikisource at all. Nonetheless, since the magazine Once a Week was in its day well known for its illustrations, and is nowadays better known for them than for its texts, it seems necessary to do justice to the illustrations when transcribing the magazine here. And since the illustrations sometimes interact with the texts, attention to page layout becomes necessary in those instances.

I would argue that if some subpages of Once a Week need a global layout applied to them, for consistency's sake a layout should be developed that can be easily applied to all the subpages. There are surely enough subpages to justify creating a template: 2800-odd in the first series alone! (Some 1000 already exist.)

Here are some examples of subpages that have led me to my opinion that they all should be a column 500px wide, justified.

  • The Sweeper of Dunluce: the illustration was designed to wrap around the text in a specific manner, thus constraining the text's width, and making it look best if justified. There are also other articles with illustrations that wrap around text.
  • The Secret That Can't Be Kept: this play needs to be a fixed width so that the right-aligned stage directions don't move too far from the left-aligned dialogue.
  • The Notting Hill Mystery, Section 7: Fixed width ensures that the diagram is next to the text that it illustrates; more legible if justified.

Default layout 2 does have a 500px justified column. However, it also has other features that are unnecessary (such as specifying serif font) or undesirable (such as its sidenotes style). Thus my request to programmers: is anyone willing to create a custom layout for this magazine?

If anyone's interested in working on this, please continue the discussion at Wikisource:Wikiproject Once a Week/Layout talk. -- Levana Taylor (talk) 07:41, 6 January 2020 (UTC)

Can't help with layout request, but I note for the play that you aren't using {{rbstagedir}}. I find this template quite useful rather than fiddling with floats. Beeswaxcandle (talk) 08:08, 6 January 2020 (UTC)
Thanks for that, it’s helpful, but really, the main issue is the illustrations. Levana Taylor (talk) 08:24, 6 January 2020 (UTC)
I don't really have any bright ideas about a one-size-fits-all formatting, but I'd like to recommend that one eye is kept on how the formatting translates to ebooks. Generally the ebook environment is much more lax with respect to sizes, so you can't reason much about the pixel sizes. For example, here is the start of the The Sweeper of Dunluce after ePub export. The screen is 1080 pixels wide (but less than 7cm across in real life), and the font size (x-height here about 25px), as on most e-readers, is under user control, as is the font itself. E-readers vary vastly in screen size, CSS capabilities and so on, so there's not a huge amount you can rely on. There are proprietary programs to "Preview" how it looks on specific e-readers, but they don't seem to work on non-Windows/Mac systems.
Also, keeping an eye on mobile, but non-e-reader, formatting can be helpful. For example the fixed width at The Secret That Can't Be Kept causes it to leak off the page to the right on a phone screen. Very often, this can be pre-empted by changing "width" to "max-width", so it can be further constrained by the screen size. This is easy to preview in Firefox and Chrome with Ctrl-Shift-M.
Generally speaking, it normally kinda-sorta works, but it rarely manages to look as swish as it does on the desktop website, unless special care is taken to accommodate these devices. And the more "fancy" formatting is used, the less likely it is it will translate to these devices without something bizarre happening. Inductiveloadtalk/contribs 09:24, 6 January 2020 (UTC)
As Inductiveload's example points out, you can't square this circle. You can't have fluid, resizeable text and pixel perfect layout. All attempts to do so will fail, often causing accessibility issues in the process. Fluid design or pixel-level control: Choose one. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:48, 6 January 2020 (UTC)
You are right, of course. Is the number of people using ebooks and small screens and mobile devices increasing to the point that it’s time to give up on doing layout that’s optimized for desktop viewing?
NB most of the OAW subpages work fine with fluid layout, and in fact I have been setting text-columns and images to max-width rather than fixed-width wherever I can, and using image-float for the smaller images so they reposition when the text column is shrunk. I’m just bothered by the few pages where it seems like something special is called for. Maybe I should just create a list of those and see if people have bright ideas for making them work semi-okay on desktop and mobile? There are all sorts of issues, such as the fact that the scores generated by Lilypond are a fixed size. Levana Taylor (talk) 12:37, 6 January 2020 (UTC)
Actually, I wonder if the "The Sweeper of Dunluce" example might be failing because it's using percentages instead of pixels for the polygon, when outlining something that is inherently measured in pixels (an image). This will get wonky results when the screen is too small, whereas modern browsers are usually pretty good at doing the right thing with the virtual unit "pixel". I would also assert that, since the text in the ePub-conversion is perfectly legible even without the fancy formatting, the biggest problem is that File:Castle of Dunluce (OAW).png has a white, rather than transparent, background, which would look ugly even on desktop if we (or the user) had a layout with non-white background.
I also see nothing in these examples that is particularly advanced for a typical web layout, and which can scale from desktop to mobile. ePub ebooks are a special case that tries to be some kind of weird hybrid between dynamic layouts (web) and static layouts (PDF) and suffers inherent limitations as a result: and this means ePubs can't really be automatically generated without some severe tradeoffs somewhere. To get the benefits of ePub you really need to author for ePub; and limiting web presentation to the lowest common denominator will just dumb down both platforms.
We should certainly keep ebook readers in mind and work actively to get the best possible presentation of our works there, but I don't think we should do so at the expense of good presentation on the web (which includes mobile web browsers, that tend to be just as powerful as their desktop equivalents). The need to support older web browsers is already a severe limitation to what we can do, and necessitates designing for graceful degradation (IE/Edge: I'm looking at you!). If we add the pseudo-HTML support in your average ebook reader and the functionality of the Mediawiki-to-ePub converter to the support matrix we might as well just go to straight ASCII. --Xover (talk) 13:36, 6 January 2020 (UTC)
I have replied with some thoughts at the Wikisource:Wikiproject Once a Week/Layout talk page. I think nearly all these issues can be mostly resolved with avoiding fixed-width in the source and applying a {{default layout}} instead, PSM-style. This means the "recommended" layout appears for most users, and can still be overridden as desired. Taking that care to make it work in Layout 1 and 2 will probably make it also work 95% in most e-readers (at least ones with a reasonable CSS engine like apps, YMMV on actual Kindles). Some things are just going to be a bit degraded on e-readers like the illuminated drop initials. That's just how it is, I don't propose to kneecap those to keep a generation-1 Kindle happy. But I certainly think WS should at least attempt to produce functional ebooks/mobile content (even if we don't advertise it as well as frWS does).
OAW is a lovely project, it would be really nice if we could get pretty things out of it like ePubs of the serial works and so on. Inductiveloadtalk/contribs 14:08, 6 January 2020 (UTC)
If required, we can insert an OAW layout into the help:layout mix so that it can be set as a default layout. I do not know whether it would then be possible for it to be excluded from the toggled rotation, or whether creation and addition of a default layout automatically puts it into the rotation. — billinghurst sDrewth 00:45, 7 January 2020 (UTC)

Tech News: 2020-02[edit]

21:18, 6 January 2020 (UTC)

Index:South - the story of Shackleton's last expedition, 1914-1917.djvu[edit]

If someone wants to do a short proofread, the Appendix of this would be appreciated. ShakespeareFan00 (talk) 09:19, 7 January 2020 (UTC)

Copyright and deletion discussions needing community input in January 2020[edit]

The following copyright discussions and proposed deletion discussions have been open for more than 14 days, and with more than 14 days since the last comments, without a clear consensus having emerged. This is typically (but not always) because the issue is not clear cut or revolves around either interpretation of policy, personal preference within the scope afforded by policy, or other judgement calls (possibly in the face of imperfect information). In order to resolve these discussions it would be valuable with wider input from the community.

Copyright discussions require some understanding of copyright and our copyright policy, but often the sticking points are not intricate questions of law so one need not be an intellectual property lawyer to provide valuable input (most actual copyright questions are clear cut, so it's usually not these that linger). For other discussions it is simply the low number of participants that makes determining a consensus challenging, and so any further input on the matter would be helpful. In some cases, even "I have no opinion on this matter" would be helpful in that it tells us that this is a question the community is comfortable letting the generally low number of participants in such discussions decide.


Copyright discussions (WS:CV)


Proposed deletions (WS:PD)


Note that while these are discussions that have lingered the longest without resolution, all discussions on these pages would benefit from wider input. Even if you just agree with everyone else on an obvious case, noting your agreement documents and makes obvious that fact in a way the absence of comments does not. The same reasoning applies for noting your dissent even if everyone else has voted otherwise: it is good to document that a decision was not unanimous.

In short, I encourage everyone to participate in these two venues! --Xover (talk) 09:35, 7 January 2020 (UTC)

Download tool is broken[edit]

The download tool seems to be broken. The Featured text links for a download (any format) result in an error. --EncycloPetey (talk) 17:49, 7 January 2020 (UTC)

ia-upload is down too. I assumed it's a more general WMF Labs thing, but [3] works OK, so... Inductiveloadtalk/contribs 18:08, 7 January 2020 (UTC)
Define broken. The httpd daemon is delivering the upload web page, though I didn't try it out. There are a range of issues that can come from tools, from whole thing, to webservice being off for the account, to a page not delivering the right output, or the processing not occurring. Some accurate description of broken will always be more helpful. — billinghurst sDrewth 00:55, 8 January 2020 (UTC)
"I didn't try it out", but you nevertheless made a disparaging comment. But elsewhere you seem to have been thankful for the notification. Odd that. --EncycloPetey (talk) 02:30, 8 January 2020 (UTC)
Huh? Oh sorry, was seeing "upload" and following that link, which can have lots of components to it, and I didn't want to upload a file to test (and I did say that). That said there is still multiple components to broken for downloads, the site toollabs:, the webservice; the webpage; the output of a file in mobi or pdf or epub; and the generated files themselves and their contents. That user described exactly what they were doing, and what they were wanting, so maybe it comes down a little to mw:How to report a bug. We still don't know what you were trying to do, and whether it is or is not working at this time. — billinghurst sDrewth 04:05, 8 January 2020 (UTC)
I would say that "The Featured text links for a download (any format) result in an error" means that I was trying to pull a download for the Featured text, but instead was given an error message. I did not know yet whether or not I had a bug to report; I simply knew that something that normally works was not working. It is great that you could list several possible things that might have gone wrong, but how would I have known any of that before you posted? --EncycloPetey (talk) 16:51, 8 January 2020 (UTC)
I was not trying to be disparaging, if I came over that way, then you have my apologies. Anyway, does your problem still exist, or do we need to be taking further actions. Specificity, rather than generality, is still king in that regard. — billinghurst sDrewth 21:10, 8 January 2020 (UTC)

Court decision titles[edit]

While organizing categories and portals for Portal:United States District Courts, I have noticed that there appears to be very little uniformity of the manner in which court decisions are titled. A quick look at Category:United States District Court decisions shows that while most pages have the simple format A v. B, such as Doe v. MySpace, Inc. or Shelton v. McKinley, there are also multiple pages with formats:

The strongest opinion I have on the matter is that A v. B is generally preferable, as it is cleaner and more human-friendly, but that A v. B should not point to one decision if A v. B (Year) points to a separate decision; in that instance, I feel that A v. B should be an index page (if related) or disambiguation page (if unrelated) pointing to the multiple A v. B (Year) pages.

This still leaves open the question of what to do when a case has multiple decisions within the same year, such as the many pages for United States v. Hubbard in 1979. My opinion there is to endorse A v. B/Citation as a subpage of an index at A v. B, since such decisions are all part of the same case.

But should all decisions be placed at A v. B/Citation (or A v. B (Year)/Citation when applicable), or should that only be used when a case has multiple documents available? Consistency and exactness favor that all decisions use it, but simplicity and conciseness favor that it only be used when necessary.

And once again, that doesn't cover every instance; when there are unrelated cases with the same name in the same year, such as, for example, the theoretical generic title of United States v. Smith, how should they be differentiated? I see no current examples of this situation, so I don't feel that this question is as high a priority; it would just be good to have an established opinion, for the sake of a complete style guide.

THE SHORT VERSION: should A v. B be the preferred court decision name, with A v. B (Year) being used in cases where disambiguation is required, with Citation being used as a subpage only in cases where multiple documents are issued in the same case?

Qwertygiy (talk) 21:39, 7 January 2020 (UTC)

I think that is exactly the correct decision tree here. —Justin (koavf)TCM 23:14, 7 January 2020 (UTC)
  • I think that it is reasonable to assume that our community of interest for court decisions will be people who are in some way involved in legal research (whether lawyers, judges and clerks, law students, or legal writers). There is a predominant form of citation for legal cases in the American legal community, which is the Bluebook, which prescribes the A v. B, Citation (Year) format. I would also note that most of the court decisions that we host are merely one in a series of decisions made with respect to that particular case. For example, Heart of Atlanta Motel, Inc. v. United States 231 F. Supp. 393 (N.D. Ga. 1964) was the decision of the district court that was appealed to the Supreme Court, which decided the case as Heart of Atlanta Motel, Inc. v. United States, 231 F. Supp. 393 (1964). These can not be disambiguated by year because they are in the same year. Note that a case with only the year in the final parentheses will be a U.S. Supreme Court case, with all lower court cases specifying the court. My preference would be to follow the Bluebook citation for American cases. BD2412 T 00:14, 8 January 2020 (UTC)
    I don't have a strong opinion on naming, but I think that people involved in legal research will have access to Westlaw. Our accessors are the common man, who reads Wikipedia and wants to look up an opinion to get the exact judge wording or a better understand of the legal principles behind their republic.--Prosfilaes (talk) 01:03, 8 January 2020 (UTC)
    What ↑ said! --Xover (talk) 05:41, 8 January 2020 (UTC)
  • A v. B, Citation (Court Year) should be the way that the case is titled in the header of the page, or when listed in Portals, certainly. However, there are several arguments to make against using it for the title of pages.
    • It complicates linking to pages, as it's a lot easier to remember the proper information for a page styled as Brown v. Board of Education, the way Wikipedia styles that case, than it is to remember 347 U.S. 483 (1954) (and that's before we get into cases with the varieties of F. Cas., F. Rep., F. Supp., F. Supp. 2d, etc.)
    • It would often result in having to recaption the link as [[Brown v. Board of Education, 347 U.S. 483 (1954)|Brown v. Board of Education]] whenever used in navigation or other templates where space is limited.
    • w:Wikipedia:Manual of Style/Legal#In the United States states that Bluebook format should be generally followed for article titles on Wikipedia, but the examples given are w:National Railroad Passenger Corp. v. Boston & Maine Corp., w:Bailey v. Drexel Furniture Co., and w:Carter v. Carter Coal Co., without citations following the titles.
    • It's also worth noting that the Bluebook is neither the official style guide for all U.S. jurisdictions (with significant deviations including the Supreme Court, California, Delaware, Maryland, and Michigan), nor is it a freely available resource; there are independent online citation generators which one can use, but the book and its guidelines themselves are protected by copyright and sold by Harvard, including via paid internet subscription.
Qwertygiy (talk) 01:33, 8 January 2020 (UTC)
  • I think that following a Bluebook citation style for naming American cases and then different styles for different jurisdictions will only prove confusing. Redirects are cheap and we should create them where we can for human linking and machine reading but the proper title of a page should be something that is maximally simple and intelligible to a human, e.g. "Brown v. Board of Education". (Note also that we should use italicized titles for these pages.) —Justin (koavf)TCM 01:59, 8 January 2020 (UTC)

Pictogram voting comment.svg Comment my opinion is not specific to citation, and more about house style and seeing if we can meander our way to a sensible output for US court cases within our environment.

  • our use of year = YYYY spawns the output (YYYY), output which is overrideable, please consider how that works in the plan
  • remember to plan for disambiguation pages and where they fit into the mix where similarly named
  • we host the world's works so please don't get caught in a US-centric or US-only naming system
  • pagename and page title work best when close, though title should always be what was on the work
  • descriptive page names are useful
  • subpages (where we use a /) are for pages of the same work; different works are not subpages. And please avoid forward slashes in naming works as part of their general pagename)

billinghurst sDrewth 00:51, 8 January 2020 (UTC)

  • I would think we would use subpages for different writings in the same decision (majority, concurrences, dissents). BD2412 T 02:39, 8 January 2020 (UTC)
    If what I said sounded different to this statement, that was not my intent. The collective components of a judgment from a court is the work. — billinghurst sDrewth 03:40, 8 January 2020 (UTC)
  • I didn't mean to imply that you meant otherwise, I was just making it clear. Cheers! BD2412 T 03:51, 9 January 2020 (UTC)
  • I don't think we should give too much weight to one specific citation style guide when it comes to page names, except in the sense that it is a strong advantage if the page name (or one of its redirects) bears a close resemblance to how that work will be cited in other works. If almost all other works refer to the play as King Henry the 5th, it would be disadvantageous for us to put it at Henry V. For court cases, that suggests we should aim to have redirects from the common citation styles by default; and it means we should let the Bluebook standard inform our choice of standard naming, as one factor among several.
    I don't have strong opinions on the specific naming (this is not my field), beyond finding Qwertygiy's proposal generally sensible. I would also strongly urge that the outcome of this discussion is some kind of written guideline for such page names that can be referred to in future. --Xover (talk) 05:56, 8 January 2020 (UTC)

Bad text layer extraction from PDFs[edit]

As it was already discussed here, Mediawiki has problems with text layer extraction from PDFs. Now I have reported it to phabricator, see task T242169. If anybody were able to add any useful comment there, it would certainly be very helpful. --Jan Kameníček (talk) 18:28, 8 January 2020 (UTC)

404:Not Found (thumbs)[edit]

Hi! I'm getting a "404:Not Found" in clicking certain image tabs in Proofread pages. Let's have an example with Index:Scientific American - Series 1 - Volume 009 - Issue 18.pdf. If you click in any page, and then click the "Image" tab, we get the error.

The URL that we are trying to load contains decimals (e.g. https://upload.wikimedia.org/wikipedia/commons/thumb/8/84/Scientific_American_-_Series_1_-_Volume_009_-_Issue_18.pdf/page4-1652.0833333333px-Scientific_American_-_Series_1_-_Volume_009_-_Issue_18.pdf.jpg) If you change 1652.0833333333px with 1652 in the URL, we'll see the thumb OK.

I think this happens with all the files that have been "mirrored" from Commons with decimals in the px. E.g.:

  • Commons: "Original file ‎(1,653 × 2,362 pixels, file size: 4.27 MB (...)".
  • Wikisource: "Original file ‎(1,652.0833333333 × 2,360.4166666667 pixels, file size: 4.27 MB, (...)".

It happens in all Wikisources I've looked. In one case, the error causes also not to display the image in the Page namespace because of the same URL with decimal px's (e.g. ca:Pàgina:Flor d'enamorats (E. Moliné y Brasés).pdf/12; this Catalan page was OK 3 days ago).

So, any idea, please? Thanks. -Aleator (talk) 15:00, 10 January 2020 (UTC)

Also look any page of Index:AASHTO USRN 1980-06-22.pdf: both types of error (404 and no thumb). Something is wrong in PDFs... -Aleator (talk) 15:15, 10 January 2020 (UTC)
@Aleator: This is probably phab:T242422, and connected to the new version of MediaWiki that was deployed today. —Xover (talk) 16:10, 10 January 2020 (UTC)
@Aleator: As a workaround, you can try to set Scan resolution in edit mode in the Index to an integer value. Ankry (talk) 21:48, 13 January 2020 (UTC)

The Outline of History[edit]

All the illustrations for Index:The Outline of History Vol 1.djvu and Index:The Outline of History Vol 2.djvu have been deleted from Commons. I could not find a link to the discussion, but it seems that the illustrations are under copyright in the UK because they are by J. F. Horrabin (d. 1962).

Since there are (were) illustrations on most pages of the work, this will mean that the integrity of the work as a whole has been compromised and may require significant cleanup. --EncycloPetey (talk) 16:59, 12 January 2020 (UTC)

Not a long discussion, see c:Commons:Deletion_requests/Illustrations_by_J._F._Horrabin. If they were in scope to be hosted here, they could have been moved. Anyhow nothing new, they always act with little consideration for other projects.Mpaa (talk) 17:47, 12 January 2020 (UTC)
c:Commons:Deletion requests/Illustrations by J. F. Horrabin -- this would be a good case for a local copy, if a commons admin wanted to copy it over. Slowking4Rama's revenge 03:17, 13 January 2020 (UTC)
adding file links for convenience. c:file:The Outline of History Vol 1.djvu and c:file:The Outline of History Vol 2.djvubillinghurst sDrewth 03:52, 13 January 2020 (UTC)
The two djvu files have been moved here guessing that at some point that someone will complain. I have also left a note on the deletion discussion to address this matter. — billinghurst sDrewth 04:07, 13 January 2020 (UTC)
@Billinghurst: I do not know how you managed to create a duplicate here, I tried for the other images (via pywikibot/API) but failed due to "API error fileexists-shared-forbidden: A file with this name exists already in the shared file repository". I also tried to import from commons (see File:Page 011 (Vol. 1 - The Outline of History, H.G. Wells).png), but I guess there is no local file here. I hope you have the tools to make a mass move from there to here. All I could do is save a local copy on my PC.Mpaa (talk) 22:11, 13 January 2020 (UTC)
@Mpaa: there is a Magnus OAuth script with that allows moving one by one to the wiki where it is in use, loaded at either my c:common.js, otherwise my m:global.js. So I will need to have files undeleted, do the undos, have the usage recognised, then move, then delete; rinse, repeat. It is an admin restricted task, so will have a level of cumbersome. Might be something that we can seek temporary admin rights for a person as admin rights are needed at sending and receiving wikis; or maybe we can locally give admin rights to a Commons admin to help shift. — billinghurst sDrewth 03:39, 14 January 2020 (UTC)
Mpaa. You should have a notification of my request at C for temp admin rights for you. <shrug> to how it will be received. — billinghurst sDrewth 03:59, 14 January 2020 (UTC)
@Billinghurst: I see a couple of other options.
1. export all pages belonging to c:Category:The_outline_of_history_-_being_a_plain_history_of_life_and_mankind_(1920) to a XML file, and have someone with proper rights to import them here (I can't import from file, I think a steward can). That would move all page text and history here but no images. Then we can delete files at commons and upload files here (which at this point should be possible via scripts). There is one thing I am not sure, i.e. if it is possible to upload a file to a FilePage who is left orphan of its image.
2. forget about history, delete files at Commons and make a bulk upload from scratch here. We can then rework offline the above XML file and restore the page text by a script, update categories, insert new templates etc.
Any of these, or the other you are discussing, are for free and I am also quite time bound in this period.
In addition we need to undo all CommonsDelinker edits .... Mpaa (talk) 20:53, 14 January 2020 (UTC)
@Mpaa: As I understand it the wiki File: pages do not actually contain the files; you can have pages in that namespace nude of files. Yes, you can add images to an existing File: ns page, it utilises the same text/link/process as overwriting a file, and yes, we need to undo, and that will be the next component if I am pushing from Commons.
FWIW with regard to rights for importing XML, we would just need a consensus to appoint someone to the role, and whether it is permanent of time limited. If 'crats cannot assign, then we ask stewards to do so, the group already exists, see Special:ListGroupRights#import. — billinghurst sDrewth 21:30, 14 January 2020 (UTC)
@Billinghurst:, then I would go for option 1 if we want/there are no issues in keeping history. Other opinions welcome. I can't assign the role. As test, you might delete this at commons (File:Page 011 (Vol. 1 - The Outline of History, H.G. Wells).png) and I might try to upload here. Mpaa (talk) 22:01, 14 January 2020 (UTC)
I filed a bug for them, they do not exist, see T242795. There are also:
File:Page 045 (Vol. 1 - The Outline of History, H.G. Wells).png
File:Page 299 (Vol. 2 - The Outline of History, H.G. Wells).png
File:Page 322 (Vol. 2 - The Outline of History, H.G. Wells).png
Mpaa (talk) 21:02, 15 January 2020 (UTC)
I also reverted all edits from CommonDelinker.Mpaa (talk) 21:30, 15 January 2020 (UTC)
@Mpaa: I am done on the moves, will get to the data updates later. The 5 blank/dead/ugh files are there but not there. Maybe we just manually download the files, and just add them under better file names at enWS, then update the links. In the category there are still variants of some of the pages, if these need to be moved over we need to link them to the pages. Also, can you do a check for redlinks in those pages? — billinghurst sDrewth 03:39, 16 January 2020 (UTC)
Finished information updates, think that I am all done here. — billinghurst sDrewth 12:01, 16 January 2020 (UTC)
@Billinghurst: the 5 images are gone, I was not able to manually download them. Number of images in the category matches with what I downloaded, all images in cat are used in Page: ns for the work. I cannot easily check for redlinks, but I think this is good enough.Mpaa (talk) 21:54, 16 January 2020 (UTC)
I have uploaded the 5 images by old-fashioned means. I will do the remaining maintenance later. Mpaa, I am pretty certain that I have a pywikibot script for redlinks for DNB, not certain whether it is in my account, or Wikisource-bot. I will look at that when I have that access. — billinghurst sDrewth 22:35, 16 January 2020 (UTC)

A personal essay from a kindred site[edit]

https://blog.pgdp.net/2020/01/01/ten-eleven-years-at-dp/Justin (koavf)TCM 07:27, 13 January 2020 (UTC)

Tech News: 2020-03[edit]

MediaWiki message delivery (talk) 18:39, 13 January 2020 (UTC)

Match and Split is down[edit]

match_and_splitrobot is not running. Please try again later.Beleg Tâl (talk) 13:32, 15 January 2020 (UTC)

It's up now —Beleg Tâl (talk) 13:27, 16 January 2020 (UTC)

Wiki Loves Folklore[edit]

WLL Subtitled Logo (transparent).svg

Hello Folks,

Wiki Loves Love is back again in 2020 iteration as Wiki Loves Folklore from 1 February, 2020 - 29 February, 2020. Join us to celebrate the local cultural heritage of your region with the theme of folklore in the international photography contest at Wikimedia Commons. Images, videos and audios representing different forms of folk cultures and new forms of heritage that haven’t otherwise been documented so far are welcome submissions in Wiki Loves Folklore. Learn more about the contest at Meta-Wiki and Commons.

Kind regards,
Wiki Loves Folklore International Team
— Tulsi Bhagat (contribs | talk)
sent using MediaWiki message delivery (talk) 06:14, 18 January 2020 (UTC)

H. P. Lovecraft revisited[edit]

user:Túrelio here is a list of renewals for Lovecraft H. P. for what it is worth. from https://cocatalog.loc.gov/ User:Slowking4/H. P. Lovecraft -- Slowking4Rama's revenge 23:03, 19 January 2020 (UTC)

That's largely incomplete, since it only lists renewals for works published after 1950. https://exhibits.stanford.edu/copyrightrenewals has the Class A renewals; https://github.com/NYPL/cce-renewals has the renewals done by Gutenberg (as does Gutenberg), which has a handful of non-Class A renewals, Class A being books and submissions to periodicals. https://onlinebooks.library.upenn.edu/cce/ has the full copyright renewals, which includes periodicals, but not necessarily transcriptions.
Author_talk:Howard_Phillips_Lovecraft#Copyright_renewals_in_the_Gutenberg_Files has what was found in the Gutenberg files, which with the LoC link will get all the book and submission to periodical renewals, though not the periodical renewals.--Prosfilaes (talk) 09:48, 20 January 2020 (UTC)
better an incomplete record, than automatic deletion based on the false pretension that everything is pma + 70. uploaders should not have to fight a deletion machine, rather than a neutral copyright review. now to program in the Penn records in the automatic copyright status at wikidata and elsewhere. Slowking4Rama's revenge 03:05, 21 January 2020 (UTC)

Tech News: 2020-04[edit]

19:41, 20 January 2020 (UTC)

Curating works for export[edit]

We have a lot of works that are considered "done", but they are quite hard to find as a mobile user interested only the end product:

  • We have Category:Validated texts, but not all validated works have this category set
  • That category is full of things like the XXX (DNB00) pages which makes it very unfriendly by itself, since those pages aren't really targets that you'd be likely to export.
  • For the works in that category, not all are well-suited for export. For example, they might not have a TOC on the front page, which kills the export tool's ability to gather all the pages.
  • Some works are perfectly exportable, but are only Proofread. Again the cat is not a perfect set of all proofread texts.
  • Both cats can also contain works and subpages of works, when you'd only really export the work. Eg. Aesthetic Papers and Aesthetic Papers/Correspondence

I suggest that we start a new category: "Work for export" or something (an analogue of fr:Catégorie:Bon pour export), which consists of works that are known to be set up correct for a functioning ws-export export:

  • They have a TOC that works to trigger exporting of the work in the right order
    • If there is no visible TOC (e.g. the TOC is on a subpages), then there is a class="ws-summary" to do this instead.
  • No valuable content (e.g. chapter headings) is hidden in the header templates, as this content doesn't get exported.
  • The formatting is suitable for e-readers:
    • Avoid fixed-width text containers
    • Avoid fixed column-based layout (e.g. {{div col}} often looks pretty bad when you're squeezing 4 columns into an e-reader)
    • Avoid over-indenting from either margin, eg by use of ::::::, etc
    • Avoid complex layouts that freak out e-readers (I don't have a really good baseline on what counts here).

Thoughts? Inductiveloadtalk/contribs 13:42, 21 January 2020 (UTC)

I don't think we should impose some complex formatting rules on exportable works. There's going to be exports to tiny phones and exports to large tablets, or even letter-sized PDFs for printing. We can discuss formatting separately, but there are going to be works where you have to do what you have to do. I've got mathematics books on Kindle right now that didn't come out well, but at least they're there.--Prosfilaes (talk) 17:48, 21 January 2020 (UTC)
Sure there's a lot of "best you can do, this wasn't designed for the format". I was thinking more of the easy things like avoiding fixed columns, fixed widths (e.g. use max-width rather than width) etc - generally avoid setting hard-coded horizontal positions where possible, and it'll come out "OK" on various media sizes. Some things like massive tables are just tough, they are what they are, the reader needs to scroll sideways.
I'd say the majority of "normal" works don't have any major formatting issues that interfere with export, and most of those that do are minor.
The reason I mention formatting is that it's something you should ideally at least consider before you declare a work is "good for export", especially as some remedies are trivial. Also, taking care of ebook formatting also fixes the mobile formatting for free. Inductiveloadtalk/contribs 18:04, 21 January 2020 (UTC)
@Inductiveload: I believe you mentioned that {{FreedImg}} is not good with e-readers. What image templates do you prefer instead? I’ve been using FreedImg constantly, but the initial choice was almost entirely because I like its default caption settings (centered, smaller text). If I have to add that formatting by hand to some other template, I will. I notice that {{Plain image with caption}} allows, and requires, almost unlimited handmade css, whereas FreedImg has lots of built-in parameters. Levana Taylor (talk) 19:15, 21 January 2020 (UTC)
I, too, have been using {{FI}} extensively, and share Levana's questions. @Kaldari: I want to make sure you see this topic too, as your recent efforts seem to overlap. -Pete (talk) 19:41, 21 January 2020 (UTC)
This I am not sure about. FreedImg uses the full-sized image from Commons, which can be hundreds of times the size of the image actually needed on the page (e.g. the iceberg image on the template doc page is 300 times the size a normal thumbnail would be). This results in exported files being up to and over 100MB, when they'd "normally" be 1 or 2MB. This is a bit unfriendly to mobile users on limited data plans, and also for people with limited e-reader storage.
It does this so it can scale up as needed to fill space, but it seems to be a rather heavy way to deal with it. I don't have any bright ideas to "fix" it, but I'm not quite sure of what FI is trying to achieve, since it seems to be trying to achieve everything at once.
If you find yourself with substantial work-specific formatting, I would probably suggest one or more "OAW img" templates to wrap a standard image template and handle it for you so you can avoid excessive markup on the content pages. Then you have a single place to keep formatting within the work consistent. Inductiveloadtalk/contribs 19:49, 21 January 2020 (UTC)
A wonderful idea, if someone would create the templates: I think the requirements are quite simple, & I’ve stated them at Wikisource:Wikiproject Once a Week/Layout talk#Image templates. There would be no need to use FI, since the images are never used here at a greater width than 800px, and there’s only a few parameters. Levana Taylor (talk) 20:49, 21 January 2020 (UTC)
I have come up with a solution to the {{FI}} situation: {{large image}}. It will use whatever pixel size you invoke if it has enough space, and if not, it will scale it down to fix the available space. Only the specified file size is ever loaded, which is a >10x reduction in the file size in the OAW images I tried it on.
The template is deliberately simple, it's not intended to be a kitchen-sink of options. 99% of large images are just a centred image. Captions and accoutrements can be done separately with their own formatting. If your use case is not a centred image that scales to fit smaller containers, this is not the template to use. But between this and {{img float}} (left and right only, centre is broken), that should cover nearly all images, I think? Inductiveloadtalk/contribs 17:19, 22 January 2020 (UTC)
Many of your "avoids" would preclude our poetic and dramatic works, as these works typically require a lot of formatting. --EncycloPetey (talk) 23:14, 21 January 2020 (UTC)
Not at all, the thing to avoid is fixed width manipulation, where then container cannot adjust to smaller screens. It's OK to have, say, a block-center, with an unspecified or maximum width, because this will be able to get smaller when the screen does. Having a fixed width ensures that the content will disappear off the right side of the screen if the device is small enough. For example, look at The Desecration of the Han Tombs: if you constrict the width, nearly all the content goes off the right side (because center-block set width, not max-width). Compare to, say, Venus and Adonis, where the center-block is not fixed, and this doesn't need the user to scroll right to read every single line, and then left to start the next line. If you need to constrict width (e.g. your poem has extremely long lines), use max-width.
Very occasionally, you may really require a fixed-width container, and if that's truly the only way, then it's how it is: the user will need to scroll. But it's unfriendly to expect them to scroll for every single line. Inductiveloadtalk/contribs 10:01, 22 January 2020 (UTC)
About how many pixels wide is the page display on typical mobile eaders, do you have any idea? Levana Taylor (talk) 11:19, 22 January 2020 (UTC)
It's a difficult judgement to make, because devices vary. Phones tend to have about 350-450 "effective" pixels (they often have 1080 or bigger screens, but they scale the content). Iphone 6/7/8 is 375, Galaxy S9 is 360. With the default settings on my phone's browser (effective screen size 1080/3=360), it looks to be about 23em: the line "Hunting he lov'd, but Love he laught to scorn:" just fits on one line.
E-readers vary too, but they also have user-set font size, so any assumptions about pixel-to-em sizes are void. On my phone e-reader app, my current settings are about the same as the mobile website: 23em to a line. But if I zoom the font size out, it's more, and in, it's less. A small tablet device would probably be closer to 40em because the screen is physically bigger, and a full-on iPad could be 50em or more. All depending on the users' settings, users with poor eyesight will likely have the font sizes larger, and the line lengths will be shorter in terms of ems. For reference, 30-50em is roughly the width of an average book's line (of course that depends on font size, page size, margins, etc).
Basically, if you were to make sure the page renders sanely at ~350px, with "normal" font scaling, I think you'd cover nearly all practical devices.
Also note, the mobile website scales images with CSS (max-width: 100% !important;), so though, for example, the image at The Education of the Deaf and Dumb Practically Considered is full-sized (655px), it will be scaled to avoid spilling out on mobile. My e-reader seems to also shrink images to fit the page too in the ePub, but my computer document viewer does not; there is nothing in the ePub that causes this, it's the app's own behaviour, it's not encoded in the ePub. Inductiveloadtalk/contribs 12:13, 22 January 2020 (UTC)
Right, I will try not to have any fixed widths greater than 350pxd then, although it’ll sometimes simply be inevitable. (Advanced way of handling things like tables and musical scores would be to create a thumbnail image of them, to be clicked if you wanted to see the real thing, but besides being too complicated, that really only makes sense in a mobile-only layout. We’re not going to have separate desktop and mobile versions anytime soon!) Levana Taylor (talk) 13:07, 22 January 2020 (UTC)
Generally, you shouldn't be specifying any width, fixed or otherwise for text in px anyway, because the px-em mapping is very variable. Someone with poor eyesight who has set a larger font size make only have 10 or 15em to a line.
For images specifically, it seems the mobile Wikisource site and at least some e-reader apps (I tried MoonReader+, Google Play Books and Overdrive) all seem to shrink images down, so even if they're over 350px, they come out just fine on mobile devices.
For scores, the mobile site does not (currently) force the size (so it spills off the right margin), but my e-reader apps do (in an ePub, the score is just another image).
Tables are often an example of "tough cookies", they often just require more width than a 350px screen can deal with. However, generally they are not scanned left-to-right, line-by-line like text, so having to scroll around is not such a burden. Smaller tables generally "just work". Sometimes adding a vertical-align can help when the cell contents wrap.
Do you have a page in mind where a fixed width is required? Inductiveloadtalk/contribs 13:42, 22 January 2020 (UTC)
Fixed width? Fair Drinking, because of how the image with the capital letter has to be next to the text, and the text column and image column really ought to be the same height, given that that's how the image is designed. Such cases are rare though! (And yes, I do specify text width in ems, always.) Levana Taylor (talk) 14:05, 22 January 2020 (UTC)
Hmm, yeah that's a tricky one. You could probably get away with just {{drop initial}}, but then the tail of the poem will jink left if the poem is taller than the image. Example of using drop initial. But I'd say as a one-off case, it's OK to just do it your way, and just be a bit too wide in this page. It's when entire slabs of works are unsuitable for e-readers for no good reason that we should worry.
BTW: This is a good example of where the FI template is loading huge things: the image as loaded is 1,123px wide and is 5,630kB in size (!). A 238px image rendered by the server is only 79kB. Inductiveloadtalk/contribs 14:25, 22 January 2020 (UTC)
That {{DI}} version of "Fair Drinking" utterly doesn't work on the mobile browser I checked, it makes the poem lines wrap badly as well as the end of the poem moving left below the image. Nah, we will have to just treat this poem+image as an unshiftable block. There are really not many like it, though. Levana Taylor (talk) 20:21, 22 January 2020 (UTC)
I think that's fair. I also can't see a way to make this work flawlessly on mobile devices. Almost as if they didn't have phone screens in mind when typesetting things in 1861! Inductiveloadtalk/contribs 21:54, 22 January 2020 (UTC)
Consider Swanwick's translation of the Eumenides, pages 147–149, & 154 or Henry_IV_Part_1_(1917)_Yale/Text/Act_II pages 50–52. The formatting in these plays likely won't tolerate narrow screens. There are dramatic works with much more complicated formatting than this, such as Electra_(Murray) page 81, where the complexity is simply unavoidable. --EncycloPetey (talk) 01:37, 23 January 2020 (UTC)
If it's unavoidable, then it's just how it is. Most of these works render pretty well on mobile and e-readers, exactly because they have not been typeset with fixed width columns, but with dynamic layouts in mind:
  • Swanwick: those pages are still readable on mobile (different minor flaws on mobile and e-reader), but it is still readable and it's a very short section. The rest is formatted OK.
  • Henry IV: seems pretty much OK. The 1em left margin applied throughout is a little annoying in mobile, but it's not terrible, and it does seem to exist in the original, though I kind of suspect it's only an artefact of the typesetting. pp50-52 don't seem any worse than any other?
    • Perhaps the biggest comment I have is the forced line wrapping of continued lines prevents justification (e.g. p47, first line and others). Compare to lines where the original was forcibly-broken because of the meter of the play, where there is a ragged right margin: p67. This isn't a mobile-only comment: it's impossible to tell continued text from line-broken text on all platforms, because it all presents with a ragged right margin. The same formatting (continuous is justified, verse is not) formatting is used in other Henry IV versions.
  • Electra: p81 looks OK on mobile (as good as plays ever look due to the ~23em line length), looks OK on Overdrive, but not great on Moon Reader+ (the headings end up in the right column prefixed to the lines with an empty void on the left). Again, a short section and still readable. The forced 4em left margin on p11 is a little bit unfriendly and I wonder if there's a better way to do that. It's only simulating the original typesetting (narrow centralised column, with right aligned text). But even then, at my normal font settings, it renders out fine on my phone, because the lines are short.
All three works are generally set (correctly, IMO) with width-agnostic layouts, and have {{default layout|Layout 2}} specified.
Far from "precluding our poetic and dramatic works", these works demonstrate that you absolutely can have such works for export, and (most of) the formatting will translate quite well. I'm not, and I never have been, advocating that we should dump formatting when it interferes with mobile, I'm simply saying when there's a possibility to get it right on mobile without screwing up the "normal" output, we should do so. Very few of my "avoids" above are necessary most of the time: probably under 1% of pages (warning: rear-end-sourced number).
For plays and poems with line breaks specifically, these can be a bit "wrappy" on phones, but that's just how they are, and they'll likely be just fine on tablets and "real" e-readers like Kindles, unless the font size is enormous.
I also do not advocate to make changes to satisfy quirks of individual e-readers: unless it's trivial and non-destructive for us to work around, we should output valid markup and expect devices to deal with it correctly. Inductiveloadtalk/contribs 07:43, 23 January 2020 (UTC)

Amusements in mathematics[edit]

I've been contributing a little to Index:Amusements in mathematics.djvu recently, but I have just stumbled on an older index which appears to be an identical duplicate version of the same book, but with considerably less work done on it. Index:Dudeney - Amusements in Mathematics.djvu I'm not sure if I should continue contributing to the newer Index or swap to work on the older Index? Or should one of these be removed? Thanks Sp1nd01 (talk) 15:14, 21 January 2020 (UTC)

They are both copies of the same book. I suggest deleting the index with almost no proofread pages. You needn’t be afraid to continute with the index that you have been contributing to so far. --Jan Kameníček (talk) 16:12, 21 January 2020 (UTC)
Thank you, I've gone ahead and placed a request for deletion on the Proposed deletions page. Sp1nd01 (talk) 19:39, 21 January 2020 (UTC)

Movement Learning and Leadership Development Project[edit]

Hello

The Wikimedia Foundation’s Community Development team is seeking to learn more about the way volunteers learn and develop into the many different roles that exist in the movement. Our goal is to build a movement informed framework that provides shared clarity and outlines accessible pathways on how to grow and develop skills within the movement. To this end, we are looking to speak with you, our community to learn about your journey as a Wikimedia volunteer. Whether you joined yesterday or have been here from the very start, we want to hear about the many ways volunteers join and contribute to our movement.

To learn more about the project, please visit the Meta page. If you are interested in participating in the project, please complete this simple Google form. Although we may not be able to speak to everyone who expresses interest, we encourage you to complete this short form if you are interested in participating!

-- LMiranda (WMF) (talk) 19:01, 22 January 2020 (UTC)

Wikisource Conference in Warsaw[edit]

Dear Wikisource Community,

Meeting the Wikisource community expectations, we are working with Wikimedia Polska and Wikimedia Foundation on organiziing the 2nd Wikisource Conference in Warsaw. We already had a survey that showed high interest in the Conference within the community. We also had recently a meeting on the conference organization process and its requirements. However, we are still at a very early stage of the Conference organization process. But we are hoping this event will happen in September this year.

In order to apply for Wikimedia Foundation support, we need some input from the community about the Conference goals and the community expectations. If you are a wikisourcian, you wish to participate the conference or you wish to help the Wikisource community that the conference take place, please fill the short survey linked below before January 29 (due to short deadline for grant applications). Please, also share this request among Your communities. Here is the link to the survey

https://docs.google.com/forms/d/e/1FAIpQLSf7FnFgMLPHeyWtBqjgXwLDYvh5vxeTnsZ0OIjTdSDrZlX0PA/viewform

Feel free to contact us, if you have any questions, suggestions, proposals, or if you wish to help us in any other way.

On behalf of the Organizing Commitee,

Nicolas Vigneron

Satdeep Gill

Ankry

Index:Sally in our alley.djvu[edit]

This work needs only two pages to be validated. Any takers? (Warning: both pages are full of LilyPond markup.) —Beleg Tâl (talk) 16:19, 24 January 2020 (UTC)

Yes check.svg Done Beeswaxcandle (talk) 18:38, 24 January 2020 (UTC)

Open call for Project Grants[edit]

Wikimedia logo family complete-2013.svg

Greetings! The Project Grants program is accepting proposals until Feburary 20 to fund both experimental and proven projects such as research, offline outreach (including editathon series, workshops, etc), online organizing (including contests), or providing other support for community building for Wikimedia projects.

We offer the following resources to help you plan your project and complete a grant proposal:

With thanks, I JethroBT (WMF) (talk) 18:38, 24 January 2020 (UTC)