From Wikisource
Jump to navigation Jump to search
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 347 active users here.



New template for hyphenated words across pages[edit]

Having just run across a work that tripped multiple edge cases in how ProofreadPage joins together pages, I finally put together a utility template to make dealing with these easier.

The details are in the template documentation, but the short version is that if you have a hyphenated word that has been split across pages (i.e. where the word should still be hyphenated when transcluded into mainspace), or where the page ends with something (like an em-dash) that should be joined with the following page without inserting a space character, you can throw a {{peh}} at the end and get the desired effect in both Page: and mainspace (or in Translation:, or anywhere else). It defaults to a hyphen (“-”) when no arguments are provided, or uses its first argument otherwise (e.g. {{peh|—}}).

It's had limited testing, but it's simple enough that I don't think there's much risk of weirdness.

Oh, also, you can (I think) achieve the exact same results using {{hws}}/{{hwe}}, so if you're already using those for this then there's no particular reason to switch. This is just intended as a simpler and easier to use way to achieve the same result for those of us (surely I'm not the only one? Right? Right…?) who find {{hws}}/{{hwe}} complicated and confusing to use for these scenarios. --Xover (talk) 17:58, 13 November 2019 (UTC)

Hm, that is a clever workaround… Much easier than hws/hwe. --Jan Kameníček (talk) 18:07, 13 November 2019 (UTC)


New speedy deletion criterion for person-based categories[edit]

Following on from a discussion at WS:PD#Speedy deletion of author based categories.

It is long established and in the main uncontroversial that English Wikisource does not use person-based categories (of the type "Works by John Smith", "Poetry by John Smith", etc.). Some previous discussions can be found at: 1, 2, and 3 (and the two following threads). However, absent a speedy deletion criterium specifically for these, admins have to rely on the provision for precedent-based deletions. In practice this means such categories must be brought to WS:PD to be rubber stamped, wait at least two weeks (because inertia and habit), and then hopefully someone will remember to process them. Eventually.

I therefore propose that we extend the deletion policy with a new G8 criterion as follows:

  • Person-based categories—Categories where the defining characteristic is person-based. This includes, but is not limited to, author-based categories like "Works by author name".

All deletions (modulo CU type concerns) are subject to community challenge in any case, and are clearly visible in the deletion log, so there is no particular benefit to the bureaucracy where there exists no significant uncertainty or controversy. --Xover (talk) 14:32, 15 July 2019 (UTC)

Symbol support vote.svg Support, but I'd note that there is an exception discussed in link #2: namely, American presidential documents categorized by president. This is due to the fact that the administration of the executive branch is tied to who is the president at the time. There was no consensus as to the scope of this exception: what kinds of presidential documents it applies to, or whether other governments may have the same treatment, etc. —Beleg Tâl (talk) 14:42, 15 July 2019 (UTC)
Symbol oppose vote.svg Oppose 2 weeks is not too long to wait. organization of subject of a work is useful, a migration to a stable ontology is necessary. Slowking4Rama's revenge 13:58, 30 July 2019 (UTC)
2 weeks is definitely too long to wait when a full beaurocratic procedure with a foregone conclusion could be replaced with a simple administrative action. —Beleg Tâl (talk) 14:32, 30 July 2019 (UTC)
Also it is worth pointing out that this proposal is not regarding whether such categories should be kept or deleted (since we have already established that they should be deleted), but only whether they should be posted to WS:PD before we delete them. —Beleg Tâl (talk) 18:51, 30 July 2019 (UTC)
And that strictly speaking, under current policy, they can be deleted a few days after a notice has been posted to WS:PD (no two week wait required, just that the discussion must have "started"). It's just that habit and inertia inevitably means that almost all cases will in practice suffer this 2+ week purely bureaucratic delay. I'm a big believer in process and the value of bureaucracy when properly deployed, but even I think this one is a pointless waste of volunteer time. We have issues that require actual discussion or other action that have sat open on the noticeboards for a year and a half; we should not waste those resources on filling out forms in triplicate for issues that are not controversial. Any deletion can be reviewed and overturned, if needed, by the community; let's save the cautious multiple-safeguards approach for stuff that might actually need it. --Xover (talk) 19:11, 30 July 2019 (UTC)
I always wait until there has been a full month of inactivity, since there are many editors who only edit occasionally, but that's just me. —Beleg Tâl (talk) 19:17, 30 July 2019 (UTC)
Symbol support vote.svg Support --EncycloPetey (talk) 17:40, 30 July 2019 (UTC)
Symbol support vote.svg Support --Jan Kameníček (talk) 19:38, 30 July 2019 (UTC)
Symbol support vote.svg Support though if possible I'd like to see the exception Beleg Tâl specified firmed up a bit, i.e. perhaps a general exception for things like governments, ministries, and reigns which are "person-based" but serve an obviously different function to categories-by-author (noting on the UK side things like Category:Acts of the Parliament of Great Britain passed under George III). —Nizolan (talk) 00:44, 1 August 2019 (UTC)
  • Note Based on the discussion above I have added the above criterion with an additional limitation to exempt things like UK governments tied to a monarch's regnal period or the administrations of US presidents. I read the above as general support for this criterion—sufficient for adding it—but with some remaining uncertainty about the optimum phrasing. I'll therefore leave this discussion open for a while longer so that interested parties may object or suggest better wording. I'll also add that minor changes to the wording (that do not change the meaning) can easily be made later with a proposal at the policy talk page. And we can always bring bigger changes up here for reevaluation if it causes problems. --Xover (talk) 19:32, 11 August 2019 (UTC)

Deletion review[edit]

I long ago (2005) gathered together historical documents related to the life of Indigenous Australian warrior Yagan in Category:Yagan. This has always seems to me a reasonable category, but it just got speedily deleted without so much as a how-d'-y'-do.

The examples given in this proposal were of the form "Works by John Smith", "Poetry by John Smith", etc. No other examples were given in the discussion. So I'm not sure if the community really intends that categories like this would be deleted. Can we review this please?

Hesperian 23:48, 2 September 2019 (UTC)

Hmm. I'm not going to express an opinion on "should" / "should not" for this, but I will note that based on my understanding of the discussions this would indeed be the intended effect. The defining characteristic of the category is that its members relate somehow to a specific person, and for such the consensus appeared to be that portals were better suited. But perhaps there is a distinction between Category:Yagan and Category:John Smith that I am not seeing? Or is it the specificity: Category:Foo by Person is bad, butCategory:Person is acceptable? --Xover (talk) 03:58, 3 September 2019 (UTC)

As things stand:

  • I can gather together documents about the Battle of Borodino in Category:Battle of Borodino, because that's an event.
  • I can gather together documents about Fort Knox in Category:Fort Knox, because that's a place.
  • I can gather together documents about scissors in Category:Scissors, because they are objects.
  • I can gather together documents about intelligence in Category:Intelligence, because that's an abstract concept.
  • But I can't gather together documents about Yagan in Category:Yagan, because he was a person.

Can no-one see how bizarrely arbitrary this is??

And it hasn't even really been discussed, since the only examples given above are "Works by" categories, the deletion of which makes perfect sense. Hesperian 11:50, 3 September 2019 (UTC)

Fully agree with Hesperian, the speedy deletion is a misinterpretation of the guidance. The "category:works of ..." is to ensure that works of authors are added to author pages, and not categorised. There is no determination that it would relate to anything else. Categorisation has always existed for people, again our biggest issue is how to separate author categorisation from subject categorisation. — billinghurst sDrewth 12:39, 3 September 2019 (UTC)
Read the policy, it does not say "works by …", it says "person-based". —Beleg Tâl (talk) 12:56, 3 September 2019 (UTC)
Per our deletion policy (as updated according to the consensus in the above discussion), "Person-based categories" are now a criterion for speedy deletion. This "includes, but is not limited to, author-based categories", but "the defining characteristic is person-based". This was very explicit in the above proposal. My deletion of Category:Yagan was therefore 100% within our deletion policy. You can propose a reversion to the older version of the deletion policy, and a restoration of Category:Yagan (even though it is entirely redundant of Portal:Yagan), but I will have no part in it. —Beleg Tâl (talk) 12:53, 3 September 2019 (UTC)
Also: as things stood before the above discussion, I could gather together documents about Yagan in Category:Yagan, but couldn't gather together documents about Yazid III in Category:Yazid III, which is just as bizarrely arbitrary. —Beleg Tâl (talk) 13:03, 3 September 2019 (UTC)
(ec) It is my opinion that it is not a positive change. 0-100 in four seconds. I find the statement It is long established and in the main uncontroversial that English Wikisource does not use person-based categories to not be the case, especially as it has been the case since 2005. Something that was entirely in scope and I believe would have been kept in a PD, is now going to a speedy deletion and deleted without conversation. I find that inappropriate, and for that to have been implemented in four weeks is an example of poor implementation and poor policy. I am wondering where this community is going, and the lack of vision that this represents. — billinghurst sDrewth 13:14, 3 September 2019 (UTC)
It may also have simply flown under the radar. It is also just one category affected, and a completely redundant one at that (equally redundant to any Author-based categories). And the proposal to update the policy was done entirely by the books, and is a significant benefit to the community. —Beleg Tâl (talk) 13:30, 3 September 2019 (UTC)
And it has been long established and in the main uncontroversial that English Wikisource does not use categories for individuals who have pages in Author space; the fact that there existed one or two categories for an individual in Portal space is (to me) a minor detail and I would have also considered it long established and uncontroversial that these were also unwelcome. —Beleg Tâl (talk) 13:33, 3 September 2019 (UTC)

Of most concern to me in this new G8 is, what if Portal:Yagan did not exist? In that case, Category:Yagan would be the only way in which we had organised our material by topic, yet it would still be summarily deletable under this new G8.

I think a more coherent policy position might be:

We don't want to organise our material by both Author/Portal and Category. So it is fine to create a category for a topic if there is no corresponding Author/Portal page. But be aware that this is a stopgap -- once someone has created the Author/Portal page, the category may be deleted.

Note that this doesn't distinguish people from other topics. Category:Yagan is fine, but only until Portal:Yagan has been created. Even Category:Works by John Doe is fine, but only until Author:John Doe has been created.

I think the biggest problem with this position is the really big topics that would be better handled by a category than by an Author/Portal page e.g. War. In that case, I would say keep the category and ditch the portal, which would be unmaintainable. In a speedy criterion there would certainly need to be something to prevent deletion of categories that contained subcategories or a collection of portal/author pages.

Thoughts? Hesperian 22:50, 3 September 2019 (UTC)

Since the attitude to concerns raised here has been "I will have no part in it" followed by non-participation in the discussion, I have boldly replaced "person-based" with "author-based". I accept the new G8 was proposed, discussed and implemented in good faith, but subsequent objections have made it clear that there is no consensus for speedy deletion in the gap between person-based" and "author-based".

To be clear: we may not agree on whether Category:Yagan should have been deleted, but I think we can all agree that the deletion was contentious, and speedy delete criteria are intended to capture non-contentious matters.

Hesperian 07:53, 6 September 2019 (UTC)

@Hesperian: I'm not going to revert that because I think at least temporarily going back to the status quo is prudent when a concern has been raised so soon after implementation. But I do object in principle to your approach here: whatever the problems with the new G8, it was properly discussed, consensus determined, and implemented. For you to unilaterally reverse it is not a good practice, no matter the merits of your concerns with it. The proper description of the thread above is, strictly speaking, not "absence of consensus" but rather "complaints after the fact" (possibly good, proper, and meritorius complaints, but still after the fact). So I am going to insist that this removal of the new criterion is a temporary measure while discussion is ongoing, and not the new status quo. If no new consensus is reached here then we revert back to what was previously decided. (To be clear, if you had suggested we should temporarily revert I would have supported that. It is your acting unilaterally with an apparent intent to change the status quo I object to.)
That being said I am absolutely open to being convinced of anything from the new criterion needing to be tweaked and to it needing to be dropped altogether. The reason I am not currently actively discussing is that I do not feel I sufficiently grasp the issue and am mulling it over. Your distinction between "person-based" and "author-based" has not been apparent to me prior to your latest comment, and I now suspect that that distinction is the crux of your objection; but I still do not grasp why you do not feel a portal would be sufficient. On the other hand, reasonably curated categories are cheap, and can conceivably be automatically applied to works included in a portal.
I also suspect, though I may of course be entirely mistaken, that what we are discussing here is not actually a speedy criterion, but rather a more fundamental issue of category and portal policy. I am not convinced the speedy criterion is a useful proxy for that debate, on the one hand, and that the former will resolve itself neatly if the latter is settled, on the other. --Xover (talk) 08:30, 6 September 2019 (UTC)
@Hesperian: "I will have no part in it" is me, not the community. I agree with Xover that it is necessary to establish a new consensus with the community to make a subsequent update to the deletion policy (in which discussion I will remain neutral). And like I said to TE(æ)A,ea.: three days is not remotely sufficient for closing a discussion. Be patient. —Beleg Tâl (talk) 12:20, 6 September 2019 (UTC)
  • Pictogram voting comment.svg Comment There is definitely a long-established practice that we collect and curate works that relate to authors, and due to our strong preference to curate, we determined to not categorise, which would have a duplication and a confusion. It has not been the case for individuals who were not authors, and it should not be a requirement that we have to curate such pages, especially where a person may be mentioned on a page(s) though not be the focus of the pages. For instance, the page The Perth Gazette and Western Australian Journal/Volume 1/Number 28 would be considered for categorisation in "Category:Yagan" though would not particularly be the focus of a page and put onto a Portal: ns page. I would definitely not expect someone to have to make edits to a portal page to that target, though I would have no qualms with someone categorising. Where we have authors, we have wikilink'd back to author pages for that relevance. So it is my belief that these non-author categories should not be speedied, if there is a case for their deletion, then bring it to the community. I also believe that a proposer should be listing consequences of their suggested policy changes, not leaving it to the community. I find the above consensus to be a troubling "yes ... tick and flick" exercise by the community without an in-depth exploration of the consequences, approving a change to speedy deletion should be items that are completely non-controversial.

    The above deletion discussion started with the scope of a PD discussion about author categories, and then specifically addressed two author related categories. No examples were given of non-author categories that would have been wrapped up in the change of our guidance, nor that we were going to now speedy delete categories that have been existing for greater than 10 years. I have a strong belief that anything that has existed for over 10 years onsite should not be speedied, and that speedy deletions are only best applied to recent additions.

    Xover: You suggested the policy change, then summarily closed less than four weeks later, and implemented. May I suggest that is not the ideal practice either, as this is a change of policy where all person categories are deleted, not as indicated in the discussion that it was an existing process and the speedy being the only change. We are not a huge community, we don't have the same editing rates, or the diversity of eyes to analyse such situations, and that is traditionally why we have left discussions open for extended periods. — billinghurst sDrewth 10:55, 7 September 2019 (UTC)

    @Billinghurst: "Too quickly closed" is a fair complaint, although I don't entirely agree with that assessment. I agree there should be plenty of time for the community to ponder, scrutinise, discuss, and decide; and in fact was somewhat disappointed that the proposal did not garner wider participation and more discussion. I agree speedy criteria should have a firm basis, which broad participation in the proposal is the best way to ensure (and document!). But I also observe that community participation in such discussions is distressingly low in general, and by that yardstick the above was about the most I felt one could realistically hope for. When no further comments either way surfaced—not even any "Unsure" or "Wait, I need to think a bit more"—I felt that was sufficient to implement. If we want to have much longer timeframes to tease out every possible community comment then we should have specific guidance to that effect (and I do mean a specific number of weeks).
    I agree that speedy should be for uncontroversial things, but then my understanding was that this was uncontroversial. My intent in making the proposal was not to change practice regarding use of categories vs. portals, but rather to eliminate a pointless two-week wait and bureaucratic box-ticking for something that was a priori determined would be deleted. I do however disagree that speedy should not be applicable to, for example, decade old clear copyvio. The purpose of speedy deletions is to reduce bureaucracy and make maintenance more efficient—where possible—and to reduce the demands on the community's time and attention in formal discussions. Because, as you point out, such participation is perhaps our scarcest resource! The age of the material affected is entirely orthogonal to whether it falls within one of the speedy deletion criteria.
    "Uncontroversial" is a better distinction, but even there some nuance is needed. The policy that leads to the deletion (by whatever process) must be unambiguously decided: it must be uncontroversial that that was what the community decided. The issue itself, though, can still be plenty controversial: there are some contributors who would never see anything deleted, for any reason, and express their frustration with copyright law and our copyright policy in every copyright discussion they participate in (nevermind proposed deletions). That someone disagrees with the community's decision, once made, is not a valid reason for considering the implementation of that decision controversial.
    On the issue at hand, though, I (am starting to) see the personauthor distincton, but I am having trouble understanding how a portal is any less suited for a person than for an author. To my mind the very same arguments for portal over category for authors apply equally to persons. Why wouldn't The Perth Gazette and Western Australian Journal/Volume 1/Number 28 go in the portal? Or is it the perceived relative amount of effort in curating the two approaches? Hesperian's more coherent policy position seems to suggest that that is the case.
    I don't think starting with a category but deleting it if a portal is created is a particularly rational approach, but as a proposal it does speak directly to the relationship between categories and portals. To me, the opposite end of the spectrum (that you also address) seems more elucidating: once a topic is sufficiently large, a portal becomes an awkward way to organise the information. In those cases I could see an argument for using both; the category for everything and the portal for the highlights. But that's an argument that will be relevant only rarely (relatively speaking) and only in the reverse order (only once the portal is "full" does the category come into play). Most person-related topics will not have too many relevant works for a portal.
    Or perhaps a different angle of attack would aid common understanding: Categories, Portals, and Author-pages overlap in various ways and in different degrees, and so we should establish some coherent guidance on the purpose of each, what to use each for, and how to distinguish between them in difficult cases. Perhaps in discussing what that guidance should be we would better understand the various perspectives than through the proxy of a speedy criterion? For example, do we want a portal about a person as a historical figure if that person is also an author? Is an Author: page and a Portal: the same thing except for inclusion criteria? Do the same layout rules and restrictions apply to both? --Xover (talk) 03:19, 9 September 2019 (UTC)
i am sad that admins persist in summarily deleting, for contentious issues that require a consensus. we need a standard of elevating issues on chat before deletion. and a standard of practice of how to organize ontologies of "subject of" and "depicts". i don’t care how- portals, categories, subsection, anything that can be linked from wikidata. but we need an organizational consensus, not deletion. Slowking4Rama's revenge 03:43, 13 September 2019 (UTC)
@Slowking4: But, but, but, but you do not understand the sysop perspective. They delete without consequence (for themselves, as from a sysop's perspective a deleted page may be view/restored and viewed without going through with restore. See? No consequence!) As for for the plebs, tough! Them's oughta put in an application to be tiara'd like good little princesses… 06:09, 13 September 2019 (UTC)
114.78: I realise you're taking the piss here, but I actually agree that this is an important difference in perspective to take into account. One thing is that the consequences of deletion can in some (but not all!) cases appear smaller to those with the technical ability to view and restore deleted pages, but the perspective is also shifted when you have long backlogs of tasks that either can only be resolved (in practice) by deletion or where deletion is a fairly foregone conclusion. To have to conduct a formal analysis, formulate it cogently, and run a community discussion is a lot of effort. The relatively low community participation in those discussions means they have a tendency to deadlock, and if resolved are too local to support any kind of future precedent. When a lot of your tasks are dealing with that dynamic, you will naturally tend to develop a bias (big or small) toward more efficient resolutions like having speedy criteria for whatever the issue at hand is.
But when you spend a lot of time going through the maintenance backlogs you also gain the very real experience that tells you that a lot of stuff has been dumped here with no followup, attempts to format properly, or even giving minimal source or copyright information. There is literally no hope of these works being brought up to standard as they are, and would in any case be easier to recreate from scratch than fix in place, even if they aren't blatant copyright violations. While we certainly need to watch for and not get fooled by the previously mentioned bias, we also should let ourselves be guided by this experience. Sometimes the perspective of those who work the maintenance backlogs (which is not by any means limited to just admins!) gives them a better foundation for reasoning about an issue than those who work primarily on their own transcriptions (and sometimes not). --Xover (talk) 07:25, 13 September 2019 (UTC)
your "guided by experience" does not address the power dynamics of a summary standard of practice. when you undertake an action. no matter how reasonable or justified you may feel, while the community is feeling ill-used, then you might want to rethink your action, if you would presume to lead a community. we have a lot of ban-able admins. Slowking4Rama's revenge 11:44, 13 September 2019 (UTC)
@Xover, @Slowking4:My sincere apologies if my comment came across solely as micturient. When young fresh meat front up to gain the authority bit it is entirely reasonable they not realise they are actually signing up for a melange of teacher, executioner, judge and neat-freak. What is less excusable is that some of them never even learn of the damage they do to the parallel roles whilst obsessing over the matter of the moment. Ordinary users are watchers and judger's too and may take away quite unexpected conclusions from administrator actions. Looked at another way the spread of intelligence is (sadly) unrelated to the authority role granted. That there never seems to be a shortage of potential idiot actions does not mean it is a good idea to go down each and every rabbit-hole.
On the other hand the occasional well-reasoned explanation might even result in the next applicant putting their hand up and taking some pressure off off the backlog slaves. If that flags me as both bitter and optimistic then just handle it. I have to. 22:06, 13 September 2019 (UTC)
@Slowking4: I have suggested above that the ontological discussion might be a better way to approach this issue than the speedy criterion. What are the ontological categories we need to handle, and what tool or structure of those we have available to us would be best to handle each? If we can figure out some guidance on that then what should be kept and what should be deleted will, hopefully, follow naturally. Perhaps you could flesh out your thoughts regarding "subject of" and "depicts" with that in mind? --Xover (talk) 07:25, 13 September 2019 (UTC)
we would need to group together all those works, which people seem to use categories . we have categories on authors, we could start with a wikidata infobox at author pages. if the community wants portals for subjects, then we will need a infobox and migration from categories to portals. (this is different from how it is done on commons) you could then link on wikidata, and have some query function to aid search, we need some wayfinding to aid search of topics. Slowking4Rama's revenge 11:53, 13 September 2019 (UTC)

Further discussion needed (New speedy deletion criterion for person-based categories)[edit]

I am quite a bit concerned about this, and have unarchived it to prevent it lingering on unresolved.

We are now in a situation where the community has voted to implement a criteria for speedy deletion, that allows any administrator to delete such matter at their own discretion with no a priori community approval (all admin actions are, of course, subject to a posteriori review by the community), but where at least two long-standing and very experienced contributors have objected to the core issue after the fact, and levelled criticisms at the formalities of the community decision process. Their objections are reasonable ones (in the "reasonable men may disagree" sense), and the criticisms of the process valid.

To make clear the procedural issues, the proposal described the issue as "in the main uncontroversial", which the objections have demonstrated was not entirely accurate, and it was closed after a mere four weeks (two weeks after the last comment), when an objection became apparent after six weeks. Additionally, relating to the core issue, those who disagree feel the examples provided in the proposal do not accurately reflect the criterion as it was implemented. These are all valid complaints and the responsibility for these deficiencies in the procedure fall to me (my apologies).

But, in any case, the core issue remains: we now have a speedy criterion that two very respected and experienced community members have valid and strong-held objections to.

The arguments of those who object are presented above under the "Deletion review" thread. I had hoped that the community would chime in on that discussion such that it would be possible to assess whether the community shares the concerns of those who have objected, or whether they still support the criterion as implemented.

But as that has not happened I would like to directly request that the community chime in to make clear their position on how to handle this.

  • Despite the criticisms, the original community vote was valid and concluded with support, so the default outcome, if no change is mandated here, is that the criterion as written will be implemented. It is currently temporarily suspended as a conservative measure since objections have been raised.
    • In particular, this means that if you do not express an opinion now you will in practical effect be reaffirming the original outcome!
  • Does the community feel that the concerns raised are serious enough to invalidate the previous vote and revert to the status quo ante?
  • Does the community feel we should proceed as per the existing vote and adjust course as necessary at a later date?
  • Alternately, does the community feel we should proceed as previously voted but with specific changes to the wording of the criterion?
    • For example, Hesperian has specifically proposed replacing "Person-based" with "Author-based" in the criterium.
  • Would the community prefer a new proposal, that better explains the issues, be made and a new vote held on that?
  • In essence: do you have any opinion or recommendation on how this disagreement should be handled such that we end up with the issue settled?
    • Not everyone needs to agree with the outcome, but everyone should preferably feel that the outcome was fairly arrived at!

Pinging previous participants in the vote/discussion (but everyone are, of course, encouraged to chime in): Beleg Tâl, Slowking4, EncycloPetey, Jan Kameníček, Nizolan, billinghurst, Hesperian.

This has dragged on unresolved and it's the kind of thing that has the potential create conflicts and discord down the line so, despite the sheer amount of text and rehashing, please chime in and make your position clear! --Xover (talk) 07:22, 14 October 2019 (UTC)

  • Pictogram voting comment.svg Comment The examples used of the purpose and solutions did not adequately represent the proposal. I don't believe that any long-held page that appears valid at a point in time should be speedy deleted with a change in policy, especially where it is unclear in the proposal that such pages were being incorporated. My understanding of our approach was that we would not build author category listing pages those to go. — billinghurst sDrewth 09:56, 14 October 2019 (UTC)
    @Billinghurst: It is not clear to me from this comment how you would prefer to resolve this issue. Could you make that explicit? --Xover (talk) 06:46, 15 October 2019 (UTC)
    Don't speedy delete long-held pages.

    If you are putting forward a policy change, then identify the pages that are going to be caught by the policy change. Look to use best examples, not examples where we are already in agreement. If you are deleting and you come across long-held pages you believe that are caught by a policy change, and they have not been specifically mentioned, then have the open-discussion so that we have a consensus that is what we were looking to do. {{smaller|Administrators are the implementers of consensus, not the determiners of what happens here, and we should be looking to be considerate. Err on the good-side and the patient-side. In reality, for many things there is no hurry, despite some of us at some stages just wanting to get things tidied away.) — billinghurst sDrewth 22:46, 21 October 2019 (UTC)

    @Billinghurst: Apart from the age exemption (which I have addressed above somewhere), this is all good advice and I agree whole-heartedly. But now you're just chiding me. What, specifically, is your preferred way to resolve this issue? Do you want the new speedy criteria rolled back and removed? Do you want its text changed from "Person-based" to "Author-based"? Or are you proposing an entirely different, general, rule that no content older than X time units may ever be deleted under any criterion for speedy deletion?
    Because right now we have an existing, valid, community decision in favour of the new criteria with the "Person-based" meaning, but I am bending over backwards to try to make sure the concerns you and Hesperian have raised are taken into account (giving everyone a chance to change their minds if they are swayed by your concerns).
    If your goal is to censure me for insufficiently researching and documenting the consequences of the new policy, or for failing to insist on a longer period before being closed, then, fine, consider me suitably chastened. But me standing dressed in a white sheet in church on three sundays isn't really going to change much. So far, of those who originally supported the new criterion, only Jan has chimed in and they reaffirm their original position. If you want a different outcome you need to at least tell us what it is. --Xover (talk) 07:49, 22 October 2019 (UTC)
  • As I said before, I remain Symbol neutral vote.svg Neutral regarding the proposed change from the current "person-based" deletion rationale to the proposed "author-based" rationale. —Beleg Tâl (talk) 15:19, 14 October 2019 (UTC)
    I voted for deletion of person-based categories and I hope that the vote also counts in this way. If somebody wishes only deletion of author-based categories instead, it should be suggested as an alternative rule. I admit it is my fault I did not protest when somebody changed the proposal without others expressing their consent clearly, but still: changing rules needs explicit consent, which is missing here.
    That said, I do not think that the idea of treating author-based categories differently from categories of other people is good.
    • Firstly, this can be a source of big confusion to many readers browsing categories: some people are included in the category tree and others not, and accidental visitor to Wikisource unfamiliar with our internal rules will not find the clue.
    • Secondly, it is not defined, who is considered to be an author by this rule: A person who is author of a work at Wikisource? A person who is author of a work eligible to be added to Wikisource? A person who is author of a work in English or translated into English, although it won't become eligible for WS for decades? Or any person who is author of whatever in any language, which may but also may not be translated into English in the future? We have some definition in the Style guide which says that "... author ... is any person who has written any text that is included in Wikisource. However, too many contributors refuse to follow this definition and found author pages of people who have no work here, sometimes even authors who have never written anything in English and nothing by them has been translated into English so far (example). I am afraid the same will sooner or later happen with categories.
    • Let's say that we determine some line dividing authors and the rule will say which authors can have categories and which not. The rule could be: authors who have an author page cannot have category, and vice versa (or any other definition). Again: accidental visitor browsing categories will be confused, unable to find our internal clue why Alois Rašín can be included in the category tree and Karel Kramář not.
    To conclude it, the best way is the simplest: forbid all person-based categories and organize people only in the author and portal namespaces, or alternatively allow categories for everybody. I am for the first of these two choices. Jan Kameníček (talk) 20:10, 14 October 2019 (UTC)
  • comment, i am concerned about increased use of speedy deletion, that has been abused elsewhere. i would prefer use of maintenance task flows in the open. i do not see a pressing problem. but maybe this is overblown, and the admin task flow here will not be abused. i raised my concern and got dismissed, which is fine with me.
  • what we really need is a consensus about how we structure our data with wikidata. (be it categories, portals or tags) we need a stable page, about work subjects, that can link to wikidata. we have a "works about" section for authors. but we need it for non-authors also.Slowking4Rama's revenge 14:09, 15 October 2019 (UTC)
    Your concerns were not dismissed, some editors merely disagreed with them. But in the interest of clarity, in view of your comment here and your original oppose vote to the proposal, do I understand correctly that your preferred resolution to this issue is to roll back to before this proposal and have no speedy deletion criterion for this at all? --Xover (talk) 14:41, 15 October 2019 (UTC)
yeah, apparently, i have out of consensus views of those who show up for process discussions. i just want some stable bibliographic metadata about "depicted people" and subjects. i am open to how to structure it, and what is the road map to get there. i do no care about rolling back a particular direction that i think is mistaken. (the problem with deletion is that it decreases the slim possibility of quality improvement, since it hides quality defects rather than making them more visible.) Slowking4Rama's revenge 15:43, 15 October 2019 (UTC)
@Slowking4: do you see the value in having a general essay and guidance on how we handle people who are not authors. Then having a range of means to handle these depending on the person's notability, and possibly the number of references/sources that we are having for these people. Some of the solutions will be here at enWS, and others may be at WD. Our policy guidance of 2010 probably needs to evolve with the implementation of Wikidata which is a bigger people resource and allows interactions and linking differently than our 2010 focus on enWP linking to notable people. Here I am thinking something akin to Wikisource:For Wikipedians and it might be something like [[Wikisource:For Wikidatans]] and [[Wikisource:Managing people data at Wikisource]]. — billinghurst sDrewth 22:55, 21 October 2019 (UTC)

Update to NopInserter Gadget[edit]

A while back, while debugging an unrelated issue, I found a bug in MediaWiki:Gadget-NopInserter.js that prevented it from displaying the intended visual indication of its operation. I also found that at the time of implementation there had been some differing preferences for what type of visual indicator be used and when prompting for confirmation was appropriate. I have therefore created an updated version in User:Xover/Gadget-NopInserter.js that fixes the bug and adds configuration options for whether to confirm addition of a {{nop}}, the style of visual indicator, and the duration of the indicator effect. To try it out you can add the following to your common.js (but disable the site-wide gadget in your preferences first!):

mw.config.set('userjs-nopinserter', {
	dontConfirmNopAddition: true,
	notificationStyle: "highlight",
	notificationTimeout: 1000


It also fixes the bug that prevented the site-wide gadget from actually showing the outline based highlight it was supposed to. And for good measure I added support for a notificationStyle using mediawiki's bubble notifications (set notificationStyle: "message" to try it out). The weird double-negative construction of "dontConfirmNopAddition" is just because I've preserved the default behaviour of the site-wide gadget. If you remove everything except the mw.loader.load line (no setting of options) you will get the old default behaviour with just the bugfix.

The changes can be seen in this diff.

The changed version has had some limited testing and seems ready for wider testing. I therefore propose that we update MediaWiki:Gadget-NopInserter.js with this version. Note that since we do not have interface administrators locally, I will have to request this edit from the Stewards at meta, and they will require a community discussion to verify that this is indeed a change in line with community consensus. It would therefore be very helpful if as many as possible indicated whether or not you support this proposal. --Xover (talk) 12:37, 6 October 2019 (UTC)

I think local bureaucrats can set the "Interface administrators" bit.Mpaa (talk) 14:18, 6 October 2019 (UTC)
Bureaucrats have the technical ability to flip that bit, yes. But by WMF Legal-imposed policy it requires 2FA, and so can't just be assigned ad hoc like other local permissions. And since we have no permanent interface admins, nor any "list of people willing and able to make interface admin-edits", we don't actually have any functioning local way to request such changes; unless you yourself happen to have 2FA enabled for other reasons. Thus, asking the Stewards at Meta is actually the easiest option for getting such changes made currently. --Xover (talk) 05:55, 7 October 2019 (UTC)
OK, just a bit weird that it is 'technically' possible but not 'legally' possible without 2FA. If they want to be on the safe side, they should not allow without 2FA.
Symbol support vote.svg Support Anyhow, I am fine with the proposal.Mpaa (talk) 19:37, 7 October 2019 (UTC)
Yeah, the 2FA requirement is kinda dumb to begin with (not that it helps that we don't have a functioning local Interface Admin policy). In any case, this thread, so far, does not demonstrate community support for the proposed change (absence of objections is not the same as support), so it appears this change will not be implemented. Note that if the community's reticence should happen to be about the other changes, I can redo this patch to only fix the (7 years old) bug. In the mean time, anyone that wishes may of course use the copy in my userspace using the syntax described above, but absent any indications of interest I probably will not be actively maintaining that copy. --Xover (talk) 13:39, 5 November 2019 (UTC)

Proposed update to template:header[edit]

There has been a discussion about adding some parameters to the header to capture contributors to sections/subpages of works, and to create synonyms to have more generic parameters available. The proposed changes are at template talk:header#Action to be taken October 2019. Flagging this prior to implementation in case anyone sees problems or has major issues that should stop moving forward. — billinghurst sDrewth 21:43, 17 October 2019 (UTC)

  • Symbol support vote.svg Support --Xover (talk) 08:52, 18 October 2019 (UTC)
  • Symbol support vote.svg Support, I do miss the section translator in the template. --Jan Kameníček (talk) 17:17, 18 October 2019 (UTC)

Proposed changes to WS:WWI regarding advertisements[edit]

There is a proposal to update the wording of our policy regarding the inclusion of advertisements, in particular advertisements that are part of a larger transcluded text. Please see the discussion at Wikisource talk:What Wikisource includes#Proposed changes to Advertisement section. —Beleg Tâl (talk) 13:50, 15 November 2019 (UTC)

Bot approval requests[edit]


Pictogram voting comment.svg Comment if we are going to do this, what is the possibility to put the "validated" flag on the wikidata item interwiki for the respective works? To note that I am gathering that this is for situations where the Index: page has been marked as validated and there is a one-to-one relationship with a main namespace page. To note that this list does include subpages of works where the works have been uploaded as parts, some would warrant listing indepedently as validated, others, not so. [A find of <code/ shows interesting output. — billinghurst sDrewth 05:35, 31 October 2019 (UTC)
@Kaldari: I agree with billinghurst regarding subpages. I would leave them out as it might be controversial and limit entries to top level items for now.Mpaa (talk) 14:44, 10 November 2019 (UTC)
Please check your list for redirects. I have found one redirect in the list. — billinghurst sDrewth 05:39, 31 October 2019 (UTC)
I was planning to have the bot follow redirects when posting the template, but I'll just go ahead and replace any redirects with their targets in the list... Kaldari (talk) 23:03, 31 October 2019 (UTC)
@Billinghurst: I've replaced all the redirects in the list with their ultimate targets. Kaldari (talk) 19:42, 1 November 2019 (UTC)
Adding the flag in Wikidata should be fairly easy to do if the wikibase-api library supports it. If not, I'll need to dig into the actual Wikibase API, which might be complicated. Kaldari (talk) 00:12, 1 November 2019 (UTC)
It looks like wikibase-api does support setting badges. However, I think it would be best to add the badges after all the pages have been templated, categorized, and reviewed for accuracy. My list is mainly based on following title links from the Indexes. However, I've noticed that use of the title field varies considerably. Some link to disambiguation pages, some link to multiple pages, and some don't have links at all. I've tried to go through all the ones that are obviously problematic and fix them by hand, but I imagine I will have missed some. Once the pages are added to Category:Validated texts (by the template), it will be easier to review them all, as I can just open them in new tabs from the category page. Once they are reviewed, anyone could write a script to badge all the pages in the category. Kaldari (talk) 19:41, 1 November 2019 (UTC)
I think that we should have the header module pull the Validated status from Wikidata and display the badge that way, but I support having a bot ensure that the status is correct on Wikidata. —Beleg Tâl (talk) 21:55, 1 November 2019 (UTC)
@Beleg Tâl: I think that's a good idea in theory, but there are some practical problems. Few Wikisource editors bother to create or link Wikidata items to their works. Of the first 5 works in my list, only 2 were linked to Wikisource. If people aren't even linking to Wikidata consistently, I think there's a vanishingly small chance that they will try to keep the Wikidata badges up to date. I imagine that people will just start adding their works to Category:Validated texts manually, as it won't be very intuitive that you have to set a badge in Wikidata (a very obscure feature) in order to add the work to the Category on Wikisource. Plus I don't really see what we gain by having the status tracked on Wikidata rather than in Wikisource directly. Kaldari (talk) 01:33, 2 November 2019 (UTC)
@Kaldari: users will not add Wikidata badges manually, but they will not add Category:Validated texts manually either (I certainly won't). We'd need a bot either way, so we may as well have the bot do it "properly" i.e. by leveraging Wikidata (and thus preventing duplicate data). If we need a bot to create the Wikidata item in the first place, then we should look into that also. —Beleg Tâl (talk) 14:54, 2 November 2019 (UTC)
If I am not wrong, 922 items are missing wikidata item. I tried to pull Category:Validated texts from wikidata badge but I did not succeed, I could only pull Category:Validated. Anyone knows how?Mpaa (talk) 22:21, 5 November 2019 (UTC)
@Mpaa: The idea that I put forward, would be to have the header module check the text's associated data item, and if the validated badge is present, then it would add Category:Validated texts to the header (similar to how Category:Works with non-existent author pages is added by the header based on the presence of an associated author page, though this uses #ifexist instead of a Wikidata query) —Beleg Tâl (talk) 00:48, 6 November 2019 (UTC)
I got the idea, I am wondering if someone knows how to pull the badge for a sidlelink of a wikidata item, with the current available modules. If not, or if this is not supported, this doesnt seem a good way-forward at the moment, until this point is cleared.Mpaa (talk) 22:18, 8 November 2019 (UTC)
@Mpaa: Module:Edition contains code for retrieving the badge of the sitelink, though it doesn't do anything useful with it. —Beleg Tâl (talk) 15:29, 9 November 2019 (UTC)
@Beleg Tâl:, seems there are a few more steps to be done, like modify Module:Edition to get the badge ID or the category we want to associate to it.Mpaa (talk) 18:21, 9 November 2019 (UTC)
@Mpaa: it looks like w:Module:Wd does it properly, so we could just import that module. —Beleg Tâl (talk) 13:58, 10 November 2019 (UTC)
Sounds good to me. I am not very familiar with importing pages (I am uncertain if this needs to be flagged or not in this case: Include all templates), if someone volunteers better so, otherwise I will give it a try.
Update: As it seems the desired route is to record the information in Wikidata, I will need to write some more code: first to collect all the associated Wikidata items (unless Mpaa has already done this) and then to add the badges in Wikidata. Unfortunately, I'll be at the Wikimedia Technical Conference all next week, but hopefully I can start working on it afterwards if everyone agrees this is the best solution. Kaldari (talk) 20:08, 8 November 2019 (UTC)
If needed, I can generate a list, and also create the missing WD items. Mpaa (talk) 21:21, 11 November 2019 (UTC)

Repairs (and moves)[edit]

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

Index:Eight Harvard Poets.djvu[edit]

The following discussion is closed and will soon be archived:

This work is missing two pages, which are present in this scan on HathiTrustBeleg Tâl (talk) 14:09, 22 October 2019 (UTC)

@Beleg Tâl: Yes check.svg Done --Xover (talk) 15:45, 22 October 2019 (UTC)
@Xover: Thanks. Discussion is continued here: Wikisource:Bot requests#Index:Eight Harvard Poets.djvuBeleg Tâl (talk) 15:59, 22 October 2019 (UTC)

Would anyone like to volunteer to validate the two pages that were just added? —Beleg Tâl (talk) 12:24, 23 October 2019 (UTC)

@Beleg Tâl: Yes check.svg Done --Xover (talk) 13:22, 23 October 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:04, 3 November 2019 (UTC)

Index:Dreams and Dust, by Don Marquis.djvu[edit]

The following discussion is closed and will soon be archived:

Would it be possible for someone to swap Page:Dreams and Dust, by Don Marquis.djvu/32 and Page:Dreams and Dust, by Don Marquis.djvu/33? They are in reverse order for some reason. (The transcluded wikitext has already been swapped.) —Beleg Tâl (talk) 20:29, 22 October 2019 (UTC)

@Beleg Tâl: Yes check.svg Done Incidentally, this file is a good example of why DjVu rocks: the current file separates the foreground and background into separate layers and is 1.65MB (avg. 8.1kB/page). That's 1.65 MB total for 208 pages at 1874x2888 pixels, plus the hidden text layer, plus other format metadata and overhead! Just for kicks I downloaded the scans and regenerated it—and my tooling does not separate the background and foreground—and the file was 37MB (avg. 182.2kB/page). Both files are from the exact same raw scan images and the exact same resolution. The raw JPEG 2000 image files are 53MB altogether (avg. 261kB/page). CC Kaldari: who may find the data interesting. --Xover (talk) 05:58, 23 October 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:05, 3 November 2019 (UTC)

Index:Carroll - Alice's Adventures in Wonderland.djvu[edit]

The following discussion is closed and will soon be archived:

Please see Kaldari's request at WS:S/H#Need help fixing Alice's Adventures in WonderlandBeleg Tâl (talk) 17:36, 30 October 2019 (UTC)

Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Kaldari (talk) 18:56, 11 November 2019 (UTC)


The following discussion is closed and will soon be archived:

I updated this author's page with a middle initial. Should the page be moved to Author:Ferdinand A. Moeller even if the name isn't completely filled out? —Crocojim18 (talk) 01:33, 5 November 2019 (UTC)

Looks like it has already been moved. —Beleg Tâl (talk) 20:49, 6 November 2019 (UTC)

Other discussions[edit]

US copyright and the inclusion policy[edit]

For the longest time, Wikisource's inclusion policy imposed additional criteria for including texts published on or after January 1 1923. This happened to coincide with what was then the public domain cutoff date for published works in the United States, although the policy makes clear that these are to be considered separate concerns. With the public domain rolling forward in the US at last, a user changed the references to 1923 to a dynamic CURRENTYEAR-95 expression. I have reverted this change pending discussion. Do we want to roll forward our unconditional inclusion threshold along with the copyright law or do we want it to stay put? Phillipedison1891 (talk) 06:54, 25 September 2019 (UTC)

why are you now raising this issue after a gap of nine months? without a discussion on the user’s talk page ? Slowking4Rama's revenge 13:33, 25 September 2019 (UTC)
@Phillipedison1891: The change was purposeful, and reflects the point that the 1923+95 has been reached, and each year that now passes increments the year we can have. The community had been awaiting that anniversary, and the consensus was determined at the creation of the template all those years ago Please undo your change. — billinghurst sDrewth 13:47, 25 September 2019 (UTC)
@Billinghurst: The change has been reverted, although it appears from the original 2007 discussion that the 1923 cutoff for scope was arbitrary and wasn't necessarily linked to public domain. Phillipedison1891 (talk) 16:20, 25 September 2019 (UTC)
the lack of discussion is more a function of a small wiki more interested in doing work, than memorializing consensus. it is not more restrictive than PD, but rather using whatever works commons will allow. the use of simplistic date hurdles is for them, using PD not renewed works are at the hazard of summary deletions there, for example c:Commons:Deletion requests/American seashells (1954). however, fair use of jeppeson charts in PD documents are not ok. but if you want to build a consensus for "what is in scope" go for it. Slowking4Rama's revenge 21:26, 28 September 2019 (UTC)
@Slowking4: Belated thanks to the people that retrieved those... I missed that discussion on Commons about the images, but had actually looked up the original registration and checked for a renewal by number before uploading the book, and it was also cleared by the Copyright Review Management System at HathiTrust (i.e. actual paid copyright experts). Disturbing that people would delete items specifically uploaded as 'not renewed' without actually checking, particularly works with a renewal date that would be in the online USCO database. (FWIW, that publisher was bought out by a big conglomerate in 1968, and most of their publications were probably never renewed.) Jarnsax (talk) 05:52, 8 October 2019 (UTC)
It's also in progress over here at Index:American Seashells (1954).djvu (blatant plug for help) :) Jarnsax (talk) 05:58, 8 October 2019 (UTC)
How strict will we enforce the URAA copyright restoration? Please consider the choices from m:United States non-acceptance of the rule of the shorter term#Statement from Wikimedia Foundation.--Jusjih (talk) 02:29, 9 October 2019 (UTC)
there is no consensus for URAA interpretation either here or on commons, see also c:Commons:Village_pump/Copyright#URAA_revisited_in_2019 and "The WMF does not plan to remove any content unless it has actual knowledge of infringement or receives a valid DMCA takedown notice. To date, no such notice has been received under the URAA. We are not recommending that community members undertake mass deletion of existing content on URAA grounds, without such actual knowledge of infringement or takedown notices." [1] -- Slowking4Rama's revenge 10:58, 18 October 2019 (UTC)
Last time we had this discussion, it was pretty clear there was consensus for not supporting texts that aren't in the public domain in the US on Wikisource.
Ignoring the URAA doesn't mean that the rule of the shorter term would come into play. Huge number of works as late as the 1980s would be PD. It also doesn't matter for many British and Canadian works, which were routinely published and even renewed in the US. E.g. the last works of H. Rider Haggard are still in copyright in the US, despite his death in 1925, with or without the URAA, because they were renewed. Many more may be out of copyright, no matter when their authors died, because they were registered with the US copyright office and not renewed.--Prosfilaes (talk) 15:50, 18 October 2019 (UTC)
Yes. Chinese Wikisource applies how the WMF does, without ignoring the URAA. It is why I ask here about how strict we enforce the URAA, active or passive.--Jusjih (talk) 03:01, 21 October 2019 (UTC)

help please[edit]

The following discussion is closed and will soon be archived:

what can i add to this project? Baozon90 (talk) 18:06, 1 October 2019 (UTC)

@Baozon90: There's a lot you can do. One thing that is simple but requires a lot of care and patience is validating pages. You can find any page in Category:Proofread, ensure that the contents match the original, and mark it as validated. Adding a new text is a great thing to do as well but is more complicated. For someone entirely new to Wikisource, I'd recommend trying something small but valuable like validation as there are a lot of pages that could use it but don't require a lot of specialized skill to review. —Justin (koavf)TCM 18:19, 1 October 2019 (UTC)
@Baozon90: If you are particularly indecisive, and want a random page from the Proofread pages, you can try the Special:RandomInCategory page; I suggest you enter "Proofread" into the field there, if you want to follow @Koavf's suggestion of starting with validation. Dcsohl (talk) 17:14, 2 October 2019 (UTC)
Wikisource:Proofread of the Month is a good place to start. and there are maintenance categories. Category:Index Not-Proofread. if you are talking about texts, we accept PD or CC texts; it helps if they are at internet archive [2] --Slowking4Rama's revenge 15:56, 3 October 2019 (UTC)
@Baozon90: ^^^ what Slowking4 said about POtM is perfect for learning about our systems, our quirks, and to be supported while doing so. Once you have the basics, then is the time to branch out. — billinghurst sDrewth 02:34, 4 October 2019 (UTC)
also if the Optics work is a little math heavy for you, there is a list on talk to work on Wikisource_talk:Proofread_of_the_Month. they give a good sample of work in progress. Slowking4Rama's revenge 23:35, 6 October 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:06, 3 November 2019 (UTC)

Broken links to scans from Index pages[edit]

The following discussion is closed and will soon be archived:

We can usually get to a scan at Commons when we click to "djvu" or "pdf" in the Source field of an Index page. However, from time to time I come across a page where the link does not work, for example at Index:Bohemian Review, 1917–Czechoslovak Review, May 1919.djvu. How can it be fixed? --Jan Kameníček (talk) 16:18, 6 October 2019 (UTC)

I experimented a bit and found that it's because of the dash in the filename. The Index page template checks if the file exists in order to suppress redlinks. It checks using {{PAGENAMEE}}, which returns the file name "Bohemian_Review,_1917%E2%80%93Czechoslovak_Review,_May_1919.djvu". Using this filename still works: File:Bohemian_Review,_1917–Czechoslovak_Review,_May_1919.djvu - but {{#ifexist:File:Bohemian_Review,_1917%E2%80%93Czechoslovak_Review,_May_1919.djvu}} returns false. —Beleg Tâl (talk) 01:52, 7 October 2019 (UTC)
I'm not sure why {{PAGENAMEE}} is used instead of {{PAGENAME}}. A related discussion at Wikisource:Scriptorium/Archives/2011-05#Special:IndexPages seems to suggest there was a bug that needed to be worked around maybe. @Billinghurst: I think you were involved in those discussions, do you know anything about this? —Beleg Tâl (talk) 01:55, 7 October 2019 (UTC)
As is known sometimes It is quite a while and probably numbers of versions ago ... <shrug> I am surprised to see it used inside File: as that seems contrary to logic, though sometimes back in the earlier days we did what worked and who knows what testcases were done, and GOIII was not the best summary-user to understand what was being done or tested. I think that file: and media: use can progress as PAGENAME rather than PAGENAMEE. — billinghurst sDrewth 07:28, 7 October 2019 (UTC)

Yes check.svg Done and it the Index: work identified displays, though we should be checking a wider range of index pages with unusual characters in the title to look for good test cases. — billinghurst sDrewth 07:35, 7 October 2019 (UTC)

checked the first and last pages of Category:Index Validated and those unusual lead characters display fine. — billinghurst sDrewth 07:39, 7 October 2019 (UTC)
Bleh. The documentation suggests that this should fail for all files containing the characters ', ", and &. But testing does not confirm that. It does indicate some weirdness though: Index:"I solemnly swear that I won't eat no more ice cream what's made with sugar nor no more candy what's made with sugar. Ho - NARA - 512512.jpg.
What looks like might have happened here is… Well, that MediaWiki magic words are a poorly designed mess. {{PAGENAME}} returns the page name with the characters ', ", and & HTML encoded. {{PAGENAMEE}} returns the page name with a set of characters URL encoded. But the -E variant doesn't actually do real URL encoding, just a weird MediaWiki-specific variant. For proper URL encoding you need to use {{urlencode:…}} from mw:Extension:ParserFunctions. No magic word or function will actually get you the raw page name with no encoding applied.
Back when this checking was implemented in MediaWiki:Proofreadpage index template, using {{PAGENAME}} with {{#ifexist:…}} failed. I'm not quite sure how using the -E variant actually helped, but that appears to be the reason for the change. However, scanning through the listed bugs (a horror story), it looks like (but isn't actually documented anywhere that I can find), that someone at some point implemented a workaround in {{#ifexist:…}} that actually decodes the encoding applied by {{PAGENAME}} in order to make this work.
The net result is that we are relying on a poorly designed mess of stacked special-case workarounds. Lua has functions that may or may not avoid this, but I can't determine that for certain and would require rewriting the whole template in Lua which may not be worth the effort involved (then again, maybe it is!). --Xover (talk) 11:25, 7 October 2019 (UTC)
The page that you indicated above displays fine for me, no weirdness. Guessing that when we made a change to one fo the magic words, that someone changed all, for continuity, not accounting for quirks. Personally, I just want it to work, and care less about the path—it is low exposure, and presumably low overheads in the holistic sense. — billinghurst sDrewth 11:36, 7 October 2019 (UTC)
@Billinghurst: Do you see a "Source jpg" on that Index page (just above the progress field)? It's not showing up for me, not even after purging it. --Xover (talk) 11:54, 7 October 2019 (UTC)
Yes, I see it. To note that in "olden days" that we used to have to wiki-natively insert jpg images, and I would still do so when setting up an Index: page. The use of just the (page) number is not something that I would have even tried to do. — billinghurst sDrewth 12:04, 7 October 2019 (UTC)
That's very strange, as I tested it in a separate browser and not logged in; and it's not even included in the HTML source of the page. Are you sure we're looking at the same page? However, the reason it didn't show up would seem to be that the "Scans" dropdown for this index was set to "other" rather than "jpg". When I changed that it showed up properly as expected. It was thus probably unrelated to the problem that's the topic of this thread. --Xover (talk) 12:45, 7 October 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:13, 3 November 2019 (UTC)

Ongoing discussion about DjVu files at Wikimedia Commons[edit]

The following discussion is closed and will soon be archived:
proposal was withdrawn on Commons

Hello Wikisource community,

There is an ongoing discussion about the hosting of files in the DjVu format on Wikimedia Commons over at "Commons:Village pump/Proposals#DjVu is dead. We should deprecate it for new uploads.", as this format is almost universally implemented on Wikisource I request input and feedback from the Wikisource community over there. It's best to engage in discussion before "voting" as this is not a referendum but a proposal. -- DonTrung (徵國單)  (討論 🤙🏻) (方孔錢 ☯) 08:05, 8 October 2019 (UTC)

As notification about this discussion has so far only been posted here at English Wikisource (and a million thanks to Donald Trung for taking the time out to do so. Very much appreciated!), but it is of potential interest to all the different language Wikisources (at least any of them that use DjVu files; but even those that do not use DjVu have indirect stakes in the outcome), we should try to make sure they are also made aware of it. I am uncertain whether we have any established channels for this. mul:Wikisource:Scriptorium would seem to be one possible avenue, but I am uncertain to what degree the non-English projects watch that forum. Can anybody advise on this? Billinghurst perhaps? --Xover (talk) 09:42, 8 October 2019 (UTC)
1) There is the wikisource-l mailing list that would be worthwhile being pinged. 2) We could look to set up a MediaWiki message delivery distribution list at Meta where we list the Wikisource wikis, and add the pages for the respective Scriptoriums per d:Q16503 (63 entries). There may be the odd other pages at some wikis if they don't have a WS:S. — billinghurst sDrewth 10:18, 8 October 2019 (UTC)
m:MassMessage and there is list of interested users at m:Global message delivery/Targets/Wikisource Community User Group participants and m:Global message delivery/Targets/Wikisource News (en)billinghurst sDrewth 10:21, 8 October 2019 (UTC)
MassMessage for WS:S list built m:Global message delivery/Targets/Wikisource Scriptoriums
And to note as massmessage is a right allocated to admins, I have it at Meta and can send any prepared message. To note that MM at a wiki will only send local, MM at Meta is able to send globally. — billinghurst sDrewth 10:37, 8 October 2019 (UTC)
@Billinghurst: Wouldn't Donald's original notification here work well?
Sure, I simply wish for it to be seen as the community's considered and sanctioned message, rather than mine. Whether it is notification alone, or part opinion, or consideration of consequence, or condemnation … — billinghurst sDrewth 20:17, 8 October 2019 (UTC)

Hello Wikisource community,

There is an ongoing discussion about the hosting of files in the DjVu format on Wikimedia Commons over at "Commons:Village pump/Proposals#DjVu is dead. We should deprecate it for new uploads.", as this format is almost universally implemented on Wikisource your input and feedback would be valuable over there.

Or perhaps we should specify that that this is the English Wikisource community passing it on? Perhaps by tacking on a "The English Wikisource community has been made aware of a discussion that may be of interest to your project." at the beginning? --Xover (talk) 13:25, 8 October 2019 (UTC)

I withdrew the proposal. Kaldari (talk) 05:13, 9 October 2019 (UTC)

i kinda agree with the assessment, but would prefer an action plan to make IA uploader produce pdf’s from jp2. this would sunset the djvu. -- Slowking4Rama's revenge 13:09, 9 October 2019 (UTC)
Such an action plan will have little value if the points raised by Xover at Commons are not solved. --Jan Kameníček (talk) 17:38, 9 October 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:16, 3 November 2019 (UTC)


The following discussion is closed and will soon be archived:
Resolved: script is now imported from global version instead of maintaining a local copy

Is there a reason we maintain our own copy of MediaWiki:InterWikiTransclusion.js instead of just using mul:MediaWiki:InterWikiTransclusion.js directly like frWS does? Our copy is a couple of years out of date and is missing functionality for section transclusion including fixes for phab:T188202. Some pages such as Lapsus Calami (Apr 1891)/Coll. Regal. are broken because our copy is missing these updates. —Beleg Tâl (talk) 15:22, 11 October 2019 (UTC)

@Beleg Tâl: I have changed the file to pull the mulWS script, please see whether it works as expected, there may be dependencies that need to be expressed. — billinghurst sDrewth 09:00, 14 October 2019 (UTC)
Not making a difference for me. — billinghurst sDrewth 09:03, 14 October 2019 (UTC)
Now it does. That damn caching of common.js. — billinghurst sDrewth 09:35, 14 October 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:17, 3 November 2019 (UTC)

"Studies in Irish history, 1649-1775" available in scanned form?[edit]

Is anyone able to see a downloadable, OCR'd copy of the work "Studies in Irish history, 1649-1775" by Murray, Alice Effie b. 1877., Gwynn, Stephen Lucius 1864-1950., Mangan, H. , Wilson, Philip. , Butler, William Francis Sir 1838-1910. et al.

I see that it is at HathiTrust though no indication of whether scanned.

If it is, it would be great if someone can get it into so we can take it to Commons. Thanks. — billinghurst sDrewth 03:18, 13 October 2019 (UTC)

It says full view at the bottom, and the first scan (unusually for HathiTrust) even lets you download the PDF directly. I've got a copy of the PDF, though at over 100 MB, I've got to upload it to IA and then sideload it to Commons.--Prosfilaes (talk) 03:33, 13 October 2019 (UTC)
I can't upload it to Commons, since one of the authors died 1950, and there's three others with no death date at hand. It's too big for me to upload it to Wikisource. It's at , if anyone knows how to get it here.--Prosfilaes (talk) 03:42, 13 October 2019 (UTC)
@Prosfilaes: Thanks, the download links must be country specific, and as I am not operating from a VPN, it knows that I am not US-based.. I will pull it onto Wikisource once it has derived a new form. If it is still over 100MB, then we can get someone to upload it on the backend. — billinghurst sDrewth 03:47, 13 October 2019 (UTC)
Looking around more, this is the same scan as . The HathiTrust version has an update date of 2012, so there may be a rescan or two, but it ultimately came from the same source.--Prosfilaes (talk) 04:25, 13 October 2019 (UTC)
What?!? I don't even see it in a search result. Sheesh, sorry to put you to the effort. — billinghurst sDrewth 05:40, 13 October 2019 (UTC)
@Billinghurst: Ad not US-based: Try Hathi Download Helper, but the downloading times are long. --Jan Kameníček (talk) 19:30, 13 October 2019 (UTC)

@Billinghurst, @Prosfilaes, @Jan.Kamenicek: I didn't quite follow the above, but took a guess and on the chance it'd be useful I grabbed Internet Archive identifier : studiesinirishhi02obri, regenerated a new DjVu from it, and uploaded here as File:Studies in Irish History, 1649-1775 (1903).djvu with an index at Index:Studies in Irish History, 1649-1775 (1903).djvu. If it's not needed or or borked in some way then feel free to delete. --Xover (talk) 11:29, 3 November 2019 (UTC)

Projects from Community Wishlist — skates time[edit]

I am reliably informed that it would be beneficial for English Wikisource to think about and plan for projects from the community wishlist, either tidying them up, or extrapolating on them, and to maybe start thinking and planning very soon, or maybe right now. Picking the best and generating a case why they are good projects with excellent outcomes.

Previous years suggestions that have been classified as Wikisource-focused can be seen at

and of course some of the more general improvements can work for us as well.

What is it that we truly want that will make our editing lives better? What is it that will make our sites truly better? What is special about our sites that could and should be better to make us better? more integrated? better findable? Let us think how we can generate benefits, and what are the benefits, rather than just think features.

Building relationships within the broader Wikimedia can be advantageous, especially if we listen. — billinghurst sDrewth 09:43, 15 October 2019 (UTC)


  • Sidenotes and Layouts. We have had discussion here and never had solutions to building sidenotes from works, especially in the migration from a portrait page, to a landscape and wide computer monitor, and also onto a small mobile device. Aligned with that is the toggled layouts that were designed in prehistoric web time, and never fully functioned with sidenotes. So I would love to see if we can have some global work done on our CSS, as it is a true weakness for us. It also flows through to how we present in mobiles, and don't have good use of scripting means to present works in a known universal means, nor easy ability to check.

    So I would like for one consideration for a project is to look at the output of ProofreadPage to various devices and the compatibility of our CSS to do that well. Whether this then flows to other outputs as EPUB, and other portable document formats would be interesting. — billinghurst sDrewth 10:04, 15 October 2019 (UTC)

i am less interested in customizing, since i use wikitext, and it would be too many choices. i tried testing it out, but it rarely helped me do what i wanted to do. it does some things well, like "find and replace", and showing line breaks. but we need a standard menu layout, based on frequent workflows and tasks. i am suggesting that until a menu redesign is done, based on newbie workflow, there will be little take-up. a regrettable lost opportunity. Slowking4Rama's revenge 19:12, 22 October 2019 (UTC)
  • OCR tool. As it has been discussed in many places, Phe's OCR tool has been out of service for many months and there is no hope it will be repaired in some reasonable time. The other available OCR tools do not work satisfactorily: either the quality is bad or they are slow. The new tool that we need should (please add more points):
    • be open for repairs and maintenance, so that the whole community did not have to rely on availability of a single volunteer
    • provide good quality OCR text, comparable with the Phe's tool or better
    • be quick
    • be able to recognize text in columns (e. g. in magazines or newspapers)
    • be able to recognize foreign characters and diacritics

--Jan Kameníček (talk) 09:22, 22 October 2019 (UTC)

    • Most OCR tools recognize and mark bold and italicized text. If one is being created specially for WS, have it transform that into proper quote-markup. Right now, that information is lost entirely in the output. --Levana Taylor (talk) 19:29, 22 October 2019 (UTC)
    • I'm not completely thrilled about marking bold; especially in much of the text we work on, bold is more often just an OCR error or something that should be marked up in a different way than something that should actually be marked bold.--Prosfilaes (talk) 21:45, 22 October 2019 (UTC)
  • OCR extraction from PDFs. Existing OCR text layer is very poorly extracted from PDF files here, although other applications like Acrobat extract it much better. Interestingly, the OCR extraction improves when the file format is changed into .djvu. We need to be able to extract OCR layer in the original quality directly from PDFs. --Jan Kameníček (talk) 09:30, 22 October 2019 (UTC)
yes, please put OCR on wishlist. the backlog of "Text Layer Requested" requiring use of OCR button is long. Slowking4Rama's revenge 19:18, 22 October 2019 (UTC)

Curly quote templates[edit]

I’m pleased to read that typographic quotes are now allowed.

One approach for editors who don't have them on their keyboards would be to create specific templates. E.g. {{sq}} could have variants like sqs (single quotes, straight) and sqc (sq, curly).

Alternately, would there be some way to have the quotes in special subpages so that something like "Name of file.djvu/dq/begin" contains the single character and have a template that transcludes that? You could then switch the quote style for an entire work with just 4 edits.


Pelagic (talk) 12:48, 15 October 2019 (UTC)

Would having a selection of them in the editing toolbar be sufficient? Templates for basic characters are fiddly and tend to look pretty cluttered in the code. --Xover (talk) 13:40, 15 October 2019 (UTC)
Could be a user option to move them out of the "Symbols" section of Special Characters to a more prominent location. Anyone up for writing an option to allow someone to place a customized set of special characters in the main editing toolbar? On second thought, there is already a gadget for keyboard shortcuts for accented characters. Easy to add quotes to that. Levana Taylor (talk) 15:30, 15 October 2019 (UTC)
Hmm, the gadget sounds good, so I have added it in my preferences now, but I cannot find anywhere whether there are any default shortcuts for some characters or how to add new shortcuts… Is there any documentation or help page about it? --Jan Kameníček (talk) 16:56, 15 October 2019 (UTC)
We can also resurrect {{dq}} and similar templates. —Beleg Tâl (talk) 15:47, 15 October 2019 (UTC)
Adding templates sounds as though we are encouraging their use. I wasn't aware that this was the plan of the community with the change. — billinghurst sDrewth 03:29, 16 October 2019 (UTC)
I think the ship has sailed … the idea of allowing curly quotes was greeted with enthusiasm by about 8 out of 10 commenters, so those same people will be busy converting texts, and it’s all to the good to facilitate doing so neatly and completely. Levana Taylor (talk) 04:17, 16 October 2019 (UTC)
Well, the issue here was perhaps more in regards using templates, specifically, for the quote marks. I don't think that's something we should encourage in the general case (with exceptions for special cases, including those where templates are the only reasonably efficient way for a given contributor to use curly quotes). All templates clutter the text in edit mode to some degree, and here we have a multiple-character expansion (6:1) that will occur tens of times on each page. It may even be enough to be a bona fide technical problem in a long quote-heavy work when you transclude a few hundred such pages into mainspace (cf. the issues with dotted toc lines).
For preference we should look to other methods to allow contributors to enter these characters, such as toolbar buttons (but at least one contributor does not use the 2010 wikitext editor or the 2017 wikitext editor), or OS/browser/keyboard-native input methods (but some OS and keyboards make that unreasonably hard). So we may need such templates for those outlier cases, but in that case we should label them accordingly as a stop-gap measure and possibly also run a continuous bot task to substitute them for the actual characters (on enwp they run a bot that automatically substs all instances of templates where the template is tagged as "must be subst:ed" that we could probably coopt for this purpose if needed). --Xover (talk) 04:58, 16 October 2019 (UTC)
I say no to templates. If people wish to do this less preferred means, then they can learn how to use all the means to add non-keyboard characters, or to use their User: drop down on their edittools set up. — billinghurst sDrewth 05:21, 16 October 2019 (UTC)
I also say no to templates. We have worked on removing the need for character templates (such as {{ae}}), so adding them for the various quote marks would be counter to our intentions. In terms of me being the main contributor who does not use either of the wikitext editors, don't initiate something specifically for me. I recognise that I am unusual in this regard. Currently things are working for me just fine without them. If I choose to work on a book in which curly quotes have been decided on, then I will add the characters to my User set in CharInsert and enter them that way. [Of course, works that I bring in and start working on will use straight quotes only.] Beeswaxcandle (talk) 06:05, 16 October 2019 (UTC)
Well, if you don't use them then it is likely that there may be more who choose not use them. I'm not saying this should be a dealbreaker for any solution we contemplate—like very old web browsers, at a certain point you just can't keep supporting them—but it would behoove us to keep the issue in mind and cater for it when reasonably possible. We want to increase the pool of contributors and make the bar to entry lower, `cause we sure ain't spoilt for folks willing to do the work! :) --Xover (talk) 06:20, 16 October 2019 (UTC)
Huh??? To whom are you directing this remark? The whole purpose of simple quotation marks is exactly for this purpose of simple editing, and why we have the style guide set as it is. The whole damn thing has been set for simplicity. In the past few years, it is the newer users who have been hyping things up trying to have exact replicas of works. The dinosaurs have long argued "KISS" and aligned with the simpler styles. — billinghurst sDrewth 11:31, 16 October 2019 (UTC)
I was responding to Beeswaxcandle's "Don't worry about me, I'll make do" by saying he's probably not the only one with that particular setup and if we can cater to it without excessive cost then we should cater to it. Recruiting new contributors is one thing, but it's equally important not to lose existing contributors (by making it harder or more frustrating for them to contribute). And in saying that I am merely stating general principles, not advocating any particular solution. But as an example (and only an example), by having a bot that automatically subst:s quote mark templates we could have our cake and eat it too: anybody that prefers it or needs it can enter them using templates, but the bot will replace them with the actual characters within minutes so that the presence of such templates in the work is only temporary. But, again, not a proposed thing we should actually do; just an example to illustrate the sort of thing that we might want to consider when the situation warrants. --Xover (talk) 14:00, 16 October 2019 (UTC)
I agree, if a template shall be allowed, it should be clear that can be replaced any time by a bot.Mpaa (talk) 22:02, 22 October 2019 (UTC)
It's all very good to say no to templates, but quote templates such as {{dq}} and {{" '}} have been around for years and are in common use throughout the site. It is not a question of encouraging their use, but rather of tolerating their continued use. —Beleg Tâl (talk) 19:38, 22 October 2019 (UTC)

Translations and source tabs[edit]

Why do translations not have a "Source" tab? For example: Translation:Sleeping Beauty, which is transcluded from Index:La bella durmiente del bosque.djvu. Kaldari (talk) 17:31, 15 October 2019 (UTC)

Hmm. Good question. Perhaps tpt knows? --Xover (talk) 17:43, 15 October 2019 (UTC)
It's a known issue that has been outstanding since 2013 (i.e. when the Translation namespace was created), see phab:T53980Beleg Tâl (talk) 17:46, 15 October 2019 (UTC)
@Kaldari: because Translate: ns. was our add-on, and it probably never got coded in. Presumably there needs to be some connection the <page> building to know that it adds the /proofreadpage_source\ tab to the Translation: ns. when it transcludes pages.
I am not a coder so don't expect me to make it. — billinghurst sDrewth 05:40, 21 October 2019 (UTC)

Duplicated proofreading status with Visual Editor[edit]

Hi! When creating a page, without any other change (just saving the content as is), if using the Visual Editor then the proofreading status is created twice (e.g. this page). Any idea? Is it a known bug, perhaps? It happens in other Wikisources as well. Thanks! -Aleator (talk) 19:03, 15 October 2019 (UTC)

Found: phab:T202200Beleg Tâl (talk) 20:10, 15 October 2019 (UTC)
Not being solved for 1 year and 2 months… How typical. --Jan Kameníček (talk) 21:40, 15 October 2019 (UTC)
yes, it is apparently having a status for header and status for body; status not easy to update in VE. could add to wishlist. Slowking4Rama's revenge 00:21, 16 October 2019 (UTC)
It would seem that it is still physically adding text
<pagequality level="1" user="" />
to the header, whereas the text is no longer actually added any more as it was melded into the page content model. I would have thought that it could have been a fairly easy fix. I would suggest that you bang on about the problem on the phabricator ticket, and ping some of the VE developers.. — billinghurst sDrewth 05:19, 16 October 2019 (UTC)
Somewhere in gerrit:/plugins/gitiles/mediawiki/extensions/ProofreadPage/+/master/modules/ve/pageTarget/ and looking for "quality" to see where it adds the tag to the header, and presumably we want to see where it also sits in the new content model. The version presumably in the header needs to go, and ensure that VE changes the content in the other when it edits. — billinghurst sDrewth 05:46, 21 October 2019 (UTC)

IA-Upload tool is down[edit]

The following discussion is closed and will soon be archived:
tool is back up again

@Samwilson: @Tpt: FYI —Beleg Tâl (talk) 14:55, 16 October 2019 (UTC)

Given the timing I'm going to go ahead and guess that this is related the ongoing reboot of the cloud hosts on which the toolserver/toolforge runs. There was one batch last Wednesday, one batch today, and will be another one next Wednesday. From the error message it looks it might just be that the service needs to be started again. --Xover (talk) 15:38, 16 October 2019 (UTC)
@Xover, @Tpt: I've restarted the web service and it seems to be fine again now. —Sam Wilson 00:02, 17 October 2019 (UTC)
It was down for 14 hours, 45 minutes and 58 seconds. Sam Wilson 00:17, 17 October 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:23, 3 November 2019 (UTC)

Fully validated indexes that aren't transcluded[edit]

Here are the 36 fully validated indexes that are not properly transcluded into main space (based on bot data):

Some of these have no transcluded content at all, some of them are partially transcluded, and some of them have more fundamental problems like incomplete indexes. If anyone wants to work on them, go for it! Kaldari (talk) 18:23, 16 October 2019 (UTC)

Here's a PetScan link for future reference: (also dibs on Townshend) —Beleg Tâl (talk) 18:43, 16 October 2019 (UTC)
  • Pictogram voting comment.svg Comment Which is why I generated the petscan queries ages ago, see User:Billinghurst#Petscan_queries (note that some need to be retweeked, as Magnus's petscan v.2 made his fields case sensitive for initial character (-> category and templates), and I had hoped that he would fix that issue) and why I generally get around to transcluding these at the end of each months, though Southern Historical Society Papers is a massive job, and it is taking me forever to work through Index:Dictionary of Indian Biography.djvu. — billinghurst sDrewth 21:53, 16 October 2019 (UTC)
Index:Dictionary of Indian Biography.djvu can be mostly automated, if you are interested, feel free to ping me.Mpaa (talk) 21:51, 17 October 2019 (UTC)

Thank you both for your efforts in making this info more accessible. Much appreciated. -Pete (talk) 21:57, 16 October 2019 (UTC)

  • As I have noted on my talk page, 'Index:Adelaide Contents.pdf' was transcluded 17.5.2017. 'Index:Felicia Hemans in The Christmas Box, 1829.pdf' and 'Index:Felicia Hemans in The New Monthly Magazine Volume 34 1832.pdf' have been done today. Esme Shepherd (talk) 20:35, 19 October 2019 (UTC)

Tech News: 2019-43[edit]

14:18, 21 October 2019 (UTC)

The source of local time digital clock display[edit]

The following discussion is closed and will soon be archived:

I have the local time displayed in the upper right corner but the local clock link in my local Vector.js is disabled and the clock in Gadgets is not selected.

The disabled link my local Vector.js is

//mw.loader.load javascript&smaxage=21600&maxage=86400');

From where does the local time display originates? — Ineuw (talk) 00:21, 27 October 2019 (UTC)

If your gadget is turned off, then it can come from the skin's js, your local common js or your global js configurations. PS. The time displayed is via your preference settings, it isn't local per se. — billinghurst sDrewth 00:48, 27 October 2019 (UTC)
Thanks I know this and checked the vector and global .js files and they were disabled. That is why I asked. It was just curiosity. — Ineuw (talk) 21:26, 27 October 2019 (UTC)
Just figured out where it comes from. When all clock scripts are disabled, in preferences\appearance, there is a local exception for the time. When selected will only display the local time. No script is needed. — Ineuw (talk) 21:32, 27 October 2019 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. Xover (talk) 08:23, 3 November 2019 (UTC)

Updated Module:WikidataIB[edit]

Hi to all. I have imported the newest version of this module that enables complex connections to WD and predominantly supplies data for our infobox equivalents (headers/footers) from WD items. Since we imported the script a year ago it has almost doubled in size, which would indicate increased potential for pulling data. I also imported the documentation page, so we should hopefully be able to see what functionality differences are listed in the help page (hoping!). If someone sees something of a useful nature that allows us to do more from the documentation, then please do speak up. I have protected the page as we should be inhaling the standard, stable version from Commons/enWP rather than fiddling with it locally. — billinghurst sDrewth 03:04, 28 October 2019 (UTC)

Index:Carroll - Alice's Adventures in Wonderland.djvu[edit]

I believe I have found a copy of the missing plate from this book on flickr, by using Google image search, but I have been unable to locate the missing text page. (See the discussion on the project page.)

I have uploaded the copy of the plate that I found into the book category at commons ( Charles Robinson's illustrations of Alice's Adventures in Wonderland ) it is the one named Alice's Adventures in Wonderland - Carroll, Robinson - S205 - The whole pack rose up in the air.jpg

Could someone take a look to see if it can be used and if so inserted into the book at the appropriate point?

If it is not suitable please delete it.

Thanks Sp1nd01 (talk) 14:19, 28 October 2019 (UTC)

@Sp1nd01: Thanks! It's been incorporated into the work. Kaldari (talk) 19:16, 11 November 2019 (UTC)

Tech News: 2019-44[edit]

16:09, 28 October 2019 (UTC)

PDF to wikitext import[edit]

For PDFs with embedded text (i.e. not requiring OCR) what's the simplest way getting the text of a PDF in commons converted to wikimarkup? Apologies if I'm missing an obvious help page, but I've not been able to find it. Evolution and evolvability (talk) 05:08, 29 October 2019 (UTC)

@Evolution and evolvability: See Help:Adding texts. You create an index for the work, then proofread (transcribe and format) each page, which is eventually transcluded together into a reader-visible page in mainspace. Don't hesitate to ask if you need assistance. --Xover (talk) 06:15, 29 October 2019 (UTC)
@Xover: Thanks. I think the bit I am confused by is "The text field may be blank or it might have been automatically filled with the text of that page". If uploading a PDF equivalent to e.g. this, how much of the formatting would be automatically extracted into wikitext? Evolution and evolvability (talk) 10:34, 29 October 2019 (UTC)
@Evolution and evolvability: I'm not entirely sure that document is in scope for Wikisource (we primarily host previous published works). But in any case, what you get out depends on what you put into the PDF file's text layer. You will not normally get any formatting at all, just plain text output. The process we call "Proofreading" involves not just correcting typos (scannos) and such, but also adding wiki formatting (wikifying). For example, page 2 of your example PDF contains this text layer (line-breaks and all):
PDF text layer
Wikimedia Journals

Public trust in Wikipedia is high, yet it has long struggled to gain reputation and engage academic/expert
communities. Similarly, the quality of much of its content is superior to other encyclopedias, yet highly
variable from page to page.
As well as the first stop for information, what if Wikimedia could also be the last stop in some cases with content considered sufficiently trustworthy to be citable? The process of independent peer review
by external experts is a foundation of robust quality-control for information. This is what we have
started to achieve with the WikiJournal project.
After hundreds of years, academic publishing is finally undergoing a rapid transformation.
The Open Access (OA) movement is revolutionising reader access to peer-reviewed research, but the
publishing cost is still out of reach for billions of people who cannot afford ‘article processing fees’,
which can be thousands of dollars for one paper.
A Wikimedia journal platform would not charge for any stage of publication, relying on volunteers and
donations to run the entire project. We have shown how the WikiJournals can draw expertise from
academic and professional communities who otherwise rarely contribute to the Wikimedia movement.

What has been done so far?
Combining academic publishing and Wikipedia has been done in several formats over the last decade.



Dual publishing​: In 2008 ​RNA Biology ​began requiring authors to also write a short Wikipedia
page to accompany any article on a new RNA gene family. In 2016, ​Gene s​ tarted a similar
Journal first publishing​: In 2012, ​PLOS Computational Biology created a format where authors
write an article that is published in the journal and then copied directly to Wikipedia. They were
joined by ​PLOS Genetics​ in 2016, and ​PLOS ONE ​in 2019.
Wikipedia first publishing​: In 2014, ​Open Medicine ​put the first Wikipedia article through
academic peer review and publication, requiring an article processing fee.
All of the above​: Since 2014, the WikiJournal User Group has run a set of journals that specialise
in these formats, hosted within Wikiversity (more in the Proof of Principle section below).
In any case, in addition to the PDF on Commons, is not meta the most appropriate place for such a proposal in wikipage format? --Xover (talk) 11:07, 29 October 2019 (UTC)
  • Pictogram voting comment.svg Comment if the document is a modern document and published electronically, so not requiring proofreading, then we have always taken those documents electronically, without a scan. We just need to ensure that we have the source documentation is captured, and we confirm the licence as provided. — billinghurst sDrewth 11:12, 29 October 2019 (UTC)

Search and replace button[edit]

In one of the proposals of the current Wishlist Survey in Meta (UI improvements on Wikisource) there is a mention of some "search and replace button" among the "Advanced" functions of the Wikitext editor. However, I failed to find it. Is it available in English Wikisource too, or do I have to switch it on somewhere, or am I just blind and cannot see it? --Jan Kameníček (talk) 12:07, 30 October 2019 (UTC)

Hm, I have just found it, but it works strange: it replaces different chunks of text than I ask for!!! --Jan Kameníček (talk) 12:13, 30 October 2019 (UTC)
the find and replace on visual editor works better. (down the page options menu next to the magic pencil) you should test it out. yrmv. Slowking4Rama's revenge 20:49, 30 October 2019 (UTC)
Generally, my experience with VE is very bad (it always looked like the biggest bug in MediaWiki to me :-) ), but I am inclined to give it one more chance :-) However, if the button is displayed outside VE (although the programmers did their best to make it almost invisible), it should work outside VE too… --Jan Kameníček (talk) 21:50, 30 October 2019 (UTC)
i agree about VE, and it is hidden down that menu, but it works well for me. might be worth toggling to VE just to find replace. Slowking4Rama's revenge 22:24, 30 October 2019 (UTC)

Transcription completeness and traditional vs. modern finding aids (i.e., tables of contents, indices, search engines)[edit]

I am interested in learning what Wikisource editors think about the relative value of tables of contents and indices, considering that search engines often meet similar needs. In some cases, it seems to me it may be a poor use of an editor's time to create a sophisticated transcription of an existing TOC or index. I've gone into some detail on this here: Talk:Oregon Historical Quarterly

Of course, a reader's needs will vary from one kind of work to the next. I'm not looking for an absolute rule (especially considering that choices about what work to prioritize on Wikisource are made by individual volunteers), but I'm curious about what principles other Wikisource editors apply when making these kinds of decisions. Pinging several editors I've discussed similar issues with: @Kaldari, @Beleg Tâl, @Billinghurst, @EncycloPetey: -Pete (talk) 03:34, 1 November 2019 (UTC)

@Peteforsyth: Well from a practical viewpoint the conversion utilities for preparing EPUB etc. publications pretty much assume the entire piece is linked (at least indirectly) from the opening page. In a (non-trivial) traditional publication that normally incorporates the contents page(s) and the utilities rely upon this. Trying to transclude a work without such linkage may look great from within wikisource but more or less guarantees outsides will see something most frustratingly (and to them incomprehensibly) incomplete… 04:03, 1 November 2019 (UTC)
I have basically the same viewpoint. I transcribe and transclude the tables of contents (with links) so that the exporting tools will work correctly. I usually don't bother to add links within indexes, however. Kaldari (talk) 04:10, 1 November 2019 (UTC)
Agree with above regarding the Contents. And for works with many chapters or sections, the contents allow quick navigation to a specific part from the main page of the work. They can also provide an overview to the reader when there are multiple parts with chapter numbering that restarts with each part. that is, it is not unusual to have more than one "Chapter 2" in a work, and relying on a search to find the specific Chapter 2 you are looking for is not ideal in such cases. Regarding Indices: these can index more than simple words. They also index topically, and for larger topics they inform the reader where specific subtopics are covered. --EncycloPetey (talk) 05:05, 1 November 2019 (UTC)
  • Pictogram voting comment.svg Comment ToCs tie a work together, so as long as you have a way to tie transcluded subpages, that is the important component. I value ToCs in a work, that set the beginning of a work nicely IMNSHO. I also think that they represent the author's final rendition of a work.

    Indices have never been consider the priority, though they are nice and specific to a work if someone wants to browse a work's contents in a little more detail. We never fuss over their absence, though when done, we add them in. As most people don't include {{engine}} to non-fiction works, and as such searching with a work is often not readily available for most readers, the index does cover the absence. Phe used to have a good script to link convert validated indices to page links, though that stopped a while ago. It is a nice touch if you get that far, though not one anyone would or should castigate for its absence. I will comment that indices are regularly mentioned in book reviews, for their absence or quality, so I find that of interest. — billinghurst sDrewth 05:43, 1 November 2019 (UTC)

I don't see where a search engine reduces the need for a Table of Contents; that's where the author tells you what's in the book, and directs you to the primary section where each major subject is covered. Indexes are a lot of work to properly transcribe and link, but a good index can offer you directions to stuff that may not come up easily if searched (e.g. canonical names or phrases that are hard to search for). It's not the first thing I'd get on, but it has value.--Prosfilaes (talk) 07:28, 1 November 2019 (UTC)

Thank you all, this is very helpful. It's great to hear the thorough views of experienced editors, and it's all persuasive. And it seems useful to think about index and TOC pages as slightly different entities.

  • It's good to know that there is not a strong sense that index pages must be included; I do agree with EncycloPetey and others that they have utility beyond what machine-generated search can provide, but the effort-to-reward ratio isn't always great enough to motivate me to transcribe them.
  • With the ToC, I also agree that they're more important, but I'm realizing that I've encountered an unusual situation with the Oregon Historical Quarterly. I'm still not sure what the best way to proceed is in this case, but it's not worth getting into in a general discussion. If anybody has the patience to take a closer look at how I've set up those pages and discuss it, I'd appreciate your comments at Talk:Oregon Historical Quarterly. -Pete (talk) 16:50, 1 November 2019 (UTC)

Abuse filter edit request[edit]

Hi. Can Special:AbuseFilter/36 please be tweaked to also exclude bots? When bots execute mass moves, they flood the log. Thanks, --DannyS712 (talk) 06:34, 2 November 2019 (UTC)

The purpose is to capture such moves where there is the potential for remaining redirects, so it is acting within scope of why I programmed it. As such it is recording what I want to see, so I am not considering it flooding the logs. — billinghurst sDrewth 06:40, 2 November 2019 (UTC)
@Billinghurst: my apologies, I thought it was for tracking misguided moves. However, bots also have suppressredirect, so if redirects aren't needed, wouldn't they be suppressed? Either way, thanks for explaining --DannyS712 (talk) 06:43, 2 November 2019 (UTC)
Yes it is its primary, though it is broader for checking, and also for clean up. It is not automatic to not create redirects, and there is no clear means to detect that no redirect has occurred, so it is a checking process. It doesn't happened that often, so I am not concerned about the few occasions that it occurs, it never truly floods the logs. Most bot moves usually occur early on, so it hasn't been problematic over the years. — billinghurst sDrewth 07:54, 2 November 2019 (UTC)

NOINDEX meta tag[edit]

Is there a reason that we don't place a NOINDEX meta tag on index: and page: pages? I think it would be a better outcome for the potential reader if a Google search on a book title returned mainspace results only. Moondyne (talk) 09:20, 2 November 2019 (UTC)

I don't know if there is a reason. It seems like a reasonable suggestion. —Beleg Tâl (talk) 17:04, 4 November 2019 (UTC)

Index:Canadian Singers and Their Songs.djvu[edit]

What has happened to the above Index? It was here when I was working on it yesterday (Sat 2 Nov) in the morning Australian time. It now says Error: No such file. It was nearly proofread. --kathleen wright5 (talk) 07:06, 3 November 2019 (UTC)

@Kathleen.wright5: the file was deleted, per c:Commons:Deletion requests/File:Canadian Singers and Their Songs.djvu. As a result, the work here may need to be deleted too --DannyS712 (talk) 07:37, 3 November 2019 (UTC)
No. The file was apparently moved here. @Beleg Tâl: will better placed to advise where it arrived. Beeswaxcandle (talk) 07:48, 3 November 2019 (UTC)
(edit conflict) @DannyS712: It was published before 1924 so it is in the public domain in the US (rule of thumb: US term of protection is 95 years from date of publication), and enWS policy is that works must be public domain in the US (vs. Commons that requires PD in both US and country of origin). @Beleg Tâl: On Commons you indicated that you had transwikied it here, but I can't find it. Can you look into it? --Xover (talk) 07:58, 3 November 2019 (UTC)
very sad they should delete an entire compilation based on the dod of a single septuagenarian. but work can continue here when the promised transfer occurs. deletion on commons should never be a deletion rationale here; rather we should have our independent task flow and determination.Slowking4Rama's revenge 12:49, 3 November 2019 (UTC)
@Xover: @Beeswaxcandle: I did import the file (or thought I had done so). You can see that File:Canadian Singers and Their Songs.djvu is not redlinked, and does contain the licensing info I set up, so I'm not sure why the file itself is not there also. Fortunately, I can easily re-upload it from the source, and will do so as soon as I have a chance (probably later this evening). @Slowking4: we did have this discussion here, the work is unambiguously copyrighted in Canada and in violation of Commons policy, and I did (try to) move the file locally as part of our independent task flow and determination. —Beleg Tâl (talk) 21:13, 3 November 2019 (UTC)
@Beleg Tâl: You imported the File: page (the container), not the media:. You cannot special:import media files. — billinghurst sDrewth 21:34, 3 November 2019 (UTC)
as we see deletion is privileged, and saving by transfer is not. it is not obvious that it was a copyright vio since the nominator did not do the work of listing the authors. i guess that is the uploaders job, or the person transcribing here, otherwise we might have work after work deleted out from under a transcription effort. look forward to the required local upload from IA, since fairusebot is a distant memory. Slowking4Rama's revenge 02:52, 4 November 2019 (UTC)
That's lame. Looks like it's already being looked at on Phabricator, phab:T8071. —Beleg Tâl (talk) 14:29, 4 November 2019 (UTC)
I've also added it to the wishlist. —Beleg Tâl (talk) 14:46, 4 November 2019 (UTC)
This index seems to be here in some form. I've just validated Page:Canadian Singers and Their Songs.djvu/124 and it was proofread by Jason Boyd earlier today. [Revision history] --kathleen wright5 (talk) 02:41, 4 November 2019 (UTC)
I have uploaded the file, everything is Yes check.svg DoneBeleg Tâl (talk) 14:29, 4 November 2019 (UTC)


Wikilivres is down, and has apparently been so for some time. Does anyone know the prognosis? We have a lot of links to it, from this project, and sister projects. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:11, 4 November 2019 (UTC)

No idea. I would suggest that if we believe that it is not coming back that we can look to creating a new landing page at Meta: (or an agreed alternate site) that explains the site and that it is no longer active. I can then get the mapped interwiki link pointing to that page. We can then work out what we want to do with the links, and explore what is happening with the site. — billinghurst sDrewth 21:19, 4 November 2019 (UTC)

Is posting to a website considered publication here just as it is in US copyright law?[edit]

Is posting to a website considered publication at Wikisource just as it is in US copyright law? Is the distribution of visibly perceptible copies considered publication at Wikisource as it is in US copyright law? See User:Richard_Arthur_Norton_(1958-_)/Vunk_-_Quick_Burial_Ground moved to user space because not considered "published". The data is not eligible for copyright because it consists entirely of information that is common property and contains no original authorship. Let me know what you think. --Richard Arthur Norton (1958- ) (talk) 15:42, 4 November 2019 (UTC)

@Richard Arthur Norton (1958- ): Yes and no. Our governing policy on the subject is Wikisource:What Wikisource includes. Analytic works like Reimer's cemetery listing "must have been published in a medium that includes peer review or editorial controls; this excludes self-publication". Most of the time, posting to a website does not fulfil this requirement, as it is usually the author self-publishing the content, or a platform hosting the content without editorial control. —Beleg Tâl (talk) 16:58, 4 November 2019 (UTC)
The South Brunswick Township Public Library published it on their website in April of 2019 based on the copy deposited by Reimer in 1977. The website version has the SBTPL annotations and their indexing at the top of the page, putting it under their editorial control. If she had posted it on her personal website or in a personal blog, it would be self-published. See: and --Richard Arthur Norton (1958- ) (talk) 17:15, 4 November 2019 (UTC)
I'd note that posting to a website is not necessarily considered publication under US copyright law. The Copyright Office says it's unclear, and it seems that it's frequently treated as broadcasting. To be clearly publication, the page would have to explicitly offer downloads or otherwise transfer copies to people.--Prosfilaes (talk) 01:04, 5 November 2019 (UTC)
  • Please note that it is published as a webpage and as a pdf. When I click on the pdf in my Chrome Browser it downloads automatically and opens in my Adobe pdf viewer, so yes it is physically distributed. --Richard Arthur Norton (1958- ) (talk) 17:52, 6 November 2019 (UTC)
If the same site or the blog published a poem from a four year old, would you consider it published? I believe that we are talking about a reasonable process of peer-review, and the website you discuss and their processes is not clearly that peer review.. — billinghurst sDrewth 10:32, 5 November 2019 (UTC)
This isn't a poem, it's a list of names and dates. Regardless of whether the above PDF can be considered "published", I would say it is out of scope. --Xover (talk) 11:14, 5 November 2019 (UTC)
"unless it is published as part of a complete source text"—lists of names and dates can be in scope. —Beleg Tâl (talk) 12:01, 5 November 2019 (UTC)
Your example is just such a larger work that provides context to the list, even if the list is the main meat (unlike, e.g., the list being a short appendix to a hundred+-page work). It's in the borderlands, I would say, but comfortably over the line for inclusion. But the list at issue here is just a list of names and dates because it is excerpted from that context in order to circumvent copyright restrictions, making it out of scope as an excerpt too. With the context it would be copyvio. --Xover (talk) 12:09, 5 November 2019 (UTC)
(ec)the list or not list is not the issue, it is the peer reviewed component. Websites can publish any set of snippets without a clear editorial decision, and review process for us to host the work. If the list is already published, then we can just as easily link to it from the author page to its present site. — billinghurst sDrewth 12:13, 5 November 2019 (UTC)
Agreed —Beleg Tâl (talk) 13:03, 5 November 2019 (UTC)
it could be data for either wikidata or commons dataset. the site says: "A paper copy is available for use at the South Brunswick Public Library." i would prefer a discussion about scope issues before moving, for some consensus. it would be better to have the scans from the "Somerset County Historical Quarterly," [11] but unclear that cut and paste works are disruptive. better to pivot to more productive editing. Slowking4Rama's revenge 16:35, 6 November 2019 (UTC)

Tech News: 2019-45[edit]

16:47, 4 November 2019 (UTC)

Community Wishlist 2020[edit]

Magic Wand Icon 229981 Color Flipped.svg

IFried (WMF) 19:30, 4 November 2019 (UTC)

Copyright in Ethiopia and Template:PD-Ethiopia[edit]

Please note that our {{PD-Ethiopia}} is out of date: Ethiopia enacted a copyright law in 2004 but our template refers to a 1960 version. The old law provided protection only during the author's lifetime, but the new law is pma. 50 with some PD-EthiopianGov type exemptions. Crucially, however, our template does not distinguish between copyright status in Ethiopia and copyright status of Ethiopian works in the US.

Since Ethiopia still does not have copyright relations with the US, no Ethiopian works are currently protected by copyright in the US, and can be freely hosted here.

However, if transferring a file to Commons the distinction becomes relevant. In those circumstances, do not depend on our {{PD-Ethiopia}} tag! Each file with this tag will need to be assessed individually.

Ideally we would modify our Ethiopia-related licensing templates and then review and correctly tag all works in Category:PD-Ethiopia with both Ethiopian and US copyright status (some works may be eligible to move to Commons even under their stricter policy). --Xover (talk) 19:11, 6 November 2019 (UTC)

Fixed —Beleg Tâl (talk) 19:08, 8 November 2019 (UTC)

List of index pages[edit]

How is the List of Index Pages supposed to work? It seems that it always gives the same results no matter what is filled in the Search field. --Jan Kameníček (talk) 15:45, 8 November 2019 (UTC)

The results page says "The search engine does not work. Sorry for the inconvenience." So I assume it's supposed to work normally but is broken. —Beleg Tâl (talk) 18:48, 8 November 2019 (UTC)
Oh, thanks, my fault… --Jan Kameníček (talk) 00:36, 9 November 2019 (UTC)
Although my experience with Phabricator is much worse than bad, I have given it a try and reported it, see task T237831. --Jan Kameníček (talk) 20:54, 9 November 2019 (UTC)
In fact it had already been reported two months earlier: task T232710 --Jan Kameníček (talk) 17:51, 10 November 2019 (UTC)

Add Wikidata link to Index page[edit]

I made a thing: User:Samwilson/LinkIndexToWikidata.js. It adds a 'Wikidata item' row to the metadata table on Index pages, linking to the Wikidata item that refers to the Index page via Wikisource index page (P1957). If there's no link, it complains to you to fix it. :) To use, add this to your common.js page:
mw.loader.load('//');Sam Wilson 23:24, 10 November 2019 (UTC)

@Samwilson: (Stupid-hat question) Why don't we just add the field to underlying template? Then we can gadgetify the script to make it more available. Or do we just gadgetify it anyway? — billinghurst sDrewth 06:36, 13 November 2019 (UTC)
@Billinghurst: Good question! It's because there's no sitelink from an Index page to its Wikidata item; the only link is via the URL stored in Wikisource index page (P1957), so the way a script can do it is by making a Wikidata Query Service request. A template (or Lua module) can't do that. Or do you mean, why don't we add a field for Wikidata ID to the template? That'd work, but it's duplicating the data (which is maybe not a bad thing; similar things are done elsewhere in the system). —Sam Wilson 12:25, 13 November 2019 (UTC)
I stopped bothering adding the index: backlink. It isn't in the WEF framework, and I just stopped bothering as it seemed to be of limited value. If it is being added to the {{book}} template at WD, then we can inhale it with the existing script, or we can enter it manually. Means that I created it a bit earlier. I sometimes wonder whether the duplication may allow for bots to better come along and tidy up. <shrug> — billinghurst sDrewth 12:49, 13 November 2019 (UTC)

Truth be known samwilson I would like to have more of the {{book}} data on the Index: page, hopefully passively added from Commons, or pulled from WD, rather than another manual addition. For instance I would like that where we have an IA work that we can have active link to that work. I want to be able more readily link to the jp2 zip file of the work so we can better work with image extraction and clean up, with our no longer actively supporting {{raw image}} extraction. Unfortunately I haven't found an online tool to open an online zip and extract single images, though I am still looking.

Now I don't know the best way to complete the three way dance with Commons and Wikidata, and it is always our issue that IA starts, Commons comes 2nd, then enWS Index: 3rd, enWS main ns, 4th, then usually WD comes 5th. If WD could occur at step two or step three (more automagically) and then Index: page that would be beautiful. Though that wish has never been fulfilled, and I have asked people like Lucas Werkmeister at a conceptual level … to silence, we are way down the food chain. RexxS is really helpful, though I don't like to push acquaintanceships too hard.

What those ignoramus thinks we need is <mode start=dream>

  • Update to MediaWiki:Proofreadpage index template for fields
    • though maybe it is a separate template can manually insert to start, or passively embed based on data links to WD (I dunno exactly, out of my paygrade)
  • Update to MediaWiki:Gadget-Fill Index.js which is the gadget the extracts data from Commons files and adds to respective Index: fields

billinghurst sDrewth 01:46, 14 November 2019 (UTC)

@Billinghurst: I'd love to help, of course, but I'm a complete noob here and I don't understand the workflow or the terminology you're using. Checking random works and authors, I find Wikidata links, but no link for a random transcription. Is that where you're stuck at? When I looked at Index:Paradise Lost (1667).djvu, I could see that the linked title and linked author both have Wikidata items, but obviously not the transcription. I think for the moment, Sam is right - you need WDQS to do the reverse lookup. However, I suspect that it should be possible to have a field on the index page that records the Wikidata item containing that link once it's been found. Magnus Manske has a bot that can create lists on wiki-pages from the results of a WDQS query, so maybe a bot run could populate such a field for you? I'll another think and see what I can work out. RexxS (talk) 18:56, 14 November 2019 (UTC)
smiley Thanks RexxS. The work you found is just going to be complicated for a range of reasons, so let me try something cleaner.

I have prepped a completed and transcluded work hopefully as a better example.

The three djvu-like pages they are suitably populated with expected data. All inter-related, and each containing different data. Noting that the WD item is for the edition, I haven't created one for the conceptual "book"

For a work in progress of transcription: Index:The best hundred Irish books.djvu <-> c:File:The best hundred Irish books.djvu <-> Internet Archive identifier : besthundredirish00obri, no wikidata item yet as I usually create those at the end, and no book item as I gave up creating those as too much extra effort.

If we need to get down and dirty then maybe we should pick a user talk space for the conversation, or a scratch space, or an IRC chat. <shrug> Guide me, I am really happy to step through things. Noting my [understanding of WikidatatIB = knowledge of WDQS = capability in Module: ns]. (I suck at programming … conceptual hole). — billinghurst sDrewth 01:26, 15 November 2019 (UTC)

Tech News: 2019-46[edit]

22:02, 11 November 2019 (UTC)

Spelling errors[edit]

I've forgotten the guidance on spelling errors in the original. "Seventeeth" in ... Rich Farmbrough, 19:10 12 November 2019 (GMT)

You can use the {{SIC}} template. --Jan Kameníček (talk) 20:02, 12 November 2019 (UTC)
@Rich Farmbrough: We reproduce as they are. If you do use the template as suggested above, it is up to you whether you include text in the second parameter. Some consider it an annotation and and assumption so do not like it, some do like it. Personally, I use it though generally leave it empty unless it is really helpful to explain the alternate word. If you want to silently leave something inline, then we also have {{sic}}. As a note, if there is a whole swag of old text being reproduced we would not tag it, we let it stand. As per WP in wikilink first error, we would only tag the first error of each type. I saw that Martin has highlighted that work in a WD talk that work that he and I did. — billinghurst sDrewth 06:30, 13 November 2019 (UTC)

If you tweet, especially about Wikisource[edit]

Hi. For those who are on Twitter and tweet about Wikisource, a new reminder that some of us maintain @wikisource_en so please do include that account in your tweets as appropriate. Either in twitter, or here, please let us know your account so that we can follow. — billinghurst sDrewth 06:25, 13 November 2019 (UTC)

I'm @pigsonthewing, and will follow the above account as soon as Twitter lets me. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:35, 13 November 2019 (UTC)

AuFCL / MODCHK / random IP editor 114...[edit]

Dear AuFCL / MODCHK / random IP editor 114... Hoping that the NSW fires are not near to you, thinking that they are to your north-west and south-west. Best of luck with what is coming through your area. Wildfire sucks. — billinghurst sDrewth 12:16, 13 November 2019 (UTC)

Amen! --Xover (talk) 16:00, 13 November 2019 (UTC)

Greek: Aerodynamics[edit]

Could somebody who is able to read and write (rather: type) Greek please enter the words in that language on Page:Aerial Flight - Volume 1 - Aerodynamics - Frederick Lanchester - 1906.djvu/415 and the following page? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:46, 13 November 2019 (UTC)

@Pigsonthewing: If you use {{Greek missing}}, the page will be added automatically to Category:Pages with missing Greek characters which is monitored by users who can type Greek. —Beleg Tâl (talk) 15:10, 13 November 2019 (UTC)
@Beleg Tâl: Something new to learn every day. Done, thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:31, 13 November 2019 (UTC)


Are signatories considered to be a kind of authors, i.e. can they have author pages even if no other work by them eligible for Wikisource exists? --Jan Kameníček (talk) 16:03, 13 November 2019 (UTC)

One more related question: If a person already has their author page, can I add there (under a separate heading) works which they have only signed? --Jan Kameníček (talk) 18:12, 13 November 2019 (UTC)
I think it depends - if they are just one of a whole bunch of signatures e.g. on a petition, I wouldn't bother creating an author page for them (but I might add it under a separate heading on an existing author page as you suggest). On the other hand, a situation like e.g. an official signing a document that was issued in their name but written by one of their staff, I would definitely treat the signing official as a full author. —Beleg Tâl (talk) 19:04, 13 November 2019 (UTC)
I see. What I had in mind was e.g. an international treaty signed by a bunch of statesmen, which is similar to your petition example. --Jan Kameníček (talk) 20:55, 13 November 2019 (UTC)
I would have just wikilinked it unless they are primary. It is one of those quandaries about how do we work with WhatLinksHere and running counts on those things to highlight where an author has exposure beyond the works they wrote. — billinghurst sDrewth 00:35, 14 November 2019 (UTC)
@Billinghurst: I see. So you do not think it should be mentioned at the author page if it is not a primary signature, right? --Jan Kameníček (talk) 13:24, 16 November 2019 (UTC)
I don't see how it is different from being mentioned/appearing in any article that we reproduce. I wikilink to their author page, and would only backlink back to the article where they are the focus. — billinghurst sDrewth 13:55, 16 November 2019 (UTC)

Ability to individually access single JP2 images from Internet Archive work archives[edit]

Now I may be completely slow on the uptake, however, today I have just identified that we can directly download individual JP2 files for the pages from a work. [If others had noticed this, then I apologise for missing your communications on this matter.]

Anyway this means that with something like GIMP, you can directly paste in the url of the JP2 page into GIMP > Open location and load it straight into application and edit the best quality file.

@Xover: I think that this means we can probably steal best quality pages from another copy of the same work and rebuild files. Correct?

To see a file list from a file's /details/ page at Internet Archive follow the SHOW ALL link > beside the .zip click the "View Contents" link and VOILA a file list where you can grab a useable link. example link

@Samwilson: if you can suck in the IA link to an index file, we can simply template this based on<ia-identifier>/<ia-identifier>

Alternatively maybe we get this linked up from within the book template at Commons. — billinghurst sDrewth 03:45, 14 November 2019 (UTC)

Pictogram voting comment.svg Comment I am thinking that at least as an initial measure we could build an optional manual IA field into {{raw image}} that can turn on a component that displays text and link to the directory listing of the JP2 file. At some later point, when we have soeone clever we may be able to build some linking of the Page:{{BASEPAGENAME}}/nn back to the Index, and any data known about the Index: page could be used to automatically populate the IA field. Just thoughts, happy to hear something cleverer. — billinghurst sDrewth 06:59, 14 November 2019 (UTC)
@Billinghurst: I'm not quite following your reasoning, but, yes, from a set of individual page images I can generate a DjVu with OCR text layer, regardless of where those page images came from. This is currently using hacky and semi-manual tooling that nobody but myself would ever use unless under duress, but I am investigating options for providing some kind of access to them for anyone to use in a way that is at least reasonably functional for normal people. In the mean time I am happy to generate DjVus for people if I am provided with a comprehensible specification of what page images in what order should make up the resulting DjVu. I can also do things like swap out a page in an existing DjVu, reorder pages in an existing DjVu, delete extraneous pages, etc., and am happy to do so, but, again, provided I get a clear specification of what needs to be done.
As for your larger thrust… I don't think I'm grasping the problem you are aiming to solve?
If we presume a correctly filled out Book template at Commons, the Index:-page preloader gadget can be extended to pull in the source link from there (in fact I think it already does, we just don't store it). The Index: template here can be extended to have a field to store the value from the source field at Commons. And it's possible to make a script that tries to pick out an IA link from that and generate a direct link to the "show all files" directory listing at IA. It is not possible to create a link directly to an individual page image at IA since their page images are arbitrarily named, and because we routinely make changes to works between IA and what's uploaded to Commons (think removing Google scan pages, calibration pages, duplicate pages, etc.). It is also technically possible, today, to make a script to go directly from a page in the Page: namespace to the directory listing at IA. Some or all of these will be somewhat hacky and prone to break, but that's already the case with the Index: preloader gadget and it seems to work enough to be worthwhile. *shrug*
In any case, lots of things are possible in this area, so it's mostly a matter of articulating which problem we are trying to solve. --Xover (talk) 08:08, 14 November 2019 (UTC)
Problem 1: {{raw image}} was previously used by Hesperian as an indicator to populate converted jp2 images as png images, this upload locally stopped a while ago due to time and effort. And users had to download the PNG and clean, then we have to go through a migration and deletion process. All butt ugly.
Info Template:raw page scan (transclusions: 21,218, links: 5) / Template:Raw image (transclusions: 12,659, links: 21,262)
Problem 2: people have used the jpg images from (expanded) scans at IA as the basis of an extracted images to upload to commons, or as an ugly screenshot to upload. All butt ugly.
(solution to P1 and P2) Links to the folder enables users to at least try and to get best available quality.
Problem 3 There are broken scans here, and often people haven't fixed them as it was too hard to extract from a djvu, or get a source page to OCR separately.
(solution to P3) new source of single page to insert into djvu, or new source of single page image to OCR online and insert; was flagging nothing more
Problem 4 While scan in file has been good, the OCR has been rubbish
(solution to P4) as per S3, can OCR individual page for paste of text
Re general comment: increasing our general connectivity in through Wikidata<->Commons was part of the discussion earlier on this page—samilson's script discussion above—and to IA is more helpful, sure there will be old data, and occasionally broken data, though such a process as this is more likely to find and get fixed. I am advocating that we keep taking these steps.
Re book => index. We haven't looked at it as a community, and Jarekt has been better developing it at Commons, and we should review how we utilise the links, the code or the data, at the moment we scrape data, rather than leverage the available sources, and then only complete fields when we need to override.
Re linking, it looks as direct linking is possible, eg. [14] though I was more advocating linking to the directory. — billinghurst sDrewth 09:00, 14 November 2019 (UTC)
@Billinghurst: Thanks, I'll try to see if I can come up with anything useful.
Regarding the direct linking, the problem isn't what IA provides, it's that we have no way to figure out which page image a Page: here corresponds to at IA. On the IA side, some scans count pages from zero, some from one; some include Google book pages, calibration pages, etc. that have been removed before upload to Commons, meaning our page 123 maybe be page 134 at IA. In other words, there's no way for a mere dumb computer to get from one to the other: you need a human being to connect the two. That said, there are things we can do to encourage the humans to add such links if we want them to: the {{raw image}} template can start by asking for an IA identifier if missing, and progress to link to the directory listing if one is provided, and also ask for an IA page identifier that will enable the direct link. --Xover (talk) 09:13, 14 November 2019 (UTC)
I have started a conversation at template talk:raw image though the work is done in module:rawImage which eliminates me from the fix, though maybe not all the grunt work needs to take place in the module. I have also noticed that we give guidance at Help:Adding images and that is part of the above problem. — billinghurst sDrewth 10:24, 14 November 2019 (UTC)
@Xover: if Djvu files are ported from IA leaving unchanged the internal 'page id', deletions, etc. should not cause problems. Inspecting the local djvu file, we could get the correct IA djvu page. This is not true if new 'page ids' are used when regenerating djvus from IA. This at least could allow offline scripts to work. Would be nice to have this info through an API command wishful thinking, I know ....)Mpaa (talk) 19:38, 14 November 2019 (UTC)
@Mpaa: Hmm. Interesting. I hadn't realised IA did that. The 'page ids' aren't actually identifiers as such, they're a "page name" and were, I believe, intended to be used essentially like our pagelist tag. I've been avoiding using them because they make it confusing when trying to manually manipulate a DjVu file (the DjVuLibre commands operate on physical page numbers, but DjView displays the "page name"; if the two differ you get seemingly random results). However I hadn't considered the possibility of using them to document the original page image from which the DjVu page was generated. I'll play around a bit when next I touch that code and see if there's anything clever we could do there. --Xover (talk) 19:49, 14 November 2019 (UTC)
For completeness, I mean this sort of info, e.g, <PARAM name="PAGE" value="whofearstospeako00cuma_0001.djvu"/> in here. I always try to leave that unchanged. Then we just need to play with the extension. I have seen bugs, e.g. the page offsets sometimes we get when uploading, related to changing these references.Mpaa (talk) 21:20, 14 November 2019 (UTC)

Work-specific disambig pages[edit]

A while ago, we agreed that it does not make sense for us to have author-specific disambiguation pages. For example, Sonnet (Shakespeare) should not exist as a disambiguation page, but instead all works by Shakespeare titled "Sonnet" should be listed directly at Sonnet and at Author:William Shakespeare.

I've noticed that we also have a number of work-specific disambiguation pages. For example, 1911 Encyclopædia Britannica/Abdera lists works entitled "Abdera" which are also part of the Encyclopedia Britannica. However, this page is redundant, as the works listed on that page are listed directly at Abdera and at 1911 Encyclopædia Britannica/Vol 1:1.

I would like to start merging these work-specific disambiguation pages into the main disambiguation pages, but I also want to get the community's input before I start. This also ties into my efforts to clean up the Wikidata items for Wikisource mainspace disambiguation pages. —Beleg Tâl (talk) 13:34, 15 November 2019 (UTC)

I would say
  • that the page "1911 Encyclopædia Britannica/Abdera" should redirect to the general disambiguation page for "Abdera" with merging of detail as required.
Philosophically we have agreed
  • one disambiguation page per term
  • where disambiguation contain main and other namespace items, then main namespace wins for siting
  • disambiguation pages can exist in any portal to disambiguate within a portal (above rules apply first)
While not desirable, I don't have a particular concern if we have work level disambig pages and nothing at root level where not attached to a WD item—to me they are low priority. That said, we should not have any work level disambiguation pages linked to WD, be it DB1911, DNB or whatever, and creation of further work-level pages should be dissuaded.
billinghurst sDrewth 14:17, 15 November 2019 (UTC)

Proposal for a new Featured texts badge on Wikidata[edit]

I've created a proposal on Wikidata for a new "Featured texts" badge, to compliment the existing "Featured article", "Featured list", and "Featured portal" badges used by the Wikipedias. If you have an opinion, please comment there (not here). Thanks. Kaldari (talk) 18:27, 15 November 2019 (UTC)

We already use the featured article for featured text (aliases at d:Q17437796). I think we felt that the words are interchangeable, and there is no link overlap issues. The Vampyre <-> d:Q58881954, if it isn't automatically appearing, that is our fault for not properly converting {{featured}} to properly leverage the tag.

We also need to better align d:help:badges of proofread, validated and digital document, as we should be building that into our {{header}} template. I note that there is a separation of Wikisource badge and Wikimedia badge. — billinghurst sDrewth 02:35, 16 November 2019 (UTC)

I have already brought up this topic on the proposal page itself. —Beleg Tâl (talk) 19:00, 16 November 2019 (UTC)