From Wikisource
(Redirected from Wikisource:S)
Jump to navigation Jump to search
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 394 active users here.



Collective work inclusion criteria[edit]

[This is a proposal stemming from the #Policy on substantially empty works section below.]

Since there has been no more input for a month, here we go. This is only a proposal, so any part of it can be changed, or the whole idea rejected. Inductiveloadtalk/contribs 10:58, 25 August 2020 (UTC)

Inclusion criteria for articles[edit]

Some works are composed of multiple parts that can stand alone as independent pages. These works are generally encyclopedias, biographical dictionaries, anthologies and periodicals such as magazines and newspapers and so on. Such "collective works" have slightly different criteria for inclusion in the main namespace. The aim of these criteria is:

  • To allow individually-useful articles, or sets of articles, to be transcribed to the main namespace without requiring active transcription of hundreds of pages of unrelated articles
  • To nevertheless make it easy for other users to "drop in" and add more articles to the work.

To be eligible for inclusion, a component of a collective work (e.g. a single magazine article), should satisfy the following criteria:

  • The component should be "non-trivial" in scope and importance. For example, only a title page or single-paragraph "notice to subscribers" in a magazine is unlikely to be considered useful on its own. However, it would still be part of a full transcription of the rest of the parent unit (e.g. a magazine issue).
  • The work should be scan-backed.
  • Main namespace pages should be created for the work at the top level and any intervening levels (e.g. Volume and Issue/Number ranks should exist). Sometimes, the Issue/Number rank redirects to a section on the Volume page.
  • Front matter of each intervening level the "parent unit" (e.g. a magazine volume and issue) should be transcribed and transcluded
  • A table of contents is required for the parent unit in question. Use {{AuxTOC}} if the original work doesn't contain a TOC.
  • Appropriate infrastructure around the work should exist. This might include internal plain link templates ("lkpl"), dedicated article link templates for use on author pages, formatting templates for repeated formatting elements, etc. All templates should be fully documented.
  • The article should be linked to from any relevant author pages and suitable portals
  • Oppose. An article is a complete work. The only requirement for inclusion should be that it actually is an article. This proposal would result in, for example, the deletion of huge numbers (at least hundreds) of perfectly good short stories and similar articles created over more than a decade for no good reason. I can see no reason for demanding every piece of front matter, which might consist of large quantities of indexes, adverts and other material of no great importance but massive bulk and technical difficulty. Insisting on scan backing would be extremely damaging if a particular article is or should be used as a source for Wikipedia. The need to provide online copies of sources to maintain and improve Wikipedia is overwhelmingly more important than the luxury of scan backing. Requiring the creation of templates would be a crushing burden, because most people do not know how to create them. It is in any event wholly unecessary. Whether the article is linked to is irrelevant to inclusion. I can understand the desire for a main page that links to the article (and even that would take a lot of effort to effect in some cases where a lot of articles have already been created), but the rest is just obstructive. The problem with this proposal is that it would create a massive crushing burden that is wholly unecessary and produces no useful benefit to the project or readers. It is burdensome restrictions for the sake of restrictions. James500 (talk) 20:18, 29 August 2020 (UTC)
  • Support. Without a system like the one you have described in place, sub-pages of works could be created wantonly without any means of completing the works from which they were derived. If an article, which is a selection from a larger work, is created without any infrastructure, it will be very difficult for other Wikisourcerors to complete the work which has been started, as they will have to find and upload a scan and set up the complicated not-article material without the aid of the person who created the first article. The new system will also make it easier for other contributors to work on smaller parts of a larger work, without worrying about demanding formatting concerns. TE(æ)A,ea. (talk) 12:30, 30 August 2020 (UTC).
    • Content creation should not be described as "wanton". There are means of completing the works from which the sub-pages were derived. If an periodical article is created without so-called infrastructure, it is very easy for other Wikisource editors to complete the work which has been started. It only becomes difficult when someone goes on a deletion spree. And it is massive numbers of nominations that cause problems. James500 (talk) 18:33, 30 August 2020 (UTC)
      • This page is a fine example of what I refer to. A novel contributor, with no previous involvement with this work, or one like it, would have to generate an entire system for reproducing (transcluding) articles from that work. The example I provide is more complete than other pages, and is much more complete, in relation to the whole work, than a single article. It would be very difficult to add to larger works, where the basis is merely articles or other pages in the state of which I complain. TE(æ)A,ea. (talk) 21:21, 30 August 2020 (UTC).
        Oh sheesh is that happening again. Fully agree with you TE(æ)A,ea that it is wanton and of little value. That content does not belong in main namespace. Main namespace is for transcribed work. Constructs and curation belong in portal namespace. I have created the portal and moved the non-mainspace material. — billinghurst sDrewth 23:17, 30 August 2020 (UTC)
        • That page was created more than a year ago. Nothing is "happening again". You did not move the bibliographic information from the mainspace page to the portal. I had to add it to the portal myself. If that important bibliographic information had been deleted by mistake, that is an example of how seriously disruptive the proposed deletion criteria could be. The word "wanton" is needlessly offensive. The primary meaning of the word "wanton" is "sexually promiscuous" and it is applied to other things by analogy. Please do not use that word. James500 (talk) 00:49, 31 August 2020 (UTC)

Pictogram voting comment.svg Comment @Inductiveload:

The proposal, as is, would require inhibit the ad hoc transcription of articles from "The Times", eg. The Times/1914 and things linked from {{The Times link}}. Is that in or out of scope for your proposal? Maybe there should be a declaration of some governing principles first. What is looking to be achieved, and indications of what is trying to be stopped. Then we can get onto a structure. I know that we created {{header periodical}} to capture where we have more sporadic collections of articles from newspapers. [Now I could be convinced that such constructions are better to be in the portal namespace rather than main ns.]

Some examples of pages considered problematic would be useful for context. If the proposal is an effort to have articles from a periodical becoming part of a hierarchy of the periodical, ie. subpages, then YES, I fully support that, in contrast to a random root level pages without context to the publication. If the proposal is to set up a fully qualified structure for every periodical where we just want to reproduce one article, then NO. This is self-interest as I regularly want to reproduce an obituary for an author to establish biographical information and we are never going to get all that requisite newspaper construct data, and we are virtually never going to get the scans.

For any newspaper article I have transcribed I will generally do "Periodical name/YYYY/Article name" to give it grounding, and the article would have some "notability". The Times I did an extra hierarchy level. I will accept that there will be early works that I transcribed that may be incomplete by that standard and I would not transcribe them that way today. — billinghurst sDrewth 15:31, 30 August 2020 (UTC)

To be eligible for inclusion, a component of a collective work (e.g. a single magazine article), should satisfy the following criteria:

  • The component should be "non-trivial" in scope and importance. For example, only a title page or single-paragraph "notice to subscribers" in a magazine is unlikely to be considered useful on its own. However, it would still be part of a full transcription of the rest of the parent unit (e.g. a magazine issue).
  • The work should be scan-backed.
  • Main namespace pages should be created for the work at the top level and any intervening levels a suitable, logical subpage hierarchy developed (e.g. Volume and Issue/Number ranks should exist). Sometimes, the Issue/Number rank redirects to a section on the Volume page.
  • Front matter of each intervening level the "parent unit" (e.g. a magazine volume and issue) should be transcribed and transcluded
  • A means to navigate the subpages of the work is required; a table of contents is preferred, though alternatives exist. A table of contents is required for the parent unit in question. Use {{AuxTOC}} if the original work doesn't contain a TOC.
  • Appropriate infrastructure around the work should exist. This might include internal plain link templates ("lkpl"), dedicated article link templates for use on author pages, formatting templates for repeated formatting elements, etc. All templates should be fully documented. (additional) Parent template exist to make this readily easy.
  • The article should be linked to from any relevant author pages and suitable portals; (additional) orphaned pages are not acceptable.
    • If an article is orphaned, that is certainly a reason to add links to the relevant author page or portal. It is not a reason to delete the article. Issues that can be addressed in a very straightforward way by adding links to other pages are not suitable for use as deletion criteria. Why would you delete the page instead of just adding the links? This kind of thing belongs in a style guide. I suggest the words "eligible for inclusion" are the problem with some of these criteria. James500 (talk) 01:33, 31 August 2020 (UTC)
      We are wanting to get people to link. We don't delete a work for lack of a linking, we are not that petty. What that criteria does is limit the transcription and addition of the trivial, linking indicates that it requires some relevance. — billinghurst sDrewth 14:57, 31 August 2020 (UTC)
    • @Billinghurst: I mostly agree with your formulation - that's more flexible in the case of newspapers. @TE(æ)A,ea.: has already given an example, but there are several more examples in the #Policy on substantially empty works below.
    • I do still think we should be requiring the front matter, but perhaps only when we have scans. Usually, it's just a title page or issue banner, it usually provides the date and number as in the original and it prevents the main-space page being just a floating TOC: e.g. The Chinese Repository/Volume 1 and The Chinese Repository/Volume 1/Number 1, versus, say, The London Quarterly Review/39 (which doesn't have a scan, so it's kind of fair enough in this case, but if it had a scan, it should get the front matter).
    • I was going to disagree with the removal of the scan section, but if it is downgraded to "if possible", since the current global policy is pretty much "scans if at all possible", it doesn't need to be repeated.
    • For clarification: by "Parent template exist to make this readily easy." do you mean things like Template:Authority/lkpl? Inductiveloadtalk/contribs 11:11, 31 August 2020 (UTC)
      I was meaning template:article link primarily as it is more what we have used for journals. template:authority/link is more aligned to dictionaries and the like. But yes, one of those as the parent template, or used directly. If we have a scan, then yes to front matter, so we can qualify in the regard of its existence.
  • I have a question; let's take Golfers Magazine. I expect that there will be exactly one article ever transcribed from this--Ask the Egyptians, by Rex Stout, an obscure short story by a not so obscure author. I'm glad to provide scans; I think we should demand scans for stuff that wasn't originally published digital. And it will get tucked under a Golfers Magazine/Volume 28/Issue 3/Ask the Egyptians. But how much work do you expect here? I would begrudgingly create a ToC for the issue, but messing with templates seems completely unnecessary.--Prosfilaes (talk) 14:03, 31 August 2020 (UTC)
    Personally I think that scans are nice, maybe preferred, not mandatory. Sometimes getting scans is either not possible, or just problematic. I have numerous newspapers to which I can get access through subscription sites, but producing scans to upload is just MEH! especially if I just want an obituary reproduced. (Noting that where I just want a rough transcription or a snippet that these days I put it on an author talk page.) Have a poke at Category:Obituaries for a range sources that myself and others have used.

    For your example, I would have gone for "Golfers Magazine/YYYY/article name" and then slapped down {{header periodical}} at the root level, as we get more years, then we can break it down further. — billinghurst sDrewth 14:57, 31 August 2020 (UTC)

  • @Prosfilaes:, what I think would be nice here might be:
    • The top level page, pretty much as it is. Doesn't look like there's much more to say about this work.
    • I can't really see any sensible templates (note "might include" in the proposal) to create for this work. It's not a dictionary so it doesn't obviously need a lkpl, and it's not big enough to merit an article link template of its own. Perhaps if all the headers are identical, there could be a formatting helper, but not critically needed.
    • Personally, I'd like to see the cover if there is one and it's "nice" like this one (obviously not a library binding), and the issue header on the issue sub-page, but I can see the argument that it's a bit pointless if there is no intention to transcribe the rest of the issue. The TOC (which already exists in the original work) is something I'd prefer to see if possible, but I do get that it's a bit of an imposition in this case, where only one article is "interesting".
    • A list of the known scans somewhere (90% of periodicals seem to do this in the mainspace, but that's evidently controversial). It looks like Hathi has an incomplete list and the IA has another Google-fied copy of v.12, so in this case probably just what Hathi has. A lot of the time a mish-mash is needed to get a set of links. Uploading is strictly optional - obviously preferred, but we all know how much of a pain it is, and page-listing and checking periodicals is pretty masochistic, so it's absolutely not needed.
    • Again personally, I prefer "Golfers Magazine/Volume 28/Issue 3/Ask the Egyptians" than "Golfers Magazine/1916/Ask the Egyptians" since we might as well put things in the correct place ahead of time and it provides the obvious place for things like front matter. But I know that's not how it's always done, especially for newspapers where the content is often even more sparse, proportionally speaking, than magazines. Inductiveloadtalk/contribs 15:54, 31 August 2020 (UTC)
      @Inductiveload: If we can get that data, then that is definitely preferred, and I would think that for journals we would encourage it. For newspapers, I doubt that we are going to get the coverage, and they are just a lot harder due to how those beasts are constructed. Probably a case of differing guidance, and difference tolerances. — billinghurst sDrewth 14:23, 20 September 2020 (UTC)

No-content mainspace pages[edit]

This one is probably even more controversial so it's a separate proposal:

Collective works are commonly referenced by other works. Due to this, it is permitted to pre-emptively create the top-level main namespace page to collect incoming links, even when there is no content ready for transclusion. This also allows labour-intensive research into location of scans to be preserved and presented to users even when no transcribed work has been completed. The following is required for such a work:

  • A header with a brief description including active dates, major editors, structure (e.g. series) and so on
  • Redirects from alternative names (e.g. when a work has changed name or is referred to by other names)
  • A listing of volume scans should be added, and it should be as complete as possible, based on availability of scans online. As always, creating Wikisources index pages is preferred, but external scans are acceptable.
  • Creating sub-pages (volumes or issues) should follow the article inclusion criteria. This means a sub-page should not be created if there is no content.
  • Oppose As above these restrictions are an unecessary burden that would produce no real benefit and presumably result in lot of deletions. We do not need lists of editors. We do not need a complete list of volumes. (There may be hundreds of volumes of a particular periodical that have scans. For example, a page with links to scans of twenty volumes should not be deleted because the creator failed to link to scans of another eighty volumes.) Lack of redirects is not a reason to delete these pages either. James500 (talk) 20:37, 29 August 2020 (UTC)
  • Support, mostly. Generally speaking, I think that if a periodical changed its name, then there should be a separate page under the new name; however, redirection pages from alternate titles would be preferable. The other requirements are not overmuch burdensome, and would make useful a page that is otherwise empty, due to a lack of transclusions. TE(æ)A,ea. (talk) 12:30, 30 August 2020 (UTC).
    • None of our periodical pages includes the names of the editors, as far as I am aware. Not one. Under this proposal, every single periodical we have would be deleted. Further, it is not possible to include the names of the editors when they are anonymous. James500 (talk) 18:24, 30 August 2020 (UTC)
      • @James500: "every single periodical we have would be deleted" - or we could make the effort to improve such works as we find them. Generally, an except from Wikipedia or some other source would do just to provide some context. E.g. The Condor vs The Journal of Jurisprudence, which has the dates, but not other useful info, not even the country. For example, even a quick trawl would allow to write something like "The Journal of Jurisprudence was a Scottish law journal published in Edinburgh from 1857 to 1891. The first successful Scottish law journal, it covered all aspects of the Scottish legal system and included editorials, biographies and short articles as well as case law and reporting of legislation. It merged with the Scottish Law Magazine in 1867. It was largely replaced by the Juridical Review in 1891.". The editors aren't particularly obvious here (so they're not "major editors"), but sometimes editors are important to the work's history and are explicitly noted, e.g. All the Year Round or The New-England Courant.
      • Basically, if a page has zero or near-zero transcribed content, in my mind it can edge over the line into acceptable as long as it's providing useful auxiliary bibliographic information, which might also include collation of various names. This is somewhere WS can actually provide value-add - nowhere else online, as far as I know, provides a venue for this information (IA/Google metadata is terrible, OCLC is not very good at periodicals, Hathi is not can't download easily, none are editable, often a complete scan list uses various sources, etc). However, "it was a periodical and here's a handful of raw external links, kthxbai" doesn't quite cut it, even for someone who thinks these pages can be useful like me.
      • I've said it before several times, but the aim here is not, not, not to get all the pages like The Journal of Jurisprudence deleted, but instead figure out what needs to happen to keep them. To me, a decent blurb and a tidy list of volumes and scans will do it, but that's far from consensus. As it stands, as far as I can tell, the only reason half of Portal:Periodicals isn't getting unceremoniously dumped into Portal space (something I personally would like to find an alternative outcome to) is no one really wants to deal with it. We can fix that by coming up with a minimum level which the pages should meet and then fixing them up. Inductiveloadtalk/contribs 12:37, 31 August 2020 (UTC)
    • @TE(æ)A,ea.: about the names, above is an example, where the The Journal of Jurisprudence absorbed the Scottish Law Magazine in 1867. Though technically after the merge TJJ became The Journal of Jurisprudence and the Scottish Law Magazine (e.g. here, but not the title pages), it was still the same work. So in my mind, we could have The Scottish Law Magazine running up to 1867 and then The Journal of Jurisprudence for 1857–1891, with notes about the merge in both headers.
    • Another example of a work that changed name, but remained the same fundamental work is Monthly Law Reporter, which was just The Law Reporter for the first 10 years, and even kept the volume sequencing over the name change (though it added a "new series" number). So The Law Reporter should probably be a redirect. Inductiveloadtalk/contribs 12:37, 31 August 2020 (UTC)
      • The Scottish Law Magazine [and Sheriff Court Reporter] was originally called the Scottish Law Journal and Sheriff Court Record. It has a page already which includes the volumes up to 1867. James500 (talk) 15:10, 1 September 2020 (UTC)
        • @James500: Then a link to it should have been in the description already. I have added it and expanded the description as above. Feel free to add more details. Inductiveloadtalk/contribs 15:50, 1 September 2020 (UTC)
  • Pictogram voting comment.svg Comment Periodical main namespace pages should not contain the curated information of scans, etc., that is the job of the Portal: namespace. Main namespace should only contain published information for works that we have prepared. So under your proposal, the main ns can exist, and it should contain contents of works that we have transcribed, and there should be a corresponding portal: or there can be a constructed Wikisource: project page where there is a project to do the work. This was discussed years ago, and we have been moving those constructs to portal namespace for years. If there is zero content at the page, and we are unlikely to have it, then it can be redlinked, or maybe if it is that obvious then we don't need a link at all, Examples would be useful. — billinghurst sDrewth 15:42, 30 August 2020 (UTC)
    • You are the only person moving these pages into the portal space. I would like to see a link to the alleged discussion you refer to. James500 (talk) 18:24, 30 August 2020 (UTC)
  • @Billinghurst: I personally don't see huge value in simply shunting just scan links to Portal and leaving them there:
    • It eventually leads to having two parallel volume lists, one with links and one without, sometimes with divergence.
    • It tends to end up with "scratchpad-level" content in Portal, which is supposed to be a nice presentation space.
    • Portals are badly integrated and will probably not be noticed by casual users, or even many Wikisource editors. Especially as the Portal headers never seem to actually link to the mainspace works that exist, but we can fix that.
  • I suggest Portals like Portal:Punch provide some useful value-add, whereas Portal:Notes and Queries does not (yet), and its current content, if anywhere, should be on a WikiProject, just on the mainspace talk page, or even nowhere now all the volumes are uploaded. If the consensus truly is to shunt this all to Portal and move back once there's content, then fine, but I do wonder if that's truly the most ideal strategy. From a pure "only reproduced content in mainspace" angle, perhaps, but does that serve readers best? Inductiveloadtalk/contribs
    @Inductiveload: Main namespace is content for the reader. There is nothing worse for a reader to go to a page and have to drill down multiple pages to find that there is no content just some dashed skeleton of hierarchy. Main namespace is not built to drive transcribers and transcriptions, that is our other content spaces. We can create a page there once we have content to display what we have to read, and point to the portal for what we have to transcribe. It is the reason we put in place the portal namespace. — billinghurst sDrewth 15:08, 31 August 2020 (UTC)
    I also wish to avoid the really ugly situation of people uploading a work, creating the front page, and then just leaving it for other people. That facadism of a work is just problematic, and we know that nothing happens to it. It is why we developed {{ext scan link}} and {{small scan link}} for use in the author namespace to do that role of managing that list build. So portal and author namespaces play that role and keep main namespace cleaner and more functional. — billinghurst sDrewth 15:15, 31 August 2020 (UTC)
  • @Billinghurst: I'm not say that we should be creating pre-emptive "empty" hierarchies. I'm saying that I don't really see the point of shunting all the scan links off to a portal where they will basically never be found by anyone who isn't extremely familiar with Wikisource and the mainspace/portal split. If a casual reader, is after, say, Volume 22 of The Atlantic Monthly, for which we have neither scans nor content, do we serve them better by placing a scan link to the IA on the mainpage next to the redlink so that can at least find what they wanted, or is better to have no redlink at all, skip Volume 22 in the list and maybe put the IA link at a portal? If the latter, I'm fairly certain 95%+ of people will just not find that link at WS. We can certainly adopt a stance of if it doesn't exist here, we don't even want casual readers to be presented with an external resource, but that seems slightly walled-gardenish for an open project.
  • "Facadism" is annoying, and it (or the perception of it) is what has brought us to this point via the proposals at WS:PD. As an example from that page, I don't find the concept of the page American Law Review intrinsically offensive in mainspace, even without any content (though perhaps it's a little untidy as-is), but I don't really see the point of American Law Review/Volume 1 as it stands (only a title page and redlinked TOC, though it's a single article away from being useful to me).
    • Notably, I find "facadism" of a collective work much less annoying than, say, only having the preface to a novel. Collective works can have individually-useful things slotted in bit by bit, and if there's a framework around the work, it's even easy to do.
  • And if we do want to ditch this proposal and be strict with Portals in this way, then 1) it needs to be documented that that's how it works (Wikisource:Portal guidelines and Help:Portals don't mention use of Portals for this purpose at all, they focus more on thematic curation) and 2) most existing periodicals need to be converted over: many people reasonably imitate of existing structures, we can't blame them for that.
  • And do we allow redirection from a non-existent mainspace page to the portal so it can be found via "normal" linking until such time as there is content? Inductiveloadtalk/contribs 17:09, 31 August 2020 (UTC)
  • The word "facadism" is needlessly offensive and should be deprecated in favour of something that doesn't sound like it refers to habitual dishonesty. I would urge that care be taken when coining neologisms to consider how these words might be taken. James500 (talk) 15:32, 1 September 2020 (UTC)
    What? It means that there is a face only. Nothing more. There is no offensive with it and I don't even see where you can draw that inference. You are digging to deep or looking for insult. Front-pageism is meh! So unless you can ind a better term can you please AGF. — billinghurst sDrewth 18:58, 1 September 2020 (UTC)

Bot approval requests[edit]

Repairs (and moves)[edit]

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

Index:A pronouncing and defining dictionary of the Swatow dialect, arranged according to syllables and tones.djvuIndex:Dictionary of the Swatow dialect.djvu and related pages/transclusions[edit]

Request to shorten this hideously long file name.

See Index talk:A pronouncing and defining dictionary of the Swatow dialect, arranged according to syllables and tones.djvu#Title of the book; c:Commons talk:File renaming#Renaming a file to have a shorter name?.

The file on Commons has already been renamed. Suzukaze-c (talk) 09:22, 22 August 2020 (UTC)

👋 Suzukaze-c (talk) 05:09, 14 September 2020 (UTC)

Index:The famous speeches of the eight Chicago anarchists in court.djvu[edit]

Can someone (@Xover: you are good at this) fix the text layer offset in this Index? —Beleg Tâl (talk) 15:45, 23 September 2020 (UTC)

@Beleg Tâl: Done. Minimal quality control, but I rebuilt it from the source scans at ~2.5x resolution, and a few spot checks didn't show any OCR offset problems. --Xover (talk) 17:54, 23 September 2020 (UTC)

Other discussions[edit]

PD-anon-1923 again[edit]

The discussion of Happy Public Domain Day! has slipped into the archives without getting into some conclusion, so I would like to remind that the last suggestion in the above mentioned discussion was to create {{PD-US|year of death}} and deprecate {{PD/1923}} and {{PD-anon-1923}}. Is this solution OK?

BTW: if we decide to keep calling the license templates for pre-1925 works {{PD/1923}} and {{PD-anon-1923}}, it would be necessary at least to adapt the latter one so that it could be used for 1924 anonymous works too. --Jan Kameníček (talk) 16:21, 20 February 2020 (UTC)

Symbol support vote.svg Support the change — I don't really care but it makes sense —Beleg Tâl (talk) 16:36, 20 February 2020 (UTC)
  • Symbol support vote.svg Support likewise —Nizolan (talk) 01:54, 21 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose because the name emphasizes US. The point of the templates is to cover both US status and international status. A template that names the US will cause confusion, especially to newcomers. --EncycloPetey (talk) 02:02, 21 February 2020 (UTC)
    @EncycloPetey: So under your opinion, fixing a math wrong do even require consensus? Without consensus we should believe 1+1=3 rahter than 1+1=2? --Liuxinyu970226 (talk) 01:37, 1 April 2020 (UTC)
    Changes to established templates require consensus. We've had previous discussions and the community is divided on the issue concerning these templates. Proceeding with a change when the community has expressed such division is inappropriate because of the community discussion, not because of my opinion. --EncycloPetey (talk) 02:05, 1 April 2020 (UTC)
  • Symbol support vote.svg Support. We are US-centric in our copyright approach. Given the number of times I've had to look up these type of templates here and on Commons, I might buy the idea that we should copy them, but otherwise, I think this is going to be as non-confusing as we get.--Prosfilaes (talk) 04:35, 21 February 2020 (UTC)
  • Pictogram voting comment.svg Comment In your proposal, how do we code the year of the author's death for anonymous works? --EncycloPetey (talk) 04:38, 21 February 2020 (UTC)
    I am afraid I do not understand the question: anonymous works do not have any known author. I propose that for anonymous works we would have a template with similar wording as {{PD-anon-1923}}, but it would be called {{PD-anon-US}}. --Jan Kameníček (talk) 09:42, 21 February 2020 (UTC)
    That's also problematic, because the US is just one place that we display license information for. The current template displays that information for both the US and for countries with 95 years pma. --EncycloPetey (talk) 19:46, 21 February 2020 (UTC)

Pictogram voting comment.svg Comment If there is a consensus to act, my recommendation is that we just move/rename the templates

  • pd/1923|yyyy -> PD-US|yyyy, yyyy=YoD, displays two templates as now
  • PD-1923 -> PD-US, where no $1 parameter it displays the one template
  • PD-anon-1923 -> PD-anon-US|yyyy, year of publication

and update the documentation around the place. Do any internal required tidying around internals of templates, and fixing double redirects. No need to deprecate anything, just move to the new nomenclature, and not worry about any of the old usage, or anyone continuing its use, as it matters not. — billinghurst sDrewth 11:15, 21 February 2020 (UTC)

  • Symbol oppose vote.svg Oppose Firstly, because of the US emphasis. Yes, we follow US copyright law, but we also serve an international readership, not to mention contributors who are also bound by the copyright laws of other countries. Secondly, I think replacing "PD-1923" with "PD-US" is confusing. "PD-US" sounds like a generic template for "this work is PD in the US", but under this proposal it would mean "this work is PD in the US for the specific reason that it was published more than 95 years ago". BethNaught (talk) 22:16, 21 February 2020 (UTC)
    I do not understand in what way "the readership" is concerned in this… They see only the text of the template which is going to stay the same. --Jan Kameníček (talk) 23:08, 21 February 2020 (UTC)
    Pictogram voting comment.svg Comment I do not think that the suggested name of the template is more American-centred than the old one. E.g. {{PD/1923|1943}} has got two parts: "1923" is the American part referring to the American copyright laws, and the parameter "1943" is international referring to the countries where PD depends on the year of death. Nothing would change, only the American part would be called "US" instead of the nowadays non-sensical 1923, I really do not see any problem in that. --Jan Kameníček (talk) 23:08, 21 February 2020 (UTC)
    @BethNaught: The thing is that the only consideration we give to copyright compliance with regard to hosting is to the US copyright. Unlike Commons, we don't really care whether it is copyright in the country of origin. It is for this reason that I am reasonably comfortable with just stating PD-US and variants. The additional PD-old-70 and variants are for information only. — billinghurst sDrewth 00:43, 22 February 2020 (UTC)
  • Pictogram voting comment.svg Comment I think this is an important issue, and I'd like to weigh in. I'm probably as familiar as (almost) any Wikimedian with the considerations around copyright law in various countries. But I do not see a clear statement of what the problem is that we're aiming to solve, or what the pros and cons are. I'm sure if I took an hour or two to dig through various archives, I could probably figure it out, but I'm not likely to have the time for that...nor should we expect every voter to do that. So given all that, I'm inclined to gently oppose, simply because I can't figure out what's going on, and it seems unwise to make a change that is difficult for community members to evaluate. Is it possible to sum up the issues more concisely so that I can give it more proper consideration, without having to do all the research myself? -Pete (talk) 22:44, 21 February 2020 (UTC)
    The problem I see is this: Until 1923 it made quite a good sense to have a template called PD-1923, because it referred to the fact that only pre-1923 works are in the public domain. However, the situation has changed, currently the time border is 1925-01-01 (or 1924-12-31) and it shifts every year. I perceive it as very confusing to call the template for pre-1925 works PD-1923 (why 1923???). At the same time it does not make sense to change the name of the template every year (PD-1923, …, PD-1925, …), it would be better to find a fitting universal name. --Jan Kameníček (talk) 23:16, 21 February 2020 (UTC)
    Ah, that's very helpful @Jan.Kamenicek:, thank you. I had misunderstood, I thought you were proposing a change to the functionality in addition to the name change.
    I agree that changing the name (a) such that it specifies "US" and (b) such that it references the 95 year rule, rather than the (now outdated) 1923 rule would be worthwhile. I agree with others that we should be cautious about US centrism; but the reality is, with a current title that assumes that it relates to US law, without stating it, we already have a high degree of US centrism in the title. In my view, it's better to state "US" as part of the name, to make it clear to editors (who are the primary audience for a template name) that it's about US law. So, my suggestion would be {{PD-US-95}} or similar. That conveys that it's about US law, and it's about the 95 year rule. Text on the template page/docs could clarify that the 1923 rule is now outdated, and subsumed under the 95 year rule.
    A related issue that I find confusing: I don't understand why we need two separate templates for {{PD-1923}} and {{PD/1923}}. I think this proposal only relates to the latter; would we be leaving PD-1923 intact? A decision on this is probably a matter for a separate discussion, but I'd like to know for sure what the intent of this proposal is. -Pete (talk) 23:45, 21 February 2020 (UTC)
    PD-1923 has no decision-making applies just a single template, it does not add the PD-old-nn variants. It has been utilised where we have been unable to determine a date of death, or for corporate publications which do not have PMA decisions. I addressed above that they would morph into PD-US, though we would need to handle them as parameterless. — billinghurst sDrewth 00:51, 22 February 2020 (UTC)
    Jan, that's not quite correct. Works published before 1923 are still in PD in the US for the same reason they were before. The 1923 date was a cutoff date beyond which we have never had to check. What has changed is that works that were under copyright later than that (from 1923 and 1924), and had their copyright renewed at one point, have now had that copyright protection expire. The works published before 1923 were not eligible for renewal and entered PD for a different reason than the works published in 1923 and 1924. It is one view to see the date as a shifting cutoff, but the cause of works from 1923 and 1924 entering public domain is actually different from those that were published prior to 1923. --EncycloPetey (talk) 03:13, 22 February 2020 (UTC)
    All works published more than 95 years ago are out of copyright because of the time since publication, no matter whether that's due to copyright notices, or renewals, or being in copyright for a full long term. For a work published before 1923, we've never been concerned about copyright notices or renewals, nor how long work published with copyright notice and renewal got in copyright. Why does it matter that a work published in 1924 may have got 95 years of copyright, whereas a work published in 1922 may have only got 75, when we don't really care about that 95 or 75 in the first place? We have no tag for "published abroad before non-US works got copyright in the US in 1891", because we don't care; it has always been sufficient for our purposes to say that it was published before 1923, and I don't see why it is not now sufficient to say that it was published more than 95 years ago.--Prosfilaes (talk) 04:59, 22 February 2020 (UTC)
    @Prosfilaes: I am presuming that this is in reference to the primary notice about copyright within the US, not the secondary notice for PD-old-nn which relates to copyright elsewhere in the world. The secondary notice can still apply for those of us not in the US, which is why we added it. — billinghurst sDrewth 05:08, 22 February 2020 (UTC)
    Yes, the primary notice. There's no need to worry about now-historical features of non-US countries, but certainly helpful to list the years since death.--Prosfilaes (talk) 05:18, 22 February 2020 (UTC)
    Yes and no. There are authors who have works published prior to 1925 who died late enough to still have works in copyright in their home country, so those notices are still very pertinent per Category:Media not suitable for Commons. — billinghurst sDrewth 05:30, 22 February 2020 (UTC)
    Right; I didn't mean to imply we should change the current secondary notices.--Prosfilaes (talk) 06:42, 22 February 2020 (UTC)
  • Symbol support vote.svg Support U.S. copyright is of primary concern to Wikisource. Fixing the license so more 1923 and 1924 works appear on Wikisource even if still under copyright in other countries is so important. Abzeronow (talk) 19:46, 16 March 2020 (UTC)
  • Symbol support vote.svg Support as this seems like the least problematic solution to the problem, and it doesn't make sense for us to keep delaying a resolution. Kaldari (talk) 18:09, 14 April 2020 (UTC)
  • Pictogram voting comment.svg Comment It looks as though some people are hedging their bets: arguing for deprecating the template on the one hand but arguing for improving the template on the other. Since the template content has now changed, before this discussion has concluded, then proceduraily we should recast all votes, since the template named in this discussion thread no longer has the content it had at the start of this discussion. --EncycloPetey (talk) 20:42, 24 April 2020 (UTC)
    Hedging their bets? It is somehow improper to try and improve Wikisource for now, whether or not this template gets deleted? If we're going to get pedantic about policy, where is it written on the English Wikisource that we should recast all votes?--Prosfilaes (talk) 06:41, 25 April 2020 (UTC)
    No need to restart the votes, as the changes have been reverted. The template is the same as it was before the voting started. No changes should be made to any template if there is a discussion and voting ongoing about its future. If the changes were allowed and at the same time we would have to restart the voting after every change, we may never come to a conclusion; not everybody has time to vote about the same problem again and again. --Jan Kameníček (talk) 09:50, 25 April 2020 (UTC)
  • Symbol support vote.svg Support If there must need a consensus to fix math wrongs, let it be. --Liuxinyu970226 (talk) 09:01, 7 May 2020 (UTC)
  • Pictogram voting comment.svg Comment Please note that the new date, 1925, applies to all works except sound recordings (and maybe architecture). The date for sound recordings is 1923. That isn't shown in the local summary of the Hirtle chart, but is in the original. (I dropped a more detailed comment below.)--Sphilbrick (talk) 14:29, 20 July 2020 (UTC)

Policy on substantially empty works[edit]

[This is imported from WS:PD, where it applies to multiple current proposals, and several other works].

We have quite a few cases of works that are "collective" or "encyclopaedic" in that they comprise many standalone articles of individual value, which are basically just "shell pages", with no substantial content of any sort, not even imported scans or Index pages. For example, and this isn't intended to make any statement about these specific works, they're just examples and they may well get some work done soon during their respective WS:PD discussions:

Based on the usual rate of editing for things like that, unless dragged up into a process like WS:PD, they'll remain that way a very, very long time. I think it is perhaps there might be a case to host a mainspace page for this work, even though there is zero, or almost zero actual content. Do we want:

  • Mainspace pages where this is a tiny bit of information like header notes, scan links and maybe detective work on the talk page (not in this case). This provides a place for people to incrementally add content. Also gives "false positive" blue links, since there is actually no "real" content from the work itself, or
  • Do not have a mainspace page until there's some content. Only host this in terms of scan links author/portal scan links, much like we do for something like a novel.

Personally, I lean (gently) towards #2, but with a fairly low bar for how much content is needed. Say, Indexes, basic templates, a title page and one example article. Ideally, a completed TOC if practical, especially for periodical volumes/numbers. It is fair to not wish to transcribe entire volumes of these work, it is fair to not want to import dozens of scans when you only wanted one, it is fair to only want an article or two, but it's not fair, IMO, to expect the first person who wants to add an article to have to do all the groundwork themselves, despite having been lured in with a blue link. That onus feels more like it should be on the person creating the top-level page in the first place.

I do see some value in periodical top pages with decent lists of volumes and scans where known, because these are often tricky and fiddly to compile from Google books/IA/Hathi, so it's not useless work, even if there are no imported scans (though imported is better than not).

We currently have a large handful of collective works listed for deletion right now in various levels of "no real content", and, furthermore, every single periodical that gets added can fall into this situation unless the person who adds, so I think we could have a think about what we really want to see here. Inductiveloadtalk/contribs 15:43, 3 July 2020 (UTC)

  • I believe that, if there is no scan as an Index: page, the main-namespace page should not exist unless it is being actively completed or is already mostly completed. A few pages (of the volume itself) is not very helpful, and is entirely useless if their is no scan given. TE(æ)A,ea. (talk) 15:59, 3 July 2020 (UTC).
  • I think such preparatory information would ideally be on more centralized WikiProject pages (for the broad subject), both for clarity and to assist in keeping different efforts consistent -- but that it certainly should be retained as visible to non-admins. I think that the red vs blue link issue is minor (but not totally negligible) and outweighed by the disadvantages of hiding the history of previous efforts. I strongly encourage redirecting such pages to appropriate WikiProject pages (after copying over the details there). JesseW (talk) 18:11, 3 July 2020 (UTC)
  • @JesseW: I agree that history shouldn't be deleted, but I think we should approach this in terms of what we want to see from these works, rather than what to do with the handful of examples at PD. There are hundreds of periodicals we could have but don't, and this applies to those as well. If we can come to a conclusion about what is and isn't wanted, we can make all the deletion requested works conform to that easily enough. Inductiveloadtalk/contribs 20:55, 3 July 2020 (UTC)
  • I think these pages are necessary to list index pages and external scans of multi-volume works (such as encyclopaedias and periodicals) especially if they are wholly or partly anonymous or have many authors or are simply large. I think it makes no difference whether such pages are in the mainspace, the portal space or the project space (except that it is harder to find pages outside the mainspace). The point is that these works often have so many volumes (often dozens or hundreds) that they must have their own page, and cannot be merged into a larger portal or wikiproject. If the community starts insisting on index pages, what will happen is the rapid upload of a large number of scans for the periodicals that already have their own page. Likewise if the community insists on transclusion. I also think it is reasonable to have a contents page in the mainspace, as it allows transclusion of articles. Most importantly, new restrictions should not immediately apply to existing pages that were created before the introduction of the restrictions. This is necessary to prevent a bottleneck. James500 (talk) 23:55, 3 July 2020 (UTC)
move the works to a maintenance category, and i will work them; delete them and i will not: i find your sword of Damocles demotivating. Slowking4Rama's revenge 01:55, 5 July 2020 (UTC)
@User:Slowking4: I am not proposing a sword of Damocles. I agree that the imposition of deadlines is counter-productive. I do not support the deletion of any of these pages. I would prefer to see them improved. James500 (talk) 04:38, 5 July 2020 (UTC)
TEA is on his usual deletion spree. not a fan. will not be finding scans to save texts, any more. he can do it. Slowking4Rama's revenge 00:15, 6 July 2020 (UTC)
The entire point of moving this here, and not staying at WS:PD is to decouple from the emotions that get stirred up in a deletion discussion. Let's keep deletion out of this. If we come up with some idea of what we do and don't want, then we can go back to WS:PD and decide what to do. I imagine that all that will be needed will be a fairly limited amount of housework to bring those works up to some standard that we can decide on here, and all the collective works there will be easy keeps. Hopefully with some kind of consensus that we can point at to outline a minimum viable product for such works going forward. There are hundreds and thousands of dictionaries, encyclopedias, periodicals and newspapers that we could/will, quite reasonably, have only snippets of. How do we want to present them? What, exactly, is the minimum threshold? Let's head of all those future deletion proposals off at the pass, because deletion proposals often cause friction. Inductiveloadtalk/contribs 00:47, 6 July 2020 (UTC)
and yet deletion is the default method to "motivate" quality improvement. i reject your assertion that "emotions get stirred in a deletion discussion", rather, anger is a valid response to a repeated broken process being kicked down on the volunteers. it is unclear that a minimum threshold is necessary, rather a functional quality improvement process is. until we have one, you should expect to see this periodic stirring of emotions, as the non-leaders act out. Slowking4Rama's revenge 11:53, 9 July 2020 (UTC)
@Slowking4: Thank you for presenting this opinion, and I'm sorry if I have not made myself clear. We do need to figure out how to avoid a de-facto process of using WS:PD as an ill-tempered ad-hoc venue for "forcing" improvements on people who have somehow managed to generate works that are so in need of improvement that another user has nominated them for deletion. Please also consider looking at #Re-purpose_WikiProject_OCR_to_WikiProject_Scans for an idea to have a "functional quality improvement process" to which such works could be referred upon discovery rather than kicking them straight to WS:PD. If you have other ideas or you have previously suggested something similar to address these frustrations, you could detail them there. Personally, I think we should always prefer improvement over deletion. Exactly what the remediation is (refer to a putative WP:Scans, WS:Scriptorium/Help, directly WS:PD as now, or something else) is not what this thread is for. This thread is for discussing, what, if anything, should be the tipping point for deeming a page "lacking" and doing something about, whatever "something" is. I don't think I can be much clearer that this is not about deletion. If we also have a better venue for improvements, then that's even better.
For example, my personal feeling and !vote on A Critical Dictionary of English Literature is "keep and improve", despite it lacking scans or even links to scans, having only one article and no other content, not even a title page: in short, failing almost every criterion suggested so far in this thread. The only thing it does have is have is good text quality of the one entry. I personally do not think this work should be deleted, but I do think it should be improved in specific ways. The first half of that sentence is not the focus of this discussion, the second half is. Inductiveloadtalk/contribs 14:18, 9 July 2020 (UTC)
deletion threat has been an habitual method of communicating by admins since the beginning of the project. and text dumps have been habitual following in the guttenberg example. culture change and process change would be required to change those behaviors. we could may it easier to start scan backed works, but the wishlist was not supported. Slowking4Rama's revenge 21:00, 14 July 2020 (UTC)

I don't think this needs to be much of an issue going forward -- we all agree that it's OK to create Index pages for scans, even if none of the Pages have been transcribed yet; so the only case where this would come up is recording research where no scan has yet been identified as suitable to be uploaded. And for that, I still think a WikiProject page is the right location, not mainspace. (Or, if you must, your userpage.) JesseW (talk) 00:59, 6 July 2020 (UTC) I realized I may not have been clear enough here -- in my view, the ideal process goes like this:

  1. Decide on a work you are interested in (in this case, a periodical/encyclopedic one) -- don't record that anywhere on-wiki (except maybe your user page)
  2. Find and upload (to Commons) a scan of one part/issue/etc of the work.
  3. Create a ProofreadPage-managed page in the Index: namespace for the scan. (You can stop after this point, without worry that your work will later be discarded.)
    1. Put further research (on other editions, context, possible wikification, etc.) on that Index_talk page.
    2. Proofread a complete part of the scan (an article from the magazine issue, a chapter from the book, a entry from an encyclopedia, etc.) and transclude it to the mainspace (and create necessary parent pages), and put the further research on the Talk: page of the parent mainspace entry.

If you can't find any scan, and don't want to leave your working notes on your user page, put them on a relevant WikiProject's page.

If you come across such research done by others and misplaced, follow the above process to relocate it to an appropriate place, then redirect the page where you found it to the new location. That's my proposal. JesseW (talk) 01:08, 6 July 2020 (UTC)

@JesseW: It's not clear to me in your above whether when you use the term "index" you refer to a ProofreadPage-managed page in the Index: namespace, or a general wikipage in the main namespace on which an index-like structure (and/or a ToC, or similar) is manually created. Could you clarify? --Xover (talk) 05:14, 6 July 2020 (UTC)
I meant the namespace. Clarified now. JesseW (talk) 05:17, 6 July 2020 (UTC)
  • Hoo-boy. Y'all sure know how to pick the difficult issues…
    My general stance is that: 1) scans and Index: (and Page:) namespace pages have no particular completion criteria to meet to merit inclusion, and can stay in whatever state indefinitely (there may be other reasons to get rid of them, but not this); and 2) the default for mainspace is that only scan-backed complete and finished works that meet a minimum standard for quality should exist there.
    That general stance must be nuanced in two main ways: 1) there must be some kind of grandfather clause for pre-existing pages; and 2) there must exist exceptions for certain kinds of works that meet certain criteria. I won't touch on the grandfather clause here much, except to say I'm generally in favour of making it minimal, maybe something like "No active effort to get rid of older works, but if they're brought to PD for other reasons they're fair game". The design of a grandfather clause for this is a whole separate discussion, and an intelligent one requires analysis of existing pages that would be affected by it. It is always preferable to migrate pages to a modern standard, so a grandfather clause is by definition a second choice option.
    Now, to the meat of the matter: the exceptions…
    We have a clear policy to start from: no excerpts. Works should either be complete as published, or they should not be in mainspace. But quite apart from the historical practices that modify this (which are somewhat subjective and inconsistent, so I'll ignore them for now), there are some fairly obvious cases that suggest a need for more nuance than a simple bright-line rule alone provides. The major ones that come to mind are: 1) massive never-completed projects like EB1911 or the New York Times (EB because it's big; NYT because new PD issues are added every year); 2) compilations or collections of stand-alone works with plausible claim to independent notability.
    For encyclopedias and encyclopedia-like things, we have to accept some subsets due to sheer scale of work. But when that is the grounds for exception, there needs to be some minimum level of completion. I'm not sure I can come up with a specific number of pages/entries or percentage, but it needs to be more than just a single entry (and, obviously, only complete entries). For this kind of exception to apply, I think it needs to be a requirement that the framing structure for it is complete: that is, the mainspace page should give a complete overview of the relevant work even if most of it is redlinks. That includes title pages and other prolegomena when relevant. For a periodical like the NYT, that means complete lists of issues with dates and other such relevant information (e,g. name changes etc.). For preference, these kinds of things should be in Portal: namespace or on a WikiProject page until actually complete, but that will not always be practical (EB1911 and NYT are examples of this). Mainspace or Portal:-space should never contain external links (i.e. to scans) or links to Index: or Page: space (except the implied link of transclusion and the "Source" tab in the MW UI provided by ProofreadPage).
    For exception claimed under independent notability there are a couple of distinct variants.
    Newspaper or magazine articles need to have a certain level of substance in addition to a specific identifiable byline (possibly anonymous or pseudonymous, and possibly identified after the fact by some other source, such as the Letters of Junius) in order to qualify. It is not enough to ipso facto be a newspaper article, a magazine article, a poem, or an encyclopedia entry. On the one hand we have things like dictionaries and thesauri, where an entry could be as little as two words. Or a one-sentence notice without byline in a newspaper. Or two rhymed lines (technically a poem) within a 1000-page scholarly monograph.
    To merit this exception it should be reasonable to argue that the "work" in question should exist as a stand-alone mainspace page (not that we generally want that; but as a test for this exception, it should be reasonable to make such an argument). This would clearly apply to moderately long entries in the EB1911 written by a known author that has their own Wikipedia article. It would apply to short stories or novella-length serialisations in literary magazines by authors that have later become famous (or "are still …"). It would apply to various longer-form journalistic material from identifiable journalists (again, rule of thumb is notable enough for enWP article), including things in magazines that have similar properties. For most periodicals the most relevant atomic (indivisable) part is the issue not the entry or article, but with some commonsense exceptions.
    It would, generally, not apply to things that are works by a single author, like a scholarly monograph that just happens to be arranged in "entries" rather than chapters. It would not apply to things that are essentially lists or tables of data. It would not apply to short entries in something encyclopedia-like or entries that are not by an identifiable author. The OED for example, iirc, is a collective work where entries are by multiple not individually identifiable authors (and each entry is mostly very short too); only the overall editor is usually cited.
    For works claiming this exception too the framing structure should be complete, even if most of it are redlinks. The same general rules about Portal:/WikiProject and no external or Index:-space links apply. An exception would be for periodicals where new issues enter the public domain every year; and we should generally avoid including even redlinks for the non-PD issues here (but may allow them in a WikiProject page). For non-periodical works in multiple volumes where some volumes were published after the PD cutoff, including listings for the non-PD volumes (but not links to scans; those are a copyvio issue) is ok.
    Poems, short stories, and novellas are a special class of works here. A lot of these were first published in a magazine (possibly serialized), and a lot of them exist as multiple editions in substantially the same form. Some exist in multiple versions. These should all primarily exist the same way as chapters as part of their various containing works; but there are some cases where we might want to have, for example, a series of connected pages of the poems of Emily Dickinson. I am significantly ambivalent about this practice, as it amounts to making our own "edition" or "collection" of her poems (in violation of several of our other policies), but I acknowledge that it is an established practice and it is something that has definite value to our readers. It may be that it is actually a practice that should be governed by its own dedicated policy rather be attempted to be handled within these other general policies.
    For the sake of example; applying this to the works Inductiveload listed at the start of this thread would shake out something like this:
    Auction Prices of Books—This work appears to have no sensible subdivisions and is in any case by a single author. I see no obvious reason to grant this work an exception, except under sheer volume of work and even there I would want to see both a substantial proportion completed and some kind of ongoing effort towards completion (no particular time frame, but definitely not infinite and definitely not as an effectively abandoned project). In a deletion discussion I would very likely vote to delete the mainspace pages here (but, as nearly always, to keep the Index: and Page: namespace artifacts). I don't see this as a reasonable candidate for a Portal:, nor really a good fit for a WikiProject (though I probably wouldn't object to a WikiProject if someone really wanted one).
    Central Law Journal/Volume 1—A single volume is too little, so I would want to see a complete structure for the entire Central Law Journal, with level of detail for each volume similar to the one existing volume. Each article in the journal can be individually considered for a stand-alone work exception; but for the collection I would want to see at minimum a full issue finished to justify having the mainspace structure, and preferably multiple issues (in a deletion discussion I might insist on multiple issues). Index: and Page:-space artefacts can, of course, stay. A Portal: might make sense for selections from the journal, of articles that meet the standalone work exception. A WikiProject to coordinate work and track links to scans etc. might be a decent fit here, if someone wanted that. As it currently stands I would probably vote delete for the mainspace artefacts (with option to move whatever content has reuse value to a non-mainspace page for preservation; and undeleting if someone wants to work on something is a low bar).
    A Critical Dictionary of English Literature—The top level mainspace page has near-zero value, existing only to link to the single transcribed entry. For a credible claim to exception to exist it would need to be a complete framework for the work as a whole, and significantly more than a single entry must be complete. I would probably also want to see ongoing work, unless a substantial percentage of the entries were complete. The single finished entry is eligible to claim a standalone work exception, but I think it probably would not meet my bar for that (I might be wrong; and the rest of the community might judge it differently). In a deletion discussion I would probably vote to delete all the mainspace artifacts here (as always keeping Index:/Page: stuff) but with a definite possibility that I might be persuaded on the one completed entry (an absolute requirement for convincing me would be to scan-back it: as a separate issue, my tolerance for grandfathering of non-scan-backed works is small, and effectively zero for new/non-grandfathered works).
    Bradshaw's Monthly Railway Guide—Would need a full framework and a number of individual issues finished to merit a mainspace page. I see no credible subdivisions for a standalone work exception, but might be persuaded otherwise if, say, one of the train tables was used as a (reliable primary) source in a Wikipedia article (implying some sort of notability beyond just being raw data). In a deletion discussion I would probably vote to delete all mainspace artifacts here. If anyone made the argument, I would entertain the notion that there is value in treating train tables like poems, and hosting a series of train tables like we do Dickinson's poems; but that would require a substantial number of them completed.
    For everything above my stance is nuanced by a willingness to accept temporary exceptions for things that are actively being worked: active being operative, but with no particular deadline to complete the work. We have differing amounts of time available, and some works are so labour-intensive or tedious to do, that my person threshold for "active" is a pretty low bar to clear. If it's months and years between every time you dip in and do a bit I might start to get antsy, but days or weeks probably won't faze me. And that the projected time to completion is very long at that pace is not particularly a problem so long as it is not infinite. Within those parameters I would always tend to err on the side of letting contributors just get on with it in peace, regardless of any of the policy-like rules sketched above.
    I also want to emphasise that I think this is a very difficult issue to deal with. There are a lot of competing concerns, and a lot of grey areas that will likely take individual discussions to resolve. My balance point on this issue is partly formed by a broader concern about our overall quality (we have waay too many works of plain sub-par quality, and too many not up to modern standards) and a hope that by preventing the creation of these kinds of works (rather than deleting them after creation) we will be able to retain the good and desirable exceptions without dragging down quality, and without the traumatic and stressful events that deletions and proposed deletion discussions are.
    And for that very reason I am grateful this issue was brought up here for discussion, and I hope we can end up with some clear guidance, possibly in the form of a policy page, going forward. And in any case, since it will create de facto policy, this is a discussion that needs to stay open for a good long while (there are several community members that have not yet commented whose opinion I would wish to hear before closing this), and depending on how well we manage to structure the consensus, may also require a formal vote (up in the #Proposals section). --Xover (talk) 09:03, 6 July 2020 (UTC)
  • Symbol oppose vote.svg Oppose. It is becoming clear that a policy on incomplete works in the mainspace is going to place enormous pressure on individual editors. I think it would be more effective to start a wikiproject devoted to scan-backing works that lack scans and so on. James500 (talk) 12:14, 6 July 2020 (UTC)
    • @James500: FYI, this thread was made in order to provide an exception to the current policy of "no excerpts". A literal reading of the policy as it stands has a plausible chance of coming down delete on the mainspace pages over at WS:PD. This thread is a chance to come up with a better way to support such partial collective works. That we have several substantially incomplete and abandoned collective works lolling around in mainspace is actually the result of laxity in respect to stated policy (not to say I think it's a bad thing). The deletion proposals, whatever you may think of them, are actually not in contradiction to policy. That said, as always, there is scope to adjust policy. Which is what this is.
    • Now, in terms of a WikiProject to scan back works, I think that is a good idea. See #Re-purpose_WikiProject_OCR_to_WikiProject_Scans above, which proposed to reboot Wikiproject OCR as a scan-backing Wikiproject. Inductiveloadtalk/contribs 14:40, 6 July 2020 (UTC)
      • The policy says "When an entire work is available as a djvu file on commons and an Index page is created here, works are considered in process not excerpts." A literal reading of that policy is that no scan-backed work is an excerpt (it is expected to be completed eventually). Further the policy refers to "Random or selected sections of a larger work". A literal reading of that expression is that it does not include lists of scans, or auxilliary content tables, as they are not "sections" (they are not part of the work), and that not every incomplete portion of a work is either "random or selected" (which would not include starting from the beginning and getting as far as you can, with intent to finish later). I could probably argue that an encyclopedia article or periodical article is a complete work. James500 (talk) 15:16, 6 July 2020 (UTC)
  • Nice wall of text, Xover (and I say that with great respect!) -- it generally makes sense and sounds good to me. As another hopefully illustrative example, take The Works of Voltaire, which I've been digging thru lately. I think this would very much satisfy your criteria as a large work, with sufficient scaffolding to justify the mainspace pages that exist for it. I would love to hear others thoughts on that. JesseW (talk) 16:07, 6 July 2020 (UTC)
    @JesseW: Yeah, apologies for the length. Brevity is just not my strong suit.
    The Works of Voltaire probably qualifies on sheer scale of work, yes. I don't think the current wikipage at The Works of Voltaire is quite it though: as it currently stands it is more WikiProject than something that should sit in mainspace (its contents are for Wikisource contributors, to organise our effort, not our readers, who want to read finished transcriptions). It also mixes a work page with a versions page in a confusing way. So I would probably say… Move the current page to Wikisource:WikiProject Voltaire; create a new The Works of Voltaire as a pure versions page, linking to…; The Works of Voltaire (1906), that is set up as a work page with the cover and title (and other relevant front matter) of the first volume, and an AuxTOC (and possibly also the {{Works of Voltaire}} volume navigation template). I don't know how tightly coupled the volumes of this edition are (does the first volume have a common ToC or index of works for all the volumes?), so some flexibility on format may be needed to make sense. But as a base rule of thumb it should start from a regular works page and deviate only as needed to accommodate this work (mainly the size is different).
    In any case… With a volume or two completed (they're only ~350 pages each) I'd be perfectly happy having something like that sitting around. With less then that I'd possibly be a bit more iffy, but it's hard to put any kind of hard limit on that. And with somebody actively working on it I'd be in no hurry whatsoever regardless of current level of completion.
    PS. I'm pretty sure a large proportion of the contents of these volumes are works that would qualify under "standalone works" that could exist independently in mainspace, regardless of what's done with the The Works of Voltaire page. Even his individual poems and essays can presumably make a credible claim here (because it's Voltaire; less famous authors would have a higher bar). Better as part of the edition, but also acceptable on their own. --Xover (talk) 16:56, 6 July 2020 (UTC)
  • @JesseW: I personally take no issue with this page's existence (actually I think it's a nice work and good way to allow an important author's works to be slotted in piece-by-piece. I have some general comments which overlap with this thread (written before Xover's reply, so pardon overlap):
    • First off, I differ with Xover in terms of the scan links: I think they're better than nothing, and I don't see much value in duplicating the volume list onto an auxiliary page just to add scan links. However, I can sympathise with the sentiment that our mainspace shouldn't direct users off-wiki (or at least off-WMF). But if we don't have the scans, and that's what the user wants, they're leaving anyway. Real answer: import moar scans!
    • No scan links are necessary where the volume exists in mainspace and is scan-backed (e.g. v3)
    • Ext scan links should only be used when there is no Index page or imported scan. Use {{small scan link}} or {{Commons link}} when possible (e.g. v2)
    • The first volume list could probably be in an AuxTOC to mark it out as WS-generated content.
    • The "Other editions" section belongs on an auxiliary namespace page (Talk, Portal or Wikisource). I suggest the Talk page is best in this case. Inductiveloadtalk/contribs 17:35, 6 July 2020 (UTC)
  • @Xover: I am in agreement with the majority of what you say. Particularly, I think a framework around any collective work (be it a single-volume biographical dictionary or a 400-issue literary review spanning 80 years) is the critical prerequisite, plus at least some scans, the more the merrier. Where I think I differ:
    • I am inclined to be a bit more relaxed in terms of how much of a work we need. As long as a single article exists, it's not "trivial" (e.g. only a short advert or some incidental text like a "note to correspondents", as opposed to an actual article), it's well-formatted and scan-backed, and a complete framework exists, including front matter and a TOC, such that's it is easy for anyone to slot in new pieces, I'd be fairly happy. Lots of periodicals have all sort of tricky bits like tables of stocks or weather tables and writing into policy that those must be proofread in order to get the "real" articles into mainspace would be a chilling effect, in my opinion. If you allowed an exception, it would be verbose and tricky to capture the spirit without saying "unless, like, it's totally, like, hard, man".
    • I am not dead against scan links in the mainspace at the top level, when such a top-level page exists. See my comments on Voltaire above. I am against them where they could sensibly be on an Author page and they are the only mainspace content.
    • I am ambivalent on the presence of, e.g., disjointed train timetables. It's not my thing to have a smattering of random timetables, but as long as they're individually presented nicely, it's not too offensive to my sensibilities. I might question the sanity of someone who loves doing tables that much, but whatever floats the boats! Also, I think that this might circle back to "good for export" - a mark which certainly would require completed issues or volumes. If you want to get that box ticked, you have to do it all.
    • Re the "notability" aspect of individual articles, I'm not really bothered by that, as I don't think we'll see a flood of total dross because few people really want to take the time to transcribe 1867 articles about cats in a tree from the Nowhere, Arizona Daily Reporter, and, actually I think some of the "dross" can be quite interesting in a slice-of-life kind of a way (always assuming well-formed and scan-backed). And the real dross is usually so bad (no scans, raw OCR, etc) that it can be dealt with outside of this topic. I think part of the value of WS is the tiny, weird and wonderful, not just in blockbusters like War and Peace and Pultizers. I think I might like to see more of our articles strung together thematically via Portals, but that's another day's issue. Inductiveloadtalk/contribs 17:35, 6 July 2020 (UTC)
      • @Inductiveload: We appear to be mostly in agreement. But… instead of me dropping another wall of text on the remaining points of disagreement, maybe that means we're in a position to try to hash out a draft guidance / policy type page with the rough framework? Then we could go at the remaining issues point by point. Because I think I'm in with a decent chance to persuade you to my point of view on at least some of them, but this thread is fast getting unwieldy (mostly my fault). It would also probably be easier for the community to relate to now, and much easier to lean on in the future. --Xover (talk) 18:31, 6 July 2020 (UTC)
        • @Xover: If there are no more comments forthcoming after a couple of days, I think that makes sense. I don't want to railroad it: considering we have at least one !vote for "do nothing", I'd like to see if there are any other substantially different opinions floating about. Inductiveloadtalk/contribs 17:41, 7 July 2020 (UTC)

The quantity of text here has grown far faster than my ability to absorb it, so rather than continue to put it off, here's my position: I don't see any problem with transcriptions that are scan-backed, even if the transcription only covers a small fraction of the entire scan. If Sally chooses (say) to transcribe a favorite story, that happened to be published in an issue of Harper's back in the 1890s, and goes to the trouble of uploading the full issue, but only creates pages for the one story that interests her, I think that's great. It doesn't matter to me whether she intends to work on the other pages or not. If it's not scan-backed, but it's fairly high quality, I am personally willing to do some work trying to locate a scan and match it up to the text; I'd rather we take that approach, than deletion, though of course deletion is the better option in some cases where the scan is very hard to come by.

If all this has been said above, or if I've misunderstood the topic, my apologies. Please take this comment or leave it, as appropriate. -Pete (talk) 02:00, 8 July 2020 (UTC)

Apologies, I see I had missed the point.

I disagree with Xover's statement that a top-level page for a publication, with a link only to a single article within the publication, has "near-zero value." Such a page can serve an important function linking content together in ways that help the reader (and search engines) find the content they're looking for, or understand the context around it. For instance, A Critical Dictionary of English Literature is linked from the relevant Wikidata entry. The banner on the Wikisource page clearly tells a Wikisource reader that they won't find a full transcription here; and with a simple edit, it could link to a full scan on another site, or (with perhaps a little more effort) even transcription links here on Wikisource. This page has been here since 2010; we don't have any way of knowing what links might have been created elsewhere in the intervening decade. (I do think that new pages like this should not be created without a scan at Commons to be linked to.) -Pete (talk) 02:12, 8 July 2020 (UTC)

I'm really bad with walls of text, so I have only read a tiny portion of the above discussion. But I want to mention a couple of things that I think are worth considering in this discussion.
  • Most of the time, a mainspace "work" that is only a table of contents, but which has none of the actual content, and is not actively being worked on, can be (and should be) deleted as No meaningful content or history under our deletion policy.
  • A mainspace work that has only a little bit of content, but that content is a work unto itself within the scope of Wikisourse, should be kept. Most periodicals are like this. For an example, see the Journal of English and Germanic Philology which only has one hosted article, but that hosted article is scan-backed and firmly within scope.
  • On some occasions, empty mainspace works do have value. I ended up creating the page The Roman Breviary, depsite containing no actual content, mostly because there are a lot of works that link to it, using many different titles, and if someone uploaded a copy of the work under one title then many of the links would remain red because they point to different titles of the work. This could be easily solved by creating redirects to a simple placeholder page, so I did. I tried to make the placeholder page as useful as a placeholder page can be, as it contains useful information about the history and authorship of the work, and links to the Index pages where the transcription will take place.

Anyway those are my 2 cents, sorry if they are redundant —Beleg Tâl (talk) 00:40, 29 July 2020 (UTC)


Since there has been no extra input for a month, and not wanting this section to get archived without at least attempting a proposal, I have started a proposal #Collective work inclusion criteria above. Inductiveloadtalk/contribs 11:00, 25 August 2020 (UTC)

I've created Bradshaw's Monthly Railway and Steam Navigation Guide (XVI) - it couldn't be done on one page, due to the very high number of template transclusions. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:52, 1 September 2020 (UTC)

Proposing to delete disambiguation pages that are subpages of works[edit]

The following discussion is closed and will soon be archived:
deleting the disambiguation pages

Preferring to have this as an open conversation at Scriptorium where more eyes will see it rather than WS:PD, though I can move it there if required, especially as we have lightly had this conversation previously.

In times past there were created pages within works

and more recently

If you run report petscan:657164 you will see most that are there today as they don't have WD entries. I will search for all of them at a later time.

These are not pages that exist within the works themselves so have no particular value within the works. They don't sit linked within works, they are just predominantly orphaned pages. Also noting that there are page listings for each of these works that will effectively disambiguate these pages. I will also note that for our numerous biographical works it will be a huge exercise to propagate

If we require disambiguation pages they should be sitting as root pages, and disambiguate all works of the name, as has been done with Emerson. Not that I enamoured with such pages nor certain that they actually reflect true disambiguation pages nor are effective or are complete. [That is not the conversation today!]

We would also need to update our guidance at Help:Disambiguation pages to be explicit to not create such pages. I would probably write a filter that looks for such pages to deal with any future circumstances. — billinghurst sDrewth 00:15, 26 July 2020 (UTC)

I'm not sure about the subpage ones, but the Supreme Court ones (like Jones v. United States) do seem useful to me, as likely catches for mistaken off-wiki links. But the subpage ones should probably go, indeed. JesseW (talk) 01:25, 26 July 2020 (UTC)
@JesseW: They stay, they are not subpages, and we typically add to them. I have been adding them to existing disambig items at WD where they exist, and will get to creations at some point. That is the actual point of the query. — billinghurst sDrewth 01:44, 26 July 2020 (UTC)
@Billinghurst: Admittedly I am insufficiently caffeinated just now, but I am completely failing to understand what you're talking about here. Aren't the pages you link just the normal main work pages for the works in question? I'm not really seeing the disambiguation aspect, or any subpages. Help? --Xover (talk) 06:46, 26 July 2020 (UTC)
@Xover: I just listed the parent works that contain disambiguation subpages, one needs to run the petscan report to see the pages. — billinghurst sDrewth 07:36, 26 July 2020 (UTC)
petscan:16913508 <= main ns, use template:disambiguation and are subpages; 38 pages — billinghurst sDrewth 07:41, 26 July 2020 (UTC)
lightbulb goes on Oh, I think I see:
Where the idea is that we don't need … /Abdera to dab … /Abdera (Spain) and … /Abdera (Thrace) because navigation in the work (toc, indices, cross links, etc.) points directly to the intended target (or at least should do so)?
Provided I (finally) understood that correctly, I think I agree we should get rid of these that exist now. In the general case I think we'd either want to ban these and never use them, or to create them proactively for all such cases. But even if one were to use them they should not be used in any normal internal links (intrawork they are a detour from the intended target, and interwork there shouldn't be a link unless it's clear whether the target is the entry on Spain or Thrace). Which means that their sole purpose would be as targets of interwiki or off-wiki linking, which, in my opinion, would need some pretty clear and compelling use case to be merited. --Xover (talk) 08:24, 26 July 2020 (UTC)
Pretty much. The specific pages are linked from the parent (somehow). If we were to disambiguate it would be [[Abdera]]. Though to do that for every reused term in our works would be a nightmare. Imagine [[John Smith]] or is it [[Smith, John]]. Anything like that would have to be systematically prepared (Listeriabot?), which would mean that we would have to get all our works into Wikidata, and that is a nightmare scenario with how that is done here. — billinghurst sDrewth 11:39, 26 July 2020 (UTC)
All our works should be in Wikidata anyway. If there's a significant backlog it may be a task for a bot, or otherwise a matching tool of some kind, with automated suggestions and human verification. Does anyone have a measure (or estimate) of the task at hand? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:52, 22 August 2020 (UTC)
Symbol delete vote.svg Delete x100. I've already created Category:DNB disambiguation pages‎ and Category:EB1911 disambiguation pages to collect these in one place in preparation for such a deletion proposal. —Beleg Tâl (talk) 00:18, 29 July 2020 (UTC)

Closing discussion and deletion these couple of dozen pages. — billinghurst sDrewth 04:40, 28 August 2020 (UTC)

Pictogram voting comment.svg Comment Added text tp Help:Disambiguation to reflect this update (see special:diff/10413151) — billinghurst sDrewth 05:00, 28 August 2020 (UTC)
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. — billinghurst sDrewth 05:01, 28 August 2020 (UTC)

Technical Wishes: FileExporter and FileImporter become default features on all Wikis[edit]

Max Klemm (WMDE) 09:13, 6 August 2020 (UTC)

@Max Klemm (WMDE): We've been using FileExporter for a while now on enWS and it's been working great. Very very appreciated feature! However, we also really miss the ability to import files from Commons to Wikisource: the projects have different copyright policies so some files up for deletion on Commons need to be imported locally here. If the UI is difficult with all the different project combination, we could do that in a local Gadget, just so long as the core functionality was available. We wouldn't even strictly need the configuration stuff on meta; if we can easily move the file and preserve the revision history we can take care of the rest manually. Any chance this could happen at some point? --Xover (talk) 11:35, 6 August 2020 (UTC)
@Xover:, Thanks for your feedback and your question. I will come back to you with an answer at the end of next week, beginning of the week after, since I need to discuss your question with my team, which requires some time. -- For the Technical Wishes Team: Max Klemm (WMDE) (talk) 11:57, 6 August 2020 (UTC)
@Xover:, I am sorry for the late answer. We have recorded your request. However, with a lot of other projects in the making, we are not having time or manpower to modify the FileImporter in such way that you can use it to move file from WikiMedia Commons to WikiSource. -- For the Technical Wishes Team: Max Klemm (WMDE) (talk) 09:05, 26 August 2020 (UTC)
@Max Klemm (WMDE): Thanks for looking into this! Is there perhaps some lesser adaptation possible that would let us implement the missing parts ourselves in a Gadget? It is specifically the importing of revisions that requires core/extension support; all UI and higher level logic looks like it should be possible to implement with site/user scripts or Gadgets. --Xover (talk) 11:49, 26 August 2020 (UTC)
@Max Klemm (WMDE): As a note, one can move files from Commons to here using one of Magnus's scripts (excommons.js), though to do that one has to be an administrator at both Commons and the source wiki, and that is very limiting. I would endorse for local administrators to have the ability to transfer into a wiki from Commons. I don't think that it needs to be all-in for such an ability. — billinghurst sDrewth 13:23, 6 August 2020 (UTC)

Project Gutenberg blocked in Italy[edit]

Section moved here from Wikisource:Copyright discussions

see also

Slowking4Rama's revenge 02:34, 26 May 2020 (UTC)

It looks like the rest of the EU is going to go that way, and possibly Wikisource will join them if we get big enough to be noticed. From the EU's perspective, they're not going to tolerate works copyrighted in the EU being available on the Internet in their country.--Prosfilaes (talk) 03:59, 26 May 2020 (UTC)
Ouch! But that's really not a particularly surprising outcome, and a demonstration of the various risks our US-only copyright policy entails. I haven't been following this specific case (pointers welcome!), so I base this only on the headline, but I'd still be willing to bet Commons' will run clear of this while ours will land us square in it if they ever deign to notice we exist. --Xover (talk) 07:04, 26 May 2020 (UTC)
As we have English works, and UK drops out of the EU next year, that is unlikely to problematic for us, and then that comes down to what the other language wikis are doing. So it would seem that frWS and deWS are primarily the focus. — billinghurst sDrewth 11:51, 26 May 2020 (UTC)
The UK may still want to block us for that reason; Ireland is primarily English-speaking and has English as an official language, and Malta has English as an official language, so I don't see us as off the hook. Not that I suggest doing anything about it on Wikisource's side; if political maneuvers don't stop them, people will just use VPNs to get around the block.--Prosfilaes (talk) 00:48, 27 May 2020 (UTC)
VPNs won't help us. The reason why VPNs have not been stopped yet is simple: only a tiny fraction of readers know about their existence and only a fraction of this fraction can use them. If this changed they would be stopped. --Jan Kameníček (talk) 06:25, 27 May 2020 (UTC)
I'll add a "ditto" for what Prosfilaes said. And add that the issue isn't just whether we get blocked from those places, but also the legal jeopardy this places our readers and (especially) reusers in everywhere that's not the US. Large parts of Asia and some parts of Africa (and let's not forget Australia) have English as a primary written language or as a primary written language for certain purposes. In all these cases we put reusers at risk when we ignore the copyright laws of the country of origin; and our contributors from those areas too, but I am more willing to accept that they can make an informed decision regarding their own risk than our reusers. This move in Italy just exemplifies a risk that is everpresent for all these cases so long as we choose to be so US-centric in our copyright policy.
Note that I am not proposing we should immediately change our copyright policy based on one single country being stronzos. But it's a concrete example of the general and ongoing problem with our policy which I think we should consider very carefully going forward. --Xover (talk) 07:17, 27 May 2020 (UTC)
I'm not worried about legal jeopardy for our readers. Unlike torrents, they have no way of knowing who our readers are, and like torrents, trying to chase down random users is unprofitable and backlash-inducing. Reusers are going to have to deal; if you're going to reuse, you need to know the law you have to follow. Certainly Wikipedia strikes me as hugely dangerous for many reusers, as many pages will violate libel or blasphemy laws, or various local laws prohibiting certain viewpoints. (E.g. India apparently bans maps that they don't agree with, and Poland is getting on anyone who thinks Poles might have had a role in the Holocaust.)
Note that "copyright laws of the country of origin" is a bit of a farce; there's a table of countries on w:en:Rule of the shorter term and many countries don't have the rule of the shorter term, like Mexico and Brazil, with English speaking nations including Australia and Canada (at least wrt the US). The list is pretty short, so I don't know about most of Africa or Asia. I don't know what rules are needed to make Wikisource or Commons copyright-safe in all parts of the world, but we're talking at least life+70 (plus wartime extensions?) and 95 years from publication.
Maybe Commons is safer from widescale blocks like this, but I sure wouldn't reuse anything from there without double-checking, with uploads with bad licenses and many, many works that aren't free in my nation (the US).--Prosfilaes (talk) 11:01, 27 May 2020 (UTC)
Project Gutenberg in Germany has already been blocked for five years, see Court Order. - R. J. Mathar (talk) 10:02, 23 June 2020 (UTC)
  • @Slowking4, @Prosfilaes, @Billinghurst, @Jan.Kamenicek, @R. J. Mathar: I think this issue would be best discussed on the Scriptorium proper, and so if nobody has objections I'll move the whole thread there in the next couple of days rather than just let it slip quietly into the archives here. --Xover (talk) 13:05, 5 August 2020 (UTC)
    Not fussed, I am still of the "do nothing" approach, we comply with copyright and we now move into the area where it favours EU authors for 95 years of copyright. I think that Wikimedia Foundation would love to put their teeth into this one. — billinghurst sDrewth 21:59, 5 August 2020 (UTC)
  • I don't consider our project to be on the same scale or footing as PG. Gutenberg fails to provide transparency or clear licensing on their works. We're very careful here to make the licensing clear, prominent, and correct for every work we host. --EncycloPetey (talk) 00:34, 16 August 2020 (UTC)
Project Gutenberg does not care about how works may be copyright-restricted outside the USA, so Italy, having to generally copyright for life plus 70 years until year end per the European Union, would thus block it as a preventive measure. Please feel thankful that I brought Template:Pd/1923 from Chinese Wikisource to automatic license updating over years.--Jusjih (talk) 03:04, 20 August 2020 (UTC) (your Eastern Asian cultural bridge)
it is not about " if we get big enough to be noticed", but rather, hard headed enough not to delete 18 works when requested by a european court, seeking to impose european law upon the internet. no chance of that here, with PRP. (interesting Italy should adopt german methods on this occasion, given the conflict here [1]) don't know why you would waste time spinning up anxiety. Slowking4Rama's revenge 12:23, 20 August 2020 (UTC)
I disagree your theory of spinning up anxiety. Project Gutenberg has asked for it. Assume good faith.--Jusjih (talk) 00:45, 7 September 2020 (UTC)

Image template[edit]

On Wikispecies, species:Template:Image calls the primary image for the subject, from Wikidata. It can be added to articles and will not show, if Wikidata has no image, but once an image is added at Wikidata, it shows on the page. The Wikidata image can still be overridden with a local value.

I think such a template would be useful here, for author pages. Can someone import it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:48, 22 August 2020 (UTC)

{{Author}} is supposed to do this already. Beeswaxcandle (talk) 03:47, 25 August 2020 (UTC)
Fair enough. What about portal pages? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:16, 25 August 2020 (UTC)
...or works? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:40, 28 August 2020 (UTC)
For people in the Portal namespace, {{person}} works in the same method as does {{Author}}. Images are only sparingly used elsewhere. TE(æ)A,ea. (talk) 21:20, 28 August 2020 (UTC).

Important: maintenance operation on September 1st[edit]

Trizek (WMF) (talk) 13:49, 26 August 2020 (UTC)

The score extension is still disabled[edit]

Despite the fact that phab:T257066 is marked as “resolved,” the issue still remains. Is there anyone who can edit that page to correct this? It’s been meaning to proofread some music, but have been unable to do so. TE(æ)A,ea. (talk) 15:16, 26 August 2020 (UTC).

Note, the ticket has been reopened. —Beleg Tâl (talk) 01:52, 30 August 2020 (UTC)

U.S. copyright of 1929 book[edit]

Author:Axel Munthe (1857–1949) wrote The Story of San Michele (1929). Now that he's been dead for 70+ years, and the book is out-of-copyright in Europe and already online at another website, what about the U.S. copyright? Is there a chance that it was never renewed, and the book is free enough for Wikisource? Or should we wait 4 or 9 more years? --LA2 (talk) 20:27, 26 August 2020 (UTC)

  • According to this page, The Story of San Michele was first published in 1929 in the United Kingdom. As that work was still in copyright in that country as of the URAA date, the copyright in the United States, (which is what matters for inclusion on English Wikisource,) was extended to 95 years after the original publication of the book. The work will be eligible for inclusion here in five years. TE(æ)A,ea. (talk) 21:08, 26 August 2020 (UTC).

The years of registration vs. publication of the United Nations Treaty Series since Volume 401[edit]

Anyone interested in the United Nations Treaty Series? I just found that since Volume 401, each cover displays a year, likely when it was published. This is different from the year when any treaty was registered with the Secretariat of the United Nations. See also commons:Commons:Village pump#The years of registration vs. publication of the United Nations Treaty Series since Volume 401 where I request comments before re-categorizing File:UN Treaty Series - vol 402.pdf to File:UN Treaty Series - vol 599.pdf Please comment as I also plan to redesign the table at [[United Nations Treaty Series.--Jusjih (talk) 04:32, 28 August 2020 (UTC)

epub export only targeting the page rather than whole book[edit]

Hi there,

The epub export links on the of the .en books I have set up here seems to be exporting only the top level page, rather than the whole book. The contents page is on the index page, which seems to be required by the export tool.

In contrast, the books at .la Wikisource are exporting fine, with what looks like the same request to the same WSexport tool, so this is very strange. Any ideas? Are people seeing the same thing?

Examples: la:Easy Latin Stories and Key to Easy Latin Stories for beginners.

Thanks for any help you can give! JimKillock (talk) 08:11, 28 August 2020 (UTC)

@JimKillock: Try it now. And a question for you. Is that a real ToC from the published work, or is that one that you created to fill a void?
Thank you; this is working now. In fact I don't think it was broke, I got confused by the fast response and small file size.
The ToC is indeed one I created in order to make the epub export work (the epub export won't work without a ToC on the initial page AIUI). I did put this on the front page, rather than the source, and then moved it to a blank page in case this made a difference with the epub export, and left a note at the source page to this effect. I will move it back, let me know what best practice is here (the source page can't fully refect the sourece without breaking the export function). JimKillock (talk) 13:05, 28 August 2020 (UTC)
@JimKillock: The local style for constructed ToC is to apply it directly to the root page of the work and to wrap it in {{AuxTOC}}, not to imply that it was in the work by transcluding it. — billinghurst sDrewth 13:52, 28 August 2020 (UTC)
Thanks @Billinghurst: that's done JimKillock (talk) 14:29, 28 August 2020 (UTC)

Alternative solution to preferences previously stored in cookies[edit]

How can I fix the Charinsert display order, so that the "User" definition is the displayed definition? — Ineuw (talk) 00:53, 29 August 2020 (UTC)

Usually the dropdown stays for whichever was last opened. — billinghurst sDrewth 07:32, 30 August 2020 (UTC)
@Billinghurst: Thanks. This is so only for the current session and does not always work. I wonder what possible technical reason was this omitted from Wikisource cookies? After all, the browser is inundated with Wikisource cookies. Wouldn't care if I could add the current copy of Charinsert to my namespace, but my attempt failed. The advantage would be is to set User: as the first entry, and reorder/customize the list. — Ineuw (talk) 14:32, 1 September 2020 (UTC)
It works for me, and is persistent as best as I see. — billinghurst sDrewth 14:34, 1 September 2020 (UTC)
And I not see that "User" is the best option as most will not use it. — billinghurst sDrewth 14:36, 1 September 2020 (UTC)

Bibliographic data: OCLC, ISSN, DOI, LCCN codes[edit]

The related discussion has developed in the article tilted Index talk:On the Atmospheric Bude-Light.pdf. Wikidata elements allow user to add an OCLC code, but the association of the bibliographic data isn't automated nor it is always imported from external databases.

It concerns the verifiability of the cited sources as usually required on Wikipedia and Wikiquote projects. Probably, there exists also on Wikisource a similar policy for which anyone can verify the source as well the exact correspondance of electronic/digitized copies to the related original paper formats and manuscripts. It is a useful information in order order to prevent pages omissions or image editing that interest from third-party sponsorized works with an internal conflict of interest.

According to the CC-BY-SA license, anyone could reproduce or manipulate the original texts for an original and creative work. Users have the right to be acknowledged on what they are looking for: if it is a copy closely compliant with the original readable in an university library or in a public one, versus something of manipulated or not yet integral.Philosopher81sp (talk) 18:47, 30 August 2020 (UTC)

@Philosopher81sp: The data that you identify belongs at Wikidata, and it is added locally through the application of {{authority control}} to the work. The issue for much of what we do is that we work with old editions of works, and many of these works pre-date those major categorisation aspects, also to add to that is that we work with editions, not with the works, and the databases are not so useful about differentiating between the two. I am unaware of a bot that does this.

With regard to verification, we require sources to be added, typically where we are working with scans, that is self-evident and the Commons file will be sourced. If there is no scan to support, then we require source data to be added using {{textinfo}} on the talk page of the work. — billinghurst sDrewth 23:01, 30 August 2020 (UTC)

yes, Template:At, effectively the codes as in object mainly identify a title and not a specific edition. The code is a starting point for more accurate informations in order to identify a single edition.

However, a more complete set of bibliographic data can supply to that problem. The Commons file doesn't ensure the scanned copy is philologically faithful to the original primary source (paper or manuscript). If I Just remember, the CC-BY-SA gives anyone the right ti manipulate the original work, to omit some parts, e.g. to create a new original work, even for commercial purposes. The author isn't obliged to explain his innovations in respect of the primary source. But this would be of great interest for this project. Readers think to have a full and integrale digitized Copy of an original paper or manuscript. It's reasonable to Imagine they also want ti know where the single scanned edition is physically located and can be seen, even by specialists. I think people providing a scanned copy and the related Commons file can easily upload on Wikidata or in the article of Wikisource, metadata like: pages of each volume, Place and date of publication, the imprinter /typographer, the people to whom the questioned work has bene dedicated. A similar set of informations is adequate to univocally identify a single edition and enable any user (if provided of the necessary rights) to access and verify the primary source, a paper copy or a manuscript. Thanks for your reply.13:55, 1 September 2020 (UTC)

Not certain what you are wanting. The edition/version metadata lives at Wikidata, the transcript belongs here. Whilst we have a CC-by license, we aim to truthfully reproduce a work, so what can be done to a work is irrelevant to the work that we are doing to reproduce it. What someone does to our works after they leave is their business, not ours.

CSS and Width in MediaWiki:Proofreadpage_index_data_config[edit]

Hi. I am sysop at Spanish Wikisource. I've noticed that in your index forms there are this two fields we currently don't have: CSS and width. I have some questions, if someone is kind enough to answer them.

  1. Do they actually work as intended?
  2. Where are they documented?
  3. How can I export them to my local Wikisource?

Thanks in advance, --Ignacio Rodríguez (talk) 23:28, 30 August 2020 (UTC)

@Ignacio Rodríguez: From memory you just need to add them into place in your config. These are already defined as part of the extension and it is up to the projects to add and name/translate as needed, explained at mul:Wikisource:ProofreadPage/Improve index pages. — billinghurst sDrewth 23:46, 30 August 2020 (UTC)
@Billinghurst: Thanks! The extension page at mediawiki is very lacking in this things! --Ignacio Rodríguez (talk) 23:53, 30 August 2020 (UTC)

Linking to external audio recording of song[edit]

I'm coming from en-WP, and just created my first Wikisource page, Hail, Pomona, Hail, the alma mater of Pomona College. There's an audio recording of the song from a 2000 Pomona College Glee Club performance here; I assume we can't embed that for copyright reasons, but would there be a way to include it as an external link? Sdkb (talk) 19:12, 31 August 2020 (UTC)

@Sdkb: Not a straight, yes/no. If there is an out of copyright version, please download it and stick it at Commons, then utilise {{listen}}. If it is in copyright and hosted by copyright owner, and on an existing article, then we would allow it from within the "notes" section, as such we allow a soft "external links". If a work is in copyright, and someone is breaching copyright, no. Sounds like you fall into a permissible "yes". — billinghurst sDrewth 21:47, 31 August 2020 (UTC)
Thanks; I just added it. Sdkb (talk) 21:52, 31 August 2020 (UTC)

Tech News: 2020-36[edit]

20:11, 31 August 2020 (UTC)

Legislation and Version History[edit]

I've started a project to upload documents related to the legislative history of Social Security in New Zealand. Starting with the Social Security Act 2018, I'm adding the first version of this, then will be adding acts that amend this document, as well as adding new versions of that document that resulted from said amendments. Following this, there are the predecessor acts to this, the Social Security Act 1964, and the Social Security Act 1938.

Once I get a good flow going, I'll be adding a significant amount of versions of largely the same document, as an act of Parliament is really a living document that changes with each amendment. Wikisource however, is static. So it'll involve potentially hundreds of pages that are exactly the same.

Normally here on Wikisource, different versions of publications are uploaded at their own root level, and are treated entirely as their own publication. While different issues/volumes of a magazine or encyclopedia are uploaded as sub-pages of their parent publication.

I feel that legislation is rather a mid-point between these, and since it's such a large area of history, warrants its own approach that works best for the way legislation works. I want to treat different versions as different "issues", and group them under the same publication.

To this end, I've started adding different versions of the Social Security Act 2018 as sub-pages, i.e. Version 56, and Social Security Act 2018/Version 59. I intend to eventually make the root page similar to a versions page, that links to the various versions, but also the amendment acts that affect this legislation.

I'm raising this here as the way I'm structuring this document and its versions has been highlighted as not following the norm of how things are usually done here, so thought I'd make the case that that legislation is different enough that it warrants a more unique approach. Happy to hear thoughts on this, or if people are happy for me to take this approach. Supertrinko (talk) 22:05, 1 September 2020 (UTC)

@Supertrinko: Having now also read through your discussion with Billinghurst on your talk page I'm beginning to understand what you're trying to do; and I think you've gotten a bit stuck on one conception of how you would like this to be that doesn't match Wikisource very well. On the one hand you argue that legislation is so unique that we should have special rules for it, but in the same breath you also argue that we have so much legislation here that we cannot possibly cope with it as independent mainspace works. Neither is actually accurate.
Legislation acts very much like any other works except that it has stronger divisions between editions. That is precisely why any change to a law is, in most jurisdictions, made in the form of an act in the legislative assembly describing the ways in which an existing law is to be amended. The original law is a work, of which the published version is an edition, and the amending legislation is a separate work with an edition (the published version). The original law does not get another edition until and unless a competent body published a consolidated version (the original text but incorporating the subsequent amendments). In the legal sense the law has changed when the amendment is enacted, but in a bibliographic sense the published version of the law is unchanged until a new version is published. Practices vary, but often these are published as "as amended up to date of last amendment" (as, indeed, NZ does: there are 13 editions of this law, the latest edition identified as "as at 01 August 2020"). For us to generate interim "versions" by ourselves applying the amendments would be outside of scope of the project; and if a competent body issues running amended versions those would be simple editions of the original and can be handled like all other editions of works.
And that works just fine with the existing setup for the United States Code and a myriad other pieces of legislation. The limiting factor here is absolutely not anything technical, but sheer man-power. No project to "continuously amend" that I've ran across here has been anything but an abject failure: it is a niche project of interest to a single person and dies with that person's waning interest. And the more divergent the setup the less likely anyone else will ever untangle it or be able to contribute. In fact, I note the original text of this law (the "as enacted" edition) hasn't even been proofread yet, but we're expending energy on a navigational and organisational construct for 59+ versions of it. This just doesn't scale.
It also doesn't actually give you what you indicate that you're after. Comparisons between versions are essentially a diff view, which is actually what the amendments are except written to be legible to humans (well, to the degree your ontological classification of practitioners of the legal arts intersects). For that purpose simply proofreading the amendment as written would be better. For more technical text comparison, the version construction is neither necessary (you can compare any two wikipages) nor sufficient (you'll get a diff at the wikitext level, not text level).
However, all that being said, we do have several alternate approaches that may help to achieve your goals. For one thing, it is entirely possible to construct a novel navigation system through standardised templates, regardless of the organisation of the pages. Once all the relevant editions have been proofread, our annotations policy would also permit doing something like a color-coded version that indicates the last time a given paragraph was amended. As an annotation it would have a lot more leeway to be innovative, and might even serve as a useful testing ground for technical innovation that would benefit other such cases.
In sum, I suspect you've jumped straight to a solution before the "figure out how best to solve it" step. --Xover (talk) 06:35, 2 September 2020 (UTC)
@Xover: First I'll say you're actually completely right, I've been solution focused! I do think there's a problem to be solved, but happy to look over the best way to do this.
That said, I'm not suggesting that we can't cope with the number of acts/amendments/versions, I was just suggesting that organisation is necessary, and my go-to was treating different versions of legislation like issues of a periodical, which would neatly organise them via the URL.
A couple points which may be specific to NZ legislation, you mentioned "The original law does not get another edition until and unless a competent body published a consolidated version (the original text but incorporating the subsequent amendments).", in NZ legislation, every amendment triggers a reprint of the original act in a new version, so there's no risk of us generating interim versions that don't exist as a work. In that respect, I think everything is in scope.
Secondly, yes, there's the risk that I lose interest in this project (though with it being related to my work, this'll take a while), from what I've read of discussions on "collective works" on this site, the key thing there is setting up a project in such a way that anyone can make sense of its structure and can pick it up at any moment. I agree this is essential. And with reference to "59+ versions", just a note, with the SSA 2018 that I've added, "Version 56" is weirdly the first version of the document. Other versions are internal versions held by the Parliamentary Counsel Office that aren't publicly available. The second version available is called "Version 59", as versions 57/58 are internal only. I know these are just side-points, just wanting to add some clarity to my particular project.
You're right that it doesn't completely allow version comparison. I did want to keep everything as static separate texts, which is how things are done here at Wikisource, so I did the closest thing I could. But, I can access Section 8 (v56) and Social Security Act 2018/Version 59/Section 8 and see the note that details the change and see the act as it appeared in each version.
So, I'm happy to take a step back from my solution and look at the problem. The problem statement is that navigation of related legislation and versions of that legislation is challenging. So it's not that Wikisource can't handle it, it's that I think navigation of this content might be challenging. Perhaps first I should ask, do people agree this is a problem, that this is a problem that hasn't been solved, and should be solved?
The United States Code is not presented as separate versions as far as I can tell, it's just continuously added to as new laws are incorporated into it. There's nothing like that in NZ law that I can compare to.
The annotation suggestions you've mentioned are something I'll definitely look into, I'll scope out some examples and see how that might look. That might help with the version comparison side of things, but yeah, I think the organisation of the information is still an important issue. Supertrinko (talk) 07:49, 2 September 2020 (UTC)
@Supertrinko: I'm failing to find the time to dig into the details here, so I can't really propose a specific solution. But in the interest of not just leaving you hanging here… From my relatively superficial look my thinking is that what you're after can be achieved by the normal organisation as editions combined with a custom work-specific navigation template. We have some precedent for that approach, and it would be analogous to our use of {{AuxTOC}} (not present in the original work, but clearly visually distinguished as an addition made by Wikisource). So my proposal is that we investigate that approach to see if it meets your needs and can be implemented without running into anything unforeseen. --Xover (talk) 10:36, 5 September 2020 (UTC)
Okay, that's a really good suggestion! In that respect, I could include amendment acts under this template, even though they are not versions of the original act, they just impact it. That is what I'm looking for, so I'll give that a go. Appreciate your assistance here. Supertrinko (talk) 22:14, 7 September 2020 (UTC)

How you find out what other people did[edit]

Hello, all, I'm looking at phab:T261023 about how people "patrol" other people's contributions. I know that some of you do this, but I've never done it myself here. I'm assuming that the process here is not quite what the big Wikipedias are doing, and I'd like your perspectives to be included. Feel free to join the Phab task (the etiquette there is that anyone can post "information", but "discussion" should usually happen elsewhere) or to ping User:Keegan_(WMF) to any discussion on wiki (or ping me, and I'll pass it along to him). Thanks, Whatamidoing (WMF) (talk) 17:25, 3 September 2020 (UTC)

@Whatamidoing (WMF): Thanks for thinking of us, and taking the time to invite us to the conversation. That is very very much appreciated!
But I'm not entirely sure the sort of patrolling we do here translates particularly to the Toolhub project, since that's a rather specialised use case. Patrolling here is basically a small scale and more laid back variation of enwp's patrolling, and just using plain old Special:RecentChanges with filters for IP and newbie (renamed "learner" now I think?) edits. Oh, and ad hoc use of +autopatrolled whenever an admin gets tired of patrolling edits from someone that's clearly ready to not need handholding.
The biggest difference with enwp is probably that we patrol more to spot, welcome, and assist newbies rather than look for vandals and bad faith actors. And that's more of a cultural thing than a technology issue (although having the tools reinforce the "We're the last line of defence!" mentality that so easily spreads in contexts like enwp's new page patrol is probably not the best idea: WikiLove should permeate all tools intended to interact with new editors). We also inherently have a vastly higher proportion of new pages created than enwp: each page of a scanned book is a wikipage in the Page: namespace here, and each such wikipage is usually only a single edit. Tools assuming that creation of a new wikipage is the best predictor of patroller interest are likely to not be a good fit here. A new editor creating fifty new wikipages is entirely normal here.
In any case… Happy to help, and pleased to be invited, but I'm not sure our experience will have very much to contribute to this particular discussion. I've subscribed to the task and will chime in if relevant. --Xover (talk) 14:28, 4 September 2020 (UTC)
We can/do also utilise a bot that can selectively patrol based on criteria, see User:Wikisource-bot/patrol whitelist though as time has progressed it has been less used. Xover pretty well covers that we have less targeted vandalism, and we use patrolling to more identify editors we wish to review. When we are comfortable that they have the gist of our namespaces, and our templates then we progress them. We are more likely to allocate the right then have someone ask for it. I will also note that we allow all autoconfirmed users to patrol edits, which is more relaxed than most wikis, and it has worked well here. — billinghurst sDrewth 15:16, 4 September 2020 (UTC)
I think it's important for the devs to hear these differences.
Am I correct in assuming that spam's not a significant problem here? Whatamidoing (WMF) (talk) 18:51, 4 September 2020 (UTC)
Spam here is mostly on User: or User Talk: pages, so relatively easy to deal with. Nuisance edits (e.g. changing dates of publication) are low volume. Other foci in patrolling are copyright violations and preserving the integrity of the text as printed (we get people "correcting" older spellings or changing text based on a newer edition). For the rest, I look for places where I can advise a new editor on proofreading problems and/or how to do things so as to comply with our house style. Beeswaxcandle (talk) 19:10, 4 September 2020 (UTC)
As we have a multiple of namespaces in which we undertake our edits, and lots of our content preparation is in the Page: namespace. So when we get a vandal or a spammer they are pretty evident in the main namespace. As you are aware editing in the Page namespace is a little more concealed from those unfamiliar with the site, so less targeted unless they are popping through RC. As we also have some standards like {{author}} and {{header}} in certain namespaces, when they are vandalised or missing they show up in AbuseFilters. Similarly as adding external links is less likely here in main namespace, having AF monitoring there is usually pretty effective. So with those other triggers, we will often find our vandals/spammers by other means equally with patrol marks. — billinghurst sDrewth 01:18, 5 September 2020 (UTC)

Not a forum‎[edit]

I've copied {{Not a forum}} from Wikipedia. Feel free to make it feel more at home. I just felt that in one case, I wanted a nice bright box to remind would-be commentators that we don't really need discussion of the aspects of the work that aren't relevant to Wikisource (in this case, what On the Creation of Niggers‎ says about its author.)--Prosfilaes (talk) 06:33, 6 September 2020 (UTC)

Tech News: 2020-37[edit]

15:59, 7 September 2020 (UTC)

Missing horizontal rules (height:0 in global CSS)[edit]

It seems that horizontal rules using the <hr> element are no longer visible, due to the following CSS in the global CSS:

hr{height:1px;background-color:#a2a9b1;border:0;margin:0.2em 0}

This appears to be phab:T262507, which is hopefully going to get a fix soon. I have "repaired" the primary user, {{rule}}, where it was reported on the talk page, as this template should be entirely independent of the Mediawiki skin. However, there are quite a few other users of the raw HTML which are also broken. I think we can probably wait for this to be merged rather than faffing with our own site CSS, but we should also probably be looking at the use of bare HR's in content spaces which should probably be {{rule}} (paging WS:IGD?). Inductiveloadtalk/contribs 11:38, 10 September 2020 (UTC)

Invitation to participate in the conversation[edit]

Pagelist widget[edit]

From Sohom Datta, on the 'Wikisource-l' mailing list:

We are excited to announce that the Wikisource Pagelist widget is now available to be enabled on all Wikisources. Any interface admin on your wiki can enable it by using the instructions on the following page:


Feel free to also give us any feedback on the project talk page on Meta-Wiki as well.

Can someone enable this, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:41, 11 September 2020 (UTC)

@Pigsonthewing: Yes check.svg Done and appears to work. Thanks for the heads-up and thanks to the developer. Great tool! Inductiveloadtalk/contribs 10:07, 11 September 2020 (UTC)

Pagelist defaults[edit]

This widget allows to select defaults from a list, configured by MediaWiki:Proofreadpage pagelist dropdown values.json. I have used the example values with some minor changes. What is the actual preferred terminology for the various pages:

  • Covers: "Cover" or "Cvr"
  • Blank un-numbered pages: "-", "–" or "—" (hyphen, en, em dash)
  • TOCs: "TOC", "ToC"
  • Plates (i.e. whole page images inserted outside the usual numbering sequence): "Image", "Img" or "Plate"

All have the implied "something else" option too and we can add more ("Title"?) Inductiveloadtalk/contribs 10:17, 11 September 2020 (UTC)

My preferences would be:
(because it is on the keyboard)
(because I understand them faster.) --Zyephyrus (talk) 15:31, 13 September 2020 (UTC)
Pictogram voting comment.svg Comment I would like to remind people that we should be utilising existing page numbers over someone's created labels, even when the page numbering is mute, though explicit from surrounding numbers. There is nothing more irritating than seeing ugly, repeating and unusable TOC TOC TOC for adjacent pages. I still fail to see why we use either COVER or CVR, what is that? what is the purpose? just put a dash. Similarly what is the value in "PLATE" unless we are linking to it. These are page numbers/bookmarks, so in the end set them as they should be rather than some creation.

To the question, the hyphen becomes very small to select from pagelist, so I prefer a larger endash, especially as it is the same width as a numeral. — billinghurst sDrewth 23:57, 13 September 2020 (UTC)

  • I also think hyphen instead of dash as it's easier to type. And 'Title' would be useful. What about 'Index' and 'Advertisement/Adv.'? Although, it's always possible to type those in, and they're not that common. @Billinghurst: the reason for 'plate' and similar is mostly for when they're not part of the pagination, i.e. have been pasted in after binding. You're right about hyphen being too small, so maybe it could be a dash for the dropdown label but added to the pagelist as a hyphen? —Sam Wilson 00:02, 14 September 2020 (UTC)
    It should be an endash in the final presentation, so what displays on a ready Index page, and what shows in a page numbering. I care not about the dropdown. I understand that it is a plate and outside of the page numbering. I would still state that it doesn't need a presented label and page numbering that says plate, it can just be the dash. I much prefer to put in the anchors on the page for plates and not rely on a generic page numbering ad an ugly displayed label. — billinghurst sDrewth 15:27, 15 September 2020 (UTC)

Wikisource Pagelist Widget: Ready to be enabled[edit]

Note: This message is in English, but we encourage translation into other languages. Thank you!

Hello everyone,

We are excited to announce that the Wikisource Pagelist widget is now available to be enabled on all Wikisources. Any interface admin on your wiki can enable it by using the instructions on the following page:

In case, your wiki doesn’t have an interface admin, reach out to us on the ‘Help with enabling the widget on your wiki’ section of the project talk page and we will connect you with a global interface admin:

You will need to hold a local discussion around what would be the labels for different page types in your language for the visual mode. (For example, ToC = ਤਤਕਰਾ in Punjabi, title = শিরোনাম in Bengali)

Feel free to also give us any feedback on the project talk page on Meta-Wiki as well.


Sohom Datta

Sent by Satdeep Gill (WMF) using MediaWiki message delivery (talk) 05:56, 15 September 2020 (UTC)

See above. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:00, 15 September 2020 (UTC)

Widget making pagelist box too small[edit]

@Sohom data: With the change the pagelist box size is now only three rows, which is a right PITA. If one is not using the gadget, I would be expecting a standard pre-change box size. — billinghurst sDrewth 22:11, 15 September 2020 (UTC)

@Billinghurst: a workaround in the meantime might be something like the following in your common.css:
#wpprpindex-Pages {
	min-height: 15em;
Obviously you can change the 15em to suit your preference. Inductiveloadtalk/contribs 10:40, 18 September 2020 (UTC)
We originally wanted to make the box autosizing, so that it would fit all the text the user had currently entered, as well as making sure that people who wanted to use the widget wouldn't have to scroll through quite a bit of empty space. However, due to some issues with OOUI, we couldn't include that in the release. (T262144) I'll create a patch to fix the rows parameter until we have proper autosizing. :) Sohom data (talk) 18:44, 18 September 2020 (UTC)

File import[edit]

On reflection, I think File:A note on grappling tail-hooks in anopheline larvae - M.O.T. Iyengar - 1922.pdf (published in India in 1922, author died in 1972) is not eligible to be hosted on Commons, and should be imported here. Can someone oblige, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:06, 12 September 2020 (UTC)

@Pigsonthewing: Yes check.svg Done Please update the file locally, and add {{do not move to Commons}} Thanks. — billinghurst sDrewth 15:21, 15 September 2020 (UTC)
I have no idea what you mean by "update the file locally". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:36, 15 September 2020 (UTC)
Typically we would use {{book}} and have a valid licence, and amend categorisation. — billinghurst sDrewth 15:39, 15 September 2020 (UTC)
yeah, just to elaborate - when stuff gets deleted at commons, then there is no mirror here linking to it, and we edit it locally at this project. also information template is notoriously generic, with the wrong fields for artworks and books, so we resort to custom metadata templates. Slowking4Rama's revenge 17:11, 16 September 2020 (UTC)

Tech News: 2020-38[edit]

16:19, 14 September 2020 (UTC)

Georgia v. Public.Resource.Org, No. 18-1150[edit]

[Moved here from Wikisource talk:Copyright discussions#Georgia v. Public.Resource.Org, No. 18-1150. --Xover (talk) 07:07, 18 September 2020 (UTC)]

I am sorry if this is the wrong place to post this. I am most active on en.wikipedia. I just wanted to note this decision and make sure you know about it. In Georgia v. Public.Resource.Org, No. 18-1150, the US Supreme Court ruled that "Georgia may not copyright its entire official code, which includes both the state's laws and annotations".[11] I think this applies not only to Georgia but to every state in the US, when it comes to their laws. The annotations are very useful information for us to archive but I believe they only apply to Georgia because the interpretive works were done by members of the Georgia government. The decision is here. Coffeeandcrumbs (talk) 04:56, 18 September 2020 (UTC)

@Coffeeandcrumbs: Thanks for thinking of us! That is indeed a very useful reference. To ensure it gets seen I've moved it here to the Scriptorium (our version of the Village Pump).
Provided I've understood it correctly, the key factor is that USSC established a new test for "Edict of Government" that focusses not on force of law but rather on the work in question being authored by legislators or a legislative body. On Wikisource—whose copyright policy is to only require a work to be public domain under US copyright law (vs. Commons that also requires the same in the country of first publication)—this has the effect of significantly expanding the application of the "Edict of Government" exception. In practice we've not had many cases where that would have made a difference, but I think there are now a significant number of works that are eligible to be hosted here that were not previously so. --Xover (talk) 07:07, 18 September 2020 (UTC)

Bigger spaces between paragraphs[edit]

I don't expect to get much traction for this, but it's been bothering me for a while so here goes.

In prose text, the spacing between paragraphs doesn't really matter too much. The current paragraph spacing is fine. If we increased the paragraph spacing a bit, it would still be fine.

In poetry, however, the semantic encoding of the poems in HTML is best done by using line breaks between lines, and paragraph breaks between stanzas - but our paragraph breaks are too small for this to adequately space the stanzas, leading many editors to use the less-semantic habit of using a double line break between stanzas. This could be easily and neatly handled by modifying our site-wide CSS to have a little bit more space between paragraphs.

The only downside I can think of is the (relatively) small number of texts that use complex workarounds that depend on the height of paragraph breaks to display properly - but I argue that these do not matter because a) there aren't really that many of them, b) such a display constraint is (essentially) a hack and could not be expected to work forever, c) such a display constraint violates the responsive design principles of web-based texts, especially on websites like Wikisource where multiple formats are presented (web, mobile, epub), and d) we can fix them when we see them.

I am interested to hear the community's thoughts on the subject. —Beleg Tâl (talk) 14:36, 18 September 2020 (UTC)

@Beleg Tâl:: this is easy to do with site CSS, if the poem extension emitted sane HTML. Rather than BR-separated lines and P-separated stanzas, I think it would make more sense to do it this way: for the poem markup:
Line 1a
Line 1b
Line 1c

Line 2a
generate the following HTML:
<div class="poem">
  <div class="stanza">
    <p>Line 1a</p>
    <p>Line 1b</p>
    <p>Line 1c</p>
  <div class="stanza">
    <p>Line 2a</p>
This is because this was you can also achieve the very common formatting of hanging-indents for continued lines something like this:
Now Jones had left his new-wed bride to
        keep his house in order.
And hied away to the Hurrum Hills above
        the Afghan border,
by setting a CSS rule for .poem > .stanza, either with TemplateStyles or site-wide. Using BR-separated lines means you get the following, since it's all one paragraph and therefore doesn't get a new indent:
Now Jones had left his new-wed bride to
        keep his house in order.<br/>
        And hied away to the Hurrum Hills above
        the Afghan border,
Once this HTML structure is present, you can also have control over stanza padding-top/-bottom to control the inter-stanza spacing.
There is a Phabricator task for this: phab:T199075, and the similar phab:T8419. Inductiveloadtalk/contribs 15:05, 18 September 2020 (UTC)
Many editors (myself included) don't use the poem extension because it doesn't output sane HTML. Unless the poem extension were to be improved (which is unlikely, according to the devs at Phabricator), then maybe the poem extension isn't a useful way of handling this. Also, I do not understand why it would make sense for individual lines of a poem to be semantically tagged as paragraphs unto themselves? I do agree that the ideal method would to have a special markup for poems that works and that spaces the stanzas accordingly, but my proposal is based on what we ourselves are actually able to do. I suppose we could do it as a template, if we can get past our usual issues with template spread {{Poem begin}} {{Poem-on}}. —Beleg Tâl (talk) 15:10, 18 September 2020 (UTC)
I understand about the poem extension being busticated, but it looks extremely simple and I wonder if we can make/beg for a new or variant poem tag to use for "correct" poems.
Re: Also, I do not understand why it would make sense for individual lines of a poem to be semantically tagged as paragraphs unto themselves?
This is because <p>Line 1<br/>Line 2</p> is considered a single line for the purposes of indentation. Which means on a small screen the following poem:
Now Jones had left his new-wed bride to keep his house in order.
And hied away to the Hurrum Hills above the Afghan border,
wraps to something like
Now Jones had left his new-wed bride
to keep his house in order.
And hied away to the Hurrum Hills
above the Afghan border,
or, with a hanging indent, something like
Now Jones had left his new-wed bride
        to keep his house in order.
        And hied away to the Hurrum Hills
        above the Afghan border,
which means you can't see the original lines. As opposed to how the poems are actually formatted in nearly all books:
Now Jones had left his new-wed bride
        to keep his house in order.
And hied away to the Hurrum Hills
        above the Afghan border,
I think SPAN would work as well as P. Inductiveloadtalk/contribs 16:39, 18 September 2020 (UTC)

Category conversations—some proposals[edit]

I am going to be addressing a number of components about categories as I start to do some re-organisation and some tidying. The basis of the re-organising is to primarily to allow categorisation of our biographies; the aligning of occupations of authors and biographies of authors; the creation of meta categories and their alignment to existing WD cats for people.

Some issues that I would like to address and resolve

  • Nomenclature of occupations
  • Deprecate the use of category parameter in headers, with aim of removing
  • To group or not to group biographical/encyclopaedic/dictionary/... subpages by work

Nomenclature of occupations[edit]

Previous discussion at Wikisource:Scriptorium/Archives/2020-05#Time to talk nomenclature of author classification by occupation that looked to determine how we would categorise occupations for more than authors. Examples are

The meta category is configured for HotCat to not allow its selection as the final choice, instead it will show the next layer down.

Still trying to get opinion on which style of category name people would prefer, noting that there are going to be lots,

  1. Authors who are physiologists
  2. Physiologists as authors

I have tossed and turned, though think 2), as they up in HotCat quicker, and they will sort better in alphabetical lists without the need for defaultsort.


  • Please indicate which style of author occupation nomenclature is preferred/


It might be enough to have just Category:Physiologists which would contain authors who are physiologists + its subcategory Biographies of physiologists‎. This subcategory would be indexed by a space or an asterisk so that it was not alphabetically mixed among other potential author subcategories. The reason is that both names of the suggested subcategory for authors are quite long for a category name. However, if the opinion to have such subcategory prevailed, I would prefer "Authors who are physiologists". --Jan Kameníček (talk) 15:43, 19 September 2020 (UTC)

The issue with this approach is that would not be able to restrict the addition of author pages or biographical works to the category, so we are still going to need to eyeball and manually clean the categories. I was trying to avoid that sort of process wherever possible.

To also note that I started on biograhical and have yet to introduce how we categorise other non-fictional and fictional works, they still need to be within the model. — billinghurst sDrewth 14:17, 20 September 2020 (UTC)

I like this idea of subcategorizing only the biographies. Or, what if we called the author category "Physiologist authors" ? The closest parallel I have found in a brief search of our sister wikis is w:Category:Politicians by occupation which uses the pattern "Physiologist-authors". —Beleg Tâl (talk) 19:25, 19 September 2020 (UTC)
On the other hand, considering that we don't allow categorizing by Biographies-of-Person, then why do we even have categories for Biographies-of-Person's-Occupation at all? —Beleg Tâl (talk) 19:40, 19 September 2020 (UTC)
We don't allow categories based on individuals, and that is due to preferring curated pages for collection(s) of works. — billinghurst sDrewth 14:08, 20 September 2020 (UTC)
"Psychcology authors", surely? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:51, 20 September 2020 (UTC)

Deprecate category parameter in headers[edit]

I would like to propose that we deprecate the use of categories in {{header}}. My reasons are:

  • It makes maintenance difficult. None of pywikibot, AutoWikiBrowser, and HotCat can inherently know that the parameter is used so it takes text replacement methodology; or manually editing categories
  • The use of the parameter allows the categorisation into categories that are either redirected or meta categories (category:category redirects and category:disambiguation categories). Such categorisation should be avoided wherever possible.

You can see list of usage of the parameter in Category:Works using categories parameter

Pictogram voting comment.svg Comment Embedding categories inside templates is a bit ugh, unless they are distinct values, and I would suggest we probably shouldn't use that methodology unless it is data that is recorded against the item in Wikidata, eg. gender, birth_year, death_year, etc. and changes align with a WD edit. — billinghurst sDrewth 12:53, 19 September 2020 (UTC)


Agree. --Jan Kameníček (talk) 15:44, 19 September 2020 (UTC)

Symbol support vote.svg Support. Also as mentioned, it would be useful to have the header template the logic to auto-categorize based on Wikidata, but removing the ability to manually categorize this way is a good idea. —Beleg Tâl (talk) 19:27, 19 September 2020 (UTC)

@Beleg Tâl: once a hierarchy is built, then the community can decide on next steps. — billinghurst sDrewth 13:45, 20 September 2020 (UTC)

To group or not to group biographical/encyclopaedic/dictionary/... subpages by work[edit]

The categories utilised by EB1911 have a thorough category hierarchy of its own Category:1911 Encyclopædia Britannica that is basically independent of our standard category build. Other categorisation has been quite flat. so the question is do we mass collect in category of type with a parallel hierarchy, or do we have a subhierarchy that collates subpages by works

So which is preferred

  • Category:Biographies of politicians has a conglomeration of works which are forced to defaultsort by the subpagename, so become a mix of all the biographical works by person. This gives a straight alphabetical, though the long page names makes eye-reading difficult.
  • [[:Category:Biographies of politicians in Thom's Who's Who in Ireland]] (as a possibility) means that there names would be a little more eye-readable and further categorised, though requires additional digging, and biographical types could be linked to a parent, something like [[:Category:Biographies by occupation in Thom's Who's Who in Ireland]]

billinghurst sDrewth 14:04, 20 September 2020 (UTC)


Relationship with Project Gutenberg[edit]

I'm new here, and wondered what Wikisource's relationship with Project Gutenberg is. I've found books like A Passage to India which are incomplete here and have a proofread page while Gutenberg have a finished copy, but no mention is made of that here either for the book or on E. M. Forster's page other than an authority control reference.

But I've also seen other Gutenberg texts imported here, but books Gutenberg finished in 2003 are missing.

Do you have a policy to avoid duplication of effort and bring the sites together, either by manual updates or bots Vicarage (talk) 09:26, 20 September 2020 (UTC)

@Vicarage: We have no relationship with Gutenberg, though there may be editors who contribute at both places. Some have copied works transcribed there to here, presumably because they wanted to do so. Most people will typically work on something new, rather than regurgitate a work from elsewhere. We can just as easily link to a work at Gutenberg from an author page. — billinghurst sDrewth 13:40, 20 September 2020 (UTC)

Tech News: 2020-39[edit]

21:28, 21 September 2020 (UTC)

Math symbols available?[edit]

I am not certain whether the page Page:The Evolution of British Cattle.djvu/107 is able to be done with math symbols or some variance. Very happy if someone can do something better than I have in place. Thanks. — billinghurst sDrewth 12:19, 22 September 2020 (UTC)

and Page:The Evolution of British Cattle.djvu/111billinghurst sDrewth 12:42, 22 September 2020 (UTC)
I don't believe that our math plugin supports this kind of diagram. w:Help:Displaying_a_formula recommends building the diagram in TeX, exporting to SVG, and uploading to Commons to use as an image. I do like your approach to it though; maybe you could do something like


for the more complicated ones? —Beleg Tâl (talk) 13:33, 22 September 2020 (UTC)
Anyway, if you can render TeX to export as SVG, I found this site which allow you to create diagrams like this to generate code like this:
● \arrow[r] \arrow[rd] & ● \\
● \arrow[r] \arrow[ru] & ○
Beleg Tâl (talk) 13:43, 22 September 2020 (UTC)
Thanks. I am going with KISS, these representations will do. I am not looking for a facsimile. — billinghurst sDrewth 13:55, 22 September 2020 (UTC)

New feature: Watchlist Expiry[edit]

Hello, everyone! The Community Tech team will be releasing a new feature, which is called Watchlist Expiry. With this feature, you can optionally select to watch a page for a temporary period of time. This feature was developed in response to the #7 request from the 2019 Community Wishlist Survey. To find out when the feature will be enabled on your wiki, you can check out the release schedule on Meta-wiki. To test out the feature before deployment, you can visit or testwiki. Once the feature is enabled on your wiki, we invite you to share your feedback on the project talk page. For more information, you can refer to the documentation page. Thank you in advance, and we look forward to reading your feedback! --IFried (WMF) (talk) 16:46, 23 September 2020 (UTC)

Subject to change, the implementation date is listed as September 22, 2020 for Enwikisource. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:12, 24 September 2020 (UTC)

Wikisource Pagelist Widget: Wikisource Meetup (29th September 2020)[edit]

Hello everyone,

We hope you are doing well!

We reached out to you a couple of weeks ago to share that Wikisource Pagelist Widget is now ready to be enabled to Wikisource. Since then, many language Wikisources have enabled the widget but many are yet to do so.

So, we have decided to organize a Wikisource Meetup to give a live demonstration on how to use the widget in both wikitext and visual modes. There will be some time for the participants to share their feedback and experience with the widget. We will also provide support in case some Wikisource communities are seeking help in enabling the widget.

The meetup will take place on 29 September 2020 at 9:30 AM UTC or 3 PM IST. Google Meet link for the meeting is:

Looking forward to seeing the global Wikisource community connect amid these difficult times when physical meetings have not been taking place.

P.S. If you are planning to attend this meetup and are comfortable in sharing your email address then send us your confirmation in the form of a small email to, this will help us in getting a sense of the number of people that are planning to show-up. We are aware that this time-zone is not convenient for everyone and more meetups can be organized in the future.


Sohom, Sam and Satdeep

Sent by Satdeep using MediaWiki message delivery (talk) 11:03, 24 September 2020 (UTC)