Wikisource:Scriptorium

From Wikisource
(Redirected from Scriptorium)
Jump to navigation Jump to search
Scriptorium
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 451 active users here.

Announcements[edit]

Proposals[edit]

Bot approval requests[edit]

Repairs (and moves)[edit]

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

Other discussions[edit]

PD-anon-1923 again[edit]

The discussion of Happy Public Domain Day! has slipped into the archives without getting into some conclusion, so I would like to remind that the last suggestion in the above mentioned discussion was to create {{PD-US|year of death}} and deprecate {{PD/1923}} and {{PD-anon-1923}}. Is this solution OK?

BTW: if we decide to keep calling the license templates for pre-1925 works {{PD/1923}} and {{PD-anon-1923}}, it would be necessary at least to adapt the latter one so that it could be used for 1924 anonymous works too. --Jan Kameníček (talk) 16:21, 20 February 2020 (UTC)

Symbol support vote.svg Support the change — I don't really care but it makes sense —Beleg Tâl (talk) 16:36, 20 February 2020 (UTC)
  • Symbol support vote.svg Support likewise —Nizolan (talk) 01:54, 21 February 2020 (UTC)
  • Symbol oppose vote.svg Oppose because the name emphasizes US. The point of the templates is to cover both US status and international status. A template that names the US will cause confusion, especially to newcomers. --EncycloPetey (talk) 02:02, 21 February 2020 (UTC)
    @EncycloPetey: So under your opinion, fixing a math wrong do even require consensus? Without consensus we should believe 1+1=3 rahter than 1+1=2? --Liuxinyu970226 (talk) 01:37, 1 April 2020 (UTC)
    Changes to established templates require consensus. We've had previous discussions and the community is divided on the issue concerning these templates. Proceeding with a change when the community has expressed such division is inappropriate because of the community discussion, not because of my opinion. --EncycloPetey (talk) 02:05, 1 April 2020 (UTC)
  • Symbol support vote.svg Support. We are US-centric in our copyright approach. Given the number of times I've had to look up these type of templates here and on Commons, I might buy the idea that we should copy them, but otherwise, I think this is going to be as non-confusing as we get.--Prosfilaes (talk) 04:35, 21 February 2020 (UTC)
  • Pictogram voting comment.svg Comment In your proposal, how do we code the year of the author's death for anonymous works? --EncycloPetey (talk) 04:38, 21 February 2020 (UTC)
    I am afraid I do not understand the question: anonymous works do not have any known author. I propose that for anonymous works we would have a template with similar wording as {{PD-anon-1923}}, but it would be called {{PD-anon-US}}. --Jan Kameníček (talk) 09:42, 21 February 2020 (UTC)
    That's also problematic, because the US is just one place that we display license information for. The current template displays that information for both the US and for countries with 95 years pma. --EncycloPetey (talk) 19:46, 21 February 2020 (UTC)

Pictogram voting comment.svg Comment If there is a consensus to act, my recommendation is that we just move/rename the templates

  • pd/1923|yyyy -> PD-US|yyyy, yyyy=YoD, displays two templates as now
  • PD-1923 -> PD-US, where no $1 parameter it displays the one template
  • PD-anon-1923 -> PD-anon-US|yyyy, year of publication

and update the documentation around the place. Do any internal required tidying around internals of templates, and fixing double redirects. No need to deprecate anything, just move to the new nomenclature, and not worry about any of the old usage, or anyone continuing its use, as it matters not. — billinghurst sDrewth 11:15, 21 February 2020 (UTC)

  • Symbol oppose vote.svg Oppose Firstly, because of the US emphasis. Yes, we follow US copyright law, but we also serve an international readership, not to mention contributors who are also bound by the copyright laws of other countries. Secondly, I think replacing "PD-1923" with "PD-US" is confusing. "PD-US" sounds like a generic template for "this work is PD in the US", but under this proposal it would mean "this work is PD in the US for the specific reason that it was published more than 95 years ago". BethNaught (talk) 22:16, 21 February 2020 (UTC)
    I do not understand in what way "the readership" is concerned in this… They see only the text of the template which is going to stay the same. --Jan Kameníček (talk) 23:08, 21 February 2020 (UTC)
    Pictogram voting comment.svg Comment I do not think that the suggested name of the template is more American-centred than the old one. E.g. {{PD/1923|1943}} has got two parts: "1923" is the American part referring to the American copyright laws, and the parameter "1943" is international referring to the countries where PD depends on the year of death. Nothing would change, only the American part would be called "US" instead of the nowadays non-sensical 1923, I really do not see any problem in that. --Jan Kameníček (talk) 23:08, 21 February 2020 (UTC)
    @BethNaught: The thing is that the only consideration we give to copyright compliance with regard to hosting is to the US copyright. Unlike Commons, we don't really care whether it is copyright in the country of origin. It is for this reason that I am reasonably comfortable with just stating PD-US and variants. The additional PD-old-70 and variants are for information only. — billinghurst sDrewth 00:43, 22 February 2020 (UTC)
  • Pictogram voting comment.svg Comment I think this is an important issue, and I'd like to weigh in. I'm probably as familiar as (almost) any Wikimedian with the considerations around copyright law in various countries. But I do not see a clear statement of what the problem is that we're aiming to solve, or what the pros and cons are. I'm sure if I took an hour or two to dig through various archives, I could probably figure it out, but I'm not likely to have the time for that...nor should we expect every voter to do that. So given all that, I'm inclined to gently oppose, simply because I can't figure out what's going on, and it seems unwise to make a change that is difficult for community members to evaluate. Is it possible to sum up the issues more concisely so that I can give it more proper consideration, without having to do all the research myself? -Pete (talk) 22:44, 21 February 2020 (UTC)
    The problem I see is this: Until 1923 it made quite a good sense to have a template called PD-1923, because it referred to the fact that only pre-1923 works are in the public domain. However, the situation has changed, currently the time border is 1925-01-01 (or 1924-12-31) and it shifts every year. I perceive it as very confusing to call the template for pre-1925 works PD-1923 (why 1923???). At the same time it does not make sense to change the name of the template every year (PD-1923, …, PD-1925, …), it would be better to find a fitting universal name. --Jan Kameníček (talk) 23:16, 21 February 2020 (UTC)
    Ah, that's very helpful @Jan.Kamenicek:, thank you. I had misunderstood, I thought you were proposing a change to the functionality in addition to the name change.
    I agree that changing the name (a) such that it specifies "US" and (b) such that it references the 95 year rule, rather than the (now outdated) 1923 rule would be worthwhile. I agree with others that we should be cautious about US centrism; but the reality is, with a current title that assumes that it relates to US law, without stating it, we already have a high degree of US centrism in the title. In my view, it's better to state "US" as part of the name, to make it clear to editors (who are the primary audience for a template name) that it's about US law. So, my suggestion would be {{PD-US-95}} or similar. That conveys that it's about US law, and it's about the 95 year rule. Text on the template page/docs could clarify that the 1923 rule is now outdated, and subsumed under the 95 year rule.
    A related issue that I find confusing: I don't understand why we need two separate templates for {{PD-1923}} and {{PD/1923}}. I think this proposal only relates to the latter; would we be leaving PD-1923 intact? A decision on this is probably a matter for a separate discussion, but I'd like to know for sure what the intent of this proposal is. -Pete (talk) 23:45, 21 February 2020 (UTC)
    PD-1923 has no decision-making applies just a single template, it does not add the PD-old-nn variants. It has been utilised where we have been unable to determine a date of death, or for corporate publications which do not have PMA decisions. I addressed above that they would morph into PD-US, though we would need to handle them as parameterless. — billinghurst sDrewth 00:51, 22 February 2020 (UTC)
    Jan, that's not quite correct. Works published before 1923 are still in PD in the US for the same reason they were before. The 1923 date was a cutoff date beyond which we have never had to check. What has changed is that works that were under copyright later than that (from 1923 and 1924), and had their copyright renewed at one point, have now had that copyright protection expire. The works published before 1923 were not eligible for renewal and entered PD for a different reason than the works published in 1923 and 1924. It is one view to see the date as a shifting cutoff, but the cause of works from 1923 and 1924 entering public domain is actually different from those that were published prior to 1923. --EncycloPetey (talk) 03:13, 22 February 2020 (UTC)
    All works published more than 95 years ago are out of copyright because of the time since publication, no matter whether that's due to copyright notices, or renewals, or being in copyright for a full long term. For a work published before 1923, we've never been concerned about copyright notices or renewals, nor how long work published with copyright notice and renewal got in copyright. Why does it matter that a work published in 1924 may have got 95 years of copyright, whereas a work published in 1922 may have only got 75, when we don't really care about that 95 or 75 in the first place? We have no tag for "published abroad before non-US works got copyright in the US in 1891", because we don't care; it has always been sufficient for our purposes to say that it was published before 1923, and I don't see why it is not now sufficient to say that it was published more than 95 years ago.--Prosfilaes (talk) 04:59, 22 February 2020 (UTC)
    @Prosfilaes: I am presuming that this is in reference to the primary notice about copyright within the US, not the secondary notice for PD-old-nn which relates to copyright elsewhere in the world. The secondary notice can still apply for those of us not in the US, which is why we added it. — billinghurst sDrewth 05:08, 22 February 2020 (UTC)
    Yes, the primary notice. There's no need to worry about now-historical features of non-US countries, but certainly helpful to list the years since death.--Prosfilaes (talk) 05:18, 22 February 2020 (UTC)
    Yes and no. There are authors who have works published prior to 1925 who died late enough to still have works in copyright in their home country, so those notices are still very pertinent per Category:Media not suitable for Commons. — billinghurst sDrewth 05:30, 22 February 2020 (UTC)
    Right; I didn't mean to imply we should change the current secondary notices.--Prosfilaes (talk) 06:42, 22 February 2020 (UTC)
  • Symbol support vote.svg Support U.S. copyright is of primary concern to Wikisource. Fixing the license so more 1923 and 1924 works appear on Wikisource even if still under copyright in other countries is so important. Abzeronow (talk) 19:46, 16 March 2020 (UTC)
  • Symbol support vote.svg Support as this seems like the least problematic solution to the problem, and it doesn't make sense for us to keep delaying a resolution. Kaldari (talk) 18:09, 14 April 2020 (UTC)
  • Pictogram voting comment.svg Comment It looks as though some people are hedging their bets: arguing for deprecating the template on the one hand but arguing for improving the template on the other. Since the template content has now changed, before this discussion has concluded, then proceduraily we should recast all votes, since the template named in this discussion thread no longer has the content it had at the start of this discussion. --EncycloPetey (talk) 20:42, 24 April 2020 (UTC)
    Hedging their bets? It is somehow improper to try and improve Wikisource for now, whether or not this template gets deleted? If we're going to get pedantic about policy, where is it written on the English Wikisource that we should recast all votes?--Prosfilaes (talk) 06:41, 25 April 2020 (UTC)
    No need to restart the votes, as the changes have been reverted. The template is the same as it was before the voting started. No changes should be made to any template if there is a discussion and voting ongoing about its future. If the changes were allowed and at the same time we would have to restart the voting after every change, we may never come to a conclusion; not everybody has time to vote about the same problem again and again. --Jan Kameníček (talk) 09:50, 25 April 2020 (UTC)
  • Symbol support vote.svg Support If there must need a consensus to fix math wrongs, let it be. --Liuxinyu970226 (talk) 09:01, 7 May 2020 (UTC)

Tech News: 2020-17[edit]

18:44, 20 April 2020 (UTC)

Files affected by CSS changes[edit]

The change announced in this Tech News (Over-qualified CSS selectors in Wikimedia skins have been removed. affects the following files on enWS. Each change will have to be investigated individually, but as a rule of thumb, CSS files that use something like div.classname can be changed to simply omit the element name: .classname. Other changes will be slightly more involved, for example as given in the news item: div#content must be changed to .mw-body.



None of the needed changes are likely to cause catastrophic failures, but some may be annoyances and it is generally a good idea to implement the necessary changes sooner rather than later. If anybody needs technical assistance then post here so one of our more technical contributors can help out (or feel free to ping me directly, I'm a bit busy IRL but happy to help where I can). --Xover (talk) 06:22, 12 May 2020 (UTC)

Thanks for pinging me. Am I right in understanding that what you're saying is that those identifiers have changed (or are about to) in the HTML, so I should update my user stylesheet to reflect the change? I know my user CSS has some overqualified properties to ensure they take effect (and some !importants too 😞), so hopefully some of those can be made less shoddy? — OwenBlacker (Talk) 07:41, 12 May 2020 (UTC)
@OwenBlacker: Yup. ID selectors (#) almost never need the element name, and classes (.) rarely do. Since the specific element used for those components may change, specifying the element in the selector makes them fragile. The Tech News item is a warning that some of those elements may now change going forward. In addition, it lists the following specific changes to classes and IDs: div#content is now .mw-body. div.portal is now .portal. div#footer is now #footer. The tracker for these changes is phab:T248137. --Xover (talk) 15:02, 12 May 2020 (UTC)

The United Nations Treaty Series should have uniform naming of files[edit]

We use File:UNTS 1.pdf, File:United Nations Treaties and international agreements registered - Volume 221 (13 November 1955 - 30 November 1955).pdf, File:UN Treaty Series - vol 935.pdf, etc. here with very inconsistent naming. Please also come to commons:Commons:Village pump#The United Nations Treaty Series should have uniform naming of files to discuss on uniform naming. How will renaming impact our uses of these files?--Jusjih (talk) 03:17, 21 April 2020 (UTC)

@Jusjih: While it is nice and better to have a uniform naming pattern, it has never been a requirement for files. We can manage it through transclusion. If you wish to organise it for future files, or files that are yet to be transcribed, I doubt that we have an issue. If you are saying that we should be renaming files that have been transcribed, I don't think that is a good idea, and I hope that you tell that to Commons. — billinghurst sDrewth 07:58, 24 April 2020 (UTC)
All three files are hosted on Commons, so I wonder how renaming there will impact our transclusion here. As Commons gets growing consensus for "UN Treaty Series - vol 935.pdf", I have uploaded UN Treaty Series - vol 2.pdf through UN Treaty Series - vol 11.pdf.--Jusjih (talk) 03:59, 26 April 2020 (UTC)
My talk at Commons has been archived at Commons:Village_pump/Archive/2020/04#The_United_Nations_Treaty_Series_should_have_uniform_naming_of_files and I have asked any disinterested user to carefully consider if renaming File:UNTS 1.pdf requires substantial efforts.--Jusjih (talk) 05:03, 4 May 2020 (UTC)
Someone else renamed File:UNTS 1.pdf to File:UN Treaty Series - vol 1.pdf on Commons without affecting Index:UNTS 1.pdf, Page:UNTS 1.pdf/1, etc. here. It looks like a good news as renamed Commons files keep the redirects.--Jusjih (talk) 03:36, 5 May 2020 (UTC)

I believe that there may be some more such treaties in Category:PD-UN. TE(æ)A,ea. (talk) 11:36, 9 May 2020 (UTC).

Most of your cited files are partial pages of certain volumes of the United Nations Treaty Series. See also #Duplications of texts, USA treaties, between individual projects and within United States Statutes at Large.--Jusjih (talk) 18:50, 17 May 2020 (UTC)
Billinghurst and I have reached a consensus on Commons talk to grandfather File:UNTS 1.pdf and File:United Nations Treaties and international agreements registered - Volume 221 (13 November 1955 - 30 November 1955).pdf to avoid major impact from renaming. All others are to be named like File:UN Treaty Series - vol 935.pdf.--Jusjih (talk) 03:37, 26 May 2020 (UTC)

US Supreme Court determination re copyright and government edicts[edit]

Nemo bis flagged a recent news report of via Wikisource-L

This definitely gives us broader scope and clarity for some of our copyright discussions, and especially for our application of {{PD-GovEdict}}. Thanks to Nemo bis for that email. — billinghurst sDrewth

[Pinging Prosfilaes if you haven't seen this already.]
Wow! I find myself hard pressed to avoid using profanity as an intensifier in describing the effects of this ruling!
What Roberts and the USSC majority does in this ruling is to turn the "edict of government" doctrine pretty much entirely on its head! The test used to be a "force of law" test: in order to qualify as an edict of government the work itself had to carry some measure of force of law in some manner. This ruling completely changes the test to instead focus on the author of the work: if the author of the work has the authority to define or interpret the law, then the creator of the work cannot be an "author" under the Copyright Act. In practical terms, this means that both judges and legislators, both federal and state-level, when acting in their official roles, cannot hold copyright for works they produce irrespective of the nature of the material!
And it even goes further: this applies to a committee established by the legislature even though it is notionally external to the legislature (includes non-legislators in its makeup), and, yet further, even to a private commercial entity that produces work-for-hire material (ultimately) on behalf of a judge or legislator. Roberts underlines that for these works the "author" for copyright purposes is actually ultimately "the people", and explicitly names this "public domain".
Summing this up in a layman's rule of thumb: we now have the same sort of exception from copyright as {{PD-USGov}} for works by judges and legislators at both federal and state level. And, in terms of US copyright, this principle will apply equally to foreign works (the edicts of government doctrine always has; this ruling does not alter that). --Xover (talk) 09:47, 28 April 2020 (UTC)
Wow. This is going to change some things! I just hope the documents are available, as opposed to just not copyrighted --DannyS712 (talk) 10:16, 28 April 2020 (UTC)

Pictogram voting comment.svg Comment We are going to need to head back to WS:CV and look at the state works that we have deleted, and works by state members. We will need to apply our brains particularly to the responses to the State of the Union addresses, and re-evaluate the US Democratic and Republican national conventions. I would suggest that we are going to also look at our documentation at template:PD-EdictGov/doc, and I am presuming that this will only apply to US state legislators, not foreign legislators, or do we think that for US copyright purposes it is more universal. — billinghurst sDrewthbillinghurst sDrewth 14:29, 28 April 2020 (UTC)

Does this ruling (and I have a view this will go to appeals) change the status of papers submitted in a case? ShakespeareFan00 (talk) 14:38, 28 April 2020 (UTC)
@ShakespeareFan00: This does not affect submissions by parties in legal cases: the parties and their counsel do not define or interpret the law (they just argue it). And the Supreme Court is the court of last appeal. For the states to change this they will have to change the actual law (the Copyright Act). --Xover (talk) 16:19, 28 April 2020 (UTC)
@Billinghurst: I agree we will need to reassess quite a few previous discussions, and the docs (and the template text) definitely need updating. Note that it is by no means a given that this will change the outcome of those discussions (the test is legislators in their role as legislators, so for most actual works we've looked that I can recall the result will be the same); we just need to review them to make sure the changed legal test doesn't affect them.
This ruling only affects US copyright, but that was true previously too. The "edict of government" doctrine has (for our purposes) always been a US issue: the US copyright office will refuse to register such works. However, it has always also applied to a foreign work's US copyright status. If a foreign work meets the new "edict of government" test, it will be ineligible for copyright protection in the US. What it does not affect is the copyright status in its country of origin: if the country of origin has a PD-USGov or its own "edict of government" exception it will be PD in that country, and if they do not then it will be in copyright there. In other words, this change is a big deal for enWS but far less so for Commons (on Commons, only US works will actually be affected by this). --Xover (talk) 16:19, 28 April 2020 (UTC)
not too surprised. the US judges have a tendency to make it up, when confronted with egregious behavior. (see also fair use) for all those copyright maximalists, take note, a literal parsing of the code may not stand up in court. and might not be the right method. and have fun on your backlog. but on the other hand, give it time, the MPAA / author's guild will be talking to their Federalist friends, and they will have another go at this political court. (public domain anywhere is a threat to copyright everywhere.)Slowking4Rama's revenge 23:07, 28 April 2020 (UTC)

The Electronic Frontier Foundation's blog post about this is worth reading (and FWIW is CC BY). This is an issue we dealt with here in my home state of Oregon back in 2008. Nice to have the feds pushing in the right direction on this one. -Pete (talk) 20:01, 5 May 2020 (UTC)

I have drafted an updated Template:PD-EdictGov; take a look. One issue is whether we still even want to cite to the Copyright Office compendium, when the Supreme Court is the controlling authority. Phillipedison1891 (talk) 19:20, 18 May 2020 (UTC)

Wikilivres is live again[edit]

https://wikilivres.org/wiki/Special:RecentChanges Does anyone know anything about this? —Justin (koavf)TCM 02:52, 3 May 2020 (UTC)

It looks very very broken. :-( --Zyephyrus (talk) 15:58, 3 May 2020 (UTC)
Extremely broken. As I've said to our friend Koavf elsewhere, hardly any of the links work. And those recent changes end in May 2019. I kept editing it, while it kept getting frustratingly slower and slower, until the start of August 2019. So I guess some work is lost forever. I'll probably look in from time to time to see if it gets unbrokem. But that wiki has caused me to shed quite enough blood, sweat and tears over the past few years. It came. It went. It came back. It changed its name. Someone took over and wanted all admins to pay for the privilege of being one. It disappeared and came back under its old name. If it does come back again, its history suggests it won't be long before it disappears once more. I don't think I'll contribute to it again. Simon Peter Hughes (talk) 16:09, 4 May 2020 (UTC)
Chinese Wikisource and I will boycott Wikilivres as being too unstable.--Jusjih (talk) 03:42, 5 May 2020 (UTC)

Is there a summary anywhere of what happened, and/or does anybody have a clear idea what should ideally happen going forward? If it helps, I was in touch with Eclecticology's family after his passing (he was a friend, I wrote his obituary for the Signpost). If there is something a family member would be in a position to do (e.g. taking ownership of the domain), I'd be happy to pass along a request. -Pete (talk) 19:54, 5 May 2020 (UTC)

@Peteforsyth: I own the .ca domain: I got it from his family. I have never owned the .org. —Justin (koavf)TCM 06:58, 12 May 2020 (UTC)

Index:Letters of a Javanese princess, by Raden Adjeng Kartini, 1921.djvu[edit]

Was there a style guide for this because it seems to change style mid-book ?

I'm also finding that although 'validated', a lot of lint-errors are showing up, something that I thought validation was supposed to catch? (Not that I'm the best person to comment, given some of my less than prefect validation in the past). ShakespeareFan00 (talk) 22:09, 4 May 2020 (UTC)

The work has been completed by users who are not usual editors of enWS, and they have applied different styles and techniques. Validation is typically only about the text and typical formatting, that it has a lint error means nothing to validation. Lint errors are a process that WMF uses to identify coding that does not align with a standard, nothing more, nothing less.

I started working through transcluding this work yesterday, and I am slowly fixing the formatting, please leave it and ignore it for the while. It also has publishing difficulties that make it slower to transclude. I will need to bot it again as trying to fix all the errors in one hit is problematic with different editors using different syntax. — billinghurst sDrewth 22:24, 4 May 2020 (UTC)

Transcluded and running some span <-> div swaps. Feel free to run some checks after user:sDrewthbot has finished its run. I am guessing that I won't have got it all. It had lots of contributors, and lots of styles. — billinghurst sDrewth 18:32, 6 May 2020 (UTC)
Markign this as Yes check.svg Done unless someone has specific that they see as problematic. — billinghurst sDrewth 15:44, 9 May 2020 (UTC)
  • Note: this problem seems to be universal among Kompetisi Wikisource 2020 participants; I believe some collective action is needed to rectify this problem. TE(æ)A,ea. (talk) 23:32, 8 May 2020 (UTC).
    What do you suggest? Without a plan of action, we'd be tackling the problem in a very hit-or-miss fashion. There are lots of editors, each doing a little. --EncycloPetey (talk) 00:20, 9 May 2020 (UTC)
    • A template message sent to all contributors would, I think, be the most helpful; information about formatting, &c. should be included primarily. It would be best if Wikisource administrators could inform the leaders of the competition, but that’s not on English Wikisource. TE(æ)A,ea. (talk) 11:40, 9 May 2020 (UTC).

Noting the works for a later bot cleanup:

I am uncertain that the former of these is out of copyright in the US. @Rachmat04: who I am guessing is having some role. I will have to look further tomorrow, too late now. — billinghurst sDrewth 15:42, 9 May 2020 (UTC)

Dear everyone. Apologize for creating a mess. The participants were notified about the style and they were asked to follow the guidelines here, but most of them are new at Wikisource and sometimes they think they need to format the paragraphs as in the book. I will inform them again shortly and clean up the formatting. Thank you! ··· 🌸 Rachmat04 · 16:06, 9 May 2020 (UTC)
@Rachmat04: When running things like Kompetisi Wikisource 2020 and Pengguna peserta kompetisi beasiswa ke Jakarta nonton bareng Truth in Numbers on English Wikisource then please at the very least notify us (on our Scriptorium) that you are going to do so, and it would be strongly preferable if you designated someone that could coordinate with the community here. Big bonus points for having a page with information about the competition (or whatever it is) in English if English Wikisource will be more than trivially involved. Anyone can contribute, and we're obviously happy to see any new contributor on the project, but having a bunch of people suddenly swarm in with little familiarity with the local practices is just a recipe for disaster. --Xover (talk) 07:14, 10 May 2020 (UTC)
<standard URAA comment about following WMF legal advice to the letter>--Slowking4Rama's revenge 14:21, 13 May 2020 (UTC)
even better add the link m:Wikilegal/Use of Foreign Works Restored under the URAA on Commonsbillinghurst sDrewth 04:36, 14 May 2020 (UTC)

Time to talk nomenclature of author classification by occupation[edit]

We have dodging the fix for a while and it is probably time to start what I think could be a long conversation in what we could do, and then probably followed by another on how we will do. (People may prefer that this is put to a separate RFC subpage rather than here at WS:S)

We have long had occupation categories for author pages (subcats of category:authors by occupation and category:authors by nationality. Below that we have the plainest of names that don't distinguish that pages there should be from our Author: namespaces.

At the same time we have not had a good set of categories for biographies of people (as the subject), though I have been working on the creation of these subcats (through category:biographies of people by nationality and category:biographies of people by occupation) though these creations will be well behind the corresponding author set.

Suggestions[edit]

To me, it is time that we start to have clarity to our author category nomenclature, and I don't know exactly what people would want, though we could be as basic as converting Category:Physicians

  • Category:Physicians (authors); or
  • Category:Author:Physicians; or
  • something more natural text Category:Physicians as authors

All have strengths, and with the HOTCAT system if we utilise {{category redirect}} we can even utilise a couple of schemes and HotCat will put to the designated target. Personally if using HotCat getting it to differentiate quickly is always my ideal.

Ultimately we would have to decide whether the existing Category:Physicians is permanently going to point to its author derivative for all time, or for a temporary time whilst we migrate and settle down the setup, and at a point possibly become a category "disambiguation" page that points to the alternatives for physicians.

And the final question are we creating separate (and maybe matching) category hierarchies one for author namespace pages, another for main ns pages, splitting portal ns pages between one or the other. Noting that in the general subject categories when we get some layers down we start to pick up both forms so the names do become necessary. For example Category:Medicine will have below it (somewhere) both biographies and author pages of physicians.

Further, for something like sailors we could have Category:…
  • Sailors as authors or Sailors (authors) or Authors:Sailors
  • Biographies of sailors
  • Stories of sailors new category created just now to take pages
and reasonably all three have been added to category:Sailors. The more that I look at it, I think those generic names should be disambiguation categories, especially as HotCat c:Help:Gadget-HotCat has means to manage disambiguation targets.
Do I take this as either a) people don't care, and I can go ahead and do as I please? or b) this is way too big a conversation, and you numb my brain? or c) what the hell are you talking about? — billinghurst sDrewth 04:43, 17 May 2020 (UTC)
A bit of both column B and C. I think your problem statement and the rough thrust of your proposed course of action make sense. I see nothing in the above I actively disagree with.
I think category names should always be "fully qualified" rather than rely on its parent categories to provide part of its definition. I also think long category names are a good thing, and natural language construction of them the best approach. We can have Category:Physicians as a top(ish) level cat, but it would just be a container for Category:Authors who are physicians and Category:Biographies of physicians and Category:Novels about physicians, and so forth (possibly with intermediate "…by topic" container cats).
But I would also very strongly urge that we start by making a guide/principles/documentation/policy/whatever page that describes the entire scheme (principles, examples, guidance on tricky cases, what pages should have what kind of categories, and what pages should have no categories, etc.), before asking the community to actually decide. This stuff is going to be de facto policy (even if we don't call it that) and will be "enforced" (I use the term loosely) through various mechanisms, so fleshing it out more is necessary before properly deciding. And if we have good guidance and a sensible category naming scheme and hierarchy, it's going be much easier for the community to help clean up / maintain it without stepping on each others' toes. --Xover (talk) 07:16, 17 May 2020 (UTC)

Existing maintenance[edit]

I have been very slowly cleaning main namespace articles from our author categories, and down to less than 200 , or thereabouts

which will be a reasonable sweep, though not perfect.

Some of the remaining maintenance is people categories (our somewhat contentious people categories) eg.

+++

billinghurst sDrewth 17:49, 6 May 2020 (UTC)

Bot task: archive pages category[edit]

Please can someone use a bot or script to add all sub-pages of Wikisource:Scriptorium/Archives (e.g. Wikisource:Scriptorium/Archives/2020-01) to Category:Scriptorium archives? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:54, 7 May 2020 (UTC)

For what purpose? Not certain why we would want to categorise the pages when they all sit in a hierarchy. — billinghurst sDrewth 12:40, 7 May 2020 (UTC)
I'm sure I'm not alone in finding such frequent negativity tiresome, and thus being disinclined to volunteer more of my time here. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:23, 7 May 2020 (UTC)
Don't shit me Andy. That is not negativity, that is asking a question to understand what we are doing, for what outcome, amd considering how to best achieve it. That is what we do with bot requests. As you are asking someone else to do something for you, we are not entitled to know the why? Why couldn't you simply answer rather than doubling down?

We have a page hierarchy set up, it is simply easier to transclude the special page and dump the list, which I have done. But I cannot give you the best result without knowing what you are trying to achieve. Not certain why we want to categorise them when they sit in a page hierarchy. — billinghurst sDrewth 13:52, 7 May 2020 (UTC)

"Not certain why we would want to" is not a question; it's negativity, pure and simple, so it appears to be you who is "shitting". And while we can "transclude the special page and dump the list" and that may indeed be "easier", it is not better, and does not have the same beneficial effect, as can be seen at the foot of every transcluded page, where the link to the category is... oh, wait, there isn't one. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:28, 7 May 2020 (UTC)
Not all ideas are good; getting annoyed when people question your ideas doesn't help in a collaborative project.--Prosfilaes (talk) 17:20, 7 May 2020 (UTC)

You may have lost a long-term contributor because of apparent 'blindness' to a technical problem...[edit]

What should have been relatively simple to repair...Namely the lint erorrs in Page namespace here, has resulted in a lot of 'head-against-desk' moments.

This is not an unknown issue, given that there was a phabricator ticket about the attempted tidy up that occurs.

The blindness is not on the part of contributors and admins here, who are very aware of the issue raised here. The 'blindness' exists amongst certain other people I asked about this, whose response was that what was happening was the intended behaviour, which on a wiki that wasn't using proofread page would be understandable.

The other blindness was suggesting that the 'fix' was as I understood it to rewrite pages to conform to what the parser was actually doing, rather then what previous contributors thought or assumed it was doing. Again on another wiki, where pages get edited on a regular basis this would be partly expected. However on Wikisource, pages are generally not edited once they've reached a "stable" validated version, and editing half a million or so pages because of technical update is perhaps stretching the expectations of what a 'volunteer' effort can do. Re-writing pages was what the attempted changes on the pages were trying to do. However, for whatever reason, EVERY solution tried so far has seemingly had it's own issues, and NONE of them are a complete or stable solution.

A third 'blindness' was the suggestion to use POEM tags instead of BR. Yes, this would in some instances resolve the issues of multiple BR tags, However POEM generates it's own DIV wrapper internally, which in some instances is undesirable. Also in it's current form there is no <poem role=start><poem role=continuation><poem role=end> syntax to allow content formatted using it to span across Page: breaks. Having to guess where to put specfic line-feeds to do the tags page by page inline does not represent a stable-solution for contributors like myself.

As was pointed out, I am not compelled to sort of technical problems, so unless a long-term solution to this is forthcoming on a reasonable time-scale, I am considering if I desist from contributing anything to this project at-all, because I can no longer assume Mediawiki can actually support the Proofread page style arrangements in a stable, consistent and usable manner. ShakespeareFan00 (talk) 11:37, 7 May 2020 (UTC)

You choose to be concerned about lint issues, some of us are concerned about transcription and transcluding and couldn't give a toss about lint for lint's sake. We choose our battles. We are all free to make our choices. — billinghurst sDrewth 12:37, 7 May 2020 (UTC)
Thank you for a voice of sanity. And thanks for your suggestions regarding transcluded references on Phabricator.

ShakespeareFan00 (talk) 13:50, 7 May 2020 (UTC)

Query , Is there a way to filter the LintErrors reporting so the analysis excludes noincluded portions (like Header, footer or documentation content). You already I think expressed a view somewhere thatit was a waste of effort to repair talk namespaces (and for that matter certain archives in Wikisource:) namespace. I am starting to have the view that trying to repair endless headers and footer content, no one is going to see expect when transcribing is also a similar drain on resources? ShakespeareFan00 (talk) 13:50, 7 May 2020 (UTC)
The majority of our templates are sizing or positioning. What will typically break is that the span sizing, so the words will be a different size from expected. Almost a case of big deal. We have templates that break pages or push transclusion limits that kill the display of a page, and we are concerned about a bit of sizing of text? We have tens of works that are not transcluded. I know where my time is better spent. — billinghurst sDrewth 14:18, 7 May 2020 (UTC)
Template:Polytonic is a span template, and the usage there has <br /> and new lines. It will always trip up. It will need to be done as a <p> or a <div> — billinghurst sDrewth 14:40, 7 May 2020 (UTC)
What's happening is also related to what happens here... Page:Beowulf (Wyatt).djvu/115. In that there is NO line feed after a DIV which is opened in the header... or converserely there's no line-feed in the footer to force the behaviour...
<DIV><!-- No line break--><SPAN>
....
<SPAN>
<DIV>

when for mediawiki purposes what's technically needed is

<DIV>
<SPAN>
....
<SPAN>
<DIV>

Because the appropriate linebreaks between the opening DIV and opening SPAN are present on transclusion, this headache isn't present in transclusion. This as you say isn't a major concern, but it generates a LOT of 'useless' noise in Special:Lint-Errors

I did however find a work-around for putting {{center}} inside a {{float-left}}.. I had been using {{span|dpb|ac}} which seems to cure the div-span swap errors that resulted as well as NOT changing the intended layout. :). Various other block in a span or block in a paragraph problems can be resolved using this workaround. And given how you and other contributors set up the sidenotes, the same approach should in principle also be applicable to them. It's not a complete soloution for that, and I'm sure you would be the first to point out certain technical limitations on doing it that way.ShakespeareFan00 (talk)

If {{lang}} usage is getting highlighted as a breach of a <span>, then use {{lang block}} per its /doc page. — billinghurst sDrewth 22:47, 7 May 2020 (UTC)
We don't have a corresponding block template for {{polytonic}}. If we did, I expect it would solve a lot of the lang-related lint issues. Polytonic is not about a particular language, but about handling the pre-20th century forms of text written in Greek script, which included a set of diacritical markings not present in the current form of the language. --EncycloPetey (talk) 02:16, 8 May 2020 (UTC)
We have {{polytonic/s}} and {{polytonic/e}} because I 'cloned' them last night, also {{greek/s}} and {{greek/e}}
I'm using the /s /e convention for block templates. (but see my subsequent topic below.)
Well that approach is problematic. Firstly, please stop and think before acting, and implement an approach that is long term resilient and robust, not another ugly patch. Where we have a span and a block alternative, the block has used the BLOCK nomenclature, and then has the /s /e added to it. Please do not make /s /e based on the span alternative, you are just confusing matters. Also with all of these templates we have tried to have an underlying base template, where the variants spawn from the parent. They are also documented singularly based on the base template, and variations noted. We have {{lang}}, {tl|lang block}}, {{lang block/s}} and {{lang block/s}} as base templates that should be utilised. — billinghurst sDrewth 02:17, 9 May 2020 (UTC)
don't know why you are irritated by "whose response was that what was happening was the intended behaviour," this project has historically been unsupported, dependent on one volunteer hacker to maintain page code. we were lucky to get temporary attention for google OCR and VE, as much as WMF likes to market the global south wikisources. step back, and check out the other transcription projects off-wiki. there are plusses and minuses - you will be better appreciated, but the code will be opaque. treat wiki as the bad boyfriend: set lots of boundries, with lots of time outs. Slowking4Rama's revenge 23:02, 8 May 2020 (UTC)


Its seems that there is a problem of documentation. I'm not prepared to continue contributing until someone that knows what they are doing can actually sit down and properly document how things like this are SUPPOSED to be done for the long term. I'd also like an apology for my wasted time.

ShakespeareFan00 (talk) 10:44, 9 May 2020 (UTC) ShakespeareFan00 (talk) 10:46, 9 May 2020 (UTC)

Putting ultimatums to your fellow volunteers, especially those who have been around cleaning up other people's messes, including yours, is just bad form. If you need a break, take a break. Otherwise, please take your martyrdom somewhere private, your entrails are showing. — billinghurst sDrewth 11:23, 9 May 2020 (UTC)

Aribtary break[edit]

Okay deep breath. Once again you are voice of sanity, and
A good response would be that someone implements {{polytonic block}}{{polytonic block/s}}{{polytonic block/e}} in an appropriate manner. The difference from a standard lang call, is that the XML-lang tag seems to be set up directly (same approach as {{lang}} but implemented directly for performance reasons perhaps? It also chooses the font with a stylesheet (not a font param).
I've also in reviewing some other template code, found some other /s /e versions of templates that don't math what appears to be the naming convention.

Per the naming conventions and usage these I think should be "_block/s" "_block/e" pairs instead, notwithstanding that at present the parent block template for the group might not yet exist. ShakespeareFan00 (talk) 11:59, 9 May 2020 (UTC)


Conventions for naming block equivalent versions of span templates.[edit]

Currently there are two approaches used. One is to append -block or block to the name, with /s /e then added for the start and end variants...

Some templates have bypassed this, and use /s and /e directly for thier block version pairs (albiet without a corresponding template for a block that sits within an entire page.

It would be reasonable, to perhaps establish one naming convention moving forward?

(Aside: With some templates, the /s /e BLOCK versions do not behave identically to their SPAN versions.. {{float left/s}} for example doesn't even accept the same parameters as {{float-left}}, which is confusing. Maybe someone competent should re-align templates like this so the behaviour IS consistent? ) ShakespeareFan00 (talk) 08:54, 8 May 2020 (UTC)

There is a convention, that there are a few examples of it not being used, means they should be fixed, not propagated. — billinghurst sDrewth 02:19, 9 May 2020 (UTC)
There is a convention but it's not actually documented anywhere, and as SF00 points out there are clearly divergent and inconsistent applications of it. I absolutely agree with billinghurst that diverging examples should be fixed rather than propagated, but I think we need to approach that somewhat systematically. And since a lot of the divergent cases will be in fairly broad use, I think we will need at least a modicum of discussion for each instance before we go ahead with any actual changes. Maybe not for simple renames that leave a redirect behind; but things like merging the inline and block versions to use same/similar code and taking the same arguments should be approached with some caution. --Xover (talk) 07:10, 9 May 2020 (UTC)
Right so the next question is what the naming conventions is for
  1. The inital span-based template.
  2. It's block equivalent
  3. the "paired version" for Page: namespace

? ShakespeareFan00 (talk) 14:33, 9 May 2020 (UTC)

The consensus would seem to be {{foo}} {{foo block}} {{foo block/s}} {{foo block/e}} ?

Conversely there are templates like {{center}} which are conventionaly block based but which for certain purposes a span based version would be useful. should these be named Template:Foo inline or Template:Foo span ? ShakespeareFan00 (talk) 14:33, 9 May 2020 (UTC)

Centering should always be a div template; it is necessarily a function applied to a block of text to reposition it in the center. A span template, when it exists, allows the function to be applied to part of a line of text. I can think of no situations where we would center a word within an un-centered line of text. There are likewise going to be some templates that should only ever be span, and never div. --EncycloPetey (talk) 15:06, 9 May 2020 (UTC)
The instance is where the nominal parameter of template is span based.. so you do something like this to get a centered heading in a sidenote for example.
...
This is only an example {{sidenote|{{span|dpb|ac}}heading</span> Rest of sidenote.}}  Rest of main body of text...

The other instance when having an 'in-line' version of centerain templates is inside {{FI}} image captions, although In the sandbox version I wrote a multicap option to try and resolve that issue..

(ASIDE: See also {{FIS-c}}, for an inline-floated to the center image. - Because of how FIS works, all the caption content must be necessity be a SPAN because of HTML conventions. Thus a SPAN based centering would have to exist for that, (albiet it's an in-line block Convoluted, but that's what's provided... ShakespeareFan00 (talk) 16:19, 9 May 2020 (UTC)

Lua module error[edit]

While adding the Wikidata info for Narrative of the life and adventures of Paul Cuffe, the software generated the following error message:

Lua error in Module:Edition at line 229: attempt to concatenate local 'badgeName' (a nil value).

This warning appears to be displayed on every page with a Wikidata item. Whatever mechanism was set up for pulling the badge information from Wikidata has broken, so hundreds of works now display this error message. --EncycloPetey (talk) 18:40, 7 May 2020 (UTC)

I believe Kaldari was working on something related a few months ago -- pinging in case it's relevant. -Pete (talk) 18:46, 7 May 2020 (UTC)
Looking... Kaldari (talk) 18:58, 7 May 2020 (UTC)
@EncycloPetey, @Peteforsyth: This looks like a wider Wikidata bug, as I'm seeing similar problems on other wikis that try to pull Wikidata labels via Lua. I've put in a workaround for now. Unfortunately, you may have to purge the cache of the page (by making a null edit) to get the error to go away. Kaldari (talk) 19:58, 7 May 2020 (UTC)
But this wasn't happening on just one page. It was happening on every page I looked at that has its proofread status marked on Wikidata. The workaround seems to have cleared the problem however, so hopefully it has done so for all of those pages. --EncycloPetey (talk) 19:59, 7 May 2020 (UTC)

Lint-noise[edit]

The Lint-noise (and I will call it noise (because as it's not showing a major structural problem) about misnsted SPAN's with header DIV SPAN combos, I was experiencing is easily solved, like this.

Header:

{{fooblock/s}}
<!-- -->

Body

{{barspan|content}}
...

Placing the comment seems to force the tidy-up without adding any line-feeds in the Page:namespace.. It also doesn't add any lines on the transclusion, which I checked.

Given how this handling seems to work, an approach like you used for on tables might also work. with a {{@force}} dummy template at the start of the page. However, I'm concerned that approach might introduce line-feeds on transclusion which is course completely undesirable.

An analogous situation can be applied to a SPAN DIV closing situation in a footer. (The comment line and line-feed needing to at the start of the footer. I put a comment rather than a blank-line, so it's not seen as entirely white-space.

If this relatively easy fix can be applied more universally, then I'd like to consider applying it, alongside other reviews for missing formatting or removal of typos as it is undertaken. ShakespeareFan00 (talk) 12:05, 9 May 2020 (UTC)

Importing images from Wikidata in author-pages[edit]

We had an excellent system of automatically importing images from Wikidata in author-pages. But now some items on wikidata have more than one image, and this creates problems. See e.g. Author:Edwin Ray Lankester. It does not look very useful to me to have more than one image on wikidata, but apart from that, we now have no image at all at our author-pages. What is the opinion about this? --Dick Bos (talk) 11:41, 10 May 2020 (UTC)

@Dick Bos: We fix it. Pop over to WD and make one preferred. If you are after the list that needs attention, then category:Pages with missing filesbillinghurst sDrewth 13:03, 10 May 2020 (UTC)
@Dick Bos: Fix the template. For example, species:Template:Image handles this well. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:53, 10 May 2020 (UTC)
I am afraid that the supposition that we will keep up with the bots adding non-sencically another (usually worse) image to items which have already had one did not prove right. I have corrected some, but I have other work to do, and so probably do other people too. The problem has been continuing for several weeks (i. e. some of our author pages have been lacking images for several weeks) and nobody knows how long it will continue. And after we finally fix all the broken images, nobody knows when the bots create new mess. Some of the images I checked were really bad and so I removed them from Wikidata completely, but I guess the bot may add them again after some time. Unfortunately, Wikidata lacks any systematic control of the bots and too many people are allowed there to make hundreds of thousands of contributions without any supervision. So if we want to use Wikidata, we also have to create our defence against irresponsible WD bot operators and set the templates to ignore newly added images until a human checks them (and possibly decides that the new one is better and sets it as a preferred one). --Jan Kameníček (talk) 21:15, 10 May 2020 (UTC)
I can't parse "non-sencically"; but to which bots do you refer? Can you provide a diff of a "bot creating new mess"? No-one is "allowed there to make hundreds of thousands of contributions without any supervision"; please don't spread such disinformation. I described the best the solution, in the post under which you have commented. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:30, 11 May 2020 (UTC)
I thought it was clear that I wrote the contribution to support your suggested solution, I am sorry if it was not.
Here is a diff] of one of many examples of a bot creating mess, here with a picture of Jan Hus. Portraits of many historical people are just imaginary as nobody often really knows their real appearence. But the bot added a portrait with which it is even not sure whether the unknown painter wanted to depict Jan Hus or somebody completely different, despite the fact that there already was a portrait widely known as depicting Jan Hus and there are also other different portraits in Commons, but the bot added the worst one of them all (and the result was that Wikisource did not show any). I would understand it if there was no image at WD, but there was quite a good one. If the only reason for adding this ambiguous portrait was, as EncyloPetey suggests, that they have this portrait in Catalan Wikipedia, it is one of the worst reasons I can imagine. --Jan Kameníček (talk) 17:42, 11 May 2020 (UTC)
As I've noted elsewhere, these images are most often imported from the Catalan Wikipedia. Making an edit over there to replace a bad image with a better one will typically correct the problem. The other solution would be to request someone with a bot at Wikidata to identify data items for humans that have more than one image, and process them by setting one image to preferred status. --EncycloPetey (talk) 15:43, 11 May 2020 (UTC)
One of the purposes of Wikidata is helping to solve some issues centrally so that they do not have to be solved locally in every single Wikipedia. So why should contributors to a local project go correcting issues to another local project? If Wikidata has a better image than Catalan Wikipedia, there is no reason why Catalan Wikipedia should push their image there. Second image should be added to Wikidata only if there is a good reason to do so. If the reason is that the other image is better, they should remove the worse one at the same time. If the reason is that the image depicts the person e. g. at a different age, they should state so using the qualifier and set the rank of preference. But this should be done immediately upon adding the image, not many weeks later after some other contributor accidentally notices that the image in their local project disappeared. If Wikidata contributors are allowed to run bots without taking responsibility for these basic issues, we need to protect ourselves. About 340(!) author pages are currently experiencing this problem (and there were many more, as people have already fixed many of them). Our contributors should spend their time in a better way than fixing 340 pictures in Catalan WP. And the main thing is that the author pages should not be lacking images until somebody notices and fixes Catalan WP. We need to set our templates in such a way that they would protect us from such problems. --Jan Kameníček (talk) 17:42, 11 May 2020 (UTC)
don't know why you are complaining about wikidata here. a little image quality circle at wikidata could knock out this image issue in no time, and maintain a preferred image. demanding bot operators do it before the fact, is a recipe for drama. you don't want to delete the image, but rather you should deprecate it to avoid edit warring with a bot.Slowking4Rama's revenge 14:00, 13 May 2020 (UTC)

A question about Preferences[edit]

Is there a history of edits in either of the my Local and Global Preferences? — Ineuw (talk) 22:34, 10 May 2020 (UTC)

Nothing visible, they are your preferences, not edits, so there is no need for public information. — billinghurst sDrewth 11:54, 11 May 2020 (UTC)
Further questions about preferences would be better among the tech heads at mw: rather than here. — billinghurst sDrewth 11:55, 11 May 2020 (UTC)

History of global and local preference changes[edit]

Is there such a thing as a history record of a user's Preference edits?— Ineuw (talk) 02:20, 16 May 2020 (UTC)

@Ineuw: No, sorry. Your changes to the preferences are not "edits" in any meaningful sense. They're like the preference settings in your web browser or other software. It's not unlikely the software logs them somewhere, but that'd be in low-level logs that are not exposed anywhere and would need a pretty severe situation of some kind to justify the developers accessing them. --Xover (talk) 07:50, 16 May 2020 (UTC)
@Ineuw: This is pretty much the second time you have asked that question. I am uncertain why, when you know the solution, that you continue to look for the problem. All of these questions are Mediawiki questions, not enWS questions, and you are better to take them all to mediawikiwiki for the detailed answers. You have had them addressed in the phabricator ticket that you raised. — billinghurst sDrewth 08:06, 16 May 2020 (UTC)

Replacing image of musical score with Lilypond in non-scan-backed works?[edit]

A recent discussion elsewhere has brought to light an issue that could benefit from wider community input.


How should we deal with replacing a score image with Lilypond in non-scan-backed works?


Our practice is generally to use Lilypond (gory details) to present musical scores to our readers—including replacing existing images of scores with Lilypond—even if they are not pixel-for-pixel identical. This is generally good and desirable for several reasons, including the ability to automatically generate an audio version of the score.

However, we have a large legacy of works that are not scan-backed, and that include the score as an image in the page. If we replace that image with Lilypond then we also break any possibility of validating the score, which is an even more fundamental practice on the project.

I have not found any specific guidance on this issue anywhere, so I am hoping the community can chime in with some views on how to handle this issue going forward. It's not the biggest problem we have, but it has led to disagreements, so a common course on it would be useful.

Some possibilities that point themselves out:

  • Ignore it. Just replace the image with Lilypond and ignore the lack of validation. The work in question is not scan-backed in any case, so this matters little.
  • Don't do it. Independent validation is fundamental to our processes, and when something precludes that it should not be done. We can live with static images of music scores on these works.
  • Link image in textinfo. We have other information about works in the {{textinfo}} template on the work's talk page, so the image can live there and be available for validation.
  • Include image thumbnail. The image of the score can be included directly on the work's page as a thumbnail so it is available for validation. This is how many non-scan-backed works include other illustrations so it's good enough for this case too.
  • Tag the score with a notice. We can put a text tag on the Lilypond score that makes clear that it is not yet validated, and links to or explains how to find the original image for verification.
  • Something else. A much better idea that the community will come up with in this discussion. 😎

I don't think this the sort of issue that has a clear-cut right and wrong answer, so is best dealt with by discussion and seeing if we can extract some rough consensus (as opposed to holding a vote or something like that). In other words, I would very much appreciate any and all thoughts and opinions on this; including "I really don't care about this!" because that's also useful guidance from the community. --Xover (talk) 08:48, 11 May 2020 (UTC)

  • Note: Pinging possibly interested contributors: Beeswaxcandle, EncycloPetey, Beleg Tâl. If there are others who may have a particular interest or relevant perspective on this, please ping them to this discussion. --Xover (talk) 08:52, 11 May 2020 (UTC)
  • I have seen a number of pages like this, especially from Portal:Sheet music. I believe that the preferable solution would be to create an index which includes the images, transclude the pages (if any Lilypond has been written), and leave it like any other work. (It could look like this if there is no Lilypond.) By the way, is there any reliable way for cross-page creation of Lilypond files? I think that it would help with this? TE(æ)A,ea. (talk) 11:35, 11 May 2020 (UTC).
  • I think the best solution which would keep the advantages of using Lilypond instead of a static image would be to link to the source image using {{textinfo}}. This keeps everything simple enough that people who wish to contribute new works in this fashion can do it without difficulty while allowing easy double-checking (without having to comb through the page history...). Similarly, if there are non-trivial differences between the Lilypond and the image then it would pose no problem to tag the page and notify the relevant persons of perceived problems (assuming the person who notices it does not know how to fix it). 107.190.33.254 14:01, 12 May 2020 (UTC)

Index:Japan-Korea GSOMIA (English Text).pdf[edit]

The source file of this index was deleted from Commons by Krd (talkcontribs). * Pppery * it has begun... 19:55, 11 May 2020 (UTC)

@Pppery: Thanks for the notification. I have recovered the file and moved it locally, and opened a conversation at WS:CV to have the conversation here about our considerations. — billinghurst sDrewth 02:21, 12 May 2020 (UTC)

Tech News: 2020-20[edit]

20:40, 11 May 2020 (UTC)

More indexes with deleted source files[edit]

* Pppery * it has begun... 02:29, 12 May 2020 (UTC)

Youtube links and Trump administration video, policy?[edit]

I'm interested in contributing transcripts related to the official work of the Trump administration. It would add a lot for these transcripts to have links to video of the events. However the Trump administration has discontinued the Obama administration policy of hosting videos at whitehouse.gov (while also posting on Youtube). The only available video links currently are at Youtube and account-required sites like Facebook. Is it the case that no links to video of (for example) Presidential briefings can be made now since Youtube is blacklisted? Or is there a work-around where I ask for each relevant Youtube link to be whitelisted? thanks Dennis the Peasant (talk) 14:55, 13 May 2020 (UTC)

What type of YouTube pages are they? If they're free, they should be uploaded here. If they're copyrighted uploads with no permission on YouTube, we shouldn't be linking to them. If they're legit but copyrighted, you could ask link by link, or we could whitelist YouTube altogether. What's the account name on YouTube?--Prosfilaes (talk) 00:25, 14 May 2020 (UTC)
If these videos are marked Public Domain on YouTube (and it sounds like they should be, from your description) then I believe it's possible to download them, and upload them to Commons. That would be the ideal scenario, because we would not be reliant in the long run on YouTube or government entities to keep the videos online. -Pete (talk) 01:18, 14 May 2020 (UTC)
Thanks for replies. Copyright-wise, there are two categories of video I am concerned with.
  1. The White House posts content at their Youtube channel https://www.youtube.com/user/whitehouse/videos. This type of self-produced and official content used to be also hosted at whitehouse.gov and would be archived to a new URL after a new president was elected. Now it's only on YouTube.
  2. There is also video of presidential activities that are produced by another entity such as C-Span.
Category 1 could be uploaded to Commons and I have seen this done with some of this content. It's not clear to me that this could be done with Category 2, in which case whitelisted links might be a better solution.
I do not see any indication on the YouTube videos are to whether they are in PD. They are all free as in not requiring a paid subscription or a YouTube account to view. Dennis the Peasant (talk) 16:02, 14 May 2020 (UTC)
there is no public domain tag on youtube so it is on you to discern that. https://support.google.com/youtube/answer/2797449?hl=en the white house feed does not clarify matters https://www.youtube.com/channel/UCYxRlFDqcWM4y7FfpiAN3KQ you have a custom license on commons https://commons.wikimedia.org/wiki/Template:PD-USGov-POTUS uploading video is hard https://commons.wikimedia.org/wiki/Commons:YouTube_files#youtube2mediawiki and https://commons.wikimedia.org/wiki/Commons:List_of_Wiki_video_projects -Slowking4Rama's revenge 13:23, 3 June 2020 (UTC)

Discord Chat/Channel in the "Wikimedia Community" Discord server[edit]

Would anyone here be interested in getting Wikisource specific chat in the Wikimedia Community discord server, en.wikipedia.org/wiki/Wikipedia:Discord, in order to help with coordination, project news, project advertising, etc. If we don't want to be involved in the main server, we could create a Wikisource specific server for ourselves instead.

Link for anyone interested discord.com/invite/e8xxGMP --Reboot01 (talk) 04:11, 17 May 2020 (UTC)

@Reboot01: I honestly think there should be a Wikisource-specific server. I created a Wiktionary-specific server in 2018, for instance (for info on that, see wikt:Wiktionary:Discord server). PseudoSkull (talk) 22:02, 17 May 2020 (UTC)
I agree, then it'll allow us to have separate chats for different wikiprojects, and cross-language wikisource collaboration if other language wikisource's join. --Reboot01 (talk) 23:02, 17 May 2020 (UTC)
I don't feel it appropriate for me to create the server since I'm not an admin, but I'm more than willing to help out in any way I can! The Wiktionary server has proven to be very useful and effective, and is being used actively at present. PseudoSkull (talk) 00:22, 18 May 2020 (UTC)
Are there any admins who would be willing to sort of 'sponsor'/create the Discord server? --Reboot01 (talk) 21:31, 19 May 2020 (UTC)

Interlanguage link template[edit]

Do we have a template like, for example, w:Template:Interlanguage link? It allows users to enter a red link for a non-existent but plausible page which is followed by a parenthetical link to an equivalent page on a sister project. When the red link is created, the sister link is hidden and the regular link is displayed. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:29, 17 May 2020 (UTC)

Period in the title in the template[edit]

Should it be "First Sentence in the Title. Second Sentence in the Title" or "First Sentence in the Title. Second Sentence in the Title." in the template for title. Note the period at the end of the second one. I see a mix of the two in use. --Richard Arthur Norton (1958- ) (talk) 02:41, 18 May 2020 (UTC)

Are you asking about {{BASEPAGENAME}} or the title parameter in the header template?
  • If the former, then usually just the title, rather than the subtitle of the work. Never have a terminating space in a BASEPAGENAME / url as it will fail miserably in a copy and paste, especially to something like twitter.
  • If the title field, I have seen both, and I have seen a colon used too, and I think that colon is the current guidance for "title: subtitle". My practical guidance is that we don't want to be over-powering the header field, or having prodigious line wraps. If it is shorter then it's fine, if it is really long, then don't, as the title page is sufficiently descriptive, though if it is long and you want it, then consider put the title/subtitle into the notes field
Be mindful to guidance like https://www.annemini.com/category/how-to/how-to-format-a-title-page-if-your-book-has-a-subtitle/ and the value in what you are reproducing and to the reader. — billinghurst sDrewth 04:00, 18 May 2020 (UTC)

Tech News: 2020-21[edit]

17:18, 18 May 2020 (UTC)

Vandalism[edit]

Hi Admins, I reverted some modifications by User:2604:6000:100E:8160:D069:14F:F972:D1B0. Should his new pages be deleted? --M-le-mot-dit (talk) 05:42, 20 May 2020 (UTC)

@M-le-mot-dit: Thanks for letting us know. The new pages have been deleted. PS. You may want to post message specifically for the admins at WS:AN since that's a lot less trafficked and may be noticed quicker. --Xover (talk) 06:01, 20 May 2020 (UTC)

Duplicate Index.[edit]

Index:First six books of the elements of Euclid 1847 Byrne.djvu and Index:The first six books of the Elements of Euclid.pdf

Which to retain given that the first linked entry already had some effort attached to it? ShakespeareFan00 (talk) 20:30, 20 May 2020 (UTC)

Senator Paul on Impeachment of President Trump (our censored version)[edit]

I understand that we've decided to censor Senator Paul on Impeachment of President Trump per this discussion, but the way it is currently titled is misleading. A casual reader might think the document was actually redacted by the government -- which I imagine is how the {{redacted}} template is most often used, but this isn't true; the U.S. Printing Office has published the speech in full. One obvious way to clear this up would be to move it to a more clear versioning title such as Senator Paul on Impeachment of President Trump (censored by Wikisource), but I thought I'd run this by the community for any other ideas. Do we perhaps have a template we normally put on our censored documents? -- Kendrick7 (talk) 22:55, 23 May 2020 (UTC)

I agree that the reader deserves better cues about how this has been handled and whose decision resulted in the redaction, thank you for bringing this up. I feel hesitant about the word "censor," which is a bit of a charged word; I would prefer the word "redacted," and ideally it should link to the discussion where the decision was made. -Pete (talk) 00:37, 24 May 2020 (UTC)
Also, I think it would be better in the "description" field in the header, and/or a note or link in the redaction itself, rather than in the page title. (Keep in mind, the page might be printed, and thereby separated from its title.) -Pete (talk) 00:39, 24 May 2020 (UTC)
I see no need to change the title of the work, especially as it was argued that the redacted parts are inconsequential to the work. I have added a comment to the {{textinfo}} on the document's talk page — billinghurst sDrewth 12:56, 24 May 2020 (UTC)
since no one is transcribing the side by side version, it does not matter what you call it- delete it. Slowking4Rama's revenge 01:43, 26 May 2020 (UTC)
Thanks, billinghurst, that adaquately addresses my concerns. -- Kendrick7 (talk) 13:13, 30 May 2020 (UTC)

When a source was/is dubious...[edit]

So trying to figure out why the Author:Nathan Haskell Dole doesn't mention project The Russian Fairy Book (of course, since project isn't done, maybe that?) when I come across another of the author's works, a translation from Tolstoy, Where Love is, There God is Also.

Only..., when I look at that, it strikes me immediately as from a bogus, corrupted source. Out pops at me:

"If he can finish and order by a certain time, he accepts it"

and the rest of the sentence has a hyphen where an em-dash would have to be. The source for this was not an edited, printed book.

Then I search on the web, and there are billions of copies of copies, all saying "finish and order". Some you can buy for $40! ;-)

But then when I specifically search for the gotta-be-right "finish an order", out pops matches: e.g. from 1896. Em-dashes! Good English! Hmmm. Another [18]

Then I finally find a decent scan from the author's book of translations. Even the text version at archive.org isn't broken enough to have 'and' instead of 'an'.

So something contributed in 2007 without source attribution, when found to be *not* faithful to a believable source, what to do? Shenme (talk) 06:39, 24 May 2020 (UTC)

@Shenme: Tag it with {{no source}}; upload the good scan; set up an index for it; proofread the relevant pages; and replace the dubious unsourced version with the newly proofread one. If the work in question is too large to reasonably replace like this then tagging it as without a source is the minimum, and it would be very helpful to set up the scan and index and then tag the bad text with {{migrate to}}. --Xover (talk) 07:27, 24 May 2020 (UTC)
@Shenme: Iván Ilyitch and Other Stories; Iván Ilyitch and Other Stories/Where Love Is, There God Is Also; and Index:Iván Ilyitch and Other Stories (1887).djvu. --Xover (talk) 18:46, 4 June 2020 (UTC)

Some CSS for Vector has been simplified[edit]

Hello!

I'd like to make a double-check about a change that was announced in Tech/News/2020/21.

Over-qualified CSS selectors have been changed. div#p-personal, div#p-navigation, div#p-interaction, div#p-tb, div#p-lang, div#p-namespaces or div#p-variants are now all removed of the div qualifier, as in for example it is #p-personal, #p-navigation …. This is so the skins can use HTML5 elements. If your gadgets or user styles used them you will have to update them. This only impacts the Vector skin.

On this wiki, this impacted or still impacts the following pages:

How to proceed now? Just visit all these pages and remove div before these CSS selectors if it hasn't been removed so far.

Thank you! SGrabarczuk (WMF) (talk) 13:05, 25 May 2020 (UTC)

Tech News: 2020-22[edit]

14:17, 25 May 2020 (UTC)

Index:Wm. M. Bell's "pilot"; an authoritative book on the manufacture of candies and ice creams (1911).djvu[edit]

I don't really know what I'm doing wrong here, I was trying to create the red link index page here just so I could take a look at how the book is looking.

I copied an existing template and replaced the index name, page no's etc. but when I preview the page before creating it, it reports a message

Error: No such index

I copied and pasted the index name so it shouldn't be a typo, is this a problem with the index name having "" and ; in it? or maybe I am just making some other basic mistake by not really understanding the process.

If anyone has the time, would you be able to create the page for me please? Thanks Sp1nd01 (talk) 13:57, 26 May 2020 (UTC)

@Sp1nd01: I'm betting it's the quotation marks that's confusing ProofreadPage, but I'm not certain. @Tpt: Can you shed any light and suggest possible workarounds? --Xover (talk) 14:54, 26 May 2020 (UTC)
@Xover: Thanks for checking and progressing for me. I hope a workaround is possible, otherwise maybe it can be renamed to remove the quotation marks, but I wouldn't know if that is a valid option at all. Sp1nd01 (talk) 20:32, 26 May 2020 (UTC)

@Sp1nd01: I would suggest use {{#tag:pages}} syntax, otherwise the use of single quotes to wrap the page name

  • {{#tag:pages||index=Wm. M. Bell's "pilot"; an authoritative book on the manufacture of candies and ice creams (1911).djvu|from=xx|to=yy}} and yes there is a gap between the first two pipes
  • <pages index='Wm. M. Bell's "pilot"; an authoritative book on the manufacture of candies and ice creams (1911).djvu' from=xx to=yy />

What is happening is that typically it is looking for the containers to identify the start and finish of the filename, and the filename is just confusing to it.

Also to note that in the jargon sense, you are trying to transclude the pages, it did take me a while to work out what you meant. — billinghurst sDrewth 21:26, 26 May 2020 (UTC)

@Billinghurst: @Xover:, Thanks for the suggestion, the tag pages did the job, the page is created (transcluded) and I can now browse the book. Sp1nd01 (talk) 22:29, 26 May 2020 (UTC)
@Sp1nd01: Works can be renamed, but since we don't have good tools to do so it's not something we usually do unless it's absolutely necessary. Each Page: of the work would need to be moved individually, and there's no way to automate it short of an actual bot run. The Javascript API for Mediawiki lets you move pages, but it's made with Wikipedia (all single articles) in mind so it has limits that don't really work for Wikisource (where 1000 pages to a work is completely normal). In addition, the Index:, Page: pages, and the File: are all interdependent and connected through the file name (which is already fragile), and the File: will usually be hosted on Commons where we can't rename it directly (it's subject to Commons policy and processes). In other words, if we have any other alternative it is probably preferable to trying to move a work to a new name. --Xover (talk) 04:59, 27 May 2020 (UTC)
{{#tag:pages}} is a great get-out-of-jail-free card that avoids a range of issues and can be used to perform better wiki-magic (described elsewhere).

Generally I would discourage the use of single or double quotes in a name, as they can quirky, either in transclusion, or when you are trying to utilise some of the tools that convert to unicode. — billinghurst sDrewth 05:53, 27 May 2020 (UTC)

Gutenberg blocked in Italy[edit]

confused reports of Gutenberg block. here is an account https://www.balcanicaucaso.org/eng/Areas/Italy/Project-Gutenberg-obscured-in-Italy-many-doubts-few-certainties -- Slowking4Rama's revenge 12:53, 27 May 2020 (UTC)

See also Wikisource:Copyright discussions#Project Gutenberg blocked in Italy --Jan Kameníček (talk) 13:54, 27 May 2020 (UTC)

Community Tech Launches Wikisource Improvement Initiative[edit]

Apologies for the broken links in the previous message. This message is in English, but we encourage translation into other languages. Thank you!

Hello everyone,

We hope you are all healthy and safe in these difficult times.

The Community Tech team has just launched a new initiative to improve Wikisource. We will be addressing five separate wishes, which came out of the 2020 Community Wishlist Survey, and we want you to be a part of the process! The projects include the following:

For the first project, the team will focus on the #1 wish: improve ebook exports. We have created a project page, which includes an analysis of the ebook export process. We now invite everyone to visit the page and share their feedback on the project talk page. Please let us know what you think of our analysis; we want to hear from all of you! Furthermore, we hope that you will participate in the other Wikisource improvement projects, which we’ll address in the future. Thank you in advance and we look forward to reading your feedback on the ebook export improvement talk page!

-- IFried (WMF) (Product Manager, Community Tech)

Sent by Satdeep Gill (WMF) using MediaWiki message delivery (talk) 10:51, 28 May 2020 (UTC)

Discussion on encouraging page scans?[edit]

Hello all! Recently I was looking at the ProofreadPage stats and I noticed that a few sites have a "pages with scans" percentage of higher than 97% – meaning that virtually all of their texts are based on transcluded page images, not just copy/pasted text like many of our works. I imagine that those sites have undertaken major efforts to encourage the use of page scans, perhaps by placing restrictions on new texts coming in or even deleting some texts without scans. I haven't been involved in discussions here much lately, so I'm wondering if there have been any discussions along these lines here. I tried a few searches but didn't come up with anything – could someone point me to recent discussion on this topic, if there has been any? Thanks. –Spangineer (háblame) 00:36, 29 May 2020 (UTC)

97% is more than encouraging pages with scans. We probably couldn't get there if we backed every scanned page with scans. I don't think there's been much discussion recently, though we are pushing for it.--Prosfilaes (talk) 03:14, 29 May 2020 (UTC)
Some other interesting comparisons: frWS and deWS have about 0.02% and 0.03% ratios of problematic (blue) pages to total page-namespace pages respectively (enWS is 1.3%) and the unproofread (red) pages to total pages ratios are 25% and 5% to enWS's 39%. I'm pretty sure much of it is down to stricter rules on those subdomains (especially deWS), though I would be interested to know how the first figure is achieved, since most of our problematic pages are either bad scans or pending image extraction. Inductiveloadtalk/contribs 10:12, 29 May 2020 (UTC)
Hey Spangineer; good to see you around. I'm not aware of any discussions specifically addressing this. There have been some indications the community would like to raise our standards, but often in the context of deletion discussions where the issue gets conflated with "deletionist" vs. "inclusionist" tendencies.
I am personally of the opinion that it is past time we made it policy for new additions that they should be scan-backed, and raise the bar on grandfathered old works (e.g. if they have style problems, and are not scan backed, and insufficient source information, they should be deleted; or if they are a conflated text, probably copied and pasted from Gutenberg from no specific edition; etc.). Combined with a concerted community effort to migrate old texts to scans, I think we could improve our overall quality massively in relatively short order.
But we are, I think, suffering under some kind of weird aversion to having proper written policy. A lot of stuff that would improve our quality turns out to have been discussed and agreed as policy, only for our actual policy pages never to have been updated accordingly. In the mean time, practice has diverged due to the lack of written policy, to the point where the previously agreed policy can no longer be assumed to be valid. I think that if we are to do anything about our quality, we absolutely need to get over this fear of having proper policy to codify our standards. --Xover (talk) 06:32, 2 June 2020 (UTC)
Thanks for your thoughts! I've been thinking along the same lines... seeing if we can get agreement on a policy that would place some limitations on works with no scans. I agree that it seems reasonable to make sure that our policy pages document actual practice, identifying both the areas where we are strict and areas where we are flexible (and of course there has to be a lot in the latter category given the vast diversity of types of works we host here). Spangineer (háblame) 13:26, 3 June 2020 (UTC)

Help with transclusion required.[edit]

I thought I'd have a try transcluding sections in Lancashire Legends, Traditions, Pageants, Sports, &c., by copying some existing transcluded pages and editing them for the new sections. I've done a few transclusions OK, but have come across a couple of transclusion situations where I'm now out of my depth and need help.

The first issue I hit is that I can't get the links between the Introduction and Memoir of John Harland, F.S.A. to turn blue. (the next and previous links.)

The second issue is that when transcluding the section Sir Bertine Entwisel, it ends on a page where there are also two other sections, one complete section is found on that page, and the last one crosses over a few more pages. My attempt to add section tags for this situation isn't working, my transclusion of section Sir Bertine Entwisel now shows the whole other section as well as the start of the next, and I don't know how to fix it. Thanks Sp1nd01 (talk) 16:34, 29 May 2020 (UTC)

Issue number one: the relative link from title/Part 1/Subpage to Title/Memoir is ../../Memoir, as it has to go "up" two levels and then down to "Memoir"
Issue number two: You were missing the <section end="s1"/> on the last page, so the sections couldn't be parsed. I suggest using the "EasyLST" gadget (Preferences -> Gadgets -> Editing tools) as it takes care of that for you. Inductiveloadtalk/contribs 16:41, 29 May 2020 (UTC)
Thank you for the help, I think I now understand the solution for the first issue.
Re the second issue, I now have the "EasyLST" gadget enabled, but must admit I don't know how to use it. I am not seeing any new button or link for it anywhere obvious in my interface.
I was checking the diff between your edit and mine to see what was done, and I see the change adding the end of section (<section end="s1" />), however when I actually edit the page I don't see that line present. Is this some system internal code that is hidden from me? Just wondering if I can manually add that line when I next come across one of these tricky pages? I think I spotted a few more of them further on in the book. Sp1nd01 (talk) 21:35, 29 May 2020 (UTC)
With EasyLST, the sections are marked with a ## section_name ## syntax. For example, in old style:
<section begin="s1"/>
Section 1
<section end="s1"/>
<section begin="s2"/>
Section 2
<section end="s2"/>
and with EasyLST:
## s1 ##
Section 1
## s2 ##
Section 2
The ## s1 ## is then replaced with the <section begin="s1"/> by the gadget. Inductiveloadtalk/contribs 21:55, 29 May 2020 (UTC)
Thanks again for the help, I've now managed to transcribe one of those situations correctly. Sp1nd01 (talk) 09:35, 30 May 2020 (UTC)
I'm back again with another problem. I have transcluded all of Part 1, but now I don't know how to make the jump from the end of Part 1 to the Part 2 Introduction. The link I've created takes me back to the Part 1 Introduction. Could someone take a look and fix it for me please? Sp1nd01 (talk) 15:27, 5 June 2020 (UTC)

Question about copyright renewal[edit]

I would like to add the book Halek’s stories and evensongs, translated by W. W. Strickland, New York: B. Westermann Co., 1930, which is probably not available online anywhere but I hopefully may get it from a Prague library and scan it. However, before I do it, I would like to ask whether it can be added here. Help:Public domain says that a work is PD if published in the US before 1963 without later copyright renewal. The search I did using the recommended search engine was negative, but I would like to ask for confirmation as I do not have any experience with the renewal problematics. --Jan Kameníček (talk) 07:13, 1 June 2020 (UTC)

@Jan.Kamenicek: Translations are a bit tricky (cf. my discussion with Prosfilaes elsewhere).
The translation as a separate work, if it was first published in the US without copyright notice and without subsequent registration (renewal), is PD in the US. If the translation was first published outside the US and not published in the US within 30 days, this does not apply (it is considered a foreign work in that case; and I see this work exists in a German edition from about that time).
The original—presuming it was first published in Czechia—is subject to pma. 70 terms. Presuming Strickland translated Halek directly (not a later edited edition), this copyright would have long since expired (probably even longer since the pma. 70 term was presumably introduced much later).
Checking for renewal needs to be a reasonably exhaustive search. The Stanford database is a good quick start, but you also need to check all the relevant copyright registrations as published. And you need to document the search steps you performed. --Xover (talk) 08:43, 1 June 2020 (UTC)
@Xover: Hm, I thought that the Stanford database includes all the relevant copyright registrations… If not, where can I check them? --Jan Kameníček (talk) 09:24, 1 June 2020 (UTC)
BTW, the subtitle of the English edition says "translated from the Czech…" [20], so the German editions should not influence anything. --Jan Kameníček (talk) 09:39, 1 June 2020 (UTC)
You're conflating two different things; if it was published without copyright notice, it was immediately put into the public domain. If it was published with copyright notice, then it had a chance 28 years later to seize an extended period of copyright (now 67 years) by renewing.--Prosfilaes (talk) 09:48, 1 June 2020 (UTC)
Not so much "conflating" as "waving my hands and pretending that complicated stuff doesn't exist unless absolutely forced to go check what the rules were". :) I was sure there were some circumstances, depending on date of publication and the phase of the moon, where something published without notice could magically get copyright by filing a renewal that was really a registration (but here I may be conflating with authors recovering their copyrights in contributions to periodicals, which is a different beast). Or something like that anyway. I just couldn't be arsed to go check and didn't think the details were likely to be relevant. But, in any case, thanks for pointing it out (it's always better to be precise about these things when possible; I was just being lazy). --Xover (talk) 10:02, 1 June 2020 (UTC)
It's probably fine. However, Czech and Slovak literature in English* says the English edition was printed in Germany, but it was still probably technically published in the US? Walter W. Strickland was British, and it seems possible that some of it was originally published in the UK before this book. It seems unlikely that anyone is going to challenge it, but I'd want to look at the book, or at least the title page and verso (copyright page) before being definitive about it. The translation is moot.
The Stanford database should include all the relevant copyright registrations; provided it was searched multiple ways to get around single typos ("Nearer my Ood to thee" was one that escaped the Gutenberg proofing), it should be fine. But one could check out the scans of the physical volumes at Online Books or elsewhere.
* Czech and Slovak literature in English is another volume that you might be interested in. It's a recent bibliography, but PD-USGov. I'll try and set it up if you're interested.--Prosfilaes (talk) 09:46, 1 June 2020 (UTC)
I had a quick look at the scans for the renewals. Normally checking pub + 27, 28 and 29 years covers the bases. I didn't see any renewals under "Strickland" (or "Halek", for that matter):
The 1930 registrations are much more fiddly as they're split into 155 numbers. Inductiveloadtalk/contribs 10:16, 1 June 2020 (UTC)
@everybody: Great, thanks very much for the help with checking. I also went through the links you have provided here and it really looks OK. I have also learned how to deal with similar cases in future. So now I will order the book using an inter-library service, and after it arrives check also its title page and scan it then.
@Prosfilaes: As for the Kovtun’s book Czech and Slovak literature in English, I know it and I already downloaded it from HathiTrust a couple of weeks ago. It is on my shortlist of works to add here as HathiTrust states it is in the PD, but I am not sure about the reasons, as it was published in 1988. Is it {{PD-USGov}}, because it was published by the Library of Congress? --Jan Kameníček (talk) 11:01, 1 June 2020 (UTC)
It's PD-USGov because it was made by an employee of the US government in the course of their duties; it says on the front that he's part of the "European Division". I'd consider waiting a year, because the front illustration is from Letters from England, 1925, and won't be PD in the US for another year, but could easily be cut out before uploading.--Prosfilaes (talk) 11:11, 1 June 2020 (UTC)
I see, no problem, it will really be better to wait until the next year. --Jan Kameníček (talk) 15:51, 1 June 2020 (UTC)
more info with pdf here https://www.loc.gov/rr/european/bibs/csle.html Slowking4Rama's revenge 12:57, 3 June 2020 (UTC)

Tech News: 2020-23[edit]

22:30, 1 June 2020 (UTC)

Internet Archive[edit]

https://arstechnica.com/tech-policy/2020/06/publishers-sue-internet-archive-over-massive-digital-lending-program/

It would seem that Internet Archive might be in trouble.

ShakespeareFan00 (talk) 20:25, 2 June 2020 (UTC)

Not a surprise: this was entirely predictable, to the point of being inevitable. Nobody thought IA had a strong case when they announced this, and nobody thinks much of their chances now. Even if they can eventually avoid billion-dollar damages, the legal costs on the way will bleed them of (donation) funds and may eventually force them to accept onerous settlement terms. Think along the lines of semi-automated DMCA takedowns and geoblocking: of the kind that lets copyfraudsters squat on public domain works by republishing a new edition (like GBooks is now). I hope they have a brilliant secret legal strategy that will let them win this and set a good precedent, but I don't give much for their chances. I hope someone is currently arranging a mass mirroring of their public domain scans is what I'm saying: without IA our work is going to become a lot harder. --Xover (talk) 06:39, 3 June 2020 (UTC)
Without IA, one use case I had becomes considerably harder for me. IA has good scans of the Catalog of Copyright Entries Scans (and not just for Books and Pamphlets.) Without those I cannot without considerable expense confirm the status of many 1923-1964 (or pre 1978) works. Having the good scans of the CCE (via IA) was invaluable in confirming if some more obscure older works could be transcribed here, (and there are efforts on Wikisource to transcribe some of the other volumes to support efforts on Commons to help identify copyvios.). Potentially no Access to scans of these means someone would have to physcially go to the Copyright Office records in the US ( as pre 1978 registrations and renewals are not necessarily fully digitized as of 2020.) or has to assume that any post 1925 work is still in copyright until they can show otherwise. If the publishers desire to protect 'recent' works made it harder for to check the status of older works, they may be some less conscientious individuals who won't bother trying to check the status on older material. (Commons admins on the other hand can be very paranoid about confirming the status of uploaded material.)
The loss of post 1925 material still in copyright wouldn't be a concern for me (given that copyright in the UK is 70 pma and unlikely to change.) However the vast loss of material which is clearly public domain ( like the CCE which is a US Government work) or which in a number of instances was donated to IA by partnering libraries who would reasonably be expected to understand copyright, would be disproportionate. I also get the view that the lawsuit isn't just about the IA's recent actions though, I think the publishers want to send a message more generally.

ShakespeareFan00 (talk) 07:05, 3 June 2020 (UTC)

i would not worry. typical authors guild overreach. see also w:Authors Guild v. Google. internet archive will continue hosting content regardless of copyright status. and we will untangle the orphan work mess. IA has more support than the SOPA lobby. if you want to send a message, call western union.Slowking4Rama's revenge 12:45, 3 June 2020 (UTC)

Tracking down an old source[edit]

I'm looking for a copy of The Cooperative Production of Knowledge on the Internet—the Case of Wikipedia, can someone help me find it? Sj (talk) 13:17, 4 June 2020 (UTC)

@Sj: The original article is at s:it:Produrre sapere in rete in modo cooperativo - il caso Wikipedia. We have previously hosted one full user translation at The Cooperative Production of Knowledge on the Internet - the Case of Wikipedia and one partial at The Cooperative Production of Knowledge on the Internet—the Case of Wikipedia. Both where deleted in 2006/2007; one as a duplicate, the other as out of scope. There was also a copyright issue raised, and I couldn't find any licensing info at itWS just now. Unless it's a copyvio we can temporarily undelete the translation so you can grab a copy. --Xover (talk) 14:17, 4 June 2020 (UTC)
Interesting. If it was out of scope for Wikisource, is it possible it would be permissible in Wikisource space, or on another project like Wikipedia (WP: namespace), or Wikibooks...? If so, perhaps it could be restored for transfer. -Pete (talk) 19:08, 4 June 2020 (UTC)
It is definitely available under a free license, and silly for it to be considered out of scope (both by current policy, and given that it's in scope on it:s). (C) grant was noted by the original author, I believe in the document itself and also on Meta around 2005. If you could undelete so I can move it to userspace, I would appreciate it -- as a stopgap :) Thanks for tracking this down, xover! Sj (talk) 19:48, 4 June 2020 (UTC)
Indeed, if this is a thesis that passed review and earned a degree, it would be covered by WS:WWI, as I understand it: "An example of such acceptable research work is a thesis that has been scrutinized and accepted by a thesis committee of an accredited university". Unless this didn't pass review or UNICATT is not accredited, but I would say that speedy would not the be right process for that now. The other issue if if UNICATT claims copyright over theses, but I think that is not very common.
Not to say I'm criticising the original determination, as the inclusion of theses in WWI didn't happen until December 2011. Inductiveloadtalk/contribs 20:06, 4 June 2020 (UTC)
@Inductiveload: In addition to being outside scope based on the then policy, the text was flagged as a copyvio, so I imagine the G5 speedy was an expediency measure as much as anything. However, a thesis would only be in scope iff it had been through the equivalent of peer review, and I can find no indication that that's the case here. It was prepared for that purpose, certainly, but there's no evidence it was accepted.
@Sj: I've trawled through the author's (Valentina Paruzzi) contributions on itWP (Vala) and meta (Elian; note that this identification is dubious as Elian says elsewhere that they do not speak Italian) and can find no reference to this work, much less its licensing. On itWS talk page histories I find that Frieda claims to be a personal friend of Paruzzi, and that the author has sent them the text in an email. They later write the text "is clearly GFDL" (I'm paraphrasing), but does not indicate on what basis they draw that conclusion (it sounds like it's the text itself, but I've not found that while skimming the Italian original nor our two English translations). None of the above accounts are active anymore, but the Elian account was sporadically active up until 2018, responding to talk page messages, so you may be able to contact them that way (I'd suggest deWP as that seems to be their home wiki).
Absent an actual OTRS ticked confirming the licensing, or at the very least some clear and plausible statement somewhere, I'm not comfortable undeleting this. If someone can find me better info on that I would otherwise be happy to do so. However… It looks like Pathoschild actually has a cut&paste of the text at their meta talk page after someone else asked for it. If that's all you need you can grab it from there.
If anyone wants this hosted permanently here under the revised scope rules (and the copyright stuff has been resolved), I'd be happy to temporarily undelete pending an undeletion discussion at WS:PD. I have to caution that on closer inspection and comparison with the itWS text, it looks like even the "complete" translation is not actually complete, and the quality of it is far from perfect. It would probably be better to start from a PDF of the original thesis and do a new scan-backed translation from that, if a suitably licensed copy was obtained. --Xover (talk) 16:44, 5 June 2020 (UTC)
Agreed, sorry, that was not well expressed. I meant that a speedy under G5 is probably no longer the right approach, since it's no longer clearly out of scope (it might very well be, but it'd probably be worth a check if it happened today). Since the author edited (albeit very briefly) at itWP and wrote about Wikipedia, you could imagine they might legitimately have licensed it GFDL and could be asked (though they appear gone now, so it's probably too late), and the review status could be checked (though I guess for an undergrad thesis, that might be between very rare and unlikely, maybe more possible for a PhD thesis).
The GFDL statement is the itWS textinfo (simply "Rilasciata in GFDL dall'autrice"), but without OTRS or a concrete source to prove it or any of the contributors active for years, it's not really worth much. And if it was incomplete, it's realistically never going to be finished anyway. Inductiveloadtalk/contribs 17:10, 5 June 2020 (UTC)
https://wikispore.wmflabs.org/wiki/Wiki:Main_Page might take it. you really should not predict what work will not be done, some might take it as a challenge. (even if experience shows a growing backlog of unfinished work) Slowking4Rama's revenge 20:37, 5 June 2020 (UTC)

Pictogram voting comment.svg Comment I have undeleted both versions to allow a review

@Sj: nice to see you. The works would have been out of scope at that time as they were not published translations. We have since modified our scope to allow Wikisource translations of other language wikisource works, and host them in the Translation: ns. (where I hve parked both these works).

We should review the two works, determine 1) can we retain, 2) which version is a better translation, and if yes 3) move to best title, and 4) to retained edition then add {{translation header}} to the work. — billinghurst sDrewth 09:07, 6 June 2020 (UTC)