Wikisource:Scriptorium

From Wikisource
Jump to: navigation, search
Scriptorium
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 304 active users here.

Contents

Announcements[edit]

Do you create PDFs on Wikimedia wikis?[edit]

Hi everyone, I’m looking for feedback from people who use the function to create PDFs on the Wikimedia wikis, which feels relevant for Wikisource. In short, the main technology we’re using to render them – OCG – is breaking down. The code is old, it’s difficult to maintain, and if we don’t replace it now we might suddenly find ourselves in a situation where we'd have to take it down without having planned to do so.

We have some plans for the future over at mw:Reading/Web/PDF Functionality. If you care about the PDF function, please head over there and tell us on the talk page if anything is missing, or if there’s something in there we shouldn’t spend our time and energy on. /Johan (WMF) (talk) 12:19, 18 May 2017 (UTC)

Proposals[edit]

Add portals to default search[edit]

The following discussion is closed and will soon be archived: consensus reached for addition of portal to defaultsearch — billinghurst sDrewth 14:03, 9 June 2017 (UTC)
Portals aren't displayed by default when making a simple search with the search box. This most likely makes it impossible for them to be found by users who are unaware of how to search for them. I propose that we add portals to the default search results if possible. Jpez (talk) 04:34, 1 May 2017 (UTC)

Symbol support vote.svg Support: I've forgotten the number of times I've had to do an advanced search to look in the Portal namespace. Ciridae (talk) 08:41, 23 May 2017 (UTC)

Symbol support vote.svg Support no brainer —Beleg Tâl (talk) 14:38, 23 May 2017 (UTC)

Symbol support vote.svg Support I didn't realize this wasn't already the case. --EncycloPetey (talk) 14:41, 23 May 2017 (UTC)

@Jpez: now deployed. Thanks to Framawikibillinghurst sDrewth 12:11, 13 June 2017 (UTC)

Thanks @Billinghurst: I had a go but I'm not getting any hits. For example if I do a search for "sheet music" I expect Portal:Sheet music to be at the top of the search. I tried other portals also with the same results. Maybe it needs time to take effect? Jpez (talk) 05:58, 14 June 2017 (UTC)
@Jpez: which means that you are not on default search. You have been able to change and save your search preference, and you will need to do that and add yours via the advanced search clicking "Remember selection for future searches". — billinghurst sDrewth 12:13, 14 June 2017 (UTC)

Proposal to allow "fair use" in certain limited scenarios[edit]

There have been a few discussions lately about "fair use" on enWS. I think there is one specific scenario in which "fair use" should be acceptable: if a work is released under an acceptable license, but contains some non-free text (or other media) under "fair use" (or with explicit permission of the copyright holder), we should be able to include that text or other media as part of the entire work that has been released freely.

Rationales:

  1. It is not always possible to determine that a selection from a free text is actually a non-free citation included under "fair use".
  2. If an author can release a work under a free license even though it contains "fair use" selections, we should be able to host it even though it contains "fair use" selections.

Example: Green Eggs and Ham is the usual example of a nonfree work that has been published under a free license by a third party under "fair use", as it was included in the congressional record after someone read it out loud in congress. While it would be unacceptable to host Green Eggs and Ham as a work on its own, we could (possibly) host the congressional records under a free license, and my proposal above would simply suggest that we don't need to censor the section that quotes the nonfree work.

Example: The Book of Common Prayer (ECUSA) almost certainly contains translations of religious texts that are non-free. Can you identify these passages?

Anyway, this is just something I was thinking of that might be acceptable, and so I though I'd bring it up. @Slowking4: this discussion will interest you. I think your idea of what "fair use" should be acceptable is broader than what I suggested above, but this discussion or a sub-discussion could be the place for that as well. —Beleg Tâl (talk) 13:43, 18 July 2017 (UTC)

Bot approval requests[edit]

Repairs (and moves)[edit]

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

Index and pages move (Thoreau)[edit]

Could someone with a bot move the following Index, the associated DjVu File on Commons, and all its created Pages according the the name change:

Index:The writings of Henry David Thoreau volume 2.djvu --> Index:Writings of Henry David Thoreau (1906) v2.djvu

There was more than one edition of The Writings of Henry David Thoreau published, and this is volume 2 of the 1906 edition. This change will therefore be needed for clear disambiguation. Also, Volumes 5, 6, and 7 exist already on both Commons and Wikisource, and they utilize the replacement naming convention. This change will therefore make the naming in the series consistent as well. --EncycloPetey (talk) 00:41, 13 June 2017 (UTC)

Yes check.svg Done leaving redirects in index: and file: nss, not in page: ns. — billinghurst sDrewth 00:55, 13 June 2017 (UTC)
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. --EncycloPetey (talk) 01:00, 13 June 2017 (UTC)

Other discussions[edit]

95 years ago was 1921[edit]

It's approaching 2019 and the release of new, 1923, works into the public domain. We should prepare a collection of 1923 works to be released on 1/1/2019; I plan to find an early copy of The Murder on the Links that we can work from, and I'm currently working on the Renewal Registrations for 1950 to be able to get a good list of some of the major stuff that's going to be freshly out of copyright in English. But I was thinking about leading into that, with an emphasis on 1921 works this year and 1922 works next year. Any interest in this idea?--Prosfilaes (talk) 08:51, 16 April 2017 (UTC)

Let us watch out for any US Congress bill to extend copyright term. Only if none is passed or signed into Federal Law will allow us to convert PD-1923 into public domain after 95 years of publication.--Jusjih (talk) 02:47, 25 May 2017 (UTC)
On one hand, it seems quite unlikely that they will manage to push through a new bill before at least 1923 works move into the public domain, and on the other treating it as an unlikely impertinence instead of a nigh-certainty will make it easier to fight if it does happen.--Prosfilaes (talk) 08:05, 15 June 2017 (UTC)

Proposal: Having H: properly mapped to Help: namespace[edit]

The following discussion is closed and will soon be archived: Consensus achieved that we have additional local mapping of H: to Help: as namespacealias — billinghurst sDrewth 04:29, 10 June 2017 (UTC)
At the moment we use a shortcut like H:NS to be a shortcut to Help:Namespaces; this is actually a kludge that shows up in the main namespace. We should be having H: registered as a namespacealias to Help: as we have done for WS: to Wikisource:. As a practical example of this when looking at the prefixindex, note the namespace dropdown as an indicator of where you are in these examples Special:PrefixIndex/H: and Special:PrefixIndex/WS:, and also note the abbreviated listing

Technical speaking: We should be having H: set as a namespace alias for ns:12/Help: and HT: set as a namespace alias for ns:13/Help talk: (see current definitions in API call. This is a little fix, and does not adversely affect the wiki. It corrects an oversight that we made when we started better using the help namespace, though incorrectly implemented the shortcuts. [I am mostly asking the non-tech members of the community to trust me that this is needed.]

I request that the community approves this proposal and we will get a site request phabricator to resolve, and the sysadmins to run some scripts that will fix the incorrect namespace components. — billinghurst sDrewth 16:13, 21 May 2017 (UTC)

Symbol support vote.svg SupportBeleg Tâl (talk) 17:34, 21 May 2017 (UTC)
Symbol support vote.svg Support Beeswaxcandle (talk) 06:49, 23 May 2017 (UTC)
Symbol support vote.svg Support Ciridae (talk) 08:49, 23 May 2017 (UTC)
Symbol support vote.svg SupportSpangineer (háblame) 22:00, 24 May 2017 (UTC)
Symbol support vote.svg Support Sam Wilson 07:11, 25 May 2017 (UTC)
H: is now an alias for Help:. I have deleted those redirects that we had and are now superfluous. I have thanked those who contributed, via the ticket. — billinghurst sDrewth 13:53, 9 July 2017 (UTC)
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. — billinghurst sDrewth 13:54, 9 July 2017 (UTC)

Sheet music and pdf export[edit]

I've noticed sheet music we transcribe with lilypond isn't rendered at all when you try to export it as pdf. For an example try to export The Child's Own Music Book/Baa, Baa, Black Sheep as pdf. Jpez (talk) 04:09, 6 June 2017 (UTC)

@Jpez: PDF generation is widely considered problematic via the current internal tool, and to which you can express that commentary at the talk page of mw:Reading/Web/PDF Functionality. Did you also try the WSexport tool for PDF creation to see if that was any better? — billinghurst sDrewth 05:14, 6 June 2017 (UTC)
@Billinghurst: I also tried the export tool and the same thing happened, it came up blank. I'll raise the issue at wikimedia and see what happens. For now you can click on the musical score and download it as an image but the resolution is low. From what I know lilypond has built in pdf functionality, maybe it can be intergrated. Jpez (talk) 03:56, 7 June 2017 (UTC)
@Jpez: 1) It sounds like a phabricator: ticket to address; 2) it sounds like mentioning in the PDF replacement project that Extension:Score and PDF generation seem to be at odds. — billinghurst sDrewth 11:46, 7 June 2017 (UTC)
Thanks @Billinghurst: I've posted a comment here and I will also will create a phabricator ticket when I have the time. I've never used phabricator before so I will need to look into it. Jpez (talk) 05:29, 9 June 2017 (UTC)
There is an old phab ticket for this: phab:T65589. Feel free to comment there. You can log in with your WMF credentials. —Justin (koavf)TCM 15:40, 12 June 2017 (UTC)

Validation before proofreading[edit]

The validation option is coming up during page creation. Experimentally I have validated this page. New bug? Hrishikes (talk) 02:16, 9 June 2017 (UTC)

Not just for new pages. I was proofreading earlier and found that the system was offering me the option to validate pages that previously had only been edited by me. [1] --EncycloPetey (talk) 02:25, 9 June 2017 (UTC)
Put it into phabricator, and we should obviously @Tpt:billinghurst sDrewth 04:45, 9 June 2017 (UTC)

Answer by Tpt:

we have added a new user right called "pagequality-admin" that is, by default, enabled only to admins and allow them to tag as validated all pages. It is useful when you want to re-create already validated pages. See task T51482. I'm going to send an email to the mailing list about that.

Hrishikes (talk) 07:49, 9 June 2017 (UTC)
Hopefully @Tpt: will also explain how to turn this off for a particular Wikisource—or at least change it to a flagged right that can temporarily be granted by a 'crat for a particular purpose. Given that we give our readers the guarantee that pages marked as validated have been checked by at least two people, this change is not a good thing for the larger Wikisources. Beeswaxcandle (talk) 08:12, 9 June 2017 (UTC)
If you want to restrict this right to 'crat, just fill a bug on phabricator tagged with "site config" and "proofreadpage" to get the change done in en.wikisource configuration. If you want something that could be temporarily granted, you need to create a new user group (doable also in en.wikisource configuration). Tpt (talk) 08:46, 9 June 2017 (UTC)
To provide some clarification, if you look at Special:ListGroupRights and search for the term "pagequality" you will see two (new) hits. 1) in user which presumably is the first indicator of the migration from code-control to system permissions as the means that the system progresses through proofreading (only logged in users can progress the page status); 2) the admin pair that is the advanced right to jump straight to validated. If this community does not wish this to be available here, we would need to go through the general consultation phase and consensus process for site requests. Here that would be to have the right pagequality-admin removed from the administrator role, and presumably not available elsewhere. If the community wish for this to be a separate assignable right through Special:UserRights by crats or admins, then a consensus discussion can be used. — billinghurst sDrewth 13:58, 9 June 2017 (UTC)
Noting that there has been some positive commentary on the pha ticket and changes are being proposed. — billinghurst sDrewth 04:02, 10 June 2017 (UTC)

RC filtering is broken too. Can't change namespace from "all", can't change number of changes from "50", can't change number of days from "7". Hesperian 05:04, 9 June 2017 (UTC)

I'm not having problems with those issues right now. Just checked. --EncycloPetey (talk) 05:12, 9 June 2017 (UTC)
Noting that this change in code was reverted. Opinion is still being sort about how to progress, whether it is an assignable right or not. If it is an assignable right, whether the wikis assign it the same, or they can apply to have it set and how they have it set. Comment probably should be added to the phabricator ticket. — billinghurst sDrewth 14:00, 9 July 2017 (UTC)

Hymns for the coronation of Edward VII[edit]

Who here has a strong interest and experience in hymns? There is a short collection of Hymns for the Coronation of His Majesty King Edward VII (1902) (transcription project) that would be easier for someone experienced with setting sheet music. It's a short collection, at only 8 pages, buth the composers and hymn writers will need to be checked individually first, and the DjVu file might have to be transferred here if the works are not yet in PD in the UK. --EncycloPetey (talk) 00:30, 11 June 2017 (UTC)

I'd be down for researching the attributions. I don't have the patience for LilyPond though. —Beleg Tâl (talk) 01:01, 11 June 2017 (UTC)
Lilypond for hymns is no problem for me. If someone looks after the rest of the text on the pages, I'm happy to do the scores. Beeswaxcandle (talk) 01:06, 11 June 2017 (UTC)
ok, i’ll set them up for you. how did you do this? [2] outstanding. i have an interest in spirituals, and sheet music [3] but gave up. Slowking4SvG's revenge 23:15, 11 June 2017 (UTC)
@Slowking4: Practice and plenty of it. I've been setting scores here and for a choir for about 4 years now. Beeswaxcandle (talk) 07:20, 13 June 2017 (UTC)
Everything's done but the scores (and validation). —Beleg Tâl (talk) 23:55, 11 June 2017 (UTC)
if we have a willing volunteer for scores, we should set up a job queue, so we can proof all but score, and put "missing score" on it. could be a wikisource selling factor with some library GLAMs. Slowking4SvG's revenge 19:05, 12 June 2017 (UTC)
@Slowking4: what do you mean by that? We already do things that way, see Category:Texts with missing musical scores. —Beleg Tâl (talk) 19:20, 12 June 2017 (UTC)
yes, i was hoping elevate the list, to get in the monthly proofread queue, or with a contest, or a portal. it is a specialized capability that other transcription sites cannot do. we should celebrate, maybe push some to featured status. Slowking4SvG's revenge 19:28, 12 June 2017 (UTC)
It's pretty much me, although Jpez is doing some score work as well. I have limited time to devote to scores amidst everything else I'm involved in here and in RL. Beeswaxcandle (talk) 07:20, 13 June 2017 (UTC)

Comey Statement for the Record Senate Select Committee on Intelligence as public domain or should we delete it here ?[edit]

The following discussion is closed and will soon be archived: moved to Wikisource:Copyright discussionsBeleg Tâl (talk) 14:26, 12 June 2017 (UTC)
thumb|8 June 2017 Comey Statement for the Record Senate Select Committee on Intelligence

Comey Statement for the Record Senate Select Committee on Intelligence was added here to Wikisource.

I had originally added the file to Wikimedia Commons.

They nominated it for deletion there and they don't think it is public domain, commons:Commons:Deletion requests/File:8 June 2017 Comey Statement for the Record Senate Select Committee on Intelligence.pdf.

Is public testimony in an open public hearing read out loud as such before the United States Congress public domain?

If so, should we keep the document in written format here at Wikisource , and if not, should we delete it from Wikisource?

Thanks for your helpful advice ! Sagecandor (talk) 14:01, 12 June 2017 (UTC)

Pictogram voting comment.svg Comment I'm moving this discussion to Wikisource:Copyright discussionsBeleg Tâl (talk) 14:26, 12 June 2017 (UTC)

Tech News: 2017-24[edit]

15:29, 12 June 2017 (UTC)

Books with no chapters[edit]

There are some old books, especially poems in my language (Persian), in which the texts of sonnets start as soon as the last one finishes. There are no chapters and you can just distinguish the start of a sonnet by a title or a graphical mark (like this one). Is there any way to tell the wiki's software how to distinguish the end of a sonnet so during the transclusion process it could find out where to end and where to start? --Yousef (talk) 16:34, 12 June 2017 (UTC)

This is done using labelled section transclusion, Help:Transclusion#How to transclude single-section. —Beleg Tâl (talk) 16:46, 12 June 2017 (UTC)
Thank you! --Yousef (talk) 17:03, 12 June 2017 (UTC)
yes, if there are no chapters / sections or a printed index, you may have to create one from scholarship for ease of use. i.e. A Woman of the Century where i had to find an index on an advert not in book. Slowking4SvG's revenge 18:58, 12 June 2017 (UTC)
And as you are manufacturing subpages, we would often utilise {{auxiliary Table of Contents}} on the front page of the work to display what we have created. — billinghurst sDrewth 01:12, 13 June 2017 (UTC)

Tech News: 2017-25[edit]

15:44, 19 June 2017 (UTC)

Ship track upload as documentary source?[edit]

I'm about to receive a track of the ACX Crystal, recently involved in a collision in Japanese waters. Would this be proper to upload here as a "documentary source"? I expect it to be in a tabular format that can then be converted to a graphic, but not yet plotted as a graphic. - Bri (talk) 18:03, 19 June 2017 (UTC)

what is the license? if it is a document of tabular data, you could argue for PD in the US, but the pdf of the document would go to commons first. or do you want to upload here as "fair use"? Slowking4SvG's revenge 11:42, 20 June 2017 (UTC)
Further to this, we don't allow "fair use" on Wikisource, and we also don't allow reference material such as tables of data unless it is published as part of a complete source text. —Beleg Tâl (talk) 12:25, 20 June 2017 (UTC)
but we very well could, would, and should. given the propensity of commons to delete books in use, it is a matter of time. Slowking4SvG's revenge 19:01, 20 June 2017 (UTC)
If it's added to Commons as '.map' data, it'd be plotted automatically. Like commons:Data:Wikimedians.map for example. I'm not sure Wikisource is the place for pure data. Sam Wilson 12:30, 20 June 2017 (UTC)
Commons:Structured data is acceptable to be uploaded to Commons, usual copyright applies. I would not think that a track would be copyright as fact is not copyrightable. — billinghurst sDrewth 05:05, 24 June 2017 (UTC)

Problem with a pdf file[edit]

A pdf file has a problem! When I download it and I go to page 172 using Acrobat reader, I see the page but in the wikisource, no page is shown. This is the page address in fa.wikisource.org. Please help me to solve it. --Yousef (talk) 11:17, 22 June 2017 (UTC)

The page is visible to me. You need to purge your cache. Hrishikes (talk) 12:18, 22 June 2017 (UTC)

Search projects from this project now active in English Wikipedia[edit]

Just to let you know, as announced via mailing list service, English Wikipedia is now receiving search results of this project, Wikisource, intended to direct Wikipedia users to this project. Currently, an option to suppress the search results of this project from the English Wikipedia search system is proposed at Village pump's "proposal" subpage, where I invite you to comment. --George Ho (talk) 19:04, 22 June 2017 (UTC)

How do you contribute to Wikisource?[edit]

Hi everyone,

I have been proofreading a few pages here, but I feel like I don't understand really how this place works. There are many many projects started, some of them lingering for years. I don't even know how to find out how many books are finished, how many books are ongoing. It seems like a lot of people work for some pages on a book, alone, then very often give up, because this is a very long and sometimes boring task. Apart from a few discussions on the Current Collaborations, I don't see where people talk, so I don't feel like there is an active community. Am I missing a magical place where people discuss, exchange, organize?

A few years ago, I participated in PGDP, where there is a very active forum, with a thread for each project where the different proofreaders can exchange on the formatting or the difficulties to reach a consistent result, or even just share the most interesting/funny quotes of the books they are working on. There was also some specialized teams, like one named the gravediggers if I remember correctly, which focused on the oldest projects, or teams for texts on a specific topic, which could gang up on a given book at the same time. This was made possible by the existence of statistics at the book level, not only at the page level.

So:

  • Is there a lot of discussion and organisation going on somewhere I don't know (other talk pages? IRC? mailing-lists?)
  • Would you be interested in statistics at the project level? (e.g. list of projects with the progress percentage, so that we can quickly finish works almost done, or focus on the oldest ones). I think I could code something giving regular updates. Actually, does it exist in other wikisources?

Koxinga (talk) 20:12, 22 June 2017 (UTC)

This very page (the Scriptorium) is our central discussion forum. You've come to the right place! Discussions regarding a specific project are done on the Index talk page. Bigger projects are organized as WikiProjects. Other discussion forums and lists of places to contribute are listed at Wikisource:Community portal. I'll let someone else speak to statistics as I don't know much about that. The best place to contribute if you don't know where to contribute is probably the proofread of the month. —Beleg Tâl (talk) 21:02, 22 June 2017 (UTC)
dashboard for wikisource progress? yes please! the example that comes to mind is Wikisource:WikiProject DNB/Statistics and Wikisource:WikiProject DNB/Progress. but in general we are too disorganized to do actually reporting, except ad hoc. some tools to make project management & progress communication would be fine. we should really do a wish list, or you could write an idealab - quick grant, if you could write up your own scope. Slowking4SvG's revenge 22:20, 22 June 2017 (UTC)
Of course I know about the talk pages and the Scriptorium, but it is just so empty. There is no feeling of community here.Koxinga (talk) 22:43, 23 June 2017 (UTC)
The Special Pages link on the left-hand side gives you access to a lot of interesting information, and particularly List of index pages is the page to see if you want find projects at various stages of completion. — I think one of the strengths of English Wikisource is it (usually) allows you to start and work on all sorts of project autonomously, but that does result in a lot of unfinished projects and makes the community spirit a little hard to see at times. I've put up a lot of index pages that I'd like to work on "some day" and a couple of times I've come across one that someone has taken on and finished, which was extremely gratifying. — One thing I do to contribute is search for common scan errors and correct them. One of my favorites has been "thou earnest" for "thou camest". That's a good way to get a glimpse of a lot of interesting material. Anyway, I hope you'll be sticking around, and I agree that more community interaction would be a good thing! Mudbringer (talk) 01:34, 23 June 2017 (UTC)
To add to this, a lot of editors will add a list of the projects they're working on to their user page, so you can get an idea of what people are up to by looking there. Special:RecentChanges will also show what people are currently working on. —Beleg Tâl (talk) 11:54, 23 June 2017 (UTC)
Yes, that's exactly what I mean. It is very gratifying to see someone else working on the same project. On the opposite, I have been back after a hiatus of a year, to find that not a single page had been proofread in the meantime. I do work on some rather specific topics, with Chinese characters that might frighten some contributors, but still, this is rather disheartening.Koxinga (talk) 22:43, 23 June 2017 (UTC)
I enjoy contributing to wikisource, it's one of my favorite passtimes. I like the idea of adding works here and making them available for future generations. Maybe someone 100 years down the line will be reading some of the works we've been adding. I also like the idea of me being able to read works I've never read before and also at the same time making them available for other readers to read. But it has to be enjoyable for me, so I mainly work on subjects I'm interested in and as you mentioned I often might start a book and get disinterested, and then just forget about it. I don't care. This isn't a job, I don't have to contribute if I don't want to, I can wake up tommorow and never contribute to wikisource again and probably no one will ever notice. I don't want deadlines here, I have them at work. I like contributing here to get away from work and relax. So basically wikisource for me is something enjoyable to do in my free time and having to be forced to finish a work, or work on books we're not interested in just to get it done is the wrong way to go for me. Don't get me wrong, we should strive to get the works we're working on finished, but if we don't or can't who cares, someone else will probably get it done down the line. Jpez (talk) 11:26, 23 June 2017 (UTC)
I am not talking about setting deadlines or anything like that. It is fine if your motivation is entirely internal and you can work alone at your own pace. However, I do think we would get more contribution with more reporting on what is going on, what are the projects moving forward, what are the projects close to completion, etc.Koxinga (talk) 22:43, 23 June 2017 (UTC)
welcome to smaller wikis. there is less chatter and drama, and more work done. a little coaching (management) would be welcome. people tend to ask for help here, ad hoc, rather than systematic reporting; people team to get a project done. we could use a wikisource newsletter, or progress dashboard. if you could make some tools to report project progress semi-automatically, rather than by hand, that would be a big help. Slowking4SvG's revenge 15:19, 26 June 2017 (UTC)

Collaboration products newsletter: 2017-06[edit]

08:41, 23 June 2017 (UTC)

License tags in Translation space[edit]

What is the best way to put license tags in Translation space? The original work needs an explicit license tag, but I'm not sure about the translation itself. I assume it will always be CC-BY-SA-3.0 and GFDL, but I've seen some editors explicitly release it into PD. Is this allowed? Should the CC-BY-SA-3.0/GFDL licenses be explicitly tagged? I've been tagging them explicitly, as below, but I just want to see if others have a better way.

{{translation license
| original = {{PD-old}}
| translation = {{CC-BY-SA-3.0}}{{GFDL}}
}}

Beleg Tâl (talk) 13:22, 23 June 2017 (UTC)

Our rider on saving is By saving changes, you agree to the Terms of Use, and you irrevocably agree to release your contribution under the CC BY-SA 3.0 License and the GFDL. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license. So that is what is applying for contributor work in Translation: ns. So until we update that, that is what it is. — billinghurst sDrewth 22:33, 23 June 2017 (UTC)

The Time Machine (Heinemann text)[edit]

Hrishikes has brought an issue to my attention, which I have looked into as well. This is a bit complicated, so I will summarize, then say more at length.

Summary: Our copy of The Time Machine (Heinemann text) is not the 1895 Heinemann text of the novel by H. G. Wells, but seems rather to be the 1924 revised "Atlantic" text included in an omnibus edition The Time Machine, The Wonderful Visit and Other Stories published by T. Fisher Unwin. [26] As H. G. Wells died in 1946, his works are in PD in the UK. The omnibus was printed in the UK in 1924, and does not seem to have had copyright renewal in the US. So it may be in PD in the US. Hrishikes has located a scan of the Heinemann text and started transcription. So, if our copy of the "not-Heinemann" (Atlantic) text is in PD in the US, then we need to move it to a new location and make room for the actual Heinemann text. But if it is not in PD, then it should be deleted. As an added wrinkle, the "not-Heinemann text" is a Wikisource Featured text.

Identity of the text located at The Time Machine (Heinemann text): It is easily seen that our current copy is not the Heinemann text. Compare the table of contents for the actual Heinemann text with the one on our current copy. The number of chapters and their presentation are completely different. The Heinemann text has 16 chapters with chapter titles, but our copy has 12 chapters without titles. Neither did the 1895 Holt text have 12 chapters. The earliest edition with 12 chapters seems to be the "Atlantic" text that was the result of a revision. The "Atlantic" text may be seen here in an electronic version that preserves the original pagination and page headers.

The Atlantic text and copyright: The "Atlantic" text was published as part of an omnibus edition of Wells' works in the UK in 1924. Details of that publication may be found here. I do not know whether the text was simultaneously published in the US, possibly under a different title, or whether copyright applied for at that time. However, a search has turned up no evidence of a renewal for that volume. If so, then it seems the copyright in the US for the Atlantic text has expired. The original text was published in 1895, so it would be PD in the US as well, and all of Wells' works entered PD in the UK at the beginning of this year, as it has now been more than 70 years since his death.

Proposed actions:

(1) Feedback and confirmation of findings thus far. Is our text the Atlantic text?
(2a) If our text is the Atlantic text, and in PD, then propose moving it to The Time Machine (Atlantic text), and then proofreading and transcluding the actual 1895 Heinemann text to The Time Machine (Heinemann text) from the scan Index:The Time Machine (H. G. Wells, William Heinemann, 1895).djvu begun by Hrishikes.
(2b) If our text is not the Atlantic text, or is but not in PD, then delete it and proceed with adding the actual Heinemann text from scan etc.
(3) Decide about Featured status for the text. (Let's wait on that discussion until we know whether we're following 2a or 2b).

Original discussion: User talk:EncycloPetey#The Time Machine (Heinemann text). --EncycloPetey (talk) 17:38, 23 June 2017 (UTC)

1- inclined to agree based on chapters, but could not find an internet archive version, or at hathi trust, and not near me at worldcat [27]
2- i would be inclined to keep both, and change the header data for the reprint. (is it Heinemann text, published by Atlantic?)
2- do not see a reason for deletion (although there is a Scribner 1924 edition)
3- we can have delisted featured, we should think about all the old versions not transcluded from page scans
4-- i imagine we will have more of this, as we research editions. (and as our scholarship improves) the metadata at internet archive is so bad, people could be easily confused. Slowking4SvG's revenge 17:54, 23 June 2017 (UTC)
It's not the Heinemann text. The two texts are completely different editions, even having a different numbers of chapters (16 versus 12). The concern over deletion is that, if this is a 1924 publication, and if copyright was renewed, this edition might not be in PD yet. My research didn't turn up anything, but someone else's search might do so. --EncycloPetey (talk) 18:33, 23 June 2017 (UTC)
if you did not find anything, that is good enough for me. under the current US copyright search, that is the best result you can get. there is no positive proof of non-renewal. we have to set the standard of "good faith search" even if there is a very small chance of facts emerging. this is the standard of hathi trust. Slowking4SvG's revenge 17:26, 24 June 2017 (UTC)
I'd prefer The Time Machine (1924) as the page name, but aside from that I agree with your assessment and support your proposed actions. —Beleg Tâl (talk) 18:20, 23 June 2017 (UTC)
Unless we can verify for certain that the text is specifically from a 1924 edition, I'd hesitate on adding a date to the filename. Doing so might require further changes to the name later, if research turns up additional information. But if we can verify that it is the "Atlantic text", from any edition of that text, then the proposed name will work regardless of the actual date. --EncycloPetey (talk) 18:33, 23 June 2017 (UTC)
  • Pictogram voting comment.svg Comment It is now an edition of a work with an uncertain source, we could just delete it if it doesn't bring true value. With regard to its copyright status, that does not change whether it is a 1924 version, or not, the copyright will always be the original version. Any copyright in the remainder of the suspected publication will depend on each of the components, and the renewal aspects. — billinghurst sDrewth 22:30, 23 June 2017 (UTC)
    As far as I am aware, the 1924 edition was a complete revision of the text by Wells himself, and not merely an editorial version. Does that affect the possibility of copyright? --EncycloPetey (talk) 22:35, 23 June 2017 (UTC)
    If it wasn't published before 1923, and wasn't previously published in an authorized version in the US, the URAA would have restored it. It's hard to say where the line is legally between a non-copyrightable new version and copyrightable changes, but decent revision should do it. It will be out of copyright in the US in 2020.--Prosfilaes (talk) 01:19, 24 June 2017 (UTC)
    Expert opinion from H. G. Wells's The Time Machine: A Reference Guide (2004) by John R. Hammond, page 19:

In the original edition of The Time Machine, published by Heinemann in 1895, the text was divided into sixteen chapters, and each chapter was given a title. When Wells revised his novels for a collected edition in 1924, the Atlantic Edition, he retained the text of The Time Machine virtually unaltered but reduced the number of chapters from 16 to 12, eliminating the chapter titles.

Most modern editions follow Wells's revision in dividing the text into twelve chapters. In the discussion that follows chapter references follow this practice.

A comparison of the chapter divisions is as follows:

Heinemann   Atlantic
1 Introduction 1
2 The Machine 1
3 The Time Traveller Returns 2
4 Time Travelling 3
5 In the Golden Age 4
6 The Sunset of Mankind 4
7 A Sudden Shock 5
8 Explanation 5
9 The Morlocks 6
10 When the Night Came 7
11 The Palace of Green Porcelain 8
12 In the Darkness 9
13 The Trap of the White Sphinx 10
14 The Further Vision 11
15 The Time Traveller's Return 12
16 After the Story 12
  Epilogue Epilogue

As per above, Heinemann chapter divisions were original, but Atlantic chapter divisions are currently in vogue. "Virtually" no difference in text. So I propose that the text may be migrated to scan, with title unchanged, alongwith additional chapters. Two pages are missing in the scan, which I am going to fix by blank placeholders. The blanks may be proofread from the Atlantic text. Hrishikes (talk) 02:00, 24 June 2017 (UTC)

The disadvantage of that approach is that we will have no copy of The Time Machine with the chapter divisions that are now in vogue. If we can legally retain a copy of the Atlantic text, then we should do so for this reason. --EncycloPetey (talk) 02:03, 24 June 2017 (UTC)
Wells's books are PD-UK. But the policy here is PD-US. Non-US texts need not have copyright registration/renewal in the U.S., the copyright is restored by the URAA for 95 years after publication. So we have to assess whether modification of chapter divisions, without alteration of text, amounts to significant change, attracting copyright. If the change is deemed as significant, then we cannot retain this text. Anyway, reduction in chapter number and elimination of chapter titles in currently-in-vogue version of the work may be mentioned in the header note, that should suffice.
P. S. It seems that the Atlantic edition was published in U. S. in the same year (1924) by Charles Scribner's Sons (details at http://www.isfdb.org/cgi-bin/pl.cgi?614641) without copyright notice/renewal. Hrishikes (talk) 03:20, 24 June 2017 (UTC)
Adding chapter names might have been copyrightable, but removing them wouldn't be, and splitting a few chapters in two pieces wouldn't be either. I don't know whether that copyright renewal would have been needed, since it's 30 days of first publication, but the changes don't seem copyrightable.--Prosfilaes (talk) 00:32, 25 June 2017 (UTC)
This site gives a date of October 15, 1924 for the first two volumes in the Atlantic Edition of The Works of H. G. Wells, which includes the text in question. —Beleg Tâl (talk) 01:52, 27 June 2017 (UTC)

Proposed action:

Given that: (a) the original work is PD in both UK and US, (b) the "Atlantic" text seems not to differ substantially except by removal of chapter titles and positioning of breaks, I propose we take the following actions:

(1) Move The Time Machine (Heinemann text) to The Time Machine (Atlantic text) to preserve this version.
(2) Add to the empty The Time Machine (Heinemann text) the front matter from the 1895 scan.
(3) Paste into each chapter subpage the relevant Atlantic text, then split-and-match to the Page namespace of the scan.
(4) Proofread the result against the Heinemann text scan, keeping alert for differences.
(4a) If proofreading demonstrates that the Atlantic text is indeed identical or inconsequentially different from the Heinemann text, then we keep both.
(4b) If proofreading reveals significant editorial changes, we can then delete the Atlantic text at its new location, perhaps moving a copy to Wikilivres, and restoring it 2020 when the US copyright would expire.

--EncycloPetey (talk) 00:45, 25 June 2017 (UTC)

Agreed. I don't think copyright will matter, anyway, it is PD-US-no notice. Additionally, I propose that the header note should mention metadata of this edition, including UK publication by Unwin and US publication by Scribner. And the Featured Text status should move to this new location of the Atlantic text. Hrishikes (talk) 02:17, 25 June 2017 (UTC)
It's only PD-US-no notice if it was published in the US within 30 days of first publication in the UK. Otherwise the copyright (if any) was restored.--Prosfilaes (talk) 03:41, 25 June 2017 (UTC)
i do not believe we have deleted a work based on URAA, so you may not want to open that can of worms, given the WMF legal advice. Slowking4SvG's revenge 14:37, 26 June 2017 (UTC)
@Slowking4: URAA-based deletion is a regular feature here. Premchand's Idgah was deleted under URAA provision, and later restored when it was proved that it was PD-India on URAA date. The works of Jibanananda Das were shifted to Wikilivres under URAA provision. Same with Sokoli Tomari Iccha and Naya Kashmir. There are many more examples. Non-US works are regularly deleted here when it is found that they were not PD-source country on URAA date. The WMF legal advice you referred to is for allowing foreign works that are PD-source country on current date, not merely URAA date. On that advice, Commons has stopped deletion of works that were not PD-source country on URAA date. This practice has not yet started here. If it starts, then the works of Jibanananda Das will need to be restored. Adopting this policy here is risky. You will do well to remember the direct deletion of Anne Frank's Diary by WMF in Dutch Wikisource, overriding the local community, based on URAA. Hrishikes (talk) 17:00, 26 June 2017 (UTC)
this case is very clearly PD not renewed. what evidence do you need? do you want a transcribed catalog of copyright entries?
sorry to hear you are propagating the URAA hysteria. let the restorations begin. i remember that about Anne Frank, why don’t you let me upload it here as fair use, since it is PD in Australia, and i will take the risk. i do not think that the plantiff will risk a DMCA takedown given w:Lenz v. Universal Music Corp. the federal judges are very consistent, and i have the $10k ante for federal court, don’t need any EFF help. Slowking4SvG's revenge 22:32, 26 June 2017 (UTC)
In order to state clearly this is "PD not renewed", we would need evidence that the edition was registered for copyright in the US within 30 days of the UK publication. Lacking evidence for that, we cannot say for certain this work falls under PD not renewed. If the original copyright was not filed in the US, or was not filed in 30 days, then the edition may retain copyright under URAA. That's rather the whole point. We need evidence of the original copyright filed and meeting the conditions, and we still need to verify that the text was not substantially altered. If no copyright was filed at the correct time, and if the text is substantially altered, this edition may still be under copyright. --EncycloPetey (talk) 22:47, 26 June 2017 (UTC)
If it was published with permission of the copyright holder within 30 days in the US, it's treated as a US work and is out of copyright for lack of notice as well as lack of renewal. If it wasn't an authorized edition, or it was more than 30 days after the UK edition, then any new copyrightable aspects will be under copyright.
Honestly, this seems like a bit much. There's no real evidence that's anything copyrightable here, and if there is, there's three years left on its copyright. Someone should split and match it against the old scans, but marginal copyright questions like this shouldn't be that much of a concern, IMO.--Prosfilaes (talk) 04:42, 27 June 2017 (UTC)
I'm inclined to agree with this, especially if we aren't able to determine whether the two publications were 30 days apart. By the time we have all the information we need to know whether it is subject to URAA or not, the copyright may well have already expired. —Beleg Tâl (talk) 05:25, 27 June 2017 (UTC)
registration date is here - Oct. 17, 1924 [28] Slowking4SvG's revenge 14:57, 27 June 2017 (UTC)
Anne Frank's Diary doesn't belong here, since the English translation will be in copyright until 2045 (in the US), and the translator was alive as of 2013. Feel free to bring it up with Commons or nl.Wikisource.--Prosfilaes (talk) 04:42, 27 June 2017 (UTC)

Disambiguation quandary[edit]

The work Once a Week is a literary magazine, but it shares the title with a book by Author:A. A. Milne.

Ordinarily, we would move Once a Week to something like Once a Week (magazine), and use the base name for disambiguation. But the current title is a literary magazine that already has multiple subpages for its series, volumes, and articles. A move would permanently extend the filename of all of the subpages, and require editing all of the links within and between these pages, both in headers and in the Page: namespace.

In this instance, where there is a multi-volume literary magazine involved, would it make more sense to set the disambiguation page at Once a Week (disambiguation), and leave the magazine where it is? --EncycloPetey (talk) 19:19, 23 June 2017 (UTC)

I'm willing to use AWB to disambiguate properly on the magazine. However, is the Milne work being added imminently? If not, there is no need to disambig yet. —Beleg Tâl (talk) 19:33, 23 June 2017 (UTC)
Although the Milne book is not being done yet (there is a good scan at IA [29]), the literary magazine is actively and rapidly growing on Wikisource each day. The longer we delay, the more moves and changes will have to be made. --EncycloPetey (talk) 19:41, 23 June 2017 (UTC)
That's a good rationale. I'll move it over when I'm on my other PC. —Beleg Tâl (talk) 19:54, 23 June 2017 (UTC)
Just to note that the articles are being created as mainspace base pages rather than subpages of the issue. e.g. The philosophy of advertising. Beeswaxcandle (talk) 20:09, 23 June 2017 (UTC)
Good to know. I'll move them to the proper path while I'm at it. —Beleg Tâl (talk) 20:12, 23 June 2017 (UTC)
The Mainspace articles probably ought to be subpages within series, volume, etc., but with redirects left from the Main namespace. I was looking into making those moves when I discovered the disambiguation issue, and decided it ought to be taken care of first. --EncycloPetey (talk) 20:16, 23 June 2017 (UTC)
Agreed. —Beleg Tâl (talk) 20:29, 23 June 2017 (UTC)

Facsimiles of older United States Reports post Google Books' typical full view cut off[edit]

Anybody know where these might be found? Prosody (talk) 19:20, 24 June 2017 (UTC)

These volumes are already present at {{List of United States Reports scanned volumes}}. Are you wanting something additional? Hrishikes (talk) 23:47, 24 June 2017 (UTC)
I was unclear, sorry. There are 564 volumes now, and Google Books only has facsimiles publicly available for US users for ones published before ~1920s (not sure what their copyright restriction policies are for users in other countries). Since asking I've found that Internet Archive seems to have some more. Prosody (talk) 17:06, 25 June 2017 (UTC)
the National Archives has it on microfilm through 1997 https://www.archives.gov/research/guide-fed-records/groups/267.html let’s see if i can find a digital copy at citizen archivist. Slowking4SvG's revenge 23:14, 25 June 2017 (UTC)
can’t find a systemic digitization. we have US govt documents, but they are haphazard. maybe a project with a sweep of the scans available would be a start. we have a few of these large projects that are stalled because the scans are crummy and it is so humongous. Slowking4SvG's revenge 01:32, 28 June 2017 (UTC)

Tech News: 2017-26[edit]

15:38, 26 June 2017 (UTC)

A word about clearing the cache and page refresh[edit]

We are not aloneIneuw talk 19:30, 26 June 2017 (UTC)

How to see edit history on a whole text[edit]

Is it possible to see the edit history of a whole text? I can see the changes made in the last 30 days through selecting "On Watchlist" in the general Wikisource "Recent Changes" page. I would like to look back and see if anyone or any bot has been working on the project I have been working on, namely An_Exposition_of_the_Old_and_New_Testament_(1828). PeterR2 (talk) 09:31, 27 June 2017 (UTC)

@PeterR2: I don't sure that I understand what do you mean on saying to see the edit history of a whole text, but if you open the page An Exposition of the Old and New Testament (1828), and then click on the link "Related changes" which is in the left panel (in the section "Tools") — is this that one what do you need? The page opened by this way would show edits made on both either of the viewed page or its subpages (or also on other pages related to the main page), so you could see the edits on the whole text of the work (since the whole text of the work consists of the main page combined with all of its subpages). P.S. Sorry if I wrongly understood your help request. --Nigmont (talk) 21:16, 27 June 2017 (UTC)
I would love to see an option on the watchlist to automatically watch all the subpages of a given page. There are some mediawiki extensions doing that, was the possibility already discussed here? Koxinga (talk) 21:57, 27 June 2017 (UTC)
There is a gadget (although, I can't find it right now because I can't remember what it was called) for watching all pages in a category. There was an idea earlier this year to extend it to cope with following all pages linked on an Index page, but I don't think that bit was finished. As for seeing all history of a work, I think Special:RelatedChanges is the only way, and that has some limitations (mainly that it only goes back 30 days, because it's using data from RecentChanges). Sam Wilson 22:57, 27 June 2017 (UTC)

Multiple pages on one page[edit]

Is here any way to split a pdf page which contains two pages of the original file on itself? This is the file I’m talking about. It’s on Persian Wikisource. --Yousef (talk) 16:39, 29 June 2017 (UTC)

Hi Yousef, I use scantailor to do this. Jpez (talk) 04:37, 30 June 2017 (UTC)
i’ve been known to crop and rearrange, in a publisher program, and then save as pdf, to maintain the page order & pagination of the pdf. Slowking4SvG's revenge 19:12, 30 June 2017 (UTC)

Tech News: 2017-27[edit]

15:31, 3 July 2017 (UTC)

Pagelists[edit]

Anyone want to finally clear this backlog? There are some I don't feel happy working with for copyright reasons.ShakespeareFan00 (talk) 14:10, 4 July 2017 (UTC)

Wikilivres is now Bibliowiki[edit]

Wikilivres has moved and rebranded; they are now Bibliowiki and are located at https://biblio.wiki . Our internal references to Bibliowiki need to be updated.

  • Documentation needs to be updated (I can do this, albeit it may take a while for me to get to it).
  • The interwiki map for [[wikilivres:foobar]] needs to be updated to point to the correct location, and [[bibliowiki:foobar]] should be created as a preferred alternative.
  • Probably other stuff I haven't thought of.

Beleg Tâl (talk) 15:23, 4 July 2017 (UTC)

wikilivres has been redirected and bibliowiki has been created in the global interwiki map. I suggest moving the template to the new name, and updating as necessary. — billinghurst sDrewth 12:30, 8 July 2017 (UTC)

Join the strategy discussion. How do our communities and content stay relevant in a changing world?[edit]

Hi!

I'm a Polish Wikipedian currently working for WMF. My task is to ensure that various online communities are aware of the movement-wide strategy discussion, and to facilitate and summarize your talk. Now, I’d like to invite you to Cycle 3 of the discussion.

Between March and May, members of many communities shared their opinions on what they want the Wikimedia movement to build or achieve. (The report written after Cycle 1 is here, and a similar report after Cycle 2 will be available soon.) At the same time, designated people did a research outside of our movement. They:

  • talked with more than 150 experts and partners from technology, knowledge, education, media, entrepreneurs, and other sectors,
  • researched potential readers and experts in places where Wikimedia projects are not well known or used,
  • researched by age group in places where Wikimedia projects are well known and used.

Now, the research conclusions are published, and Cycle 3 has begun. Our task is to discuss the identified challenges and think how we want to change or align to changes happening around us. Each week, a new challenge will be posted. The discussions will take place until the end of July. The first challenge is: How do our communities and content stay relevant in a changing world?

All of you are invited! If you want to ask a question, ping me please. You might also take a look at our the FAQ (recently changed and updated).

Thanks! SGrabarczuk (WMF) (talk) 14:53, 5 July 2017 (UTC)

Is Khrushchev's secret speech in the public domain or not?[edit]

This website says that the translation given in it is in the public domain, but apparenly translations of this speech have been deleted over and over again from Wikisource. So can I add this translation to Wikisource or not? --Itsused (talk) 09:53, 7 July 2017 (UTC)

Works that are copyrighted do not become public domain when they are published in the Congressional Record. Their publication in the Congressional Record is considered fair use, but fair use is not an acceptable rationale for hosting a work on Wikisource. The Record itself may be {{PD-US-no-notice}} as suggested by the website you linked to, but that license tag applies only to American works and not to works written in the U.S.S.R. —Beleg Tâl (talk) 12:06, 7 July 2017 (UTC)
The translation as such is PD-USGov if it was produced by the State dept., but the translation is a derivative work of the original and so would, for our purposes, inherit the copyright status of the original. The original is for copyright purposes considered to be simultaneously first published in all the former Soviet republics (the "source country" in policy terms), and subject to all the successor states' current copyright laws. For Russia, for example, the term here is effectively pma. 70 (so until 2041). And as a signatory to the international copyright treaties, that means it is covered by copyright in the US as well. So, in other words, no, this work is not suitable for hosting on any Wikimedia project. --Xover (talk) 19:57, 7 July 2017 (UTC)
yet another example to adopt fair use here. english accepts fair use, it is acceptable for hosting there right now. Slowking4SvG's revenge 03:16, 8 July 2017 (UTC)
Just a comment: the original text can be found on the Russian Wikisource.--Itsused (talk) 07:24, 8 July 2017 (UTC)
@Itsused: Well, then it follows that either they know something we don't, or they have a different policy for this, or they simply haven't noticed that there is an issue. Perhaps you could raise the issue there and report back if any new information comes to light? --Xover (talk) 11:46, 8 July 2017 (UTC)
I left a message on their Scriptorium (Forum) telling them to come here, let's see what happens.--Itsused (talk) 12:28, 8 July 2017 (UTC)
Well, anyway, that's the copyright notice at the end of the text (О культе личности и его последствиях. Доклад XX съезду КПСС (Н.С. Хрущёв)): Это произведение не является объектом авторского права. В соответствии со статьёй 1259 Гражданского кодекса Российской Федерации официальные документы государственных органов и органов местного самоуправления муниципальных образований, в том числе законы, другие правовые акты, решения судов, иные материалы законодательного, административного и судебного характера, официальные документы международных организаций, а также их официальные переводы, государственные символы и знаки, а также символы и знаки муниципальных образований не являются объектами авторских прав.--Itsused (talk) 18:57, 8 July 2017 (UTC)
Via Google Translate: "(On the personality cult and its consequences Report to the XX Congress of the CPSU (NS Khrushchev)): This work is not an object of copyright. In accordance with Article 1259 of the Civil Code of the Russian Federation, official documents of state bodies and local self-government bodies of municipalities, including laws, other legal acts, court decisions, other legislative, administrative and judicial materials, official documents of international organizations, as well as their official Translations, state symbols and signs, as well as symbols and signs of municipal entities are not subject to copyright."
Can we really consider a "secret speech" to be an official government document? I am highly skeptical of this. —Beleg Tâl (talk) 19:36, 8 July 2017 (UTC)
were the Pentagon Papers an official government document? Slowking4SvG's revenge 00:13, 9 July 2017 (UTC)
Yes. The Pentagon Papers were prepared by a government agency. They were declassified and released to the public in 2011. --EncycloPetey (talk) 00:27, 9 July 2017 (UTC)
so a secret document can be an official government document. and written speeches are held to be copyright-able. is there any reason to confuse secrecy with authorship? Slowking4SvG's revenge 08:59, 11 July 2017 (UTC)
Following the copyright policy of ruWS is not of much use to us. Different wikisources have different local policies on copyright. For example, the various Indic wikisources follow the policy of PD-India, for the safety of editors. Plenty of works of authors who died in 1956 or before are present there, which are PD-India but not PD-US. Pre-1923 works of authors who died after 1956 get deleted there. Plenty of books are also present in Commons, which are PD-India but not PD-US. Commons used to delete them before, but not now. English Wikisource follows the policy of PD-US, so no use delving into what others do. Fair use also is not applicable here. English Wikipedia uses "extracts" of works or "downsized" images as fair use, Wikisource uses full works. Full works cannot be fair use; these should be explicitly copyright-free, and in our case, non-controversially PD-US. Another thing is that any change of copyright policy in the source country after the URAA date does not affect US copyright except in case of separate bilateral agreement; therefore, such copyright changes in source country are not useful to us for assessing the copyright of the original/foreign works. Policy as it stood on URAA date in the source country is to be considered, not the later and current ones. The current policy of the source country is for the wikisource specific to that country's language, not for us. Hrishikes (talk) 02:34, 9 July 2017 (UTC)
@Hrishikes: I don't think anyone here has suggested enWS should follow ruWS's policy. What I did suggest, however, is that the community on ruWS—by virtue of being closer to the subject matter, the history, and speaks the relevant language natively—may have been in a better position than us to correctly assess the copyright status of this particular work, and that therefore we may be able to use their assessment to determine what action our policy suggest we take.
I am not sure why you so emphasise the URAA date. The last relevant change (aiui) to the Russian copyright terms was in 1993 when the pma. term was extended from 50 to 70 years, and so this was the applicable term on the URAA date.
In any case, the English translation qua translation is PD-US (PD-USGov), so it all boils down to whether the Russian original is PD or not. I can't see that it is, but the ruWS community may have more information or better understanding. --Xover (talk) 06:19, 9 July 2017 (UTC)
or the culture is more inclusionist. the propensity of defining who we are by what we exclude tends to drive away newcomers. Slowking4SvG's revenge 08:25, 11 July 2017 (UTC)

Khrushchev's speech as document of government body[edit]

Arbitrary break because nesting was getting hard to follow...

So… ruWS asserts that this work, hitherto referred to as a "speech", is exempt from copyright in the Soviet Union and (all) its successor states because it is considered to be some form of official document of a government entity. They do this by reference to Article 1259 of the Civil Code of the Russian Federation, the relevant part of which reads:

6. The following are not objects of copyright:

1) official documents of state bodies and bodies of local government of municipal formations, including statutes, other normative acts, judicial decisions, other materials of a legislative, administrative and judicial nature, official documents of international organizations, and also their official translations;

The question then becomes, can the speech in question be considered an "official document of a state body"?

I can think of three immediate arguments in favour of that position:

  1. In the above copyright assertion, the document is titled "On the personality cult and its consequences Report to the XX Congress of the CPSU (NS Khrushchev)". That looks very much like the "Title—Subtitle (Authoring Entity)" format of any typical government report; any number of which are first, and most officially, presented in the form of a speech to a parliamentary body, even though a printed version makes a lot more sense. In fact, this becomes even clearer when you consider the title page for the 1956 printed edition: "Report of the Central Committee of the Communist Party of the Soviet Union to the 20th Party Congress" (with "N. S. Khrushchev" as author). In other words, I think the term "speech" may be misleading here; it could entirely plausibly be a report from the Central Committee (in modern US terms, think "White House"), by Khrushchev in his formal role, to the 20th Congress of the CPSU (modern US, the Congress). In other words, it's no more unusual in that sense than a State of the Union speech in the US, or a Queen's Speech in the UK.
  2. In a lot of jurisdictions (including, aiui, the US and UK; and presumably Russia), even a more typical (scripted/written; not necessarily off-the-cuff remarks) "speech" by a government official, like a minister, is considered an official document of the relevant department or ministry, and so subject to the same copyright terms or exceptions as printed documents.
  3. Even off-the-cuff remarks etc. in a parliamentary body, by members of that body (i.e. Representatives and Senators in the US Congress), are, when included in the record of that assembly, covered by the same copyright terms or exemptions as the overall record. Unless the words had a prior copyright (i.e. a Senator reading a copyrighted work aloud in the Senate), and the record includes them merely as "fair use", the speech of the members in the assembly becomes part of the record and covered by the same copyright. Since Khrushchev was First Secretary (General Secretary) at the time, he was definitely a member of the party and its Congress, and there is no doubt that this particular speech was made in his role as the effective leader of the Soviet Union (if he wasn't, he would most likely have disappeared mid-sente…).

Based on this reasoning, I actually find myself somewhat persuaded that this speech is in fact in the public domain, through a copyright exemption, in the Soviet Union and its successor states; and that its English translation is therefore in the public domain, as a PD-USGov translation of a public domain Russian original, in the US.

Thoughts? What'd I miss, misunderstand, fail to take into account? Does this reasoning hold up? --Xover (talk) 07:19, 9 July 2017 (UTC)

I'll just point out that this document has a Wikipedia page with the history of the document, which may be useful to determine its official status. It looks to me like you may be right. —Beleg Tâl (talk) 22:34, 9 July 2017 (UTC)
Apparently things are a bit more complicated than this. See the discussion at the Russian Scriptorium.--Itsused (talk) 06:14, 10 July 2017 (UTC)

┌──────┘
Ok, having trawled through the thread on ruWS and one of the previous deletion discussions linked from there, the summary seems to be thus: Khrushchev made the speech to the 20th Congress of the Communist Party of the Soviet Union, the governing congress of the Communist Party of the Soviet Union, and he did so in his role as the General Secretary of the Communist Party of the Soviet Union (in fact, he held no other relevant offices at the time of the speech). These are not, in fact, entities of the Soviet state, but are all part of the political party. The fact that the Soviet Union was formally a one-party state, that government in practice was controlled by the party, and that the General Secretary of the party was the de facto Leader of the Soviet Union, are all immaterial as far as the copyright laws are concerned. The formal Government of the Soviet Union, whose products the copyright exceptions mentioned above apply to, was the Council of Ministers of the Soviet Union, led by the Chairman (Premier), and the "Chairman of the Presidium of the Supreme Soviet". Even the version eventually published in the Soviet Union in 1989 was actually published by the Communist Party, and not the Soviet state.

In other words, the speech does not fall under any of the "official documents of government"-type exceptions. The matter then becomes one of Khrushchev's copyrights, which, through various steps, ends up not expiring until the end of 2041 (pma. 70). enwp can use it under fair use, but neither Commons nor enWS can, as it's not actually public domain. And ruWS has nominated it for deletion as a result of our raising the question. --Xover (talk) 10:49, 10 July 2017 (UTC)

next thing you will say: he was not head of state, because he did not hold "relevant offices". triumph of "de jure" over "de facto". Slowking4SvG's revenge 08:53, 11 July 2017 (UTC)
Regardless of who had the original copyright to Khrushchev's secret speech, it expired fifty years after 1956, and so are the translations. Also, the US in not the first translator of the document. — Ineuw talk 04:31, 12 July 2017 (UTC)
@Ineuw: As we established above, the copyright to the speech vested in Khrushchev personally. The term of protection is then also relative to the date of the death of the author (pma) and not the date of publication (date of publication comes into play only for anonymous works). And while Russia did previously have a pma. 50 copyright term, this was amended in 2004 to be pma. 70. In other words, the relevant reference is 1971 when Khrushchev died, and not 1956 when the speech was made. And the term of protection therefore lasts until the end of 2041. --Xover (talk) 07:51, 12 July 2017 (UTC)
@Xover: If the amendment was of 2004, it does not have U.S. cognizance, so not relevant for this site. Except in case of bilateral agreement, if any. Therefore it would be date of publication + 95 years, i.e., 2051. Hrishikes (talk) 07:57, 12 July 2017 (UTC)
@Hrishikes: Russia is a signatory to the Berne agreement and a WTO member, and so their copyright terms are valid in the US. And even a pma. 50 term would not expire until the end of 2021. --Xover (talk) 08:17, 12 July 2017 (UTC)
@Hrishikes: Xover's information is correct. If it is considered to be Khrushchev's personal property, then perhaps we should ask Khrushchev's son what is the status of the document. P.S: Who will compose and send the email? :-) — Ineuw talk 09:07, 12 July 2017 (UTC)
That's not how the Berne Convention works. Each nation sets its own copyright terms and honors the copyright of other signers within those terms. Those terms have to be at least life+50, but while the copyright length in the US is publication based, the w:Marrakesh Agreement apparently means there's a multinational agreement that the US copyright law, as of the w:Uruguay Round Agreements Act satisfies that requirement. "Copyright Term and the Public Domain in the United States" from Cornell University gives a fairly detailed description of the durations of US copyright law.
In this case, presuming that the copyright wasn't owned by a government (a complexity not covered by Cornell's chart), and it is considered published in 1956 (that would be a pretty broad limited publication), then 2051 is right. If it is considered published in 1989 (the first time it was openly published in the Soviet Union), then it will be under copyright in the US until 2048 (70 years from 1978, as a grandfathered protection for unpublished works).
It's certainly one of the points where the law and the practice don't come particularly close. Sergei Nikitich Khrushchev, who, as the surviving heir, is probably the copyright owner, doesn't seem to act as the owner. The Soviet Union didn't have international copyright relations until 1973, and the laws in the US left most foreign works in the public domain, including this one, so the rules have changed a lot.
Sergei Nikitich Khrushchev is at Brown University, so someone might be able to get a OTRS clearance on the speech.--Prosfilaes (talk) 09:08, 12 July 2017 (UTC)
"a complexity not covered by Cornell's chart" i.e. you are manufacturing complexities not covered by the gold standard of practice of libraries. do we have any evidence that the Khrushchev heirs have claimed copyright? yes, let’s email everyone to confirm their non-action. what if they use a PD mark, or general statement, which commons won’t accept? it is a Gordian knot of your own making, devoid of any risk assessment. Slowking4SvG's revenge 11:35, 13 July 2017 (UTC)
The gold standard is the law itself, and then the Copyright Office; third parties can at best offer a silver standard. In any case, the complexity not covered is that the URAA does not restore certain government-owned copyrights, so ignoring it just makes it more likely this is under copyright.
Do you see any evidence the heirs of Lord Dunsany have claimed copyright over War Poems? At least in my world, very well known works that have a known heir with a good case for copyright have a decent risk associated with them; perhaps less than the works of Harlan Ellison™, but more than a lot of works I know were copyright-renewed.--Prosfilaes (talk) 17:10, 13 July 2017 (UTC)
a law is not a standard. the copyright office can be captured. i have been to the copyright office as they "consulted" with the author’s guild as they vented on hathi trust. it did not inspire confidence. we can look to the standard of practice of the best third parties as to what our standard of practice should be. being more sophistical about edge cases, is not a higher standard. we have guidance from legal about URAA, would you care to follow it? "ignoring it just makes it more likely this is under copyright." - no, our assessment of the risk does not increase the risk. "decent risk" - clearly we have different views as to what the risk is. if they have not enforced the copyright at russian wikisource, does it exist? Slowking4SvG's revenge 06:58, 14 July 2017 (UTC)
Why are you grumbling like that? Every site needs some rules, otherwise the site cannot run. Strictly speaking, PD-source country is the required item. If it is PD-source country but not PD-URAA, then, yes, it can be uploaded, WMF will turn a blind eye till someone complains; and usually, there will be no complaint. But every Wikisource has locally made some policy. Usually each WS follows the copyright term of the country to which the language primarily belongs. Indic WSes follow PD-India, ruWS likely follows PD-Russia, we at enWS follow PD-US (including PD-URAA). Obviously, following PD-Russia or PD-India and other such foreign rules will lead to lot of confusion, a site cannot run this way. Moreover, the item under discussion is not yet PD-Russia, that's why it has been nominated for deletion (as noted above) at ruWS. So wherefrom comes the question of its inclusion here? Our copyright policy, as it currently stands, is quite sound, IMO. We require PD-US on its own, or PD-US by URAA, or copyright-release by CC, OTRS etc., or Edict-Gov of any country. Except Edict-Gov, it is totally US-centric. Concentrating on one country's rule won't lead to much confusion. Yes, our servers are US-based, even then, allowing PD of other countries won't legally lead to much problem, as Commons is already allowing it. But the point is, this will lead to much confusion and disarray. We cannot just opt for PD by copyright rules of any country. No WS allows this kind of broad-spectrum thing. So, irrespective of WMF's legal advice about URAA, as you noted, our policy of following the URAA should be continued, if we want to run this site in a cohesive way.Hrishikes (talk) 07:35, 14 July 2017 (UTC)
because russian wikisource has an interpretation of russian law, and everyone here knows better. do not tell me copyright office is better than cornell. they are not. they are political hacks. cornell are librarians. there is a reverence for legal scripture that sounds like the MPAA. risk assessment is not turning a blind eye, rather it is an acknowledgement that the law is not black and white or right and wrong. Slowking4SvG's revenge 07:56, 14 July 2017 (UTC)
@Slowking4: Then what is your point? Copyright Office and Cornell librarians are useless, so let's just disregard them and allow all books here on fair use doctrine and let the copyright go hang? If you have some new policy in mind, please put up a separate and detailed proposal for discussion of the community. That would be a constructive contribution. Hrishikes (talk) 14:48, 14 July 2017 (UTC)
so what is your point? - that you are still right and everyone else is still wrong? i have dealt with the WMF legal team, they are reasonable, and the amateur lawyers on commons are not. actual legal practice is much more that the sum of all the documents. i offered up a policy; you lot are too attached to your "live free or die". see also Wikisource:Scriptorium/Archives/2016-10#Exemption_Doctrine_Policy_.28EDP.29 and historically Wikisource:News/2006-03-08/Debate over fair use on Wikisource. Slowking4SvG's revenge 21:37, 16 July 2017 (UTC)
Which everyone else? I am not seeing anyone else supporting your points. You are proposing fair use and disregarding of URAA (claiming support of WMF Legal), as I understand. If you can provide supporting evidence and rationality, I have no problem with that, that's why I requested you to put up a separate and detailed proposal. To prove your point, you will need to
    • demonstrate that works reproduced in entirety passes test-3 of the four-balance test given in the Fair Use Rule, in accordance with Finding-3 of the US Supreme Court in Harper & Row v. Nation Enterprises.
    • demonstrate that WMF Legal has supported disregarding of URAA in a legally valid way. As stated by WMF Legal, they did go for a legal fight against URAA in w:Golan v. Holder, but lost in court. So their stand does not have legal validity. So they have advised for a case-by-case analysis of every work for URAA assessment, citing complexity of the law, and also advised to wait till a take-down notice is received, for deletion of the work.(1, 2) This is suitable for Commons, not Wikisource. In Commons, your labour consists of uploading the work. If it is deleted, that labour is lost. In Wikisource, a work is proofread, validated, transcluded, sometimes selected for Featured Text. All that labour by multiple editors is lost when the work is deleted.
    • So if you take the fair use stand against URAA, you need to demonstrate that the "lost-in-court" stand of WMF for waiting for take-down notice is suitable for Wikisource.
    • In addition to point 1 above, the Fair Use rule also stipulates use of the reproduced work for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, validity of which needs to be demonstrated in case of Wikisource.
    • As for EDP, it has a very narrow spectrum within the fair use rubric, and not applicable to whole works. Within its narrow spectrum, however, I consider it supportable in this site, however, site-specific policy needs to be developed, preferably with some input from a lawyer.
    • Without point-by-point logical argument with supporting evidence and rationale, only inserting some random comments about fair use, corruption of copyright office etc., in any copyright-related discussion, as you have been recently doing, is not sufficient for convincing the community. And claiming support of "everyone else" also requires evidence. Hrishikes (talk) 01:46, 17 July 2017 (UTC)
I see Wikisource:Scriptorium/Archives/2016-10#Exemption_Doctrine_Policy_.28EDP.29, and feel it's a little limited; my need for fair use would be for NTSB reports, which needs more than images. But it in no way covers this case. I gave you a risk assessment; "very well known works that have a known heir with a good case for copyright have a decent risk associated with them". I have deep problems with risk assessment; most people can't defend even modern work from copyright infringement whereas the Doyle and Christie estates can harass legit users of PD work. Frankly, risk assessment feels wrong; if it's okay to copy stories from the pulps of the 1970s, then we shouldn't worry if they have the name Harlan Ellison™ attached to them.--Prosfilaes (talk) 09:41, 17 July 2017 (UTC)
you raise a good point that the MLK heirs are different from the Khrushchev heirs. do we have any evidence of these heirs claiming copyright? any Russian government officials? any European officials? are we not projecting american legal practice upon foreign legal systems? as we know the law is more than the sum of the texts.
you raise a good point about copyfraud. does not PRP give a license to pre-emptively enforce a fraud? or modify a CC license by writing a stern letter? have not items been deleted with only a stern letter, and not a DMCA? do we agree there are some risks we would undertake, if we had a consensus the item was PD? have not Swedish and German uploaders taken those risks?
"demonstrate that WMF Legal has supported disregarding of URAA in a legally valid way." now you are questioning the legal judgement of WMF? really? where did you go to law school?
"All that labour by multiple editors is lost when the work is deleted." some of us have more labor to be deleted than others.
"demonstrate that works reproduced in entirety passes test-3 of the four-balance test" to who? you? i do not have much confidence in a consensus, given PRP argumentation. there was little interest in a tighter standard. Slowking4SvG's revenge 13:28, 17 July 2017 (UTC)
In the US, does the heirs of Khrushchev have a copyright on the Secret Speech? The answer is apparently yes, and you have made no real attempt to argue otherwise. Are there current exceptions to the rule that we don't post anything that's not in the public domain that we don't have a free license to? No. If you want to argue for an exception, this is not the topic for it.--Prosfilaes (talk) 19:29, 17 July 2017 (UTC)
on the contrary, i have given up reasoning, that it is PD russia, which wikisource russia agrees with. even if i agreed that we should "consider the heirs" (which i do not), those heirs apparently agree it is PD, since they do not enforce their "rights". this is not a private letter, but a public speech with historical implication, regardless of the security. no exception here, merely your tl;dr sophistry. you do not address points about evidence, rather you shift ground and shift burden of proof. "risk assessment feels wrong" - no deleting items that very well could be kept, and diminish the texts available, is wrong. it is not a way to collaborate: it is not dictation. Slowking4SvG's revenge 23:00, 17 July 2017 (UTC)
@Slowking4: As you do not believe in consensus and continue repeating the same points without going through the discussion properly, it seems futile to discuss with you. Anyway, you have raised some allegations against me and repeated some wrong assertions.
  • I have not questioned any "legal judgement of WMF". I could not have, simply because I am not aware that WMF is a court and that they have passed any "legal judgement". I have cited the "legal judgement" of the U.S. Supreme Court. If you read carefully, hopefully you will be able to discover it.
  • I have never asked you to demonstrate anything to me. Please refrain from this kind of allegation. I have repeatedly requested you to give a detailed proposal, containing your points, to the community. The matter of demonstrating pertains to that proposal.
  • The item under discussion is not PD-Russia; nobody has claimed so. It is hosted in ruWS under RusGov, i.e., EdictGov-Russia. This has been challenged in ruWS deletion nomination (1).
Hrishikes (talk) 01:25, 18 July 2017 (UTC)
do not put words in my mouth - i have never said i do not believe in consensus. rather i do not believe in toxic non-leadership. what would you know of consensus? how should i report this incident at wikimania: "out of an abundance of caution Khrushchev's speech was repeatedly deleted, on english, out of concern that his heirs would sue us" - does that summarize your views? Slowking4SvG's revenge 01:35, 18 July 2017 (UTC)
Your exact wordings were "i do not have much confidence in a consensus ...". Anyway, this item (original version) is not PD-Russia, or EdictGov-Russia or PD-URAA. It is not PD of any kind. Neither is it CC or OTRS. So the author's son appears to not claiming the copyright. So what copyright tag, according to you, should apply to it? {{Copyright-unclaimed}}? Hrishikes (talk) 02:16, 18 July 2017 (UTC)
thanks for the quote - it is a paradox: i have very little confidence in copyright discussion here or the sub-optimal "consensus" they produce. and yet i abide by them. just look at this discussion; it does not inspire much confidence in copyright determinations. widespread use of tautology and parsing of foreign texts, not evidence or context. "winning is the only thing": we have a generation of admins who would rather reign in hell than serve in heaven. continue with this behavior and it will be wikinews everywhere. how about {{PD-Arse}} Slowking4SvG's revenge 12:39, 19 July 2017 (UTC)
You claimed that it is "PD russia". How is that not either parsing of foreign texts or pulling something out of your ass? US law is basically "works first published between 1923 and 2002 inclusive are in copyright for 95 years from first publication." Once you cross that line, you're dealing with a work presumed copyright and need to actively justify that it's not.
Dealing only with works that were published at least 95 years ago would leave a huge body of valuable works open for us. There are other sites that don't care about copyright nearly as much as us, or at all. PD-Arse is a tag appropriate for the Pirate Bay, not us. That is one of things that makes Wikimedia wikis different.--Prosfilaes (talk) 13:50, 19 July 2017 (UTC)
no - wikimedia is renowned for the lack of adult supervision. library professionals are appalled by the "cultural buzzsaw", and they are building their own transcription websites, where the volunteers are run-off wikimedians. the fact that historical documents are stuck on internet archive, and academic blogs and google docs is no loss to you. it’s all good, because you are large and in charge. i can link off wiki as easy as wikisource. i wonder what the wikimania audience will say as i raise this issue? Slowking4SvG's revenge 20:27, 20 July 2017 (UTC)
Nothing screams adulthood like ignoring standard English orthography and explaining that copyright problems go away with {{PD-Arse}}. I think you'll actually find that libraries prefer proper English and careful copyright checks.--Prosfilaes (talk) 00:12, 21 July 2017 (UTC)
nothing whispers adulthood like having to have the last word. guy kawasaki is a smart man, and it is funny his parting shot is "You feel surrounded by incompetent idiots – and you can’t help letting them know the truth every now and then". i wonder what prompted that? i’m just reporting what actual librarians say about this place, and it is not "my, aren’t they so diligent about their copyright checks"; but then why don’t you ask a librarian, if you can find one? i guess i will say at wikimania: "copyright hysteria = lost opportunities" how’s that sound to you? Slowking4SvG's revenge 01:10, 21 July 2017 (UTC)

Per project statistics[edit]

Following my previous post about progress statistics by project, I decided to do some analysis myself. Based on the latest database dump, I looked at the Page: namespace and only counted the edits which change the status of a page.

It is possible to find many interesting tidbits of information from the different projects. For example:

However, it is mostly interesting to check the status of the backlog. For example:

What I wanted mostly was to know on which projects people are currently working. Dumps are not the most appropriate way to go about it as we miss a few days, but it is possible to know what happened in June using the latest dump from July 1st. In June, 419 projects have been edited (i.e. at least one page changed status), the most active being:

Editions by project in June 2017 (as of July 1st)
Index name Index status Pages validated Pages proofread Pages empty Pages remaining Number of pages modified Number of revisions Number of Authors
Index:Travels in Mexico and life among the Mexicans.djvu To be proofread 228 396 62 0 667 910 4
Index:Tarzan of the Apes.djvu Validated 407 0 17 6 410 769 3
Index:Thoreau - His Home, Friends and Books (1902).djvu Validated 346 0 38 0 384 745 15
Index:The Shaving of Shagpat.djvu Validated 306 0 20 0 308 611 2
Index:Ballantyne--The Battery and the Boiler.djvu Proofread 116 316 16 4 432 475 2
Index:The Novels and Tales of Henry James, Volume 1 (New York, Charles Scribner's Sons, 1907).djvu Validated 550 0 18 0 455 455 2
Index:The Novels and Tales of Henry James, Volume 2 (New York, Charles Scribner's Sons, 1907).djvu Validated 564 0 14 0 434 434 2
Index:The Bostonians (London & New York, Macmillan & Co., 1886).djvu Validated 451 0 13 0 430 430 1
Index:Cuthbert Bede--Little Mr Bouncer and Tales of College Life.djvu Proofread 48 256 14 0 311 425 3
Index:Royal Naval Biography Marshall sp4.djvu To be proofread 14 420 18 29 383 385 2
Index:Maud Howe - Atlanta in the South.djvu To be proofread 7 339 10 6 348 352 2
Index:Morley--Travels in Philadelphia.djvu Proofread 203 69 16 0 255 347 2
Index:Ballinger Price--Us and the Bottle Man.djvu Validated 162 0 14 0 176 338 4

Is there any interest for this kind of statistics and analysis? I understand Wikisource is currently driven by very dedicated users who start and often finish a work all by themselves. However, for a more casual editor, who wants to simply proofread a few pages and see a complete book including his work without having to wait for years, this could be a good extension to the proofread of the month (which is clearly visible in the table above!).

Technical description: I parsed a dump of the database to extract each project (based on the index pages), each page (based on the page namespace) and each revision changing the status of the page (not proofread, proofread, validated, etc.). The link between the page and the project is done by looking at the page name. This approach means I don't deal well with all the projects where the page is not a subpage of the index (there are 8769 of these). I also extracted the number of pages of the file, in order to take into account pages not yet created (I did not find how to get this data from the database directly, I had to scrap the HTML of the commons page).

Koxinga (talk) 08:45, 9 July 2017 (UTC)

From the perspective of "completeness" of works, we are interested in works that are nearly proofread, or nearly validated, that have not been edited for a period of time, so we can put resources to them. They are cheap wins with true value. If you are looking to see missing/non-created pages of a work, then you probably want to get a count of pages from the File: and compare that with the number of subpages of Index:. That would be a neat comparison as that would be another indicator of near completeness.

The other factoids, are interesting trivia, though I am not sure that they are particularly enlightening for the site, or our work — though I could just be considered a boring unexcitable, unromantic, task-focused fart. Noting that the stats about projects doesn't consider our multiple volume projects (EB1911, DNB, DMM, +++). Thinking of what would be useful: numbers of Index: works with counts for images missing, score missing, etc. so we could focus efforts, or promote efforts to assist completion. Numbers of edits on works is not relevant, though maybe date from creation to validation may have some social interest, though even that has dodginess of the work has advertising. We already track our validated and proofread works, and try to keep on top of transclusion status. [As said I may have the wrong focus for what is interesting to the trivia buffs.) — billinghurst sDrewth 13:19, 9 July 2017 (UTC)

Noting that pages remaining (not proofread) can be due to works having their advertising pages remaining, eg. Ballantyne's work above, so for a work like that, it has been marked as proofread (important), and we are tracking that its advertising is not done by a category. So pages unread in a proofread or validated work; whereas pages unread where small in a work not proofread is interesting. We are a complex beast. :-) — billinghurst sDrewth 13:25, 9 July 2017 (UTC)
Finding a work that had no pages remaining (ie. nothing to be proofread) for a work marked as "not proofread" is very useful as it enables us to review and reclassify as required by the review. — billinghurst sDrewth 14:03, 9 July 2017 (UTC)
Thank you for your comments. A few answers:
  • The trivia was to show the different possibilities, but I mostly aim to do something useful for project tracking and motivation of the different users. I know that at least for me, it would be motivating to see which works are being actively worked on, so that I can see progress when I come back to it, I know I can ask questions and exchange about the project, etc.
  • Yes, I use the actual number of pages of the uploaded file, so I can find the pages not yet created, I mostly consider them the same as the "not proofread" pages but it can be separated if needed.
  • I don't trust the "proofread", "validated", etc. flag in the index. It is manually set, so there can be mistakes in one direction or another. That's why I think it is useful to compare it to the actual situation of each page.
  • It is possible to remove the advertisement pages from my analysis, based on the <pagelist> tags, but we need to define a consistent marking for them. I saw some adv, adv., advt, advert (with a bonus "index to advert"), advertisement. Do we allow all of these or to we try to normalize?
  • I can take into account multi-volume projects and group them together, by looking at the Volumes part of the index page, especially the Category:Scanned volume navigation templates, I will look into it.
Koxinga (talk) 19:43, 9 July 2017 (UTC)
There is something wrong with the information gathered. Index:Popular Science Monthly Volume 12.djvu was completely validated a long time ago, and and the proofreading of Index:Travels in Mexico and life among the Mexicans.djvu was also completed, perhaps at the beginning of this month. — Ineuw talk 04:41, 12 July 2017 (UTC)
My analysis is based on a database dump, using the most recent one from July 1st. At the time of this dump, Page:Popular Science Monthly Volume 12.djvu/430 was not yet validated, it has been done after I posted this message. For Index:Travels in Mexico and life among the Mexicans.djvu, I did say that there was 0 page remaining, however, at the time of the dump, even if all the pages had been proofread, the index status was still "to be proofread". It has also been changed just after I posted this message. If there is an interest, it would be possible to use the recent changes to update the data more frequently, but judging from the lack of response here, it does not seem worth it.Koxinga (talk) 01:12, 13 July 2017 (UTC)
Thanks for clarifying. It's interesting. — Ineuw talk 09:50, 13 July 2017 (UTC)

Tech News: 2017-28[edit]

15:07, 10 July 2017 (UTC)

Using Template:SIC with incorrect punctuation?[edit]

I have an unclosed bracket ("parenthesis" for any Americans reading this) in the first paragraph of the commentary on Chapter 14 at Page:An_Exposition_of_the_Old_and_New_Testament_(1828)_vol_1.djvu/125. I can't work out how to show that the first comma after the opening bracket should be a closing bracket, as shown in other editions from the 18th and 19th centuries. --PeterR2 (talk) 08:07, 11 July 2017 (UTC)

My own preference is not to mark anything in these situations. I just replicate what the printed text says. Beeswaxcandle (talk) 08:34, 11 July 2017 (UTC)
I guess people do these things for different reasons. I am working on this because I want to contribute to a reliable online text of a good edition of Matthew Henry's Bible commentary. The only existing one, which is used in various mobile phone apps, is from an unknown edition/editions and therefore not possible to proofread.

--PeterR2 (talk) 08:23, 12 July 2017 (UTC)

If I think it's important to show what seems to be the correct punctuation, sometimes I include the word attached to the punctuation to make it a little more visible. So if I understand correctly the place you're talking about, you could try this: {{SIC|history,|history)}}, which results in: history, — is that more or less what you had in mind? Mudbringer (talk) 10:04, 12 July 2017 (UTC)
I agree, and have occasionally done the same as Mudbringer, when I was concerned about the authenticity of the text. For some texts, it's not worth noting, and in some cases it's actually a printing / scanning issue. I've come across scans where there ought to have been a period at the end of a line, but none is visible in the scan, or where the scan shows a period, but it ought to be a comma. In some of these cases, I have had access to a printed copy, and could see the impression of the period, or the slight bit of ink starting a comma. Sometimes the ink isn't properly distributed by the printer, or the punctuation type was defective at the original press. In those situations, it's not worth marking. --EncycloPetey (talk) 17:36, 12 July 2017 (UTC)
Or just stick the correct punctuation inside a <includeonly> so it displays per the scan in the Page: ns, and it displays corrected in main ns. Stick an html comment in place if you really need to have a comment. I would not normally use {{SIC}} to correct punctuation, it pretty much defeats the purpose if it is a necessary piece of punctuation. — billinghurst sDrewth 12:47, 15 July 2017 (UTC)
In the case PeterR2 brought up, it would be necessary to have history<includeonly>)</includeonly><noinclude>,</noinclude> since there's also a comma that needs to be suppressed in the transcluded text. If this is an approach often taken on Wikisource, wouldn't it be better to have a template to do this, to make it clear that this is a permissible option, and make it possible to find places where this has been done? Mudbringer (talk) 17:59, 15 July 2017 (UTC)

Style changes?[edit]

When I used to edit index pages like this:- Index:The Atlantic Monthly, Volume 18.djvu

The field boxes USED to be in monospace. They aren't currently meaning that options overrun.

Is this a style change on Mediawiki, or a local configuration issue with an updated Firefox? ShakespeareFan00 (talk) 10:59, 13 July 2017 (UTC)

sure looks like a style change in media wiki. index page editing interface change. is there any documentation / notice for this? Slowking4SvG's revenge 11:20, 13 July 2017 (UTC)
[Wikisource-l] Tech details
GSoC Proposal[2017]: Improvements to ProofreadPage Extension and Wikisource
Weekly reports: Improvements to ProofreadPage Extension and Wikisource, Zdzislaw (talk) 20:50, 13 July 2017 (UTC)
In summary, there was no notice where the ordinary user of enWS would see it. It's a pity the GSoC Proposal was called "improvements" when so far it's resulted in a new user right that had to be reversed almost immediately after implementation and this uglification of something that worked just fine. Beeswaxcandle (talk) 07:51, 14 July 2017 (UTC)
hmmm, I do not think so... page editor interface will be switching over to OOUI soon - is now ready for deployment to Wikimedia wikis, see: The_Atlantic_Monthly/Volume_2/Number_2/The_Autocrat_of_the_Breakfast-Table. So, adaptation of "the rest" of the proofread extension ui is also required. It would rather be nice to say "thank" that someone wants to do... and get some skins for improvement... Zdzislaw (talk) 16:34, 14 July 2017 (UTC)
As I am no UI guy, I miss why a switch to OOUI implies also a style change, but never mind, was just curious. Just a comment: the style of the summary section in the pages above is different for the Index and the Main ns.— Mpaa (talk) 17:21, 14 July 2017 (UTC)
After switching to OOUI will be the same - Index:The_Atlantic_Monthly,_Volume_18.djvu, Zdzislaw (talk) 17:34, 14 July 2017 (UTC)
You mean they will both be glitchy? I'm not getting arrow images by some of the drop-down items (like "Progress") and it looks as though the limitations on values for Year of Publication have gone away. It would have been nice if the change had been explained clearly to non-tech-minded users beforehand, or better still, if it had been tested on non-Wikipedia projects like Wikisource. I hate to think what this will do to Wiktionary editing. --EncycloPetey (talk) 21:21, 15 July 2017 (UTC)

Another question about copyright.[edit]

Looked up a short microfilmed article on the New York Times, downloaded the page as .pdf, and typed the contents into a text file (pdf copy and paste was blocked). At the bottom of the microfilmed article was the following claim: Published: February 12, 1877 Copyright © The New York Times. Is it or is it not in the public domain? — Ineuw talk 02:17, 16 July 2017 (UTC)

It's public domain. The assertion of copyright there is at best imprecise (at worst fraudulent). If it was published before 1923 (anywhere in the world) it is safe to assume it's public domain in the US. And since this was first published in the US we need not take into account any differing copyright terms in other countries (i.e. the annoying rules for magazines published in the UK). --Xover (talk) 14:05, 16 July 2017 (UTC)
Anyone can make a claim about anything... Whereas the reality is that I got the most votes at the recent US election... But it doesn't necessarily make it factually correct. — billinghurst sDrewth 14:11, 16 July 2017 (UTC)
do not know why experienced editors keep asking. there is a reflexive naivete towards the false boilerplate that institutions persist in. how many items have been deleted on false claims? it shows that the copyright determination is not balanced, but tilted sharply toward delete. Slowking4SvG's revenge 21:48, 16 July 2017 (UTC)
@Slowking4: Experienced editor, maybe. Knowledgeable, I doubt that. I will bring up the matter with The New York Times, since I love to bother them occasionally. — Ineuw talk 16:21, 18 July 2017 (UTC)
bother them all you want, their legal department does not care. you could also bother Getty, MacArthur Foundation, National Portrait Gallery, London, Louvre, Smithsonian Institution, etc, etc. [37] Slowking4SvG's revenge 12:56, 19 July 2017 (UTC)

Tech News: 2017-29[edit]

22:59, 17 July 2017 (UTC)

Future changes previously mentioned: TemplateStyles[edit]

Mentioned first above at #Tech News: 2017-24

Hi all,

we'll enable TemplateStyles (mw:Extension:TemplateStyles and mw:Help:TemplateStyles) tomorrow on mediawiki.org, wikitech.wikimedia.org and some test wikis. (Today for those of you in UTC or later time zones.)

TemplateStyles allows editors to add complex CSS to templates with the help of a <templatestyles> tag. This makes template maintenance easier, lowers the barrier of access (previously you had to be an admin to be able to add new CSS) and empowers editors to create more user-, mobile- and print-friendly templates.

For plans for rolling it out to content wikis see phab:T168808.

Gergo Tisza, wikitech-l mailing list

Schedules are WRONG, the columns don't match the info, please check my comment in February![edit]

Hello, the schedules haven't been corrected, I advised you about this back in February!

Regards, Neil, South Africa