Wikisource:Proposed deletions

From Wikisource
(Redirected from Wikisource:DEL)
Jump to navigation Jump to search
WS:PD redirects here. For help with public domain materials, see Help:Public domain.
Proposed deletions

This page is for proposing deletion of specific articles on Wikisource in accordance with the deletion policy, and appealing previously-deleted works. Please add {{delete}} to pages you have nominated for deletion. What Wikisource includes is the policy used to determine whether or not particular works are acceptable on Wikisource. Articles remaining on this page should be deleted if there is no significant opposition after at least a week.

Possible copyright violations should be listed at Copyright discussions. Pages matching a criterion for speedy deletion should be tagged with {{sdelete}} and not reported here (see category).

SpBot archives all sections tagged with {{section resolved|1=~~~~}} after 7 days. For the archive overview, see /Archives.


Ok, I think it's time we have this conversation…

Translation:Manshu describes itself as a Wikisource translation of A 9th century Middle Chinese text regarding the geopolitics of southwest China, particularly the historic kingdom of Nanzhao. It is an important historical source for the period. This translation is based upon a digitized version of the recompiled 1774 movable type edition edited by the 武英 (Palace Museum Library).

However, looking at it more closely it appears to be much more an original analytical work than anything that could be shoehorned to fit within our definition of a mere translation.

The front page is almost entirely original work (apart from a table of contents), partly semi-encyclopedic and partly meta-discussion about the effort itself.

Looking at Chapter 1 we find some actual translation, but mostly comparisons with a professionally published previous translation (Luce) that is quoted extensively, and translator's commentary that far exceeds the actual translated text itself. It also features a lot of images that obviously do not appear in any original, but have been picked to illustrate a particular point (i.e. how Wikipedia would construct an article).

Chapter 2 and onwards are the same, except they lack the extensive quotations from the published translation (Luce), but only because the effort to compare has not reached that point yet. Around Chapter 9 the translation appears incomplete with only the Chinese original text present.

Irrespective of the rest of this work, there is a question regarding the extensive quotations from the previous professional translation (link). It is a 1961 publication with copyright notice, so there is a high probability that it is in copyright (and thus the quotations are also copyvios). I haven't looked at this issue in detail, but if this discussion ends up keeping the work in some form we will have to address that separately (and if it is not in copyright, why are we not transcribing that instead of making our own?). The sole contributor to Translation:Manshu has a somewhat haphazard approach to copyright (e.g. claiming satellite imagery from Google Maps or similar as "own work") so the issue will have to be checked thoroughly.

But all that being said, this is also a great effort and a unique work that really should exist somewhere. If it were completed I'm certain it could have been professionally published, and it would be a real shame if all the effort that's gone into it was wasted. The contributor has not been active since 2018 (and the last large progress was in 2016), so I don't think it very likely that it will now ever be completed; but if a place is found for it even the partial translation is valuable, and could conceivably be completed by others at some point in the future. If the outcome of this discussion is that it is out of scope we should make a real effort to see whether a project like WikiBooks would be interested, and, if not, rather than simply delete it we should move it to the contributor's user space (a practice I am usually vehemently opposed to but am making an exception in this particular case).

In any case, it has kept popping up on my radar for various reasons, and I have always been torn on what to do about its issues. It seems clearly outside of scope per WS:WWI, doesn't meet WS:T, violates WS:ANN, and would most likely need cleanup to meet WS:COPY. So now I'm putting the question before the community: what do we do about this? --Xover (talk) 10:20, 2 April 2021 (UTC)Reply[reply]

WS:T ought to address contributions like this, the first section on published works is redundant. Are there examples of Wikisource translations that have been in some way verified (validated)? CYGNIS INSIGNIS 14:43, 8 April 2021 (UTC)Reply[reply]
@Cygnis insignis: Not a lot, but they do exist. Translation:On Discoveries and Inventions is a recent example. --Xover (talk) 15:42, 8 April 2021 (UTC)Reply[reply]
It could go in User space for the time being. Maybe Wikibooks would want it? —Beleg Tâl (talk) 13:55, 30 May 2021 (UTC)Reply[reply]
I'm only able to comment on a small portion of this, which I hope might be helpful: https://cocatalog.loc.gov/cgi-bin/Pwebrecon.cgi?DB=local&PAGE=First has no results for Man shu or southern barbarians as title; nor Luce, Gordon as an author name; nor do Cornell University or Southeast Asia Program or Oey or Fan, Cho seem to have a relevant renewal under their names. Southeast Asia Program as a title reveals registration of other of these data papers as copyrighted works, but no renewal of this one. This suggests the copyright was never renewed on the Luce translation (possibly this is not surprising, as these weren't exactly blockbusters...) and it is now public domain, judging by Help:Public domain#ref renewal. If accurate, this should resolve the WS:COPY concern. Good luck with the rest of this matter! Dingolover6969 (talk) 11:39, 21 January 2022 (UTC)Reply[reply]


Response by author[edit]

Hi there, I am the primary author, an admin on English Wikipedia. I would say I have spent upwards of 500 hours on this translation. During the time it is alleged that I have been inactive, I was a founding team member at a very important company you would have heard of, and provided some of the earliest COVID map coverage on Wikipedia (webm gif). Currently I run seven (7) companies and have a family, so it is fair to say I have 'other commitments'. I do still intend to complete the translation. Aside from time constraints, partly I have not been active on Wiki projects recently because I am living in China and this makes editing Wikiprojects a massive hassle due to the requirement for a VPN. Nevertheless, I noticed this deletion attempt by Xover and would like to respond objectively for the record. If we summarize the alleged issues they are as follows:

  • The translation includes commentary
    • That is simply because it is a good (ie. transparent/honest) translation.
    • Any accredited historian will agree this is a good (positive) feature.
    • This does not in any way support 'delete'.
  • The work is incomplete
    • I am still finishing, I am just ridiculously busy and have been so for five years.
    • Incomplete and pending further effort is often simply the nature of voluntary work.
    • This does not in any way support 'delete'.
  • The work includes quotations from previous translations
    • Fully cited and contextually presented, in academia, this is clearly fair use.
    • This does not in any way support 'delete'.
  • The work includes satellite derived images
    • These images were constructed with great care based upon detailed context and are both low resolution and substantially original work in themselves.
    • This does not in any way support 'delete'.
  • The translation is done by the contributor and openly licensed instead of being an out of copyright work of someone else which has been uploaded
    • IMHO as a student of history original translation is *great* to welcome and should be encouraged.
    • This does not in any way support 'delete'.
  • Violates 'What Wikisource Includes' (WWI)
    • Wikisource includes "Works created after 1925" / "Analytical and artistic works".
    • Wikisource includes "Translations"
    • To be perfectly honest I consider this assertion a truly baseless accusation that I frankly find highly offensive.
    • This does not in any way support 'delete'.
  • "Doesn't meet" WS:T
    • Unclear what this means
    • The WST page clearly states that original translations are in-scope and acceptable (there is only one prior English translation and it is bad and incorrect)
    • This does not in any way support 'delete'.
  • Violates WS:ANN
    • I have never seen that page before in my life
    • Apparently it doesn't like parallel text
    • I would suggest strongly that parallel text provides the basis for most high caliber academic translations, it is my view that the policy page is wrong and further discussion to correct it should occur there.
    • This does not in any way support 'delete'.
  • Requires cleanup to meet WS:COPY
    • Unsure what this is actually alleging
    • Aside from original work there is only contextual quotations from other works in line with an academic translation
    • This does not in any way support 'delete'.

Sincerely, Pratyeka (talk) 10:12, 11 June 2021 (UTC)Reply[reply]

I just noticed that Xover also deleted my maps. This is a great loss. I cannot recreate them as I do not have access to the context at the time. This is truly a tragedy. I am ... highly alarmed and stressed at this turn of events and will cease contributing further to Wikipedia projects. Pratyeka (talk) 10:21, 11 June 2021 (UTC)Reply[reply]
Could someone with more time please go through the undeletion process on my behalf. It is... truly a great tragedy. Multiple academics had thanked me for this work. Pratyeka (talk) 12:55, 11 June 2021 (UTC)Reply[reply]
@Pratyeka: These maps are not appropriate for enWS (or Commons), because they contain copyright material: the satellite photos. There is no allowance here, as there is at enWP, for fair use or de minimis, and resolution doesn't affect it. I imagine the "correct" solution is to either locate a suitable base maps from Commons (or NASA or other PD source), draw your own, or commission them via c:Commons:Graphics Lab/Map workshop.
If the presumption of copyright is incorrect (e.g. the photos are PD or freely licenced), then let me know and they can be restored and correct attribution and licence declarations made. In that case, they actually belong at Commons.
Sadly, being thanked by academics does not overrule copyright.
Even if these are copyrighted, I can also provide you with the files if you do not have access to them any more. Inductiveloadtalk/contribs 14:00, 11 June 2021 (UTC)Reply[reply]
@Pratyeka: I'm glad to see you're editing again. I'm not sure why you felt it relevant to mention that you have +sysop on enwp, but since you bring it up… as an admin on enwp you should be well familiar with the need to make policy-based arguments in such discussions and to familiarise oneself with the policy on the project. I have raised several policy-based concerns, and your response addresses none of them. However, to reiterate the challenges:
The text on Translation:Manshu is not a mere translation of a previously published work. It contains substantial portions of your own analysis, comparisons, and commentary: all of which is original rather than previously published content. In enwp terms, think of it as "original research": it's not a perfect analogy, but the problem is similar. This is out of scope for English Wikisource. In addition, you include extensive quotations from the other (professionally published) translation, but that translation is not public domain or compatibly licensed. Fair use content is not permitted on English Wikisource (and even on enWP only in very narrow and limited circumstances), which puts in violation of our licensing policy.
Now, as I wrote above, this is an impressive work and I am sure it is a valuable contribution to the knowledge in that area of study. It just isn't compatible with the policies on Wikisource. In other words, if it is to stay here it will have to be stripped down so that it only contains the translation, without embellishment, of the original text and all non-public domain elements removed. I imagine that's not your first choice as I get the impression it is the analytical parts of the work that interest you the most. So as an alternative, works such as this may be in scope for WikiBooks: their scope explicitly includes original works so long as it falls within their definition of "educational". As another Wikimedia sister project it is possible to import the pages between projects, even preserving revision history. If you need it we can try to facilitate contact with the Wikibooks community to get the ball rolling. --Xover (talk) 19:50, 7 August 2021 (UTC)Reply[reply]
  • Oppose. This whole situation is insulting. The work is clearly a Wikisource translation of a work in the public domain, and is thus in scope, your complaints about the annotations aside. This discussion should never have been started, and much less dragged on this long. The problem with the maps is unfortunate, but the rest is irrelevant. TE(æ)A,ea. (talk) 00:25, 4 August 2021 (UTC)Reply[reply]
    Indeed, and I would love nothing better than to see much much wider participation in discussions here and on WS:CV so that we could properly determine community consensus and within a reasonable time. That's why I so very much appreciate your efforts to participate in both venues! However, meanwhile we have to operate within the reality that exists. I am sorry if you found this insulting, but there really is no other way to address such issues. --Xover (talk) 18:54, 7 August 2021 (UTC)Reply[reply]

The following discussion is closed and will soon be archived:

Closed as premature. A new request for undeletion can be made when the original has been proofread at heWS. In this particular case you can also ask me or EncycloPetey (or in principle any active admin, but since we know the backstory...) directly without going by way of a community discussion.

Please undelete Translation:Mishneh Torah and all of its subpages. I would like to start working on continuing this translation and want the old text as a starting point. Thanks a lot, Sije (talk) 20:47, 15 November 2023 (UTC)Reply[reply]

What text will you be working from? Part of the problem with the previous copy was that it had no scan-backed copy on he.WS to work from. See Wikisource:Translations#Wikisource original translations, which notes that one of the things we want in a user-created translation is a "scan supported original language work ... on the appropriate language wiki, where the original language version is complete at least as far as the English translation". As far as we could tell, there is no scan-backed original copy on he.WS, and therefore no stable original copy exists from which to create a translation. --EncycloPetey (talk) 20:18, 16 November 2023 (UTC)Reply[reply]
As far as I know, it is not the practice of he.WS to provide scans of any books. Here is a 1566 edition of the first three sections of Mishneh Torah available at Google books. Would it be OK if I work from this text? Sije (talk) 21:18, 16 November 2023 (UTC)Reply[reply]
@Sije: While I am not familiar with the policies and practices og heWS, they do certainly use Proofread Page to transcribe scanned originals side-by-side. See e.g. s:he:Index:Hebrewbooks org 38168.djvu.
In any case, this undeletion request is premature; once the work has been proofread on heWS is the time to request undeletion here. Xover (talk) 21:30, 16 November 2023 (UTC)Reply[reply]
Well, current practice has also allowed direct translations from the scans, such as this one, and in such a case it imo might not be necessary to insist on the work being proofread in the original language Wikisource. --Jan Kameníček (talk) 19:30, 17 November 2023 (UTC)Reply[reply]
In such cases, the transcription already has happened on the parent language Wikisource. We are simply using the same scan locally to allow for side-by-side comparison of the text in the Page namespace, in order that the translation can be checked against the original language text. The text still exists on the parent language WS prior to local translation. --EncycloPetey (talk) 19:40, 17 November 2023 (UTC)Reply[reply]
I created index pages at the heWS and one overhere as well, although I'm not sure about the technicalities. Any help on how to proceed would be greatly appreciated. Thanks a lot, Sije (talk) 20:03, 17 November 2023 (UTC)Reply[reply]
@Sije: Now you proofread the text at heWS. Each physical page in the scan is listed in s:he:Index:משנה תורה דפוס ווארשא-ווילנא כרך ראשון 1.pdf. Go to the first page (physical page 1, logically numbered 2) and transcribe the text and use the standard heWS templates etc. to format the page. heWS will have some guidance and help pages somewhere, or you can ask the community there for help at their village pump / scriptorium. Once you have it finished you change the page status to "Proofread" (the yellow radio button). Then continue to the next page and do the same, and so on until you have proofread every page in the scan. Once that is done you can use transclusion to combine the individual book pages together into one wikipage per chapter (or other relevant subdivision). When the book is fully proofread and transcluded on heWS you can come back here to request undeletion of the old text and start translating it here.
PS. This page (Wikisource:Proposed deletions) is for deletion/undeletion discussions. The best place to ask for assistance is Wikisource:Scriptorium/Help, so I am going to close this thread shortly and you can open new threads there when you have questions. Note that we have limited ability to help with issues on heWS so you may want to find the equivalent place there to ask for help from the heWS community.
PPS. I see the scan has "www.hebrewbooks.org" branding. If that is anything more than them just slapping their branding on a scan of an otherwise public domain book you may want to double-check that there isn't a copyright issue there. If hebrewbooks.org have, for example, added commentary or something there could be parts of it that are covered by copyright. I can't tell, and it looks like a mere scan of an old book, but it's better to make sure before you put too much effort into it. Xover (talk) 09:54, 18 November 2023 (UTC)Reply[reply]
Do I have to wait until everything will be proofread (that's a lot of work) or can we do it one part at a time? For example, can the introduction be undeleted once the introduction at heWS will be proofread? Sije (talk) 08:21, 19 November 2023 (UTC)Reply[reply]
@Sije: Strictly speaking the policy only requires the work on the original-language Wikisource to be proofread as far as the user translation on enWS. That is, it permits just what you're suggesting here. However, based on experience and the fact that it is quite a lot of work to proofread the original, I am inclined to be very conservative in applying it to avoid having too many unfinished fragments sitting around. How about we do this in batches: work a little ahead at heWS compared to what we undelete here. Would that work for you? Based on the deletion rationale in the previous thread, you will have to do at least some work here on enWS too to make it conform to standards, so working in batches should be a practical way to divide up the task in manageable pieces. Xover (talk) 09:27, 19 November 2023 (UTC)Reply[reply]

Hello. I'm an avid editor at hebrew wikisource and i fail to see the need to require the user to proofread every page of the scan for the following reasons:

  1. as you mentioned - you need a scanned back up in the original language. you have it - why further trouble the user to create also a digitial addition in hebrew? a backup is a backup.
  2. the digitial hebrew version of this book is already in existence in multiple (if not dozens) sources throughout the internet - one of them being on the he.wikisource itself! the fact that "the proofread box wasn't checked" seems like hardly a reason not to suffice with the 2 resource that are already in existence there 1. the scan of original printing + 2. the digital version of משנה תורה
  3. as was mentioned previously by Jan Kameníček - you already have exceptions to this rule - in light of the points mentioned previously it would make sense to include the current discussion also in said category and not insist on having the user proofread the OTR scan of the pdf which is inferior to the current digital text of the book that is already on hebrew wikisource anyway...

i ask for your further consideration of this topic. many thank to all the terrific work you do here and for the whole world at large. Roxette5 (talk) 21:24, 20 November 2023 (UTC)Reply[reply]

This is a policy question, and not part of the undeletion request. If you have a question about policies, the Wikisource:Scriptorium is the place to ask those questions. But I will address point 3: no, the item Jan Kameníček linked to as an exception actually does meet our current requirements, since it has a proofread scan on th.WS. So it is not an exception. --EncycloPetey (talk) 22:23, 20 November 2023 (UTC)Reply[reply]
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --Xover (talk) 11:56, 9 December 2023 (UTC)Reply[reply]

Compilation of various editorials by Carl Schurz, partly taken from The Writings of Carl Schurz and mostly from other sources. The compilation should be deleted as out of scope, although it is possible to identify the sources of the individual editorials and split them under their original publications, if somebody undertakes to do it. Pinging Bob Burkhardt (although inactive since April 2022). -- Jan Kameníček (talk) 20:24, 19 November 2023 (UTC)Reply[reply]

  •  Comment This might be moved to the Author: namespace as a subpage. We have similar pages such as Author:Marcus Tullius Cicero/Speeches. But the individual editorials would need to be attached to the works from which they came, as they could not be so moved. --EncycloPetey (talk) 03:23, 20 November 2023 (UTC)Reply[reply]
    I would say a portal would be a more natural way to collect these as they're the intersection of an author and a specific publication. The Author: subpages we have are a bit of a mess and overall poorly managed (partly because we have no very clear guidance on them). Xover (talk) 06:33, 25 November 2023 (UTC)Reply[reply]
    • Does that mean moving these to the Author namespace? Otherwise there would be no basepage, since the listing page would be moved and the Harper's Weekly page was moved to the Portal namespace in 2021. --EncycloPetey (talk) 03:36, 9 December 2023 (UTC)Reply[reply]
      I mean that the wikipage Harper's Weekly Editorials by Carl Schurz should be moved to Portal:Harper's Weekly Editorials by Carl Schurz, and the subpages should be treated as if they were top-level mainspace pages. The subpages can be either preserved as top-level standalone works, or moved within the containing work if it can be identified and it is practicable to do so. According to the notes these are mostly Harper's Weekly, so we could (re)create Harper's Weekly and move these to Harper's Weekly/1893-03-18/The Annexation Policy etc. (the previous page at Harper's Weekly was presumably moved because it was a collection of extlinks, more suited to a WikiProject but just barely passable as a Portal:, not because we can't have a page there if properly formed).
      Hmm. Incidentally, or maybe more "tangentially", perhaps a better approach for Harper's would be Portal:Harper's Weekly as the main page, combined with each issue as a top level page ala. Harper's Weekly (March 18, 1893) and with subpages Harper's Weekly (March 18, 1893)/The Annexation Policy etc.? That would avoid having a constructed (non-scan-backed / transcluded) index portal type thingie at Harper's Weekly, and avoid the multi-level subpage hierarchy, all in all making this simpler for all involved. The page-dab'ed-by-date also jibes well with how we dab other stuff.
      And maybe we could divide the purpose of Portal: in two, narrowing Portal: into a more focussed topic hierarchy (LCCN etc.) and splitting off portals that are periodicals to a new sort of mainspace page (using a hypothetical {{periodical header}} to differentiate).
      Probably unworkable for some reason or other, but might be worth exploring if anybody has the spare cycles to think it through.
      Xover (talk) 11:28, 9 December 2023 (UTC)Reply[reply]
      Oh, but to be clear, I am not opposed to deleting Harper's Weekly Editorials by Carl Schurz outright, if that's the consensus, so long as we preserve the subpages somewhere. Moving it to Portal: is just an alternative to deletion; it can't stay in mainspace as is. Xover (talk) 11:50, 9 December 2023 (UTC)Reply[reply]
  •  Comment I agree the wikipage at Harper's Weekly Editorials by Carl Schurz is out of scope and should be either deleted, migrate to a Portal:, migrate to a sub-page of Author:Carl Schurz, or be converted into a Harper's Weekly (cf. Portal:Harper's Weekly).
    The subpages then become just loose non-scan-backed texts with slightly confused provenance (e.g. at least one are from a later reprint of an editorial first published in Harper's). Given our general practice with such texts I don't think deleting them is appropriate (we'd be being stricter with these than with others), so moving them to non-subpage titles or to titles within a magazine structure would be more appropriate.
    Also, Bob Burkhardt, aka. Library Guy, has been active here on and off for a very long time. It is likely that they will become active again at some point, and leaving notifications on their user talk pages may elicit a response. I think we should try to get their input as a long-time contributor before making a final call on this. --Xover (talk) 06:46, 25 November 2023 (UTC)Reply[reply]
    I have left a notification at the talk pages of both Bob Burkhardt and Library Guy. --Jan Kameníček (talk) 09:38, 9 December 2023 (UTC)Reply[reply]
Agree with the general sentiment about being overly strict here. For works in periodicals, the boundaries between Author: subpages, Portal: and Main: generally we lack clear guidance for organizing and then linking the scattered works across issues (e.g. we have The Strand Magazine/The Hound of the Baskervilles in Main: but the Holmes's short stories in Portal: as well as in Author:). Other contributors haven't places their works from periodicals under the periodical either, e.g. Landon in The Literary Gazette 1821 isn't a subpage of The Literary Gazette or Her Chance (Wylie) isn't a subpage of Royal for example. I would be more in favor of tagging with a maintenance tag / having a discussion for improvement on the work talk page about how to improve things rather than deletion. MarkLSteadman (talk) 15:56, 25 November 2023 (UTC)Reply[reply]
Landon is a separate problem. Landon in The Literary Gazette 1821 claims to be transcluded from Index:Literary Gazette Titles.pdf and backed by File:Literary Gazette Titles.pdf, which is a PDF file containing only the title bits and obviously created in Word on the contributor's own computer (it's not a scan, it's self-generated PDF). The subpages, e.g. Landon in The Literary Gazette 1821/Stanzas On the Death of Miss Campbell, are transcluded from Index:Landon in The London Literary Gazette 1821.pdf backed by File:Landon in The London Literary Gazette 1821.pdf, which is—you guessed it—a PDF file generated from a Word file "compiled by Peter J. Bolton". All of which were uploaded by Esme Shepherd, whose user page begins "My real name is Peter J. Bolton." That is, this is a user-created compilation of arbitrary excerpts, hidden behind a sheen of scan-backing. And this is just the tip of the iceberg: the user has a particular obsession with Landon and has been creating these self-published collections and custom editions for years. At some point we'll have to go through their entire contribution history to weed out these, but I just don't have the spare capacity to try to unravel this mess (and the contributor does not react well to stress, so both patience and a firm hand are needed). Xover (talk) 11:47, 9 December 2023 (UTC)Reply[reply]
My general feeling is that these type of moves are handed by adding maintenance tags ({{standardize}}?) rather than deletion if it is mostly around moving things around and creating the appropriate larger apparatus (portals, index pages, front matter and contents etc.) adding another big pile of work to the backlog... MarkLSteadman (talk) 01:52, 11 December 2023 (UTC)Reply[reply]
    • Not sure where Xover is coming from here. Landon in The Literary Gazette 1821/Stanzas On the Death of Miss Campbell, is transcluded from Page:Landon in The London Literary Gazette 1821.pdf/4, which is a scan from The London Literary Gazette 1821: 22nd Sept 1821, page 602, which is not a Word file at all and is far from arbitrary, having been searched out comprehensively. These texts are not my own work and this method was approved of in the first instance.Esme Shepherd (talk) 17:07, 10 December 2023 (UTC)Reply[reply]
      Looking at the mentioned Page:Landon in The London Literary Gazette 1821.pdf/4, it seems to me that only the part titled "Original poetry" has been extracted from the newspaper, the text above seems added and if I were asked to guess, I would guess it was done in Word. The same can be said about all the pages of the "scans". Also the alleged title page really looks like written in Word and so does the Contents. So it really looks like a self-made compilation of poems, not a published poetry collection. This can be very confusing to our readers, who have been accustomed to the fact that our indexes have been created from published works, and thus may assume that a collection called "Landon in The Literary Gazette 1821" was really published by somebody somewhere. Although it is beyond any doubt that the compilation was made in good faith, it should be deleted from both the main NS and the index NS (and maybe also from Commons?) as out of scope. --Jan Kameníček (talk) 17:40, 10 December 2023 (UTC)Reply[reply]
      As those pages are not relevant for the question of the item under consideration, perhaps they warrant a separate discussion? --EncycloPetey (talk) 19:50, 10 December 2023 (UTC)Reply[reply]
  •  Delete. And these could easily fit on an author page or author subpage, which is where these probably really belong IMO. PseudoSkull (talk) 04:35, 9 December 2023 (UTC)Reply[reply]
    Rather than outright deleting the page and its contents, there have been proposals to make it a Portal or an Author sub-page. Most of the discussion is now on how to handle the listing and its items. The linked articles are currently organized as subpages, so deleting the base page would leave an auto-generated redlink to that location from each of the article pages. --EncycloPetey (talk) 19:50, 10 December 2023 (UTC)Reply[reply]

The following discussion is closed and will soon be archived:

Deleted. Scan made from a secondary transcription, and not a scan of the actual publication.

This is not the original source, but a copy of a different person’s transcription of the original. In addition, the actual original, while in the public domain in the Philippines (country of origin), had its copyright restored in the United States, and remains copyrighted for 95 years after the original date of publication (1934). TE(æ)A,ea. (talk) 22:19, 1 December 2023 (UTC)Reply[reply]

  •  Comment The previous deletion discussion was closed on the basis of the work being published before 1923. However, I can find no information presented concerning the date of publication in the discussion. At the time, there was a simultaneous discussion at Commons, which resulted in the file being moved here. The commons discussion also does not present date information that I can find. Do we know the actual date of publication, or are we assuming the date of 1934 on the Index is correct? --EncycloPetey (talk) 00:01, 2 December 2023 (UTC)Reply[reply]
    • EncycloPetey: Looking it up, I got the date wrong. The author died in 1942, so the copyright in the Philippines expired in 1993, which is before the URAA date (1996). Thus, there was no restoration, and the work is in the public domain in the United States. I find no proof anywhere of a pre-1923 date. However, I believe my non-deletion reason still stands, so I don’t think that the discussion should be closed. TE(æ)A,ea. (talk) 00:10, 2 December 2023 (UTC)Reply[reply]
      • The only reason I would close this early is if there were a Speedy reason in favor of deletion, or a clear copyright violation. Since the copyright seems OK, and there is no Index for the original publication, this will sit for (at least) the usual week before action. --EncycloPetey (talk) 00:18, 2 December 2023 (UTC)Reply[reply]
  •  Delete per nom (minus the resolved copyright issue). --Xover (talk) 09:21, 2 December 2023 (UTC)Reply[reply]
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --EncycloPetey (talk) 20:59, 8 December 2023 (UTC)Reply[reply]

The following discussion is closed and will soon be archived:

Deleted. List of plays with no links or content from the FF& encyclopedic content beyond scope.

This page merely lists the plays found in the First Folio, with information copied verbatim from the Wikipedia article. It does not (and never has) actually linked to the First Folio. Nor does it link to the facsimile of the FF, nor to copies of the plays found in a copy of the FF. As this is simply a list of Shakespeare's plays (which are listed at the author's page) with Wikipedia content, it is redundant and beyond scope, and serves no purpose here. --EncycloPetey (talk) 01:56, 3 December 2023 (UTC)Reply[reply]

 Delete. This is really the job of the table of contents on a collective work (which in this case would be the Folio). PseudoSkull (talk) 02:08, 3 December 2023 (UTC)Reply[reply]
 Delete as above. —CalendulaAsteraceae (talkcontribs) 03:43, 5 December 2023 (UTC)Reply[reply]
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --EncycloPetey (talk) 02:56, 10 December 2023 (UTC)Reply[reply]

The following discussion is closed and will soon be archived:

Deleted as redundant.

According to the information filled in by Bob Burkhardt, this seems to be duplicitous of Index:The Collected Works of Theodore Parker volume 6.djvu, which is more or less entirely proofread and transcluded. TE(æ)A,ea. (talk) 23:50, 3 December 2023 (UTC)Reply[reply]

  •  Keep. Unfortunately, the proofread scan misses page no. 323, while the nominated one has the page. So I suggest keeping the nominated scan, moving the proofread pages here, and deleting the other one. --Jan Kameníček (talk) 00:05, 4 December 2023 (UTC)Reply[reply]
    •  Comment A scan repair on the proofread scan could also resolve the issue. A move to the other scan would require a page offset, since the page numbers relative to scan pages is not the same. --EncycloPetey (talk) 02:17, 5 December 2023 (UTC)Reply[reply]
      File fixed. Mpaa (talk) 23:29, 5 December 2023 (UTC)Reply[reply]
      @Mpaa: Great! However, the file at Commons still links to the same source where this part of the scan is missing, which is quite confusing. Would it be possible to mention the repair and the source of the added part at the file's page at Commons? --Jan Kameníček (talk) 23:58, 5 December 2023 (UTC)Reply[reply]
      @Jan.Kamenicek done, I added as a note. Yes, it is indeed worth while noting the source of the replacements we make on the files. It is not an established practice. We should find the best way to document it (e.e in summary of changes, as "note" in the description, etc..) Mpaa (talk) 21:25, 6 December 2023 (UTC)Reply[reply]
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --Jan Kameníček (talk) 07:29, 7 December 2023 (UTC)Reply[reply]

This is an author born in 1960, and the only item listed is a link to a page on the New York Times that's not free to access. As there's no free works or expectation thereof, we should delete it.--Prosfilaes (talk) 00:57, 8 December 2023 (UTC)Reply[reply]

  • That speech might be PD-ILGov—at least, the same user who created this page uploaded a similar speech citing that as a rationale. TE(æ)A,ea. (talk) 02:50, 8 December 2023 (UTC)Reply[reply]
    That isn't a speech but an Op-Ed. But I would have thought that there must be works by him that are PD. -- Beardo (talk) 07:21, 8 December 2023 (UTC)Reply[reply]
  •  Comment The same user created Author:Mohammed bin Zayed, which needs serious cleaning up. --EncycloPetey (talk) 20:57, 8 December 2023 (UTC)Reply[reply]
  •  Delete unless we can find something authored by him that is in English and is in PD. A look at the WikiQuotes page turns up only quotations cited from news articles; and Hebrew Wikisource has only a single external link, as we do. --EncycloPetey (talk) 21:02, 8 December 2023 (UTC)Reply[reply]
 Delete unless we can find some PD works. By the way, as this person was a government official, there is a possibility of some edicts existing in English, though it's a stretch. PseudoSkull (talk) 04:39, 9 December 2023 (UTC)Reply[reply]

Unused, undocumented, abandoned since 2015, and should in any case not be used in its current form. Xover (talk) 10:49, 9 December 2023 (UTC)Reply[reply]

 Delete per nom. —CalendulaAsteraceae (talkcontribs) 23:57, 9 December 2023 (UTC)Reply[reply]

This specific index is one of many such indexes; I nominate it as an example, but should the rationale be found sound, I will endeavor to make a list of all such indexes.

This index (and many others) were created by now-absent User:Languageseeker. My main concern is that the pages of these indexes have been added via match-and-split from some source, likely Project Gutenberg, which does not have a defined original copy. Because of this absence of real source, and the similarity of the text to the actual text of any given scanned copy, proofreading efforts would likely have to either not check the text against the original source or scrap the existing text entirely to ensure accuracy to the original on Wikisource. In light of this, I think the easiest approach is to delete the indexes and all pages thereunder; if there is organic desire to scan them at some point in the future, the indexes may be re-created, but I do not see a reason to keep the indexes as they stand. TE(æ)A,ea. (talk) 19:12, 9 December 2023 (UTC)Reply[reply]

  •  Comment Hmm. I don't see the Index: pages as problematic. But the "Not Proofread" Page: pages that were, as you say, created by Match & Split from a secondary transcription (mostly Gutenberg, but also other sources), I do consider problematic. We don't permit secondary transcriptions added directly to mainspace, so to permit them in Page: makes no sense. And in addition to the problems these create for Proofreading that TE(æ)A,ea. outlines, it is also an issue that many contributors are reluctant to work on Index:es with a lot of extant-but-not-Proofread (i.e. "Red") pages.
    We have around a million (IIRC; it may be half a mill.) of these that were bot-created with essentially raw OCR (the contributor vehemently denies they are "raw OCR", so I assume some fixes were applied, but the quality is very definitely not Proofread). Languageseeker's imports are of much higher quality, but are still problematic. I think we should get rid of both these classes of Page: pages. In fact, I think we should prohibit Not Proofread pages from being transcluded to mainspace (except as a temporary measure, and possibly some other common sense exceptions). --Xover (talk) 20:24, 9 December 2023 (UTC)Reply[reply]
    • Xover: Assuming the status of the works to be equal, I would actually consider Languageseeker’s page creations to be worse, because, while it would look better as transcluded, it reduces the overall quality of the transcription. My main problem with the other user’s not-proofread page creations was that he focused a lot on indexes of very technical works, but provided no proofread baseline on which other editors could continue work—that was my main objection at the time, as it is easier to come on and off of work where there is an established style (for a complicated work) as opposed to starting a project and creating those standards yourself. As to the Page:/Index: issue, I ask for index deletion as well because these indexes were created only as a basis for the faulty text import, and I don’t want that to overlook any future transcription of those works. Again, I have no problem to work (or re-creation), I just think that these indexes (which are clearly abandoned, and were faulty ab origine) should be deleted. As for transclusion of not-proofread pages, I don’t think that the practice is so widespread that a policy needs to implemented (from my experience, at least); the issue is best dealt with on a case-by-case basis, or rather an user-by-user basis (as users can have different ways of turning raw OCR into not-proofread text, then following transclusion and finally proofread status). But of course, that (and the other user’s works, the indexes for which I think should probably be deleted) are a discussion for another time. (I will probably have more spare time starting soon, so I might start a discussion about the other user’s works after this discussion concludes.) TE(æ)A,ea. (talk) 02:28, 10 December 2023 (UTC)Reply[reply]
      I'm not understanding what fault there is in the Index page. If the Page: pages had not been created, what problem would exist in the Index: page? --EncycloPetey (talk) 02:53, 10 December 2023 (UTC)Reply[reply]
      • EncycloPetey: This isn’t a case where the index page’s existence is inherently bad; but the pages poison the index, in terms of future (potential) proofreading efforts and in terms of abandonment. TE(æ)A,ea. (talk) 03:07, 10 December 2023 (UTC)Reply[reply]
  • I support deleting the individual pages of the index. As for the Index page itself, I am OK with both deleting it as abandoned or keeping it to wait for somebody to start the work anew. I also support getting rid of other similar secondary transcriptions. If a discussion on prohibiting transclusion of not-proofread pages into main NS is started somewhere, I will probably support it too. --Jan Kameníček (talk) 00:39, 10 December 2023 (UTC)Reply[reply]
 Comment I've always felt uncomfortable with the tendency of some users to want to bulk-add a bunch of Index pages which have the pages correctly labelled, but are left indefinitely with no pages proofread in them. I feel like a "transcription project" (as Index pages are labelled in templates) implies an ongoing, or at least somewhat complete, ordeal, and adding index pages without proofreading anything is really just duplicating data from other places into Wikisource. Not to say there's absolutely no value in adding lots of index pages this way, but the value seems minimal. The fact that index pages mostly rely on duplicate data as it is is already an annoying redundancy on the site, and I think most of what happens on Index pages should just be dealt with in Wikidata, so I think the best place to bulk-add data about works is there, not by mass-creating empty Index pages. I know my comment here is kind of unrelated to the specific issue of the discussion (being, indexes with pages matched and splitted or something), but the same user (Languageseeker) has tended to do that as well. I am struggling to come up with any specific arguments or policies to support my position against those empty index pages... but it just seems unnecessary, seems like it will cause problems in the future, and on a positive note I do applaud Languageseeker's massive effort—it shows something great about their character as an editor—but unfortunately I think their effort should have been more focused on areas other than the creation of as many Index pages as possible. PseudoSkull (talk) 04:15, 10 December 2023 (UTC)Reply[reply]
Bulk-adding anything is probably a bad idea on Wikisource, because so much of what we do here requires a human touch. That being said, so far as I know the Index: pages Languageseeker created were perfectly fine in themselves, including having correct pagelists etc. This step is often complicated for new contributors, so creating the Index: without Proofreading anything is not without merit. It's pointing at an already set up transcription project onsite vs. just (ext)linking to a scan at IA for some users. The latter is an insurmountable effort for quite a lot of contributors. We also have historically permitted things to sit indefinitely in our non-content namespaces if they are merely incomplete rather than actually wrong in some way.
That's not to say that all these Index: pages are necessarily golden, but imo those that are problematic (if any) should be dealt with individually. Xover (talk) 09:08, 10 December 2023 (UTC)Reply[reply]
 Comment I do not support creating them, but since they exist, I try to make good use of them. I usually proofread offline for convenience and when I add the text I check the diff. If anything differs, it is an extra check for me as I could be the one who made mistakes. So I would keep them.
BTW, nobody forbids to press the OCR button and restart. Mpaa (talk) 18:35, 10 December 2023 (UTC)Reply[reply]
While that is true, my experience is that the kinds of errors introduced by a mystery text layer is insidious, and most editors are unaware of the issue, or fail to notice small problems such as UK/US spelling differences, changes to punctuation, minor word changed, etc. So, while a person could reset the text, what would alert them to the fact that they should, rather than working from the existing unproofed page?
H. G. Wells' First Men in the Moon is a prime example. A well-meaning editor matched-and-split the text into the scan. Two experienced editors crawled through making multiple corrections to validate the work, yet as recently as this past week we have had editors continue to find small mistakes throughout. Experience shows that match-and-split text is actually worse for Wikisource proofreading than the raw OCR because of these persistent text errors. --EncycloPetey (talk) 18:51, 10 December 2023 (UTC)Reply[reply]
In my workflow, I start from OCR, then compare what I did with what is available. It is an independent reference which I use for quality check. The probability that I did the same error is low (and the error would be anyhow there). It is almost as if someone is validating my text (or vice-versa). For me it is definitely a help. I follow the same process when validating text. I do not look at what is there and then compare. Mpaa (talk) 19:21, 10 December 2023 (UTC)Reply[reply]
Right. You do that, and I work similarly. But experience shows that the vast majority of contributors don't do that; they either don't touch the text due to the red pages, or they try to proofread off the extant text and leave behind subtle errors as EncycloPetey outlines. Xover (talk) 19:35, 10 December 2023 (UTC)Reply[reply]
We could argue forever. I do not know what evidence you have to say that works started from match-and-split are worse than others. I doubt anyone has real numbers to say that. IMHO it all depends on the attitude of contributors. I have seen works reaching a Validated stage and being crappy all the same. If you want to be consistent, you should delete all pages in a NotProofread state and currently not worked on because I doubt a non-experienced user will look where the text is coming from when editing, from a match-and-split or whatever.
Also, then we should shutdown the match-and-split tool or letting only admins to run it, after being 100% sure that the version to split is the same as the version to scan.
I am not advocating it as a process, I am only saying that what is there is there and it could be useful to some. If the community will decide otherwise, fine, I can cope with that. Mpaa (talk) 20:32, 10 December 2023 (UTC)Reply[reply]
I do not know what evidence you have to say that works started from match-and-split are worse than others. Anecdotal evidence only, certainly. But EncycloPetey gave a concrete example (H. G. Wells' First Men in the Moon), and both of us are asserting that we have seen this time and again: when the starting point is Match & Split text, the odds are high that the result will contain subtle errors in punctuation, US/UK spelling differences, words changed between editions, and so forth. All the things that do not jump out at you as "misspelled". Your experience may, obviously, differ, and it's certainly a valid point that we can end up with poor quality results for other reasons too.
Your argumentum ad absurdum arguments are also well taken, but nobody's arguing we go hog-wild and delete everything. Languageseeker, specifically, went on an import-spree from Gutenberg (and managed to piss off the Distributed Proofreaders in the process), snarfing in a whole bunch of texts in a short period of time. All of these are secondary transcriptions, and Languageseeker was never going to proofread these themselves (their idea was almost certainly to either transclude them as is, or to run them in the Monthly Challenge).
For these sorts of bulk actions that create an unmanageable workload to handle, I think deletion (return to the status quo ante) is a reasonable option. The same would go for the other user that bulk-imported something like 500k/1 mill. (I've got to go check that number) Page: pages of effectively uncorrected OCR. For anything else I'd be more hesitant, and certainly wouldn't want to take a position in aggregate. Those would be case-by-case stuff, but that really isn't an option for these bulk actions. Xover (talk) 07:17, 11 December 2023 (UTC)Reply[reply]


 Comment I am agianst deleting the Index. Indexes are one of the most tedious work to do when starting a transcription. Having index pages prepared and checked against the scan will save a lot of work. Mpaa (talk) 21:46, 10 December 2023 (UTC)Reply[reply]

This index (which has a number of duplicated pages) should be deleted in place of Index:The Mahabharata of Krishna-Dwaipayana Vyasa (1884).djvu. Ideally, the name of the latter file should be harmonized with that of the whole set. TE(æ)A,ea. (talk) 03:48, 10 December 2023 (UTC)Reply[reply]

Without looking too closely, this can probably be speedied as redundant. Xover (talk) 10:27, 10 December 2023 (UTC)Reply[reply]
  • Xover: My nomination was in the hope that someone would feel compelled to get the names sorted out, which takes redirect-quash privileges. TE(æ)A,ea. (talk) 15:01, 10 December 2023 (UTC)Reply[reply]
    If it is OK to keep the djvu extension, I can take care of the alignment. Mpaa (talk) 18:46, 10 December 2023 (UTC)Reply[reply]

The following discussion is closed and will soon be archived:

Speedied as out of scope.

Just a photo without text. The photo is in commons. -- Beardo (talk) 10:20, 10 December 2023 (UTC)Reply[reply]

Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --Xover (talk) 10:26, 10 December 2023 (UTC)Reply[reply]

I am pretty sure that this volume (and the 2018, 2019, 2020, and 2021 volumes) are not actual volumes, but user-created compilations. In any case, the original Web-site is broken (for some reason), so there’s no way to confirm. TE(æ)A,ea. (talk) 15:42, 10 December 2023 (UTC)Reply[reply]

An unsourced edition. We have three editions fully backed by scans. --EncycloPetey (talk) 20:31, 10 December 2023 (UTC)Reply[reply]

 Delete per nom MarkLSteadman (talk) 21:16, 10 December 2023 (UTC)Reply[reply]
 Delete PseudoSkull (talk) 21:17, 10 December 2023 (UTC)Reply[reply]
 Delete per nom. --Xover (talk) 07:18, 11 December 2023 (UTC)Reply[reply]

Non-scan backed edition, where a scan-backed version (of far better quality) does exist. It claims to have been taken from a website, and per the talk page of the edition, this is due to a transwiki from Wikipedia sometime back in the 2000s. We should stay as far away as possible from "website versions", especially where a scan-backed version does exist. PseudoSkull (talk) 10:20, 11 December 2023 (UTC)Reply[reply]

Orphaned outdated subpages of Library of Congress Classification[edit]

While scrolling Category:Texts without a source I discovered that there were numerous subpages of Library of Congress Classification that are now orphaned, of the form Library of Congress Classification/Class H, subclass HB -- Economic Theory and Demography. These are superseded by those of the form Library of Congress Classification/Class H if I'm not mistaken. Don't know if there are places still linking to those old pages, but can we get away with deleting them? (Redirects could work if necessary.) Arcorann (talk) 11:34, 11 December 2023 (UTC)Reply[reply]