Wikisource:Proposed deletions

From Wikisource
Jump to navigation Jump to search
WS:PD redirects here. For help with public domain materials, see Help:Public domain.
Proposed deletions

This page is for proposing deletion of specific articles on Wikisource in accordance with the deletion policy, and appealing previously-deleted works. Please add {{delete}} to pages you have nominated for deletion. What Wikisource includes is the policy used to determine whether or not particular works are acceptable on Wikisource. Articles remaining on this page should be deleted if there is no significant opposition after at least a week.

Possible copyright violations should be listed at Copyright discussions. Pages matching a criterion for speedy deletion should be tagged with {{sdelete}} and not reported here (see category).

SpBot archives all sections tagged with {{section resolved|1=~~~~}} after 7 days. For the archive overview, see /Archives.


Index:The trail of the golden horn.djvu

[edit]

This specific index is one of many such indexes; I nominate it as an example, but should the rationale be found sound, I will endeavor to make a list of all such indexes.

This index (and many others) were created by now-absent User:Languageseeker. My main concern is that the pages of these indexes have been added via match-and-split from some source, likely Project Gutenberg, which does not have a defined original copy. Because of this absence of real source, and the similarity of the text to the actual text of any given scanned copy, proofreading efforts would likely have to either not check the text against the original source or scrap the existing text entirely to ensure accuracy to the original on Wikisource. In light of this, I think the easiest approach is to delete the indexes and all pages thereunder; if there is organic desire to scan them at some point in the future, the indexes may be re-created, but I do not see a reason to keep the indexes as they stand. TE(æ)A,ea. (talk) 19:12, 9 December 2023 (UTC)Reply

  •  Comment Hmm. I don't see the Index: pages as problematic. But the "Not Proofread" Page: pages that were, as you say, created by Match & Split from a secondary transcription (mostly Gutenberg, but also other sources), I do consider problematic. We don't permit secondary transcriptions added directly to mainspace, so to permit them in Page: makes no sense. And in addition to the problems these create for Proofreading that TE(æ)A,ea. outlines, it is also an issue that many contributors are reluctant to work on Index:es with a lot of extant-but-not-Proofread (i.e. "Red") pages.
    We have around a million (IIRC; it may be half a mill.) of these that were bot-created with essentially raw OCR (the contributor vehemently denies they are "raw OCR", so I assume some fixes were applied, but the quality is very definitely not Proofread). Languageseeker's imports are of much higher quality, but are still problematic. I think we should get rid of both these classes of Page: pages. In fact, I think we should prohibit Not Proofread pages from being transcluded to mainspace (except as a temporary measure, and possibly some other common sense exceptions). --Xover (talk) 20:24, 9 December 2023 (UTC)Reply
    • Xover: Assuming the status of the works to be equal, I would actually consider Languageseeker’s page creations to be worse, because, while it would look better as transcluded, it reduces the overall quality of the transcription. My main problem with the other user’s not-proofread page creations was that he focused a lot on indexes of very technical works, but provided no proofread baseline on which other editors could continue work—that was my main objection at the time, as it is easier to come on and off of work where there is an established style (for a complicated work) as opposed to starting a project and creating those standards yourself. As to the Page:/Index: issue, I ask for index deletion as well because these indexes were created only as a basis for the faulty text import, and I don’t want that to overlook any future transcription of those works. Again, I have no problem to work (or re-creation), I just think that these indexes (which are clearly abandoned, and were faulty ab origine) should be deleted. As for transclusion of not-proofread pages, I don’t think that the practice is so widespread that a policy needs to implemented (from my experience, at least); the issue is best dealt with on a case-by-case basis, or rather an user-by-user basis (as users can have different ways of turning raw OCR into not-proofread text, then following transclusion and finally proofread status). But of course, that (and the other user’s works, the indexes for which I think should probably be deleted) are a discussion for another time. (I will probably have more spare time starting soon, so I might start a discussion about the other user’s works after this discussion concludes.) TE(æ)A,ea. (talk) 02:28, 10 December 2023 (UTC)Reply
      I'm not understanding what fault there is in the Index page. If the Page: pages had not been created, what problem would exist in the Index: page? --EncycloPetey (talk) 02:53, 10 December 2023 (UTC)Reply
      • EncycloPetey: This isn’t a case where the index page’s existence is inherently bad; but the pages poison the index, in terms of future (potential) proofreading efforts and in terms of abandonment. TE(æ)A,ea. (talk) 03:07, 10 December 2023 (UTC)Reply
        @TE(æ)A,ea.: Just to be clear, if the outcome here is to delete all the "Not Proofread" Page: pages, would you still consider the Index: pages bad (should be deleted)? So far that seems to be the most controversial part of this discussion, and the part that is a clear departure from established practice. Xover (talk) 07:40, 20 December 2023 (UTC)Reply
        • Xover: Yes, I think those are also bad. They were created en masse for the purpose of adding this poor match-and-split text, and there is no additional value in keeping around hundreds of unused indexes whose only purpose was to facilitate a project consensus (here) clearly indicates in unwise. The main objection on that ground is that indexes are difficult to make; but that is not really true, and in any case is not a real issue, as a new editor who wishes to edit (but not create an index) can simply ask for one to be created. Another problem with these indexes is that they are not connected with other information (like the Author:-pages) that would help new editors find them. Insofar as they exist like this, the only real connection these indexes have to the project at large is through Languageseeker, who is now no longer editing. I don’t think that every abandoned index is a nuisance, but I do believe that this (substantial) group of mass-created indexes is a problem. TE(æ)A,ea. (talk) 21:10, 20 December 2023 (UTC)Reply
  • I support deleting the individual pages of the index. As for the Index page itself, I am OK with both deleting it as abandoned or keeping it to wait for somebody to start the work anew. I also support getting rid of other similar secondary transcriptions. If a discussion on prohibiting transclusion of not-proofread pages into main NS is started somewhere, I will probably support it too. --Jan Kameníček (talk) 00:39, 10 December 2023 (UTC)Reply
 Comment I've always felt uncomfortable with the tendency of some users to want to bulk-add a bunch of Index pages which have the pages correctly labelled, but are left indefinitely with no pages proofread in them. I feel like a "transcription project" (as Index pages are labelled in templates) implies an ongoing, or at least somewhat complete, ordeal, and adding index pages without proofreading anything is really just duplicating data from other places into Wikisource. Not to say there's absolutely no value in adding lots of index pages this way, but the value seems minimal. The fact that index pages mostly rely on duplicate data as it is is already an annoying redundancy on the site, and I think most of what happens on Index pages should just be dealt with in Wikidata, so I think the best place to bulk-add data about works is there, not by mass-creating empty Index pages. I know my comment here is kind of unrelated to the specific issue of the discussion (being, indexes with pages matched and splitted or something), but the same user (Languageseeker) has tended to do that as well. I am struggling to come up with any specific arguments or policies to support my position against those empty index pages... but it just seems unnecessary, seems like it will cause problems in the future, and on a positive note I do applaud Languageseeker's massive effort—it shows something great about their character as an editor—but unfortunately I think their effort should have been more focused on areas other than the creation of as many Index pages as possible. PseudoSkull (talk) 04:15, 10 December 2023 (UTC)Reply
Bulk-adding anything is probably a bad idea on Wikisource, because so much of what we do here requires a human touch. That being said, so far as I know the Index: pages Languageseeker created were perfectly fine in themselves, including having correct pagelists etc. This step is often complicated for new contributors, so creating the Index: without Proofreading anything is not without merit. It's pointing at an already set up transcription project onsite vs. just (ext)linking to a scan at IA for some users. The latter is an insurmountable effort for quite a lot of contributors. We also have historically permitted things to sit indefinitely in our non-content namespaces if they are merely incomplete rather than actually wrong in some way.
That's not to say that all these Index: pages are necessarily golden, but imo those that are problematic (if any) should be dealt with individually. Xover (talk) 09:08, 10 December 2023 (UTC)Reply
Oh, also, what we host on Wikidata vs. what's hosted locally in our Index: pages is a huge and complicated discussion (hmu if you want the outline). For the purposes of this discussion it, imo, makes the most sense to just view that as an entirely orthogonal issue. If and when (and how and why and...) we push some or all our Index: page contents somewhere other than our current solution, it'll deal with these Index:es as well as every other. Xover (talk) 07:33, 20 December 2023 (UTC)Reply
 Comment I do not support creating them, but since they exist, I try to make good use of them. I usually proofread offline for convenience and when I add the text I check the diff. If anything differs, it is an extra check for me as I could be the one who made mistakes. So I would keep them.
BTW, nobody forbids to press the OCR button and restart. Mpaa (talk) 18:35, 10 December 2023 (UTC)Reply
While that is true, my experience is that the kinds of errors introduced by a mystery text layer is insidious, and most editors are unaware of the issue, or fail to notice small problems such as UK/US spelling differences, changes to punctuation, minor word changed, etc. So, while a person could reset the text, what would alert them to the fact that they should, rather than working from the existing unproofed page?
H. G. Wells' First Men in the Moon is a prime example. A well-meaning editor matched-and-split the text into the scan. Two experienced editors crawled through making multiple corrections to validate the work, yet as recently as this past week we have had editors continue to find small mistakes throughout. Experience shows that match-and-split text is actually worse for Wikisource proofreading than the raw OCR because of these persistent text errors. --EncycloPetey (talk) 18:51, 10 December 2023 (UTC)Reply
In my workflow, I start from OCR, then compare what I did with what is available. It is an independent reference which I use for quality check. The probability that I did the same error is low (and the error would be anyhow there). It is almost as if someone is validating my text (or vice-versa). For me it is definitely a help. I follow the same process when validating text. I do not look at what is there and then compare. Mpaa (talk) 19:21, 10 December 2023 (UTC)Reply
Right. You do that, and I work similarly. But experience shows that the vast majority of contributors don't do that; they either don't touch the text due to the red pages, or they try to proofread off the extant text and leave behind subtle errors as EncycloPetey outlines. Xover (talk) 19:35, 10 December 2023 (UTC)Reply
We could argue forever. I do not know what evidence you have to say that works started from match-and-split are worse than others. I doubt anyone has real numbers to say that. IMHO it all depends on the attitude of contributors. I have seen works reaching a Validated stage and being crappy all the same. If you want to be consistent, you should delete all pages in a NotProofread state and currently not worked on because I doubt a non-experienced user will look where the text is coming from when editing, from a match-and-split or whatever.
Also, then we should shutdown the match-and-split tool or letting only admins to run it, after being 100% sure that the version to split is the same as the version to scan.
I am not advocating it as a process, I am only saying that what is there is there and it could be useful to some. If the community will decide otherwise, fine, I can cope with that. Mpaa (talk) 20:32, 10 December 2023 (UTC)Reply
I do not know what evidence you have to say that works started from match-and-split are worse than others. Anecdotal evidence only, certainly. But EncycloPetey gave a concrete example (H. G. Wells' First Men in the Moon), and both of us are asserting that we have seen this time and again: when the starting point is Match & Split text, the odds are high that the result will contain subtle errors in punctuation, US/UK spelling differences, words changed between editions, and so forth. All the things that do not jump out at you as "misspelled". Your experience may, obviously, differ, and it's certainly a valid point that we can end up with poor quality results for other reasons too.
Your argumentum ad absurdum arguments are also well taken, but nobody's arguing we go hog-wild and delete everything. Languageseeker, specifically, went on an import-spree from Gutenberg (and managed to piss off the Distributed Proofreaders in the process), snarfing in a whole bunch of texts in a short period of time. All of these are secondary transcriptions, and Languageseeker was never going to proofread these themselves (their idea was almost certainly to either transclude them as is, or to run them in the Monthly Challenge).
For these sorts of bulk actions that create an unmanageable workload to handle, I think deletion (return to the status quo ante) is a reasonable option. The same would go for the other user that bulk-imported something like 500k/1 mill. (I've got to go check that number) Page: pages of effectively uncorrected OCR. For anything else I'd be more hesitant, and certainly wouldn't want to take a position in aggregate. Those would be case-by-case stuff, but that really isn't an option for these bulk actions. Xover (talk) 07:17, 11 December 2023 (UTC)Reply


 Comment I am agianst deleting the Index. Indexes are one of the most tedious work to do when starting a transcription. Having index pages prepared and checked against the scan will save a lot of work. Mpaa (talk) 21:46, 10 December 2023 (UTC)Reply
  •  Keep the Index, but  Delete the pages. None of the bot-created pages have the header, which is a pain to add after-the-fact unless you can run a bot. The fact that they were created by match-and-split, instead of proofreading the text layer is poor practice. --EncycloPetey (talk) 19:15, 20 December 2023 (UTC)Reply
    There are many recently added "new texts" with no headers. Mpaa (talk) 22:00, 23 December 2023 (UTC)Reply
    • What percent of editors want headers; and what percent do not care? Do you have data? --EncycloPetey (talk) 22:03, 23 December 2023 (UTC)Reply
      No, I am only stating is not a good argument for deletion in my opinion, unless it is considered mandatory. Mpaa (talk) 22:22, 23 December 2023 (UTC)Reply
      • It is a good argument if most potential editors want to include the headers, and are put off working on proofreading by the fact that pages were created without the headers in place. There are works I've chosen not to work on for this reason. --EncycloPetey (talk) 23:04, 23 December 2023 (UTC)Reply
      I agree that on its own the lack of headers is not a good argument for deletion. But I read it here to be intended as one additional factor on the scales that added together favour deletion. Which I do think is a valid argument (one can disagree, of course). Xover (talk) 23:57, 23 December 2023 (UTC)Reply
    • Mpaa: That is the result of the efforts of one user, who has declared headers superfluous. I was going to start another discussion on that topic after this one (only one big discussion at a time for me, please). I think that, for all editors who want headers (most of them), not having them (because of the match-and-split seen here) is bad. Also, in response to your other comments above about proofreading over existing text, I usually do that as well, but I prefer proofreading on my own, without needing to check against a base—that’s why I focus on proofreading, not validation. For that same reason, I avoid all-not-proofread indexes like those at issue here. TE(æ)A,ea. (talk) 23:22, 23 December 2023 (UTC)Reply
      I was thinking the same about headers, it would be good to have a consistent approach about works, in all their parts/namespaces. Mpaa (talk) 09:47, 24 December 2023 (UTC)Reply
 Comment in the future, if anyone feels blocked for the lack of headers, or wants to add headers, please make a bot request.Mpaa (talk) 09:47, 24 December 2023 (UTC)Reply
 Comment I am proofreading this specific text. This discussion can be as reference for the other indexes, as TE(æ)A,ea. mentioned at the beginning of the discussion. BTW, a list would be useful, so I can fetch before a (possible) deletion. Mpaa (talk) 12:53, 2 January 2024 (UTC)Reply

The Picture in the House (unknown)

[edit]

Duplicative of Weird Tales/Volume 3/Issue 1/The Picture in the House, starting discussion to decide whether to remove or migrate the librivox recording. MarkLSteadman (talk) 06:43, 29 December 2023 (UTC)Reply

Gah. Tough call.
The two texts are not the same. Both Weird Tales in 1924 and the 1937 reprint use … the antique and repellent wooden building which blinked with bleared windows from between two huge leafless oaks near the foot of a rocky hill, but the unsourced text uses elms. LibriVox for once actually gives a source, and in the case of File:LibriVox - picture in the house lovecraft sz.ogg that source is The Picture in the House (unknown) (modulo a page move after the fact here), and the audio narration does match (uses "elms"). The change to "elms" seems to be a later innovation, possibly applied by an editor as late as 1982 (Bloodcurdling Tales of Horror and the Macabre, the earliest use of "elms" there I could find right now), and the likely ultimate source of our text. The texts differ in other ways too, but up to this point the difference could be explained by transcription errors, lack of scan-backing and validation, etc.).
So… I don't think we can move the LibriVox file over to our new text (different edition). And because the nominated text is from an indeterminate edition and we have a scan-backed version of this work, we should  Delete The Picture in the House (unknown) too.
But it's really annoying that when LibriVox for once both gives the source text they have used for their reading and actually links back to us, we have to delete the page. I wish they'd coordinate more with us on issues like this so we could get the maximum benefit out of our respective volunteer efforts. Xover (talk) 08:38, 29 December 2023 (UTC)Reply
I guess that the LibreVox versions dates to when this was the only version available. Can we put the LibreVox link on The Picture in the House ? -- Beardo (talk) 18:33, 29 December 2023 (UTC)Reply
Hmm, no, I don't think so. We can't start amassing random multimedia versions of texts at the dab pages. Eventually we want spoken-word versions of our texts automatically linked from data on Wikidata, and that requires control over which specific edition the spoken-word version is from. Xover (talk) 10:49, 31 December 2023 (UTC)Reply
weak  Delete - it would probably be better for us to just start from scratch, although I recognize its value as being linked to from LibriVox, so maybe it could just be redirected to the current scanned version instead of outright deleted. SnowyCinema (talk) 03:30, 8 March 2024 (UTC)Reply
  • But start from scratch using what? The issue is that our scan-backed copy has a different text from the LibriVox recording. The text of the nominated copy can be attested, but not (yet) from a volume dated before 1945. Ideally, we would find a PD volume with the current text. --EncycloPetey (talk) 04:35, 18 March 2024 (UTC)Reply

Excerpt of just parts of the title page (a pseudo-toc) of an issue of the journal of record for the EU. Xover (talk) 11:29, 11 February 2024 (UTC)Reply

Also Official Journal of the European Union, L 078, 17 March 2014 Xover (talk) 11:34, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 087I, 15 March 2022 Xover (talk) 11:35, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 110, 8 April 2022 Xover (talk) 11:36, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 153, 3 June 2022 Xover (talk) 11:37, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 066, 2 March 2022 Xover (talk) 11:39, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 116, 13 April 2022 Xover (talk) 11:39, 11 February 2024 (UTC)Reply
  •  Keep This isn't an excerpt; it matches the Contents page of the on-line journal and links to the same items, which have also been transcribed. The format does not match as closely as it might, but it's not an excerpt. --EncycloPetey (talk) 04:52, 12 February 2024 (UTC)Reply
    That's not the contents page of the online journal, it's the download page for the journal that happens to display the first page of the PDF (which is the title page, that also happens to list the contents). See here for the published form of this work. What we're hosting is a poorly-formatted de-coupled excerpt of the title page. It's also—regardless of sourcing—just a loose table of contents. Xover (talk) 07:09, 13 February 2024 (UTC)Reply
    I don't understand. You're saying that it matches the contents of the journal, yet somehow it also doesn't? Yet, if I click on the individual items in the contents, I get the named items on a subpage. How is this different from what we do everywhere else on Wikisource? --EncycloPetey (talk) 16:35, 13 February 2024 (UTC)Reply
    They are loose tables of contents extracted from the title pages of issues of a journal. They link horizontally (not to subpages) to extracted texts and function like navboxes, not tables of contents on the top level page of a work. That their formatting is arbitrary wikipedia-like just reinforces this.
    The linked texts should strictly speaking also be migrated to a scan of the actual journal, but since those are actual texts (and not a loose navigation aid) I'm more inclined to let them sit there until someone does the work to move them within the containing work and scan-backing them. Xover (talk) 08:35, 20 February 2024 (UTC)Reply
    So, do I understand then that the articles should be consolidated as subpages, like a journal? In which case, these pages are necessary to have as the base page. Deleting them would disconnect all the component articles. It sounds more as though you're unhappy with the page formatting, rather than anything else. They are certainly not "excerpts", which was the basis for nominating them for deletion, and with that argument removed, there is no remaining basis for deletion. --EncycloPetey (talk) 19:41, 25 February 2024 (UTC)Reply

Interlinear Greek Translation:Bible

[edit]

While I understand that there is value to interlinear texts, we generally only host one user translation of a given work, and we already have Translation:Bible. —Beleg Âlt BT (talk) 20:53, 21 February 2024 (UTC)Reply

@Bobdole2021: I believe this is a project you are involved in —Beleg Âlt BT (talk) 20:54, 21 February 2024 (UTC)Reply
  •  Comment It depends on the source text, in this case, I would think. Translation:Bible is presumably using the Masoretic text (in Hebrew) for the Old Testament, which is different from the source text for Interlinear Greek Translation:Bible, which is from the Greek Septuagint. This is one reason we want our user-created translations to clearly identify what they are translating, so that we can determine what is happening in cases like this. If one is translating the Masoretic, and another the Septuagint, and another the Latin Vulgate, then that feels like legitimate separate translations. But without having a clearly identified starting point for the translation, we cannot determine that. And even for "the Septuagint", whose edition of the Septuagint is being used? There are whole volumes listing the differences between the various Greek copies of the Septuagint. --EncycloPetey (talk) 21:09, 21 February 2024 (UTC)Reply
    Translation:Bible is a bit unique, in that every book has (or should have) its own source. So Hebrew works like Translation:Genesis would be translated from Hebrew, while Greek works like Translation:Esther (Greek) would be translated from Greek, Translation:1 Meqabyan from Ge'ez, etc. —Beleg Âlt BT (talk) 21:13, 21 February 2024 (UTC)Reply
    There would be a good argument to be made, that Translation:Bible should actually be in Portal space, since it is a list of separate works rather than a cohesive work itself. —Beleg Âlt BT (talk) 21:15, 21 February 2024 (UTC)Reply
    Yes, but also no. The Masoretic text is a cohesive collection, and there are published editions that can be used as a basis for translation. The Vulgate is a cohesive collection, and it has published editions too. But the Dead Sea Scrolls are not; they are a collection by virtue of being discovered in the same location together. And even if you consider Genesis a "book" in its own right, there is still no single source text. There is the Masoretic edition in Hebrew, and the Septuagint editions in Greek, and the Vulgate edition of Jerome in Latin. There is not even an editio princeps as often happens with classical texts. Considering Genesis (and the other "books" of the Bible) to be works in their own right does nothing to help the fundamental issues here. --EncycloPetey (talk) 21:39, 21 February 2024 (UTC)Reply
    Unfortunately, Translation:Bible is nowhere near as sensible as all that :p
    I wonder if we could approach this by splitting Translation:Bible into component sources?
    • Hebrew OT books based on the Masoretic text, which is what I assume heWS has (he:ביבליה, not scan-backed)
    • Greek OT books based on the Septuagint, which is what I assume elWS has (el:Η Αγία Γραφή, also not scan-backed)
    • Greek NT books based on this scan at elWS
    • The others I'll need to research further but you get the gist.
    We already have separate translations for Esther (Hebrew and Greek) and Psalms (Hebrew, Greek, and Syriac) so we can just extend this to the rest of the works I guess.
    My main concern is the idea of having a separate "regular" translation and "interlinear" translation of the same work; otherwise, I'm open to whatever needs to be done to clean this mess up —Beleg Âlt BT (talk) 14:55, 26 February 2024 (UTC)Reply
    Perhaps we consider an interlinear translation to be something like the Translation equivalent of an Annotated text, requiring a "clean" copy to exist first? --EncycloPetey (talk) 00:48, 27 February 2024 (UTC)Reply
    Maybe. I seem to recall that at some point we explicitly disallowed interlinear translations, but I can't find it now.
    [update] I found it: WS:ANN disallows "Comparison pages: Pages from different versions of the same work, whether whole works or extracts, placed alongside each other (whether in series or in parallel) to provide a comparison between the different versions." I'm not sure whether that would apply here though. —Beleg Âlt BT (talk) 14:14, 27 February 2024 (UTC)Reply
  •  Comment I'm leaning toward saying we have "no consensus" on this particular item, because we really need a broader discussion where informed folks lay out the issues at stake, and because we probably need a decision on how we want to handle Wikisource translations of the Bible. A lot of things are in play here. --EncycloPetey (talk) 18:12, 13 April 2024 (UTC)Reply
  •  Delete A scan supported original language work must be present on the appropriate language wiki, where the original language version is complete at least as far as the English translation. Wikisource translations, like Annotations, are deliberately restrictive. But once the original text exists as a proofread and scan-backed work at grWS (or mulWS, or…) the discussion here on enWS is going to be vastly simplified. Also: The English Wikisource only collects texts written in the English language. Texts in other languages should be placed in the appropriate language subdomain, or at the general multi-language website [mulWS]. And that's in addition to the guidance for annotations that Beleg Âlt cited above. With strikes against it in three policies I don't see how this particular snowball is helped much by SPF 50. --Xover (talk) 18:55, 13 April 2024 (UTC)Reply
    Are you advocating for deletion of all WS-original translations of the Bible? Are you advocating for deletion of all side-by-side original translations we have? Both of those positions would result in the deletion of a huge number of pages here. Hence, it is a much larger issue than just this particular translation. --EncycloPetey (talk) 19:07, 13 April 2024 (UTC)Reply
    I'm not advocating mass-deletion of much of anything that was created before our current standards were in place (but do hold them to modern standards if they come up individually for other reasons). I'm saying this text, which is relatively speaking a new text, has so many strikes against it that it's not a difficult call. But I do think we should enforce current standards for all new texts (and this one should have been caught in patrolling when created), and most especially we should not turn a blind eye to Translation:-space as some kind of free-for-all.
    I haven't gone looking at what the other Wikisource translations of the bible look like, so I don't know what issues apply to them. But based on the above I suspect where we run into thorny issues is where someone wants to translate archeological artefacts rather than actually published works. In which case, lets apply or existing policies and standards to the ones that were actually published in any meaningful sense and save the big discussion for the archeological artefacts (which exception we may or may not want to accommodate). Xover (talk) 19:40, 13 April 2024 (UTC)Reply
    We have a grandfather rule in WS:T which would exclude Translation:Bible from the requirement that the original be scan-backed on grWS, but I agree that this would be a strike against Interlinear Greek Translation:Bible if we chose to enforce it since it was added after the 2013 changes to WS:T. —Beleg Tâl (talk) 19:42, 13 April 2024 (UTC)Reply
    (On the other hand, I don't see how Xover's second quote would apply, because the very next sentence in that policy is "However, English Wikisource does collect English translations of non-English texts, as well as bilingual editions in which the target language of the translation is English") —Beleg Tâl (talk) 19:43, 13 April 2024 (UTC)Reply
    It's the English bits that are the essence (and for bilingual works we usually transcribe only the English pages). Granted that's watered down a bit by the very wide definition of "English" applied, but it doesn't extend to Greek (Ancient or Modern) beyond short quotes etc. Xover (talk) 19:55, 13 April 2024 (UTC)Reply
    While what you say have been true here for transcriptions of published works, that principle has not been applied to original translations created here. We have a very large number of bitexts in the Translations namespace, so a decision here, using that principle, would affect a very large number of our Translations. So, if this were a transcription of a published work, yes, but in this situation, the waters are far muddier. --EncycloPetey (talk) 20:09, 13 April 2024 (UTC)Reply
    But original translations are supposed to be "transcriptions of published works", just ones where the text is translated into English as it is transcribed. I'm not saying to go retroactively apply this to every old text we have. I'm just saying this text, which is comparatively very recent and came up for discussion here individually and for other reasons, should get the actual standards applied. Xover (talk) 06:47, 19 April 2024 (UTC)Reply
    So you agree that what you are advocating would be a change to practice? What I am saying is that such a change in practice deserves a broader consideration for its impact, beyond the one work. --EncycloPetey (talk) 17:21, 19 April 2024 (UTC)Reply

A court document allegedly from Lexis Advance, but the source is not available to people who do not have an account of the Multimedia University (from Malaysia). With the source being inaccessible it is impossible to say whether it is a second-hand transcription or a transcription of an original document. The biggest problem is that the text is not available anywhere else either, and so it is absolutely impossible to check whether the transcription is correct and complete, whether it was originally in in English or it is a translation from Malay (which would raise questions about copyright), or even whether such a text really exists (I believe it does, but we have to be able to check it).

Pinging also Ong Kai Jin, who I have already asked to add a proper licence tag but without any reaction. -- Jan Kameníček (talk) 18:09, 7 March 2024 (UTC)Reply

If indeed this is a document that originated (from legislators) in English it would be in the public domain as an edict of a government, but like you said it is also impossible to tell if that's the case without access to the original source. So,  Delete until evidence of source is provided. SnowyCinema (talk) 00:49, 8 March 2024 (UTC)Reply
I have trimmed the source link to become not dependent to the institution's account, but subscription is still required for access, I am sorry if this is also not accepted. I would say this is the only authentic and original source, since this is a 'unreported' case law which it was not included into the journal for print, and LexisNexis is the publisher of this journal. The issue is no direct way for other user to validate the text, but I could not help.
In regarding the copyright, literary work, which is a copyrightable work, does not include judicial decisions. This is stated in Section 3 of Copyright Act 1987. The license tag Template:PD-Malaysia have been prepared. I assume there is no copyright issue here. Ong Kai Jin (talk) 16:34, 8 March 2024 (UTC)Reply
@Ong Kai Jin - to be hosted here, a work needs to be in public domain under US copyright law. I am still unclear - was the actual judgement in English ? -- Beardo (talk) 17:46, 8 March 2024 (UTC)Reply
I think it is reasonable to be under the public domain in both countries? Yes, the actual judgement was written in English. Why is it suspected to be in Malay language? Ong Kai Jin (talk) 19:52, 8 March 2024 (UTC)Reply
 Keep a court judgement issued in English is solidly in scope as {{PD-EdictGov}}. That said, @Ong Kai Jin: is there any way this work can be exported from LexisNexus in PDF format so that it can be properly proofread? —Beleg Tâl (talk) 23:14, 8 March 2024 (UTC)Reply
The website provides two versions of PDF download, the user-customizable and the court-ready, but the court-ready version is not available for 'unreported' cases such as this work. I feel that it serves no authentic value for using that custom PDF. Ong Kai Jin (talk) 20:23, 9 March 2024 (UTC)Reply

Translation:La Serva Padrona

[edit]

There is no scan supported original language work present on the appropriate Italian Wikisource, as required by Wikisource:Translations. -- Jan Kameníček (talk) 09:50, 28 March 2024 (UTC)Reply

Contracts Awarded by the CPA

[edit]

Out of scope per WS:WWI as it's a mere listing of data devoid of any published context. Xover (talk) 12:53, 31 March 2024 (UTC)Reply

 Keep if scan-backed to this PDF document. Since the PDF document is from 2004, a time when the WWW existed but wasn't nearly as universal to society as today, I find the thought that this wasn't printed and distributed absurdly unlikely. And the copyright license would be PD-text, since none of the text is complex enough for copyright, being a list of general facts. Also, this document is historically significant, since it involves the relationships between two federal governments during a quite turbulent war in that region. SnowyCinema (talk) 14:25, 31 March 2024 (UTC)Reply
(And it should be renamed to "CPA-CA Register of Awards" to accurately reflect the document.) SnowyCinema (talk) 14:32, 31 March 2024 (UTC)Reply
It's still just a list of data devoid of any context that might justify its inclusion (like if it were, e.g., the appendix to a report on something or other). Xover (talk) 19:51, 13 April 2024 (UTC)Reply
Maybe I should write a user essay on this, since this is something I've had to justify in other discussions, so I can just link to that in the future.
I don't take the policy to mean we don't want compilations of data on principle, or else we'd be deleting works like the US copyright catalogs (which despite containing introductions, etc., the body is fundamentally just a list of data). The policy says the justification on the very page. What we're trying to avoid is, rather, "user-compiled and unverified" data, like Wikisource editors (not external publications) listing resources for a certain project. And if you personally disagree, that's fine, but that's how I read the sentiment of the policy. I think that whether something was published, or at least printed or collected by a reputable-enough source, should be considered fair game. I'm more interested in weeding out research that was compiled on the fly by individual newbie editors, than federal government official compilations.
But to be fair, even in my line of logic, this is sort of an iffy case, since the version of the document I gave gives absolutely no context besides "CPA-CA REGISTER OF AWARDS (1 JAN 04- 10 APRIL 04)" so it is difficult to verify the actual validity of the document's publication in 2004, but I would lean to keep this just because I think the likelihood is in the favor of the document being valid, and the data is on a notable subject. And if evidence comes to light that proves its validity beyond a shadow of a doubt, then certainly. SnowyCinema (talk) 00:03, 20 April 2024 (UTC)Reply
Evidence of validity: The search metadata gives a date of April 11, 2004, and the parent URL is clearly an early 2000s web page just by the looks of it. My keep vote is sustained. SnowyCinema (talk) 00:16, 20 April 2024 (UTC)Reply

Tale of the Doomed Prince

[edit]

excerpt translation —Beleg Âlt BT (talk) 16:55, 10 April 2024 (UTC)Reply

 Delete. I find a version of this here, but it appears to be a fairly recent translation, perhaps self-published, linking to a (publisher?) website that is no longer online (blackmask.com), with no indication of a free license. (The TOC page is explicitly copyrighted, but I see no general statement about copyright or licensing.) -Pete (talk) 23:41, 12 April 2024 (UTC)Reply
I spoke too soon. There is a scan of the 1913 book it is excerpted from here. Seems to me it could be kept, matched, and treated as an in-progress transcription of the full work. -Pete (talk) 23:44, 12 April 2024 (UTC)Reply

O My Lord, Your Dwelling Places Are Lovely

[edit]

This and three other poems attributed to Judah Halevi and translator Solomon Solis-Cohen (d. 1948):

These poems were added to Wikisource in 2008 by Josette, who has not edited here in a decade. The discussion page for the first identifies this web page as the source, and I have confirmed the text matches. However, the web page makes no mention of Solis-Cohen, nor does it attribute any translator or make any assertions about original publication date. I have searched extensively for pieces of the text and metadata at google.com, archive.org, and hathitrust.org, but I've come up with nothing. Difficult to ascertain the provenance of these translations, seems unlikely they are is in the public domain, or that we could definitively establish where they came from. -Pete (talk) 20:15, 23 April 2024 (UTC)Reply

Some of the poems have a little note at the bottom, saying they are from A Treasury of Jewish Poetry (1957), which does identify Solis-Cohen as the translator. However, I'm inclined to suspect these translations are copyvio. —Beleg Âlt BT (talk) 21:06, 23 April 2024 (UTC)Reply

The Athenaeum

[edit]

This has been an empty page since it was created in 2015. --EncycloPetey (talk) 00:06, 25 May 2024 (UTC)Reply

Normally I would suggest speedy deletion (no notable content or history). However, we do appear to have two articles from The Athenaeum that should be moved to subpages of that work: Folk-lore (extracted from The Athenaeum 1846-08-22) and Folk-lore (extracted from The Athenaeum 1876-08-29). —Beleg Tâl (talk) 01:07, 25 May 2024 (UTC)Reply
Looks like we do have justification to keep the page and convert it into a base page for the periodical (though the subpage convention might need to be worked out -- the periodical doesn't have numbered volumes and but does number issues continuously). While we're here, do we have conventions on how to handle "æ" in work titles (since The Athenæum was always written as such)? Arcorann (talk) 04:09, 27 May 2024 (UTC)Reply
We don't have a strict convention, there are benefits to both arguments (faithfulness vs accessibility). Ether way, redirects should be created so that both spellings direct you to the correct work. —Beleg Âlt BT (talk) 13:24, 27 May 2024 (UTC)Reply

Duplicate O. Henry Stories.

[edit]

Nominating in bulk stories that have been proofread in collections

MarkLSteadman (talk) 22:16, 28 May 2024 (UTC)Reply

Ok so I know I've been pushing for Proposed Deletion in such cases since they are technically different editions ... but I had another look at WS:CSD and I do think that these fall under the criteria for speedy deletion as Redundant unless they are substantially different.  Delete and convert to redirect. —Beleg Âlt BT (talk) 13:20, 29 May 2024 (UTC)Reply
It takes you a while, but you do usually see sense eventually. :) Xover (talk) 06:24, 31 May 2024 (UTC)Reply
:D —Beleg Tâl (talk) 17:21, 31 May 2024 (UTC)Reply
The plan was always to convert these to redirects once my little O. Henry project was complete. It's very likely that there will be more batches of these in future, but I haven't mapped what non-scan backed dross we have sitting around. In any case,  Delete to convert to redirects. Xover (talk) 06:23, 31 May 2024 (UTC)Reply

Some old bills

[edit]

These are not scan-backed, of doubtful utility (especially as they are not laws, and the version of the bill is not clear), and they are generally a poorly-formatted mess with clearly editorial hyper-links (many broken) scattered amidst. I may find a few more, but to start:

In addition, these both belong to the roughly hundred-member Category:Proposed United States federal law of the 113th Congress; these should probably all be deleted, as well. TE(æ)A,ea. (talk) 21:43, 8 June 2024 (UTC)Reply

Organon (Owen)/Categories/annotated: redundant copy of scan-backed transclusion

[edit]

The "annotated" page is an incomplete transcription of Organon (Owen)/Categories; the latter page now properly transcludes the footnotes and sidenotes, making the "annotated" page redundant. Overthrows (talk) 18:02, 11 June 2024 (UTC)Reply