Wikisource:Proposed deletions

From Wikisource
Latest comment: 27 minutes ago by ShakespeareFan00 in topic Index:The color printer (1892).djvu and pages..
Jump to navigation Jump to search
WS:PD redirects here. For help with public domain materials, see Help:Public domain.
Proposed deletions

This page is for proposing deletion of specific articles on Wikisource in accordance with the deletion policy, and appealing previously-deleted works. Please add {{delete}} to pages you have nominated for deletion. What Wikisource includes is the policy used to determine whether or not particular works are acceptable on Wikisource. Articles remaining on this page should be deleted if there is no significant opposition after at least a week.

Possible copyright violations should be listed at Copyright discussions. Pages matching a criterion for speedy deletion should be tagged with {{sdelete}} and not reported here (see category).

SpBot archives all sections tagged with {{section resolved|1=~~~~}} after 7 days. For the archive overview, see /Archives.


Index:The trail of the golden horn.djvu

[edit]

This specific index is one of many such indexes; I nominate it as an example, but should the rationale be found sound, I will endeavor to make a list of all such indexes.

This index (and many others) were created by now-absent User:Languageseeker. My main concern is that the pages of these indexes have been added via match-and-split from some source, likely Project Gutenberg, which does not have a defined original copy. Because of this absence of real source, and the similarity of the text to the actual text of any given scanned copy, proofreading efforts would likely have to either not check the text against the original source or scrap the existing text entirely to ensure accuracy to the original on Wikisource. In light of this, I think the easiest approach is to delete the indexes and all pages thereunder; if there is organic desire to scan them at some point in the future, the indexes may be re-created, but I do not see a reason to keep the indexes as they stand. TE(æ)A,ea. (talk) 19:12, 9 December 2023 (UTC)Reply

  •  Comment Hmm. I don't see the Index: pages as problematic. But the "Not Proofread" Page: pages that were, as you say, created by Match & Split from a secondary transcription (mostly Gutenberg, but also other sources), I do consider problematic. We don't permit secondary transcriptions added directly to mainspace, so to permit them in Page: makes no sense. And in addition to the problems these create for Proofreading that TE(æ)A,ea. outlines, it is also an issue that many contributors are reluctant to work on Index:es with a lot of extant-but-not-Proofread (i.e. "Red") pages.
    We have around a million (IIRC; it may be half a mill.) of these that were bot-created with essentially raw OCR (the contributor vehemently denies they are "raw OCR", so I assume some fixes were applied, but the quality is very definitely not Proofread). Languageseeker's imports are of much higher quality, but are still problematic. I think we should get rid of both these classes of Page: pages. In fact, I think we should prohibit Not Proofread pages from being transcluded to mainspace (except as a temporary measure, and possibly some other common sense exceptions). --Xover (talk) 20:24, 9 December 2023 (UTC)Reply
    • Xover: Assuming the status of the works to be equal, I would actually consider Languageseeker’s page creations to be worse, because, while it would look better as transcluded, it reduces the overall quality of the transcription. My main problem with the other user’s not-proofread page creations was that he focused a lot on indexes of very technical works, but provided no proofread baseline on which other editors could continue work—that was my main objection at the time, as it is easier to come on and off of work where there is an established style (for a complicated work) as opposed to starting a project and creating those standards yourself. As to the Page:/Index: issue, I ask for index deletion as well because these indexes were created only as a basis for the faulty text import, and I don’t want that to overlook any future transcription of those works. Again, I have no problem to work (or re-creation), I just think that these indexes (which are clearly abandoned, and were faulty ab origine) should be deleted. As for transclusion of not-proofread pages, I don’t think that the practice is so widespread that a policy needs to implemented (from my experience, at least); the issue is best dealt with on a case-by-case basis, or rather an user-by-user basis (as users can have different ways of turning raw OCR into not-proofread text, then following transclusion and finally proofread status). But of course, that (and the other user’s works, the indexes for which I think should probably be deleted) are a discussion for another time. (I will probably have more spare time starting soon, so I might start a discussion about the other user’s works after this discussion concludes.) TE(æ)A,ea. (talk) 02:28, 10 December 2023 (UTC)Reply
      I'm not understanding what fault there is in the Index page. If the Page: pages had not been created, what problem would exist in the Index: page? --EncycloPetey (talk) 02:53, 10 December 2023 (UTC)Reply
      • EncycloPetey: This isn’t a case where the index page’s existence is inherently bad; but the pages poison the index, in terms of future (potential) proofreading efforts and in terms of abandonment. TE(æ)A,ea. (talk) 03:07, 10 December 2023 (UTC)Reply
        @TE(æ)A,ea.: Just to be clear, if the outcome here is to delete all the "Not Proofread" Page: pages, would you still consider the Index: pages bad (should be deleted)? So far that seems to be the most controversial part of this discussion, and the part that is a clear departure from established practice. Xover (talk) 07:40, 20 December 2023 (UTC)Reply
        • Xover: Yes, I think those are also bad. They were created en masse for the purpose of adding this poor match-and-split text, and there is no additional value in keeping around hundreds of unused indexes whose only purpose was to facilitate a project consensus (here) clearly indicates in unwise. The main objection on that ground is that indexes are difficult to make; but that is not really true, and in any case is not a real issue, as a new editor who wishes to edit (but not create an index) can simply ask for one to be created. Another problem with these indexes is that they are not connected with other information (like the Author:-pages) that would help new editors find them. Insofar as they exist like this, the only real connection these indexes have to the project at large is through Languageseeker, who is now no longer editing. I don’t think that every abandoned index is a nuisance, but I do believe that this (substantial) group of mass-created indexes is a problem. TE(æ)A,ea. (talk) 21:10, 20 December 2023 (UTC)Reply
  • I support deleting the individual pages of the index. As for the Index page itself, I am OK with both deleting it as abandoned or keeping it to wait for somebody to start the work anew. I also support getting rid of other similar secondary transcriptions. If a discussion on prohibiting transclusion of not-proofread pages into main NS is started somewhere, I will probably support it too. --Jan Kameníček (talk) 00:39, 10 December 2023 (UTC)Reply
 Comment I've always felt uncomfortable with the tendency of some users to want to bulk-add a bunch of Index pages which have the pages correctly labelled, but are left indefinitely with no pages proofread in them. I feel like a "transcription project" (as Index pages are labelled in templates) implies an ongoing, or at least somewhat complete, ordeal, and adding index pages without proofreading anything is really just duplicating data from other places into Wikisource. Not to say there's absolutely no value in adding lots of index pages this way, but the value seems minimal. The fact that index pages mostly rely on duplicate data as it is is already an annoying redundancy on the site, and I think most of what happens on Index pages should just be dealt with in Wikidata, so I think the best place to bulk-add data about works is there, not by mass-creating empty Index pages. I know my comment here is kind of unrelated to the specific issue of the discussion (being, indexes with pages matched and splitted or something), but the same user (Languageseeker) has tended to do that as well. I am struggling to come up with any specific arguments or policies to support my position against those empty index pages... but it just seems unnecessary, seems like it will cause problems in the future, and on a positive note I do applaud Languageseeker's massive effort—it shows something great about their character as an editor—but unfortunately I think their effort should have been more focused on areas other than the creation of as many Index pages as possible. PseudoSkull (talk) 04:15, 10 December 2023 (UTC)Reply
Bulk-adding anything is probably a bad idea on Wikisource, because so much of what we do here requires a human touch. That being said, so far as I know the Index: pages Languageseeker created were perfectly fine in themselves, including having correct pagelists etc. This step is often complicated for new contributors, so creating the Index: without Proofreading anything is not without merit. It's pointing at an already set up transcription project onsite vs. just (ext)linking to a scan at IA for some users. The latter is an insurmountable effort for quite a lot of contributors. We also have historically permitted things to sit indefinitely in our non-content namespaces if they are merely incomplete rather than actually wrong in some way.
That's not to say that all these Index: pages are necessarily golden, but imo those that are problematic (if any) should be dealt with individually. Xover (talk) 09:08, 10 December 2023 (UTC)Reply
Oh, also, what we host on Wikidata vs. what's hosted locally in our Index: pages is a huge and complicated discussion (hmu if you want the outline). For the purposes of this discussion it, imo, makes the most sense to just view that as an entirely orthogonal issue. If and when (and how and why and...) we push some or all our Index: page contents somewhere other than our current solution, it'll deal with these Index:es as well as every other. Xover (talk) 07:33, 20 December 2023 (UTC)Reply
 Comment I do not support creating them, but since they exist, I try to make good use of them. I usually proofread offline for convenience and when I add the text I check the diff. If anything differs, it is an extra check for me as I could be the one who made mistakes. So I would keep them.
BTW, nobody forbids to press the OCR button and restart. Mpaa (talk) 18:35, 10 December 2023 (UTC)Reply
While that is true, my experience is that the kinds of errors introduced by a mystery text layer is insidious, and most editors are unaware of the issue, or fail to notice small problems such as UK/US spelling differences, changes to punctuation, minor word changed, etc. So, while a person could reset the text, what would alert them to the fact that they should, rather than working from the existing unproofed page?
H. G. Wells' First Men in the Moon is a prime example. A well-meaning editor matched-and-split the text into the scan. Two experienced editors crawled through making multiple corrections to validate the work, yet as recently as this past week we have had editors continue to find small mistakes throughout. Experience shows that match-and-split text is actually worse for Wikisource proofreading than the raw OCR because of these persistent text errors. --EncycloPetey (talk) 18:51, 10 December 2023 (UTC)Reply
In my workflow, I start from OCR, then compare what I did with what is available. It is an independent reference which I use for quality check. The probability that I did the same error is low (and the error would be anyhow there). It is almost as if someone is validating my text (or vice-versa). For me it is definitely a help. I follow the same process when validating text. I do not look at what is there and then compare. Mpaa (talk) 19:21, 10 December 2023 (UTC)Reply
Right. You do that, and I work similarly. But experience shows that the vast majority of contributors don't do that; they either don't touch the text due to the red pages, or they try to proofread off the extant text and leave behind subtle errors as EncycloPetey outlines. Xover (talk) 19:35, 10 December 2023 (UTC)Reply
We could argue forever. I do not know what evidence you have to say that works started from match-and-split are worse than others. I doubt anyone has real numbers to say that. IMHO it all depends on the attitude of contributors. I have seen works reaching a Validated stage and being crappy all the same. If you want to be consistent, you should delete all pages in a NotProofread state and currently not worked on because I doubt a non-experienced user will look where the text is coming from when editing, from a match-and-split or whatever.
Also, then we should shutdown the match-and-split tool or letting only admins to run it, after being 100% sure that the version to split is the same as the version to scan.
I am not advocating it as a process, I am only saying that what is there is there and it could be useful to some. If the community will decide otherwise, fine, I can cope with that. Mpaa (talk) 20:32, 10 December 2023 (UTC)Reply
I do not know what evidence you have to say that works started from match-and-split are worse than others. Anecdotal evidence only, certainly. But EncycloPetey gave a concrete example (H. G. Wells' First Men in the Moon), and both of us are asserting that we have seen this time and again: when the starting point is Match & Split text, the odds are high that the result will contain subtle errors in punctuation, US/UK spelling differences, words changed between editions, and so forth. All the things that do not jump out at you as "misspelled". Your experience may, obviously, differ, and it's certainly a valid point that we can end up with poor quality results for other reasons too.
Your argumentum ad absurdum arguments are also well taken, but nobody's arguing we go hog-wild and delete everything. Languageseeker, specifically, went on an import-spree from Gutenberg (and managed to piss off the Distributed Proofreaders in the process), snarfing in a whole bunch of texts in a short period of time. All of these are secondary transcriptions, and Languageseeker was never going to proofread these themselves (their idea was almost certainly to either transclude them as is, or to run them in the Monthly Challenge).
For these sorts of bulk actions that create an unmanageable workload to handle, I think deletion (return to the status quo ante) is a reasonable option. The same would go for the other user that bulk-imported something like 500k/1 mill. (I've got to go check that number) Page: pages of effectively uncorrected OCR. For anything else I'd be more hesitant, and certainly wouldn't want to take a position in aggregate. Those would be case-by-case stuff, but that really isn't an option for these bulk actions. Xover (talk) 07:17, 11 December 2023 (UTC)Reply


 Comment I am agianst deleting the Index. Indexes are one of the most tedious work to do when starting a transcription. Having index pages prepared and checked against the scan will save a lot of work. Mpaa (talk) 21:46, 10 December 2023 (UTC)Reply
  •  Keep the Index, but  Delete the pages. None of the bot-created pages have the header, which is a pain to add after-the-fact unless you can run a bot. The fact that they were created by match-and-split, instead of proofreading the text layer is poor practice. --EncycloPetey (talk) 19:15, 20 December 2023 (UTC)Reply
    There are many recently added "new texts" with no headers. Mpaa (talk) 22:00, 23 December 2023 (UTC)Reply
    • What percent of editors want headers; and what percent do not care? Do you have data? --EncycloPetey (talk) 22:03, 23 December 2023 (UTC)Reply
      No, I am only stating is not a good argument for deletion in my opinion, unless it is considered mandatory. Mpaa (talk) 22:22, 23 December 2023 (UTC)Reply
      • It is a good argument if most potential editors want to include the headers, and are put off working on proofreading by the fact that pages were created without the headers in place. There are works I've chosen not to work on for this reason. --EncycloPetey (talk) 23:04, 23 December 2023 (UTC)Reply
      I agree that on its own the lack of headers is not a good argument for deletion. But I read it here to be intended as one additional factor on the scales that added together favour deletion. Which I do think is a valid argument (one can disagree, of course). Xover (talk) 23:57, 23 December 2023 (UTC)Reply
    • Mpaa: That is the result of the efforts of one user, who has declared headers superfluous. I was going to start another discussion on that topic after this one (only one big discussion at a time for me, please). I think that, for all editors who want headers (most of them), not having them (because of the match-and-split seen here) is bad. Also, in response to your other comments above about proofreading over existing text, I usually do that as well, but I prefer proofreading on my own, without needing to check against a base—that’s why I focus on proofreading, not validation. For that same reason, I avoid all-not-proofread indexes like those at issue here. TE(æ)A,ea. (talk) 23:22, 23 December 2023 (UTC)Reply
      I was thinking the same about headers, it would be good to have a consistent approach about works, in all their parts/namespaces. Mpaa (talk) 09:47, 24 December 2023 (UTC)Reply
 Comment in the future, if anyone feels blocked for the lack of headers, or wants to add headers, please make a bot request.Mpaa (talk) 09:47, 24 December 2023 (UTC)Reply
 Comment I am proofreading this specific text. This discussion can be as reference for the other indexes, as TE(æ)A,ea. mentioned at the beginning of the discussion. BTW, a list would be useful, so I can fetch before a (possible) deletion. Mpaa (talk) 12:53, 2 January 2024 (UTC)Reply

The Picture in the House (unknown)

[edit]

Duplicative of Weird Tales/Volume 3/Issue 1/The Picture in the House, starting discussion to decide whether to remove or migrate the librivox recording. MarkLSteadman (talk) 06:43, 29 December 2023 (UTC)Reply

Gah. Tough call.
The two texts are not the same. Both Weird Tales in 1924 and the 1937 reprint use … the antique and repellent wooden building which blinked with bleared windows from between two huge leafless oaks near the foot of a rocky hill, but the unsourced text uses elms. LibriVox for once actually gives a source, and in the case of File:LibriVox - picture in the house lovecraft sz.ogg that source is The Picture in the House (unknown) (modulo a page move after the fact here), and the audio narration does match (uses "elms"). The change to "elms" seems to be a later innovation, possibly applied by an editor as late as 1982 (Bloodcurdling Tales of Horror and the Macabre, the earliest use of "elms" there I could find right now), and the likely ultimate source of our text. The texts differ in other ways too, but up to this point the difference could be explained by transcription errors, lack of scan-backing and validation, etc.).
So… I don't think we can move the LibriVox file over to our new text (different edition). And because the nominated text is from an indeterminate edition and we have a scan-backed version of this work, we should  Delete The Picture in the House (unknown) too.
But it's really annoying that when LibriVox for once both gives the source text they have used for their reading and actually links back to us, we have to delete the page. I wish they'd coordinate more with us on issues like this so we could get the maximum benefit out of our respective volunteer efforts. Xover (talk) 08:38, 29 December 2023 (UTC)Reply
I guess that the LibreVox versions dates to when this was the only version available. Can we put the LibreVox link on The Picture in the House ? -- Beardo (talk) 18:33, 29 December 2023 (UTC)Reply
Hmm, no, I don't think so. We can't start amassing random multimedia versions of texts at the dab pages. Eventually we want spoken-word versions of our texts automatically linked from data on Wikidata, and that requires control over which specific edition the spoken-word version is from. Xover (talk) 10:49, 31 December 2023 (UTC)Reply
weak  Delete - it would probably be better for us to just start from scratch, although I recognize its value as being linked to from LibriVox, so maybe it could just be redirected to the current scanned version instead of outright deleted. SnowyCinema (talk) 03:30, 8 March 2024 (UTC)Reply
  • But start from scratch using what? The issue is that our scan-backed copy has a different text from the LibriVox recording. The text of the nominated copy can be attested, but not (yet) from a volume dated before 1945. Ideally, we would find a PD volume with the current text. --EncycloPetey (talk) 04:35, 18 March 2024 (UTC)Reply

Excerpt of just parts of the title page (a pseudo-toc) of an issue of the journal of record for the EU. Xover (talk) 11:29, 11 February 2024 (UTC)Reply

Also Official Journal of the European Union, L 078, 17 March 2014 Xover (talk) 11:34, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 087I, 15 March 2022 Xover (talk) 11:35, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 110, 8 April 2022 Xover (talk) 11:36, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 153, 3 June 2022 Xover (talk) 11:37, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 066, 2 March 2022 Xover (talk) 11:39, 11 February 2024 (UTC)Reply
Also Official Journal of the European Union, L 116, 13 April 2022 Xover (talk) 11:39, 11 February 2024 (UTC)Reply
  •  Keep This isn't an excerpt; it matches the Contents page of the on-line journal and links to the same items, which have also been transcribed. The format does not match as closely as it might, but it's not an excerpt. --EncycloPetey (talk) 04:52, 12 February 2024 (UTC)Reply
    That's not the contents page of the online journal, it's the download page for the journal that happens to display the first page of the PDF (which is the title page, that also happens to list the contents). See here for the published form of this work. What we're hosting is a poorly-formatted de-coupled excerpt of the title page. It's also—regardless of sourcing—just a loose table of contents. Xover (talk) 07:09, 13 February 2024 (UTC)Reply
    I don't understand. You're saying that it matches the contents of the journal, yet somehow it also doesn't? Yet, if I click on the individual items in the contents, I get the named items on a subpage. How is this different from what we do everywhere else on Wikisource? --EncycloPetey (talk) 16:35, 13 February 2024 (UTC)Reply
    They are loose tables of contents extracted from the title pages of issues of a journal. They link horizontally (not to subpages) to extracted texts and function like navboxes, not tables of contents on the top level page of a work. That their formatting is arbitrary wikipedia-like just reinforces this.
    The linked texts should strictly speaking also be migrated to a scan of the actual journal, but since those are actual texts (and not a loose navigation aid) I'm more inclined to let them sit there until someone does the work to move them within the containing work and scan-backing them. Xover (talk) 08:35, 20 February 2024 (UTC)Reply
    So, do I understand then that the articles should be consolidated as subpages, like a journal? In which case, these pages are necessary to have as the base page. Deleting them would disconnect all the component articles. It sounds more as though you're unhappy with the page formatting, rather than anything else. They are certainly not "excerpts", which was the basis for nominating them for deletion, and with that argument removed, there is no remaining basis for deletion. --EncycloPetey (talk) 19:41, 25 February 2024 (UTC)Reply

Translation:La Serva Padrona

[edit]

There is no scan supported original language work present on the appropriate Italian Wikisource, as required by Wikisource:Translations. -- Jan Kameníček (talk) 09:50, 28 March 2024 (UTC)Reply

Contracts Awarded by the CPA

[edit]

Out of scope per WS:WWI as it's a mere listing of data devoid of any published context. Xover (talk) 12:53, 31 March 2024 (UTC)Reply

 Keep if scan-backed to this PDF document. Since the PDF document is from 2004, a time when the WWW existed but wasn't nearly as universal to society as today, I find the thought that this wasn't printed and distributed absurdly unlikely. And the copyright license would be PD-text, since none of the text is complex enough for copyright, being a list of general facts. Also, this document is historically significant, since it involves the relationships between two federal governments during a quite turbulent war in that region. SnowyCinema (talk) 14:25, 31 March 2024 (UTC)Reply
(And it should be renamed to "CPA-CA Register of Awards" to accurately reflect the document.) SnowyCinema (talk) 14:32, 31 March 2024 (UTC)Reply
It's still just a list of data devoid of any context that might justify its inclusion (like if it were, e.g., the appendix to a report on something or other). Xover (talk) 19:51, 13 April 2024 (UTC)Reply
Maybe I should write a user essay on this, since this is something I've had to justify in other discussions, so I can just link to that in the future.
I don't take the policy to mean we don't want compilations of data on principle, or else we'd be deleting works like the US copyright catalogs (which despite containing introductions, etc., the body is fundamentally just a list of data). The policy says the justification on the very page. What we're trying to avoid is, rather, "user-compiled and unverified" data, like Wikisource editors (not external publications) listing resources for a certain project. And if you personally disagree, that's fine, but that's how I read the sentiment of the policy. I think that whether something was published, or at least printed or collected by a reputable-enough source, should be considered fair game. I'm more interested in weeding out research that was compiled on the fly by individual newbie editors, than federal government official compilations.
But to be fair, even in my line of logic, this is sort of an iffy case, since the version of the document I gave gives absolutely no context besides "CPA-CA REGISTER OF AWARDS (1 JAN 04- 10 APRIL 04)" so it is difficult to verify the actual validity of the document's publication in 2004, but I would lean to keep this just because I think the likelihood is in the favor of the document being valid, and the data is on a notable subject. And if evidence comes to light that proves its validity beyond a shadow of a doubt, then certainly. SnowyCinema (talk) 00:03, 20 April 2024 (UTC)Reply
Evidence of validity: The search metadata gives a date of April 11, 2004, and the parent URL is clearly an early 2000s web page just by the looks of it. My keep vote is sustained. SnowyCinema (talk) 00:16, 20 April 2024 (UTC)Reply

The Athenaeum

[edit]

This has been an empty page since it was created in 2015. --EncycloPetey (talk) 00:06, 25 May 2024 (UTC)Reply

Normally I would suggest speedy deletion (no notable content or history). However, we do appear to have two articles from The Athenaeum that should be moved to subpages of that work: Folk-lore (extracted from The Athenaeum 1846-08-22) and Folk-lore (extracted from The Athenaeum 1876-08-29). —Beleg Tâl (talk) 01:07, 25 May 2024 (UTC)Reply
Looks like we do have justification to keep the page and convert it into a base page for the periodical (though the subpage convention might need to be worked out -- the periodical doesn't have numbered volumes and but does number issues continuously). While we're here, do we have conventions on how to handle "æ" in work titles (since The Athenæum was always written as such)? Arcorann (talk) 04:09, 27 May 2024 (UTC)Reply
We don't have a strict convention, there are benefits to both arguments (faithfulness vs accessibility). Ether way, redirects should be created so that both spellings direct you to the correct work. —Beleg Âlt BT (talk) 13:24, 27 May 2024 (UTC)Reply

Duplicate O. Henry Stories.

[edit]

Nominating in bulk stories that have been proofread in collections

MarkLSteadman (talk) 22:16, 28 May 2024 (UTC)Reply

Ok so I know I've been pushing for Proposed Deletion in such cases since they are technically different editions ... but I had another look at WS:CSD and I do think that these fall under the criteria for speedy deletion as Redundant unless they are substantially different.  Delete and convert to redirect. —Beleg Âlt BT (talk) 13:20, 29 May 2024 (UTC)Reply
It takes you a while, but you do usually see sense eventually. :) Xover (talk) 06:24, 31 May 2024 (UTC)Reply
:D —Beleg Tâl (talk) 17:21, 31 May 2024 (UTC)Reply
The plan was always to convert these to redirects once my little O. Henry project was complete. It's very likely that there will be more batches of these in future, but I haven't mapped what non-scan backed dross we have sitting around. In any case,  Delete to convert to redirects. Xover (talk) 06:23, 31 May 2024 (UTC)Reply

A History of the Civil War, 1861-1865

[edit]

Unformatted copydump that has not been fixed since it was added in 2008. --EncycloPetey (talk) 04:25, 27 June 2024 (UTC)Reply

Maybe it's just me, but aside from one single issue, the formatting looks pretty ok to me. The one issue is that the HTML character entities are not being rendered correctly—and I'm not sure why that is, but I'm sure that once I figure it out it will be a pretty quick fix. —Beleg Tâl (talk) 18:02, 28 June 2024 (UTC)Reply
Turns out it was just using the wrong character numbers (it was using Windows-1252 encoding which differs from Unicode in those places).
Anyway I've fixed that particular issue. I'm not sure whether you would still consider this a copydump worth deleting. —Beleg Tâl (talk) 18:25, 28 June 2024 (UTC)Reply
Semi-unrelated—do we know how good Bartleby's transcriptions are? If they are sufficiently faithful to the source material, here is the scan of the edition that this transcription is based on. —Beleg Tâl (talk) 18:28, 28 June 2024 (UTC)Reply
I do not know, but the general principle here is that we do not accept secondhand transcriptions, so we would require evidence that Bartleby's transcriptions are highly accurate. If this particular one is highly accurate, and if we know the specific edition source, then it is a potential candidate for a match-and-split. However, from an initial inspection, I can see that the entire Preface is missing. --EncycloPetey (talk) 18:35, 28 June 2024 (UTC)Reply
A missing preface isn't an issue, since the preface is present on Bartleby and can easily be imported.
I have worked on some poetry collections that were on Bartleby, and I do know that those poetry collections are highly accurate. That being said, without some sort of published guidelines from Bartleby that suggest their standards are at least as high as ours, I'd be hesitant to assume that that level of accuracy applies across the board. —Beleg Tâl (talk) 13:15, 2 July 2024 (UTC)Reply
How did you determine that the particular scan you found is the edition the Bartleby transcription is based on? The title page information is different. --EncycloPetey (talk) 20:27, 29 June 2024 (UTC)Reply
... you're right, the title page information is different. Friggen IA lol. The Bartleby edition is digitized from the New York MacMillan 1917 printing, which is what I thought I had confirmed that scan was, but actually the scan is of the 1919 printing. All the other scans on IA are poor quality early Google digitizations.
I have found a scan of the 1917 edition on HathiTrust, however: https://babel.hathitrust.org/cgi/pt?id=mdp.39015011726042&seq=13Beleg Tâl (talk) 13:11, 2 July 2024 (UTC)Reply

Come,_Thou_Almighty_King_(unsourced)

[edit]

Scan backed version in "Come, Thou Almighty King" in The Army and Navy Hymnal, 1920 ShakespeareFan00 (talk) 12:39, 1 July 2024 (UTC)Reply

I would have speedied this long ago, but for one thing: we don't have a scan-backed edition that includes the second verse beginning "Jesus, our Lord, arise". For this reason, I am conflicted about deleting the unsourced version, and my !vote is  Neutral. —Beleg Tâl (talk) 12:59, 2 July 2024 (UTC)Reply
I found a source that includes the second verse, although the typography is not quite the same: [1] --EncycloPetey (talk) 18:15, 2 July 2024 (UTC)Reply

A Patriotic Manifesto for a European Future

[edit]

The following discussion is closed and will soon be archived:

Deleted. Likely copyvio; no evidence to support CC-BY-SA on the sources.

Digital-born document. I have not seen any copyright statement, but this cannot qualify as a government edict, so it may fail on copyright grounds. --EncycloPetey (talk) 15:47, 6 July 2024 (UTC)Reply

 Delete - the CC-BY-SA license is not corroborated on either of the provided sources of the document, and I have not been able to find any copy of this document that gives a compatible license. —Beleg Tâl (talk) 13:29, 8 July 2024 (UTC)Reply
Also noting that this appears to be a translation of a text originally in German, and I do not believe that either the original nor the translation are freely licensed. —Beleg Tâl (talk) 13:31, 8 July 2024 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --EncycloPetey (talk) 18:40, 14 July 2024 (UTC)Reply

Diary of a Lunatic

[edit]

The following discussion is closed and will soon be archived:

Deleted. Copyvio; misattribution of translator, likely through confusion with Gogol story with similar title.

Added in 2010 without a source. There is no source indicated on the Author page, nor on the author page for the translator. A search at IA turned up nothing either. --EncycloPetey (talk) 04:08, 8 July 2024 (UTC)Reply

 Comment Here is an edition of what appears to be this translation, that predates ours. I highly suspect translation copyvio. Also: I think that the attribution to Garnett may be a mistake, caused by confusion about her translation of Nikolai Gogol's Diary of a MadmanBeleg Tâl (talk) 13:40, 8 July 2024 (UTC)Reply
Checkmark This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. --EncycloPetey (talk) 18:42, 14 July 2024 (UTC)Reply

Template:Vertical header

[edit]

Cut&paste import from enWP, made in good faith by Matrix in April. Currently unused, and I'd like to keep it that way because…

…the template is inherently problematic from a technical perspective. Web standards (and hence web browsers) provide no native way to rotate table headers. So all the ways to achieve the apparent same effect are various degrees of hacks, with big drawbacks and that are prone to breaking. In particular this template makes assumptions about font size and line height that are neither guaranteed in the long term nor even accurate currently, and does not integrate well with our standard table formatting tools. Or put another way, it's a handy template for use in some specialised cases on enWP, but on enWS it's problematic. Xover (talk) 14:34, 9 July 2024 (UTC)Reply

 SupportCalendulaAsteraceae (talkcontribs) 18:32, 10 July 2024 (UTC)Reply
 Comment There are a number of works that use vertical headers in this manner, where the text is rotated 90 degrees anticlockwise. As far as I know the closest to achieving this using other templates is to apply {{rotate}}, while {{vlr}} and {{vrl}} both rotate the text 90 degrees clockwise -- completely the wrong direction. Recommendations for what to do in these cases would be appreciated. Arcorann (talk) 04:32, 11 July 2024 (UTC)Reply
The recommendation is to simply not try to reproduce rotated headers, frustrating as that is. The fundamental problem is that web standards simply do not support rotated table headers, and as we've seen for all such cases (dot leaders and drop caps being obvious examples) trying to fake support creates more problems than it is worth. Much like advanced typography, this is a level of fidelity that the state of the art and our tooling simply does not allow us to do in a sustainable fashion. It is absolutely infuriating that web standards still do not support these things, but by trying to hack our way around this fact we are creating problems for ourselves.
However, in this case, I proposed {{vertical header}} for deletion because 1) it's calling convention clashes with our other table templates (it's a bad fit) and 2) it is unused. We have, as you noted, {{rotate}} that is equally problematic (it does essentially the same thing, technically speaking), but which is widely used and does not clash with our other table-related templates. Personally I would prefer we not use that either, but that would be a much bigger discussion (it would be a policy-level discussion for the Scriptorium on all such templates, not a proposed deletion for a single template). Xover (talk) 09:55, 12 July 2024 (UTC)Reply
I agree with Arcorann - if you can show what to use for vertical headers in cases that would be appreciated. A lot of templates like this are just straight up hacks, but we need a solution even if it is a bad one. The reason I created this template was because I saw a table in A Dictionary of Music and Musicians (I can't remember the page number/volume) that required it, so I c&p imported the template from enwiki but then I just forgot. —Matr1x-101 {user page (@ commons) - talk} 19:37, 11 July 2024 (UTC) —Matr1x-101 {user page (@ commons) - talk} 19:37, 11 July 2024 (UTC)Reply
  • When I needed something like this, I found a template on Wikipedia—I don’t remember which one—and copied over its formatting. I oppose deletion absent a consistent, community-supported choice. TE(æ)A,ea. (talk) 02:45, 12 July 2024 (UTC)Reply
    @TE(æ)A,ea.: Please keep in mind that when you do this the odds are pretty high that although you got the effect you were looking for in your web browser and on your device, you have created problems for other people using a different web browser and a different device. And if you copied over raw markup into a Page: page we can't even sensibly track the usage to fix it whenever web standards catch up and start providing what we need. I strongly recommend not doing that unless you're enough of a web standards nerd to really know what you're doing and all its implications. Xover (talk) 10:08, 12 July 2024 (UTC)Reply
@Matr1x-101: No, all templates like these are hacks, which is kinda the point. If it was just the single template implementation then we could just fix it or migrate to something better. But for this (and a few other things we commonly run across) web standards simply do not support what we need. Xover (talk) 10:01, 12 July 2024 (UTC)Reply

Index:The collected works of Henrik Ibsen (Volume 4).djvu

[edit]

This index and all it's associated Page: are redundant to Index:The collected works of Henrik Ibsen (Heinemann Volume 4).djvu. Further, the one nominated for deletion was created by match-and-split from a copy that did not match the edition, lacks formatting and footnotes.

This is part of a cleanup of the larger mess that is our two sets of US / UK editions for this collection, which interlink with each other and do not use consistent naming. --EncycloPetey (talk) 20:28, 11 July 2024 (UTC)Reply

 Delete sounds like a candidate for speedy deletion to me —Beleg Tâl (talk) 22:56, 11 July 2024 (UTC)Reply
Because a match-and split was applied, it will now require a bot to do all the deleting. The match-and-split can be deleted and replaced by transcluding the copy from the other Index. --EncycloPetey (talk) 03:29, 12 July 2024 (UTC)Reply
 Delete: No reason keeping a match & split of the wrong edition. — Alien333 (what I did & why I did it wrong) 19:08, 14 July 2024 (UTC)Reply

Hanuman Chalisa

[edit]

This was added without source or license; but the contributor has now added a website as the source. It does not look as though it can be hosted here. --EncycloPetey (talk) 18:07, 18 July 2024 (UTC)Reply

 Delete: online source, not based on a specific edition, and translator unknown. — Alien333 (what I did & why I did it wrong) 18:36, 18 July 2024 (UTC)Reply
 Delete Looks like a modern translation, there are no traces of any old publication containing this text, so unless proved otherwise, we must assume the translation is copyrighted. --Jan Kameníček (talk) 18:56, 18 July 2024 (UTC)Reply

Index:The color printer (1892).djvu and pages..

[edit]

Transcribed in good faith, but for various reasons it's proving difficult to get the color samples to be consistent between pages, I had a template based approach, ( sub pages of the index) which I've now removed or marked for speedy. I am of the view that if this work can't be consistently transcribed then it's not worth transcribing, and thus despite my good faith effort, should be removed, so to not have an inaccuarte item.

If you do not want this to be deleted, please come up with a way of ensuring a "consistent" reproduction of the colors in it, that's true to the original, because I couldn't make it work.
ShakespeareFan00 (talk) 08:01, 20 July 2024 (UTC)Reply