Wikisource:Proposed deletions
- WS:PD redirects here. For help with public domain materials, see Help:Public domain.
This specific index is one of many such indexes; I nominate it as an example, but should the rationale be found sound, I will endeavor to make a list of all such indexes.
This index (and many others) were created by now-absent User:Languageseeker. My main concern is that the pages of these indexes have been added via match-and-split from some source, likely Project Gutenberg, which does not have a defined original copy. Because of this absence of real source, and the similarity of the text to the actual text of any given scanned copy, proofreading efforts would likely have to either not check the text against the original source or scrap the existing text entirely to ensure accuracy to the original on Wikisource. In light of this, I think the easiest approach is to delete the indexes and all pages thereunder; if there is organic desire to scan them at some point in the future, the indexes may be re-created, but I do not see a reason to keep the indexes as they stand. TE(æ)A,ea. (talk) 19:12, 9 December 2023 (UTC)
- Comment Hmm. I don't see the Index: pages as problematic. But the "Not Proofread" Page: pages that were, as you say, created by Match & Split from a secondary transcription (mostly Gutenberg, but also other sources), I do consider problematic. We don't permit secondary transcriptions added directly to mainspace, so to permit them in Page: makes no sense. And in addition to the problems these create for Proofreading that TE(æ)A,ea. outlines, it is also an issue that many contributors are reluctant to work on Index:es with a lot of extant-but-not-Proofread (i.e. "Red") pages.We have around a million (IIRC; it may be half a mill.) of these that were bot-created with essentially raw OCR (the contributor vehemently denies they are "raw OCR", so I assume some fixes were applied, but the quality is very definitely not Proofread). Languageseeker's imports are of much higher quality, but are still problematic. I think we should get rid of both these classes of Page: pages. In fact, I think we should prohibit Not Proofread pages from being transcluded to mainspace (except as a temporary measure, and possibly some other common sense exceptions). --Xover (talk) 20:24, 9 December 2023 (UTC)
- Xover: Assuming the status of the works to be equal, I would actually consider Languageseeker’s page creations to be worse, because, while it would look better as transcluded, it reduces the overall quality of the transcription. My main problem with the other user’s not-proofread page creations was that he focused a lot on indexes of very technical works, but provided no proofread baseline on which other editors could continue work—that was my main objection at the time, as it is easier to come on and off of work where there is an established style (for a complicated work) as opposed to starting a project and creating those standards yourself. As to the Page:/Index: issue, I ask for index deletion as well because these indexes were created only as a basis for the faulty text import, and I don’t want that to overlook any future transcription of those works. Again, I have no problem to work (or re-creation), I just think that these indexes (which are clearly abandoned, and were faulty ab origine) should be deleted. As for transclusion of not-proofread pages, I don’t think that the practice is so widespread that a policy needs to implemented (from my experience, at least); the issue is best dealt with on a case-by-case basis, or rather an user-by-user basis (as users can have different ways of turning raw OCR into not-proofread text, then following transclusion and finally proofread status). But of course, that (and the other user’s works, the indexes for which I think should probably be deleted) are a discussion for another time. (I will probably have more spare time starting soon, so I might start a discussion about the other user’s works after this discussion concludes.) TE(æ)A,ea. (talk) 02:28, 10 December 2023 (UTC)
- I'm not understanding what fault there is in the Index page. If the Page: pages had not been created, what problem would exist in the Index: page? --EncycloPetey (talk) 02:53, 10 December 2023 (UTC)
- EncycloPetey: This isn’t a case where the index page’s existence is inherently bad; but the pages poison the index, in terms of future (potential) proofreading efforts and in terms of abandonment. TE(æ)A,ea. (talk) 03:07, 10 December 2023 (UTC)
- @TE(æ)A,ea.: Just to be clear, if the outcome here is to delete all the "Not Proofread" Page: pages, would you still consider the Index: pages bad (should be deleted)? So far that seems to be the most controversial part of this discussion, and the part that is a clear departure from established practice. Xover (talk) 07:40, 20 December 2023 (UTC)
- Xover: Yes, I think those are also bad. They were created en masse for the purpose of adding this poor match-and-split text, and there is no additional value in keeping around hundreds of unused indexes whose only purpose was to facilitate a project consensus (here) clearly indicates in unwise. The main objection on that ground is that indexes are difficult to make; but that is not really true, and in any case is not a real issue, as a new editor who wishes to edit (but not create an index) can simply ask for one to be created. Another problem with these indexes is that they are not connected with other information (like the Author:-pages) that would help new editors find them. Insofar as they exist like this, the only real connection these indexes have to the project at large is through Languageseeker, who is now no longer editing. I don’t think that every abandoned index is a nuisance, but I do believe that this (substantial) group of mass-created indexes is a problem. TE(æ)A,ea. (talk) 21:10, 20 December 2023 (UTC)
- @TE(æ)A,ea.: Just to be clear, if the outcome here is to delete all the "Not Proofread" Page: pages, would you still consider the Index: pages bad (should be deleted)? So far that seems to be the most controversial part of this discussion, and the part that is a clear departure from established practice. Xover (talk) 07:40, 20 December 2023 (UTC)
- EncycloPetey: This isn’t a case where the index page’s existence is inherently bad; but the pages poison the index, in terms of future (potential) proofreading efforts and in terms of abandonment. TE(æ)A,ea. (talk) 03:07, 10 December 2023 (UTC)
- I'm not understanding what fault there is in the Index page. If the Page: pages had not been created, what problem would exist in the Index: page? --EncycloPetey (talk) 02:53, 10 December 2023 (UTC)
- Xover: Assuming the status of the works to be equal, I would actually consider Languageseeker’s page creations to be worse, because, while it would look better as transcluded, it reduces the overall quality of the transcription. My main problem with the other user’s not-proofread page creations was that he focused a lot on indexes of very technical works, but provided no proofread baseline on which other editors could continue work—that was my main objection at the time, as it is easier to come on and off of work where there is an established style (for a complicated work) as opposed to starting a project and creating those standards yourself. As to the Page:/Index: issue, I ask for index deletion as well because these indexes were created only as a basis for the faulty text import, and I don’t want that to overlook any future transcription of those works. Again, I have no problem to work (or re-creation), I just think that these indexes (which are clearly abandoned, and were faulty ab origine) should be deleted. As for transclusion of not-proofread pages, I don’t think that the practice is so widespread that a policy needs to implemented (from my experience, at least); the issue is best dealt with on a case-by-case basis, or rather an user-by-user basis (as users can have different ways of turning raw OCR into not-proofread text, then following transclusion and finally proofread status). But of course, that (and the other user’s works, the indexes for which I think should probably be deleted) are a discussion for another time. (I will probably have more spare time starting soon, so I might start a discussion about the other user’s works after this discussion concludes.) TE(æ)A,ea. (talk) 02:28, 10 December 2023 (UTC)
- I support deleting the individual pages of the index. As for the Index page itself, I am OK with both deleting it as abandoned or keeping it to wait for somebody to start the work anew. I also support getting rid of other similar secondary transcriptions. If a discussion on prohibiting transclusion of not-proofread pages into main NS is started somewhere, I will probably support it too. --Jan Kameníček (talk) 00:39, 10 December 2023 (UTC)
- Comment I've always felt uncomfortable with the tendency of some users to want to bulk-add a bunch of Index pages which have the pages correctly labelled, but are left indefinitely with no pages proofread in them. I feel like a "transcription project" (as Index pages are labelled in templates) implies an ongoing, or at least somewhat complete, ordeal, and adding index pages without proofreading anything is really just duplicating data from other places into Wikisource. Not to say there's absolutely no value in adding lots of index pages this way, but the value seems minimal. The fact that index pages mostly rely on duplicate data as it is is already an annoying redundancy on the site, and I think most of what happens on Index pages should just be dealt with in Wikidata, so I think the best place to bulk-add data about works is there, not by mass-creating empty Index pages. I know my comment here is kind of unrelated to the specific issue of the discussion (being, indexes with pages matched and splitted or something), but the same user (Languageseeker) has tended to do that as well. I am struggling to come up with any specific arguments or policies to support my position against those empty index pages... but it just seems unnecessary, seems like it will cause problems in the future, and on a positive note I do applaud Languageseeker's massive effort—it shows something great about their character as an editor—but unfortunately I think their effort should have been more focused on areas other than the creation of as many Index pages as possible. PseudoSkull (talk) 04:15, 10 December 2023 (UTC)
- Bulk-adding anything is probably a bad idea on Wikisource, because so much of what we do here requires a human touch. That being said, so far as I know the Index: pages Languageseeker created were perfectly fine in themselves, including having correct pagelists etc. This step is often complicated for new contributors, so creating the Index: without Proofreading anything is not without merit. It's pointing at an already set up transcription project onsite vs. just (ext)linking to a scan at IA for some users. The latter is an insurmountable effort for quite a lot of contributors. We also have historically permitted things to sit indefinitely in our non-content namespaces if they are merely incomplete rather than actually wrong in some way.That's not to say that all these Index: pages are necessarily golden, but imo those that are problematic (if any) should be dealt with individually. Xover (talk) 09:08, 10 December 2023 (UTC)
- Oh, also, what we host on Wikidata vs. what's hosted locally in our Index: pages is a huge and complicated discussion (hmu if you want the outline). For the purposes of this discussion it, imo, makes the most sense to just view that as an entirely orthogonal issue. If and when (and how and why and...) we push some or all our Index: page contents somewhere other than our current solution, it'll deal with these Index:es as well as every other. Xover (talk) 07:33, 20 December 2023 (UTC)
- Bulk-adding anything is probably a bad idea on Wikisource, because so much of what we do here requires a human touch. That being said, so far as I know the Index: pages Languageseeker created were perfectly fine in themselves, including having correct pagelists etc. This step is often complicated for new contributors, so creating the Index: without Proofreading anything is not without merit. It's pointing at an already set up transcription project onsite vs. just (ext)linking to a scan at IA for some users. The latter is an insurmountable effort for quite a lot of contributors. We also have historically permitted things to sit indefinitely in our non-content namespaces if they are merely incomplete rather than actually wrong in some way.That's not to say that all these Index: pages are necessarily golden, but imo those that are problematic (if any) should be dealt with individually. Xover (talk) 09:08, 10 December 2023 (UTC)
- Comment I do not support creating them, but since they exist, I try to make good use of them. I usually proofread offline for convenience and when I add the text I check the diff. If anything differs, it is an extra check for me as I could be the one who made mistakes. So I would keep them.
- BTW, nobody forbids to press the OCR button and restart. Mpaa (talk) 18:35, 10 December 2023 (UTC)
- While that is true, my experience is that the kinds of errors introduced by a mystery text layer is insidious, and most editors are unaware of the issue, or fail to notice small problems such as UK/US spelling differences, changes to punctuation, minor word changed, etc. So, while a person could reset the text, what would alert them to the fact that they should, rather than working from the existing unproofed page?
- H. G. Wells' First Men in the Moon is a prime example. A well-meaning editor matched-and-split the text into the scan. Two experienced editors crawled through making multiple corrections to validate the work, yet as recently as this past week we have had editors continue to find small mistakes throughout. Experience shows that match-and-split text is actually worse for Wikisource proofreading than the raw OCR because of these persistent text errors. --EncycloPetey (talk) 18:51, 10 December 2023 (UTC)
- In my workflow, I start from OCR, then compare what I did with what is available. It is an independent reference which I use for quality check. The probability that I did the same error is low (and the error would be anyhow there). It is almost as if someone is validating my text (or vice-versa). For me it is definitely a help. I follow the same process when validating text. I do not look at what is there and then compare. Mpaa (talk) 19:21, 10 December 2023 (UTC)
- Right. You do that, and I work similarly. But experience shows that the vast majority of contributors don't do that; they either don't touch the text due to the red pages, or they try to proofread off the extant text and leave behind subtle errors as EncycloPetey outlines. Xover (talk) 19:35, 10 December 2023 (UTC)
- We could argue forever. I do not know what evidence you have to say that works started from match-and-split are worse than others. I doubt anyone has real numbers to say that. IMHO it all depends on the attitude of contributors. I have seen works reaching a Validated stage and being crappy all the same. If you want to be consistent, you should delete all pages in a NotProofread state and currently not worked on because I doubt a non-experienced user will look where the text is coming from when editing, from a match-and-split or whatever.
- Also, then we should shutdown the match-and-split tool or letting only admins to run it, after being 100% sure that the version to split is the same as the version to scan.
- I am not advocating it as a process, I am only saying that what is there is there and it could be useful to some. If the community will decide otherwise, fine, I can cope with that. Mpaa (talk) 20:32, 10 December 2023 (UTC)
I do not know what evidence you have to say that works started from match-and-split are worse than others.
Anecdotal evidence only, certainly. But EncycloPetey gave a concrete example (H. G. Wells' First Men in the Moon), and both of us are asserting that we have seen this time and again: when the starting point is Match & Split text, the odds are high that the result will contain subtle errors in punctuation, US/UK spelling differences, words changed between editions, and so forth. All the things that do not jump out at you as "misspelled". Your experience may, obviously, differ, and it's certainly a valid point that we can end up with poor quality results for other reasons too.Your argumentum ad absurdum arguments are also well taken, but nobody's arguing we go hog-wild and delete everything. Languageseeker, specifically, went on an import-spree from Gutenberg (and managed to piss off the Distributed Proofreaders in the process), snarfing in a whole bunch of texts in a short period of time. All of these are secondary transcriptions, and Languageseeker was never going to proofread these themselves (their idea was almost certainly to either transclude them as is, or to run them in the Monthly Challenge).For these sorts of bulk actions that create an unmanageable workload to handle, I think deletion (return to the status quo ante) is a reasonable option. The same would go for the other user that bulk-imported something like 500k/1 mill. (I've got to go check that number) Page: pages of effectively uncorrected OCR. For anything else I'd be more hesitant, and certainly wouldn't want to take a position in aggregate. Those would be case-by-case stuff, but that really isn't an option for these bulk actions. Xover (talk) 07:17, 11 December 2023 (UTC)
- We could argue forever. I do not know what evidence you have to say that works started from match-and-split are worse than others. I doubt anyone has real numbers to say that. IMHO it all depends on the attitude of contributors. I have seen works reaching a Validated stage and being crappy all the same. If you want to be consistent, you should delete all pages in a NotProofread state and currently not worked on because I doubt a non-experienced user will look where the text is coming from when editing, from a match-and-split or whatever.
- Right. You do that, and I work similarly. But experience shows that the vast majority of contributors don't do that; they either don't touch the text due to the red pages, or they try to proofread off the extant text and leave behind subtle errors as EncycloPetey outlines. Xover (talk) 19:35, 10 December 2023 (UTC)
- In my workflow, I start from OCR, then compare what I did with what is available. It is an independent reference which I use for quality check. The probability that I did the same error is low (and the error would be anyhow there). It is almost as if someone is validating my text (or vice-versa). For me it is definitely a help. I follow the same process when validating text. I do not look at what is there and then compare. Mpaa (talk) 19:21, 10 December 2023 (UTC)
- BTW, nobody forbids to press the OCR button and restart. Mpaa (talk) 18:35, 10 December 2023 (UTC)
- Comment I am agianst deleting the Index. Indexes are one of the most tedious work to do when starting a transcription. Having index pages prepared and checked against the scan will save a lot of work. Mpaa (talk) 21:46, 10 December 2023 (UTC)
- Keep the Index, but Delete the pages. None of the bot-created pages have the header, which is a pain to add after-the-fact unless you can run a bot. The fact that they were created by match-and-split, instead of proofreading the text layer is poor practice. --EncycloPetey (talk) 19:15, 20 December 2023 (UTC)
- There are many recently added "new texts" with no headers. Mpaa (talk) 22:00, 23 December 2023 (UTC)
- What percent of editors want headers; and what percent do not care? Do you have data? --EncycloPetey (talk) 22:03, 23 December 2023 (UTC)
- No, I am only stating is not a good argument for deletion in my opinion, unless it is considered mandatory. Mpaa (talk) 22:22, 23 December 2023 (UTC)
- It is a good argument if most potential editors want to include the headers, and are put off working on proofreading by the fact that pages were created without the headers in place. There are works I've chosen not to work on for this reason. --EncycloPetey (talk) 23:04, 23 December 2023 (UTC)
- I agree that on its own the lack of headers is not a good argument for deletion. But I read it here to be intended as one additional factor on the scales that added together favour deletion. Which I do think is a valid argument (one can disagree, of course). Xover (talk) 23:57, 23 December 2023 (UTC)
- No, I am only stating is not a good argument for deletion in my opinion, unless it is considered mandatory. Mpaa (talk) 22:22, 23 December 2023 (UTC)
- Mpaa: That is the result of the efforts of one user, who has declared headers superfluous. I was going to start another discussion on that topic after this one (only one big discussion at a time for me, please). I think that, for all editors who want headers (most of them), not having them (because of the match-and-split seen here) is bad. Also, in response to your other comments above about proofreading over existing text, I usually do that as well, but I prefer proofreading on my own, without needing to check against a base—that’s why I focus on proofreading, not validation. For that same reason, I avoid all-not-proofread indexes like those at issue here. TE(æ)A,ea. (talk) 23:22, 23 December 2023 (UTC)
- I was thinking the same about headers, it would be good to have a consistent approach about works, in all their parts/namespaces. Mpaa (talk) 09:47, 24 December 2023 (UTC)
- What percent of editors want headers; and what percent do not care? Do you have data? --EncycloPetey (talk) 22:03, 23 December 2023 (UTC)
- There are many recently added "new texts" with no headers. Mpaa (talk) 22:00, 23 December 2023 (UTC)
- Comment in the future, if anyone feels blocked for the lack of headers, or wants to add headers, please make a bot request.Mpaa (talk) 09:47, 24 December 2023 (UTC)
- Comment I am proofreading this specific text. This discussion can be as reference for the other indexes, as TE(æ)A,ea. mentioned at the beginning of the discussion. BTW, a list would be useful, so I can fetch before a (possible) deletion. Mpaa (talk) 12:53, 2 January 2024 (UTC)
- I have removed your discussion-closure notice, EncycloPetey, because I created this discussion as a general issue, not tied to the specific index at hand. Mpaa, just to be clear, you would like a list of the indexes with match-and-split pages, or would you like a more general listing of that user’s indexes? TE(æ)A,ea. (talk) 01:09, 10 January 2024 (UTC)
- Per the description at the top of this page: "This page is for proposing deletion of specific articles on Wikisource in accordance with the deletion policy, and appealing previously-deleted works." This discussion was closed because the decision was to Keep Index:The trail of the golden horn.djvu now that it is proofread. Do you disagree with that decision? Because this page is not for discussing general issues. That needs to happen in the Scriptorium, not here. --EncycloPetey (talk) 02:47, 10 January 2024 (UTC)
- @TE(æ)A,ea. those with match-and-split pages that could be deleted. Mpaa (talk) 20:48, 10 January 2024 (UTC)
- Per the description at the top of this page: "This page is for proposing deletion of specific articles on Wikisource in accordance with the deletion policy, and appealing previously-deleted works." This discussion was closed because the decision was to Keep Index:The trail of the golden horn.djvu now that it is proofread. Do you disagree with that decision? Because this page is not for discussing general issues. That needs to happen in the Scriptorium, not here. --EncycloPetey (talk) 02:47, 10 January 2024 (UTC)
Duplicative of Weird Tales/Volume 3/Issue 1/The Picture in the House, starting discussion to decide whether to remove or migrate the librivox recording. MarkLSteadman (talk) 06:43, 29 December 2023 (UTC)
- Gah. Tough call.The two texts are not the same. Both Weird Tales in 1924 and the 1937 reprint use
… the antique and repellent wooden building which blinked with bleared windows from between two huge leafless oaks near the foot of a rocky hill
, but the unsourced text uses… elms …
. LibriVox for once actually gives a source, and in the case of File:LibriVox - picture in the house lovecraft sz.ogg that source is The Picture in the House (unknown) (modulo a page move after the fact here), and the audio narration does match (uses "elms"). The change to "elms" seems to be a later innovation, possibly applied by an editor as late as 1982 (Bloodcurdling Tales of Horror and the Macabre, the earliest use of "elms" there I could find right now), and the likely ultimate source of our text. The texts differ in other ways too, but up to this point the difference could be explained by transcription errors, lack of scan-backing and validation, etc.).So… I don't think we can move the LibriVox file over to our new text (different edition). And because the nominated text is from an indeterminate edition and we have a scan-backed version of this work, we should Delete The Picture in the House (unknown) too.But it's really annoying that when LibriVox for once both gives the source text they have used for their reading and actually links back to us, we have to delete the page. I wish they'd coordinate more with us on issues like this so we could get the maximum benefit out of our respective volunteer efforts. Xover (talk) 08:38, 29 December 2023 (UTC)- I guess that the LibreVox versions dates to when this was the only version available. Can we put the LibreVox link on The Picture in the House ? -- Beardo (talk) 18:33, 29 December 2023 (UTC)
- Hmm, no, I don't think so. We can't start amassing random multimedia versions of texts at the dab pages. Eventually we want spoken-word versions of our texts automatically linked from data on Wikidata, and that requires control over which specific edition the spoken-word version is from. Xover (talk) 10:49, 31 December 2023 (UTC)
- I guess that the LibreVox versions dates to when this was the only version available. Can we put the LibreVox link on The Picture in the House ? -- Beardo (talk) 18:33, 29 December 2023 (UTC)
- Comment The earliest copy I found with "...elms..." is a Best Supernatural Stories of H.P. Lovecraft on Google books, p. 122, which the site dates to 1945. --EncycloPetey (talk) 20:29, 30 December 2023 (UTC)
- weak Delete - it would probably be better for us to just start from scratch, although I recognize its value as being linked to from LibriVox, so maybe it could just be redirected to the current scanned version instead of outright deleted. SnowyCinema (talk) 03:30, 8 March 2024 (UTC)
- But start from scratch using what? The issue is that our scan-backed copy has a different text from the LibriVox recording. The text of the nominated copy can be attested, but not (yet) from a volume dated before 1945. Ideally, we would find a PD volume with the current text. --EncycloPetey (talk) 04:35, 18 March 2024 (UTC)
Excerpt of just parts of the title page (a pseudo-toc) of an issue of the journal of record for the EU. Xover (talk) 11:29, 11 February 2024 (UTC)
- Also Official Journal of the European Union, L 078, 17 March 2014 Xover (talk) 11:34, 11 February 2024 (UTC)
- Also Official Journal of the European Union, L 087I, 15 March 2022 Xover (talk) 11:35, 11 February 2024 (UTC)
- Also Official Journal of the European Union, L 110, 8 April 2022 Xover (talk) 11:36, 11 February 2024 (UTC)
- Also Official Journal of the European Union, L 153, 3 June 2022 Xover (talk) 11:37, 11 February 2024 (UTC)
- Also Official Journal of the European Union, L 066, 2 March 2022 Xover (talk) 11:39, 11 February 2024 (UTC)
- Also Official Journal of the European Union, L 116, 13 April 2022 Xover (talk) 11:39, 11 February 2024 (UTC)
- Keep This isn't an excerpt; it matches the Contents page of the on-line journal and links to the same items, which have also been transcribed. The format does not match as closely as it might, but it's not an excerpt. --EncycloPetey (talk) 04:52, 12 February 2024 (UTC)
- That's not the contents page of the online journal, it's the download page for the journal that happens to display the first page of the PDF (which is the title page, that also happens to list the contents). See here for the published form of this work. What we're hosting is a poorly-formatted de-coupled excerpt of the title page. It's also—regardless of sourcing—just a loose table of contents. Xover (talk) 07:09, 13 February 2024 (UTC)
- I don't understand. You're saying that it matches the contents of the journal, yet somehow it also doesn't? Yet, if I click on the individual items in the contents, I get the named items on a subpage. How is this different from what we do everywhere else on Wikisource? --EncycloPetey (talk) 16:35, 13 February 2024 (UTC)
- They are loose tables of contents extracted from the title pages of issues of a journal. They link horizontally (not to subpages) to extracted texts and function like navboxes, not tables of contents on the top level page of a work. That their formatting is arbitrary wikipedia-like just reinforces this.The linked texts should strictly speaking also be migrated to a scan of the actual journal, but since those are actual texts (and not a loose navigation aid) I'm more inclined to let them sit there until someone does the work to move them within the containing work and scan-backing them. Xover (talk) 08:35, 20 February 2024 (UTC)
- So, do I understand then that the articles should be consolidated as subpages, like a journal? In which case, these pages are necessary to have as the base page. Deleting them would disconnect all the component articles. It sounds more as though you're unhappy with the page formatting, rather than anything else. They are certainly not "excerpts", which was the basis for nominating them for deletion, and with that argument removed, there is no remaining basis for deletion. --EncycloPetey (talk) 19:41, 25 February 2024 (UTC)
- They are loose tables of contents extracted from the title pages of issues of a journal. They link horizontally (not to subpages) to extracted texts and function like navboxes, not tables of contents on the top level page of a work. That their formatting is arbitrary wikipedia-like just reinforces this.The linked texts should strictly speaking also be migrated to a scan of the actual journal, but since those are actual texts (and not a loose navigation aid) I'm more inclined to let them sit there until someone does the work to move them within the containing work and scan-backing them. Xover (talk) 08:35, 20 February 2024 (UTC)
- I don't understand. You're saying that it matches the contents of the journal, yet somehow it also doesn't? Yet, if I click on the individual items in the contents, I get the named items on a subpage. How is this different from what we do everywhere else on Wikisource? --EncycloPetey (talk) 16:35, 13 February 2024 (UTC)
- That's not the contents page of the online journal, it's the download page for the journal that happens to display the first page of the PDF (which is the title page, that also happens to list the contents). See here for the published form of this work. What we're hosting is a poorly-formatted de-coupled excerpt of the title page. It's also—regardless of sourcing—just a loose table of contents. Xover (talk) 07:09, 13 February 2024 (UTC)
There is no scan supported original language work present on the appropriate Italian Wikisource, as required by Wikisource:Translations. -- Jan Kameníček (talk) 09:50, 28 March 2024 (UTC)
- Comment There is an 1862 Italian copy of the libretto on IA, with just 26 pages and no music score to transcribe, if any is willing to transcribe it on it.WS. --EncycloPetey (talk) 17:28, 28 March 2024 (UTC)
- I've started the Italian transcription at it:Indice:La serva padrona - intermezzi due (IA laservapadronain00fede).pdf --EncycloPetey (talk) 20:20, 19 April 2024 (UTC)
- I've set up the English translation match at Index:La serva padrona - intermezzi due (IA laservapadronain00fede).pdf --EncycloPetey (talk) 21:26, 19 April 2024 (UTC)
- Comment In trying to match the translation to the Italian text, I've run into large difficulties. The text lacks most stage directions, and the one version of the opera I've now watched does not match either our translation nor the Italian text. Correctly translating the text is therefore near impossible, since I have no context for many of the lines. They appear to depend upon the "business" on stage for understanding what is meant, who the line is spoken to, and there may be some significant differences between 19th-century Italian and modern Italian. There are certainly some culturally dependent context phrases, such as "chocolate" probably meaning "hot chocolate drink" and not simply a confection. I do not think I am likely to finish work on this; is anyone else willing and able to make an attempt? --EncycloPetey (talk) 22:07, 8 June 2024 (UTC)
Out of scope per WS:WWI as it's a mere listing of data devoid of any published context. Xover (talk) 12:53, 31 March 2024 (UTC)
- Keep if scan-backed to this PDF document. Since the PDF document is from 2004, a time when the WWW existed but wasn't nearly as universal to society as today, I find the thought that this wasn't printed and distributed absurdly unlikely. And the copyright license would be PD-text, since none of the text is complex enough for copyright, being a list of general facts. Also, this document is historically significant, since it involves the relationships between two federal governments during a quite turbulent war in that region. SnowyCinema (talk) 14:25, 31 March 2024 (UTC)
- (And it should be renamed to "CPA-CA Register of Awards" to accurately reflect the document.) SnowyCinema (talk) 14:32, 31 March 2024 (UTC)
- It's still just a list of data devoid of any context that might justify its inclusion (like if it were, e.g., the appendix to a report on something or other). Xover (talk) 19:51, 13 April 2024 (UTC)
- Maybe I should write a user essay on this, since this is something I've had to justify in other discussions, so I can just link to that in the future.
- I don't take the policy to mean we don't want compilations of data on principle, or else we'd be deleting works like the US copyright catalogs (which despite containing introductions, etc., the body is fundamentally just a list of data). The policy says the justification on the very page. What we're trying to avoid is, rather, "user-compiled and unverified" data, like Wikisource editors (not external publications) listing resources for a certain project. And if you personally disagree, that's fine, but that's how I read the sentiment of the policy. I think that whether something was published, or at least printed or collected by a reputable-enough source, should be considered fair game. I'm more interested in weeding out research that was compiled on the fly by individual newbie editors, than federal government official compilations.
- But to be fair, even in my line of logic, this is sort of an iffy case, since the version of the document I gave gives absolutely no context besides "CPA-CA REGISTER OF AWARDS (1 JAN 04- 10 APRIL 04)" so it is difficult to verify the actual validity of the document's publication in 2004, but I would lean to keep this just because I think the likelihood is in the favor of the document being valid, and the data is on a notable subject. And if evidence comes to light that proves its validity beyond a shadow of a doubt, then certainly. SnowyCinema (talk) 00:03, 20 April 2024 (UTC)
- Evidence of validity: The search metadata gives a date of April 11, 2004, and the parent URL is clearly an early 2000s web page just by the looks of it. My keep vote is sustained. SnowyCinema (talk) 00:16, 20 April 2024 (UTC)
- It's still just a list of data devoid of any context that might justify its inclusion (like if it were, e.g., the appendix to a report on something or other). Xover (talk) 19:51, 13 April 2024 (UTC)
This has been an empty page since it was created in 2015. --EncycloPetey (talk) 00:06, 25 May 2024 (UTC)
- Normally I would suggest speedy deletion (no notable content or history). However, we do appear to have two articles from The Athenaeum that should be moved to subpages of that work: Folk-lore (extracted from The Athenaeum 1846-08-22) and Folk-lore (extracted from The Athenaeum 1876-08-29). —Beleg Tâl (talk) 01:07, 25 May 2024 (UTC)
- I've done some checking. The Athenaeum is not a unique title. There is also a student paper from Acadian University by this name that has been published since the late 19th century; there was also a (now defunct?) publication from Yale by this title; and there is a well-known Brazilian novel with this as its English title. So at the very least, any hub page for the London literary publication would need to be placed under a disambiguated title. --EncycloPetey (talk) 18:18, 26 July 2024 (UTC)
- Looks like we do have justification to keep the page and convert it into a base page for the periodical (though the subpage convention might need to be worked out -- the periodical doesn't have numbered volumes and but does number issues continuously). While we're here, do we have conventions on how to handle "æ" in work titles (since The Athenæum was always written as such)? Arcorann (talk) 04:09, 27 May 2024 (UTC)
- We don't have a strict convention, there are benefits to both arguments (faithfulness vs accessibility). Ether way, redirects should be created so that both spellings direct you to the correct work. —Beleg Âlt BT (talk) 13:24, 27 May 2024 (UTC)
No source, no license, no indication of being in the public domain —Beleg Tâl (talk) 17:22, 7 August 2024 (UTC)
- Found the source: [1] — Alien333 (what I did & why I did it wrong) 19:54, 7 August 2024 (UTC)
- The text of the source does not match what we have. I am having trouble finding our opening passages in the link you posted. --EncycloPetey (talk) 19:58, 7 August 2024 (UTC)
(At least, a sentence matched).@EncycloPetey: Found it, the content that corresponds to our page starts in the middle in the page 44 of that pdf, though the delimiting of paragraphs seems to be made up. — Alien333 (what I did & why I did it wrong) 20:00, 7 August 2024 (UTC)- That means we have an extract. --EncycloPetey (talk) 00:39, 9 August 2024 (UTC)
- No, it appears that the PDF is a compilation of several different, thematically related documents. His statement (English’d) is one such separate document. TE(æ)A,ea. (talk) 00:53, 9 August 2024 (UTC)
- In which case we do not yet have a source. --EncycloPetey (talk) 00:55, 9 August 2024 (UTC)
- No, that is the source; it’s just that the PDF contains multiple separate documents, like I said. It’s like the “Family Jewel” papers or the “Den of Espionage” documents. TE(æ)A,ea. (talk) 00:58, 9 August 2024 (UTC)
- Sorry, I meant to say that we do not have a source for it as an independently hosted work. To use the provided source, it would need to be moved into the containing work. --EncycloPetey (talk) 01:55, 9 August 2024 (UTC)
- Well these document collections are bit messy, they were originally independent documents / works but they are collected together for release, e.g. because someone filed a FOIA request for all documents related to person X. I don't think it is unreasonable if someone were to extract out the document. I wouldn't object if someone was like I went to an archive and grabbed document X out of Folder Y in Box Z but if someone requested a digital version of the file from the same archive they might just get the whole box from the archive scanned as a single file. Something like the "Family Jewels" is at least editorial collected, has a cover letter, etc., this is more like years 1870-1885 of this magazine are on microfiche roll XXV, we need to organize by microfiche roll. MarkLSteadman (talk) 11:17, 9 August 2024 (UTC)
- @EncycloPetey since this PDF is published on the DOD/WHS website, doesn't that make this particular collection of documents a publication of DOD/WHS? (Genuine question, I can imagine there are cases -- and maybe this is one -- where it's not useful to be so literal about what constitutes a publication or to go off a different definition. But I'm interested in your thinking.) -Pete (talk) 20:11, 9 August 2024 (UTC)
- Well these document collections are bit messy, they were originally independent documents / works but they are collected together for release, e.g. because someone filed a FOIA request for all documents related to person X. I don't think it is unreasonable if someone were to extract out the document. I wouldn't object if someone was like I went to an archive and grabbed document X out of Folder Y in Box Z but if someone requested a digital version of the file from the same archive they might just get the whole box from the archive scanned as a single file. Something like the "Family Jewels" is at least editorial collected, has a cover letter, etc., this is more like years 1870-1885 of this magazine are on microfiche roll XXV, we need to organize by microfiche roll. MarkLSteadman (talk) 11:17, 9 August 2024 (UTC)
- Sorry, I meant to say that we do not have a source for it as an independently hosted work. To use the provided source, it would need to be moved into the containing work. --EncycloPetey (talk) 01:55, 9 August 2024 (UTC)
- No, that is the source; it’s just that the PDF contains multiple separate documents, like I said. It’s like the “Family Jewel” papers or the “Den of Espionage” documents. TE(æ)A,ea. (talk) 00:58, 9 August 2024 (UTC)
- In which case we do not yet have a source. --EncycloPetey (talk) 00:55, 9 August 2024 (UTC)
- Why would a particular website warrant a different consideration in terms of what we consider a publication? How and why do you think it should be treated differently? According to what criteria and standards? --EncycloPetey (talk) 20:23, 9 August 2024 (UTC)
- Your reply seems to assume I have a strong opinion on this. I don't. My question is not for the purpose of advocating a position, but for the purpose of understanding your position. (As I said, it's a genuine question. Meaning, not a rhetorical or a didactic one.) If you don't want to answer, that's your prerogative of course.
- I'll note that Wikisource:Extracts#Project scope states, "The creation of extracts and abridgements of original works involves an element of creativity on the part of the user and falls under the restriction on original writing." (Emphasis is mine.) This extract is clearly not the work of a Wikisource user, so the statement does not apply to it. It's an extract created by (or at least published) by the United States Department of Defense, an entity whose publishing has been used to justify the inclusion of numerous works on Wikisource.
- But, I have no strong opinion on this decision. I'm merely seeking to understand the firmly held opinions of experienced Wikisource users. -Pete (talk) 20:42, 9 August 2024 (UTC)
- You misunderstand. The page we currently have on our site is, based on what we have so far, an extract from a longer document. And that extract was made by a user on Wikisource. There is no evidence that the page we currently have was never published independently, so the extract issue applies here. We can host it as part of the larger work, however, just as we host poems and short stories published in a magazine. We always want the work to be included in the context in which it was published. --EncycloPetey (talk) 20:55, 9 August 2024 (UTC)
- OK. I did understand that to be TEaeA,ea's position, but it appeared to me that you were disagreeing and I did not understand the reasons. Sounds like there's greater agreement than I was perceiving though. Pete (talk) 21:36, 9 August 2024 (UTC)
- I am unclear what you are referring to as a "longer document." Are you referring to the need to transcribe the Russian portion? That there are unreleased pages beyond the piece we have here?. Or are you saying the "longer document" is all 53 sets of releases almost 4000 pages listed here (https://www.esd.whs.mil/FOIA/Reading-Room/Reading-Room-List_2/Detainee_Related/)? I hope you are not advocating for merging all ~4000 pages into a single continuous page here, some some subdivision I assume is envisioned.
- Re the policy statement: I am not sure that is definitive: if someone writes me a letter or a poem and I paste that into a scrapbook, is the "work" the letter, the scrapbook or both? Does it matter if it is a binder or a folder instead of a scrapbook? If a reporter copies down a speech in a notebook, is the work the speech or the whole notebook. etc. I am pretty sure we haven't defined with enough precision to point to policy to say one interpretation of "work" is clearly wrong, which is why we have the discussion. MarkLSteadman (talk) 05:36, 10 August 2024 (UTC)
- The basic unit in WS:WWI is the published unit; we deal in works that have been published. We would not host a poem you wrote and pasted into a scrapbook, because it has not been published. For us to consider hosting something that has not been published usually requires some sort of extraordinary circumstances. --EncycloPetey (talk) 15:53, 10 August 2024 (UTC)
- From WWSI: "Most written work ... created but never published prior to 1929 may be included", Documentary sources include; "personal correspondence and diaries." The point isn't the published works, that is clear. If someone takes the poem edits it and publishes in a collection its clear. It's the unpublished works sitting in archives, documentary sources, etc. Is the work the unpublished form it went into the archive (e.g separate letters) or the unpublished form currently in the archives (e.g. bound together) or is it if I request pages 73-78 from the archives those 5 pages in the scan are the work and if you request pages 67-75 those are a separate work? MarkLSteadman (talk) 17:18, 10 August 2024 (UTC)
- I will just add that in every other context we refer to a work as the physical thing and not a mere scanned facsimile. We don't consider Eighteenth Century Collections Online scanning a particular printed editions and putting up a scan as the "published unit" as distinct from the British Library putting up their scan as opposed to the LOC putting up their scan or finding a version on microfilm. Of course, someone taking documents and doing things (like the Pentagon Papers, or the Family Jewels) might create a new work, but AFAICT in this context it is just mere reproduction. MarkLSteadman (talk) 05:37, 12 August 2024 (UTC)
- In the issue at hand, I am unaware of any second or third releases / publications. As far as I know, there is only the one release / publication. When a collection or selection is released / published from an archive collection, that release is a publication. And we do not have access to the archive. --EncycloPetey (talk) 17:34, 12 August 2024 (UTC)
- We have access, via filing a FOIA request. That is literally how those documents appeared there, they are hosted under: "5 U.S.C. § 552 (a)(2)(D) Records - Records released to the public, under the FOIA," which are by law where records are hosted that have been requested three times. And in general, every archive has policies around access. And I can't just walk into Harvard or Oxford libraries and handle their books either.
- My point isn't that can't be the interpretation we could adopt or have stricter policies around archival material. Just that I don't believe we can point to a statement saying "work" or "published unit" and having that "obviously" means that a request for pages 1-5 of a ten report is obviously hostable if someone requests just those five pages via FOIA as a "complete work" while someone cutting out just the whole report now needs to be deleted because that was released as part of a 1000 page large document release and hence is now an "extract" of that 1000 page release. That requires discussion, consensus, point to precedent etc. And if people here agree with that interpretation go ahead. MarkLSteadman (talk) 03:16, 18 August 2024 (UTC)
- For example, I extracted Index:Alexandra Kollontai - The Workers Opposition in Russia (1921).djvu out of [2]. My understanding of your position is that according to policy the "work" is actually all 5 scans from the Newberry Library archives joined together (or, maybe only if there are work that was previously unpublished?), and that therefore it is an "extract" in violation of policy. But if I uploaded this [3] instead, that is okay? Or maybe it depends on the access policies of Newberry vs. the National Archives? Or it depends on publication status (so I can extract only published pamphlets from the scans but not something like a meeting minutes, so even though they might be in the same scan the "work" is different?) MarkLSteadman (talk) 03:45, 18 August 2024 (UTC)
- If the scan joined multiple published items, that were published separately, I would see no need to force them to be part of the same scan, provided the scan preserves the original publication in toto. I say that because there are Classical texts where all we have is the set of smushed together documents, and they are now considered a "work". This isn't a problem limited to modern scans, archives, and the like. The problem is centuries old. --EncycloPetey (talk) 04:21, 18 August 2024 (UTC)
- So if in those thousands of pages there is a meeting minute or letter between people ("unpublished") then I can't? MarkLSteadman (talk) 13:57, 20 August 2024 (UTC)
- If the scan joined multiple published items, that were published separately, I would see no need to force them to be part of the same scan, provided the scan preserves the original publication in toto. I say that because there are Classical texts where all we have is the set of smushed together documents, and they are now considered a "work". This isn't a problem limited to modern scans, archives, and the like. The problem is centuries old. --EncycloPetey (talk) 04:21, 18 August 2024 (UTC)
- For example, I extracted Index:Alexandra Kollontai - The Workers Opposition in Russia (1921).djvu out of [2]. My understanding of your position is that according to policy the "work" is actually all 5 scans from the Newberry Library archives joined together (or, maybe only if there are work that was previously unpublished?), and that therefore it is an "extract" in violation of policy. But if I uploaded this [3] instead, that is okay? Or maybe it depends on the access policies of Newberry vs. the National Archives? Or it depends on publication status (so I can extract only published pamphlets from the scans but not something like a meeting minutes, so even though they might be in the same scan the "work" is different?) MarkLSteadman (talk) 03:45, 18 August 2024 (UTC)
- In the issue at hand, I am unaware of any second or third releases / publications. As far as I know, there is only the one release / publication. When a collection or selection is released / published from an archive collection, that release is a publication. And we do not have access to the archive. --EncycloPetey (talk) 17:34, 12 August 2024 (UTC)
- I will just add that in every other context we refer to a work as the physical thing and not a mere scanned facsimile. We don't consider Eighteenth Century Collections Online scanning a particular printed editions and putting up a scan as the "published unit" as distinct from the British Library putting up their scan as opposed to the LOC putting up their scan or finding a version on microfilm. Of course, someone taking documents and doing things (like the Pentagon Papers, or the Family Jewels) might create a new work, but AFAICT in this context it is just mere reproduction. MarkLSteadman (talk) 05:37, 12 August 2024 (UTC)
- From WWSI: "Most written work ... created but never published prior to 1929 may be included", Documentary sources include; "personal correspondence and diaries." The point isn't the published works, that is clear. If someone takes the poem edits it and publishes in a collection its clear. It's the unpublished works sitting in archives, documentary sources, etc. Is the work the unpublished form it went into the archive (e.g separate letters) or the unpublished form currently in the archives (e.g. bound together) or is it if I request pages 73-78 from the archives those 5 pages in the scan are the work and if you request pages 67-75 those are a separate work? MarkLSteadman (talk) 17:18, 10 August 2024 (UTC)
- The basic unit in WS:WWI is the published unit; we deal in works that have been published. We would not host a poem you wrote and pasted into a scrapbook, because it has not been published. For us to consider hosting something that has not been published usually requires some sort of extraordinary circumstances. --EncycloPetey (talk) 15:53, 10 August 2024 (UTC)
- OK. I did understand that to be TEaeA,ea's position, but it appeared to me that you were disagreeing and I did not understand the reasons. Sounds like there's greater agreement than I was perceiving though. Pete (talk) 21:36, 9 August 2024 (UTC)
- You misunderstand. The page we currently have on our site is, based on what we have so far, an extract from a longer document. And that extract was made by a user on Wikisource. There is no evidence that the page we currently have was never published independently, so the extract issue applies here. We can host it as part of the larger work, however, just as we host poems and short stories published in a magazine. We always want the work to be included in the context in which it was published. --EncycloPetey (talk) 20:55, 9 August 2024 (UTC)
- Why would a particular website warrant a different consideration in terms of what we consider a publication? How and why do you think it should be treated differently? According to what criteria and standards? --EncycloPetey (talk) 20:23, 9 August 2024 (UTC)
- No, it appears that the PDF is a compilation of several different, thematically related documents. His statement (English’d) is one such separate document. TE(æ)A,ea. (talk) 00:53, 9 August 2024 (UTC)
- That means we have an extract. --EncycloPetey (talk) 00:39, 9 August 2024 (UTC)
- The text of the source does not match what we have. I am having trouble finding our opening passages in the link you posted. --EncycloPetey (talk) 19:58, 7 August 2024 (UTC)
- This discussion has gone way beyond my ability to follow it. However, I do want to point out that we do have precedent for considering documents like those contained in this file adequate sources for inclusion in enWS. I mention this because if the above discussion established a change in precedent, there will be a large number of other works that can be deleted under similar argument (including ones which I have previously unsuccessfully proposed for deletion). —Beleg Tâl (talk) 13:14, 13 August 2024 (UTC)
- for example, see the vast majority of works at Portal:Guantanamo —Beleg Tâl (talk) 13:15, 13 August 2024 (UTC)
- (@EncycloPetey, @MarkLSteadman) So, to be clear, the idea would be to say that works which were published once and only once, and as part of a collection of works, but that were created on Wikisource on their own, to be treated of extracts and deleted per WS:WWI#Extracts?
- If this is the case, it ought to be discussed at WS:S because as BT said a lot of other works would qualify for this that are currently kept because of that precedent, including most of our non-scan-backed poetry and most works that appeared in periodicals. This is a very significant chunk of our content. — Alien333 (what I did & why I did it wrong) 09:29, 14 August 2024 (UTC)
- Also, that would classify encyclopedia articles as extracts, which would finally decide the question of whether it is appropriate to list them on disambiguation pages (i.e., it would not be appropriate, because they are extracts) —Beleg Tâl (talk) 13:23, 14 August 2024 (UTC)
- Extracts are only good for deletion if created separately from the main work. As far as I understood this, if someone does for example a whole collection of documents, they did the whole work, so it's fine, it's only if it's created separately (like this is the case here) that they would be eligible for deletion. Editing comment accordingly. — Alien333 (what I did & why I did it wrong) 15:00, 14 August 2024 (UTC)
- We would not host an article from an encyclopedia as a work in its own right; it would need to be part of its containing work, such as a subpage of the work, and not a stand-alone article. I believe the same principle applies here. --EncycloPetey (talk) 15:36, 14 August 2024 (UTC)
- Extracts are only good for deletion if created separately from the main work. As far as I understood this, if someone does for example a whole collection of documents, they did the whole work, so it's fine, it's only if it's created separately (like this is the case here) that they would be eligible for deletion. Editing comment accordingly. — Alien333 (what I did & why I did it wrong) 15:00, 14 August 2024 (UTC)
- Much of our non-scan backed poetry looks like this A Picture Song which is already non-policy compliant (no source). For those listing a source such as an anthology, policy would generally indicate the should end up being listed as subworks of the anthology they were listed in. I don't think I have seen an example of a poetry anthology scan being split up into a hundred different separate poems transcribed as individual works rather than as a hundred subworks of the anthology work.
- Periodicals are their own mess, especially with works published serially. Whatever we say here also doesn't affect definitely answer the question of redirects, links, disambiguation as we already have policies and precedent allowing linking to sub-works (e.g. we allow linking to laws or treaties contained in statute books, collections, appendices, etc.). MarkLSteadman (talk) 02:57, 18 August 2024 (UTC)
- They are non-policy compliant, but this consensus appears to have been that though adding sourceless works is not allowed, we do not delete the old ones, which this, if done, would do. — Alien333 ( what I did &
why I did it wrong ) 07:55, 18 August 2024 (UTC)
- They are non-policy compliant, but this consensus appears to have been that though adding sourceless works is not allowed, we do not delete the old ones, which this, if done, would do. — Alien333 ( what I did &
- Also, that would classify encyclopedia articles as extracts, which would finally decide the question of whether it is appropriate to list them on disambiguation pages (i.e., it would not be appropriate, because they are extracts) —Beleg Tâl (talk) 13:23, 14 August 2024 (UTC)
Pages of Index:Historical and Biographical Annals of Columbia and Montour Counties, Pennsylvania, Containing a Concise History of the Two Counties and a Genealogical and Biographical Record of Representative Families.pdf
[edit]OCR is mess and all over the place, Just throw the whole thing out and start again, unless someone has the time to calmly realign all the pages. ShakespeareFan00 (talk) 22:44, 18 August 2024 (UTC)
- Delete. The OCR is far too bad to be useful. Definitely better to just delete it. TE(æ)A,ea. (talk) 22:52, 18 August 2024 (UTC)
- Please make sure that you tag the Index with the deletion notice. I see that not only are the created Pages full of OCR errors, but many of those created Pages have content that does not match the scan in any way for the side-by-side comparison. There may be deeper issues with the PDF. --EncycloPetey (talk) 22:53, 18 August 2024 (UTC)
- The index itself is fine. There's no mechanism for mass noming a batch of pages though ShakespeareFan00 (talk) 22:58, 18 August 2024 (UTC)
- Then when you said "throw the whole thing out", you did not mean the Index (which is what is listed)? --EncycloPetey (talk) 23:00, 18 August 2024 (UTC)
- I meant the Page: s , the actual Index: page itself isn't bad. ShakespeareFan00 (talk) 06:38, 19 August 2024 (UTC)
- Then when you said "throw the whole thing out", you did not mean the Index (which is what is listed)? --EncycloPetey (talk) 23:00, 18 August 2024 (UTC)
- The index itself is fine. There's no mechanism for mass noming a batch of pages though ShakespeareFan00 (talk) 22:58, 18 August 2024 (UTC)
The file is missing two pages, and a number of additional pages have poorly scanned pages which would need replacement. In addition, the actual scan quality itself is poor, and doesn’t serve easy proofreading. We already have better scans (an illustrated one here and one from a collection here). The images would also be difficult to extract, owing to the same issue. The OCR is poor, and the text added to the pages isn’t useful either. The index and the pages should be deleted. TE(æ)A,ea. (talk) 00:36, 22 August 2024 (UTC)
These are extracts from Toki Pona: The Language of Good (2014) and Toki Pona Dictionary (2021) respectively, and hence fail WS:WWI. It is perhaps worth noting that these extracts have been released under a CC license, while the remainders of these works have not (and are not available online at all). —Beleg Tâl (talk) 13:36, 13 September 2024 (UTC)
- Keep for the reasons you have given. Insofar as they have been separately released, they are whole works. Notes on lipu pu is already scan-backed, and it shouldn’t be too hard to get a copy of the 2014 book for other scan-backing purposes. TE(æ)A,ea. (talk) 16:03, 13 September 2024 (UTC)
- But the first text is pulling content from a copyrighted work and slapping a CC license on it. That's a serious problem, because I don't see how you can claim CC on copyrighted material. The second work appears to be a response to the original text, quoting from it, and does not appear to be an extract. --EncycloPetey (talk) 16:13, 13 September 2024 (UTC)
- EncycloPetey: From a technical standpoint, obviously, one must have copyright on certain material in order to release it under a license. I assumed (without looking in to the matter) that the license was legitimate. I found a forum post which says that the dictionary was released into the public domain in the original book; if this is the case, the license is not an issue. Given that statement, I have ordered the book and will scan the dictionary if it has actually been released. TE(æ)A,ea. (talk) 19:50, 13 September 2024 (UTC)
- Yes, the author of this work has put this section of the book under a CC license. The copyright status of this portion of the work is not an issue. The issue is that it is an extract of a published work, which is currently banned under WS:WWI, and the copyright status of the rest of the book prevents us from hosting it in full. —Beleg Tâl (talk) 01:27, 15 September 2024 (UTC)
- @Beleg Tâl: The Wikisource:Extracts page says that the reason for banning extracts is mainly that "The act of making the extract introduces a bias, placing emphasis on certain points and potentially eliminatig counterpoints or contextual information. Even an extract made in good faith may inadvertently change the intent of the original work." But in case of Toki Pona dictionary, assigning a different license specifically to the dictionary was the intent of the original work. --Ssvb (talk) 09:38, 17 September 2024 (UTC)
- That is one of the reasons, yes. Another reason given is "The intent of Wikisource is to create a library of freely available, complete texts", which is impossible with the texts in question. —Beleg Tâl (talk) 17:37, 17 September 2024 (UTC)
- Beleg Tâl: Yes, and these are complete works, as I have said. TE(æ)A,ea. (talk) 18:25, 17 September 2024 (UTC)
- I disagree. Giving this extract a different license, is not the same as publishing it as its own complete work. It just means that the extract is more accessible. It's still an extract. —Beleg Tâl (talk) 18:45, 17 September 2024 (UTC)
- How does Wikisource handle mixed-license content in general? E.g. a book with pubic domain text, but also with still copyright protected illustrations? Or a newspaper issue with some articles already in public domain, but not the others (edit: Wikisource:Periodical_guidelines#Copyright seems to provide some guidelines)? Should Wikisource have an index for the whole Toki Pona book, but with blank placeholder pages for everything except for the CC-licensed part? --Ssvb (talk) 03:43, 18 September 2024 (UTC)
- Templates like {{image removed}} have been used in cases like that, but that content should be removed from the file also, else commons/us would be hosting copyrighted stuff in that file. — Alien 3
3 3 06:57, 18 September 2024 (UTC)
- Templates like {{image removed}} have been used in cases like that, but that content should be removed from the file also, else commons/us would be hosting copyrighted stuff in that file. — Alien 3
- How does Wikisource handle mixed-license content in general? E.g. a book with pubic domain text, but also with still copyright protected illustrations? Or a newspaper issue with some articles already in public domain, but not the others (edit: Wikisource:Periodical_guidelines#Copyright seems to provide some guidelines)? Should Wikisource have an index for the whole Toki Pona book, but with blank placeholder pages for everything except for the CC-licensed part? --Ssvb (talk) 03:43, 18 September 2024 (UTC)
- I disagree. Giving this extract a different license, is not the same as publishing it as its own complete work. It just means that the extract is more accessible. It's still an extract. —Beleg Tâl (talk) 18:45, 17 September 2024 (UTC)
- Beleg Tâl: Yes, and these are complete works, as I have said. TE(æ)A,ea. (talk) 18:25, 17 September 2024 (UTC)
- That is one of the reasons, yes. Another reason given is "The intent of Wikisource is to create a library of freely available, complete texts", which is impossible with the texts in question. —Beleg Tâl (talk) 17:37, 17 September 2024 (UTC)
- @Beleg Tâl: The Wikisource:Extracts page says that the reason for banning extracts is mainly that "The act of making the extract introduces a bias, placing emphasis on certain points and potentially eliminatig counterpoints or contextual information. Even an extract made in good faith may inadvertently change the intent of the original work." But in case of Toki Pona dictionary, assigning a different license specifically to the dictionary was the intent of the original work. --Ssvb (talk) 09:38, 17 September 2024 (UTC)
- Yes, the author of this work has put this section of the book under a CC license. The copyright status of this portion of the work is not an issue. The issue is that it is an extract of a published work, which is currently banned under WS:WWI, and the copyright status of the rest of the book prevents us from hosting it in full. —Beleg Tâl (talk) 01:27, 15 September 2024 (UTC)
- EncycloPetey: From a technical standpoint, obviously, one must have copyright on certain material in order to release it under a license. I assumed (without looking in to the matter) that the license was legitimate. I found a forum post which says that the dictionary was released into the public domain in the original book; if this is the case, the license is not an issue. Given that statement, I have ordered the book and will scan the dictionary if it has actually been released. TE(æ)A,ea. (talk) 19:50, 13 September 2024 (UTC)
- But the first text is pulling content from a copyrighted work and slapping a CC license on it. That's a serious problem, because I don't see how you can claim CC on copyrighted material. The second work appears to be a response to the original text, quoting from it, and does not appear to be an extract. --EncycloPetey (talk) 16:13, 13 September 2024 (UTC)
Delete, noting that the "scan" of the Notes on lipu pu seems to be just some self-published pdf. --Jan Kameníček (talk) 16:20, 13 September 2024 (UTC)- Jan Kameníček: No, that’s just a copy of the “notes” section from the 2021 book, which has been published. TE(æ)A,ea. (talk) 19:50, 13 September 2024 (UTC)
- I see. I am striking my vote for now and will think about it again. --Jan Kameníček (talk) 09:42, 14 September 2024 (UTC)
- Jan Kameníček: No, that’s just a copy of the “notes” section from the 2021 book, which has been published. TE(æ)A,ea. (talk) 19:50, 13 September 2024 (UTC)
Various Wikipedia-related pages
[edit]All more or less self-published:
- Without any indication of specific authorship other than the publishing organization:
- By editors, published by WMF:
- Written by direction of publishing organization:
- (Non-WM) Blog posts:
Wikimedia and the new collaborative digital archives(being discussed below)Wikipedian-in-residence, a proposal(being discussed below)
— Alien 3
3 3 21:20, 16 September 2024 (UTC)
- None of these items have been tagged as under discussion. All nominations for deletion should be appropriately tagged. --EncycloPetey (talk) 18:41, 17 September 2024 (UTC)
- Just tagged them all, after you reminded me in the above discussion. — Alien 3
3 3 18:55, 17 September 2024 (UTC)
- Just tagged them all, after you reminded me in the above discussion. — Alien 3
- NOTE: If these Wikisource entries get deleted, we should probably make sure to delete backlinks across Wikimedia, since many sites (including internally) probably still link to those transcriptions. However . . .
- Delete for all the ones that straight-up originated from or exist as wiki articles, per the points I made in the other discussion about the Wikipedia Signpost. It gets more nuanced with a few of these for me, though.
- Delete per nom. --Jan Kameníček (talk) 23:30, 29 September 2024 (UTC)
- Very weak Keep because 1. this is PDF-backed which lends legitimacy, and 2. unlike Editing Wikipedia this didn't just originate straight from Wikipedia itself, and had a direct connection to other organizations in paper form and can be argued to be documentary evidence of interactions between Harvard University and the Wikimedia Foundation. It doesn't help that the individuals involved don't seem particularly notable, though. SnowyCinema (talk) 21:49, 16 September 2024 (UTC)
- Keep this and Editing Wikipedia as they different in kind from the other works nominated—which are all published-as-Wiki-text blog posts. I would not be opposed to re-nomination later, though. TE(æ)A,ea. (talk) 13:58, 17 September 2024 (UTC)
- Keep for evidentiary value. That it's an important blog post being the first ever reference to Wikipedians in residence, a topic that is notable enough to have its own Wikipedia article now, and did not (apparently) originate from Wikipedia itself, would be sufficient to call historically significant. SnowyCinema (talk) 21:49, 16 September 2024 (UTC)
- Keep as not within scope of deletion rationale. It’s on the edge but I think worthy of inclusion. TE(æ)A,ea. (talk) 13:58, 17 September 2024 (UTC)
- Keep. As the page header says, "preserved due to its significance to the Wikimedia movement". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:19, 28 September 2024 (UTC)
- It may be significant for the Wikimedia movement, yes, but that's far from WS:WWI's
high importance or historical value
of texts like United States Declaration of Independence. Meta is there to document the Wikimedia movement; Wikisource is not here to do that, but to host published texts. — Alien 3
3 3 11:26, 28 September 2024 (UTC)
- It may be significant for the Wikimedia movement, yes, but that's far from WS:WWI's
- Delete, just a blog post. --Jan Kameníček (talk) 23:35, 29 September 2024 (UTC)
- Keep My general position is that if it fell into the public domain naturally in any sense (in this case, being a federal government document), then we should be extremely lenient on its inclusion, since so few modern works actually have the ability to fall under this umbrella. (And the general rule of thumb is, the more modern the work, the higher the page views we get for its transcription are. This one got 23 views this month...) Also, a federal government employee's work essentially has their stamp on it, giving it an inherent sense of documentary/academic legitimacy. SnowyCinema (talk) 02:59, 17 September 2024 (UTC)
- Keep for the same reason: it’s not a work of Wikipedia, so it doesn’t that sort of concern. In addition, for the most part, we don’t consider works created by governments to be “self-published” as that would prevent most laws, rules, regulations, &c. would be banned. TE(æ)A,ea. (talk) 13:58, 17 September 2024 (UTC)
- My problem is that it's a blog post, not an official document. — Alien 3
3 3 16:32, 17 September 2024 (UTC)
- My problem is that it's a blog post, not an official document. — Alien 3
- Delete, just a blog post. --Jan Kameníček (talk) 23:36, 29 September 2024 (UTC)
This is a list of links to various works by Balzac. I think this is supposed to be an anthology, but the links in it do not appear to be from an edition of the anthology, so this should be deleted. —Beleg Tâl (talk) 18:52, 24 September 2024 (UTC)
- Of course, if it's not an anthology, but rather a list of related works, it should be moved to Portal space instead. —Beleg Tâl (talk) 18:53, 24 September 2024 (UTC)
- This is a Schrödinger's contents: All of the listed items were published together in a collection by this title, however the copies we have do not necessarily come from that collection, and meny of the items were published elsewhere first. --EncycloPetey (talk) 19:02, 24 September 2024 (UTC)
- None of the copies we have come from that collection, which is why I nominated it for deletion. The closest is Author's Introduction to The Human Comedy which is from The Human Comedy: Introductions and Appendix. —Beleg Tâl (talk) 19:46, 24 September 2024 (UTC)
- There are also a LOT of links to this page, and there is Index:Repertory of the Comedie Humaine.djvu, which is a reference work tied to the work by Balzac. --EncycloPetey (talk) 19:03, 24 September 2024 (UTC)
- The vast majority of the incoming links are through section redirects, so we could just make a portal and change the redirect targets to lead to the portal sections.
- As for Index:Repertory of the Comedie Humaine.djvu, it goes with Repertory of the Comedie Humaine, which is mentioned at La Comédie humaine as a more specific, detailed and distinct work. — Alien 3
3 3 19:26, 24 September 2024 (UTC)- Yes, it is a distinct work, but it is a reference work about La Comédie humaine, containing links throughout to all the same works, because those works were published in La Comédie humaine, which is the subject of the reference book. This means that it contains the same links to various works issue that the nominated work has. --EncycloPetey (talk) 19:32, 24 September 2024 (UTC)
- We could make the unusual step of creating a Translations page despite having no editions of this anthology. This would handle all the incoming links, and list various scanned editions that could be added in future. It's not unprecedented. —Beleg Tâl (talk) 13:16, 25 September 2024 (UTC)
- These novel series are a bit over the place, things like The Forsyte Chronicles and Organon get entries, while typically The X Trilogy does not. My sense it that current practice is to group them on Authors / Portals so that is my inclination for the series. Separately, if someone does want to start proofreading one of the published sets under the name, e.g. the Wormeley edition in 30 (1896) or 40 (1906) volumes. MarkLSteadman (talk) 21:12, 24 September 2024 (UTC)
- Sometimes there is no clear distinction between a "series of works" and a "single multi-volume work", which leaves a grey area. However, when the distinction is clear, a "series of works" does not belong in mainspace. To your examples: The Forsyte Chronicles is clearly in the wrong namespace and needs to be moved; but Organon is a Translations page rather than a series, and Organon (Owen) is unambiguously a single two-volume work, so it is where it belongs (though the "Taken Separately" section needs to be split into separate Translations pages). —Beleg Tâl (talk) 13:15, 25 September 2024 (UTC)
- I support changing the page into a translations page. --Jan Kameníček (talk) 21:05, 5 October 2024 (UTC)
- Which translations would be listed? So far, I am aware of just one English translation we could host. --EncycloPetey (talk) 18:38, 7 October 2024 (UTC)
- The translation page can contain a section listing the translation(s) that we host or could host and a section listing those parts of the work which were translated individually. --Jan Kameníček (talk) 21:11, 7 October 2024 (UTC)
- That does not answer my question. I know what a translation page does. But if there is only a single hostable translation, then we do not create a Translations page. --EncycloPetey (talk) 21:56, 7 October 2024 (UTC)
- Although there might not be multiple hostable translations of the whole work, there are various hostable translations of some (or all?) individual parts of the work, which is imo enough to create a translation page for the work. Something like the above discussed Organon. --Jan Kameníček (talk) 15:05, 8 October 2024 (UTC)
- Organon is a collected work limited in scope to just six of Aristotle's works on a unifying theme. La Comédie humaine is more akin to The Collected Works of H. G. Wells, where we would not list all of his individual works, because that's what an Author page is for. --EncycloPetey (talk) 17:10, 8 October 2024 (UTC)
- Well, this work also has some unifying theme (expressed in the title La Comédie humaine) and so it is not just an exhausting collection of all the author's works. Unlike The Collected Works of H. G. Wells it follows some author's plan (see w:La Comédie humaine#Structure of La Comédie humaine). So I also perceive it as a consistent work and can imagine that it has its own translation page, despite the large number of its constituents. --Jan Kameníček (talk) 18:56, 8 October 2024 (UTC)
- A theme hunted for can always be found. By your reasoning, should we have a Yale Shakespeare page in the Mainspace that lists all volumes of the first edition and a linked list of all of Shakespeare's works contained in the set? After all, the Yale Shakespeare is not an exhaustive collection. I would say "no", and say the same for La Comédie humaine. The fact that a collection is not exhaustive is a weak argument. --EncycloPetey (talk) 19:16, 8 October 2024 (UTC)
- You pick one little detail from my reasoning which you twist, this twisted argument you try to disprove and then consider all my reasoning disproved. However, I did not say that the reason is that it is not exhaustive. I said that it is not just an exhausting collection but that it is more than that, that it resembles more a consistent work with a unifying theme. The theme is not hunted, it was set by the author. --Jan Kameníček (talk) 19:54, 8 October 2024 (UTC)
- Then what is your reason for wanting to list all of the component works on a versions / translations page? "It has a theme" is not a strong argument; nor is "it was assembled by the author". Please note that the assemblage, as noted by the Wikipedia article, was never completed, so there is no publication anywhere of the complete assemblage envisioned by the author. This feels more like a shared universe, like the Cthulhu Mythos or Marvel Cinematic Universe, than a published work. I am trying to determine which part of your comments are the actual justification being used for listing all of the component works of a set or series on the Mainspace page, and so far I do not see such a justification. But I do see many reasons not to do so. --EncycloPetey (talk) 20:08, 8 October 2024 (UTC)
- I have written my arguments and they are not weak as I see them. Having spent with this more time than I had intended and having said all I wanted, I cannot say more. --Jan Kameníček (talk) 20:24, 8 October 2024 (UTC)
- There are multiple reasons why it is different from the Cthulu Mythos or Marvel Cinematic Universe. E.g.
- 1. It is a fixed set, both of those examples are open-ended, with new works being added. Even the authors are not defined.
- 2. It was defined and published as such by the original author. Those are creations of, often, multiple editors meaning that the contents are not necessarily agreed upon.
- 3. It was envisioned as a concept from the original author, not a tying together of works later by others.
- etc.
- The argument, "it wasn't completed" is also not a particularly compelling one. Lots of works are unfinished, I have never heard the argument, we can't host play X as "Play X" because only 4/5 acts were written before the playwright died, or we can't host an unfinished novel as X because it is unfinished. And I doubt that is really a key distinction in your mind anyways, I can't imagine given the comparisons you are making that you would be comfortable hosting it if Balzac lived to 71, completed the original planned 46 novels but not if he lived to 70 and completed 45.5 out of the 46.
- MarkLSteadman (talk) 23:41, 8 October 2024 (UTC)
- Re: "It was defined and published as such by the original author". Do you mean the list was published, or that the work was published? What is the "it" here? --EncycloPetey (talk) 00:54, 9 October 2024 (UTC)
- "It" is the concept, so both. You could go into a book store in 1855 and buy books labeled La Comedie Humaine, Volume 1, just like you can buy books today labeled A Song of Ice and Fire, First Book.
- But that is my general point, having a discussion grounded in the publication history of the concept can at least go somewhere. Dismissing out of hand, "it was never finished" gets debating points, not engagement. I may have had interest in researching the history over Balzac's life, but at this point that seems futile.
- In general, to close out my thoughts, for the reasons I highlighted (fixed set, author intent, enough realization and publication as such, existence as a work on fr Wiki source / WP as a novel series) it seems enough to be beyond a mere list, and a translation page seems a reasonable solution here. MarkLSteadman (talk) 12:50, 9 October 2024 (UTC)
- Re: "It was defined and published as such by the original author". Do you mean the list was published, or that the work was published? What is the "it" here? --EncycloPetey (talk) 00:54, 9 October 2024 (UTC)
- Then what is your reason for wanting to list all of the component works on a versions / translations page? "It has a theme" is not a strong argument; nor is "it was assembled by the author". Please note that the assemblage, as noted by the Wikipedia article, was never completed, so there is no publication anywhere of the complete assemblage envisioned by the author. This feels more like a shared universe, like the Cthulhu Mythos or Marvel Cinematic Universe, than a published work. I am trying to determine which part of your comments are the actual justification being used for listing all of the component works of a set or series on the Mainspace page, and so far I do not see such a justification. But I do see many reasons not to do so. --EncycloPetey (talk) 20:08, 8 October 2024 (UTC)
- You pick one little detail from my reasoning which you twist, this twisted argument you try to disprove and then consider all my reasoning disproved. However, I did not say that the reason is that it is not exhaustive. I said that it is not just an exhausting collection but that it is more than that, that it resembles more a consistent work with a unifying theme. The theme is not hunted, it was set by the author. --Jan Kameníček (talk) 19:54, 8 October 2024 (UTC)
- A theme hunted for can always be found. By your reasoning, should we have a Yale Shakespeare page in the Mainspace that lists all volumes of the first edition and a linked list of all of Shakespeare's works contained in the set? After all, the Yale Shakespeare is not an exhaustive collection. I would say "no", and say the same for La Comédie humaine. The fact that a collection is not exhaustive is a weak argument. --EncycloPetey (talk) 19:16, 8 October 2024 (UTC)
- Well, this work also has some unifying theme (expressed in the title La Comédie humaine) and so it is not just an exhausting collection of all the author's works. Unlike The Collected Works of H. G. Wells it follows some author's plan (see w:La Comédie humaine#Structure of La Comédie humaine). So I also perceive it as a consistent work and can imagine that it has its own translation page, despite the large number of its constituents. --Jan Kameníček (talk) 18:56, 8 October 2024 (UTC)
- Organon is a collected work limited in scope to just six of Aristotle's works on a unifying theme. La Comédie humaine is more akin to The Collected Works of H. G. Wells, where we would not list all of his individual works, because that's what an Author page is for. --EncycloPetey (talk) 17:10, 8 October 2024 (UTC)
- Although there might not be multiple hostable translations of the whole work, there are various hostable translations of some (or all?) individual parts of the work, which is imo enough to create a translation page for the work. Something like the above discussed Organon. --Jan Kameníček (talk) 15:05, 8 October 2024 (UTC)
- That does not answer my question. I know what a translation page does. But if there is only a single hostable translation, then we do not create a Translations page. --EncycloPetey (talk) 21:56, 7 October 2024 (UTC)
- The translation page can contain a section listing the translation(s) that we host or could host and a section listing those parts of the work which were translated individually. --Jan Kameníček (talk) 21:11, 7 October 2024 (UTC)
- Which translations would be listed? So far, I am aware of just one English translation we could host. --EncycloPetey (talk) 18:38, 7 October 2024 (UTC)
- This is a Schrödinger's contents: All of the listed items were published together in a collection by this title, however the copies we have do not necessarily come from that collection, and meny of the items were published elsewhere first. --EncycloPetey (talk) 19:02, 24 September 2024 (UTC)
This work is an extract, it is part of the work Khusru and Shirin included inside Sykes's A History of Persia on p. 141 inside Chapter 54. MarkLSteadman (talk) 21:56, 29 September 2024 (UTC)
- Delete per WS:WWI. Cremastra (talk) 19:33, 3 October 2024 (UTC)
Does this Austrian chemist have any known hostable works? All the listed works are in German. --EncycloPetey (talk) 17:32, 30 September 2024 (UTC)
- I could not find any English translations of his works, so I don't see the use of this author page. Ping to @LlywelynII: what was your reason for creating it? Are there translations or something else hostable that I couldn't find? Thanks, Cremastra (talk) 20:59, 2 October 2024 (UTC)
- He is linked to from an Encyclopedia Britannica article where his scientific measurements are reported. 20:42, 3 October 2024 (UTC) MarkLSteadman (talk) 20:42, 3 October 2024 (UTC)
Not in accordance with WS:Translations#Wikisource original translations: A scan supported original language work must be present on the appropriate language wiki. -- Jan Kameníček (talk) 15:41, 4 October 2024 (UTC)
Translation:“Regulations for the Control of Licensed Brothels and Prostitutes” in the Korean peninsula under the Administration of the Empire of Japan
[edit]Not in accordance with WS:Translations#Wikisource original translations: A scan supported original language work must be present on the appropriate language wiki. -- Jan Kameníček (talk) 15:41, 4 October 2024 (UTC)
Abandoned incomplete work, containing just a small fragment of Book 1. -- Jan Kameníček (talk) 20:39, 5 October 2024 (UTC)
- Comment It is incomplete and abandoned, but there is a source linked in the notes of the header. --EncycloPetey (talk) 18:35, 7 October 2024 (UTC)
This is a new template modeled on Template:Long s, but which displays a printed v as a u everywhere except pagespace. Thus far, it has only been deployed at Index:Hamlet, Second Quarto, 1603 (Folger STC 22278).djvu, an early edition of Shakespeare. This template breaks our norm of reproducing what was printed in a serious way. And in using it on a Shakespeare Quarto, hides the original orthography, which would be one of the most important features of transcribing such a Quarto.
This is not equivalent to the long-s template. That template renders one printed form of a lower-case s with other printed form of a lower-case s. This new template swaps one printed letter for a different printed letter. --EncycloPetey (talk) 22:54, 9 October 2024 (UTC)
- I disagree that showing the the archaic printed orthography of the quarto—with u, v, ſ or ƈ—is useful to the reader. Please see Wikisource:Style guide/Orthography#Phonetically equivalent archaic letter forms.
- Consider the conventional line:
- He smote the sledded Polacks on the ice.
- In this quarto, the line is written:
- He ſmot the ſleaded pollax on the ice.
- As much as I agree the notable spelling “pollax” should be kept, this is not the same as the distracting printed convention of “ſleaded”—which was never pronounced as an F, and is not useful as an F.
- Similarly, the contemporary:
- Why this same strict and most observant watch
- should not use this quarto’s
- Why this ſame ſtrikt and moſt obſeruant watch
- but instead should give the reader:
- Why this same strikt and most observant watch
- In any case, there is no need to jump straight to deleting the template. If a decision is made at the style guide to substitute all {{V as u}} into the simple v, that change can be easily done. HTGS (talk) 23:22, 9 October 2024 (UTC)
- Sorry, I meant to note further that if this typographic decision is made at this printing of the Second Quarto, that does not mean that the template should never be used anywhere, so I think deletion is at least premature, if not totally unnecessary. HTGS (talk) 23:38, 9 October 2024 (UTC)
- As I mentioned in the initial post, there is no disagreement about the long-s. We agreed to allow that many years ago. But we have never before agreed to replace one typographical letter with a different typographical letter. But what is the point of transcribing a Quarto edition if we alter the presented spellings? Modern editions (Oxford, Yale, Arden, etc.) will have the modern orthography. The utility of transcribing a Quarto or Folio is in providing what was printed, and not in presenting an altered edition with modernized orthography. --EncycloPetey (talk) 23:52, 9 October 2024 (UTC)
- Your goal seems to be a very strange carve out, to keep only one particularity of archaic printing, but not all. If the goal were to display all characters as they were printed, that would make sense to me—and I suggest that the style guide should actually allow it in such cases, where consensus holds for it. To suggest that the printed v used as a u is of a different category of orthography though, you would have to convince me that “vs” is just a different spelling of “us”, and not an orthographic oddity. HTGS (talk) 00:11, 10 October 2024 (UTC)
- It is not my goal, in either sense. Long-s contracted to lower-case s is a community decision, and is permitted, but is not mandated. There are transcribed works where the long-s is preserved, and in some works I have so preserved it. In a Folio or Quarto, I would argue it should be preserved as well, because the power of transcribing such a text (as I have said multiple times) is providing the reader with what was printed. If the goal is to provide modern orthography, we should do so in a modernized edition, of which there are many. --EncycloPetey (talk) 00:22, 10 October 2024 (UTC)
- Which of the modern editions use the quarto text? I was under the impression that all of them use the First Folio (and so “Fortinbras”, not “Fortinbrasse”, etc). HTGS (talk) 00:38, 10 October 2024 (UTC)
- The University of Victoria (Canada) maintains both an original and modernized edition of Q1, Q2, and F1. They assert copyright on their texts, but we would not copy secondhand from an online source anyway. But according to the Folger's Shakespeare: "Most such editors have preferred the Second Quarto’s readings in the belief that it was printed either directly from Shakespeare’s own manuscript or from a scribe’s copy of it. A few have, instead, adopted Folio readings in the belief that the Folio was set into type from a theater manuscript, and they wanted to give their readers the play as it was performed on Shakespeare’s stage." --EncycloPetey (talk) 01:55, 10 October 2024 (UTC)
- Also, from our copy of The Yale Shakespeare: "Modern texts are based upon the Quarto of 1604 and the First Folio." --EncycloPetey (talk) 01:57, 10 October 2024 (UTC)
- Thanks. I think ultimately, we should be providing a readable version of texts we transcribe. This ideology fits with our present guidance, to avoid “Phonetically equivalent archaic letter forms”, and it makes most sense to consider the use of V as a U similarly.
- Of course this isn’t a discussion to have in a section about deleting the template, because whatever consensus is reached, there is no reason the template needs to be deleted. If the community prefers print-parity across the board on this, the template can be edited to function in reverse, displaying the printed form for all users except those who have chosen to display it with the modern orthography. HTGS (talk) 02:08, 10 October 2024 (UTC)
- It can become an issue in works with many template calls of this kind. We already have Old English poetry too long to use poem templates, or with too many gaps in the text to use complex templates. But this discussion is happening precisely because it pushes beyond what has been permissible in the past. Allowing this template would be a change to established practice. This is why the deletion discussion is happening. --EncycloPetey (talk) 03:01, 10 October 2024 (UTC)
- Which of the modern editions use the quarto text? I was under the impression that all of them use the First Folio (and so “Fortinbras”, not “Fortinbrasse”, etc). HTGS (talk) 00:38, 10 October 2024 (UTC)
- It is not my goal, in either sense. Long-s contracted to lower-case s is a community decision, and is permitted, but is not mandated. There are transcribed works where the long-s is preserved, and in some works I have so preserved it. In a Folio or Quarto, I would argue it should be preserved as well, because the power of transcribing such a text (as I have said multiple times) is providing the reader with what was printed. If the goal is to provide modern orthography, we should do so in a modernized edition, of which there are many. --EncycloPetey (talk) 00:22, 10 October 2024 (UTC)
- Your goal seems to be a very strange carve out, to keep only one particularity of archaic printing, but not all. If the goal were to display all characters as they were printed, that would make sense to me—and I suggest that the style guide should actually allow it in such cases, where consensus holds for it. To suggest that the printed v used as a u is of a different category of orthography though, you would have to convince me that “vs” is just a different spelling of “us”, and not an orthographic oddity. HTGS (talk) 00:11, 10 October 2024 (UTC)
- As I mentioned in the initial post, there is no disagreement about the long-s. We agreed to allow that many years ago. But we have never before agreed to replace one typographical letter with a different typographical letter. But what is the point of transcribing a Quarto edition if we alter the presented spellings? Modern editions (Oxford, Yale, Arden, etc.) will have the modern orthography. The utility of transcribing a Quarto or Folio is in providing what was printed, and not in presenting an altered edition with modernized orthography. --EncycloPetey (talk) 23:52, 9 October 2024 (UTC)
- Sorry, I meant to note further that if this typographic decision is made at this printing of the Second Quarto, that does not mean that the template should never be used anywhere, so I think deletion is at least premature, if not totally unnecessary. HTGS (talk) 23:38, 9 October 2024 (UTC)
- Delete While there is an apparent precedent with ſ vs s, this pairing is not a similar case in that search engines are unlikely to recognise "u" and "v" as being equivalent. In the particular work the template is appearing in, I regard it as an annotation. Thus, under our rules for annotated editions, there needs to be a version that has fidelity to the work as printed—including long-s. Once that is done, then an annotated version with modern orthography can be considered. Such would not require this template. Beeswaxcandle (talk) 09:16, 10 October 2024 (UTC)
- Can I add a template I created in good faith {{vv}} to this discussion, as I feel the concerns are related. ? ShakespeareFan00 (talk) 22:09, 11 October 2024 (UTC)
- There are a handful of such templates, most with little usage, and some with inconsistent standards applied to transcription of the work where they are used. I am hoping we'll get enough discussion to set a community norm that will allow us to judge similar templates. --EncycloPetey (talk) 22:19, 11 October 2024 (UTC)
Just a social media post. -- Jan Kameníček (talk) 21:22, 11 October 2024 (UTC)
- This is not just a sole instance of this user's tweet transcriptions; this has been going on for a while. Some examples of more contentless NWS tweets "transcribed" by this editor include NWS Tornado Test Tweet, NWS Dodge City No (literally just the word "No."), NWS – Same tbh. One that at least has some substantial content is X.com/NWS/status/1837308747603136776, but even this would be against our criteria for inclusion.
- @WeatherWriter: Were you aware that bare social media posts, especially ones as non-notable and unsubstantial as many of the ones you've been posting here, are against our criteria for inclusion, at WS:WWI? What would one instance of the word "No" help? If the world needs an archive of federal government tweets, then fine, but maybe we should do it in some automated way, and somewhere other than Wikisource (like the Internet Archive, which happens to be down right now).
- I'm even probably one of the more lenient admins here in regards to the "Internet inclusion criteria", and even your posting of web pages (example) from National Weather agencies would probably generate some controversy here, by itself. A good rule of thumb is, "did it appear on paper in published form?" I'm sure there are plenty of paper weather reports or documentation you could find to transcribe that would be really interesting additions to our library. But, Wikisource simply does not have the infrastructure to include every tweet by a federal employee or agency.
- I do appreciate the clearly dedicated work, but I have to regrettably vote Delete for all the tweets. Though I do hope you stick around through your interest in weather anyway, and target the energy to works that we would accept. SnowyCinema (talk) 22:25, 11 October 2024 (UTC)
- I will remain neutral on the small tweets due to User:SnowyCinema saying they are against Wikisource's criteria of inclusion. However, strong keep for X.com/NWS/status/1837308747603136776 as it literally contains an error posted by the NWS, which was pointed out by others. In that post, NWS said snow would be in the forecast in September. That seems solid enough for criteria of inclusion. WeatherWriter (talk) 22:29, 11 October 2024 (UTC)
- Also to note, the NWS Disclaimer for Photo Use is the discussion of a major RFC on the Commons and is very much Wikimedia/Wikiproject related. WeatherWriter (talk) 22:31, 11 October 2024 (UTC)
- When you say "seems solid enough for criteria of inclusion", which criteria are you applying? --EncycloPetey (talk) 22:38, 11 October 2024 (UTC)
- Documentary sources published after 1928. WeatherWriter (talk) 22:41, 11 October 2024 (UTC)
- How is this an "official document" as opposed to simply "[e]xpressions of mere opinion"? --EncycloPetey (talk) 22:52, 11 October 2024 (UTC)
- Everything produced by the NWS direct accounts (as opposed to a meteorologists personal account) are done during official duty, and therefore is an official document by the United States government. WeatherWriter (talk) 00:57, 12 October 2024 (UTC)
- The question is not whether this is some form of official comment of the agency versus a personal account. The post of "Correct! Great message!" seems to be an opinion on some other object, and not a departmental document. --EncycloPetey (talk) 01:08, 12 October 2024 (UTC)
- X.com/NWS/status/1837308747603136776 (which has been grouped into this discussion) is clearly not an opinion and would clearly qualify as an official document/department post. Same with the "No" post, as explained below. "No" was the formal post by the NWS. Yes, it was their opinion that there was no tornado at the time (we see how that went), but that is a clear departmental statement which bit them in the ass more or less. WeatherWriter (talk) 02:01, 12 October 2024 (UTC)
- The question is not whether this is some form of official comment of the agency versus a personal account. The post of "Correct! Great message!" seems to be an opinion on some other object, and not a departmental document. --EncycloPetey (talk) 01:08, 12 October 2024 (UTC)
- Everything produced by the NWS direct accounts (as opposed to a meteorologists personal account) are done during official duty, and therefore is an official document by the United States government. WeatherWriter (talk) 00:57, 12 October 2024 (UTC)
- How is this an "official document" as opposed to simply "[e]xpressions of mere opinion"? --EncycloPetey (talk) 22:52, 11 October 2024 (UTC)
- Documentary sources published after 1928. WeatherWriter (talk) 22:41, 11 October 2024 (UTC)
- When you say "seems solid enough for criteria of inclusion", which criteria are you applying? --EncycloPetey (talk) 22:38, 11 October 2024 (UTC)
- Edit conflict - Context on the "No" tweet. NWS publicly denied that a storm chaser saw a tornado and when they were asked about if they can issue a tornado warning, that "No" post was their reply. They later confirmed that a tornado occurred. That single tweet has thousands of views and reactions which can be seen here. Even TV channels and degreed meteorologists reacted to it (A few: [4][5][6]). That "no" may actually qualify in the criteria of inclusion. Mike Smith, the former senior vice president of w:Accuweather actually wrote an entire timeline involving that day and tornado: [7]. If you Google "Dodge City" "tornado" "no", you will see several things come up on it. Honestly, that "no" probably does qualify in the criteria for inclusion. WeatherWriter (talk) 22:41, 11 October 2024 (UTC)
- I'm not too convinced. There's no content there—it's a simple English word which could be said by anyone in any number of contexts. And the Google results don't lead me to believe the tweet itself would be such a phenomenon, compare for example Captain Midnight broadcast signal intrusion message which has its own Good article on Wikipedia. SnowyCinema (talk) 22:49, 11 October 2024 (UTC)
- User:SnowyCinema: If File:NWS Miami, Florida post on X at 524 PM on Oct 9, 2024.png was turned into a Wikisource document, would you have objections to it? It is in use on the Hurricane Milton article right now, since you keep seeming like using elsewhere on the Wikimedia Projects is needed for short posts to quality. WeatherWriter (talk) 02:06, 12 October 2024 (UTC)
- I'm not too convinced. There's no content there—it's a simple English word which could be said by anyone in any number of contexts. And the Google results don't lead me to believe the tweet itself would be such a phenomenon, compare for example Captain Midnight broadcast signal intrusion message which has its own Good article on Wikipedia. SnowyCinema (talk) 22:49, 11 October 2024 (UTC)
- [Edit conflict] The NWS uses Twitter as part of their official duties. Several tweets are published (short in nature). If you look at this NWS webpage ("Timeline" tab), you will see links to a ton of NWS social media posts during the w:2013 Moore tornado. Posts are short like these: [8][9][10]. One of the most critical posts was this 5 word tweet, which was one of only about 200 issuances of a w:tornado emergency. NWS tweets clearly can qualify under the documentary sources published after 1928 criteria of inclusion. WeatherWriter (talk) 22:51, 11 October 2024 (UTC)
- Delete We are not a replicant site for conversations on Twitter/X. Bending the "documentary evidence" exception for a series of tweets (regardless of who issued them) is going beyond its original intent. Beeswaxcandle (talk) 23:31, 11 October 2024 (UTC)
- P.S. as a reply to SnowyCinema's question above on if I knew the criteria for inclusion guidelines, no I did not. This is so dumb as well. I apologize for cursing, but I think it is warranted. I got directed by another user from the Wikisource:Scriptorium/Help to use the Template:New texts. I decide to give it a shot. First fucking thing I put there is reverted, then proposed for deletion, then others (like yourself) practically gang up and propose several other things I wrote into the deletion discussion. Literally, directed from the help page got me several proposed deletions within a few hours of trying the "help" out. What a fucking warm welcome to trying new stuff out. I went ahead and tried it again with a different page I wrote. If I am wrong, just propose it for deletion and I'll just quit Wikisource since using the "help" just got stuff deleted... WeatherWriter (talk) 05:55, 12 October 2024 (UTC)