Wikisource:Proposed deletions
- WS:PD redirects here. For help with public domain materials, see Help:Public domain.
No source, no license, no indication of being in the public domain —Beleg Tâl (talk) 17:22, 7 August 2024 (UTC)
- Found the source: [1] — Alien333 (what I did & why I did it wrong) 19:54, 7 August 2024 (UTC)
- The text of the source does not match what we have. I am having trouble finding our opening passages in the link you posted. --EncycloPetey (talk) 19:58, 7 August 2024 (UTC)
(At least, a sentence matched).@EncycloPetey: Found it, the content that corresponds to our page starts in the middle in the page 44 of that pdf, though the delimiting of paragraphs seems to be made up. — Alien333 (what I did & why I did it wrong) 20:00, 7 August 2024 (UTC)- That means we have an extract. --EncycloPetey (talk) 00:39, 9 August 2024 (UTC)
- No, it appears that the PDF is a compilation of several different, thematically related documents. His statement (English’d) is one such separate document. TE(æ)A,ea. (talk) 00:53, 9 August 2024 (UTC)
- In which case we do not yet have a source. --EncycloPetey (talk) 00:55, 9 August 2024 (UTC)
- No, that is the source; it’s just that the PDF contains multiple separate documents, like I said. It’s like the “Family Jewel” papers or the “Den of Espionage” documents. TE(æ)A,ea. (talk) 00:58, 9 August 2024 (UTC)
- Sorry, I meant to say that we do not have a source for it as an independently hosted work. To use the provided source, it would need to be moved into the containing work. --EncycloPetey (talk) 01:55, 9 August 2024 (UTC)
- Well these document collections are bit messy, they were originally independent documents / works but they are collected together for release, e.g. because someone filed a FOIA request for all documents related to person X. I don't think it is unreasonable if someone were to extract out the document. I wouldn't object if someone was like I went to an archive and grabbed document X out of Folder Y in Box Z but if someone requested a digital version of the file from the same archive they might just get the whole box from the archive scanned as a single file. Something like the "Family Jewels" is at least editorial collected, has a cover letter, etc., this is more like years 1870-1885 of this magazine are on microfiche roll XXV, we need to organize by microfiche roll. MarkLSteadman (talk) 11:17, 9 August 2024 (UTC)
- @EncycloPetey since this PDF is published on the DOD/WHS website, doesn't that make this particular collection of documents a publication of DOD/WHS? (Genuine question, I can imagine there are cases -- and maybe this is one -- where it's not useful to be so literal about what constitutes a publication or to go off a different definition. But I'm interested in your thinking.) -Pete (talk) 20:11, 9 August 2024 (UTC)
- Well these document collections are bit messy, they were originally independent documents / works but they are collected together for release, e.g. because someone filed a FOIA request for all documents related to person X. I don't think it is unreasonable if someone were to extract out the document. I wouldn't object if someone was like I went to an archive and grabbed document X out of Folder Y in Box Z but if someone requested a digital version of the file from the same archive they might just get the whole box from the archive scanned as a single file. Something like the "Family Jewels" is at least editorial collected, has a cover letter, etc., this is more like years 1870-1885 of this magazine are on microfiche roll XXV, we need to organize by microfiche roll. MarkLSteadman (talk) 11:17, 9 August 2024 (UTC)
- Sorry, I meant to say that we do not have a source for it as an independently hosted work. To use the provided source, it would need to be moved into the containing work. --EncycloPetey (talk) 01:55, 9 August 2024 (UTC)
- No, that is the source; it’s just that the PDF contains multiple separate documents, like I said. It’s like the “Family Jewel” papers or the “Den of Espionage” documents. TE(æ)A,ea. (talk) 00:58, 9 August 2024 (UTC)
- In which case we do not yet have a source. --EncycloPetey (talk) 00:55, 9 August 2024 (UTC)
- Why would a particular website warrant a different consideration in terms of what we consider a publication? How and why do you think it should be treated differently? According to what criteria and standards? --EncycloPetey (talk) 20:23, 9 August 2024 (UTC)
- Your reply seems to assume I have a strong opinion on this. I don't. My question is not for the purpose of advocating a position, but for the purpose of understanding your position. (As I said, it's a genuine question. Meaning, not a rhetorical or a didactic one.) If you don't want to answer, that's your prerogative of course.
- I'll note that Wikisource:Extracts#Project scope states, "The creation of extracts and abridgements of original works involves an element of creativity on the part of the user and falls under the restriction on original writing." (Emphasis is mine.) This extract is clearly not the work of a Wikisource user, so the statement does not apply to it. It's an extract created by (or at least published) by the United States Department of Defense, an entity whose publishing has been used to justify the inclusion of numerous works on Wikisource.
- But, I have no strong opinion on this decision. I'm merely seeking to understand the firmly held opinions of experienced Wikisource users. -Pete (talk) 20:42, 9 August 2024 (UTC)
- You misunderstand. The page we currently have on our site is, based on what we have so far, an extract from a longer document. And that extract was made by a user on Wikisource. There is no evidence that the page we currently have was never published independently, so the extract issue applies here. We can host it as part of the larger work, however, just as we host poems and short stories published in a magazine. We always want the work to be included in the context in which it was published. --EncycloPetey (talk) 20:55, 9 August 2024 (UTC)
- OK. I did understand that to be TEaeA,ea's position, but it appeared to me that you were disagreeing and I did not understand the reasons. Sounds like there's greater agreement than I was perceiving though. Pete (talk) 21:36, 9 August 2024 (UTC)
- I am unclear what you are referring to as a "longer document." Are you referring to the need to transcribe the Russian portion? That there are unreleased pages beyond the piece we have here?. Or are you saying the "longer document" is all 53 sets of releases almost 4000 pages listed here (https://www.esd.whs.mil/FOIA/Reading-Room/Reading-Room-List_2/Detainee_Related/)? I hope you are not advocating for merging all ~4000 pages into a single continuous page here, some some subdivision I assume is envisioned.
- Re the policy statement: I am not sure that is definitive: if someone writes me a letter or a poem and I paste that into a scrapbook, is the "work" the letter, the scrapbook or both? Does it matter if it is a binder or a folder instead of a scrapbook? If a reporter copies down a speech in a notebook, is the work the speech or the whole notebook. etc. I am pretty sure we haven't defined with enough precision to point to policy to say one interpretation of "work" is clearly wrong, which is why we have the discussion. MarkLSteadman (talk) 05:36, 10 August 2024 (UTC)
- The basic unit in WS:WWI is the published unit; we deal in works that have been published. We would not host a poem you wrote and pasted into a scrapbook, because it has not been published. For us to consider hosting something that has not been published usually requires some sort of extraordinary circumstances. --EncycloPetey (talk) 15:53, 10 August 2024 (UTC)
- From WWSI: "Most written work ... created but never published prior to 1929 may be included", Documentary sources include; "personal correspondence and diaries." The point isn't the published works, that is clear. If someone takes the poem edits it and publishes in a collection its clear. It's the unpublished works sitting in archives, documentary sources, etc. Is the work the unpublished form it went into the archive (e.g separate letters) or the unpublished form currently in the archives (e.g. bound together) or is it if I request pages 73-78 from the archives those 5 pages in the scan are the work and if you request pages 67-75 those are a separate work? MarkLSteadman (talk) 17:18, 10 August 2024 (UTC)
- I will just add that in every other context we refer to a work as the physical thing and not a mere scanned facsimile. We don't consider Eighteenth Century Collections Online scanning a particular printed editions and putting up a scan as the "published unit" as distinct from the British Library putting up their scan as opposed to the LOC putting up their scan or finding a version on microfilm. Of course, someone taking documents and doing things (like the Pentagon Papers, or the Family Jewels) might create a new work, but AFAICT in this context it is just mere reproduction. MarkLSteadman (talk) 05:37, 12 August 2024 (UTC)
- In the issue at hand, I am unaware of any second or third releases / publications. As far as I know, there is only the one release / publication. When a collection or selection is released / published from an archive collection, that release is a publication. And we do not have access to the archive. --EncycloPetey (talk) 17:34, 12 August 2024 (UTC)
- We have access, via filing a FOIA request. That is literally how those documents appeared there, they are hosted under: "5 U.S.C. § 552 (a)(2)(D) Records - Records released to the public, under the FOIA," which are by law where records are hosted that have been requested three times. And in general, every archive has policies around access. And I can't just walk into Harvard or Oxford libraries and handle their books either.
- My point isn't that can't be the interpretation we could adopt or have stricter policies around archival material. Just that I don't believe we can point to a statement saying "work" or "published unit" and having that "obviously" means that a request for pages 1-5 of a ten report is obviously hostable if someone requests just those five pages via FOIA as a "complete work" while someone cutting out just the whole report now needs to be deleted because that was released as part of a 1000 page large document release and hence is now an "extract" of that 1000 page release. That requires discussion, consensus, point to precedent etc. And if people here agree with that interpretation go ahead. MarkLSteadman (talk) 03:16, 18 August 2024 (UTC)
- For example, I extracted Index:Alexandra Kollontai - The Workers Opposition in Russia (1921).djvu out of [2]. My understanding of your position is that according to policy the "work" is actually all 5 scans from the Newberry Library archives joined together (or, maybe only if there are work that was previously unpublished?), and that therefore it is an "extract" in violation of policy. But if I uploaded this [3] instead, that is okay? Or maybe it depends on the access policies of Newberry vs. the National Archives? Or it depends on publication status (so I can extract only published pamphlets from the scans but not something like a meeting minutes, so even though they might be in the same scan the "work" is different?) MarkLSteadman (talk) 03:45, 18 August 2024 (UTC)
- If the scan joined multiple published items, that were published separately, I would see no need to force them to be part of the same scan, provided the scan preserves the original publication in toto. I say that because there are Classical texts where all we have is the set of smushed together documents, and they are now considered a "work". This isn't a problem limited to modern scans, archives, and the like. The problem is centuries old. --EncycloPetey (talk) 04:21, 18 August 2024 (UTC)
- So if in those thousands of pages there is a meeting minute or letter between people ("unpublished") then I can't? MarkLSteadman (talk) 13:57, 20 August 2024 (UTC)
- If the scan joined multiple published items, that were published separately, I would see no need to force them to be part of the same scan, provided the scan preserves the original publication in toto. I say that because there are Classical texts where all we have is the set of smushed together documents, and they are now considered a "work". This isn't a problem limited to modern scans, archives, and the like. The problem is centuries old. --EncycloPetey (talk) 04:21, 18 August 2024 (UTC)
- For example, I extracted Index:Alexandra Kollontai - The Workers Opposition in Russia (1921).djvu out of [2]. My understanding of your position is that according to policy the "work" is actually all 5 scans from the Newberry Library archives joined together (or, maybe only if there are work that was previously unpublished?), and that therefore it is an "extract" in violation of policy. But if I uploaded this [3] instead, that is okay? Or maybe it depends on the access policies of Newberry vs. the National Archives? Or it depends on publication status (so I can extract only published pamphlets from the scans but not something like a meeting minutes, so even though they might be in the same scan the "work" is different?) MarkLSteadman (talk) 03:45, 18 August 2024 (UTC)
- In the issue at hand, I am unaware of any second or third releases / publications. As far as I know, there is only the one release / publication. When a collection or selection is released / published from an archive collection, that release is a publication. And we do not have access to the archive. --EncycloPetey (talk) 17:34, 12 August 2024 (UTC)
- I will just add that in every other context we refer to a work as the physical thing and not a mere scanned facsimile. We don't consider Eighteenth Century Collections Online scanning a particular printed editions and putting up a scan as the "published unit" as distinct from the British Library putting up their scan as opposed to the LOC putting up their scan or finding a version on microfilm. Of course, someone taking documents and doing things (like the Pentagon Papers, or the Family Jewels) might create a new work, but AFAICT in this context it is just mere reproduction. MarkLSteadman (talk) 05:37, 12 August 2024 (UTC)
- From WWSI: "Most written work ... created but never published prior to 1929 may be included", Documentary sources include; "personal correspondence and diaries." The point isn't the published works, that is clear. If someone takes the poem edits it and publishes in a collection its clear. It's the unpublished works sitting in archives, documentary sources, etc. Is the work the unpublished form it went into the archive (e.g separate letters) or the unpublished form currently in the archives (e.g. bound together) or is it if I request pages 73-78 from the archives those 5 pages in the scan are the work and if you request pages 67-75 those are a separate work? MarkLSteadman (talk) 17:18, 10 August 2024 (UTC)
- The basic unit in WS:WWI is the published unit; we deal in works that have been published. We would not host a poem you wrote and pasted into a scrapbook, because it has not been published. For us to consider hosting something that has not been published usually requires some sort of extraordinary circumstances. --EncycloPetey (talk) 15:53, 10 August 2024 (UTC)
- OK. I did understand that to be TEaeA,ea's position, but it appeared to me that you were disagreeing and I did not understand the reasons. Sounds like there's greater agreement than I was perceiving though. Pete (talk) 21:36, 9 August 2024 (UTC)
- You misunderstand. The page we currently have on our site is, based on what we have so far, an extract from a longer document. And that extract was made by a user on Wikisource. There is no evidence that the page we currently have was never published independently, so the extract issue applies here. We can host it as part of the larger work, however, just as we host poems and short stories published in a magazine. We always want the work to be included in the context in which it was published. --EncycloPetey (talk) 20:55, 9 August 2024 (UTC)
- Why would a particular website warrant a different consideration in terms of what we consider a publication? How and why do you think it should be treated differently? According to what criteria and standards? --EncycloPetey (talk) 20:23, 9 August 2024 (UTC)
- No, it appears that the PDF is a compilation of several different, thematically related documents. His statement (English’d) is one such separate document. TE(æ)A,ea. (talk) 00:53, 9 August 2024 (UTC)
- That means we have an extract. --EncycloPetey (talk) 00:39, 9 August 2024 (UTC)
- The text of the source does not match what we have. I am having trouble finding our opening passages in the link you posted. --EncycloPetey (talk) 19:58, 7 August 2024 (UTC)
- This discussion has gone way beyond my ability to follow it. However, I do want to point out that we do have precedent for considering documents like those contained in this file adequate sources for inclusion in enWS. I mention this because if the above discussion established a change in precedent, there will be a large number of other works that can be deleted under similar argument (including ones which I have previously unsuccessfully proposed for deletion). —Beleg Tâl (talk) 13:14, 13 August 2024 (UTC)
- for example, see the vast majority of works at Portal:Guantanamo —Beleg Tâl (talk) 13:15, 13 August 2024 (UTC)
- (@EncycloPetey, @MarkLSteadman) So, to be clear, the idea would be to say that works which were published once and only once, and as part of a collection of works, but that were created on Wikisource on their own, to be treated of extracts and deleted per WS:WWI#Extracts?
- If this is the case, it ought to be discussed at WS:S because as BT said a lot of other works would qualify for this that are currently kept because of that precedent, including most of our non-scan-backed poetry and most works that appeared in periodicals. This is a very significant chunk of our content. — Alien333 (what I did & why I did it wrong) 09:29, 14 August 2024 (UTC)
- Also, that would classify encyclopedia articles as extracts, which would finally decide the question of whether it is appropriate to list them on disambiguation pages (i.e., it would not be appropriate, because they are extracts) —Beleg Tâl (talk) 13:23, 14 August 2024 (UTC)
- Extracts are only good for deletion if created separately from the main work. As far as I understood this, if someone does for example a whole collection of documents, they did the whole work, so it's fine, it's only if it's created separately (like this is the case here) that they would be eligible for deletion. Editing comment accordingly. — Alien333 (what I did & why I did it wrong) 15:00, 14 August 2024 (UTC)
- We would not host an article from an encyclopedia as a work in its own right; it would need to be part of its containing work, such as a subpage of the work, and not a stand-alone article. I believe the same principle applies here. --EncycloPetey (talk) 15:36, 14 August 2024 (UTC)
- Extracts are only good for deletion if created separately from the main work. As far as I understood this, if someone does for example a whole collection of documents, they did the whole work, so it's fine, it's only if it's created separately (like this is the case here) that they would be eligible for deletion. Editing comment accordingly. — Alien333 (what I did & why I did it wrong) 15:00, 14 August 2024 (UTC)
- Much of our non-scan backed poetry looks like this A Picture Song which is already non-policy compliant (no source). For those listing a source such as an anthology, policy would generally indicate the should end up being listed as subworks of the anthology they were listed in. I don't think I have seen an example of a poetry anthology scan being split up into a hundred different separate poems transcribed as individual works rather than as a hundred subworks of the anthology work.
- Periodicals are their own mess, especially with works published serially. Whatever we say here also doesn't affect definitely answer the question of redirects, links, disambiguation as we already have policies and precedent allowing linking to sub-works (e.g. we allow linking to laws or treaties contained in statute books, collections, appendices, etc.). MarkLSteadman (talk) 02:57, 18 August 2024 (UTC)
- They are non-policy compliant, but this consensus appears to have been that though adding sourceless works is not allowed, we do not delete the old ones, which this, if done, would do. — Alien333 ( what I did &
why I did it wrong ) 07:55, 18 August 2024 (UTC)
- They are non-policy compliant, but this consensus appears to have been that though adding sourceless works is not allowed, we do not delete the old ones, which this, if done, would do. — Alien333 ( what I did &
- Also, that would classify encyclopedia articles as extracts, which would finally decide the question of whether it is appropriate to list them on disambiguation pages (i.e., it would not be appropriate, because they are extracts) —Beleg Tâl (talk) 13:23, 14 August 2024 (UTC)
Looks like transcription of some screenshots of web pages. Not in our scope per WS:WWI#Reference material: "Wikisource does not collect reference material unless it is published as part of a complete source text" ... "Some examples of these include... Tables of data or results".
Besides, the PDF file contains two pages with two tables from two separate database entries, so it is a user-created compilation, which is again not possible per WS:WWI.
(Besides all this, I still believe that our task is not transcribing the whole web, as this creates unnecessary maintenance burden for our small community. But it is not the main reason, though it is important, the main ones are above.)
-- Jan Kameníček (talk) 22:04, 12 January 2025 (UTC)
- Keep – These reports are published specifically by the United States government at least 3 months after a natural disaster that serve as the finalized reports. There is an entire page specifically about these sources. The PDF is Wikipedian-made but the tables are not. The U.S. government divides every report by county and by month. The fire was in a single county, but occurred in April & May 2024, therefore, NOAA published an April 2024 and a May 2024 report separately. The PDF was the combination of the two sources. To note, this is an official publication of the U.S. government as described in that page linked above: "Storm Data is an official publication of the National Oceanic and Atmospheric Administration (NOAA) which documents the occurrence of storms and other significant weather phenomena having sufficient intensity to cause loss of life, injuries, significant property damage, and/or disruption to commerce." Per WS:WWI, this is a documentary source, which qualifies under Wikisource's scope per "They are official documents of the body producing them". There is way in hell you can argue a collection of official U.S. government documents does not qualify for Wikisource. WeatherWriter (talk) 22:26, 12 January 2025 (UTC)
- The definition of the documentary source in WS:WWI says that "documents may range from constitutions and treaties to personal correspondence and diaries." Pure tables without any context are refused by the rule a bit below, see my quotation above. --Jan Kameníček (talk) 22:33, 12 January 2025 (UTC)
- That is how the National Weather Service, a branch of the United States government publishes finalized results...Like every single fucking natural disaster in the United States is published in that format. File:Storm Data Document for the 1970 Lubbock, Texas Tornado.jpg is a 1970 publication (pre-Internet) and this is a physical paper that was physcally scanned in. That to is in a chart and table. If charts and tables produced by the US government are not allowed, then y'all need to create something saying no U.S. government natural disaster report is allowed because tables is how the U.S. government fucking publishes the information. Yeah, good bye Wikisource. There is literally no use to be here. WeatherWriter (talk) 22:39, 12 January 2025 (UTC)
- That is absolutely OK that they publish tables, but our rule does not accept such screenshot-based material. Being rude or shouting with bold or red letters won't help. Although you have achieved that opposing arguments are less visible, it will not have any impact on the final result. --Jan Kameníček (talk) 22:53, 12 January 2025 (UTC)
- If/when this is deleted, please make a note somewhere that Storm Data is not covered under Wikisource's scope, since both the 2024 wildfire and 1970 tornado document above are from Storm Data and they would not be under the scope. There needs to be some note about that somewhere that the U.S. document series Storm Data is not under Wikisource's scope. WeatherWriter (talk) 22:56, 12 January 2025 (UTC)
- Definitely not, it is not a matter of publisher. Besides, our rules are worded generally, we never make them publisher-specific. Speaking about Storm Data, they publish a monthly periodical, see an example which would definitely be in our scope. Unlike screenshots of their web. --Jan Kameníček (talk) 23:06, 12 January 2025 (UTC)
- So Storm Data is allowed, but screenshots of Storm Data is not allowed? Is that correct? WeatherWriter (talk) 23:09, 12 January 2025 (UTC)
- More or less. We don't accept extracts or user-created compilations, but if you have a government work as a whole, we'll generally take it. Screenshots of works aren't specifically in violation, but it's a horrible way to get a whole work. You can use podman on the HTML, or print it directly from your browser, and that will let the text be copyable.--Prosfilaes (talk) 00:35, 13 January 2025 (UTC)
- I went ahead and requested author-requested speedy deletion on it. No use to try to argue or debate. I know you are an administrator who clearly knows it isn't in scope and needs to be deleted. I don't want to argue or debate it anymore and just want to be done with Wikisource transcribing. I do indeed lack the competence to know what is or is not allowed for Wikisource, despite being a veteran editor. WeatherWriter (talk) 23:18, 12 January 2025 (UTC)
- So Storm Data is allowed, but screenshots of Storm Data is not allowed? Is that correct? WeatherWriter (talk) 23:09, 12 January 2025 (UTC)
- Definitely not, it is not a matter of publisher. Besides, our rules are worded generally, we never make them publisher-specific. Speaking about Storm Data, they publish a monthly periodical, see an example which would definitely be in our scope. Unlike screenshots of their web. --Jan Kameníček (talk) 23:06, 12 January 2025 (UTC)
- If/when this is deleted, please make a note somewhere that Storm Data is not covered under Wikisource's scope, since both the 2024 wildfire and 1970 tornado document above are from Storm Data and they would not be under the scope. There needs to be some note about that somewhere that the U.S. document series Storm Data is not under Wikisource's scope. WeatherWriter (talk) 22:56, 12 January 2025 (UTC)
- That is absolutely OK that they publish tables, but our rule does not accept such screenshot-based material. Being rude or shouting with bold or red letters won't help. Although you have achieved that opposing arguments are less visible, it will not have any impact on the final result. --Jan Kameníček (talk) 22:53, 12 January 2025 (UTC)
- That is how the National Weather Service, a branch of the United States government publishes finalized results...Like every single fucking natural disaster in the United States is published in that format. File:Storm Data Document for the 1970 Lubbock, Texas Tornado.jpg is a 1970 publication (pre-Internet) and this is a physical paper that was physcally scanned in. That to is in a chart and table. If charts and tables produced by the US government are not allowed, then y'all need to create something saying no U.S. government natural disaster report is allowed because tables is how the U.S. government fucking publishes the information. Yeah, good bye Wikisource. There is literally no use to be here. WeatherWriter (talk) 22:39, 12 January 2025 (UTC)
- The definition of the documentary source in WS:WWI says that "documents may range from constitutions and treaties to personal correspondence and diaries." Pure tables without any context are refused by the rule a bit below, see my quotation above. --Jan Kameníček (talk) 22:33, 12 January 2025 (UTC)
- In general, I would lean towards
Keep for reports by federal governments on official events. I know that we keep for example Civil Aeronautics Board / NTSB reports. Presumably, the NTSB dockets could also be added if so inclined. This seems to be the NOAA equivalent where the differences seem to be some level of "lack of narrative / description" and the proper formatting of the sourcing from the DB for structured data. I don't really think the first is particularly compelling to merit deletion, and the second is really about form not content. E.g. it might make sense to download the DB as a csv and then make each line a sub page to be more "official" but this seems fine to me (might make sense to upload the 1 line CSV anyways for posterity). MarkLSteadman (talk) 00:06, 13 January 2025 (UTC)
- On this topic, I want to throw 2024 Greenfield Tornado Finalized Report into the mix. This is a nearly identical format Wikisource collection (and Wikisource validated collection) for the NOAA finalized report on the 2024 Greenfield tornado. I am wanting to throw this into the mix for others to see a better-example of NOAA's finalized report. Also noting the Wikisource document is listed on the EN-Wikipedia article for the tornado (see the top of w:2024 Greenfield tornado#Tornado summary). WeatherWriter (talk) 00:17, 13 January 2025 (UTC)
- It's not the NOAA finalized report; it's a stitched together collection of NOAA reports. It's not entirely transparent which reports were stitched together. It's clearly not Storm Data.--Prosfilaes (talk) 00:35, 13 January 2025 (UTC)
- @Prosfilaes: Every URL is cited on the talk page. See Talk:2024 Greenfield Tornado Finalized Report in the "Information about this edition". To also note, the "Notes" section actually says, "This tornado crossed through four counties, so the finalized report consists of four separate reports, which have been combined together." I do not know how that is not transparent enough to say which reports are in the collection. The reports "Event Narrative" also make it clear for the continuations: For example, one ends with "The tornado exited the county into Adair County between Quince Avenue and Redwood Avenue." and the next starts with "This large and violent tornado entered into south central Adair County from Adams County." NOAA is very transparent when it is a continuation like that. If you have any suggestions how to make it more transparent, I am all ears! WeatherWriter (talk) 00:51, 13 January 2025 (UTC)
- Also quick P.S., this is in fact Storm Data. You can read the Storm Data FAQ page. Everything regarding what is an "Episode" vs "Event" (as seen in the charts aforementioned above) is entirely explained there. WeatherWriter (talk) 00:57, 13 January 2025 (UTC)
- @WeatherWriter: I missed those URLs because they're not listed on the PDF page. Someone should archive completely that Storm Data database, but that's not really Wikisource's job. We store publications, not user-created collections of material from a database. There is no "2024 Greenfield Tornado Finalized Report" from NOAA; there are four separate reports.--Prosfilaes (talk) 04:21, 14 January 2025 (UTC)
- It's not the NOAA finalized report; it's a stitched together collection of NOAA reports. It's not entirely transparent which reports were stitched together. It's clearly not Storm Data.--Prosfilaes (talk) 00:35, 13 January 2025 (UTC)
- Keep. The nominator misreads the relevant policy. The fact that a document is in tabular form does not mean that it needs must be excluded; this is a good example of that fact. TE(æ)A,ea. (talk) 00:44, 13 January 2025 (UTC)
- ...and besides that it is a user created compilation. --Jan Kameníček (talk) 18:56, 13 January 2025 (UTC)
Upon my request, the two reports compiled in our pdf have been archived by archive.org, see here and here. Archive.org is the service which should be used for web archiving, not Wikisource, where the two screenshot-based tables are now redundant and without any added value. --Jan Kameníček (talk) 15:13, 16 January 2025 (UTC)
- It might make sense to add these to field to wikidata for storm events, assuming the event itself is noticeable, given that it is built for handling structured data. But that is a question for the wikidata commmunity. MarkLSteadman (talk) 04:09, 19 January 2025 (UTC)
- It seems to me that the claim that the page is a compilation was not disproved, and so I suggest closing the discussion and deleting the page per Wikisource:WWI#Compilations. --Jan Kameníček (talk) 21:06, 15 September 2025 (UTC)
- You’re out!voted 2–1—in fact, no one even !voted delete. -- —unsigned comment by TE(æ)A,ea. (talk) 19:27, 15 September 2025.
- Well, I am giving an argument that it is a compilation which is explicitely prohibited here. None of the two who voted for keeping disproved this argument. --Jan Kameníček (talk) 09:06, 19 September 2025 (UTC)
- FWIW,
Delete on Jan's premises. SnowyCinema (talk) 12:17, 19 September 2025 (UTC)
- FWIW,
- It seems to me that the claim that the page is a compilation was not disproved, and so I suggest closing the discussion and deleting the page per Wikisource:WWI#Compilations. --Jan Kameníček (talk) 21:06, 15 September 2025 (UTC)
Comment As I see it, the principal objection is that two separate reports have been compiled into a single page. The logical solution is therefore to split the page into its component reports, and turn the current page into a disambiguation page. --EncycloPetey (talk) 21:43, 16 November 2025 (UTC)
The following discussion is closed and will soon be archived:
Deleted as compilation; Portal:Additional amendments to the United States Constitution created; many inappropriate links removed; some redirected to other targets
Judging by the note at the bottom, it looks like a compilation. -- Jan Kameníček (talk) 18:21, 21 October 2025 (UTC)
- Shouldn't each of the amendments have its own seperate page ? Is it worth moving these ? Or start afresh ? -- Beardo (talk) 01:44, 28 October 2025 (UTC)
- I suggest enabling contributors to start afresh. Such important documents deserve to be scanbacked. --Jan Kameníček (talk) 20:40, 28 October 2025 (UTC)
Comment There are a lot of pages that link to this. Is there a Portal that can be used as a replacement link? --EncycloPetey (talk) 21:18, 30 October 2025 (UTC)
- Exactly how bad of an idea would it be to scan back all of these amendments on the same page from different sources? ToxicPea (talk) 19:56, 16 November 2025 (UTC)
- You mean deliberately create a compilation, which is disallowed by policy? Using a Portal makes more sense to me. --EncycloPetey (talk) 21:47, 16 November 2025 (UTC)
- Portal sounds good. It can be created even with red links only, with the hope that the red links will prompt somebody to create the individual amendments' pages. --Jan Kameníček (talk) 15:45, 14 January 2026 (UTC)
- You mean deliberately create a compilation, which is disallowed by policy? Using a Portal makes more sense to me. --EncycloPetey (talk) 21:47, 16 November 2025 (UTC)
- Exactly how bad of an idea would it be to scan back all of these amendments on the same page from different sources? ToxicPea (talk) 19:56, 16 November 2025 (UTC)
- I've created Portal:Additional amendments to the United States Constitution. Those 932 links are essentially
- various amendments and other early constitutional documents having a pseudo-toc hardcoded with links to bill of rights etc and this page—this pseudo-toc arguably should be plain gotten rid of.
- Damn Benchbot! It added links by default to what it imported *facepalm*, broken links by default, which were helpfully "fixed" to now point to this page. Of course, all of it hardcoded. I would be of the opinion of simply deleting all Benchbot's stuff as non-scan-backed copypastes, but that's a discussion for another time. At any rate, I think you'll agree to simply removing these links? There's no point linking "14th amendment" at the start of every court case ever.
- — Alien 3
3 3 22:58, 22 February 2026 (UTC)- Agree. Thanks for taking care of all this! --Jan Kameníček (talk) 23:02, 22 February 2026 (UTC)
- Geez. Looking up close has led me to realise how terrible the BenchBot imports are. I've opened Wikisource:Scriptorium#Getting rid of BenchBot imports? about it, and won't finish chasing absurd links in that mess. — Alien 3
3 3 21:06, 25 February 2026 (UTC)
- Geez. Looking up close has led me to realise how terrible the BenchBot imports are. I've opened Wikisource:Scriptorium#Getting rid of BenchBot imports? about it, and won't finish chasing absurd links in that mess. — Alien 3
- Agree. Thanks for taking care of all this! --Jan Kameníček (talk) 23:02, 22 February 2026 (UTC)
Comment The page was deleted, but there are dozens of links still pointing to the deleted page that have yet to be cleaned up. --EncycloPetey (talk) 18:02, 21 March 2026 (UTC)
- I have taken care of all links not from BenchBot. There is no point wasting volunteer time to take care of dumps like those. (See also #23k BenchBot pages.) — Alien 3
3 3 18:10, 21 March 2026 (UTC)
- I have taken care of all links not from BenchBot. There is no point wasting volunteer time to take care of dumps like those. (See also #23k BenchBot pages.) — Alien 3
Translations of works by Author:Olavo Bilac
[edit]All of these are user translations without a scan-backed source at the Portuguese Wikisource. The first three have pages there but not scan-backed ones, but "Delirium" has no page at all on the Portuguese Wikisource.
I don't dabble in translations too much but my understanding is that, since these were created after the 2013 grandfather rule was established, these are candidates for deletion. SnowyCinema (talk) 14:41, 14 November 2025 (UTC)
Neutral, though I will note that these translations were first uploaded in 2003 (including the last one) before being split to their current location; so the grandfather rule should apply —Beleg Tâl (talk) 15:55, 14 November 2025 (UTC)
- Ah, well then I guess that means there's no rationale available for deletion. I don't love that stuff like this is here, but if there's no formal rationale to use that would work then I guess I can just withdraw the nomination. SnowyCinema (talk) 16:04, 14 November 2025 (UTC)
- Though The Milky Way has two sonnets translated - I can only see one of them in the old history. -- Beardo (talk) 00:10, 18 November 2025 (UTC)
- It's from 2023, apparently. @Beleg Tâl: granted, these are grandfathered from the requirements on translations: but besides that, this completely fails to give any sort of source or original. Even if this weren't a translation, we'd delete it. Do you think it should be kept, and if so, why? — Alien 3
3 3 20:48, 22 November 2025 (UTC)- I honestly don't care either way to be honest —Beleg Âlt BT (talk) 20:23, 17 December 2025 (UTC)
Delete per Alien. --Jan Kameníček (talk) 19:42, 6 December 2025 (UTC)
- It's from 2023, apparently. @Beleg Tâl: granted, these are grandfathered from the requirements on translations: but besides that, this completely fails to give any sort of source or original. Even if this weren't a translation, we'd delete it. Do you think it should be kept, and if so, why? — Alien 3
- Though The Milky Way has two sonnets translated - I can only see one of them in the old history. -- Beardo (talk) 00:10, 18 November 2025 (UTC)
- Ah, well then I guess that means there's no rationale available for deletion. I don't love that stuff like this is here, but if there's no formal rationale to use that would work then I guess I can just withdraw the nomination. SnowyCinema (talk) 16:04, 14 November 2025 (UTC)
- Surely the grandfather rule applies to all except the most recently added item ? -- Beardo (talk) 16:02, 20 December 2025 (UTC)
- (unclosing) @Beardo: Sorry, I thought the discussion was settled. I'll ask you the same question: regardless of translation status, this completely fails to give any source or original. What's the point keeping it? — Alien 3
3 3 17:17, 20 December 2025 (UTC)- Surely the sources are the texts in Portuguese ? Requiring those to be scan-backed is back-dating even further the policy that only became policy earlier this year. -- Beardo (talk) 20:47, 28 December 2025 (UTC)
- I am not talking about scan-backing and the WS:T requirements, but about simply having sources, because the portuguese pages have no trace whatsoever of source or origin. forgot to sign last month — Alien 3
3 3 22:32, 22 February 2026 (UTC)
- I am not talking about scan-backing and the WS:T requirements, but about simply having sources, because the portuguese pages have no trace whatsoever of source or origin. forgot to sign last month — Alien 3
- Surely the sources are the texts in Portuguese ? Requiring those to be scan-backed is back-dating even further the policy that only became policy earlier this year. -- Beardo (talk) 20:47, 28 December 2025 (UTC)
- (unclosing) @Beardo: Sorry, I thought the discussion was settled. I'll ask you the same question: regardless of translation status, this completely fails to give any source or original. What's the point keeping it? — Alien 3
Although the scanbacking was started in the Korean Wikisource, it seems to be stuck, see here. The work can be undeleted after/if the proofreading process of the original is finished. -- Jan Kameníček (talk) 23:07, 17 November 2025 (UTC)
- Why are you deleting, the Hunminjeongeum??? Do you know you cant do that?? Who are you and why you want delete such important work?? Makes no Sence. NO! Resits (talk) 23:24, 17 November 2025 (UTC)
- Wow, such a bad atitude! Ur deleting colaboration from users, and not colaborating anything, Ur sabotaging, open work. If there work to be done, in proofreading of scans, why u deleting ALL translation in full page formate?? U cant do that!! So, this page is all good, tranlated. And u wanna delete it, and have bad scans not profread? makes no sence. vote against here. discrescpet for users colaborations! Resits (talk) 23:28, 17 November 2025 (UTC)
- My understanding is that policy says "present" and "complete" and avoids saying words such as "proofread" or "validated". If we want to require proofreading, understandably, it would be helpful to clarify that is the expectation by setting it out. Right now it is compliant with the terms of the policy AFAICT, MarkLSteadman (talk) 01:22, 18 November 2025 (UTC)
- Good point. "A scan supported original language work must be present on the appropriate language wiki". It looks as if it complies. -- Beardo (talk) 01:32, 18 November 2025 (UTC)
- "Complete" could certainly be taken to mean "validated" or "proofread", I expect that was the intent behind the wording, but especially between "proofread" and "validated" there is a significant difference as we know. I would support an update / clarification to include a quality statement in the policy. MarkLSteadman (talk) 01:58, 18 November 2025 (UTC)
- I interpret "where the original language version is complete at least as far as the English translation" that if the text in the original language has only been partially transcluded, then it should at least cover the part which is translated into English. I can't see what "at least as far as" could mean otherwise. -- Beardo (talk) 03:49, 18 November 2025 (UTC)
- I have always understood that it should be at least "proofread" because the status "not proofread" does not mean anything. But I agree that it would be better to have some quality statement included in the policy. --Jan Kameníček (talk) 12:58, 19 November 2025 (UTC)
- If "Complete" = "Done" = "Validated" i.e. gone through ALL stages of the proofreading process, the policy statement "where the original language version is complete" could be read as "where the original version is validated" as far as the English version (e.g. if the first 10 pages are validated and the next 10 pages are only proofread, the English translation can cover the first 10 pages but not the first 11), but that seems especially strict. MarkLSteadman (talk) 15:52, 19 November 2025 (UTC)
- "Completed" is used to mean different things in different places, so it really does need to be defined here.
- By the way, this propsal really seems to have rattled some cages ! (See the recent change which I undid). -- Beardo (talk) 20:18, 19 November 2025 (UTC)
- I interpret "where the original language version is complete at least as far as the English translation" that if the text in the original language has only been partially transcluded, then it should at least cover the part which is translated into English. I can't see what "at least as far as" could mean otherwise. -- Beardo (talk) 03:49, 18 November 2025 (UTC)
- "Complete" could certainly be taken to mean "validated" or "proofread", I expect that was the intent behind the wording, but especially between "proofread" and "validated" there is a significant difference as we know. I would support an update / clarification to include a quality statement in the policy. MarkLSteadman (talk) 01:58, 18 November 2025 (UTC)
- Good point. "A scan supported original language work must be present on the appropriate language wiki". It looks as if it complies. -- Beardo (talk) 01:32, 18 November 2025 (UTC)
Comment: as far as I'm aware we almost always take "complete" to mean "proofread" (or for a mainspace page "is a transclusion of a proofread index"). — Alien 3
3 3 19:52, 6 December 2025 (UTC)- Agree. --Jan Kameníček (talk) 19:54, 6 December 2025 (UTC)
- And also the other way: we never understand "complete" as "not proofread". --Jan Kameníček (talk) 19:56, 6 December 2025 (UTC)
- {{incomplete}} specifically refers to "are not available on Wikisource in any form, either as text content or page scans." Note any form so specifically no requirement on whether it was proofread against the scans or even for the text content any quality marker. As we specifically separate completeness from both scan-baking and even {{OCR-errors}} which is listed as errors a separate category form completeness. That may be how it is interpreted given how things have evolved but not how it is actually written out in our templates and policies. MarkLSteadman (talk) 03:53, 7 December 2025 (UTC)
- The rule clearly says that the work must be scanbacked. So it is not completeness of the transcription in any form that is required, it is the completeness of the scanbacking process. While we may discuss whether "complete" means "proofread" or "validated" (and I agree it would be better to have it specified in the rule), it definitely does not mean "not proofread". --Jan Kameníček (talk) 13:14, 7 December 2025 (UTC)
- "Completeness of scan-backing process" is ambiguous is my point. Do we consider works that were match-and-split but still-to-be-proofread scan backed?
- Yes, with "scan-backed" meaning all subpages of are transcluded from scans while "incomplete scan-backed" means only some subpages have been migrated, while others are still left in main untranscluded. E.g. The Wonderful Wizard of Oz would be considered a scan-backed work.
- Yes, with "scan-backed" meaning all pages of work in main are transcluded and all pages of the work are transcluded. "Incomplete scan-backed" means that all pages in maing are transcluded from scans but not all portions of the work are present yet. E.g. The Wonderful Wizard of Oz would be considered a scan-backed work.
- No, "completely scan-backed" means completing the scan-backing process to either proofread or done. The Wonderful Wizard of Oz would not be considered a "scan-backed" work.
- On here now we recently deleted The Indian Orphan as "redundant to a scan-backed copy" but Forget Me Not/1825/The Indian Orphan is not proofread and hence according to the proposed definition isn't "scan-backed". So is that message wrong? MarkLSteadman (talk) 15:03, 17 December 2025 (UTC)
- "scan-backed" and "scan-backed and complete" are different things. As far as I understand it, "scan-backed" is just your first point: all the text comes from a scan. "scan-backed and complete" implies that the scan-backing is reasonably complete (in which I'd place that "reasonably" somewhere near "at least proofread except for problematic or empty pages").
- (On the specific case of The Indian Orphan: that was a self-published extract from a 1836 reprint, claiming to be the 1824 work; from this perspective being actually based on a scan of the real 1824 work was already enough to make the other one redundant.) — Alien 3
3 3 15:55, 20 December 2025 (UTC)
- "Completeness of scan-backing process" is ambiguous is my point. Do we consider works that were match-and-split but still-to-be-proofread scan backed?
- The rule clearly says that the work must be scanbacked. So it is not completeness of the transcription in any form that is required, it is the completeness of the scanbacking process. While we may discuss whether "complete" means "proofread" or "validated" (and I agree it would be better to have it specified in the rule), it definitely does not mean "not proofread". --Jan Kameníček (talk) 13:14, 7 December 2025 (UTC)
- {{incomplete}} specifically refers to "are not available on Wikisource in any form, either as text content or page scans." Note any form so specifically no requirement on whether it was proofread against the scans or even for the text content any quality marker. As we specifically separate completeness from both scan-baking and even {{OCR-errors}} which is listed as errors a separate category form completeness. That may be how it is interpreted given how things have evolved but not how it is actually written out in our templates and policies. MarkLSteadman (talk) 03:53, 7 December 2025 (UTC)
Translation that isn't scan-backed in the Norwegian Wikisource. Nighfidelity (talk) 21:16, 19 November 2025 (UTC)
- They are just not linked: https://no.wikisource.org/wiki/Urd/1907/Pianist_Sigvart_H%C3%B8gh-Nilsen --RAN (talk) 18:39, 27 November 2025 (UTC)
- @Richard Arthur Norton: the nows page is not scan-backed. — Alien 3
3 3 15:49, 28 November 2025 (UTC)
@Alien333: Can you rephrase what "scan-backed" means in this context? The scans are housed at Commons: File:Sigvart Høgh-Nilsen biography in Lordag on July 27, 1907.png Are you requiring that Wikisource must house the scans, rather than Commons? Isn't this something requiring a fix, rather than deletion? --RAN (talk) 17:48, 28 November 2025 (UTC)
- It means that the page on noWS has not been proofread against the scan. This would need to be done at no:Indeks:Sigvart Høgh-Nilsen biography in Lordag on July 27, 1907.png by someone at noWS, before we can host the user translation here at enWS. —Beleg Tâl (talk) 22:36, 28 November 2025 (UTC)
- Why do we need an index page for a magazine article just 5 paragraphs? Aren't indexes for whole books? --RAN (talk) 06:19, 29 November 2025 (UTC)
- People should be rather adding whole works rather than single articles. I'm not saying you have to finish it all, just that when setting it up for work, instead of working on little parts here and there to be recomposed when there are enough of them, it'd be better to set up the index once and for all, so as to be done with it. (Of course, you are then free to only proofread and transclude a few pages of that.) (Well, that's what we'd do anyhow. But I suspect the NOWS people are probably not huge fans either of disconnected articles.) — Alien 3
3 3 15:01, 29 November 2025 (UTC)
- People should be rather adding whole works rather than single articles. I'm not saying you have to finish it all, just that when setting it up for work, instead of working on little parts here and there to be recomposed when there are enough of them, it'd be better to set up the index once and for all, so as to be done with it. (Of course, you are then free to only proofread and transclude a few pages of that.) (Well, that's what we'd do anyhow. But I suspect the NOWS people are probably not huge fans either of disconnected articles.) — Alien 3
- This has been discussed multiple times. A magazine article or a newspaper article is a complete work. That is not a reason for deletion. We have over 500 The New York Times articles ranging from 1851 to 1929 and as far as I can tell, the closest to a complete issue is one single day in 1929 with an index, the rest are single articles with no index, because an index is not needed for a single article. --RAN (talk) 19:17, 3 December 2025 (UTC)
- An index is necessary for something to be scan-backed. I might be tolerant of some other way of linking the scan, but this page doesn't have any link to a scan to check it. That's the whole point of the scan-backed requirement, that anyone can check the text versus the scan.
- We permit single articles to be posted, but it's a lot easier when you load scans for the whole work as originally printed. Ideally, it would be nice to have an magazine or newspaper here. I'm more conflicted about newspapers; on one hand, it's unlikely we'll transcribe an entire newspaper, but on the other, so many newspaper clips are tiny texts that were never meant to stand alone; they feel like much worse violation of our no-excerpt rule than full chapters of a lot of books.--Prosfilaes (talk) 05:38, 4 December 2025 (UTC)
- Note that many magazines come in volumes so we are talking about uploading 6 Months or a year, etc. While it might be preferred to upload the whole issue it certainly isn't necessary. The two pages proofread here can be easily be moved when the whole work is uploaded to replace it when someone wants to.
- Among other things, it means that know you need to go through and check all 500 pages for copyright status which may not be easy, so to get this one article I would need to go through 6 months of Norwegian writers to see what their death dates are to determine whether it is suitable for commons or not. MarkLSteadman (talk) 15:31, 4 December 2025 (UTC)
- Magazines were originally printed in issues, and uploading them as issues would be better than uploading one page. I've found that volumes are sometimes too large to upload. In this case, since this is on no.WS, which presumably only takes life+70, I don't think it would be unreasonable to upload just one page. OTOH, for works on en.WS, I'd rather an issue be uploaded to here for concerns of Commons copyright instead of an excerpt uploaded to save that checking.--Prosfilaes (talk) 01:00, 5 December 2025 (UTC)
- It is reasonable to tradeoff having clearly licensed content on commons serving both NO WS and EN WS to doing localized uploads, and I am not sure what NO WS's policies are around content that isn't PD in Norway / the EU anyways. It would certainly be reasonable on their part for their admins to not want to deal with files / content that put them at risk for infringement. Even going through copyright clearance and redaction for a whole issue might be a pain. MarkLSteadman (talk) 05:28, 5 December 2025 (UTC)
- Does that mean that English Wikisource cannot have user translations of works that are PD in the US but not in their home country ? Or what ? -- Beardo (talk) 14:06, 5 December 2025 (UTC)
- Say pg. 10-15 of a magazine isn't, but like here pg. 1-2 containing the article of interest for translation is in the PD in both countries. The point of discussion is whether to just upload the joint PD parts (pp. 1-2) but or whether we need to upload the whole issue to provide scan-backing and create the index file. My main point is that it isn't a trivial ask to the contributor to do so and may cause complications. I personally don't see the value in having an additional 20 pgs. in Norweigian a magazine lying around in the index file scan just for completeness in exchange for these additional complications. 15:27, 5 December 2025 (UTC) MarkLSteadman (talk) 15:27, 5 December 2025 (UTC)
- Separately, this causes issues for scientific journal as well, for example, I have held up scan backing Layered Architecture for Quantum Computing because the scan is of just the article, not the issue and I don't know what the policy to do actually is: https://journals.aps.org/prx/abstract/10.1103/PhysRevX.2.031007. I can't reassemble the articles (user compilation), i can't upload the individual article if we have a whole issues policy. MarkLSteadman (talk) 15:33, 5 December 2025 (UTC)
- Is anyone arguing against that? Yes, in that case just upload pages 1 and 2. But in general when uploading to enWS, not life+x, I'd rather see the whole magazine uploaded to enWS rather than one section uploaded to Commons.--Prosfilaes (talk) 22:20, 5 December 2025 (UTC)
- That's your choice, but I don't think we should mandate people commit copyright infringement in their own country if they prefer not too. I am fine with a discussion going, "hey I saw you uploaded only pages 5-7, we would prefer if you uploaded the whole issue" and they responding by saying "I'd rather not because it's still copyrighted where I am." MarkLSteadman (talk) 02:01, 6 December 2025 (UTC)
- Then ask someone else to upload it if it's available online.--Prosfilaes (talk) 03:11, 6 December 2025 (UTC)
- That's your choice, but I don't think we should mandate people commit copyright infringement in their own country if they prefer not too. I am fine with a discussion going, "hey I saw you uploaded only pages 5-7, we would prefer if you uploaded the whole issue" and they responding by saying "I'd rather not because it's still copyrighted where I am." MarkLSteadman (talk) 02:01, 6 December 2025 (UTC)
- I believe www.wikisource.org will take works that are public domain in the US but not acceptable for copyright reasons on their language's Wikisource.--Prosfilaes (talk) 22:20, 5 December 2025 (UTC)
- Say pg. 10-15 of a magazine isn't, but like here pg. 1-2 containing the article of interest for translation is in the PD in both countries. The point of discussion is whether to just upload the joint PD parts (pp. 1-2) but or whether we need to upload the whole issue to provide scan-backing and create the index file. My main point is that it isn't a trivial ask to the contributor to do so and may cause complications. I personally don't see the value in having an additional 20 pgs. in Norweigian a magazine lying around in the index file scan just for completeness in exchange for these additional complications. 15:27, 5 December 2025 (UTC) MarkLSteadman (talk) 15:27, 5 December 2025 (UTC)
- Does that mean that English Wikisource cannot have user translations of works that are PD in the US but not in their home country ? Or what ? -- Beardo (talk) 14:06, 5 December 2025 (UTC)
- It is reasonable to tradeoff having clearly licensed content on commons serving both NO WS and EN WS to doing localized uploads, and I am not sure what NO WS's policies are around content that isn't PD in Norway / the EU anyways. It would certainly be reasonable on their part for their admins to not want to deal with files / content that put them at risk for infringement. Even going through copyright clearance and redaction for a whole issue might be a pain. MarkLSteadman (talk) 05:28, 5 December 2025 (UTC)
- Magazines were originally printed in issues, and uploading them as issues would be better than uploading one page. I've found that volumes are sometimes too large to upload. In this case, since this is on no.WS, which presumably only takes life+70, I don't think it would be unreasonable to upload just one page. OTOH, for works on en.WS, I'd rather an issue be uploaded to here for concerns of Commons copyright instead of an excerpt uploaded to save that checking.--Prosfilaes (talk) 01:00, 5 December 2025 (UTC)
- Imagine requiring someone to load an entire issue of a newspaper, just to load one obituary, just because we want to have the obit to document a particular fact about that one person. If it can be read from start to finish, it is a complete work, whether a news article, a magazine article, a journal article, or an advertisement. We even have several works that are incomplete because parts of the complete document have been lost to history. --RAN (talk) 21:59, 5 December 2025 (UTC)
- Imagine that. We require people to upload the whole History of the United States (Beard) even if they just want, say, "PART IV. THE WEST AND JACKSONIAN DEMOCRACY", and you're complaining because you just want to upload a tiny snippet of one issue of a work. You only want to document a particular fact about that one person; perhaps Wikisource is not the place to do that. Just upload the obituary to Commons, perhaps.--Prosfilaes (talk) 22:39, 5 December 2025 (UTC)
- That is the strawman fallacy. My argument is magazines articles and newspaper articles, and your argument is about a book. Books are usually read from cover to cover, magazines articles and newspaper articles are meant to be read individually, and are complete works. --RAN (talk) 02:59, 17 December 2025 (UTC)
- It's not a strawman; it's not really a logical argument at all, and neither is your "Imagine requiring...". You're complaining about having to cross the street to pick up your mail to a postal worker who walks 10 miles a day.
- Non-fiction books are often used in parts, often only the subjects or fields that the reader is interested in. Obituaries aren't meant to be read individually; people don't buy the newspaper for just one obituary. They read the obituary page at least, and usually newspaper readers read the headlines and other parts. Particularly "to document a particular fact about that one person" also applies to tiny parts of nonfiction books, and sometimes even tiny parts of fiction books.
- Again it wasn't a logical argument; it was a statement that arguing that you shouldn't have to upload so much since you just want "to document a particular fact about that one person" is actively going to irritate certain of us, and it's not something that Wikisource actively supports.--Prosfilaes (talk) 06:51, 17 December 2025 (UTC)
Dates back to early 2008 (thus interestingly, this work is probably one of our earliest uses of ProofreadPage!). We now use the template {{SIC}} (and similar templates) for noting where typos and errors in the work exist, and all the annotations I can see here are simply noting where the typos are within the work. So, is there a point to having this separate "annotated version" that I'm not seeing, beyond what {{SIC}} can now provide? SnowyCinema (talk) 16:32, 23 November 2025 (UTC)
- @SnowyCinema: Would you be willing to do the work of moving these footnotes to {{SIC}}s in the text? (We'd better do this first if we're going to delete the annotation.) — Alien 3
3 3 19:35, 6 December 2025 (UTC)- @Alien333: Yeah, I'll do it. Just give me a little time, though. (I'll announce here when I start on that.) SnowyCinema (talk) 19:42, 6 December 2025 (UTC)
23k BenchBot pages
[edit]All pages in Category:Uncategorized United States Supreme Court decision and their subpages (list) (plus redirects of course).
Context: BenchBot was a bot run by slaporte which in 2010-2011 imported 118201 mainspace page's worth of US Supreme Court cases from http://bulk.resource.org/ (relevant archive here), a website maintained by https://public.resource.org/index.html.
BenchBot imports are a mostly unreviewed&unproofread mess of little wonders:
collapsed for sanity's sake
|
|---|
Things like [[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|[[Additional amendments to the United States Constitution#Amendment XV|Fifteenth Amendment]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
<nowiki>*</nowiki> <nowiki>*</nowiki> <nowiki>*</nowiki> <nowiki>*</nowiki> <nowiki>*</nowiki> 'Sec. 16. [...]
Population
County
Hamilton............ 1682,027 1, Kearney............. 1591,571 1, Finney.............. ---3,350 3, Gray................ ---2,415 1, Ford.............. 3,1225,308 5, Edwards........... 2,4093,600 3, Pawnee............ 5,3965,204 5, Barton........... 10,318 13,172 13, Rice.............. 9,292 14,451 14, Reno............. 12,826 27,079 29, Sedgwick......... 18,753 43,626 44, Sumner........... 20,812 30,271 25, Cowley........... 21,538 34,478 30,156
--------- --------- ---------
104,793 186,552 178,
................ --- which is clear proof of OCR) are legion, with also occasional links here and there to of Amendment & This amendment, capitals for what probably should be smallcaps, etc.
|
Someone took plaintext files and tried to make wiki pages out of it without supervision, which a) tends to be a bad idea and b) ended up quite badly.
These pages are a remnant of older times but are quite below standards for formatting, and especially due to the sheer volume impossible to take care of properly. We delete OCR dumps regularly, and this isn't much better.
Roughly 50k of these 118k pages are non-redirect; I estimate only about 1k of these are worth the keeping but the 23k in question are already very low-lying fruit with approximately zero risk: this category was added only by the bot on page creations, and was always removed on all the pages that had a tiny bit of content work done. These 23k are still just as the bot dumped them. Therefore I propose we delete them. — Alien 3
3 3 22:00, 14 March 2026 (UTC)
Delete per my arguments at the last discussion. SnowyCinema (talk) 22:55, 14 March 2026 (UTC)
Delete per nom. --Jan Kameníček (talk) 17:35, 19 March 2026 (UTC)
Delete per nomination. Most of the formatted versions can be recreated if a scan is found. Nighfidelity (talk) 18:35, 19 March 2026 (UTC)
Delete for everything (including redirects) except the ~1k you have identified. I suspect the ~1k is also probably best deleted and started from scratch, but closer checking first seems warranted there. Xover (talk) 10:45, 4 April 2026 (UTC)
This is a user translation created after July 2013 without a scan-backed equivalent at the Chinese Wikisource. The Chinese version is at zh:捕蛇者說, but as it was not scan-backed, I am afraid this is not policy-compliant. SnowyCinema (talk) 17:03, 24 March 2026 (UTC)
- I should ping the creator: @Bousehje: as well as link to our policy page on user translations: Wikisource:Translations. SnowyCinema (talk) 17:05, 24 March 2026 (UTC)
- what does "scan-backed" mean? Bousehje (talk) 17:11, 24 March 2026 (UTC)
- well it was written in the 8th century so I doubt you could get a scan backed copy from the original publisher
- But it's been public domain for centuries Bousehje (talk) 20:03, 24 March 2026 (UTC)
- @Bousehje: What basically needs to be done is that a scan needs to be uploaded, probably to Wikimedia Commons, of an original Chinese-language version of this text. It then needs to be proofread at the Chinese Wikisource with the ProofreadPage system. This process is explained at Help:Proofread and similar sources (zh:Help:校对 at the Chinese Wikisource). Then, after all of the pages are proofread and transcluded, can a user-generated English-language equivalent exist here at the English Wikisource.
- The original Chinese text does exist at the Chinese Wikisource but was not proofread with this process there, so that is not sufficient for the user translation provided to be hosted here. SnowyCinema (talk) 05:46, 25 March 2026 (UTC)
- Right, maybe there's a museum that has an original Chinese text of this public domain essay, but I doubt any museum would let someone scan it. Bousehje (talk) 15:07, 27 March 2026 (UTC)
- It looks like the tale is included in this scan c:File:SSID-13063005_增批足本古文觀止_卷90.pdf (starts on page 8 of the PDF). Tcr25 (talk) 16:13, 27 March 2026 (UTC)
- It doesn't have to be a scan of the 8th century original but a scan of a book edition which is in public domain in China. --
- Beardo (talk) 17:13, 27 March 2026 (UTC)
- It looks like the tale is included in this scan c:File:SSID-13063005_增批足本古文觀止_卷90.pdf (starts on page 8 of the PDF). Tcr25 (talk) 16:13, 27 March 2026 (UTC)
- Right, maybe there's a museum that has an original Chinese text of this public domain essay, but I doubt any museum would let someone scan it. Bousehje (talk) 15:07, 27 March 2026 (UTC)
User-generated translation of a Romanian declaration created today, which is linked to a text at the Romanian Wikisource. Per our user translation policy at Wikisource:Translations, the version at the Romanian Wikisource must be scan-backed, as this user translation was done after July 2013. SnowyCinema (talk) 15:36, 26 March 2026 (UTC)
- Propose : weak
Keep because better to have the content in English than not have it at all, especially given the current attention a reunion between Romania and Moldova has been getting in the media since the President Maia Sandu's comments saying she would vote for union. The Russians are also spreading disinformation about the previous union. Better to have the content available for others to see than not as a matter of source than not have at all. Frank0051 (talk) 12:09, 27 March 2026 (UTC)
- @Frank0051: Will you be able to scan-back the original work at the Romanian Wikisource? Here's an example of a text that was scan-backed there: ro:Cerșetorii. SnowyCinema (talk) 12:14, 27 March 2026 (UTC)
"(full text)" sections of An Etymological Dictionary of the German Language
[edit]All of these should be deleted as redundant double transclusions. (I would have just speedied them, but since it seems like the project somewhat relies structurally on these pages, I thought I'd do a discussion instead.) SnowyCinema (talk) 16:36, 28 March 2026 (UTC)
- Is there a specific policy this violates? I'd generally find this sort of "all the words in the section" page more helpful to readers than the transclusion of a single word/definition to its own page, though I can see reasons for single-word transclusions too. Absent a policy violation, I'd !vote to keep. —Tcr25 (talk) 17:13, 28 March 2026 (UTC)
- @Tcr25: As far as I understand it, we can either have one or the other but we shouldn't have both. "Redundant—double transclusion" is in the list of speedy deletion rationales in the MediaWiki dropdown list. As far as my personal opinion on it, it should be deleted and this should be formal policy. I wasn't aware that it wasn't, but I agree that I can't find a specific policy this matches either. If it is kept on these grounds, I will bring this to a proposal next. SnowyCinema (talk) 17:55, 28 March 2026 (UTC)
- This particular version is a little complicated because these are annotated full text versions which by definition are not redundant with the unannotated "clean" versions. Is there a reason the "annotated" versions (e.g. An Etymological Dictionary of the German Language/Annotated/H (full text)) are nominated by the "unannotated/clean" ones (e.g. An Etymological Dictionary of the German Language/H (full text)) is not? MarkLSteadman (talk) 02:06, 5 April 2026 (UTC)
Keep Fully compliant with annotation policy: Text implemented with {{Annotation switch}}, clean version exists, "annotated" in the name per policy. If people don't think annotations should exist or that {{Annotation switch}} is the wrong apporach you should say that rather than hide it is "double transcluded" which our annotation policy requires (i.e. if the claim here is that the user should copy and paste the text into a new index and edit from thers. 03:45, 10 April 2026 (UTC) MarkLSteadman (talk) 03:45, 10 April 2026 (UTC)
- As an aside, being able to speedy deletion without comment the work of someone without discussion is a great power and should be used cautiously, a glance at he note on the first one saying "This annotated version expands the abbreviations in the original entry A (full text)." and named An Etymological Dictionary of the German Language/Annotated/A (full text) with "Annotated" right there should have made this obvious. MarkLSteadman (talk) 03:52, 10 April 2026 (UTC)
Anna Mae Yu Lamentillo self-published speeches
[edit]The following discussion is closed and will soon be archived:
Deleted as self-published
These were published on Lamentillo's personal site. So, they seem to be self-published, and not in line with WS:WWI. SnowyCinema (talk) 01:32, 31 March 2026 (UTC)
- These were also published by One Young World, NightOwl AI and Build Initiative. Wonderwoman1991 (talk) 15:56, 31 March 2026 (UTC)
- Aren't those also Lamentillo's personal platforms? —Beleg Tâl (talk) 16:25, 31 March 2026 (UTC)
Delete per nom, these look a lot like self-promotion to me —Beleg Tâl (talk) 16:28, 31 March 2026 (UTC)
Delete. Also, this may be an extension of sockpuppetry on the English Wikipedia a few years ago to promote the author, @Amylamentillo. Nighfidelity (talk) 17:00, 31 March 2026 (UTC)
3 3 20:05, 11 April 2026 (UTC)
The following discussion is closed and will soon be archived:
Deleted as redundant
Tagged for speedy deletion initially by Nighfidelity but for a case like this, I think it'd be best to send here. To be clear, I do agree with the deletion of this and vote
Delete, but I think for extract cases it's best they be sorted out here in case of legitimate disagreement. SnowyCinema (talk) 06:03, 4 April 2026 (UTC)
- Indeed
Delete - the text is also at Roughing It/Chapter L -- Beardo (talk) 15:04, 4 April 2026 (UTC)
3 3 19:02, 11 April 2026 (UTC)
The following discussion is closed and will soon be archived:
Deleted as useless
Created solely to house the long inactive page Wikisource:Books. --EncycloPetey (talk) 13:11, 4 April 2026 (UTC)
Delete. Also, what should be done to Category:PediaPress books and its contents? Nighfidelity (talk) 16:38, 4 April 2026 (UTC)
- These should also just be deleted. They're non-functional and serve no purpose. Xover (talk) 07:59, 5 April 2026 (UTC)
Delete per nom. SnowyCinema (talk) 18:51, 4 April 2026 (UTC)
Keep I propose to change to redirect towards to Category:PediaPress books MarioeMary (talk) 21:37, 4 April 2026 (UTC)
Delete as redundant in name to Category:Books Beeswaxcandle (talk) 00:47, 5 April 2026 (UTC)
Keep the redirect may go to Category:Books MarioeMary (talk) 01:19, 5 April 2026 (UTC)- (struck extra keep vote from same user) SnowyCinema (talk) 01:20, 5 April 2026 (UTC)
- Sorry 🙏 MarioeMary (talk) 01:33, 5 April 2026 (UTC)
- (struck extra keep vote from same user) SnowyCinema (talk) 01:20, 5 April 2026 (UTC)
- I propose a redirect to Category:PediaPress books or Category:Books MarioeMary (talk) 01:39, 5 April 2026 (UTC)
- @MarioeMary - what is the point of redirecting a category which is almost empty ? -- Beardo (talk) 18:27, 5 April 2026 (UTC)
- I suspect we will find that this is for a university course of some kind and that if the page creation doesn't remain in some form, then the user won't get credit for the task. Beeswaxcandle (talk) 01:48, 6 April 2026 (UTC)
- @MarioeMary - what is the point of redirecting a category which is almost empty ? -- Beardo (talk) 18:27, 5 April 2026 (UTC)
Delete per nom. --Xover (talk) 07:58, 5 April 2026 (UTC)
Comment Maybe a bit of a tangent, but why do we even have Wikisource:Books? Why did we ever have it? What we offer at Wikisource is "books" almost by definition. Everything listed here is a book already—do we need "books" of...books??? A little bit bizarre to be honest. I wonder if we should just get rid of the whole thing. SnowyCinema (talk) 03:37, 6 April 2026 (UTC)
- People wanted to see their work in print, and some people prefer reading in paper format even now. The nomenclature around the service is simply because it started as a way to create books from arbitrary collections of Wikipedia articles, and expanding to Wikisource would seem an obvious next step. Add in that Wikisource at the start was a lot less discrimiating about the texts we hosted, their provenance, and their organization. Creating a "book" that gathered pieces of text that were otherwise spread out over multiple pages on the site with barely a navbox to connect them (individual poems as top-level pages in mainspace, and a custom navbox or manual next/prev links letting you navigate between them) would be a way to recreate the properties that made books valuable in the first place. Over the years the community has tightened up how we select and organize our texts, making that need almost disappear, in paralell with PediaPress (and the technical bits implementing it) slowly dying and ebooks really taking off, which is why the artefacts seem a little absurd now. Xover (talk) 06:45, 6 April 2026 (UTC)
3 3 20:02, 11 April 2026 (UTC)
This appears to be an excerpt, barely covering the abstract, with just Google Translate links to the ruWS transcription of the Russian original.
It is also, apparently, a user translation (the article hasn't been published anywhere in English that I can tell), although it seems plausible that the original author is also the translator. This text was the subject of a copyvio discussion in 2011 that ended with an OTRS-permission asserting GFDL licensing that I am not going to challenge, but I'll just note that the publisher claims copyright in the published version and the circumstances are suggestive of someone grasping for wider reach (i.e. self-promotion), which would tend to make that OTRS release dubious. Xover (talk) 17:05, 7 April 2026 (UTC)
Delete per nom. SnowyCinema (talk) 19:27, 7 April 2026 (UTC)
This is an excerpt of a webpage at whitehouse.gov, which is itself just a collection of excerpts. SnowyCinema (talk) 21:00, 7 April 2026 (UTC)
- Delete. If this an excerpt, it can be completed; and the Web page is not a collection of excerpts, but a work in its own right. However, this should be deleted for the separate reason that it is a duplicate of a Web page, which can be archived through the usual means. TE(æ)A,ea. (talk) 01:02, 8 April 2026 (UTC)
Page:Jewish Encyclopedia Volume 9.pdf/1 (and others)
[edit]Mass creation of pages as not proofread without any text. TE(æ)A,ea. (talk) 19:16, 8 April 2026 (UTC)
Delete as when I asked @AramaicQueen about them, they stated that I hope someone picks the pages up later and turns them into articles like in the main Jewish Encyclopedia page. I do not have enough time to do it myself, so I am just starting and creating the page.
Nighfidelity (talk) 15:47, 9 April 2026 (UTC)
Comment The file has no text layer, so anyone wishing to work on this would need access to OCR tools. --EncycloPetey (talk) 16:41, 9 April 2026 (UTC)
Comment: Please not that pages 479-503 are already proofread; so at least those pages should not be deleted. Sije (talk) 19:33, 12 April 2026 (UTC)
Subsection of a portal page that hasn't been updated since 2010. The reason why I nominated it is because this page is the only one like it in Wikisource. Nighfidelity (talk) 20:09, 9 April 2026 (UTC)
Delete These are taken from en.wikipedia's DYK, right? That seems a bit odd and without any upkeep low value. —Tcr25 (talk) 21:25, 9 April 2026 (UTC)
Delete. If we're going to have a system like this, it should be consistently applied across many portals at least. But I don't even think I'd want this here even if that were the case. Portals are a place to list out works primarily, not list facts.- More broadly, every time we try and report information in prose like Wikipedia does it always comes with problems we can't easily deal with at scale. If people disagree with the information presented, the information is out of date, the information is biased in presentation, etc., we just don't have enough community monitoring / mechanisms to deal with the problem. This kind of conflict actually happens occasionally in the Portal namespace and in {{header}} notes sections, and every time I've encountered disagreement about informational sentences I've just truncated them. We're best at what we do best, which is to list works and transcribe them. My philosophy is to try and stay as far away from reporting as possible for this reason. SnowyCinema (talk) 05:02, 10 April 2026 (UTC)
This category contains only Category:United States Supreme Court decisions by court/Courts of Appeals which contains categories for each US Court of Appeals. Since these clearly aren't Supreme Court decisions Category:United States Supreme Court decisions by court should probably be deleted and Category:United States Supreme Court decisions by court/Courts of Appeals should be moved to Category:Courts of Appeals decisions by court. ToxicPea (talk) 23:27, 9 April 2026 (UTC)
- @JoeSolo22 FYI. My impressions is that this was meant to categorize SCOTUS decisions by the circuit they were appealed from, rather than cover the actual circuit decisions themselves. 03:35, 10 April 2026 (UTC) MarkLSteadman (talk) 03:35, 10 April 2026 (UTC)
- You'd be correct. I'd be alright if it was deleted though since it's not really being utilized.
- JoeSolo22 (talk) 03:48, 10 April 2026 (UTC)
Delete as unused. If someone is motivated to pick it up and properly classify things than make yourself known and this can be kept, otherwise dealt as unused and unnecessary. MarkLSteadman (talk) 03:57, 10 April 2026 (UTC)
- Actually Category:United States Supreme Court decisions by court/Courts of Appeals should just be deleted instead of being moved since Category:United States Courts of Appeals exists. ToxicPea (talk) 04:02, 10 April 2026 (UTC)
Not scan backed and redundant to British White Paper of Palestine 1939 ToxicPea (talk) 20:37, 11 April 2026 (UTC)
This work is not in English and is therefore out of scope for English Wikisource. There is a version on German Wikisource but the pages are transcribed here in German and not translated into English. ToxicPea (talk) 02:13, 12 April 2026 (UTC)
- I think that if the user were planning to do a translation, then the Index could stay, but I agree that the Pages don't belong. -- Beardo (talk) 02:51, 12 April 2026 (UTC)
- Pinging @Babbage: for comment. ToxicPea (talk) 03:00, 12 April 2026 (UTC)
Aus Der Bai Von Paranaguá and Index:Aus der Bai von Paranaguá.pdf and associated Pages
[edit]Another work by Julius Platzmann (see immediately above), but in this case, I cannot see that the work is on German wikisource (let alone being scan-backed), and several pages in German have been transcribed and transcluded. I think that all needs to be deleted, as not complying with our policies. -- Beardo (talk) 02:55, 12 April 2026 (UTC)
This work has allegedly been written in 1939 but I suspect it has never been published anywhere which would mean it is not elligible to be hosted here. --Jan Kameníček (talk) 18:35, 14 April 2026 (UTC)
- Well, we have many unpublished letters transcribed here, and even guidance about transcribing manuscripts, so I don’t see why this wouldn’t count. The main issue is that we are lacking a scan, but that’s not a requirement for completed works. TE(æ)A,ea. (talk) 22:07, 14 April 2026 (UTC)
- I am afraid no works which have not been published anywhere can be hosted under current rules. The fact that a work has been published somewhere is a sign of its notability. --Jan Kameníček (talk) 14:13, 15 April 2026 (UTC)
I think this is the same edition as Index:Coverdale.pdf, which is a better scan and has further progress —Beleg Tâl (talk) 00:13, 15 April 2026 (UTC)
