Wikisource:Scriptorium

From Wikisource
Jump to navigation Jump to search
Scriptorium

The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help.

The Administrators' noticeboard can be used where appropriate. Some announcements and newsletters are subscribed to Announcements.

Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 373 active users here.

Announcements[edit]

Alumni Oxonienses: the Members of the University of Oxford, 1715-1886 done[edit]

Joseph Foster's Alumni Oxonienses: the Members of the University of Oxford, 1715-1886 now has all its entries posted here. It is a standard reference work, and the first part (1500-1714) is already digitised online; and would be a possible bot project here.

The four index pages were set up in July 2010, and many editors have since worked on this project. I'd like to mention Billinghurst (talkcontribs) and Miraclepine (talkcontribs). The scans present particular difficulties, with varying systematic errors that substitute one digit for another (especially in the third volume).

Integration work is under way: on Author pages here, on enWP for referencing, and in the creation of Wikidata items. I'd particularly like to mention the Topicmatcher tool, Wikisource version, by Magnus Manske. That link is set up for Foster, but can be used for any work here organised in subpage style. Charles Matthews (talk) 16:33, 11 July 2022 (UTC)Reply[reply]

@Charles Matthews: Thanks for the ping. I'll go do some work on the Wikidata items as soon as I can. I do want to note, though, that the Topicmatcher hasn't assigned preliminary matches to the recently created items. ミラP@Miraclepine 17:38, 11 July 2022 (UTC)Reply[reply]
I can ask Magnus what happens about refreshing that list. Charles Matthews (talk) 17:48, 11 July 2022 (UTC)Reply[reply]
@Miraclepine: Done - 8K more automatches. Charles Matthews (talk) 11:07, 18 July 2022 (UTC)Reply[reply]

Proposals[edit]

Bot approval requests[edit]

Repairs (and moves)[edit]

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

See also Wikisource:Scan lab

Index:The_Strand_Magazine_(Volume_22).djvu[edit]

Despite the name, this appears to be a scan of Volume 23 of The Strand Magazine, rather than 22, so it would be good to get it replaced with the correct volume (for example, https://archive.org/download/TheStrandMagazineAnIllustratedMonthly/TheStrandMagazine1901bVol.XxiiJul-dec.pdf ).

Let me know if there is a better place to post this, as this is my first post that isn't a proofreading edit. Qq1122qq (talk) 14:42, 29 July 2022 (UTC)Reply[reply]

At Wikisource:Scan lab. Mpaa (talk) 08:57, 7 August 2022 (UTC)Reply[reply]

A Grammar of the Malayalim Language (Peet 1841)[edit]

This work is written for English speakers (see what it redirects to), and so the text should be moved here from the Malayalam Wikisource. Pinging @Vis M, @Manojk: who I think are active at that wiki. PseudoSkull (talk) 12:34, 4 August 2022 (UTC)Reply[reply]

@PseudoSkull: Yes, agree. Last year, I had requested its import here Wikisource:Administrators'_noticeboard#Import_help. We have to "Special:Import" those books. Only an enWS admin has the ability to import them.
One more book would meet the criteria (ml:മലയാള വ്യാകരണ ചോദ്യോത്തരം/Catechism of Malayalam grammar (Hermann Gundert 1897); see Wikisource:Requested_texts#Import_5_books_about_Malayalam_language)
All these 5 books are written for English speakers to learn Malayalam words, and the definitions are all in English. But only an enWS admin can use Special:Import. Please do so if you can. Thank you. Vis M (talk) 15:37, 4 August 2022 (UTC)Reply[reply]

A dictionary of high and colloquial Malayalim and English[edit]

Ditto above. PseudoSkull (talk) 12:37, 4 August 2022 (UTC)Reply[reply]

A Malayalam and English dictionary[edit]

Ditto above. PseudoSkull (talk) 12:49, 4 August 2022 (UTC)Reply[reply]

Malayalam Selections[edit]

Ditto above. PseudoSkull (talk)

How to Think Like a Computer Scientist[edit]

This is currently transcribed at Wikibooks, but why? Isn't that less in scope for Wikibooks and more in scope for Wikisource? (Idk anything about Wikibooks, though, to be fair...) Also, I did find out that this book apparently exists in paperback form, so it could reasonably be transcribed. So I think we should transcribe the book here, which would get rid of the necessity of the interwiki redirect, and also (hopefully) get rid of the book at Wikibooks so we don't have it in two places. Pinging @Jusjih: who apparently has admin rights both here and there. PseudoSkull (talk) 12:44, 4 August 2022 (UTC)Reply[reply]

Don't bother following the link shown on the Wikibooks page. Here's where the book can be found: https://open.umn.edu/opentextbooks/textbooks/80. There are links there for HTML and PDF versions. Rather than import from Wikibooks, why not just bring it here directly from the UMN website? Arbitan (talk) 15:51, 5 August 2022 (UTC)Reply[reply]
Just to add, it is within the scope for Wikibooks, but how it'll be used there is different. Wikisource would store an exact representation of the book as it existed when it was published in 2002. Wikibooks can use the same book as the basis for their own version, which can evolve over time with user edits. (The license of the book allows for that.) For example, there's currently a merge proposal there where they would combine features of two different editions. Arbitan (talk) 16:13, 5 August 2022 (UTC)Reply[reply]

Portal:Devon[edit]

Something's wrong with this page. The header template has the class and subclass1 (IC), but the page renders with a ? where the class would go, and it's included in Category:Unclassified portals. I can't figure out why it isn't working. Arbitan (talk) 04:19, 7 August 2022 (UTC)Reply[reply]

@Arbitan: Fixed. Experimenting with the portal header I found out that there was some problem with the spaces, I guess there must have been some invisible characters or something of that kind. --Jan Kameníček (talk) 10:57, 8 August 2022 (UTC)Reply[reply]
Thanks! --Arbitan (talk) 11:13, 8 August 2022 (UTC)Reply[reply]

Index:Bismarck and the foundation of the German empire (IA bismarckfoundati00head).pdf[edit]

The are two images missing from this scan: PRINCESS BISMARCK and EMPEROR WILLIAM I. coming after page 88 and 162 respectively. I don’t know if there’s another copy of this same edition elsewhere but these images are present in other editions. Can they be added? Ciridae (talk) 07:18, 8 August 2022 (UTC)Reply[reply]

Here's another copy that seems to have the images: https://archive.org/details/bismarckfoundati0000jame --Arbitan (talk) 09:57, 8 August 2022 (UTC)Reply[reply]

Other discussions[edit]

Policy on substantially empty works[edit]

[This is imported from WS:PD, where it applies to multiple current proposals, and several other works].

We have quite a few cases of works that are "collective" or "encyclopaedic" in that they comprise many standalone articles of individual value, which are basically just "shell pages", with no substantial content of any sort, not even imported scans or Index pages. For example, and this isn't intended to make any statement about these specific works, they're just examples and they may well get some work done soon during their respective WS:PD discussions:

Based on the usual rate of editing for things like that, unless dragged up into a process like WS:PD, they'll remain that way a very, very long time. I think it is perhaps there might be a case to host a mainspace page for this work, even though there is zero, or almost zero actual content. Do we want:

  • Mainspace pages where this is a tiny bit of information like header notes, scan links and maybe detective work on the talk page (not in this case). This provides a place for people to incrementally add content. Also gives "false positive" blue links, since there is actually no "real" content from the work itself, or
  • Do not have a mainspace page until there's some content. Only host this in terms of scan links author/portal scan links, much like we do for something like a novel.

Personally, I lean (gently) towards #2, but with a fairly low bar for how much content is needed. Say, Indexes, basic templates, a title page and one example article. Ideally, a completed TOC if practical, especially for periodical volumes/numbers. It is fair to not wish to transcribe entire volumes of these work, it is fair to not want to import dozens of scans when you only wanted one, it is fair to only want an article or two, but it's not fair, IMO, to expect the first person who wants to add an article to have to do all the groundwork themselves, despite having been lured in with a blue link. That onus feels more like it should be on the person creating the top-level page in the first place.

I do see some value in periodical top pages with decent lists of volumes and scans where known, because these are often tricky and fiddly to compile from Google books/IA/Hathi, so it's not useless work, even if there are no imported scans (though imported is better than not).

We currently have a large handful of collective works listed for deletion right now in various levels of "no real content", and, furthermore, every single periodical that gets added can fall into this situation unless the person who adds, so I think we could have a think about what we really want to see here. Inductiveloadtalk/contribs 15:43, 3 July 2020 (UTC)Reply[reply]

  • I believe that, if there is no scan as an Index: page, the main-namespace page should not exist unless it is being actively completed or is already mostly completed. A few pages (of the volume itself) is not very helpful, and is entirely useless if their is no scan given. TE(æ)A,ea. (talk) 15:59, 3 July 2020 (UTC).Reply[reply]
  • I think such preparatory information would ideally be on more centralized WikiProject pages (for the broad subject), both for clarity and to assist in keeping different efforts consistent -- but that it certainly should be retained as visible to non-admins. I think that the red vs blue link issue is minor (but not totally negligible) and outweighed by the disadvantages of hiding the history of previous efforts. I strongly encourage redirecting such pages to appropriate WikiProject pages (after copying over the details there). JesseW (talk) 18:11, 3 July 2020 (UTC)Reply[reply]
  • @JesseW: I agree that history shouldn't be deleted, but I think we should approach this in terms of what we want to see from these works, rather than what to do with the handful of examples at PD. There are hundreds of periodicals we could have but don't, and this applies to those as well. If we can come to a conclusion about what is and isn't wanted, we can make all the deletion requested works conform to that easily enough. Inductiveloadtalk/contribs 20:55, 3 July 2020 (UTC)Reply[reply]
  • I think these pages are necessary to list index pages and external scans of multi-volume works (such as encyclopaedias and periodicals) especially if they are wholly or partly anonymous or have many authors or are simply large. I think it makes no difference whether such pages are in the mainspace, the portal space or the project space (except that it is harder to find pages outside the mainspace). The point is that these works often have so many volumes (often dozens or hundreds) that they must have their own page, and cannot be merged into a larger portal or wikiproject. If the community starts insisting on index pages, what will happen is the rapid upload of a large number of scans for the periodicals that already have their own page. Likewise if the community insists on transclusion. I also think it is reasonable to have a contents page in the mainspace, as it allows transclusion of articles. Most importantly, new restrictions should not immediately apply to existing pages that were created before the introduction of the restrictions. This is necessary to prevent a bottleneck. James500 (talk) 23:55, 3 July 2020 (UTC)Reply[reply]
move the works to a maintenance category, and i will work them; delete them and i will not: i find your sword of Damocles demotivating. Slowking4Rama's revenge 01:55, 5 July 2020 (UTC)Reply[reply]
@User:Slowking4: I am not proposing a sword of Damocles. I agree that the imposition of deadlines is counter-productive. I do not support the deletion of any of these pages. I would prefer to see them improved. James500 (talk) 04:38, 5 July 2020 (UTC)Reply[reply]
TEA is on his usual deletion spree. not a fan. will not be finding scans to save texts, any more. he can do it. Slowking4Rama's revenge 00:15, 6 July 2020 (UTC)Reply[reply]
The entire point of moving this here, and not staying at WS:PD is to decouple from the emotions that get stirred up in a deletion discussion. Let's keep deletion out of this. If we come up with some idea of what we do and don't want, then we can go back to WS:PD and decide what to do. I imagine that all that will be needed will be a fairly limited amount of housework to bring those works up to some standard that we can decide on here, and all the collective works there will be easy keeps. Hopefully with some kind of consensus that we can point at to outline a minimum viable product for such works going forward. There are hundreds and thousands of dictionaries, encyclopedias, periodicals and newspapers that we could/will, quite reasonably, have only snippets of. How do we want to present them? What, exactly, is the minimum threshold? Let's head of all those future deletion proposals off at the pass, because deletion proposals often cause friction. Inductiveloadtalk/contribs 00:47, 6 July 2020 (UTC)Reply[reply]
and yet deletion is the default method to "motivate" quality improvement. i reject your assertion that "emotions get stirred in a deletion discussion", rather, anger is a valid response to a repeated broken process being kicked down on the volunteers. it is unclear that a minimum threshold is necessary, rather a functional quality improvement process is. until we have one, you should expect to see this periodic stirring of emotions, as the non-leaders act out. Slowking4Rama's revenge 11:53, 9 July 2020 (UTC)Reply[reply]
@Slowking4: Thank you for presenting this opinion, and I'm sorry if I have not made myself clear. We do need to figure out how to avoid a de-facto process of using WS:PD as an ill-tempered ad-hoc venue for "forcing" improvements on people who have somehow managed to generate works that are so in need of improvement that another user has nominated them for deletion. Please also consider looking at #Re-purpose_WikiProject_OCR_to_WikiProject_Scans for an idea to have a "functional quality improvement process" to which such works could be referred upon discovery rather than kicking them straight to WS:PD. If you have other ideas or you have previously suggested something similar to address these frustrations, you could detail them there. Personally, I think we should always prefer improvement over deletion. Exactly what the remediation is (refer to a putative WP:Scans, WS:Scriptorium/Help, directly WS:PD as now, or something else) is not what this thread is for. This thread is for discussing, what, if anything, should be the tipping point for deeming a page "lacking" and doing something about, whatever "something" is. I don't think I can be much clearer that this is not about deletion. If we also have a better venue for improvements, then that's even better.
For example, my personal feeling and !vote on A Critical Dictionary of English Literature is "keep and improve", despite it lacking scans or even links to scans, having only one article and no other content, not even a title page: in short, failing almost every criterion suggested so far in this thread. The only thing it does have is have is good text quality of the one entry. I personally do not think this work should be deleted, but I do think it should be improved in specific ways. The first half of that sentence is not the focus of this discussion, the second half is. Inductiveloadtalk/contribs 14:18, 9 July 2020 (UTC)Reply[reply]
deletion threat has been an habitual method of communicating by admins since the beginning of the project. and text dumps have been habitual following in the guttenberg example. culture change and process change would be required to change those behaviors. we could may it easier to start scan backed works, but the wishlist was not supported. Slowking4Rama's revenge 21:00, 14 July 2020 (UTC)Reply[reply]

I don't think this needs to be much of an issue going forward -- we all agree that it's OK to create Index pages for scans, even if none of the Pages have been transcribed yet; so the only case where this would come up is recording research where no scan has yet been identified as suitable to be uploaded. And for that, I still think a WikiProject page is the right location, not mainspace. (Or, if you must, your userpage.) JesseW (talk) 00:59, 6 July 2020 (UTC) I realized I may not have been clear enough here -- in my view, the ideal process goes like this:Reply[reply]

  1. Decide on a work you are interested in (in this case, a periodical/encyclopedic one) -- don't record that anywhere on-wiki (except maybe your user page)
  2. Find and upload (to Commons) a scan of one part/issue/etc of the work.
  3. Create a ProofreadPage-managed page in the Index: namespace for the scan. (You can stop after this point, without worry that your work will later be discarded.)
  4. EITHER
    1. Put further research (on other editions, context, possible wikification, etc.) on that Index_talk page.
    2. Proofread a complete part of the scan (an article from the magazine issue, a chapter from the book, a entry from an encyclopedia, etc.) and transclude it to the mainspace (and create necessary parent pages), and put the further research on the Talk: page of the parent mainspace entry.

If you can't find any scan, and don't want to leave your working notes on your user page, put them on a relevant WikiProject's page.

If you come across such research done by others and misplaced, follow the above process to relocate it to an appropriate place, then redirect the page where you found it to the new location. That's my proposal. JesseW (talk) 01:08, 6 July 2020 (UTC)Reply[reply]

@JesseW: It's not clear to me in your above whether when you use the term "index" you refer to a ProofreadPage-managed page in the Index: namespace, or a general wikipage in the main namespace on which an index-like structure (and/or a ToC, or similar) is manually created. Could you clarify? --Xover (talk) 05:14, 6 July 2020 (UTC)Reply[reply]
I meant the namespace. Clarified now. JesseW (talk) 05:17, 6 July 2020 (UTC)Reply[reply]
  • Hoo-boy. Y'all sure know how to pick the difficult issues…
    My general stance is that: 1) scans and Index: (and Page:) namespace pages have no particular completion criteria to meet to merit inclusion, and can stay in whatever state indefinitely (there may be other reasons to get rid of them, but not this); and 2) the default for mainspace is that only scan-backed complete and finished works that meet a minimum standard for quality should exist there.
    That general stance must be nuanced in two main ways: 1) there must be some kind of grandfather clause for pre-existing pages; and 2) there must exist exceptions for certain kinds of works that meet certain criteria. I won't touch on the grandfather clause here much, except to say I'm generally in favour of making it minimal, maybe something like "No active effort to get rid of older works, but if they're brought to PD for other reasons they're fair game". The design of a grandfather clause for this is a whole separate discussion, and an intelligent one requires analysis of existing pages that would be affected by it. It is always preferable to migrate pages to a modern standard, so a grandfather clause is by definition a second choice option.
    Now, to the meat of the matter: the exceptions…
    We have a clear policy to start from: no excerpts. Works should either be complete as published, or they should not be in mainspace. But quite apart from the historical practices that modify this (which are somewhat subjective and inconsistent, so I'll ignore them for now), there are some fairly obvious cases that suggest a need for more nuance than a simple bright-line rule alone provides. The major ones that come to mind are: 1) massive never-completed projects like EB1911 or the New York Times (EB because it's big; NYT because new PD issues are added every year); 2) compilations or collections of stand-alone works with plausible claim to independent notability.
    For encyclopedias and encyclopedia-like things, we have to accept some subsets due to sheer scale of work. But when that is the grounds for exception, there needs to be some minimum level of completion. I'm not sure I can come up with a specific number of pages/entries or percentage, but it needs to be more than just a single entry (and, obviously, only complete entries). For this kind of exception to apply, I think it needs to be a requirement that the framing structure for it is complete: that is, the mainspace page should give a complete overview of the relevant work even if most of it is redlinks. That includes title pages and other prolegomena when relevant. For a periodical like the NYT, that means complete lists of issues with dates and other such relevant information (e,g. name changes etc.). For preference, these kinds of things should be in Portal: namespace or on a WikiProject page until actually complete, but that will not always be practical (EB1911 and NYT are examples of this). Mainspace or Portal:-space should never contain external links (i.e. to scans) or links to Index: or Page: space (except the implied link of transclusion and the "Source" tab in the MW UI provided by ProofreadPage).
    For exception claimed under independent notability there are a couple of distinct variants.
    Newspaper or magazine articles need to have a certain level of substance in addition to a specific identifiable byline (possibly anonymous or pseudonymous, and possibly identified after the fact by some other source, such as the Letters of Junius) in order to qualify. It is not enough to ipso facto be a newspaper article, a magazine article, a poem, or an encyclopedia entry. On the one hand we have things like dictionaries and thesauri, where an entry could be as little as two words. Or a one-sentence notice without byline in a newspaper. Or two rhymed lines (technically a poem) within a 1000-page scholarly monograph.
    To merit this exception it should be reasonable to argue that the "work" in question should exist as a stand-alone mainspace page (not that we generally want that; but as a test for this exception, it should be reasonable to make such an argument). This would clearly apply to moderately long entries in the EB1911 written by a known author that has their own Wikipedia article. It would apply to short stories or novella-length serialisations in literary magazines by authors that have later become famous (or "are still …"). It would apply to various longer-form journalistic material from identifiable journalists (again, rule of thumb is notable enough for enWP article), including things in magazines that have similar properties. For most periodicals the most relevant atomic (indivisable) part is the issue not the entry or article, but with some commonsense exceptions.
    It would, generally, not apply to things that are works by a single author, like a scholarly monograph that just happens to be arranged in "entries" rather than chapters. It would not apply to things that are essentially lists or tables of data. It would not apply to short entries in something encyclopedia-like or entries that are not by an identifiable author. The OED for example, iirc, is a collective work where entries are by multiple not individually identifiable authors (and each entry is mostly very short too); only the overall editor is usually cited.
    For works claiming this exception too the framing structure should be complete, even if most of it are redlinks. The same general rules about Portal:/WikiProject and no external or Index:-space links apply. An exception would be for periodicals where new issues enter the public domain every year; and we should generally avoid including even redlinks for the non-PD issues here (but may allow them in a WikiProject page). For non-periodical works in multiple volumes where some volumes were published after the PD cutoff, including listings for the non-PD volumes (but not links to scans; those are a copyvio issue) is ok.
    Poems, short stories, and novellas are a special class of works here. A lot of these were first published in a magazine (possibly serialized), and a lot of them exist as multiple editions in substantially the same form. Some exist in multiple versions. These should all primarily exist the same way as chapters as part of their various containing works; but there are some cases where we might want to have, for example, a series of connected pages of the poems of Emily Dickinson. I am significantly ambivalent about this practice, as it amounts to making our own "edition" or "collection" of her poems (in violation of several of our other policies), but I acknowledge that it is an established practice and it is something that has definite value to our readers. It may be that it is actually a practice that should be governed by its own dedicated policy rather be attempted to be handled within these other general policies.
    For the sake of example; applying this to the works Inductiveload listed at the start of this thread would shake out something like this:
    Auction Prices of Books—This work appears to have no sensible subdivisions and is in any case by a single author. I see no obvious reason to grant this work an exception, except under sheer volume of work and even there I would want to see both a substantial proportion completed and some kind of ongoing effort towards completion (no particular time frame, but definitely not infinite and definitely not as an effectively abandoned project). In a deletion discussion I would very likely vote to delete the mainspace pages here (but, as nearly always, to keep the Index: and Page: namespace artifacts). I don't see this as a reasonable candidate for a Portal:, nor really a good fit for a WikiProject (though I probably wouldn't object to a WikiProject if someone really wanted one).
    Central Law Journal/Volume 1—A single volume is too little, so I would want to see a complete structure for the entire Central Law Journal, with level of detail for each volume similar to the one existing volume. Each article in the journal can be individually considered for a stand-alone work exception; but for the collection I would want to see at minimum a full issue finished to justify having the mainspace structure, and preferably multiple issues (in a deletion discussion I might insist on multiple issues). Index: and Page:-space artefacts can, of course, stay. A Portal: might make sense for selections from the journal, of articles that meet the standalone work exception. A WikiProject to coordinate work and track links to scans etc. might be a decent fit here, if someone wanted that. As it currently stands I would probably vote delete for the mainspace artefacts (with option to move whatever content has reuse value to a non-mainspace page for preservation; and undeleting if someone wants to work on something is a low bar).
    A Critical Dictionary of English Literature—The top level mainspace page has near-zero value, existing only to link to the single transcribed entry. For a credible claim to exception to exist it would need to be a complete framework for the work as a whole, and significantly more than a single entry must be complete. I would probably also want to see ongoing work, unless a substantial percentage of the entries were complete. The single finished entry is eligible to claim a standalone work exception, but I think it probably would not meet my bar for that (I might be wrong; and the rest of the community might judge it differently). In a deletion discussion I would probably vote to delete all the mainspace artifacts here (as always keeping Index:/Page: stuff) but with a definite possibility that I might be persuaded on the one completed entry (an absolute requirement for convincing me would be to scan-back it: as a separate issue, my tolerance for grandfathering of non-scan-backed works is small, and effectively zero for new/non-grandfathered works).
    Bradshaw's Monthly Railway Guide—Would need a full framework and a number of individual issues finished to merit a mainspace page. I see no credible subdivisions for a standalone work exception, but might be persuaded otherwise if, say, one of the train tables was used as a (reliable primary) source in a Wikipedia article (implying some sort of notability beyond just being raw data). In a deletion discussion I would probably vote to delete all mainspace artifacts here. If anyone made the argument, I would entertain the notion that there is value in treating train tables like poems, and hosting a series of train tables like we do Dickinson's poems; but that would require a substantial number of them completed.
    For everything above my stance is nuanced by a willingness to accept temporary exceptions for things that are actively being worked: active being operative, but with no particular deadline to complete the work. We have differing amounts of time available, and some works are so labour-intensive or tedious to do, that my person threshold for "active" is a pretty low bar to clear. If it's months and years between every time you dip in and do a bit I might start to get antsy, but days or weeks probably won't faze me. And that the projected time to completion is very long at that pace is not particularly a problem so long as it is not infinite. Within those parameters I would always tend to err on the side of letting contributors just get on with it in peace, regardless of any of the policy-like rules sketched above.
    I also want to emphasise that I think this is a very difficult issue to deal with. There are a lot of competing concerns, and a lot of grey areas that will likely take individual discussions to resolve. My balance point on this issue is partly formed by a broader concern about our overall quality (we have waay too many works of plain sub-par quality, and too many not up to modern standards) and a hope that by preventing the creation of these kinds of works (rather than deleting them after creation) we will be able to retain the good and desirable exceptions without dragging down quality, and without the traumatic and stressful events that deletions and proposed deletion discussions are.
    And for that very reason I am grateful this issue was brought up here for discussion, and I hope we can end up with some clear guidance, possibly in the form of a policy page, going forward. And in any case, since it will create de facto policy, this is a discussion that needs to stay open for a good long while (there are several community members that have not yet commented whose opinion I would wish to hear before closing this), and depending on how well we manage to structure the consensus, may also require a formal vote (up in the #Proposals section). --Xover (talk) 09:03, 6 July 2020 (UTC)Reply[reply]
  • Symbol oppose vote.svg Oppose. It is becoming clear that a policy on incomplete works in the mainspace is going to place enormous pressure on individual editors. I think it would be more effective to start a wikiproject devoted to scan-backing works that lack scans and so on. James500 (talk) 12:14, 6 July 2020 (UTC)Reply[reply]
    • @James500: FYI, this thread was made in order to provide an exception to the current policy of "no excerpts". A literal reading of the policy as it stands has a plausible chance of coming down delete on the mainspace pages over at WS:PD. This thread is a chance to come up with a better way to support such partial collective works. That we have several substantially incomplete and abandoned collective works lolling around in mainspace is actually the result of laxity in respect to stated policy (not to say I think it's a bad thing). The deletion proposals, whatever you may think of them, are actually not in contradiction to policy. That said, as always, there is scope to adjust policy. Which is what this is.
    • Now, in terms of a WikiProject to scan back works, I think that is a good idea. See #Re-purpose_WikiProject_OCR_to_WikiProject_Scans above, which proposed to reboot Wikiproject OCR as a scan-backing Wikiproject. Inductiveloadtalk/contribs 14:40, 6 July 2020 (UTC)Reply[reply]
      • The policy says "When an entire work is available as a djvu file on commons and an Index page is created here, works are considered in process not excerpts." A literal reading of that policy is that no scan-backed work is an excerpt (it is expected to be completed eventually). Further the policy refers to "Random or selected sections of a larger work". A literal reading of that expression is that it does not include lists of scans, or auxilliary content tables, as they are not "sections" (they are not part of the work), and that not every incomplete portion of a work is either "random or selected" (which would not include starting from the beginning and getting as far as you can, with intent to finish later). I could probably argue that an encyclopedia article or periodical article is a complete work. James500 (talk) 15:16, 6 July 2020 (UTC)Reply[reply]
  • Nice wall of text, Xover (and I say that with great respect!) -- it generally makes sense and sounds good to me. As another hopefully illustrative example, take The Works of Voltaire, which I've been digging thru lately. I think this would very much satisfy your criteria as a large work, with sufficient scaffolding to justify the mainspace pages that exist for it. I would love to hear others thoughts on that. JesseW (talk) 16:07, 6 July 2020 (UTC)Reply[reply]
    @JesseW: Yeah, apologies for the length. Brevity is just not my strong suit.
    The Works of Voltaire probably qualifies on sheer scale of work, yes. I don't think the current wikipage at The Works of Voltaire is quite it though: as it currently stands it is more WikiProject than something that should sit in mainspace (its contents are for Wikisource contributors, to organise our effort, not our readers, who want to read finished transcriptions). It also mixes a work page with a versions page in a confusing way. So I would probably say… Move the current page to Wikisource:WikiProject Voltaire; create a new The Works of Voltaire as a pure versions page, linking to…; The Works of Voltaire (1906), that is set up as a work page with the cover and title (and other relevant front matter) of the first volume, and an AuxTOC (and possibly also the {{Works of Voltaire}} volume navigation template). I don't know how tightly coupled the volumes of this edition are (does the first volume have a common ToC or index of works for all the volumes?), so some flexibility on format may be needed to make sense. But as a base rule of thumb it should start from a regular works page and deviate only as needed to accommodate this work (mainly the size is different).
    In any case… With a volume or two completed (they're only ~350 pages each) I'd be perfectly happy having something like that sitting around. With less then that I'd possibly be a bit more iffy, but it's hard to put any kind of hard limit on that. And with somebody actively working on it I'd be in no hurry whatsoever regardless of current level of completion.
    PS. I'm pretty sure a large proportion of the contents of these volumes are works that would qualify under "standalone works" that could exist independently in mainspace, regardless of what's done with the The Works of Voltaire page. Even his individual poems and essays can presumably make a credible claim here (because it's Voltaire; less famous authors would have a higher bar). Better as part of the edition, but also acceptable on their own. --Xover (talk) 16:56, 6 July 2020 (UTC)Reply[reply]
  • @JesseW: I personally take no issue with this page's existence (actually I think it's a nice work and good way to allow an important author's works to be slotted in piece-by-piece. I have some general comments which overlap with this thread (written before Xover's reply, so pardon overlap):
    • First off, I differ with Xover in terms of the scan links: I think they're better than nothing, and I don't see much value in duplicating the volume list onto an auxiliary page just to add scan links. However, I can sympathise with the sentiment that our mainspace shouldn't direct users off-wiki (or at least off-WMF). But if we don't have the scans, and that's what the user wants, they're leaving anyway. Real answer: import moar scans!
    • No scan links are necessary where the volume exists in mainspace and is scan-backed (e.g. v3)
    • Ext scan links should only be used when there is no Index page or imported scan. Use {{small scan link}} or {{Commons link}} when possible (e.g. v2)
    • The first volume list could probably be in an AuxTOC to mark it out as WS-generated content.
    • The "Other editions" section belongs on an auxiliary namespace page (Talk, Portal or Wikisource). I suggest the Talk page is best in this case. Inductiveloadtalk/contribs 17:35, 6 July 2020 (UTC)Reply[reply]
  • @Xover: I am in agreement with the majority of what you say. Particularly, I think a framework around any collective work (be it a single-volume biographical dictionary or a 400-issue literary review spanning 80 years) is the critical prerequisite, plus at least some scans, the more the merrier. Where I think I differ:
    • I am inclined to be a bit more relaxed in terms of how much of a work we need. As long as a single article exists, it's not "trivial" (e.g. only a short advert or some incidental text like a "note to correspondents", as opposed to an actual article), it's well-formatted and scan-backed, and a complete framework exists, including front matter and a TOC, such that's it is easy for anyone to slot in new pieces, I'd be fairly happy. Lots of periodicals have all sort of tricky bits like tables of stocks or weather tables and writing into policy that those must be proofread in order to get the "real" articles into mainspace would be a chilling effect, in my opinion. If you allowed an exception, it would be verbose and tricky to capture the spirit without saying "unless, like, it's totally, like, hard, man".
    • I am not dead against scan links in the mainspace at the top level, when such a top-level page exists. See my comments on Voltaire above. I am against them where they could sensibly be on an Author page and they are the only mainspace content.
    • I am ambivalent on the presence of, e.g., disjointed train timetables. It's not my thing to have a smattering of random timetables, but as long as they're individually presented nicely, it's not too offensive to my sensibilities. I might question the sanity of someone who loves doing tables that much, but whatever floats the boats! Also, I think that this might circle back to "good for export" - a mark which certainly would require completed issues or volumes. If you want to get that box ticked, you have to do it all.
    • Re the "notability" aspect of individual articles, I'm not really bothered by that, as I don't think we'll see a flood of total dross because few people really want to take the time to transcribe 1867 articles about cats in a tree from the Nowhere, Arizona Daily Reporter, and, actually I think some of the "dross" can be quite interesting in a slice-of-life kind of a way (always assuming well-formed and scan-backed). And the real dross is usually so bad (no scans, raw OCR, etc) that it can be dealt with outside of this topic. I think part of the value of WS is the tiny, weird and wonderful, not just in blockbusters like War and Peace and Pultizers. I think I might like to see more of our articles strung together thematically via Portals, but that's another day's issue. Inductiveloadtalk/contribs 17:35, 6 July 2020 (UTC)Reply[reply]
      • @Inductiveload: We appear to be mostly in agreement. But… instead of me dropping another wall of text on the remaining points of disagreement, maybe that means we're in a position to try to hash out a draft guidance / policy type page with the rough framework? Then we could go at the remaining issues point by point. Because I think I'm in with a decent chance to persuade you to my point of view on at least some of them, but this thread is fast getting unwieldy (mostly my fault). It would also probably be easier for the community to relate to now, and much easier to lean on in the future. --Xover (talk) 18:31, 6 July 2020 (UTC)Reply[reply]
        • @Xover: If there are no more comments forthcoming after a couple of days, I think that makes sense. I don't want to railroad it: considering we have at least one !vote for "do nothing", I'd like to see if there are any other substantially different opinions floating about. Inductiveloadtalk/contribs 17:41, 7 July 2020 (UTC)Reply[reply]

The quantity of text here has grown far faster than my ability to absorb it, so rather than continue to put it off, here's my position: I don't see any problem with transcriptions that are scan-backed, even if the transcription only covers a small fraction of the entire scan. If Sally chooses (say) to transcribe a favorite story, that happened to be published in an issue of Harper's back in the 1890s, and goes to the trouble of uploading the full issue, but only creates pages for the one story that interests her, I think that's great. It doesn't matter to me whether she intends to work on the other pages or not. If it's not scan-backed, but it's fairly high quality, I am personally willing to do some work trying to locate a scan and match it up to the text; I'd rather we take that approach, than deletion, though of course deletion is the better option in some cases where the scan is very hard to come by.

If all this has been said above, or if I've misunderstood the topic, my apologies. Please take this comment or leave it, as appropriate. -Pete (talk) 02:00, 8 July 2020 (UTC)Reply[reply]

Apologies, I see I had missed the point.

I disagree with Xover's statement that a top-level page for a publication, with a link only to a single article within the publication, has "near-zero value." Such a page can serve an important function linking content together in ways that help the reader (and search engines) find the content they're looking for, or understand the context around it. For instance, A Critical Dictionary of English Literature is linked from the relevant Wikidata entry. The banner on the Wikisource page clearly tells a Wikisource reader that they won't find a full transcription here; and with a simple edit, it could link to a full scan on another site, or (with perhaps a little more effort) even transcription links here on Wikisource. This page has been here since 2010; we don't have any way of knowing what links might have been created elsewhere in the intervening decade. (I do think that new pages like this should not be created without a scan at Commons to be linked to.) -Pete (talk) 02:12, 8 July 2020 (UTC)Reply[reply]

I'm really bad with walls of text, so I have only read a tiny portion of the above discussion. But I want to mention a couple of things that I think are worth considering in this discussion.
  • Most of the time, a mainspace "work" that is only a table of contents, but which has none of the actual content, and is not actively being worked on, can be (and should be) deleted as No meaningful content or history under our deletion policy.
  • A mainspace work that has only a little bit of content, but that content is a work unto itself within the scope of Wikisourse, should be kept. Most periodicals are like this. For an example, see the Journal of English and Germanic Philology which only has one hosted article, but that hosted article is scan-backed and firmly within scope.
  • On some occasions, empty mainspace works do have value. I ended up creating the page The Roman Breviary, depsite containing no actual content, mostly because there are a lot of works that link to it, using many different titles, and if someone uploaded a copy of the work under one title then many of the links would remain red because they point to different titles of the work. This could be easily solved by creating redirects to a simple placeholder page, so I did. I tried to make the placeholder page as useful as a placeholder page can be, as it contains useful information about the history and authorship of the work, and links to the Index pages where the transcription will take place.

Anyway those are my 2 cents, sorry if they are redundant —Beleg Tâl (talk) 00:40, 29 July 2020 (UTC)Reply[reply]

Proposal[edit]

Since there has been no extra input for a month, and not wanting this section to get archived without at least attempting a proposal, I have started a proposal #Collective work inclusion criteria above. Inductiveloadtalk/contribs 11:00, 25 August 2020 (UTC)Reply[reply]

Since the proposal has now slipped off the main page (to here), with vague support for the first part (collective work inclusion criteria) and a fairly consistent opposition to the second (no-content pages), my plan is to transfer the first part, as guidelines rather than policy, to Wikisource:Periodical guidelines. As non-binding guidelines, they can then be worked on further in situ. Sound OK? Inductiveloadtalk/contribs 08:10, 16 April 2021 (UTC)Reply[reply]
The example given in Wikisource:Periodical guidelines might be improved, PSM is and was an exercise that has gone its own way (no offense to @Ineuw:, this is a site under development and that is only one example).CYGNIS INSIGNIS 13:05, 17 April 2021 (UTC)Reply[reply]
@Cygnis insignis: You would be wrong to think that I am offended. Remember that when I started, I knew everything. By now, so much of that knowledge is lost that I am happy to listen. Would you elaborate please? — Ineuw (talk) 19:50, 17 April 2021 (UTC)Reply[reply]

I've created Bradshaw's Monthly Railway and Steam Navigation Guide (XVI) - it couldn't be done on one page, due to the very high number of template transclusions. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:52, 1 September 2020 (UTC)Reply[reply]

@Pigsonthewing: The links in the toc on that page appear non-functional. Also, depending on just exactly which templates were the culprit, it is possible that you may be able to put all the content you wanted onto one page now due to some recent technical changes (template code moved to a Lua module which drastically improves performance and prevents hitting transclusion limits until much later). Xover (talk) 11:17, 14 September 2021 (UTC)Reply[reply]
Create the Draft namespace to hold substantially empty works? Then delete if no improvement after months?--Jusjih (talk) 19:22, 1 November 2021 (UTC)Reply[reply]
The issue is that the "substantially empty works" can have useful and complete content that stands alone. For example, an article from a scientific journal.
I would not want to see that either shunted into a Draft namespace to rot or deleted a few weeks down the line.
Index and Page namespaces provide our long term staging areas, and works can and do remain unfinished there for years. But what do we do when a self-contained piece of a larger work is ready? Inductiveloadtalk/contribs 20:29, 1 November 2021 (UTC)Reply[reply]

Universal Code of Conduct News – Issue 1[edit]

Universal Code of Conduct News
Issue 1, June 2021Read the full newsletter


Welcome to the first issue of Universal Code of Conduct News! This newsletter will help Wikimedians stay involved with the development of the new code, and will distribute relevant news, research, and upcoming events related to the UCoC.

Please note, this is the first issue of UCoC Newsletter which is delivered to all subscribers and projects as an announcement of the initiative. If you want the future issues delivered to your talk page, village pumps, or any specific pages you find appropriate, you need to subscribe here.

You can help us by translating the newsletter issues in your languages to spread the news and create awareness of the new conduct to keep our beloved community safe for all of us. Please add your name here if you want to be informed of the draft issue to translate beforehand. Your participation is valued and appreciated.

  • Affiliate consultations – Wikimedia affiliates of all sizes and types were invited to participate in the UCoC affiliate consultation throughout March and April 2021. (continue reading)
  • 2021 key consultations – The Wikimedia Foundation held enforcement key questions consultations in April and May 2021 to request input about UCoC enforcement from the broader Wikimedia community. (continue reading)
  • Roundtable discussions – The UCoC facilitation team hosted two 90-minute-long public roundtable discussions in May 2021 to discuss UCoC key enforcement questions. More conversations are scheduled. (continue reading)
  • Phase 2 drafting committee – The drafting committee for the phase 2 of the UCoC started their work on 12 May 2021. Read more about their work. (continue reading)
  • Diff blogs – The UCoC facilitators wrote several blog posts based on interesting findings and insights from each community during local project consultation that took place in the 1st quarter of 2021. (continue reading)


unsigned comment by SOyeyele (WMF) (talk) 22:37, 10 June 2021‎.

Index:Robert Carter- his life and work. 1807-1889 (IA robertcarterhis00coch).pdf[edit]

First run through is done, and it's transcluded. Needs validation. Thanks in advance for any help. Jarnsax (talk) 18:13, 16 June 2021‎ (UTC)Reply[reply]

J3l[edit]

The Works of the Late Edgar Allan Poe/Volume 1/The Domain of Arnheim unsigned comment by 202.165.87.161 (talk) 18:52, 25 December 2021 ‎(UTC).Reply[reply]

Subscribe to the This Month in Education newsletter - learn from others and share your stories[edit]

Dear community members,

Greetings from the EWOC Newsletter team and the education team at Wikimedia Foundation. We are very excited to share that we on tenth years of Education Newsletter (This Month in Education) invite you to join us by subscribing to the newsletter on your talk page or by sharing your activities in the upcoming newsletters. The Wikimedia Education newsletter is a monthly newsletter that collects articles written by community members using Wikimedia projects in education around the world, and it is published by the EWOC Newsletter team in collaboration with the Education team. These stories can bring you new ideas to try, valuable insights about the success and challenges of our community members in running education programs in their context.

If your affiliate/language project is developing its own education initiatives, please remember to take advantage of this newsletter to publish your stories with the wider movement that shares your passion for education. You can submit newsletter articles in your own language or submit bilingual articles for the education newsletter. For the month of January the deadline to submit articles is on the 20th January. We look forward to reading your stories.

Older versions of this newsletter can be found in the complete archive.

More information about the newsletter can be found at Education/Newsletter/About.

For more information, please contact spatnaik at wikimedia.org.


About This Month in Education · Subscribe/Unsubscribe · Global message delivery · For the team: ZI Jony (Talk), Saturday 4:12, 13 August 2022 (UTC)

Deletion of redirects[edit]

Hi Wikisource folks. An outside observation from English Wiktionary: I have done an audit of broken links from English Wiktionary to English Wikisource. You can see the list here. As you can see, a significant number of the links were once valid but have since been broken by page moves on this wiki. In particular, chapters of Moby-Dick and Sons and Lovers as well as the Song of Everlasting Regret appear all throughout the list.

It seems that this situation has arisen because of eager deletion of redirects on this project. The administrators who deleted those redirects evidently did not consider the impact this would have on other websites (not just wikis) which link to Wikisource texts. Keeping long-standing URLs functional is a courteous thing for a website to do, especially one such as Wikisource where the content is very stable and drastic changes would not be expected. It's reasonably easy for us on Wiktionary to fix these broken links because of our use of templates, but the same can't be said for everybody who links to this site.

I am curious to understand Wikisource's policy on redirects, how it has come about, and whether there is appetite for keeping certain long-standing redirects even if current naming schemes are not followed. This, that and the other (talk) 14:19, 9 July 2022 (UTC)Reply[reply]

This, that and the other does wiktionary have much going on with wikidata yet? Here {{wdl}} can be used and will prevent this kind of problem from enthusiastic redirect deleters and other problems of inter-wiki linking, as it grabs the current link.--RaboKarbakian (talk) 15:32, 9 July 2022 (UTC)Reply[reply]
@This, that and the other: Well, admittedly, we are sometimes a bit too aggressive in pruning top-level redirects that are non-standard (but might be targeted from another wiki). But mainly the short answer is that page moves and deletions happen and we need to use other mechanisms to keep the dead links down (maybe we should look at bot-updating any link whose target has turned into a soft redirect?). For example, as RK says above, adopting linking through Wikidata would catch page moves, and might make it easier to detect page deletions. And some discipline in (i.e. policy for) what to link to: in your list I find links to the Page: namespace here (which is an internal working area you generally shouldn't link to), links to subpages in mainspace (subpages have zero stability guarantees and don't get redirects on page moves), links to one specific edition of a work when it is likely the intent is to link to the work, and so forth.
And I see another significant subset of the pages in your list are pages created before standards for things like page names were set here, and as such have seen a larger than average amount of attrition due to cleanup and standardisation. As a general rule of thumb, top-level pages for works (that is, versions pages) and specific editions do not tend to change much here (when they're done they're done). At worst an edition gets moved to make way for a versions page, but then the old page name still gets you a list of editions of the work. In other words, I think a lot of the current dead links are the inevitable consequence of cleaning up old messes (other projects, like enWP, have done this years ago and are now much more stable); and a lot of the rest can be ameliorated (not eliminated) by more disciplined linking.
But I think a better question to address is how we can enable "deep linking" (for lack of a better term). For parts of works that are themselves works (poems, short stories, some, but not all, newspaper and magazine articles, etc.; stuff that's usually published in some form of collection) we can usually create top-level redirects to the subpage (and you should link to the redirect instead of the subpage). But for, say, a chapter of a novel our standard is to not have redirects. At the same time, Wiktionary and Wikipedia (e.g.) will often want to link to such a sub-part of the work. I also expect both to have a need to link directly to a specific sentence or position (think "To be or not to be"). We currently have no facility to enable this. And both these things are sometimes needed for internal linking on enWS as well, so it's not just our sister projects that need this. Xover (talk) 17:01, 9 July 2022 (UTC)Reply[reply]
One of problems I can see is the fact that when we move a work we can check what links there only from Wikisource, we cannot check what links there from other Wikiprojects. If we could, it would help to prevent such things from happening very much. --Jan Kameníček (talk) 07:43, 10 July 2022 (UTC)Reply[reply]
as a part of the process of deleting redirects, should we include a "what links here check" and if not fixing right away, then adding to a list for linking at the other wiki? --Slowking4Farmbrough's revenge 18:11, 10 July 2022 (UTC)Reply[reply]
This sort of thing has even happened here with intrawiki links: see Page:Hero and Leander - Marlowe and Chapman (1821).pdf/36 and The Passionate Shepherd to His Love, both of which were broken because the page to which they both linked (Golden Treasury of English Songs and Lyrics/Book 1/Poem 5) was moved to The Golden Treasury, etc. A redirect was left for the root page in mainspace, but not for all the subpages.
The former of the two broken pages also illustrates the use of {{anchor}}, which is one way—albeit unwieldy—to link to a specific passage in a text. Shells-shells (talk) 20:24, 10 July 2022 (UTC)Reply[reply]
Wiktionary wants to cite a use of a word. Thus Wiktionarians don't want to cite a generic form of the work, or link to the top level; they want to link a page that has the word in question on it in a specifically dateable context. It doesn't strike me as that rare; while there are times you want to link to a generic version, there's times you want to talk about Homer's use of rosy-fingered dawn ("as soon as early rosy-fingered Dawn appeared, then they set sail for the wide camp of the Achaeans") and link not to the Iliad, but the Iliad, book 1, and a translation that faithfully translates that (not Alexander Pope's! apparently many students over the years have been confused by that).--Prosfilaes (talk) 04:14, 11 July 2022 (UTC)Reply[reply]
Thanks all for your input. I am glad to have generated some discussion around this topic. This, that and the other (talk) 09:59, 16 July 2022 (UTC)Reply[reply]
It is precisely because of Wiktionary linking that I have endeavored to avoid the need for later moves. We have some editors here who insist that if a work does not have multiple editions hosted now, then the base title is where the edition must be hosted, which means that, if ever we have an additional edition, the work gets moved. We can avoid this issue by more forward-thinking effort. If a work is a translation, or exists in multiple significant editions, we ought to place any edition/translation of that work at a more specific title, and set up a versions page for the main item. This would not only help with Wiktionary linking, but would also assist with connecting to Wikidata and Wikipedia. --EncycloPetey (talk) 18:11, 2 August 2022 (UTC)Reply[reply]

Uploading new versions of files[edit]

I have been asking in vain for help with the for some time now and humbly request that my problem be given some attention. I CANNOT upload a new version of any file. This has been the situation for many months now. Any attempt to upload a new version is inevitably corrupted and all I get is Fileicon-pdf.png and no file. Esme Shepherd (talk) 20:40, 17 July 2022 (UTC)Reply[reply]

@Esme Shepherd Sorry to hear that. I've been having some trouble with some pdfs recently, it might be a bug. What file are you trying to upload? Languageseeker (talk) 22:13, 17 July 2022 (UTC)Reply[reply]

Every file I have tried to load a new version to for maybe a year now. The latest was Lydia Sigourney 1834.pdf, which is now in Category: Lydia Sigourney Redundant Files, as I had to re-upload it as Lydia Sigourney, 1834.pdf, which I am now working on. Esme Shepherd (talk) 06:05, 18 July 2022 (UTC)Reply[reply]

Tech News: 2022-29[edit]

22:59, 18 July 2022 (UTC)

Page Preview lacking headers and footers[edit]

For some time now I've been noticing that, when editing a page in the Page namespace, when I preview the page it is rendered without the header and footer (and thus as a side effect shows the page as "not proofread"). When the page is published, all is well; it appears to be solely the preview functionality. Have others seen this? Is this a known bug? — Dcsohl (talk)
(contribs)
18:48, 19 July 2022 (UTC)Reply[reply]

I have never noticed anything of that kind so far… Have you tried different browsers and/or different computers? --Jan Kameníček (talk) 18:59, 19 July 2022 (UTC)Reply[reply]
@Dcsohl: This is due to T309451. The workaround for now is to disable "Show previews without reloading the page" in the "Editing" section of the Preferences. Xover (talk) 20:08, 19 July 2022 (UTC)Reply[reply]

Second-hand transcriptions[edit]

Can second-hand transcriptions be speedied based on Wikisource:What_Wikisource_includes#Second-hand_transcriptions or should they be listed at Wikisource:Proposed deletions? Currently, they are not among the Wikisource:Deletion policy#Speedy deletion criteria, but they are repeatedly proposed for speedy deletion. -- Jan Kameníček (talk) 09:09, 20 July 2022 (UTC)Reply[reply]

Only speedy-able if a sourced version of the same text is hosted, per G4. There are no other valid criteria for speedy deletion of such. Summary deletion of so-called "second-hand" transcriptions without discussion is against the open nature of us as a library that anyone can bring works to. We can encourage people to bring them in a scan-backed form, but at present we don't have a policy that restricts to on-site scan-backing. If we speedy delete a new-comer's contributions we lose the new-comer. Also, the definition of "second-hand" seems quite arbitrary. Why aren't the various Executive Orders treated as second-hand? They are after all, simply brought over from the White House websites with minimal wikification. Yet, I've never seen them proposed for deletion on this ground. Beeswaxcandle (talk) 09:49, 20 July 2022 (UTC)Reply[reply]
once upon a time, we used old guttenberg transcriptions pasted in the side by side edit box, when the text layer was really bad. (as a part of the migration process) yrmv. --Slowking4Farmbrough's revenge 21:23, 20 July 2022 (UTC)Reply[reply]
I believe that they would fall under G5. I don't think that there is any evidence that many of these contributions stay on enWS. Most of them come, copy-and-paste a text (often without formatting), and then leave. It's an extremely fast process for them. Then, other enWS contributors then have to spend time on trying to format properly. PG are especially problematic because they silently correct errata. The entire process is just a time drain. As for the Executive Orders, I would also say that they should be speedied. They are published in the Federal Register and should be scan-backed from there. Languageseeker (talk) 21:38, 20 July 2022 (UTC)Reply[reply]
they were useful to me, if you delete them, then i cannot migrate works to scan backed works. increasing the scrap rate does not increase quality. --Slowking4Farmbrough's revenge 22:41, 20 July 2022 (UTC)Reply[reply]
Definitely not G5. That is for content that is out of scope. The content of these works are in scope (on the whole), it's just the source that is seen as problematic by those tagging for speedy deletion. Beeswaxcandle (talk) 06:44, 21 July 2022 (UTC)Reply[reply]
Second-hand transcriptions are out of scope for enWS, and any newly added second-hand transcriptions are speediable as such (that is, under CSD G5, which is the criterion for all content that does not meet WS:WWI). But the definition of it is inherently a grandfather clause in that it says enWS no longer accepts any new … second-hand transcriptions of any sort (my emphasis). So for anyone pasting in a new Gutenberg text today you can speedy it (presumably while explaining the issue to the contributor on their talk page); but for any similar text that was added in 2021 or earlier it needs to go through a normal deletion discussion. It is also not a given that older second-hand transcriptions will be deleted at WS:PD: the policy only implicitly marks these as undesirable, so absent community consensus to delete the status quo will obtain. There's no strong presumed default "delete" outcome for these. I personally think there should be, but that's not what the policy currently is. --Xover (talk) 06:42, 24 July 2022 (UTC)Reply[reply]
While I agree that such works should not have a place at WS, I am hesitant about their speediness under current deletion policy. I agree with Beeswaxcandle that G5 with its bracketed part "(such as advertisements or book descriptions without text)" does not seem to give way to general speedying of all beyond-scope texts. So if we agreed that it does not apply only to completely blatant cases, we should either make the criterion more general by removing the brackets, or we should explicitely add some less blatant examples, e. g. the second-hand transcriptions.
However, after this discussion and after several current similar nominations at WS:Proposed deletions, it seems to me that listing such cases there is useful, as some contributors sometimes save such works by scanbacking them, which would not be happening if they were speedied. --Jan Kameníček (talk) 12:19, 24 July 2022 (UTC)Reply[reply]
The bracketed stuff are informative examples to illustrate; the criterion itself is Beyond scope: The content … lies outside the scope of Wikisource (i.e. it fails to meet WS:WWI), and the limiting clause is … The content clearly lies outside the scope …. The point there is that if something is borderline or there's a significant possibility of mistake the admin shouldn't unilaterally decide (speedy) and it should go to WS:PD instead for community discussion. The latter is usually exemplified by someone pasting Harry Potter here—which is clearly a copyvio—versus someone proofreading a 1964 book that makes a superficially plausible claim of being {{PD-US-no renewal}}. The latter could still be a copyvio, but a single admin shouldn't decide that based solely on misbelieving the contributor's assertion: it should go to WS:PD where the community can examine it and possibly dig up the evidence (either way) to determine its actual copyright status. Harry Potter, obviously, should be speedied on sight (and preferably before Wizarding World Digital sends its DMCA-wielding Nazgûl after us).
That being said, I absolutely agree our policies are in dire need of tightening and should be written with much greater clarity. Navigating them now are an exercise in frustration for both general contributors and admins trying to apply them. --Xover (talk) 14:43, 24 July 2022 (UTC)Reply[reply]
"but a single admin shouldn't decide that based solely on misbelieving the contributor's assertion:" and yet they do. we have very few DMCA, and yet the copyright enforcement is adamant, summary, and non-consensus. we would not need more tl;dr policy if the praxis were reasonable; and what makes you think admins will follow policy? --Slowking4Farmbrough's revenge 19:47, 25 July 2022 (UTC)Reply[reply]

Copyright status of Men, Ships, and the Sea (1962)[edit]

I have done some searching in the copyright.gov database and come up empty for a renewal of the first edition of Men, Ships, and the Sea by Alan Villiers, published in 1962 by the National Geographic Society. As far as I can tell it should therefore have lapsed into the public domain (excepting, possibly, licensed photographs and illustrations within it). However, seeing as other works by Villiers have had their copyrights renewed (e.g., the very similarly named Of Ships and Men, also published in 1962), I would like to know the opinion of a more experienced user in judging the copyright status of this work, as I may have missed something important.

On a related note, is there a proper area for discussion about the copyright statuses of works not yet added to WS? I would have put this on WS:Copyright discussions, but that seems to be more about works already on WS than about ones offsite. Shells-shells (talk) 04:31, 23 July 2022 (UTC)Reply[reply]

Do you have the book? If you look at the actual book, you may see a list of copyright notices from other works. With or without them, I'm still concerned that there may be a number of other works that it's copying from.--Prosfilaes (talk) 20:15, 23 July 2022 (UTC)Reply[reply]
@Prosfilaes, @Xover: I'm reasonably confident that at least the text content was written specifically for this book, not copied from another source. I have a copy of the 1973 edition, which explicitly states: "Text by Alan Villiers / with a foreword by Melville Bell Grosvenor / and additional chapters by [several other authors]". The foreword to this edition seems to indicate that the book was written from scratch: "In commissioning him [Villiers] as chief author of Men, Ships, and the Sea, the Society chose the greatest sea writer of our time."
There are, however, a proudly proclaimed "423 illustrations, 294 in full color" in my copy. Most of these are undoubtedly still under copyright (although a few are obviously in the public domain, and some were commissioned specifically for the book). That's slightly less than one illustration per page. I suppose I could redact all the offending images if I wanted to, but it's probably not fruitful enough to spend a great deal of time with. (If I were to do so—assuming all the text content is PD—would it then be suitable to host here?) In any case, thanks to both of you for the help and advice. :) Shells-shells (talk) 17:12, 24 July 2022 (UTC)Reply[reply]
@Shells-shells: You're right that WS:CV is more a workflow for discussing the copyright status of texts already on enWS. But you can certainly raise other copyright issues, such as the one in this thread, there too. It's more a question of what's the best venue for your needs: WS:CV is watched by only a small subset of the community (unfortunately) and is often months and years backlogged (because of insufficient community participation) so as a practical matter you may prefer to post here. On the flip side, for complicated copyright issues WS:CV may be better because the copyright wonks will see it there, and it may get you a more definitive answer (or at least guard against wholly incorrect answers).
Short version: feel free to post such queries either place.
PS. I agree with Prosfilaes: even if the copyright on this work was not renewed, it may contain independently copyrighted works that for our purposes has the same effect as if the whole was in copyright. Xover (talk) 06:53, 24 July 2022 (UTC)Reply[reply]

ToC links[edit]

I like to style ToC's with the text linking to the transcoded page (unconditionally), and the page number linking to the Page namespace (when viewed from the Page or Index namespace), and to the transcoded page when the ToC is transcluded. This is *mostly* satisfied by {{TOC row 2dot-1 linked}} but it seems to be partially broken; does anyone know of a better choice, or how to fix it? The bug I've observed is that, for multi-level subpages, e.g. The_Works_of_Voltaire/Volume_36, the page number links are broken (they assume a single level, e.g. they link to The Works of Voltaire/The Lisbon Earthquake but the actual page is The Works of Voltaire/Volume 36/The Lisbon Earthquake). I think there may be other bugs, too. But it's really nice to have working links both to the transcluded pages and the Page namespace from the Index page, on the actual ToC, so I'd love to get this fixed. Suggestions? JesseW (talk) 03:50, 24 July 2022 (UTC)Reply[reply]

Seems to me that this behaviour is caused by the part #invoke:Filter|CleanParentDirectories in the code of {{TOC link}}. --Jan Kameníček (talk) 11:57, 24 July 2022 (UTC)Reply[reply]
Yeah, but I'm not sure what would break if I took that out. I suppose I could make a separate version... JesseW (talk) 14:03, 24 July 2022 (UTC)Reply[reply]
I took a look, and it looks like {{TOC link}} is broken by design: it has a hard assumption that there is never more than one level of subpage. Unfortunately, people have apparently depended on the broken behaviour for the last decade or so, so fixing it will require going through all extant uses and fixing the broken ones. I'm not sure that's a task that can be reasonably automated either (it'd need a lot of custom coding, not just application of existing tools), so there's no quick fixes here. Xover (talk) 18:34, 24 July 2022 (UTC)Reply[reply]
Cool, that makes creating a {{TOC link multilevel}} much more appealing. I'll see what I can do. JesseW (talk) 21:06, 24 July 2022 (UTC)Reply[reply]
Actually, it looks like {{TOC link}} is fine; it's {{TOC row 2dot-1 linked}} that needs fixing for multi-level subpages. Specifically, {{TOC link|1|Volume 3/Something|link label}} works fine; the trick is that {{TOC row 2dot-1 linked}} breaks up the page link as "The Works of Voltaire/Volume 36" and "The Lisbon Earthquake" (and makes the text link label just the second part), while {{TOC link}} needs the "Volume 36" part explicitly included. I should be able to make a variant of {{TOC row 2dot-1 linked}} that handles this correctly, just by splitting the "The Works of Voltaire/Volume 36" param. JesseW (talk) 22:59, 24 July 2022 (UTC)Reply[reply]
Sweet, I figured out a way to add an optional parameter (subpages=) that solves the problem! Yay, going off to fix ToC's now. JesseW (talk) 23:18, 24 July 2022 (UTC)Reply[reply]
Good to see you solved your problem. I'll have to take another look, but I suspect you're still relying on a quirk of the implementation which we'll need to design a proper migration path for at some point. Not a pressing issue, but just so you're aware.
But let me just add the obligatory Please Don't Use These Templates™ rant: none of the current crop of toc templates should be used, because they are technically poor (every single use creates technical debt for us, and it's unsustainably huge already), a large proportion produce really rather horrid output (in a technical sense), their operation is prone to cause confusion (a link that looks blue in Page: may be broken in mainspace, and these templates make that harder to detect), and provide very little actual value for the complexity (linking to the physical page in Page: makes little sense: you'll never use it and it tells you nothing about the link's state in mainspace where it matters). It also doesn't help that we have a myriad inconsistent and incompatible such templates. The alternative is plain old table markup, which admittedly can be a little harder to learn and slightly more complicated to use for simple cases, but which gives you far more control and flexibility for the hard cases without the downsides of the toc templates.
So far, nobody much are listening to me on this, but I live in hope… Xover (talk) 08:19, 25 July 2022 (UTC)Reply[reply]
I thought the {{TOC begin}} ones (like {{TOC row 2dot-1}}) were acceptable. And I find the links to the Page namespace very helpful as a way to get from the Index page to the start of the work, without having to bypass thru the transcluded page. I'd be fine with always linking the page number on the ToC to the Page namespace (like the titles on the ToC always link to the transcluded page), but I don't want to abandon all links to the Page namespace. What I want to avoid is repeating the transcluded page name, as that provides opportunities for typos. I'd be delighted if you give me an example of how your preferred way would look on Page:Works of Voltaire Volume 36.djvu/17 and, assuming I can understand it, I'll likely apply it to the others. JesseW (talk) 14:34, 25 July 2022 (UTC)Reply[reply]
@JesseW: My preferred approach would be something like this. If you feel the page number links are important you could easily apply {{TOC link}} to the page number column (I wouldn't, but that's a minor issue). My main point is that it's preferable to use plain table wikimarkup rather than any of the various TOC templates.
But I need to stress that this is my personal opinion, and in terms of things like policy, style guide, community practice etc. there's absolutely nothing wrong with your current approach. It's just that I, from a primarily technical perspective, think they are a bad idea and take every opportunity to say so in the hopes of persuading as many as possible of that. But nobody (myself included) will give you the stink-eye if you still want to use your existing approach.
Oh, and… The TOC-row family of templates are not by far the worst (that honor goes to dtpl), but anything that tries to fake dot leaders is going to be actively problematic. And using a new pseudo-syntax implemented as templates in order to generate wikimarkup which is then used to generate HTML is just a bad idea in general. But, you know, in the grand scheme of things… Xover (talk) 19:51, 28 July 2022 (UTC)Reply[reply]
@Xover: Great, thank you! Losing the dot leaders makes me sad (they are so pretty), but it fundamentally doesn't matter much. As I've said, I really value having links to the Page namespace from the Index, so I would certainly either apply {{TOC link}} or do unconditional links to the Page namespace (but I think some people don't like that). It's good to know about {{Table style}}. I don't think I'm going to go back and re-do any of the ToC's I've made, but I'll likely use your style for the other volumes of The Works of Voltaire as I make them. If I can't bear to leave out the dot headers, do you have a least-bad preference for making them? JesseW (talk) 20:10, 28 July 2022 (UTC)Reply[reply]
@JesseW: I agree dot leaders are both pretty and useful, but sadly they're the kind of thing that can really only properly be implemented with built-in support in browsers (and the CSS WG has been promising a specification for this Real Soon Now™ for going on a decade, iirc). After pouring way way too many hours into a clean(ish) way to fake them I've come to the conclusion that we'll just have to do without until browser support materialises. At which point I hope retrofitting them to existing TOCs will be a fairly easy, possibly automatable, task.
Any of the TOC-row templates with dot leader support should be in the "least bad" camp I think; it's just {{dtpl}} itself that's truly pathological.
Links from mainspace to Page: and Index: namespaces are an actual no-no (we have a draft linking policy floating around somewhere that addresses that specifically), so if you're going to have the page numbers linked it needs to be through through something namespace-conditional (like {{TOC link}}). But as I've mentioned, I question the utility for most works (there are always exceptions). Personal preference aside, it's rather a lot of complexity for something that will at best be irrelevant, and at worst actively confusing, for most readers. Xover (talk) 20:45, 28 July 2022 (UTC)Reply[reply]
@Xover: Cool, I'll leave out the dot leaders, pending browser support. As for the page number links (which I find particularly valuable for works with LOTS of tiny parts, like Index:Works_of_Voltaire_Volume_36.djvu, because that gives me a sense from the Index of which parts are done), I think I'll go for linking them to Page from the Index and Page namespaces, and leaving them unlinked in mainspace (since the transcluded page will be linked from the title). That has the advantage of not requiring any handling of the transcluded page name. I don't know if there's an existing template implementing that? If not, I'll make one. JesseW (talk) 21:46, 28 July 2022 (UTC)Reply[reply]
I was looking for precisely that! A while back I made Template:LinkedTOC_row_1-1-1 because I couldn't find something that worked.
Example of it working: Social Security Act 2018 (Version 56). You'll see in the transcluded version, you just see links to transcluded pages, but if you click on page 1 or 2 to the page space, the main space links disappear, and now only the page links show. It might not be exactly what you want, but I think you could very easily create another template that meets your needs from this. It's very simple code that either displays links or not depending on whether you're in the main or page namespace. Mine is also very "sub-page" specific so you might need to adjust that too for your uses. Supertrinko (talk) 00:21, 26 July 2022 (UTC)Reply[reply]
Interesting! I'm not sure I like the page numbers disappearing entirely in mainspace, as they are there in the original document. But it's nicely made. JesseW (talk) 12:19, 26 July 2022 (UTC)Reply[reply]
I chose that because I felt page numbers are not valuable in a transcribed document where pages are meaningless, but being a template, it's easily adjusted to act like the main links, where they could instead display as text without a link, or they could just permanently link to the page regardless and not be special at all. Supertrinko (talk) 20:33, 26 July 2022 (UTC)Reply[reply]

Let's talk about the Desktop Improvements[edit]

Vector 2022 showing language menu with a blue menu trigger and blue menu items 01.jpg

Join an online meeting with the team working on the Desktop Improvements! It will take place on 26 July 2022 at 12:00 UTC and 19:00 UTC on Zoom. Click here to join. Meeting ID: 5304280674. Dial by your location.

Read more. See you! SGrabarczuk (WMF) (talk) 16:19, 25 July 2022 (UTC)Reply[reply]

Tech News: 2022-30[edit]

19:27, 25 July 2022 (UTC)

Template:ct[edit]

This template was recently deleted and all uses of it deleted by user:Billinghurst either manually or automatically by user:SDrewthbot with the reasoning "community has already decided against such a template". The only remains of the template are Template:Ct/doc

I don't see why this had to be done, as the text, when complied, copied as regular ct, and not a ligature, the works still were able to be searched as normal, and it makes the works that used it in the texts that much more authentic, since those ligatures are used in the texts.

I also haven't seen any discussion against templates like this, and the main reason I was told why we don't see them, is due to them messing with searching and copying, which this template had no issues with. Reboot01 (talk) 16:37, 1 August 2022 (UTC)Reply[reply]

Strong agree. The reason for deleting it in the past was due to text searchability issues, but I re-created it specifically to get around those issues, by using regular ct in a <span> with CSS styling, which meant that it was completely optional. This honestly just felt like an overreaction by someone who didn't bother to look into how the template worked and/or has forgotten why the old version was deleted in the first place. @Billinghurst: please undelete this and revert your mass changes. You've mucked up a lot of hard work. Theknightwho (talk) 16:44, 1 August 2022 (UTC)Reply[reply]
I'm not an active part of English WS, but I strongly agree. This is destruction of meaningful information, without any gain except for (doubt) wikicode readability.--Ignacio Rodríguez (talk) 16:50, 1 August 2022 (UTC)Reply[reply]
Thanks for the responses, guys, I also just found the reasoning from Billinghurst on Theknightwho's talk page, linked here User talk:Theknightwho#template:ct for others to read. Reboot01 (talk) 16:55, 1 August 2022 (UTC)Reply[reply]
I slightly misremembered the issue, as there is no "ct" ligature in Unicode. However, tracing back the supposed consensus, it stems from:
  • This brief discussion about the Unicode st ligature, clearly raising issues about text searchability.
  • This discussion about the ct ligature, where someone had inserted the private use charaacter  into a text, because that's the ct ligature in the MUFI extension to Unicode. This was (rightly) deleted, but apparently on the basis that we don't include ligatures as it's "what the community discussed several years back", referring to that brief 2011 conversation.
It's pretty obvious that neither of these issues are relevant here. In fact, one of the contributors to the 2011 discussion even said "This kind of ligature is properly implemented by a suitably intelligent font/browser and Unicode, which spots "st" combos and substitutes the ligature.", which is precisely what the new template did. Theknightwho (talk) 17:48, 1 August 2022 (UTC)Reply[reply]
I have unprotected it. Even if you personally disagree with the existence of this template, I think without super-clear consensus in a previous discussion, and with only one deletion in the past, I think admin-protecting it at this point is a bit overkill. @Beleg Tâl, @Billinghurst: what is the harm of having this template around? PseudoSkull (talk) 20:57, 1 August 2022 (UTC)Reply[reply]
I don't recall where the various previous discussions on this kind of ligature are, but it was consensus that purely orthographic ligatures would not be reproduced here. [Quite possibly it was in the context of a discussion about fi and fl.] Ligatures for an actual letter/glyph/character that represent a distinct phoneme, such as æ and œ were accepted. Whereas those that are publisher's orthography such as fi and ct were not. Yes, the arguments about copy/paste are increasingly irrelevant these days with modern browsers. Searches can still be problematic in some browsers. However, there is still the issue of these not being about reproducing the content of the work—which is our main goal and purpose. They are solely about reproducing the arcana of a font used by a publisher.
Also, the quote about "spotting combos and substituting the ligature" is not replicated by using a template. That is done when the browser sees "st" in the text flow and automatically ligatures them—much like most browsers today do with "fi" and "fl". Beeswaxcandle (talk) 21:19, 1 August 2022 (UTC)Reply[reply]
The consensus was that we don’t want to use specially encoded ligatures, because they muck up searching. That is not relevant to this discussion, because this particular template only uses CSS styles to apply OpenType features in order to enable the ligature in fonts that support it, which is the kind of unobtrusive implementation that we should be fine with, because the text itself remains unchanged. In fact, this problem is exactly why Unicode haven’t encoded any more ligatures after the handful they added right at the beginning. By the way - CSS styles like this are quite literally how you get a browser to automatically substitute the ligature in the way you’re claiming this doesn’t do, so I don’t really follow your point at all. Nobody is suggesting we use the ligature characters, and given that we go to great pains in order to replicate the look of the original work very often, it feels very strange to rule this out. Seems like a post-hoc rationalisation, especially when we’re fine with long S, despite that being no less stylistic; the only difference is that it’s more widely supported in text search, which is the real issue here.
Thank you @PseudoSkull. @Billinghurst please revert your changes. There are too many of them. Theknightwho (talk) 22:35, 1 August 2022 (UTC)Reply[reply]
I have gone ahead and reverted these anyway. There is clearly no consensus for speedy deleting this. Theknightwho (talk) 04:26, 2 August 2022 (UTC)Reply[reply]
Browsers often ligate fl and fi and ffi automatically, and always ligate ي and ة that is, (ية). If the right font is used, it should ligate ct automatically. We don't want to mark up every copy of ct in a work to get it to ligate.--Prosfilaes (talk) 01:13, 6 August 2022 (UTC)Reply[reply]
@Prosfilaes But you don't have to. It's entirely optional, which is the point. Theknightwho (talk) 17:06, 7 August 2022 (UTC)Reply[reply]
Nothing's entirely optional; the {{ct}} would have to be used consistently throughout a work. I'm aware of only two current features that are optional in a similar way, the {{ls}} and and the curved versus straight quotes, and both are pains, because you want to maintain consistency over a whole, possible multivolume work. I don't see the value in adding a third. Again, this should be done automatically over a text, not manually on each and every occurrence of ct.--Prosfilaes (talk) 21:05, 7 August 2022 (UTC)Reply[reply]
@Prosfilaes The reason for doing it that way is because it wasn't necessarily used consistently throughout the work, but did seem to be used in the same way between editions. I have no idea why that would be the case, but an example of where we might genuinely want to reproduce it is in a textual extract in which it's used as an intentional archaism, in the same way that blackletter is used. In a small number of contexts it carries semantic value, simply by virtue of the fact that the ligature itself is being discussed (works on printing, for the most part). Theknightwho (talk) 18:38, 9 August 2022 (UTC)Reply[reply]
My personal opinion is that per the Unicode standard, the presence/absence of a "ct" ligature is a mere byproduct of the font chosen by the work's printer, and we do not/should not try to replicate fonts, nor should we encourage editors to do so by making templates available for them to use for this purpose. That's my opinion; I do not pretend that it is or should be anyone's opinion but my own. —Beleg Tâl (talk) 00:39, 2 August 2022 (UTC)Reply[reply]
Wouldn't the same also apply to Long S and the various ae, oe, etc, ligatures that we do use? They're all byproducts of those 1700 typefaces. Reboot01 (talk) 01:01, 2 August 2022 (UTC)Reply[reply]
No. Æ and Œ and their lower case equivalents œ and æ represent distinct vowels in Old English and many of the words they appear in are older than 18th Century typefaces. See the enWP article on ligatures for more. My opinion is that the stylistic ligatures should not be deliberately reproduced here (if a browser does it that's different), while those that were/are distinct letters should remain. The long-S is a separate case and we await the ability for readers to chose which to see—which is why the {{ls}} has been accepted. Beeswaxcandle (talk) 01:31, 2 August 2022 (UTC)Reply[reply]
@Beleg Tâl I don't see how that's relevant when this doesn't involve a hard-encoded Unicode ligature. I don't see why this is any more of a problem than including any other stylistic elements from works. We have a vast number of style options already - there is no reason to exclude this one. Theknightwho (talk) 04:18, 2 August 2022 (UTC)Reply[reply]

The template was deleted by community discussion in WS:PD. There is also a long-standing, clear statement in the Wikisource:style guide about our typography. If someone is wishing to go outside the community consensus of those two forums then that person should have the conversation first, not ignore our community's consensus.

@Pseudoskull: The reason why I protected the template was because of exactly what happened. You removed the protection and the user immediately reverted my edits and recreated the template. What is the point of WS:PD discussions and style guides if we so blithely ignore them? — billinghurst sDrewth 10:31, 2 August 2022 (UTC)Reply[reply]

The style guide says that "Typographic ligatures such as ƈt, ff, fi, fl, and ſt should not be used in page text even if they appear in the original source (as they interfere with the searchability of the text)." This template did not insert a ƈt, it used CSS to make ct display as ƈt, and it did not interfere with the searchability of the text whatsoever. Reboot01 (talk) 10:43, 2 August 2022 (UTC)Reply[reply]
Addendum: These discussions were years old, and the old version of the template, from what I can gather, did just insert a ƈt, this new version is nothing like that old version of the template, it just used the same title because, it does the same effect, but far better and in a modern, up to date way. Reboot01 (talk) 10:45, 2 August 2022 (UTC)Reply[reply]
And what is the intent of the community's decisions? That it is year's old means what? That it can just be ignored? Our purpose is NOT to produce a facsimile copy, that has not been our purpose. — billinghurst sDrewth 10:50, 2 August 2022 (UTC)Reply[reply]
@Billinghurst As has been pointed out numerous times, Typographic ligatures such as ƈt, ff, fi, fl, and ſt should not be used in page text even if they appear in the original source (as they interfere with the searchability of the text). does not apply because this template did not interfere with text searchability. It's obvious you've made no effort to even read the discussion, and you seem to have the idea that consensus is decided once and then never changes, which is never how it's worked. The original version of the template used a private use character, which is obviously not acceptable, but also completely different. I don't see you advocating for removing any other style options, either, so this is inconsistent at best and suggests a complete failure to understand why the original template was deleted.
Deleting the templates while this is clearly a contentious issue, edit warring using your bot, and then abusing your administrator privileges by banning recreation of the template is completely out of line. You have also presented zero reasoning other than pointing at discussions that happened years ago and which are not relevant due to the issues at hand. You are not behaving appropriately at all. Consensus from years ago is not law. Theknightwho (talk) 11:15, 2 August 2022 (UTC)Reply[reply]
@Theknightwho: In a word: chill.
That you're frustrated is understandable, but you're now flinging accusations against one of our longest-standing admins; one who is probably the single person who has put the most overall work into making this project a success, and who has been around as all these policies and practices were established. That's very much uncalled for, and when they speak to these issues one should most certainly listen (disagree, sure, but definitely listen).
In this particular case, there is no question that there is an established consensus and an established practice against the reproduction of these ligatures on the site, irrespective of technical implementation. Billinghurst was implementing that consensus. Now, as you've pointed out, a lot can change on the technical side in the years since that consensus was established, and even absent changed circumstances consensus itself can change over time. It is therefore entirely appropriate to raise such issues anew so the community can assess both any changed circumstances and their position on the issue in general. But unless and until a new consensus is established the old consensus holds sway. And that is indeed how consensus has always worked on Wikimedia projects.
And on that note, let's please not get bogged down in one single detail of this. Whether or not it breaks search is just one aspect. We need to also consider things like whether we want to encourage or merely permit such practice from perspectives like what will that do to the practical aspect of trying to read and modify the wikitext (I've seen some very bad examples of pages drowned in a gazillion ligature templates), or from the perspective of the more abstract purpose of Wikisource (we do not, and cannot, perfectly replicate the original; do we really want to move the line to the other side of reproducing such minute typographic features of a given typeface and printer? How does that affect our general principle of not trying to reproduce artefacts of the printer, unless it reflects the author's intent?). If we allow or encourage the ct-ligature, what about the myriad other ligatures for which the same arguments can be made?
This isn't a straight-forward question, and we should take the time to get it right if we're to reexamine the status quo. Not least to make sure we head off future drama in this area. Xover (talk) 16:20, 2 August 2022 (UTC)Reply[reply]
@Xover While I agree about being all that being a bit harsh on Billing and others, the way they worded their responses did make it seem like they only cared about the outdated template inserting a ƈt, or the fact that 'consensus says this so...', they didn't even acknowledge the fact this was updated to not due that, and I guess it led to some flare ups because of it.
And it's in my opinion that trying to transcribe a work, even with a large amount of wikitext for ligatures in, is no harder than transcribing any other work, especially with the modern editing tools we have these days. I feel that at least there should have been some kind of discussion before speed deleting a template, and the large amount of work that I, and other editors using this template, did in older works that used that Ligature, and definitely feel that there needs to be a wider debate and discussion on this policy.
I've seen multiple things, such as Long-s, where it can either be transcribed like it is in the text, or just use s, and I see no reason why this has to be any different from that. Reboot01 (talk) 17:46, 2 August 2022 (UTC)Reply[reply]
I see long s as a different case, as it used to be used as a separate character, although representing the same sound as the short s, and its usage was connected with some rules. On the other hand, I understand the ƈt ligarure to be just a kind of typeface not distinguished from common ct characters. --Jan Kameníček (talk) 18:17, 2 August 2022 (UTC)Reply[reply]
Then, with it being just part of a typeface, couldn't we say that using different fonts, like blackletter, or using something like template:old style for some numbers, is just a decision of the printer and shouldn't be included since they're just typographical? Reboot01 (talk) 18:51, 2 August 2022 (UTC)Reply[reply]
Precisely. The logic behind that argument in the first place was to say that it was fine not to use the hard-encoded Unicode ligature characters. It does not follow that any attempt to stylise ligatures should be disallowed. PseudoSkull raised an excellent question that mobody has yet been able to answer: what is the harm that is done by this template? I'm also unconvinced by the attempt to argue long S is a separate character - it carries no semantic value, which is precisely why we all view it as optional (i.e. stylistic). The reason it was never disallowed is because it didn't disrupt text searchability. If it did, I have no doubt it would have been treated in the same way.
And yes, @Xover, @Reboot01 is entirely right about why I got so annoyed about this. @Billinghurst has made absolutely no effort whatsoever to be cooperative, has all but ignored this discussion, and has by any measure abused their position as an administrator, regardless of whether anyone agrees with their view on the matter itself. It's extremely clear that they are not interested in engaging in this topic in a constructive way, being perfectly happy to ignore this discussion, but jumping into action immediately if they felt they weren't getting their way. That is not a sign of good faith, and is pretty insulting when they've just undone something that took effort to implement in the first place (a couple of hundred instances). Theknightwho (talk) 01:22, 3 August 2022 (UTC)Reply[reply]
I did not write any argument for or against, at the moment I am still just reading and considering arguments of others. I only reacted to Reboot01’s argument explaining the difference between long s and ƈt, nothing more. As for black-letters: that is much different too, these are often used to make e.g. some titles more prominent/visible and I am far from being sure whether it was printer’s or author’s decision. However, you are right that the old-style template for numbers looks like pretty the same case, so we should decide whether we want to add another similar template: I am still not sure, Xover is afraid of flooding our texts with them.
I would also like to ask Theknightwho to stop offending others as they are the only one to be blamed for breaking an established rule before making an attempt to establish new consensus, which resulted into the current drama. If you re-read the discussion at your talk page, you will notice it was explained there to you politely and you started with your offenses there already, making it such a drama. Let’s leave this to history now and let’s concentrate on factual arguments only. --Jan Kameníček (talk) 06:35, 3 August 2022 (UTC)Reply[reply]
Just to be clear: I've taken no particular position on this issue. I'm saying the question is multi-faceted so we can't just argue one aspect of it, and the principles underlying the current (status quo) consensus have implications that go beyond ct-ligatures and which also need to be considered. It is to me also important that if we are to change established practice in this area, we do it properly and document it in a way that removes the potential for future drama as much as possible.
To that end, my suggestion would be to take a step back, and start with a neutral and dispassionate description of the issue. Starting from the assumption that at least some contributors want to reproduce ct-ligatures: what are their motivations (why ct-ligatures, rather than st, ffi, or ny number of others), what are the technical options and how do they differ from when this issue was last discussed, what are the potential downsides, and what other issues will be affected by such a change (e.g. those other ligatures). I would also like to eventually arrive at some kind of actual policy page describing this and related issues, for which purpose some thought of how to formulate either the status quo or a hypothetical uti possidetis would be beneficial. Xover (talk) 07:07, 3 August 2022 (UTC)Reply[reply]
Long s is a special case because it appears outside of printed works. Authors such as John Milton and Jane Austen used the long s in their manuscripts. The rest are ligatures to beautify the text and can be added automatically by some ebook reader software. Honestly, I think that such ligatures bog down the proofreading and validation of works without adding anything. Languageseeker (talk) 00:36, 4 August 2022 (UTC)Reply[reply]
@PseudoSkull: Just flagging this to your attention, as it's clear that billinghurst neither understands the issue nor cares about acting cooperatively with other editors. Theknightwho (talk) 11:19, 2 August 2022 (UTC)Reply[reply]
Deleting was in accordance with both the current practice and the rule expressed in Wikisource:Style_guide/Orthography#Ligatures. Although the given reason might not apply anymore, the rule is valid until it is cancelled by new consensus. I understand this discussion as an attempt to reach such consensus, so we will see what comes out of it. The template should definitely not be recreated before reaching an agreement. --Jan Kameníček (talk) 12:02, 2 August 2022 (UTC)Reply[reply]
@Theknightwho: You are speaking out of your hat. I well and truly understand the issue. I definitely do care and have worked with cooperatively with people here for 15 years. That I don't agree with your argument and that I follow the consensus of this community should not be condemnation, they should have you listening. There is ZERO requirement to produce facsimile copies of works and that has been the opinion of this community for years. We have tried to not overburden our works with templates,; to keep our proofreading as simple as possible. What does this community or the library gain from a reproduction of a CT ligature-look. Nothing. Read the archives of this page and you will see this argument about facsimile reproduction and the community's higher interest in the written and reproduced word, not a publisher's artefact. — billinghurst sDrewth 12:44, 5 August 2022 (UTC)Reply[reply]
Then let's do away with everything stylistic then. Why target this and only this? Theknightwho (talk) 12:50, 5 August 2022 (UTC)Reply[reply]
If you wish to have a conversation then please produce a logical argument about what you are proposing, or the changes that you are wishing to see undertaken. We do try to represent a work without slavish facsimile reproduction. We are not wishing to have no formatting, many styles are clearly pertinent and add definition, clarity or readibility to works. We made allowance for "long s" though some of would still like it gone, and that template is required to display in the main namespace as a standard "s" without imposition of any stylistic aspect. "ct" is only a visual artefact without any benefit and was not even continued as a style by publishers. This community determined that we wished to not progress with the template and the published look. It is definitely unfortunate that you spent time doing that editing, however, if you followed the style, then you wouldn't have spent that time, so please do not blame us. We have deleted completed works so we all understand about effort spent ending up being no longer valid. I have had to delete works that I have spent hours upon. C'est la vie. — billinghurst sDrewth 13:16, 5 August 2022 (UTC)Reply[reply]
I mean...if anything, this new discussion shows that 'this community' is not all in agreement on this topic, and I still feel that there could be more discussion and voting on this, as @Xover and a few others have said we should. Just because the responses have slowed doesn't mean it's just case closed, it's staying deleted. Reboot01 (talk) 14:22, 5 August 2022 (UTC)Reply[reply]
And on top of the whole 'we don't make facsimile's of works' arguments, that makes a page like this, with it's formatting make the tail, completely unnecessary and out of scope by those arguments, among so many other works that go beyond it. Page:Lewis Carroll - Alice's Adventures in Wonderland.djvu/57. Reboot01 (talk) 14:29, 5 August 2022 (UTC)Reply[reply]
@Billinghurst I have already explained what the "changes" I have proposed are. Why are you against this particular stylistic element and not others? The onus is on you to explain why this one is not justified, given the fact that there are so many others that we do allow, because at this point it feels like a mantra rather than something with any real logic behind it. I see no proposals to delete any other templates which could impede proofreading, of which there are many, either.
And yes - the community is clearly not in agreement about this. "The community" determined that we should not be using private use characters or templates which impede searchability, which do not apply here, and the fact that you keep making that point despite several other people pointing out that we aren't even talking about the same thing is starting to become absurd. Either you don't understand the issue at hand, or you do understand it but choose to ignore the fact that people are explaining why this is different. Neither looks good. At this point, it's not even about the fact we disagree - it's the fact you keep stonewalling the arguments made against your view. Theknightwho (talk) 16:47, 5 August 2022 (UTC)Reply[reply]
No, in a community, the onus is never on one person to explain something. You didn't bother responding to my post below. I think Billinghurst might have been a little quick to delete, but you were quick to recreate and edit war, so let's go beyond that. Whether certain characters are ligatures are part of a font style, and we don't do font styles. Blackletter is more like italics or bold, and Alice in Wonderland's tail ... is one of those idiosyncratic things. If you look at the Penguin Alice, it looks like every other book in the series, and reproduces the tail.--Prosfilaes (talk) 00:56, 6 August 2022 (UTC)Reply[reply]
The "ct" is a publishing artefact used in a very small period of publishing time, and was only ever a publishing artefact, it is not a real set of characters and the publishing industry stopped doing it. Other styles that we do have been specifically inserted into works as a clear means of styling. Template:ct as you had it displayed in main ns, template:long _s does not. You are only presenting half of a story and solely because you like the look of the "ct" to replicate a publisher. As you have it, it still impedes search and your statement that it does not is just wrong, and you clearly don't know local search as the span exists in the reproduction and always will, it still makes proofreading more complex and it does nothing to build a good library.

@Reboot01: With regard to your !vote. How many times should we have to re-prosecute a case that has been addressed in multiple ways and in multiple means about facsimile reproduction? Just because you have arrived with you really good idea about which we have already had a discussion many years ago. Are we going to back and discuss replicating fonts as published> Do you want to go back and insert every hyphen we have removed from a work? We have had these conversations, and the results of these conversations come through in Wikisource:Style guide and related pages. So always the conversation should be if you want a change to the style guide is discuss first, not ignore the guide and impose your own styles that we have determined to not do. — billinghurst sDrewth 02:31, 6 August 2022 (UTC)Reply[reply]

Except you're ignoring the multitude of style options that get freely applied to works in the wikitext, such as the one already pointed out to you. Plus it does not disrupt search at all, because the span does not show up in search results. I had already tested this. As you can see here, search results show the final display form if it finds a match there. It doesn't only look in the raw wikitext. Please explain what the actual problem is, because I'm really not seeing one. Theknightwho (talk) 17:03, 7 August 2022 (UTC)Reply[reply]
We are not going to reproduce fi ligatures because they're universal and practically mandatory. The ligated ct and st are likewise mandatory in those font styles, and not at all in more modern font styles. In scripts like Latin and Greek, characters like the s in pre-1800 usage, or sigma in modern Greek usage, that have mandatory positional variation are encoded separately instead of requiring complex shaping. I'm not a fan of using the long-s, but the distinction is a plain text distinction.
The letters æ and œ are more complex. Both are full letters in some languages; æ is used in Old English and Danish, and œ is used in French and various Cameroon languages. (For example wikt:coeliaque is considered a misspelling in French.) Likewise, æ and œ are often but not always converted to e in American English, instead ae or oe which is more common in British English. Even in just English, we can't blindly convert it, which makes it a spelling conversion, which we don't do.--Prosfilaes (talk) 03:05, 4 August 2022 (UTC)Reply[reply]
Without wishing to get involved too any major extent in such a fractious topic, it seems that the easy option has been overlooked: add the relevant ct-ligature-enabling CSS in the index CSS for the relevant work. No templates at all are needed to gunk up the editing interface. Regardless of whether ct-ligatures are good or bad, having to replace every "ct" pair in Wikicode is, I think self-evidently, a complete non-starter in terms of workflow or maintenance and has been rendered completely unnecessary by the software anyway. Inductiveloadtalk/contribs 22:12, 10 August 2022 (UTC)Reply[reply]

Tech News: 2022-31[edit]

21:21, 1 August 2022 (UTC)

No Download Link on Portal:Sherlock Holmes (UK Strand)[edit]

No download link appears on the page Portal:Sherlock Holmes (UK Strand) even though it has an AuxToc. Can someone take a look? Languageseeker (talk) 00:02, 3 August 2022 (UTC)Reply[reply]

Board of Trustees election 2022 -- Candidates[edit]

Hi all,

Affiliate representatives have finished voting.
The following six candidates were selected to continue to the community voting stage of the election:

For more information, see the statistics and the results.

Thanks to the community members who participated. The candidacy for the Board of Directors is not a small decision. The time and diligence that the candidates have shown so far speak of their commitment to the movement. Congratulations to the candidates who have been selected. We also express great respect and gratitude to those candidates who were not selected.

From these six candidates, the community will elect two who will become the new Trustees.
Community voting begins on August 16 and runs until August 30.


Next steps:

Before voting, plase take a moment to learn more about the candidates' positions. You can do this in one of the following ways:


Thank you!
--BPipal (WMF) (talk) 13:31, 3 August 2022 (UTC)Reply[reply]

OCR button not working?[edit]

The "Transcribe text" button in the proofreading toolbar doesn't seem to work for me; I can press it, but it doesn't change the page text. I'm pretty sure it was working fine a month ago, so this seems like a recent development. Are others experiencing this same problem? Fortunately, https://ocr.wmcloud.org itself still works, but it is slightly more inconvenient to use. Does anyone know what's going on? Shells-shells (talk) 19:05, 3 August 2022 (UTC)Reply[reply]

Bookreader app[edit]

When I was uploading a scan of an English book to Wikisource a few days ago I noticed the bookreader app icon on the index page of the work, and was very impressed. We would like to be able to use the same programme on Wicidestun, the Welsh version of Wikisource. I can't find any information on how we might set up the application ourselves or who to ask to enable it for us. Could anybody point me in the right direction please? AlwynapHuw (talk) 19:35, 5 August 2022 (UTC)Reply[reply]

@AlwynapHuw Those app icons are done with mw:Help:Page status indicators, as defined in MediaWiki:Proofreadpage index template. There doesn't seem to be any complex setup involved; it should work as long as the right link is generated. I believe the relevant line of code is
<indicator name="bookreader">[[File:BookReader-favicon.svg|18px|link={{fullurl:toollabs:bookreader/en/{{{n|{{PAGENAMEE}}}}}}}|Open file in BookReader]]</indicator>
I'm not certain, but I think this could be copied right over to cy:MediaWici:Proofreadpage index template, simply changing bookreader/en/ to bookreader/cy/ (and translating the tooltip text). Shells-shells (talk) 00:35, 6 August 2022 (UTC)Reply[reply]

It would sometimes be helpful if IA-upload didn't make files tiresome to use.[edit]

Index:NBS Technical Note 11176 (1983) (IAutilityprogramsf1176dick).djvu The OCR is out by one page. ShakespeareFan00 (talk) 14:07, 6 August 2022 (UTC)Reply[reply]

I guess it is the usual bug. I personally stopped using it due to this "one page offset". Mpaa (talk) 09:00, 7 August 2022 (UTC)Reply[reply]
I sometimes use IA uploader and do not experience this problem, but it probably occurs from time to time, see the phabricator link. --Jan Kameníček (talk) 09:13, 7 August 2022 (UTC)Reply[reply]

How do I split a bilingual text between two Wikisources, so that it's visible to both?[edit]

I saw this done with Fitzgerald's Rubaiyat, where the Persian pages are housed on WS-fa, and the facing English translations on WS-en. In the indices, the pages in the other language are colored grey, but are integrated together when you read the book. The same should be done with Whinfield's Rubaiyat, which is all housed on WS-en but with most of the Persian pages marked 'problematic' blue and some without even an attempt at digitization.

Recently I came across Bose's Essential Religion on WS-bn, which is in English for the first half and then in Bengali. I'd like to migrate the English pages here, but keep them visible on WS-bn, and at the same time make the Bengali pages visible here. (Please ping.) Kwamikagami (talk) 20:04, 6 August 2022 (UTC)Reply[reply]

@Kwamikagami: For interwiki transclusions you can use the template {{iwpage}}. An example can be seen at Page:The seven great hymns of the mediaeval church - 1902.djvu/86. However, it works well only if the other language wikisource page does not include any local templates or only templates compatible with en.ws templates. --Jan Kameníček (talk) 10:54, 7 August 2022 (UTC)Reply[reply]
I noticed that if I download a PDF of your listed example (from https://en.wikisource.org/wiki/The_seven_great_hymns_of_the_mediaeval_church), the resulting PDF doesn't have the transcluded pages. Do you know if this a bug with iwpage, and if so, if it is likely to be fixed? Persimmon and Hazelnut (talk) 22:01, 8 August 2022 (UTC)Reply[reply]
@Persimmon and Hazelnut: The reason is that the non-English pages are not transcluded into the main namespace, see e. g. The seven great hymns of the mediaeval church/Stabat Mater and Mater Speciosa/Stabat Mater, Lindsay where there are only odd pages with English text and even pages in Latin are omitted. --Jan Kameníček (talk) 22:30, 8 August 2022 (UTC)Reply[reply]

Copyright of the Bin Laden Files[edit]

Are the Bin Laden Files subject to copyright?unsigned comment by Blahhmosh (talk) 7 August 2022.

You have to write to the Al-Queda copyright department to find out. — ineuw (talk) 04:54, 7 August 2022 (UTC)Reply[reply]

Be careful before ever trying to contact Al-Qaeda while I see w:Al-Qaeda with an external link about ""Bin Laden documents at a glance". CBS News. Archived from the original on May 11, 2012."--Jusjih (talk) 13:09, 7 August 2022 (UTC)Reply[reply]

The CIA said they redacted all the material they knew to be copyrighted, but looking through the list of items there's a ton of stuff from media outlets and other stuff that at first glance seems like it would be under copyright. I suspect a large proportion of the material (if not all) would be speedy deleted for that reason if uploaded to Commons. --Arbitan (talk) 04:07, 12 August 2022 (UTC)Reply[reply]

Copyright of the Pandora Papers[edit]

Are they subject to copyright? Blahhmosh (talk) 19:23, 8 August 2022 (UTC)Reply[reply]

Tech News: 2022-32[edit]

19:49, 8 August 2022 (UTC)

Wikisource:General disclaimer addendum proposal[edit]

I propose that, in the licensing section, we include a line about trademark law. Something to the effect of this (feel free to suggest better wording):

While works hosted here are freely licensed, certain names used in works, such as titles of books, names of fictional characters, or product names, may be trademarked in some jurisdictions. This does not impact the copyright status of the work. Care should be taken to ensure that certain usages of a work do not violate local trademark law.

Rationale: We have several project disclaimers for encyclopedias which have been an ongoing deletion discussion here at WS (which has recently been closed). Those project disclaimers specifically mention the trademarks of those encyclopedia names. And in that discussion I noted that our project pages, most notably our general disclaimer, do not mention trademark law at all, and probably should, since this is a more universal issue that doesn't just apply to encyclopedias. PseudoSkull (talk) 22:20, 10 August 2022 (UTC)Reply[reply]

Copyright of the Panama Papers[edit]

Are the Panama Papers subjected to copyright? Blahhmosh (talk) 19:17, 11 August 2022 (UTC)Reply[reply]

I am afraid that most of these leaked materials (if not all) are documents of private people or companies and thus are not in public domain. Besides that, some of the leaked materials describes also legal financial operations of private persons and companies, and in such cases there might be problems with w:Privacy laws of the United States. --Jan Kameníček (talk) 07:15, 12 August 2022 (UTC)Reply[reply]