User talk:Xover/Archives/2023

Please do not post any new comments on this page.

This is a discussion archive first created in 2023, although the comments contained were likely posted before and after this date.

See current discussion or the archives index.

= Category:Confucianism => {{author}} =

Latest comment: 1 year ago4 comments2 people in discussion

We somehow have authors being piled into Category:Confucianism through religion or worldview (P140). Other religious categories have been for works, rather than authors, which have had a different lens applied, and this level of category has been used for works, not authors. For the moment can you please neuter their addition, and we will need to do some work in that space. [All part of the category fixes for authors / works / portals, and we still haven't worked out where portals sit.] — billinghurst sDrewth 23:22, 31 December 2022 (UTC)

@Billinghurst: I haven't looked at the code yet so all kinds of grains of salt, but… I'm guessing it will be equally easy (or possibly even easier) to change it into a distinct category like Category:Confucian authors (or Category:Authors with a Confucian religion or worldview to match the WD property's construction). Would that be a suitable stop-gap until we make a master plan for this?

I'll try to take a look tomorrow, but it might be a day or two depending on what IRL permits. Xover (talk) 01:46, 1 January 2023 (UTC)

@Billinghurst: I had a quick look and as suspected we already have the right logic in place for this, cf. Category:Catholic authors (691), Category:Mormon authors (96), Category:Christadelphian authors (4), Category:Lutheran authors (228), Category:Methodist authors (163). It's just misconfigured for Category:Confucianism (2) and others. We can fix it pretty easily provided the schema "Denomination authors" is the correct one (at east for now; it can be changed en masse later)? Xover (talk) 12:10, 1 January 2023 (UTC)

Thanks. Found and updated. I see that there is more work to do. It never ends. <shrug> — billinghurst sDrewth 07:20, 2 January 2023 (UTC)

The Casebook of Sherlock Holmes

Latest comment: 1 year ago1 comment1 person in discussion

Couldn't happen to notice that you're plowing through The Casebook of Sherlock Holmes. If you want, I just added Index:The Strand (Volume 73).pdf that has last three stories that are in the PD. They also have the original images. In any case, Happy New Years! Languageseeker (talk) 15:08, 3 January 2023 (UTC)

Noting issues with some forced formatting defeating layers

Latest comment: 1 year ago2 comments2 people in discussion

To note that I am about to start to resolve issues with Index:St. Nicholas, vol. 40.1 (1912-1913).djvu/styles.css forcing a whole lot of formatting on a work that inhibits the use of the toggled layers. I am going to hazard a guess that the contributor has other impositions in other index css, as they seem to find our style guide and our consensus often not to their liking on a regular basis, and as such somewhat discardable. — billinghurst sDrewth 04:41, 10 January 2023 (UTC)

@Billinghurst: Ugh. That is rather an excessive amount of custom styling, yes. Granted St. Nicholas is a formatting heavy work, but /styles.css must be possible to prune quite a bit. Xover (talk) 07:15, 10 January 2023 (UTC)

Bulk revert?

Latest comment: 1 year ago5 comments3 people in discussion

Hi, User:Atyang344 has gone on a validation spree in the past few hours. Most of the pages are in a parlous state (except the ones that I proofread). Could a bulk revert please be done of anything they have validated? Most of their "proofreading" is equally bad, but I'm not sure what to do there as others have validated some of it. Beeswaxcandle (talk) 16:32, 10 January 2023 (UTC)

Doing… Xover (talk) 16:54, 10 January 2023 (UTC)

@Beeswaxcandle:

Done Xover (talk) 17:00, 10 January 2023 (UTC)

As we have been through something similar recently, it smells like some wikicommunity trying to do something on their own in their own lab exercise. Probably time to put out the feelers. — billinghurst sDrewth 22:41, 10 January 2023 (UTC)

Yeah, smells like either a local (non-English) wiki community running a "editathon" for English-language texts, or some kind of school assignment with insufficient preparation and oversight. A few years ago there was a sudden spate of Good Article nominations on enWP for articles on Shakespeare's individual sonnets, all of them badly prepared and with a lot of plagiarised content. Turned out to be a Prof. somewhere simply setting their class that as an assignment, scoring students on whether they achieved GA or not, and doing no prep or communicating with the community. This has kind of that seem feel to it. Xover (talk) 05:02, 11 January 2023 (UTC)

author link switch within module:article link?

Latest comment: 1 year ago9 comments2 people in discussion

At the moment {{article link}} only allows for author links through formatting the text in the field of the parameter. I am looking to convert {{Weird Tales link}} to parse Template:article_link, though that would mean that I need to go through and wikilink every author parameter (Template:Sandbox/Wikisource:Sandbox). It seems that it would be easier if there was the means to have a parameter to just wikilink the author and the co-author field when parsing that template. Possible without too much hard slog? — billinghurst sDrewth 12:20, 8 January 2023 (UTC)

Oh and same would apply for coauthor — billinghurst sDrewth 12:21, 8 January 2023 (UTC)

@Billinghurst: Making a parameter auto-link is no great effort in itself, but the existing |author=, |coauthor=, and |editor= are defined as free text (partly to enable multiple authors in a single parameter). If we're to change the behaviour of these we will have to bot-fix all the uses. I haven't checked so I don't know how many there are (if the template is mostly used on Author: pages there probably aren't that many that actually use |author=, much less |coauthor= and |editor=).

Alternately we could add new params with the behaviour that's desired, but we'd need to go full |authorn= to support multiple authors. Xover (talk) 14:56, 8 January 2023 (UTC)

I was more thinking of additional functionality, not a removal of the existing so we can nest it in other forms to continue that move to more standard forms, so the need for multiple authors beyond author 1 and author 2 was not a current requirement for the purpose envisaged. Can always fall back to usage of the base template if truly required. — billinghurst sDrewth 12:03, 9 January 2023 (UTC)

@Billinghurst: {{article link/sandbox}} (which calls Module:article link/sandbox) now supports |author1=, |author2= etc., and automatically links the argument to a corresponding author page. It supports an infinite number of authors, but they must be numbered consecutively and start at 1 (the code stops looking for params if a number in the sequence isn't present). If |author= is given it will ignore the numbered variants. If |author1= is given it ignores |coauthor=. I haven't looked at |editor= and |translator=.

Try it out for your use case and see if that does what you needed. The code is a rough cut intended just to have a concrete example for discussing functionality, so it may need rewriting (and definitely some testing and cleanup) before deploying. Xover (talk) 10:28, 10 January 2023 (UTC)

special:permalink/12911849 It isn't applying the pipe trick to create an active link; same issue I saw when I tried to wrap the plain parameter inside [[Author:{{{1}}}|]] —unsigned comment by Billinghurst (talk) 23:39, 10 January 2023‎ (UTC).

@Billinghurst: Meh. The pipe trick is what in devspeak is called a "pre-save transform", that is it happens before the page is saved. This means that when the user is the one adding the link it gets expanded and saved with the template invocation: {{article link|…|author=[[Author:John Smith (1666-1945)|John Smith]]}}. Neither the template nor the module ever sees the pre-transformed link ([[Author:John Smith (1666-1945)|]]). Scribunto doesn't expose the pre-save transforms anywhere, so there's no real way to manually use the pipe trick from a template or module. (incidentally, the existing production code is broken in this way too, it just doesn't matter because the user is providing the link. But that's why I didn't find this in testing: my sandbox was seemingly producing the same output as the production version.)

The practical effect is that we'll need to use separate link and display parameters to handle this. I've added support for |authorN-display= that will override the displayed link for the given author; and when no -display is given it uses the raw name. I.e. for |author1=Howard Phillips Lovecraft it constructs [[Author:Howard Phillips Lovecraft|Howard Phillips Lovecraft]]; for |author1=William Shakespeare (1564-1616) it constructs [[Author:William Shakespeare (1564-1616)|William Shakespeare (1564-1616)]]; and for |author1=William Shakespeare (1564-1616) + |author1-display=The Bard it constructs [[Author:William Shakespeare (1564-1616)|The Bard]].

Example: "The three hunters" by Howard Phillips Lovecraft, August William Derleth, and The Bard. in Once a Week, Series 1, 10 (1864)

In any case, see if it works better now. Xover (talk) 07:27, 11 January 2023 (UTC)

Yes, that will work special:permalink/12911849. I have updated WT_link and then rolled it back awaiting the module upgrade — billinghurst sDrewth 13:12, 11 January 2023 (UTC)

@Billinghurst: Sandbox has been sync'ed to production now so you should be good to go. Xover (talk) 08:34, 16 January 2023 (UTC)

With next set of DNB, the footer templates ...

Latest comment: 1 year ago7 comments2 people in discussion

With the next set of DNB available, and the next list of contributors to start to build, the methodology for building the next set of {{DNB footer initials}} templates becomes even clunkier as more initials are being reused. While I can go and build and gerrymander fix that out, it is becoming an ugly and old-fashioned way to do things. I am thinking that it is probably time for an underlying data file approach, though having a humongous #switch is just as beastly. What I was thinking is to simplify it all in a number of ways.

set up a base template that does author initials, and that can be DNB/EB1911/CE/...
set up respective css that formats initials based on the work
set up datafiles that manage respective editions of works, for DNB that is initials -> the vols, the errata, the supplements -> the authors

I am not confident in my coding ability, though the data management and the migrations I can do. Obviously would want to start with one, then work on. Your thoughts/guidance before I start to think through functional specs? Thanks. — billinghurst sDrewth 22:22, 18 January 2023 (UTC)

@Billinghurst: Hmm. If we're going to go for this type of approach, the way to go is definitely Lua. In Lua you have functions to, literally, "load that data file" and then use it as a lookup table. Ignoring all boilerplate, error correction, etc., your giant #switch in a template becomes lookuptable = mw.loadData(datafile); author = lookuptable['initials'] in Lua. Scribunto is even smart enough that it loads that data file only once per page on which this is called, so you won't waste memory or execution time (and start running into MediaWiki's hard limits on those). Ignore the ugliness that is {{ts}}, but you can see an example of such a data file at Module:Table style/data. The structure for initials would probably be simpler than that, but still with some syntax requirements of course.

Setting up one module that handles loading the data file and spitting out initials wikilinked to the author should be pretty straightforward. CSS could go in several places (Index: styles, the wrapper template, added by the module, etc.) and I think we'd best have something more concrete before we nail down the final answer on that. I also think we'd better reconsider the use of {{DNB GCB}} versus something like {{dnbi|GCB}}. Having one dedicated template per author is pretty insane when all we're doing is looking up the initials and linking to the author page. One author initials template per work seems a reasonable balance. Then we could attach style info there (if we don't punt it to Index: styles) as well as picking the data file to use for looking up initials.

For the data files, I'm not sufficiently familiar with the structure of the works (editions, volumes, supplements, etc.) to be very confident. My starting assumption would be that we have one data file per main work, and not separate data files for things like supplements or per volume. So one datafile for DNB entire, and one for EB1911, and so on. But possibly the differences are large enough that DNB00, DNB01, and DNB12 needs separate datafiles.

In any case, I'd be happy to help out with the nerdy bits here if you can supply the expertise on the works, their structure, usage, etc. Xover (talk) 07:20, 19 January 2023 (UTC)

The initials are not unique through the complete work, they have runs within the volumes of the first series, sometimes in the first series an author has two sets of initials as they disambiguate themselves. Then from series to series, the initials can be re-used on a same or different author, or an author can have a modified set between series.

So it needs to have the logic to do

initials lookup,
apply logic based on volume/supp, then
select and output initials, and the wikilinked author.

We do have some logic in the Index: pages special:prefixindex/Index:Dictionary of National Biography… So I would say that we could label all the various works in a sequential order based on the name stem suffix, then I would build initials tables, and assign names, and by a means apply the logic back to the run of volumes. I am pretty certain that they are linear runs without chopping and changing back (that I have to check). Here a start and stop index no. per set of initials/name. If there is chop and change that would just mean more lines per initials set and for me to fix.

Variation comes within Vols. 1 to 63 themselves, otherwise as we jump to each supplement set, and we have done that work prior to supplement 3, and I working on lists of contributors for supp. 3.

Not something that we readily can punt to the Index:....css as they are per Index as a security measure, and that becomes a PITA. Also once we have done the /data for the DNB. —unsigned comment by Billinghurst (talk) 09:02, 19 January 2023‎ (UTC).

@Billinghurst: Hmm. If the initials aren't unique the logic becomes a little more complicated. We can probably look up what volume we're being called from based on page name, but that's a bit of a hacky solution, technically speaking, for several reasons. It would be better if we could find a way to add that as a parameter somewhere that eventually gets passed in to the module, but off the top of my head I can't think of any way that could sensibly be done. If we're to detect it from page name we'll definitely have to exercise some discipline in page/index/file naming, and possibly also move-protect them to prevent massive breakage.

How are these works currently set up in terms of page names and subpages? As I recall, they have relatively custom page structure (made way before current standards) but I'm vague on the details. Is it at all practical to migrate them to something close to current standards (if needed) or is the job too massive / complicated to be worth even considering?

Because I don't think we can (currently) get at the Index/File name from the transcluded page in mainspace; and in the Page: namespace we can't get at the page name in mainspace where we're being transcluded. In order to have manageable logic here we'd need to have pretty strictly regularized page naming in both namespaces. --Xover (talk) 15:33, 19 January 2023 (UTC)

My thinking was something like a date file with a list of indexes/pages => common PAGENAME prefix: Dictionary of National Biography<suffix stem>.djvu

1 = ' volume 01' or 'Dictionary of National Biography volume 01.djvu'

...

63 = ' volume 63'
64 = '. Sup. Vol I (1901)'

...

67 = ', Second Supplement, volume 1'

...

70 = ', Third Supplement'

then saying =>

initials='EC' index='11' author='Edith Coleridge',
initials='EC' index='59' author='Edward Caird',
initials='EC' index='47' author='Ernest Clarke (1856-1923)',
initials='EC' index='67-69' author='Ernest Clarke (1856-1923)'

I am happy to build this as we have the data already gerrybuilt in things like Template:DNB EC. It is not as though they are all replicated, just the odds and sods, though as we get an extra supplement, it means the extra chance for duplication. Well that is how I would have thought it through if I was doing #switch one for the page: then one for the the initials. I hope that makes sense. — billinghurst sDrewth 22:45, 19 January 2023 (UTC)

@Billinghurst: I put a quick mockup of this together, just to illustrate roughly how the pieces would go together. {{DNB footer initials/sandbox}} now calls Module:Author initials. The template passes "DNB" to the function in the module, which the module uses to load author data from Module:Author initials/DNB. That data file is where you'd edit initial → author page mappings. The syntax there currently is just about the simplest it can be, but we can make it arbitrarily more complicated if we need more fancy stuff (it is Lua programming code, even if we load it as a data file, so we can even put logic in it). There are also dummy test cases set up on Template:DNB footer initials/testcases just to show usage (they're not really testing anything meaningful just now). All of these are currently just to demonstrate the concept and make it easier to discuss, and not anything final.

On the variable initials... We can certainly set up config along the lines you outline. Details depend on the pattern of non-uniqueness (are they clean runs or switch back and forth, how many authors are affected in absolute numbers, and how many as a proportion of the total, etc.). I am hoping we can maybe set up a default mapping and then just add overrides for the (few) exceptions. Otherwise we'd need a lot of config (1:1 maps for every author and volume combination) that's inefficient and hard to maintain.

But the main issue is that when the template is called, it is called in either a mainspace page or a Page: namespace page. If we're to trigger off that page name we need to be strict about the naming of both sets of pages, and we need to maintain two sets of page name → volume rules. Doable, but not optimal. I've asked Inductiveload whether the Lua interface to Proofread Page has a function to get the associated Index: page name (undocumented), or possibly whether one can be added with a reasonable amount of effort. If so, that would work elegantly no matter where we were called (even in Translation: should that ever become relevant) and let us use just a single mapping. Even more elegant would be having access to the Volume field in the Index: (or the other structured fields in the Index:) so we could trigger off that directly.

But I think we need some example data to reason from. Any chance you could come up with an author or two that has contributions spread over several volumes, that have non-unique initials, and a handful of articles by them? That way I'd get a firmer idea of what we'll have to trigger off. Xover (talk) 06:41, 20 January 2023 (UTC)

Template talk:DNB footer initials/sandbox for some examples of three sets people sharing initials, with one of those having two different sets of initials. — billinghurst sDrewth 21:56, 25 January 2023 (UTC)

Template:Editnotices/Namespace/Index

Latest comment: 1 year ago2 comments2 people in discussion

This has an unterminated <p> tag. It should be using {{pbr}} instead?

The page is protected, so I can't implement the obvious reapir.

Also I'm seeing some LintErrors from PortalHeader here - https://en.wikisource.org/wiki/Special:LintErrors/missing-end-tag?namespace=10 ShakespeareFan00 (talk) 20:17, 1 February 2023 (UTC)

@ShakespeareFan00: Fixed the editnotice. The portal stuff is buried so deep inside twisty template code that I can't untangle it just now. It'll have to wait until I clean up the header stuff fully. Xover (talk) 07:49, 2 February 2023 (UTC)

Latest comment: 1 year ago11 comments3 people in discussion

This is a mess, and I'd like to have the ability to have an infinite number of labelN,datedN pairs.

see: Page:UKSI1964 (Part 3- Section 1).pdf/1336 for the reason why.

My previous attempts proved to difficult to implement reliably solely in Wikitext markup, so the suggestion was that some kind of Lua based generation, be used. I tried to write some skeleton code in Module:Key value table, and realised I was out of my competence entirely. Can you take a look and if you are feeling very generous implement the Module properly? updating UKSI/header accordingly. Thanks ShakespeareFan00 (talk) 17:01, 3 February 2023 (UTC)

@ShakespeareFan00: I largely started working on UKSI from noticing your Module:Key value table and this discussion. It looks like you are attempting to recreate an infobox or even Module:TOCstyle (notice your UKSI tables have dot leaders). I recommend you use {{TOCstyle}} in {{uksi/header}} instead of attempting to reinvent your own (if you really need your own just add a new model to it; I was already considering a similar thing for the Appendices section of Page:Report of the Traffic Signs Committee (1963).pdf/8 since there are no page numbers there and I had to add empty ones to the template invocation as a workaround). —Uzume (talk) 12:55, 6 February 2023 (UTC)

The module I wrote wasn't just about that use case to be fair, and it may well be redundant if you have better approach.

Dot leaders will eventually be part of CSS, but that's a long long way off. ShakespeareFan00 (talk) 14:30, 6 February 2023 (UTC)

The issue here is that uksi/header needed to be able to handle a variable number of arguments, whilst suppressing defaults that weren't specified (see the default_idx handling logic). If you can setup up something up with TOCstyle that would allow the dot leaders as needed, but also allowed for the suppression of default labels if not specified feel free. ShakespeareFan00 (talk) 14:36, 6 February 2023 (UTC)

Please, please, do not add any more ways to fake dot-leaders anywhere. Until we get real support for them in CSS and web browsers all our hacks to fake them are just going to continue to cause problems, and encourage people to use them such that our cleanup job once we get real support will be that much more insurmountable.

Also, when I see pages like Page:UKSI1964 (Part 3- Section 1).pdf/1336 where everything is wrapped in a twisty maze of templates, nested three levels of subpages deep, my "hideously over-engineered" alarm immediately starts blaring at full volume. If our page source with templates looks more like line noise than raw HTML we're doing something wrong somewhere. It's possible that it's unavoidable, but I'd jump through burning hoops looking for simpler solutions and compromise presentations before accepting that as a given.

That little rant out of the way... Supporting arbitrary number of foon parameters in Lua isn't normally a problem. So unless there's some particularly hairy logic involved that should be straightforward to add. Xover (talk) 14:47, 6 February 2023 (UTC)

If you have a better approach to do the formatting, I'd welcome suggestions, as the aim here was to have once consistent style. ShakespeareFan00 (talk) 14:52, 6 February 2023 (UTC)

The core is {{uksi/paragraph}} which is also a mess, and may well have been written pre templatestyles, like the {{cl-act-p}} module we don't talk about. ShakespeareFan00 (talk) 15:37, 6 February 2023 (UTC)

@Xover: - I've had it with trying to get template syntax to handle certain {{uksi/paragraph}} use cases. I'm considering removing the template entirly and going back to the exceptionaly error prone manual coding of the numbering and anchors, because the template logic to handle something is getting overly complicated. This will break a LOT of pages, but perhaps that will focus attention on getting something that is actually maintainable using a SANE syntax. So much for trying to have ONE consistent approach. (Sigh)ShakespeareFan00 (talk) 19:57, 7 February 2023 (UTC)

More positvely, would you be willing to at some point look into making this template into a Module which can handle the various nested numberings without the need to have the /1 /2 /3 and other variants? ShakespeareFan00 (talk) 20:05, 7 February 2023 (UTC)

Another limitation is that I am not entirely happy with the use of a split parameter, given that elsewhere /s /e versions of templates are used. ShakespeareFan00 (talk) 20:05, 7 February 2023 (UTC)

The reason for the {{uksi/paragraph}} was to do with including anchors so that specific Regulations, of clauses could be (cross-referenced) as more items of this nature become available on Wikisource.ShakespeareFan00 (talk) 20:05, 7 February 2023 (UTC)

File:A Letter on the Subject of the Cause (1797).djvu

Latest comment: 1 year ago3 comments2 people in discussion

I noticed this file is tagged {{do not move to commons}}. Do you remember why? —CalendulaAsteraceae (talk • contribs) 22:55, 5 February 2023 (UTC)

@CalendulaAsteraceae: I have no recollection of that file now (memory like Swiss cheese it seems), but my guess is that it's something I uploaded on behalf of another user and that the do not move is due to the lack of information template. Xover (talk) 06:32, 6 February 2023 (UTC)

Legit; thank you! —CalendulaAsteraceae (talk • contribs) 01:47, 7 February 2023 (UTC)

User:GrafZahl/AMS-style mathematics

Latest comment: 1 year ago5 comments3 people in discussion

Another example where the blanked userspace page approach could be applied. ShakespeareFan00 (talk) 10:41, 20 January 2023 (UTC)

No, just ignore it. It is of zero concern and causing zero harm. — billinghurst sDrewth 21:57, 25 January 2023 (UTC)

I agree it's not a high priority, but its impact is still non-zero. While most of the lint errors are just fussiness, some do matter to us and they do matter to the devs when trying to progress the parser (which is in dire need of improvement). Getting rid of the noise in the linterrors list is certainly good in that sense.

That being said, we have so many linterrors and so many of them are of so little actual impact, that among the many many maintenance backlogs we have those are very far down my list of worries just now. Xover (talk) 07:50, 2 February 2023 (UTC)

It is user: ns, it is irrelevant to readers and is zero to our readability. That the linterrors are checking user ns, and doesn't allow people to ignore these, seems to be the issue, not that there are errors there. Get them to fix/nodify their tools. — billinghurst sDrewth 02:15, 10 February 2023 (UTC)

It would also be nice if the error counts given on Special:LintErrors reflected 'content' namespaces given the view you express. ShakespeareFan00 (talk) 08:35, 10 February 2023 (UTC)

category:Pages transcluding nonexistent sections

Latest comment: 1 year ago2 comments2 people in discussion

Tracked in PhabricatorTask T329432

As per Dictionary of National Biography, 1885-1900/Forshall, Josiah and another couple of hundred pages, do you know what is the issue? The identified page definitely doesn't have other untranscluded sections. A quick check doesn't show me where it is applied, so I am wondering whether it is in the code of the extension. Thanks — billinghurst sDrewth 11:06, 16 February 2023 (UTC)

@Billinghurst: This is probably T329432. Xover (talk) 15:56, 16 February 2023 (UTC)

User contributions for 82.167.152.179

Latest comment: 1 year ago3 comments3 people in discussion

Can you look over these, because I'm not seeing the sort of things that should have triggered the abuse filter which blocked this user?

They were contributing in a positive manner from the edits I reviewed. (adding in arabaic transcription for a specfic work.) ShakespeareFan00 (talk) 08:30, 22 February 2023 (UTC)

@Jan.Kamenicek: Filter 52 is your baby, I believe? Xover (talk) 11:46, 22 February 2023 (UTC)

Corrected, I do apologize for the mistake in the filter. --Jan Kameníček (talk) 08:54, 23 February 2023 (UTC)

User may need some hints..

Latest comment: 1 year ago1 comment1 person in discussion

https://en.wikisource.org/wiki/Special:Contributions/TandangSoraPH

No issues with what is being uploaded, but they seem to need some help figuring out how Proofread page needs complete files. ShakespeareFan00 (talk) 23:37, 23 February 2023 (UTC)

Link curation...

Latest comment: 1 year ago3 comments2 people in discussion

Okay, Your views seem to be clear on this. I've therefore abandoned the current effort, as you articulated your concern clearly. I have however moved the template logic I was using to my user space, so it can be reimplemented in the future if needed.

I don't necessarily agree that relying on IA/Hathi/Google Books is "good enough" as I've sometimes found errors in the metadata supplied, such as given dates for serial publication being misleading, and works that are only out of copyright in the US. Unless someone is actively curating the links, then it's impossible to know which links are in fact problematic, and I felt that a proactive effort to address the concern was better than a reactive one.

ShakespeareFan00 (talk) 10:31, 25 February 2023 (UTC)

@ShakespeareFan00: Oh, I'm sorry, I didn't notice that you'd opened a community discussion about this at WS:S. Do, of course, feel free to pursue that if you want to. I didn't mean to implicitly "veto" this (it was just a bold—revert—discuss revert).

To expand on my reasoning… We're talking about links, not hosted files, so the risk here is a possibility of a very weak form of "contributory copyright infringement" in a very small number of cases. Even linking to clear copyvios can sometimes be "no big deal" (not permitted, but not a big issue if it happens by accident). In addition, both Hathi and IA have massively more PD material than non-PD material, and though I have serious concerns about both their policies and implementation of them, they do have a stated aim to avoid copyvios that they sorta kinda follow. And, ultimately, we police uploaded files and added texts which will most likely catch it if one of those links did happen to point to a copyvio.

The issue is different for files we actually host, or that Commons hosts, but in those cases it is actually the file we're concerned about, and not the link as such. And these templates are used far more on Author:, Portal:, and disambiguation pages than on File: pages.

On the other side of the equation, manually verifying all these links is a massive expenditure of community effort, and the visible artefacts (the icon, tracking cats, etc.) very "in your face" until that manual verification has happened.

In other words, in addition to disagreeing with doing this, my position is that doing so would require at least some public discussion and community consensus before implementing. Xover (talk) 11:57, 25 February 2023 (UTC)

Duly noted, Currently, I'd like to request some assistance in converting archive.org links into a specfic templated form? The reasoning is that raw URL's are more prone to link rot, and we have templates like {{IA small link}}} and {{IAl}} whereby the URL structure need only be changed in one location. (Aside it also make it far far easy to add silent verification categories to links as needed.). I'd been using AWB to convert from {{ext scan link}} in singular cases, but would welcome your suggestion on how to handle multi-part usage of that template.

This link conversion should also be done for HathiTrust {{HTl}} and {{HTlink}} and Google Books Google Books that have a standardized format. Of course if someone had the time it would be desirable to have one ext scan link module around which the current disprate templates were wrappers, so as to get the multi-volume behaviour in () in all of them :)

ShakespeareFan00 (talk) 15:01, 25 February 2023 (UTC)

Babel

Latest comment: 1 year ago2 comments2 people in discussion

Would you consider adding Babel information to your user page? It is not mandatory, just useful. --Dan Polansky (talk) 09:05, 27 February 2023 (UTC)

No, sorry, not a big believer in user boxes. Xover (talk) 15:14, 27 February 2023 (UTC)

respective contributor blurb in author description

Latest comment: 1 year ago6 comments2 people in discussion

Supplementary. And as a bit of reflection, do you think that we still need the contributor components of DNB, DMM, EB9, EB1911, etc. on author pages? I was playing to see if there was a way to merge them and have them as a set rather than a repetitive noise blocks; then I thought do they still provide value. [Back as we were starting DNB and populating, sure; now, not so sure.] The text in those is getting egregious, and I am not certain of the value any further. There is also a better means to categorise if we properly populate WD using contributed to creative work (P3919) / Dictionary of National Biography, second supplement (Q16014697) pair and inhale those results.—unsigned comment by Billinghurst (talk) 08:25, 19 January 2023‎ (UTC).

@Billinghurst: If the Wikidata item for an author contains some property that identifies him as having contributed to the DNB, then it's no problem having {{author}} spit out that blurb automatically. Showing the initials used may be a little more tricky unless there's an actual property for it at WD, but we may be able to hack something up by inverting the data file used for the footer initials (look up initials from name, instead of name from initials). I'm inclined to think it'd make sense to have properties for the initials at Wikidata, but the inconsistency / non-uniqueness you mention above may make that a little complicated. Adding categories based on Wikidata should be straightforward, but I'd need to look into the details to be sure. --Xover (talk) 08:30, 19 January 2023 (UTC)

I was actually questioning the purpose of the whole blurb in the description section. That was a very early approach that we took, and the template was a means to standardise and tidy. Our author pages are way more developed and actually now have the biographical articles listed, and the pages are interlinked, plus WD exists. So the basic question is does the contributor to text in desc. add value? With regard to the technical question, yes, there has been the journey of adding the contributor information to WD and an example is Marion Spielmann (Q6765371). That is what I am working through now to properly populate WD for 3rd Supp. Is it complete for early series? No, though that is remediable, and yes it contains initials if needed. ALL THAT SAID, I am more thinking that we ditch the contributor text in desc. and just categorise. — billinghurst sDrewth 21:52, 19 January 2023 (UTC)

@Billinghurst: I don't think I have the foundation to make up a very firm opinion on whether to keep or remove the visible blurb. I've sometimes been annoyed by it (especially the ones with multiple initials), and other times thought it nice to know. The blurb is human readable, which the categories are... only sort of. But bottom line is I have no strong feelings either way. Making the template do just the cats is certainly technically easier than doing both. Xover (talk) 06:11, 20 January 2023 (UTC)

I reckon that we can almost control this from WD. Keep this template as something like {{contributed to}} then leverage contributed to creative work (P3919) <=> Dictionary of National Biography, 1885–1900 (Q15987216) and pipe in the data subject named as (P1810). If the data is missing, we don't display initials, and list them in a maintenance category. Do the categorisation straight from WD. So we do have the initial hard work of populating WD, however, that is separate issue. If multiple sets of initials, then we just have to do multiple entries in the CCW property. And if the community decides that it is not needed as a notes field, we can just alter the template to kill the text, and maybe we just build that as a trigger now so it is visible and easily flipped. — billinghurst sDrewth 05:30, 4 February 2023 (UTC)

Note: I have created {{contributed to}} and starting the journey of work by work update to migrate to this as a base template. I will look to see how I go with updating the work level data and see what I can do about migrating data to WD. Hoping that will allow us see how we can later develop it, or use it as the basis of an approach. Tap me if things are not going to align to your greater plan. — billinghurst sDrewth 21:49, 28 February 2023 (UTC)

Indian Constitution..

Latest comment: 1 year ago4 comments3 people in discussion

User_talk:27.63.1.124#Notes_added_to_Indian_Constitution.. - I hope this was reasonable. We have scanned back versions of the Constitution concerned don't we? ShakespeareFan00 (talk) 19:27, 27 February 2023 (UTC)

@ShakespeareFan00: 2019 notes to a 1949 constitution are obviously incorrect, so in this case it would be entirely appropriate to revert the changes. In your communication to the user you may want to point out that this is the 1949 document as published, and if they want to add the 2019 version they should do so as a separate text. If the note is the user's own explanation then the talk page might be an appropriate place to do so.

I'm not sure whether we have scan-backed versions of these constitutions currently, so your powers of search are as good as mine on that score. :) Xover (talk) 20:17, 27 February 2023 (UTC)

I thought as much, but I was going to let someone with more experience leave an expanded comment for a new user. ShakespeareFan00 (talk) 20:31, 27 February 2023 (UTC)

@ShakespeareFan00: please also consider the application of {{static version}} — billinghurst sDrewth 05:46, 28 February 2023 (UTC)

Author:Mahadev Desai

Latest comment: 1 year ago2 comments2 people in discussion

Hi, Please undelete that page. We now have at least one work from him: Index:Gandhi - The Story of My Experiments With Truth, vol. 1.pdf. Thanks, Yann (talk) 05:37, 28 February 2023 (UTC)

@Yann:

Done. Please note that we prefer undelete requests to be made to the community at WS:PD rather than to an individual. Quicker and for the community to address. — billinghurst sDrewth 05:44, 28 February 2023 (UTC)

Index:A contribution to computer typesetting techniques … .pdf

Latest comment: 1 year ago5 comments2 people in discussion

OKay , I started this in good faith... However, the actual tabular data pages need to be rotated, because the OCR doesn't see them in their current orientation, I get rubbish when I try to OCR them. No objections if you wanted to build a higher quality version of this direct from the original scans, or migrate to djvu by the way. Just so long as you LMK that's what you are doing. ShakespeareFan00 (talk) 11:16, 4 March 2023 (UTC)

@ShakespeareFan00: I think (I could be wrong) that the advanced OCR interface in the new OCR tool (the "Transcribe text" button) lets you rotate pages before OCR.

But, looking as this scan... Are you sure you want to tackle this? Those tables are going to be monumentally difficult to do, and a large portion of the pages are going to have to be images (and challenging to align between pages). It looks like the kind of project that would be pure torture and incredibly time-consuming.

That being said, if you want me to I can certainly make you a new DjVu from the scans, at the highest quality I can manage. I don't think we'd want to rotate the pages in the DjVu though, since they appear to be "as published" in the scan, but I could explore ways to make the embedded OCR better by rotating the images before OCR but still adding the unrotated image to the DjVu. That's a bit more complicated (takes custom programming) so I can't promise when I'd have the time for it (busy IRL, and my on-wiki backlog is starting to grow downright epic). In any case, just let me know whether you want me to try. Xover (talk) 11:39, 4 March 2023 (UTC)

I've opted to use images instead. It's just easier ShakespeareFan00 (talk) 22:55, 4 March 2023 (UTC)

It's a very long term project, and I could just import the tabular data as images. The actual 'interent' formats available seem to an encoded version. What would be REALLY nice, if someone made SVG versions as this is effectively a Vector stroke font. ShakespeareFan00 (talk) 13:47, 4 March 2023 (UTC)

https://ocr.wmcloud.org/?engine=tesseract&langs[]=en&image=https%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Fcommons%2Fthumb%2Ff%2Ffb%2FA_contribution_to_computer_typesetting_techniques_-_tables_of_coordinates_for_Hershey%2527s_repertory_of_occidental_type_fonts_and_graphic_symbols_%2528IA_contributiontoco424wolc%2529.pdf%2Fpage36-2047px-thumbnail.pdf.jpg&uselang=en No option for scan rotation in the version I am seeing.

Better extraction of tabular data by OCR, would be appreciated... ( Cary's New Itenary stalled on this.).

In the interests of simplicity I'll use images as a stopgap measure, until we can come up with a 'better' one. ShakespeareFan00 (talk) 14:09, 4 March 2023 (UTC)

Maps for Index:The Northern Ḥeǧâz (1926).djvu

Latest comment: 1 year ago3 comments2 people in discussion

Hello. Some time ago you helped me with uploading Index:The Northern Ḥeǧâz (1926).djvu. Later I found out that the book is accompanied with 2 maps on separate sheets placed in a pocket of the book, but the file does not contain scans of these maps. I have scanned them and the pdf files are uploaded here. May I ask for adding both of them to the scan? The best place would probably be just before the back cover, as the original has them placed in the pocket of the back cover. Extracting the images from the pdf files is not needed. There is no hurry, proofreading advances very slowly due to numerous and very unusual diacritics. -- Jan Kameníček (talk) 13:29, 6 March 2023 (UTC)

@Jan.Kamenicek:

Done. File:The Northern Ḥeǧâz (1926).djvu. Please check that it's correct. Xover (talk) 06:22, 8 March 2023 (UTC)

Great, thanks! --Jan Kameníček (talk) 12:45, 8 March 2023 (UTC)

Obselete Center tags.

Latest comment: 1 year ago2 comments2 people in discussion

I've just attempted to clean up the remaining unmatched <center>...</center> tags in Page: namespace.

Any chance of applying a tool automated conversion of these over to {{center}} or other appropriate templates? ShakespeareFan00 (talk) 13:41, 8 March 2023 (UTC)

@ShakespeareFan00: I have on my todo list to do another pass on these, I just don't know when that'll be. Xover (talk) 06:48, 30 March 2023 (UTC)

Wikisource:WikiProject Transactions NZ Institute

Latest comment: 1 year ago4 comments2 people in discussion

I've recently initiated the above project as an attempt to get help in getting the thousands of articles proofread and into the mainspace so that they can be cross-cited from both enWP and also various NZ scientific works here. Papers Past got wind of the project and have offered us the opportunity to pick up their "corrected text" and import it in some way. They can provide an xml file per volume. They've emailed me a sample file (for Volume 15) to see if it's usable. I'm not able to utilise xml files myself, so am turning to you to see if such are practical for our purposes. If they are, then I'll need to do some negotiations with the National Library and the Royal Society of New Zealand with the help of the Wikimedia Aotearoa New Zealand society before we go ahead with uploading. Beeswaxcandle (talk) 06:33, 30 March 2023 (UTC)

@Beeswaxcandle: Converting XML to wikimarkup and bot-adding them to Page: namespace as "Not Proofread" should in general be doable. It depends on the details of the XML they've used (I'm guessing it's TEI-XML) and how they've coded it, and some things may not be possible to automatically recreate in wikimarkup, but that should mostly be an issue of how much manual fiddling will be needed in order to tag it as "Proofread" afterwards. We'll also need to look at how to match whatever "page" concept they use in the XML to physical pages in the scans so we can automate that as much as possible.

I may have some free time over Easter so if you make the XML file and a scan available somewhere (I can arrange an email if you need it) I can try to take a look. Ideally we should start from individual scan images so we can generate DjVu files ourselves, but we can probably make it work whatever format the scans are in. Xover (talk) 06:47, 30 March 2023 (UTC)

I've just forwarded you the file and the associated email trail. Stating such here for transparency. Beeswaxcandle (talk) 07:53, 31 March 2023 (UTC)

@Beeswaxcandle: Thanks. And I agree, keeping stuff visible on-wiki is best whenever possible.

I've had a very quick look and I think it looks promising for being able to do the bulk of the text, with basic formatting (italics etc.). More complicated bits (tables etc.) may not be as easy to deal with, so those'll probably have to be handled manually to begin with. But given the sheer size of this project, we can probably fruitfully get the basics down to start with and then maybe try to iterate on additional improvements.

I'm a little bit concerned about the mapping to physical page numbers in the scan. It doesn't appear to be a straightforward 1:1 mapping, so I'll need to investigate that a little more. Hopefully we can just apply a fixed offset or something.

Anyways, I'll try to dig into this a bit over the Easter holidays and give you a ping if I get to a point that useful to look at. Xover (talk) 10:43, 31 March 2023 (UTC)

Personal tools

Latest comment: 1 year ago2 comments2 people in discussion

I'm here using a (complex) set of personal tools. I didn't care about documenting them, but some it.wikisource are asking me about, and their interest forces me to add some doc... if you like, you'll find doc here: User:Alex brollo/PersonalTools. Please consider that's a raw WIP by now. Alex brollo (talk) 09:46, 5 April 2023 (UTC)

@Alex brollo: Thank you, that's very useful! Xover (talk) 11:41, 6 April 2023 (UTC)

Follow up on book scanning

Latest comment: 1 year ago3 comments2 people in discussion

Hi Xover, I wanted to inform you about my progresses in book scanning, after this exchange. My scanner scans at 600x600 DPI max resolution. Thus, I saved the images as tiff, cropped them with Gimp and saved as "tiff no compression", and used tesseract to have a text layer that I exported in a PDF file. The result is visible at File:Scalo marittimo - Raffaele Viviani - 10 commedie.pdf.

The other test was using Cam Scanner with my cellphone. The result is File:Circo_equestre_Sgueglia_-_Raffaele_Viviani_-_10_commedie.pdf. I reckon that the first book is better on Mediawiki (on my computer, needless to say, the image is very clear).

Do you think that it's an acceptable result or should I try something different (e.g. creating a djvu as in User:GrafZahl/How_to_digitalise_works_for_Wikisource)? Cheers! -- Ruthven (talk) 10:03, 11 April 2023 (UTC)

@Ruthven: If the resultant OCR is sufficient to proofread the text without too much trouble then it is "good enough". You'll have to be the judge of that.

However, are you sure you scanned this at 300dpi? An A4 sheet of paper (297x210mm) scanned at 300dpi should be 2480 x 3508 pixels, while your PDF is just 956 × 1,352 (less than half). Granted your source (physical book) may be smaller than A4, but from these numbers it looks more like an effective ~150dpi resolution. I also see some compression artefacts in the PDF when I download it (zoom in on, e.g., the "Personaggi" on p. 3 and notice blocking and a halo effect around the letters). These are then exacerbated when MediaWiki reencodes it to JPEG.

The compression is most likely being added when you create the PDF, so you'll want to look there for that. But my guess is that if you can find a way to get higher effective scan resolution the compression artefacts will be much lessened. Could it be that your scanner / scanner software does something weird that causes lower net resolution than what the sensor is physically capable of? Does rotating the (physical) pages 90 degrees when you scan them affect the effective resolution? Xover (talk) 11:32, 11 April 2023 (UTC)

Hello,

the format of the book is closer to C5 than to A4 (it's 17x23,5cm). I'll look into what happens right after the scan, e.g. when rotating the page. When you're talking about p. 3, it's the one from this file? Thank you for your support. Ruthven (talk) 12:14, 11 April 2023 (UTC)

Yale - Richard III

Latest comment: 1 year ago8 comments2 people in discussion

When you have time, could you please prepare a DjVu from this IA scan? For File:Richard III (1927) Yale.djvu? I've been waiting for a public scan, and this one finally turned up. It's not the first edition, but the only 1st ed. I could find was a Google scan that wound up on HathiTrust, but which is heavily written upon throughout. --EncycloPetey (talk) 21:40, 8 April 2023 (UTC)

@EncycloPetey:

Done.

I've mostly only done technical checks on it. Speaking of which, the scan resolution is surprisingly low for what claims to be a 2022 scan by IA's scanners, which is going to affect OCR quality somewhat (but not, I think, any worse than others we've seen). Uploaded locally to enWS so we can tweak it, add info, etc. before transferring to Commons.

Regarding the edition, I see no real reason to fetishise the actual first printing of the first edition to the degree certain others do—whom it would be unpolite to name without being precent, but you know who you are! 😎—when this printing appears to pretty much exactly reproduce the first edition (modulo possibly the placement of the caption for the frontispiece). If the alternative is a crappy Google scan then that's no alternative at all. Xover (talk) 07:15, 9 April 2023 (UTC)

@EncycloPetey: Oh, btw, I meant to tell you: I found a copy of Index:Venus and Adonis, Lucrece, and the Minor Poems (1927).djvu that I uploaded and have started on. Xover (talk) 09:42, 9 April 2023 (UTC)

The first editions do have a different title page, but yes, they otherwise do not seem to differ from the later reprints. Part of the care is just being sure we're not accidentally getting something that is a later edition. There were second editions for many (all?) of the series. --EncycloPetey (talk) 17:28, 9 April 2023 (UTC)

Note: The end matter for Richard III is proofread, if you care to make your usual pass through the Notes and Appendices. --EncycloPetey (talk) 17:12, 24 April 2023 (UTC)

@EncycloPetey: Thanks. The list is a little long just now, but I'll try to find the time for a pass over it when I can. Xover (talk) 08:53, 25 April 2023 (UTC)

RE: "of of": It happens to all of us. --EncycloPetey (talk) 19:43, 30 April 2023 (UTC)

Indeed. :) Xover (talk) 20:56, 30 April 2023 (UTC)

Template:Wikidata populated category

Latest comment: 1 year ago7 comments3 people in discussion

... started use at Category:Albanian authors, thoughts? Probably need a link at the bottom to help:categorization (or maybe another???) that has some context and direction. Once agreed upon, I will run a bot batches through and add it based on the /data page. I will probably need to comment each line of the data page to indicate where created. [Happy for smarter ideas]. I am also looking at the contains interset in items like Category:Albanian writers (Q6102065) to see what smarts might be around to intersect our /data file and categorisation (lower priority) — billinghurst sDrewth 00:51, 25 April 2023 (UTC)

I'm curious how this will work (1) when the country/language isn't the best fit, such as an author in the French language who is from Algeria, or a Scottish author who is marked on Wikidata as being from the UK. Wikidata has several different ways to label "UK" depending upon the relevant dates. Or (2) when authors are from countries that no longer exist, such as Prussia, Austro-Hungary, or the Venetian Republic. --EncycloPetey (talk) 02:33, 25 April 2023 (UTC)

@EncycloPetey: This is example is just being overt to what is happening in Module:Author/data, this is not about the categorisation itself, that is a long past community action. To the existing arrangement, it is not overriding categorisation that we do, it is complementing. If someone manually puts Scottish authors, then that is still there as it, it is just not automatic. The leveraging of WD is giving more categorisation, not removing any. — billinghurst sDrewth 11:19, 25 April 2023 (UTC)

I think you misunderstood. I'm wondering whether the categorization will function correctly under these situations, or whether it will generate unsuitable categories. --EncycloPetey (talk) 15:03, 25 April 2023 (UTC)

@Billinghurst: The template looks good. I would suggest going with {{cmbox}} instead of {{ombox}} despite it not currently having a |small=yes variant since there's semantics attached to the type of box, but it's not a major issue. Help:Categorization clearly needs some love, but I'm not familiar enough with the area to offer much help.

In terms of smarter ideas… It depends on what we're trying to achieve. If it's just housekeeping what category pages are created relative what's defined in /data (and tagging cats used for this purpose) then I think your solution is probably about as smart as can be achieved. If we start thinking more ambitiously about Wikidata, then for example we could think about having {{Wikidata populated category}} pull the property IDs from Wikidata for display (instead of adding them manually) and things like that. But that's probably not a near-term issue since we'd need to think through properly how that would function on the Wikidata side (cf. a discussion we had elsewhere). Xover (talk) 12:29, 25 April 2023 (UTC)

Thanks. I originally went cmbox, but I also wanted small, so while it is in "draft" I was happy to cheat. I have flicked it over, so your commentary on whether it should be small or not would be useful. I will see if I can work out #WTF is wrong with the small--when I have a momnent to analyse--it is not a straight missing css bit of coding. And yes to auto-populating the auto-population template note. For now I will stick a big fat WD=Y per line comment to signify those that I have done. As usual, when you do these things other tasks show up. <shrug> — billinghurst sDrewth 04:40, 26 April 2023 (UTC)

No small for cmbox is purposeful in design (by enWP) as allowSmall = true, is not configured. So we can leave it (and continue to align with a sometime template at enWP), or not and fork the config file. — billinghurst sDrewth 04:53, 26 April 2023 (UTC)

"Centered" tables...

Latest comment: 1 year ago2 comments2 people in discussion

I was doing a check for a center template wrapped around table syntax.

pwb.py listpages -ns:104 -start:* -grep:"\{\{c(enter)\|(1\=|)\s|{\|"  -intersect -lang:en -family:wikisource -format:"[[{page.loc_title}]]" > Wrapped_table.txt

It's taking a while to run and was wondering if you were able to run it against a dump instead? ShakespeareFan00 (talk) 10:07, 1 May 2023 (UTC)

@ShakespeareFan00: No, sorry. Xover (talk) 10:25, 1 May 2023 (UTC)

About Vasari's adventure

Latest comment: 1 year ago1 comment1 person in discussion

First of all, thanks for your help and suggestions. Lives of the Most Excellent Painters, Sculptors, and Architects is going to reach a decent level 1, my aim has been no more than that result; I hope that other users will appreciate my effort to solve formatting & transclusion issues at my best, even if the English text needs revision.

I'm leaving here my personal tools, feel free to test them if you like and to ask me for any detail. Alex brollo (talk) 09:09, 4 May 2023 (UTC)

Files

Latest comment: 1 year ago2 comments2 people in discussion

More files are at User talk:Xover/Files for speedy deletion. There were a number of anomalies, as well, but I should be able to go through some more batches in the upcoming days. TE(æ)A,ea. (talk) 15:16, 4 May 2023 (UTC)

@TE(æ)A,ea.: Thanks. Deleted. Xover (talk) 18:06, 4 May 2023 (UTC)

Shakespearean Tragedy

Latest comment: 1 year ago3 comments2 people in discussion

Is this work now complete, enough to be listed as a New Text? --EncycloPetey (talk) 17:47, 7 May 2023 (UTC)

@EncycloPetey: Not yet. The play excerpts still need to have their formatting fixed, and a few other tweaks. Xover (talk) 17:58, 7 May 2023 (UTC)

OK. For now, I've linked it at Shakespeare's Author page and the Elizabethan drama Portal. --EncycloPetey (talk) 17:59, 7 May 2023 (UTC)

Template:Dent/s

Latest comment: 1 year ago2 comments1 person in discussion

See, Template_talk:Dent/s for an attempted repair. ShakespeareFan00 (talk) 12:16, 8 May 2023 (UTC)

I also suspect that the {{Portal header}} related Lint's are to do with a '''{{{param|}}''' type construction, which can be solved by using {{bold}} instead. ShakespeareFan00 (talk) 12:16, 8 May 2023 (UTC)

"Stage" scripts...

Latest comment: 1 year ago6 comments3 people in discussion

You might want to look at what I attempted with {{Stagescript/s}} and related templates. If they can be made to be compatible with {{ppoem}} than I don't necessarily need a distinct template :) ShakespeareFan00 (talk) 18:37, 8 May 2023 (UTC)

@ShakespeareFan00: I don't think {{ppoem}} can or should be made to support play scripts in general. It'd make it into a "do everything" template, and those tend to be so complicated that nobody wants to use them and they are effectively unmaintainable. The speech prefixes stuff I added is an experiment to see if we can support a few very simple and typical cases without overly complicating its main purpose (poems). I may at some point investigate the possibility of forking off a {{pplay}} based on the same approach, dedicated to play scripts, but I'm concerned that the variability of play script formatting will make that unworkable.

My main philosophy on templates is to make each template do one simple thing well, and be very restrictive with adding complications. If you feel you need ever more complicated features or boundless extensibility (e.g. |style= or arbitrary margin widths etc.) then the problem is probably not one where a template is a well-suited solution. And if you start fighting with MediaWiki or the skin's formatting or what browsers natively support (drop initials, dot leaders, etc.) then a template is probably not a good idea: it'll be complex, frail (prone to break), encourage users to make hacky (ab)uses of it that you'll then have to support (making changes impossible), all the while giving users the impression that this is a supported functionality and something they should be doing.

This has to be coupled with pragmatism and common sense, of course, but my rule of thumb is to be extremely conservative with adding extensibility or new features to simple templates; or to create new complicated templates. Xover (talk) 07:23, 9 May 2023 (UTC)

{{pplay}} was sort of the aim with the {{stagescript}} family. The intent was to have to one common set of templates, but different stylesheets (which could also be defined as an Indexstyle for a given work, or use case.

My initial aim developing it was to enable the conversion of older scripts into different formats merely by changing the stylesheet used. ( 'Scene' vs 'Cue' format for example.). ShakespeareFan00 (talk) 07:38, 9 May 2023 (UTC)

I expanded a little on it to support some efforts at Wikiversity. - https://en.wikiversity.org/wiki/Special:AllPages?from=Stagescript&to=&namespace=10

ShakespeareFan00 (talk) 18:37, 8 May 2023 (UTC)

See also https://en.wikisource.org/wiki/Special:AllPages?from=poem+special&to=&namespace=10, the usages of {{poemspecial}} should be migrated over to ppoem usage at some point. ShakespeareFan00 (talk) 18:44, 8 May 2023 (UTC)

@ShakespeareFan00: I'm not sure how useful this would be for the scripts I've worked with, but "dialogue" is misspelled throughout the documentation --EncycloPetey (talk) 21:44, 8 May 2023 (UTC)

pages missing fixed

Latest comment: 1 year ago2 comments2 people in discussion

Thank you very much for fixing: Bulandshahr- Or, Sketches of an Indian District- Social, Historical and Architectural.djvu Stamlou (talk) 20:08, 9 May 2023 (UTC)

@Stamlou: My pleasure. Glad I could help. Xover (talk) 20:12, 9 May 2023 (UTC)

On my files

Latest comment: 1 year ago3 comments2 people in discussion

From “Temporary files”:

File:169630002.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630003.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630004.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630005.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630006.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630007.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630008.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630009.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630010.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630011.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630012.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630013.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630014.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630015.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630016.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630017.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630018.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630019.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630020.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630021.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630022.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630023.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630024.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630025.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630026.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630027.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630028.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630029.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.
File:169630030.tif has been superseded by c:File:A Letter on the Subject of the Cause (1797).djvu.

From “A Dissertation on the Construction of Locks (1785) (images)”:

File:A Dissertation on the Construction of Locks (1785) (images).pdf has been superseded by c:File:A Dissertation on Locks-F.jpg, c:File:A Dissertation on Locks-C.jpg, c:File:A Dissertation on Locks-A.jpg, c:File:A Dissertation on Locks-2.jpg, c:File:A Dissertation on Locks-B.jpg, c:File:A Dissertation on Locks-4.jpg, c:File:A Dissertation on Locks-5.jpg, c:File:A Dissertation on Locks-6.jpg, c:File:A Dissertation on Locks-7.jpg, and c:File:A Dissertation on Locks-8.jpg.

From “Temporary files”:

File:ARABHQImg1.tif has been superseded by c:File:Arthur Rackham, a bibliography-016.jpg.
File:ARABHQImg2.tif has been superseded by c:File:Arthur Rackham, a bibliography-127.jpg.
File:ARABHQImg3.tif has been superseded by c:File:Arthur Rackham, a bibliography-128.jpg.
File:ARABHQImg4.tif has been superseded by c:File:Arthur Rackham, a bibliography-129.jpg.
File:ARABHQImg5.tif has been superseded by c:File:Arthur Rackham, a bibliography-130.jpg.
File:Dotted cell pre change.png was created for the discussion at Template talk:Dotted cell.
File:Dotted cell post change.png was created for the discussion at Template talk:Dotted cell.
File:GrammarPlate35replacement.pdf has been superseded by c:File:A Japanese Grammar-133.jpg.
File:HGWells4C.png was created for the discussion at Page talk:The Works of H G Wells Volume 4.pdf/14.
File:Japanese Literature (Keene) cover cropped.tif has been superseded by c:File:Japanese Literature (Keene) cover.jpg.
File:KeeneAncientCover.tif has been superseded by c:File:KeeneAncientCover.jpg.
File:KeeneAncientFp.tif has been superseded by c:File:KeeneAncientFp.jpg.
File:KeeneAncientp263.tif has been superseded by c:File:KeeneAncientp263.jpg (which has yet to be added to the transcription).
File:KeeneModernCover.tif has been superseded by c:File:KeeneModernCover.jpg.
File:KeeneModernFp.tif has been superseded by c:File:KeeneModernFp.jpg.
File:Pagelist broken.png was created for the discussion at Wikisource:Scriptorium § Pagelist spacing errors.
File:Pagelist working.png was created for the discussion at Wikisource:Scriptorium § Pagelist spacing errors.

From “Missing license and file information”:

File:Martin Luther.tif can be exported, for which see the information at c:File:The Jews and Their Lies.pdf.

Files which have been superseded may be deleted. TE(æ)A,ea. (talk) 23:49, 30 April 2023 (UTC)

@TE(æ)A,ea.: Thank you. I've deleted / exported as directed above. As for the various screenshots, all but File:HGWells4C.png are explicitly tagged as temporary files. Are these still needed? Xover (talk) 10:04, 1 May 2023 (UTC)

@TE(æ)A,ea.: Ping. All the remaining files lack file description and license templates. Could you check which are still needed, and either tag them for speedy or add the missing info? Xover (talk) 08:56, 13 May 2023 (UTC)

What peeves me ...

Latest comment: 1 year ago4 comments3 people in discussion

What peeves me is that you can have the criticism about CommonsDelinker which is working to 95% efficiency, yet the mess that has been made of WS:Works/2023 (and somewhat 2022, and 2021) generates silence. When we converted from a manual list that worked fine to this list that now needs to be converted to a formatted file that is totally and dependent on specialist knowledge, with no instruction, and essentially depends on one person. I asked and asked about it, to little response, and then abandoned maintaining it in frustration, and went to do something else. — billinghurst sDrewth 06:08, 11 May 2023 (UTC)

@Billinghurst: No, I agree with you. This change should have been discussed by the community before implementing, to make sure we didn't create something that has a worse bus factor than the old way. The mere fact that I can't completely figure out how it works in 5 minutes of looking means it's too complicated for the average non-technical user to grok without some pretty comprehensive guidance (which I didn't see, but that may just mean I'm blind).

It's entirely possible that I would have supported such a change in a community discussion—because the old manual way is very manual and limited and a more advanced technical solution would be better—but the sustainability of any solution is dependent on the community at large's ability to 1) use it and 2) to maintain it. I'd need to spend a lot more time looking at it and figuring out how it works before I'd be able to say anything sensible about its properties along those two axis. Xover (talk) 07:30, 11 May 2023 (UTC)

The WS:Works system was reverted last year, and all existing JSON-based lists converted to old-style plain template trancludable lists, as it clearly wasn't working out. I did say so in the Scriptorium thread about it when I did that, removed all reference to it in the edit notice and updated the quick-access links in the table. To archive, copy from Template:New texts to the relevant pages like Template:New texts/2023/01, which are linked from the table. Some archiving has happened since the structured system was removed. Inductiveload—talk/contribs 20:23, 11 May 2023 (UTC)

Thanks Inductiveload, the change back wasn't sufficiently overt, which says that our dropping a note in WS:S discussion is not enough. Seems we longer term admins needs to look to set better example in comms strategy. Yes, we need to simplify or keep simple, or set up simple systems, though we know that this is a complex space with multiple inputs and non-standards. Yes we need to manage when we get jaded or tired from having the same repeated battles with contributors with independent ideas that don't sit with pre-existing works. I have taken to picking my battles as the environment in which we work has become more complicated from mediawiki, wikimedia, wikidata, etc. What we probably fail to do is mentor, train, and set up the next generation of admin up for success, especially when all we want to do is actually edit and generate good content. <shrug> Possibly the same for our fellow admins who have been here forever. I will ponder. — billinghurst sDrewth 03:10, 12 May 2023 (UTC)

Index:The Confessions of Jean-Jacques Rousseau, 1896, vol. 2.djvu

Latest comment: 1 year ago3 comments2 people in discussion

Hi, There were 3 pages to delete, but not the index. Thanks, Yann (talk) 11:45, 16 May 2023 (UTC)

@Yann: D'oh! My bad. No idea what part of my brain short-circuited there, but it was clearly not firing on all synapses. Thanks for the heads up, and apologies for the trouble! Xover (talk) 14:31, 16 May 2023 (UTC)

Oh. I see. One of the Page: pages tagged for speedy was transcluded onto the Index: as part of the ToC. I should of course have checked better first, but in the moment I just assumed this was one of the obsoleted/replaced indexes and deleted it based on it having the tag. I'm still an idiot, but not quite as blundering as I first feared. :) Either way I appreciate the heads up! Xover (talk) 14:36, 16 May 2023 (UTC)

Page:Chronological_table_of_the_Statutes_(United_Kingdom)(1950).pdf/818

Latest comment: 1 year ago6 comments2 people in discussion

Before I go full tilt at this, can you take a look at this and come up with some stable ways of doing things like the bracing on earlier pages? I do not want to have to got back and forth on a lot of pages. ShakespeareFan00 (talk) 08:31, 17 May 2023 (UTC)

Side note , the existing 1877 effort will need a new disambiguated title. ShakespeareFan00 (talk) 08:31, 17 May 2023 (UTC)

@ShakespeareFan00: Could you explain the problem you’re trying to solve in a bit more detail? Xover (talk) 09:31, 17 May 2023 (UTC)

Page:Chronological table of the Statutes (United Kingdom)(1950).pdf/28 - this has multiple table row bracing. ShakespeareFan00 (talk) 09:35, 17 May 2023 (UTC)

@ShakespeareFan00: The standard way would be as a table, with a separate column for the brace, that has rowspan on the cell containing the brace. But if those tables are as big as I suspect they are then that could quickly hit transclusion limits. If that's the case it's possible simply don't have any good way to solve it. Xover (talk) 11:30, 17 May 2023 (UTC)

On the 1877 table I split parts of the table by session and did some tweaks to reduce the transclusion overheads. I can do that again here if needed. ShakespeareFan00 (talk) 11:35, 17 May 2023 (UTC)

Center tags..

Latest comment: 1 year ago1 comment1 person in discussion

Amazing.. The edits I was making were follow-ups to cleanup from unpaired center tags mostly. Keep going :) ShakespeareFan00 (talk) 17:59, 20 May 2023 (UTC)

Template:Citation/core

Latest comment: 1 year ago1 comment1 person in discussion

This was assuming a Wikipedia version of {{date}} , The local version is different, which is causing this template to mis-handle accessdate amongst other issues.

ShakespeareFan00 (talk) 14:52, 21 May 2023 (UTC)

Timeline needs adjusting

Latest comment: 1 year ago3 comments2 people in discussion

Hi, Wikisource:Administrators/Archives#Timeline is fine in Preview, but isn't showing on the page anymore. I suspect that it's run out of space after extending the width by another month, as my only other change just now was to end date BethNaught. When you've a moment, could you have a look? Beeswaxcandle (talk) 19:16, 1 June 2023 (UTC)

@Beeswaxcandle: It looks fine to me now, both on mobile and desktop. Could it be a local caching issue in your browser? Xover (talk) 21:30, 1 June 2023 (UTC)

<shrug> Yep, it's fine now. Thanks, Beeswaxcandle (talk) 23:17, 1 June 2023 (UTC)

{{fs90}} and {{fs90/s}} are missing some parameters

Latest comment: 1 year ago5 comments2 people in discussion

The template's original line-height:130%;padding-top:0.50em;padding-bottom:0.50em; seem to be missing. Can it be added to the this template's code. I have no experience of how to add them into the existing code. Thanks. — ineuw (talk) 14:10, 2 June 2023 (UTC)

P.S: If I am shown, I would like to standardize the rest of the fs family I created originally.

@Ineuw: The templates set all these in its stylesheet. Do you have an example of somewhere they are used but the style rules are not getting applied? --Xover (talk) 16:48, 2 June 2023 (UTC)

No. I just wanted to learn this new style of templates and convert {{fs85}} etc. and others. :-)

— ineuw (talk) 17:05, 2 June 2023 (UTC)

@Ineuw: The short version is that the <templatestyles src="Font-size90%/styles.css"/> is an extension tag provided by the TemplateStyles extension. So it's one of MediaWiki's special "XML-like tags". What it does is import the CSS stylesheet located at Template:Font-size90%/styles.css (it does a whole bunch more, but for this purpose that's the jist). In the imported stylesheet you can use CSS selectors to target anything the template outputs, but you'll usually do it by targeting a class applied to one of the output elements. For the {{fs90}} example that's .wst-font-size-90 {…}.

If you convert more of the templates in the group to use templatestyles you'll want to keep in mind: 1) point all of the templates in the family to the same stylesheet and use different classes to adjust the styling; 2) all template-added CSS classes should have a "wst-" prefix to avoid collisions; 3) templates in the same family should use the same naming convention; 4) the name also serves as a way to identify the template in rendered page output (e.g. when debugging) so should bear a clear relationship with the template's name. Xover (talk) 17:19, 2 June 2023 (UTC)

Thanks. I always build them in my sandboxes. Success or failure will lead me back here and will add the info to this post. Thanks. — ineuw (talk) 17:53, 2 June 2023 (UTC)

WS:AN#Edit requests for Template:RunningHeader

Latest comment: 1 year ago1 comment1 person in discussion

Speaking of nagging you about technical edit requests/code review, could you take a look at this one? —CalendulaAsteraceae (talk • contribs) 06:02, 3 June 2023 (UTC)

Need your input on a policy impacting gadgets and UserJS

Latest comment: 1 year ago1 comment1 person in discussion

Dear interface administrator,

This is Samuel from the Security team and I hope my message finds you well.

There is an ongoing discussion on a proposed policy governing the use of external resources in gadgets and UserJS. The proposed Third-party resources policy aims at making the UserJS and Gadgets landscape a bit safer by encouraging best practices around external resources. After an initial non-public conversation with a small number of interface admins and staff, we've launched a much larger, public consultation to get a wider pool of feedback for improving the policy proposal. Based on the ideas received so far, the proposed policy now includes some of the risks related to user scripts and gadgets loading third-party resources, best practices for gadgets and UserJS developers, and exemptions requirements such as code transparency and inspectability.

As an interface administrator, your feedback and suggestions are warmly welcome until July 17, 2023 on the policy talk page.

Have a great day!

Samuel (WMF), on behalf of the Foundation's Security team 23:02, 7 July 2023 (UTC)

Importing commons:Module:Roman

Latest comment: 11 months ago2 comments1 person in discussion

I've been thinking about your suggestion of importing an upstream module, and think commons:Module:Roman is a good choice. It has toArabic, and Numeral is equivalent to the current toRoman ({{Roman}} would need to be updated), plus it also categorizes errors into Category:Errors reported by Module Roman. {{Saferoman}} would need to be implemented locally—or just deleted; it's currently unused. —CalendulaAsteraceae (talk • contribs) 03:13, 19 July 2023 (UTC)

Except that commons:Module:Roman doesn't have a function for accessing toArabic with wikicode (: I'll see if I can fix that. —CalendulaAsteraceae (talk • contribs) 03:27, 19 July 2023 (UTC)

Wikifunctions / Abstract

Latest comment: 11 months ago2 comments2 people in discussion

Have you kept your eyes over the development of m:Abstract Wikipedia? Most recent update they have is m:Abstract Wikipedia/Updates/2023-07-20. Guessing that this will involve the WSes as a community, AND the greater WMF for somethings that we use and leverage. — billinghurst sDrewth 23:58, 24 July 2023 (UTC)

@Billinghurst: I wouldn't say I'm keeping my eyes on the project, exactly. I try to check in every now and again to see what the expected damage is, is more like it.

I think the whole concept is fundamentally flawed, and the specifics of how they're approaching it is even worse. See e.g. m:Abstract Wikipedia/Function model. They sum it up as "LISP in JSON" which tells you all you need to know: this is a project designed by computer science theorists for people who actually like LISP as a programming language. Its aim is based on the idea that Wikidata makes Wikipedia literally obsolete, because with Wikifunctions you can just automatically generate any Wikipedia article in any language from the factoids in Wikidata.

What it does not aim to solve is anything to do with coding or maintaining templates, modules, or Gadgets across projects. There have been open Phab tasks for a decade or so asking for "Global Gadgets", "Global Modules", etc. and better tools for building and maintaining them, and Wikifunctions / Abstract Wikipedia makes no effort whatsoever towards addressing these. At best we can hope that some of the infrastructure changes that Wikifunctions needs will be reusable for finally addressing Gadgets and Modules shared across projects.

In short I am pissed off that the WMF is spending scarce developer resources and funds on this expensive boondoggle instead of any of the actual practical stuff that we need. Xover (talk) 07:09, 25 July 2023 (UTC)

Heart of the West

Latest comment: 11 months ago5 comments2 people in discussion

I found a Doubleday edition that claims 1904 as their copyright date [1], but the edition was clearly printed later than the volumes listed on the title page. Do you have an authoritative source for the first publication date? For the time being, I've left a note on the page and adjusted the date on the primary WD item, but do not know whether the Doubleday claim is correct. McClure claims they have copyright in 1907. It's early enough that there won't be a renewal record in the usual places to check. --EncycloPetey (talk) 03:33, 26 July 2023 (UTC)

@EncycloPetey: I can't see the scan you linked (geolocked), but in general the bibliographic data for the early O. Henry collections is atrocious; not helped by several of them not actually printing publication dates. But in general the problem is that Porter wrote the stories for magazines, and the collections list copyright dates for the stories rather than publication date for the collection.

Metadata for scans is a complete mess (just try searching IA for any of these titles), but I am trying to primarily use the first publications of the collections by McClure: these do print publication dates (down to the month, actually, it seems; hallelujah!). You can also reverse-engineer rough publication order by looking at the "Author of [previously published collections]" blurbs on the title pages. For example, The Gentle Grafter (1908), on its title page, lists The Voice of the City (1908) in the "Author of …" blurb, but its title page in turn only lists the previous collections.

But we can also cite Britannica:

In 1902 O. Henry arrived in New York—his “Bagdad on the Subway.” From December 1903 to January 1906 he produced a story a week for the New York Sunday World magazine and also wrote for other magazines. His first book, Cabbages and Kings (1904), depicted fantastic characters against exotic Honduran backgrounds. Both The Four Million (1906) and The Trimmed Lamp (1907) explored the lives of the multitude of New York in their daily routines and searchings for romance and adventure, and the former contained the widely popular story “The Gift of the Magi.” Heart of the West (1907) presented accurate and fascinating tales of the Texas range.
Then in rapid succession came The Voice of the City (1908), The Gentle Grafter (1908), Roads of Destiny (1909), Options (1909), Strictly Business (1910), and Whirligigs (1910).

Digging through the mess of scans I've found nothing to contradict these dates; just a ton of really really poor bibliographic metadata. --Xover (talk) 07:38, 26 July 2023 (UTC)

@EncycloPetey: Let me put it more succinctly: as best as my research has been able to discover, the publication dates I had originally provided are correct (and thus your revised dates are incorrect). As I can't see the scan you linked I can't assess that evidence, so absent more information I'm assuming it's a case of bad bibliographic metadata or of a copyright date for a single story conflated with the publication date for the collection. Xover (talk) 06:52, 28 July 2023 (UTC)

@EncycloPetey: How do we resolve this? Xover (talk) 07:45, 31 July 2023 (UTC)

I'm going with the information you unearthed. --EncycloPetey (talk) 16:05, 31 July 2023 (UTC)

Test page

Latest comment: 11 months ago3 comments2 people in discussion

How temporary is temporary? — billinghurst sDrewth 22:43, 27 July 2023 (UTC)

@Billinghurst: Hmm. It was temporary to use for testing while converting the old PageNumbers.js script into a Gadget (the old script couldn't be turned on and off, it was loaded for everyone always, so the first step was creating a mechanism to disable it on individual pages). That's now done so that's no longer a need. But I don't know / can't recall what Pywikibot uses it for, so we'll have to ask Mpaa about that before nuking it. Xover (talk) 06:04, 28 July 2023 (UTC)

If needed, we should undelete it as needed. I hate permanent non-content in main ns. — billinghurst sDrewth 12:39, 28 July 2023 (UTC)

Aristophanes: The Eleven Comedies

Latest comment: 11 months ago24 comments2 people in discussion

Are you able to download full scans from Hathi, and create DjVu files from them? If so, there is a two-volume scan there now that I have been wanting for years: Aristophanes: The Eleven Comedies. It the first complete translation of Aristophanes into English, and we've had a woefully incomplete copy since before I started editing here. --EncycloPetey (talk) 17:31, 30 June 2023 (UTC)

@EncycloPetey: No, sorry, it's geolocked so I can't access it. Xover (talk) 20:33, 30 June 2023 (UTC)

That's frustrating. I have a physical copy of this work published more than 100 years ago, but this is the only scan I can find.

It looks like the Google copies of the scans may be accessible: (vol. 1) & (vol. 2). Will those work? --EncycloPetey (talk) 20:44, 30 June 2023 (UTC)

Note: If so, volme II looks like it might have a duplicated pair of pages near the front. The main title page ought to be the 11th page of the scan, not the 13th. --EncycloPetey (talk) 20:47, 30 June 2023 (UTC)

@EncycloPetey: Grr, no, geolocked there too. Xover (talk) 06:43, 1 July 2023 (UTC)

I can't access the Hathi copies, but I may be able to get the Google ones. Let me try, and if so, then I'll upload them locally. --EncycloPetey (talk) 17:56, 1 July 2023 (UTC)

Update: Vol. 1 times out each time I try to upload, and Vol. 2 is "larger than the server is configured" for. The files are 100MB and 130MB, respectively, as PDFs. --EncycloPetey (talk) 20:31, 3 July 2023 (UTC)

@EncycloPetey: Normal upload caps out at 100MB. To upload up to 2GB you need to install c:User talk:Rillke/bigChunkedUpload.js, typically by adding

mw.loader.load( 'https://commons.wikimedia.org/w/index.php?title=User:Rillke/bigChunkedUpload.js&action=raw&ctype=text/javascript' );

to your common.js. Then you open the File:filname.pdf page (not the upload form) and pick "Upload (chunked)" from one of the top menus (under "Create" I think). Add the description before selecting the file to upload (upload starts immediately when you pick the file), and you can probably set the "Chunk size" slider to max (~20MB).

It's really not user friendly, but you should be able to figure it out; and it allows uploads up to ~2GB and can retry uploads even if there is an intermittent failure while uploading. It can't work miracles so if Commons / uploads are really broken it'll still fail, but in my experience it manages to complete uploads far more reliably than the built-in uploader. Sadly Rillke is no longer active because with some user-friendlyness improvements it would be a better option for all uploads (for experienced users; newbies still need handholding with licenses and such). Xover (talk) 20:48, 3 July 2023 (UTC)

Both volumes now uploaded locally, as:

If you can please create DjVu files from these PDFs that are suitable for proofreading, with the Google notices stripped and vol. II adjusted (it might have a duplicated pair of pages near the front; the main title page ought to be the 11th page of the scan, not the 13th). I will be so happy to finally have a scan for this set. Aristophanes does not yet have a complete set of his surviving plays on Wikisource, and only four of the eleven have scan-backed copies. It's partly the result of Victorian sensibilities, and partly that most translations of his plays have more footnotes per page than text. Same problem as a lot of Shakespeare from the same time. --EncycloPetey (talk) 16:43, 5 July 2023 (UTC)

File:The Eleven Comedies (1912) Vol 1.djvu and File:The Eleven Comedies (1912) Vol 2.djvu. No real quality control (I just processed them technically), but I can tweak them if anything is off. Xover (talk) 20:54, 5 July 2023 (UTC)

In looking at the title page of Volume 1, and text is missing. Compare this view with what I've put in place at Aristophanes: The Eleven Comedies. All the red text from the title page is missing. --EncycloPetey (talk) 21:20, 5 July 2023 (UTC)

@EncycloPetey: If it's just the one page, if you get me a somewhat decent photo of that page I can patch it in. For this purpose even a phone photo will do; just try to get the page to lie somewhat flat and avoid shadows (put a desk lamp or something to the side and below the camera, throwing light at a 45-ish degree angle). I can crop out any extraneous stuff, rotate if it's not perfectly aligned, etc. But if the camera's sensor plane and the page are not parallel it's a lot harder to correct and the result will look awkward. i.e. probably best to open the book only half way, letting the page in question lie flat and the other half stick straight up, rather than laying it out on its back and getting that characteristic "stylized seagull" shape. Like this rather than like this. Xover (talk) 07:01, 6 July 2023 (UTC)

Note that the red text is present in the PDF I sent to you. It's just missing in the DjVu. --EncycloPetey (talk) 15:13, 6 July 2023 (UTC)

@EncycloPetey: Erm. I must be being dumber than usual. I'm not seeing it in the PDFs, cf. the thumbs on the right there. Xover (talk) 17:30, 6 July 2023 (UTC)

That is very odd. They are visible on Google Books with the red text [2], but it does not appear in the local copy. —unsigned comment by EncycloPetey (talk) 19:34, 6 July 2023 (UTC).

@EncycloPetey: Oh, you didn't really think Google would let you download the best version they have, did you? Oh no, you get the crappy over-compressed version of the already crappy scan. Grr!

But if you can view the pages on GBooks, just make the page as large as possible and take a screenshot. I can patch it in from any image file easily enough. Xover (talk) 17:39, 6 July 2023 (UTC)

Will these work? If not, I can take photos from the copies I own. File:Screen Shot 2023-07-06 Temp 1.png File:Screen Shot 2023-07-06 Temp 2.png --EncycloPetey (talk) 17:48, 6 July 2023 (UTC)

@EncycloPetey:: Ah, there we go. --Xover (talk) 19:13, 6 July 2023 (UTC)

Vol. 2 looks good, but in Vol. 1, you put the title page in the wrong location. Scan page 9 should be the title page, and page 10 should be blank. Right now the incorrect title page is still in there (in the correct location) and the corrected title page has been added (but in the wrong location). --EncycloPetey (talk) 22:24, 6 July 2023 (UTC)

@EncycloPetey: Fixed. Xover (talk) 05:23, 7 July 2023 (UTC)

@EncycloPetey: Btw, do you want me to remove a pair of blank pages from Vol. 2 so that its title page is also on index 9 as it is in Vol. 1? Xover (talk) 05:26, 7 July 2023 (UTC)

Thanks! I've already done some transcription, and haven't noticed any other issues. The scan is exceptionally clean (considering it comes form Google) and has a good text layer. I've been wanting a scan of this set for nearly a decade. At this point, I'd leave the scans "as is". And thanks again! I'm glad we'll finally be able to have scan-backed copies for all of Aristophanes' surviving plays. --EncycloPetey (talk) 05:28, 7 July 2023 (UTC)

@EncycloPetey: Are we done with File:Screen Shot 2023-07-06 Temp 1.png and File:Screen Shot 2023-07-06 Temp 2.png so we can delete them? No biggie, just housekeeping to avoid amassing too much cruft in the "no license", "no source", "no description" (etc.) maint. categories. Xover (talk) 07:47, 1 August 2023 (UTC)

If you have no further need of them, neither do I. --EncycloPetey (talk) 15:02, 1 August 2023 (UTC)

Copyright on pages

Latest comment: 11 months ago7 comments2 people in discussion

Hi! I was trying to find a license for File:Weasel 2a.ogg and a few other files. They are in use at Wild Weasel mission 19 April 1967. I would say the files and the text is PD because it was created by US Army. But my question is should all pages on Wikisource not specify the copyright? --MGA73 (talk) 11:42, 3 August 2023 (UTC)

@MGA73: Yes, all files and top-level mainspace pages should specify the copyright status. There are modifiers and nuances, but as a rule of thumb…

The Weasel files are untagged because there's quite a bit of research needed to determine their status and figure out what to do about them. I am not necessarily convinced they are legitimately PD-USGov, nor am I convinced they have been legally published, and if if they were I am not convinced they are in scope since they a) may have been effectively self-published and b) appear to have been heavily modified. At some point somebody is going to have to do the research and sheperd the necessary community discussions, but I've not found the time and energy to volunteer as yet. Xover (talk) 11:50, 3 August 2023 (UTC)

Thank you. I don't think copyright is a problem because the conversation was between US military persons. But as for scope I do not know. If there are files that are uncertan perhaps they should have a template to inform possible reusers about the doubt and to make sure the files are not forgotten.

Same with temporary uploads. If they were in a category they were easier to find. For example File:MU KPB 034 Parsifal.pdf. It was on my list because the self-template does not exist at WS. So I added the license separately to get the file off my list. --MGA73 (talk) 11:56, 3 August 2023 (UTC)

@MGA73: Note that a lot of those files I have, when doing periodic sweep through them, deliberately not tagged in any way precisely to keep them on the "this needs to be solved properly" list. And screenshots and temporary files is an example where we just need to set up some policy / guidance for tagging and categories, and when to push stuff to Commons, etc. i.e. it's mostly an administrative issue. Xover (talk) 12:06, 3 August 2023 (UTC)

Aha well in that case I might have messed up your system because I added a license to some files.

Looking at User:MGA73/Sandbox it seems that there are 2 bigger groups of documents. One is the Danforthreport files and the other is the The Strand Magazine. The first are PD-USGov and other are probably PD-US if published before 1928 but if they are PD in the UK is unsure. So before I add a license to those perhaps you could check if you agree? --MGA73 (talk) 12:16, 3 August 2023 (UTC)

@MGA73: I'm not sure on the danforth stuff. Whether great chunks of it were written by the US Federal Government depends on contractual issues with the third-party experts. Only works actually written by the USGov is exempt from copyright protection; third-party copyright material included in an otherwise PD-USGov document does not magically become public domain. Unless it can be shown that the expert reports were works for hire they must be presumed to be the author's copyright, used by the USGov under fair use or a "by permission"-type license.

The The Strand issues need a lot of research to determine copyright status; not so much for hosting on enWS, but in order to figure out what's eligible to transfer to Commons. It depends on the death date of the authors of the individual articles, and the uploader punted on the issue. Xover (talk) 12:45, 3 August 2023 (UTC)

Yeah the Strand is very tricky. But the PD-US should be safe to add. If anyone want to move to Commons they need to investigate to see if all the articles are PD.

Regarding the danforth stuff uploader claim it is a Federal government document. But none of the files seem to be in use so fair use is not possible. So I say either we go with PD-USGov and trust uploader did the relevant research or we nominate for deletion. The document need a license to stay. Since the documents are unused perhaps delete? --MGA73 (talk) 15:51, 3 August 2023 (UTC)

Request to restore to user space

Latest comment: 11 months ago2 comments2 people in discussion

Hi, could you please restore this page to my user space? I'll do my best to fix up any offending redirects expeditiously. -Pete (talk) 22:38, 4 August 2023 (UTC)

@Peteforsyth:

Done User:Peteforsyth/Portal:Marcus Whitman. Let me know if you need further assistance.

PS. As best I can tell all the content was cut&paste moved to Author:Marcus Whitman so I don't think there's anything interesting there apart from wikiarcheological stuff (the revision history). Xover (talk) 09:43, 5 August 2023 (UTC)

Lint errors.

Latest comment: 11 months ago1 comment1 person in discussion

Hi, Can you have a look at the remaining Stripped Tag errors in Main and Page Namespace?

I've tired to solve more than a few of them, but I've hit a limit of what I feel confident attempting repairs on.

I was concentrating currently on unclosed italics in Page: namespace, but if you fix any of the remaining ns0 Missing tags, it would be much appreciated. Thanks :) ShakespeareFan00 (talk) 20:15, 6 August 2023 (UTC)

Comics

Latest comment: 10 months ago12 comments2 people in discussion

If you're interested in working on comics, we could probably use the same templates as with Film for the transcripts of them (to make the comics textually usable). And it would actually be easier to work on comics in this way than films.

But yes, we are a long way away from putting text in speech bubbles, lol. Comics would be great, and there's a lot of interesting ones. For example, the original 1977 comics featuring Garfield (called Jon) are in the public domain, as I deduced fairly recently and announced somewhere on Commons. (Although unfortunately we don't have scans of the full newspapers, just the comics themselves). PseudoSkull (talk) 08:43, 7 August 2023 (UTC)

@PseudoSkull: Sigh. To have proper tooling... There're folks on Commons that would love to help convert the drawings to SVG, and combined with our transcriptions it could make an awesome old comics archive. But we'd need dedicated tools to work with it, and some way to display them with a smoothness that at least can be compared with Comixology.

In any case, in the mean time I, personally, think we should actually say such things are out of scope. They waste volunteer effort and do not produce worthwhile results. But I think the concept of harm for the issue is too abstract for most people to relate to. Oh well, we have more pressing issues in the near term. Xover (talk) 08:52, 7 August 2023 (UTC)

Out of scope? No no no, that's going a bit far. We don't need to scope them out; the method I proposed could at least be considered a prototype. If we get the tooling to improve the comic situation from there later, then we can convert what's already there. But to insist on having nothing at all, I think, because it's not perfect, is a bit much. Much like we don't need to scope out films because the technology isn't up to par, we don't need to scope out comics either. One of the main advantages of Wikisource is textual, structural, and semantic usability, something most other platforms don't give, and I'd rather have something that is usable for something than nothing at all, just because it's not pretty enough. PseudoSkull (talk) 08:59, 7 August 2023 (UTC)

@PseudoSkull: I'm saying for these categories of works I think the end result is so bad, and the effort required to get there so wasteful and painful for our contributors, that it crosses the line where it is better that we scope them out. It's not that the results aren't perfect, it's that they, for me, tip over the line where they're merely "meh" to being actually harmful. The short version of it is that not managing contributor expectations well they start out on efforts that can never work, and then end up leaving the project in frustration; and on the reader side having too much junk makes readers avoid our site (and don't link to it, so not driving PageRank and more new readers), thereby also reducing the pool of potential future contributors. As I said, these are rather abstract harms, so I don't expect most people will agree.

I'd be entirely sympathetic to an argument that for films the tooling and process—and results—are just good enough to fall on the right line of that scale (not necessarily in agreement, but sympathetic to the argument). And for magazine layouts it is very hard to objectively define where the line goes (cf. e.g. Once a Week where they've found pretty decent solutions), so a WWI scoping rule here would have to be pretty general rather than a bright line. But posters, postcards, and comics I think we can and should brightline out until we have at least rudimentary tools to work with them in a sane way and produce passable results. Whether we need a safety valve there for the exceptions and edge cases where we can produce good results I'm not sure about (whether it needs to be explicit, I mean).

But as I said, we have a lot more pressing problems than this so you won't find me agitating for such changes any time soon. Xover (talk) 09:20, 7 August 2023 (UTC)

I did write out a long-winded response, but I deleted it. All I'll say is I disagree, but it's pointless to debate it here because I don't wanna take up too much talk page space (and I admit I was being slightly uncivil). We can debate the issue when it gets to the community's attention. But I will ask if I do make comic transcriptions, please don't nominate them for deletion. I'll do the best I can to make them usable; if you want to have this debate at a future proposed deletion, I'll go into great detail on my position and my evidence there. PseudoSkull (talk) 10:44, 7 August 2023 (UTC)

I will say what you might consider "passable" sounds arbitrary or subjective. For example, many music sheets that could be transcribed can't be transcribed in a passable way because of the technology being limited, but I wouldn't extend this to making policy make explicit bans on that kind of content. Understand that adding more and more policy against types of content is the authoritarian solution, and doesn't need to be used unless absolutely necessary. The way I propose to do comics, I think would be passable enough to read and use. And yes, maybe not by your personal standards, but that's your subjective opinion. Some people are more picky than others, I get it. But, we need to avoid the deletionist mentality wherever possible, so we don't become more like Wikipedia, where there are actually academic studies that show that so much deleting of people's hard work is a primary force driving away Wikipedia's new contributor base. I'd hate to see that happen here. I hope no one's personal, very high-bar version of "passable" doesn't become law of the land for this reason. And that's all. PseudoSkull (talk) 11:37, 7 August 2023 (UTC)

@PseudoSkull: I think you're reading too much into what I say, and perceive my stance as more categorical than it actually is. Regarding films my main point is that the tools we have—both for transcription and presentation—should be way way better so that contributors could work more effectively and so readers could more easily enjoy more of the value that has already been created but is currently obscured by our suboptimal presentation. My secondary point is that I'm not sufficiently familiar with that area to have a particularly strong opinion on where it would fall on the spectrum we're discussing, beyond it being one class of work that would need to be assessed iff one were to seriously consider scoping out some things on this basis. Which nobody is actually proposing currently: I'm just explaining my current thinking on the issue, and the bits we care about here only as an analogy for the other end of the spectrum (the raw data end, where my stance is quite a bit firmer).

The studies you cite I have probably read (as they were published), and I've got the scars to prove my bona fides from enWP, so please believe me when I assert that this deletionist—inclusionist dichotomy is not only not useful but actively harmful. The issues are much much more nuanced than that, and the answers very very rarely black and white, so that lens pretty much only achieves divisiveness, entrenchment, and balkanisation.

For any collaborative project like this to work we first all have to acknowledge that such issues come on a scale, and that where the line is drawn varies from person to person, and that we have to be able to discuss and disagree on where to draw the line without engendering rancour. When someone proposes deleting something you've invested effort into it probably isn't because they're out to get you (well, we know there are exceptions, but, you know, generally...), or don't appreciate the effort you've put in, but because they're viewing it from a different perspective. The whole point of deletion discussions is to tease out those perspectives and try to come to a consensus that takes into account all of them. Once this gets turned into the parodic "keep everything!" vs. "delete (nearly) everything!" the process breaks down.

That being said, and as a purely personal request, please don't start transcribing lots of comics just because they're not bright-line forbidden by scope. Experiment, sure (in user or project space). See if the tools can be made to give decent results, absolutely. Maybe you'll prove me wrong. Your efforts on films have certainly gone a long way towards doing so in that area. But the reason I, currently, think it would be better to scope them out as a bright-line rule (possibly with some kind of exception clause for special cases) is precisely so that nobody starts putting effort into something I don't think we'll be able to do well at all. Because if the results are as I fear then the inevitable deletion discussion will be perceived by the poor contributor in question just as you demonstrated here: it will feel like someone unfairly and callously coming in to delete all your hard work that you put so much effort into. It's situations like that that make contributors ragequit (been there, done that). Having an upfront rule that says "Sorry, you can't do that here" may be frustrating, but it manages expectations and completely avoids the "wasted effort" aspect.

I feel quite confident in assuming that you agree that there has to be a line somewhere. If there is one stance in this whole discussion I actually hold to strongly it is that it is always better if that line (whatever it is) is visible before someone starts contributing such content, rather than showing up in a deletion discussion after they have put blood, sweat, and tears into it.

But as you say, it is a bit pointless to discuss this in depth: I am not actually proposing any such change to the scope, nor actively agitating for it. I am explaining my current thinking. My thinking has been known to change, on thornier issues than this, especially as discussions of them bring to light new information or highlight previously unseen opportunities. Don't let's dig the trenches before they're actually needed, yeah? Xover (talk) 11:42, 7 August 2023 (UTC)

I would tend to agree with you more on posters than comics, actually; I'm pretty confident comics could be more presentable. So hopefully, seeing what I propose would put a light bulb in your eyes. Yes, posters are a messy issue. I transcribed this poster, where I focused on the semantic and structural transcription of its content rather than making it exactly align with the poster itself. I saw it as more valuable to have the text and sections transcribed, etc., as if it were a single document, rather than recreate the entire board game structure they had on the actual poster. That's because in that scenario, if someone wants the transcription of the poster, they're looking primarily for the text content, not an HTML/CSS Da-Vinci-level work of art (which is what that would have to be if we adequately replicated the poster itself). If they want the poster as it is, they'd just use the image. So there might be different perceptions between us on the purpose of a work here. I'm sympathetic to what you're saying, and definitely would agree that a line needs to be drawn somewhere (like, eliminating most blog posts, and source code READMEs, twitter posts, etc.). But, I maintain there shouldn't be very many lines. PseudoSkull (talk) 11:55, 7 August 2023 (UTC)

And except in the area of copyright, there should always be exceptions IMO. For example, I wouldn't want blog posts to be accepted in 99.9% of cases. But if, for example, there was a Wikipedia article about a blog post, or it was particularly notable in that it was a substantial part of some huge incident that is well known, and happened to be freely licensed, I might support an exception for that particular post. Or if there was some other good argument. But the line on that kind of online material should generally be clear, and exceptions can be worked out where they need to be. PseudoSkull (talk) 12:11, 7 August 2023 (UTC)

@PseudoSkull: Indeed: no rule without exceptions; and in many cases the exceptions should be codified along with the rule (whether specific or vague). For example, I don't think we could reasonably make scan-backing mandatory without at the same time codifying the exceptions (like born-digital content). Xover (talk) 12:41, 7 August 2023 (UTC)

Another factor to consider is that we are not even close to a point where we have to start "cleaning up" where scan-backed, fully proofread works are concerned. We barely have any scan-backed works to begin with. There's a lot of empty Index pages, index pages with only pages that are marked "Not Proofread" but are just entirely OCR dumps, and worst of all, the copypaste dumps in the mainspace from 2007 or whatever. Only 5,551 entire indexes proofread and 6,009 validated, out of how many millions of works that might scope into Wikisource to begin with, and how many thousands of works we have total? So what we really do need to clean up, if anything, is work that did not involve blood, sweat, and tears. And that type of work is most often what takes up Wikisource:Proposed deletions space. Generally a good rule of thumb, I think, is that if anything took blood, sweat, and tears, it probably belongs in some form or fashion. Because if someone's going to put their blood sweat and tears towards something, it's usually from a place of cognizance and care. Not a hard rule, but generally that test applies well. Pretty much anything to which I'd actually agree it should be deleted, are things that took almost no effort at all to build, and we have plenty of that to sift through. Scan-backing and proofreading a work, even where we were to assume it looked subpar, is not something that should be deleted, because again, that in and of itself drives away contributors. There is also a "Problematic" status for a reason.

And the reason I bring up the deletionism thing is not because I think the "inclusionism vs. deletionism" debate has merit in and of itself...but because Wikipedia is (or at least has become) a community where you can barely contribute anything meaningful, because half the people there exist on that site only to try and mindlessly delete as much as possible, especially newcomer contributions. And that was my point with Wikipedia. I wasn't saying that we should fully hop on the "inclusionist" bandwagon either, but that Wikipedia has a particular culture of deletionism, that is a problem with their community. I don't even want to contribute anything to their site, because I know the likelihood is very high that my content will get trashed, just based on someone's subjective opinion on the vague and vacuous concept of "notability", even if the sources I cite are valid. One of the benefits of Wikisource, honest to god, is that we don't have a culture like that, and don't measure inclusion solely based on some ridiculously unmeasurable metric like "notability", and I hope we never do. PseudoSkull (talk) 18:07, 7 August 2023 (UTC)

@PseudoSkull: Ah, I see you've run across the New Page Patrollers on enWP. There you can certainly talk about a culture problem. They've managed to let themselves see the world as if they are the thin red line against the barbarian hordes of vandals, perennially understaffed and unable to cope with the sheer firehose of junk, and so they need to act fast and act hard or the entire project will drown in it. They're the moral equivalent of a local sheriffs deputy in a one-horse town that's equipped with high-end military hardware and think they're a one-man SWAT team. Not to paint everyone with the same brush (they're just people, good, bad, and indifferent), but the culture there is really problematic and for the precise reason you outline. That is one of the reasons I don't really contribute to enWP anymore. There are lots of other problems too, and some of them are definitely cautionary tales for enWS.

But, getting back to enWS, I don't think any discussion about raised requirements or narrowed scope should start with the assumption we should mass-delete existing work that do not meet the new standard. So for a scan-backing requirement I really mean requiring scan-backing for new texts. That is, there's a built-in exception for existing texts. And in addition we need a robust set of exceptions for born-digital works where a scan is a meaningless concept. We should also decide where hosting born-digital texts actually makes sense (I think that for a lot of them we add no value, waste contributor effort, and our readers would be better served going elsewhere), but that's an orthogonal discussion.

Now, raising the bar or narrowing the scope may over time lead to individual texts getting proposed for deletion on that basis, and on a long enough time scale may also come to mass-deleting what's left. For example if we require scan-backing for new texts and twenty years down the line we only have a handful of "unfixable" non-scan-backed texts left. But mostly I see such a requirement as having two functions: 1) it prevents new non-scan-backed texts from being added (thus preventing the backlog from growing), and 2) it sanctions migrating such texts to a scan (which some people are opposed to, they don't want anybody touching these texts for any reason) and makes it a common maintenance backlog for everyone to contribute to.

As one rule of thumb among many I generally agree with you on the blood, sweat, and tears. But like any such one cannot apply it blindly, absolutely, or without considering other factors. For example, we have Translation:Manshu (cf. WS:PD#Translation:Manshu). There is no question the contributor has invested blood, sweat, and tears into the project. It's also a valuable and unique resource. But as it stands it violates multiple policies. Nobody wants to make the hard call, so we've been stringing the contributor along for a decade (even though the problems were obvious at a much earlier stage). At some point that text is going to get deleted, but probably not until after the contributor is no longer around to re-host the work somewhere else and hence by trying to ignore it being out of policy here we've only achieved making it disappear for good. Then go dig through the contributions of this user related to this author. There's no question there is blood, sweat, and tears invested here (obsessively, even). But you soon start noticing something funny about the "scans" backing the texts (here's a hint: the author was never published in any collection, the collections would certainly not have been made in Microsoft Word in any case, and the editor would not have shared the real name of the contributor adding them). These texts are also going to get deleted eventually, and in the mean time we're wasting that contributors time instead of guiding them to making their contributions in a way that makes everyone win and the effort stand the test of time. And then there's all of Translation: space, with exemplum such as Translation:Shulchan Aruch (quite possibly not the worst example by far), that seems to be a wholly original work combining user interpretations of texts from multiple published sources (not even translation, it's interpretation). Lots of effort, probably valuable (I can't tell), but entirely outside the scope of the project.

We have a similar implicit rule that the older something is, the more valuable it is, and therefore if older than mere days it should never be deleted no matter its atrocious state; and usually should not even be attempted to be cleaned up because "it represents the standards at the time". It's fair enough to give a little slack to older texts, but as with other such rules of thumb it can't be a bright-line black and white rule or we'll never improve and slowly slowly drown in our own garbage.

I understand that you feel under attack when I mention films as one area that I am sceptical about, both because it's an area that you're interested in and because it's one where you've already invested the proverbial "blood, sweat, and tears". But please understand that I am not really proposing anything at all regarding films or the other areas. I am just using them as examples. On my list of worries they are way way down the list (below the stuff I sketched out in the above paras and about a thousand other things), and they are in any case areas that I would want to study and consider more carefully before it's even worthwhile to raise any discussion about them. Not least because you have achieved far better results on films than I thought likely when you first proposed working on them (you may recall I, and several others, were sceptical at the time too). But no matter what one concludes regarding these specific areas, I would assert that we need to do better at managing our scope and our standards, to concentrate effort where we can work somewhat effectively and not waste volunteer effort, and where we can produce value to our readers. And that means we have to have those discussions, either to decide to scope something out or to better specify its inclusion. For example, maybe we'd decide films should be explicitly in scope, but some special rules must be met and they have to follow a specific style guide, use a particular set of templates, etc. Maybe only silent movies are in scope, because they tend to tell the story in text cards, or maybe the rule is that there must be English speech to transcribe, or maybe we need to specify that we can't describe what's happening visually even though important for the story because it would be an annotation, or...

I've written too long, so I'll stop here. What I'm trying to get across is that I never want to be shitting on anybody's efforts, and I am always open to the possibility that there's stuff I don't know or don't understand, and I frequently change my mind when presented with new information or arguments. When I make arguments like these I am invariably thinking at the systemic and long term level, and how can we rig the project and our processes so that ten, twenty, thirty years down the line it is better than it is now. We have time, but the sooner we start the sooner we'll get there. Xover (talk) 09:23, 10 August 2023 (UTC)

Purpose of Category:Pages calling header main block with class

Latest comment: 10 months ago2 comments2 people in discussion

Hi. Trying to understand what this hidden category is doing for us. Thx. — billinghurst sDrewth 08:56, 10 August 2023 (UTC)

@Billinghurst: Coding artefact. I'm planning to do further cleanup on {{header}} and friends and need to see where those parameters are being used and for what. The categories will disappear when no longer needed. Xover (talk) 09:26, 10 August 2023 (UTC)

CharInsert

Latest comment: 10 months ago2 comments2 people in discussion

Re phab:T204201#9127376: No, they won’t conflict – the one is called ext.charinsert.styles, the other is ext.gadget.charinsert-styles. If you go to Special:Gadgets, the problem becomes obvious: it tries to load MediaWiki:Gadget-charinsert-styles.css, which of course it can’t. —Tacsipacsi (talk) 20:26, 30 August 2023 (UTC)

@Tacsipacsi: Thank you! This was bugging the heck out of me (and I was starting to invent ever more elaborate theories as to why). Turns out the page was moved in 2014, so this has been broken (in the dumbest possible way) and undiscovered for very nearly a decade. Sigh. Xover (talk) 05:19, 31 August 2023 (UTC)

Misnested tags

Latest comment: 10 months ago2 comments2 people in discussion

https://en.wikisource.org/wiki/Special:LintErrors/misnested-tag?titlecategorysearch=&exactmatch=1&tag=all&template=all

I'd been attempting fixes on some of these, but I'd appreciate someone with more technical expertise taking a look at these (and my recent contributions which I am quite prepared to revert.) with the goal of a more stable fix if desirable. ShakespeareFan00 (talk) 08:47, 6 September 2023 (UTC)

@ShakespeareFan00: My stack is a little too deep to add more to it just now. Sorry. Xover (talk) 08:31, 9 September 2023 (UTC)

Update Module:Message box

Latest comment: 9 months ago3 comments2 people in discussion

w:Module:Message box and w:Module:Message box/configuration have been updated to use SVGs for all the images and provide attribution for images that require it. Could you please re-sync the local module so we can have those changes as well? —CalendulaAsteraceae (talk • contribs) 07:16, 9 September 2023 (UTC)

@CalendulaAsteraceae:

Done. Please note that Module:Message box and its main config is a right royal pain to update due to local patches, so please try to keep syncs to a minimum / when there are significant changes. Xover (talk) 08:30, 9 September 2023 (UTC)

Noted, and thank you. —CalendulaAsteraceae (talk • contribs) 22:13, 9 September 2023 (UTC)

Labeled Section Transclusion and Category:Pages transcluding nonexistent sections

Latest comment: 9 months ago11 comments2 people in discussion

I am seeing that pages using #lst are being falsely captured by the whichever/wherever logic that builds the category. There are quite a few in that list, and noticeably all the {{DNB errata}} getting rid of those will do some good tidying. — billinghurst sDrewth 10:15, 11 September 2023 (UTC)

I will also note the use of a faux section to get the /source\ tab to show, per this — billinghurst sDrewth 10:28, 11 September 2023 (UTC)

@Billinghurst: Is the use of #lst in {{DNB errata}} deliberate (i.e. something that should be preserved) or just "it was something that worked"? I don't immediately see why the template is triggering this category, but one likely necessary change to fix it is to switch to using #pages, so if that's a no-go it'd be good to know before I start digging. Xover (talk) 10:36, 11 September 2023 (UTC)

Deliberate. It stops the errata page being being the /source\ tab, and leaves the primary transclusion of the 1885-1900 volumes as that source. The transgressions we have for transcluding two different works, but it was better than transcribing the errata elsewhere and linking to them. — billinghurst sDrewth 10:55, 11 September 2023 (UTC)

We also has {{authority reference}} that will cause the same issue, I just haven't got that far down the list. — billinghurst sDrewth 11:01, 11 September 2023 (UTC)

@Billinghurst: With PetScan I find only three pages that are both in Category:Pages transcluding nonexistent sections and transclude {{authority reference}}. I haven't checked what's triggering it on those three pages but given the low number I'll bet it's something else and {{authority reference}} doesn't actually have this problem (at least not in general). Xover (talk) 11:20, 11 September 2023 (UTC)

@Billinghurst: I fixed {{DNB errata}}. It was just a silly little logic error that had gone unnoticed since the beginning because it through sheer happenstance caused no noticeable effects apart from triggering this category (well, it also probably caused observable effects down where the database admins might care about it; but since they never complained I think it's safe to assume it wasn't enough to ever pop up on their radar). Xover (talk) 11:18, 11 September 2023 (UTC)

That appears to have rid us of half of the category's population, along with some iterative work with scripts and view. I'll see if I can make a further hole in the cat, though some of the fixes are ... umm ... unusual. — billinghurst sDrewth 14:13, 11 September 2023 (UTC)

Comment Module:Endnote as used in Page:Immanuel Kant - Dreams of a Spirit-Seer - tr. Emanuel Fedor Goerwitz (1900).djvu/71 may also have a similar issue. — billinghurst sDrewth 22:09, 11 September 2023 (UTC)

@Billinghurst: I rewrote it to prevent that problem, but the only two actual cases I found were legitimate cases of bad arguments. In any case, it will now bail without trying to transclude anything when missing arguments. Xover (talk) 12:26, 13 September 2023 (UTC)

And this Template:ShowTransclude is an ugly throwback (upchuck?) in history that I will look to eliminate. It predates <ref follow=...> I am not certain that I can botify the fix, though I will see what is possible. — billinghurst sDrewth 22:30, 11 September 2023 (UTC)

Template consolidation(s)

Latest comment: 9 months ago1 comment1 person in discussion

To meet a specfic requirment a while back, I wrote {{numbered div}} which no longer necessarily meets the quality standards Wikisource aspires to.

I appreciate you have a long to-do list, but in evaluating other templates in response to update situations this one needs to be carefully re-written. In any event some of it's use case was deprecated by subsequent functionality provided in {{ppoem}} and the {{*!/s}} {{*!/i}} {{*!/e}} family of templates. ShakespeareFan00 (talk) 13:27, 11 September 2023 (UTC)

logic for Category:Pages transcluding nonexistent sections ?

Latest comment: 9 months ago2 comments2 people in discussion

There are false positives in the category, and typically due to non-linear transclusions, eg. placement of image at a paragraph break. Also some of the uses of #section/#lst. Where is the logic for the categorisation so I can see what may need tweaking. Thx — billinghurst sDrewth 09:25, 20 September 2023 (UTC)

@Billinghurst: It's added by LabeledSectionTransclusion.php. Look for lst-invalid-section-category. Xover (talk) 17:48, 20 September 2023 (UTC)

De-Lints..

Latest comment: 8 months ago2 comments2 people in discussion

Thanks, for looking into this again.

You may wish to check some of my non-project namespace reverts of my own attempted fixes, on the grounds that the edits were contentious. Was the aim to empty the backlog of lints entirely?

I was currently focusing my delinting efforts on proofread pages with 'unpaired' italics, as I'd done a massive cleanup on unterminated tables quite recently, and had attempted to reduce the number of "content" pages with structural lints previously. ShakespeareFan00 (talk) 08:31, 1 November 2023 (UTC)

No, I'm just fixing a few specific issues flagged by the linter that are annoying or that will cause issues now that parser unification is on the horizon. The bogus image options linter warning, for example, is useful for detecting newly created bad image syntax so getting rid of the old ones let us more easily catch such cases and assist the users making the mistakes. Xover (talk) 08:37, 1 November 2023 (UTC)

Index:Samuel Gompers - Out of Their Own Mouths (1921).djvu

Latest comment: 8 months ago6 comments3 people in discussion

Missing table won't clear, I checked this manually and can't find WHERE it's missing. ShakespeareFan00 (talk) 19:45, 1 November 2023 (UTC)

@ShakespeareFan00: Sadly my mind-reading is a bit on the fritz just now. What is the problem, and on what page does it manifest? Xover (talk) 22:01, 1 November 2023 (UTC)

https://en.wikisource.org/wiki/Special:LintErrors/missing-end-tag?wpNamespaceRestrictions=106&titlecategorysearch=&exactmatch=1&tag=all&template=all

Has 2 items, but when I check those index pages, I can't find an unterminated table. ShakespeareFan00 (talk) 17:56, 2 November 2023 (UTC)

@ShakespeareFan00: I have also no idea, but I noticed that Module:Proofreadpage index template produces three <table>s even though only the innermost of them contains data that can be said to be tabular if one wants to. I’d start with converting the two non-tabular <table>s (#ws-index-container and #ws-index-main-table) to <div>s (possibly using flexbox) – in addition to improving accessibility, doing so would make it clear whether one of these two tables is found to be unclosed, or the third, real table. For sandbox experiments, you can copy the wikitext of the index page from https://en.wikisource.org/w/index.php?title=Index:Samuel_Gompers_-_Out_of_Their_Own_Mouths_(1921).djvu&action=raw. —Tacsipacsi (talk) 22:50, 2 November 2023 (UTC)

@ShakespeareFan00, @Tacsipacsi: Indeed, the most likely culprit for this is Module:Proofreadpage index template. I didn't immediately see how it could trigger this linter warning with that simple an Index (no complicated input to the module), but then the code there is kinda grotty right now as it's in the process of being migrated. That's why there are multiple implementations of the same logical thing in there (and unusual calling patterns). The new version, once done, will indeed ditch the layout table. You can test the effects now by putting any number (e.g. 2) in the "Template version" field in the Index and hitting Preview (you can also see a very ugly and overwrought demo-mode by putting in the value 42 there).

I will probably not spend too much time looking for what's causing the linter warnings in the current output, in favour of picking up the migration to new layout again. That's going to take some community testing before flipping the switch so it's likely you'll have to live with these warnings for a bit. Xover (talk) 06:48, 3 November 2023 (UTC)

That being said, it's interesting that lintHint on the Index page shows green. Special:LintErrors should be getting its data from the same source as the API lintHint is using, but there may be some caching going on there (some special pages are updated by weekly cron jobs and such). My suspicion right now is that this warning is bogus, one way or another. Xover (talk) 08:06, 3 November 2023 (UTC)

Lint repairs.. what constitutes 'meaningful' repairs vs cosmetic?

Latest comment: 8 months ago5 comments2 people in discussion

~~https://en.wikisource.org/w/index.php?title=One_Step_Forward,_Two_Steps_Back_(The_Crisis_in_Our_Party)/Chapter_16&action=history~~

I've come into conflict with another editor (who was working on the above) about where the limit between meaningful edits (such as resolving structural absences of table colsures and DIV terminations etc.), and mostly cosmetic rendering changes (such as unpaired P and italic tags.)

So I'd like someone uninvolved to examine my past efforts to reduce the lint backlog, and potentially tell me if I am wasting my time trying to reduce the backlog of lint-errors, when it comes to things like unpaired italics that make up the bulk of the remaining content space Linter concerns. Thanks. ShakespeareFan00 (talk) 13:10, 8 November 2023 (UTC)

Don't worry. I've worked out I think their concern was. ShakespeareFan00 (talk) 14:28, 8 November 2023 (UTC)

@ShakespeareFan00: I think there are two factors at play there.

First that whenever you step into an area where someone else is working you are going to disrupt them to some degree. It doesn't matter which contributor it is; anybody getting disrupted that way is going to be at least slightly annoyed, it's just that some are better at suppressing their annoyance than others. In this particular case you also happened to disrupt a contributor that was already annoyed with you for this very reason, many times over many years. That's one reason you got such a strong reaction.

The other factor is the perceived value of the edits. I've told you before that while I think it would be nice to clear out the lint errors, I think they are a very very low priority. The contributor you annoyed goes further and thinks the lint errors are entirely unimportant. I happen to disagree, but it's an entirely valid opinion to hold. The reason for both of us is that the actual practical problems caused by these "errors" is effectively nil. When they do not actually cause any problems, but the edits to clear them out do cause problems, the cost—benefit assessment just doesn't add up. For some contributors that means your edits to clear the lint errors are the actual problems, not the lint errors themselves.

I think the lesson to take away from this is that if you want to keep working on these then you need to go slower and spend more effort on making sure you don't step on people's toes when you're working. I know that by inclination you tend to want to go fast and efficient and get through huge backlogs. But since this is a collaborative project, the human factors are as important or more important than the more technical ones (like being efficient). It's a different kind of efficiency, is all. We optimize for not annoying other contributors, because in the long run we need lots and lots of contributors working away in relative harmony more than we need maximum efficiency in any one single task by any one single contributor. Xover (talk) 19:44, 8 November 2023 (UTC)

In any case I've got some other projects to work on, If I am allowed to continue with those, (like sorting out the Hoyt Quotations mess.

If interested in clearing lint-errors yourself - https://public-paws.wmcloud.org/4407/ql3.txt

I'll see how feel in the morning about continued contributions on less contentious efforts. ShakespeareFan00 (talk) 00:41, 9 November 2023 (UTC)

https://en.wikisource.org/wiki/Special:LintErrors/missing-end-tag?wpNamespaceRestrictions=0&titlecategorysearch=One+Step+Forward%2C+Two+Steps+Back+%28The+Crisis+in+Our+Party%29&exactmatch=&tag=p&template=all

The substantive technical concern I had that an un-terminated P inside what is otherwise a SPAN (REF tag) is technically malformed, that the parser and browsers currently insert a paragraph break is of course no guarantee that they will continue to do so. Hence in other works, I'd converted P tags inside references to use {{pbri}} which is an entirely SPAN based approach to resolving the technical concern, whilst giving a visually simmilar (it's not necessarily identical) rendering.

I can however fully agree with the concern the other contributor has about 'sniped' edits (even if made in good faith.) getting in the way of other repair efforts, especially if there is an apparent lack of communication between multiple contributors, as to what those edits were trying to do.

ShakespeareFan00 (talk) 07:57, 9 November 2023 (UTC)

Repairs to running headers/centered headers.

Latest comment: 8 months ago3 comments2 people in discussion

In looking through some of the remaining Lint-errors , I am noticing that a fair chunk of them are related to the use of italics in {{Running header}} or {{center}}'ed text. Do you think it would be possible to come up with an automated script for these otherwise trivial, "insignficant" but repetitive repairs (Based on the work you've already been doing for {{rh-c}} related repairs? ShakespeareFan00 (talk) 14:53, 8 November 2023 (UTC)

@ShakespeareFan00: Possibly. If you give me a small handful of links to pages with a description of the type of problem, I can see if these have potential for automated cleanup.

A lot of these kinds of errors can't really be efficiently automated because by nature they are "irregular" (i.e, it's missing one part of a pair) and automation works best on regular problems. The bot run I'm doing on rh-c, for example, is transforming one entirely standardised form to another standard form. That's what bots are good at, so that's an easy job. Xover (talk) 19:48, 8 November 2023 (UTC)

You wanted some examples?

Page:Harold Lamb--Marching Sands.djvu/100, Page:Harold Lamb--Marching Sands.djvu/101

Page:Quiller-Couch--Old fires and profitable ghosts.djvu/100, Page:Quiller-Couch--Old fires and profitable ghosts.djvu/101 were what I was thinking about.

Here the missing italic would be at the end of the relevant parmeter field.. i.e you have a mal-formation of

{{rh|<left>|''<header>|<right>}} in the first <NOINLCUDE>...</NOINLCUDE> section.

ShakespeareFan00 (talk) 08:09, 9 November 2023 (UTC)

another type was - https://en.wikisource.org/w/index.php?title=Page%3AAstrophel_and_other_poems_%28IA_astrophelotherpo00swiniala%29.pdf%2F220&diff=13586046&oldid=10836528

Something didn't feel right..

Latest comment: 7 months ago12 comments2 people in discussion

https://en.wikisource.org/w/index.php?title=Sikhim_and_Bhutan/Chapter_09&oldid=13586547

It was wrong for over 2 years and I didn't even notice :( (cries)

What worries me more, is how many other conversions are broken in the same way, as I did an extensive batch of these conversions. ShakespeareFan00 (talk) 13:54, 9 November 2023 (UTC)

I'm going to run a check on ALL them if I can. ShakespeareFan00 (talk) 13:55, 9 November 2023 (UTC)

Ouch, yeah, that's kinda annoying. But it's also a good argument, in this particular example, of why we should try to repair the scan rather than try to patch around it with complex transclusions. Xover (talk) 14:00, 9 November 2023 (UTC)

The scan isn't broken. It's a layotu issue of an image that would otherwise be in line. The layout here is to effective move the image display location to avoid an ugly break in the middle of a paragraph. ShakespeareFan00 (talk) 14:03, 9 November 2023 (UTC)

If there is an ugly break in the middle of a paragraph in the original as published, then we can live with an ugly break in the middle of a paragraph when proofreading. IMO, of course. Oh well. Xover (talk) 14:05, 9 November 2023 (UTC)

And this was caught by a non-existent section category: As I said it make me uncomfortable knowing there might be undetected broken transclusions from a faulty conversion, that are just going to sit there. Do you have a fast way to check which ns0 pages I converted in this way? So I can either revert or at last track, just how much damage i managed to create? ShakespeareFan00 (talk) 14:06, 9 November 2023 (UTC)

No, sorry. Xover (talk) 14:11, 9 November 2023 (UTC)

I think trying to check 11000 pages is infeasible, and so if you consent, I am just going to be more careful moving forward. If you find mangled transcludes, fix them as you would any other? ShakespeareFan00 (talk) 20:58, 9 November 2023 (UTC)

@ShakespeareFan00: You don't check 11k pages manually; you pick a statistically relevant subset and spot check those. If you check around a hundred of them at random you'll get a sufficiently significant result to tell you whether there's an actual problem there. Whatever the number of errors you've found after checking a hundred, that's going to be your base assumption of the proportion of changes with problems. If that number is "zero" the proportion of errors is going to be small enough not to worry about. And if after checking a hundred you've found no more than one or two then, ok, it's a little more than preferable but still within what one should expect for a human error rate. Xover (talk) 07:17, 10 November 2023 (UTC)

Which is what I had been doing. So far I'd only found 2 or 3 pages that I've had to rework (but that's still a little higher than I expected.), and those were ones that had already been detected as having missing sections by other mechanisms. I'm also noting that in some instances, of the 11k pages. ShakespeareFan00 (talk) 11:10, 10 November 2023 (UTC)

BTW I ran a query a while back to find potentially mangled transcludes:- https://public-paws.wmcloud.org/4407/Badesctions , it has about 100 entries, but I've only so far found 1 or 2 entries where it was me that broke the transclude on conversion. (In some instances, the breakage seems to have happened after other contributors had converted the page to specfic transclusion formats.) I'm going to keep checking until I have a more meaningful sample though. ShakespeareFan00 (talk) 11:10, 10 November 2023 (UTC)

@Xover: Right now I have considerable concerns about my own ability, I even found some basic mistakes in some repairs I'd attempted earlier. The sort of mistakes you'd expect a new user to make, not a contributor of over a decade. I am way too focused, and making far too many bad edits right now. It's not easy to admit you've potentially become 'toxic' to a project you deeply care about, but that's the concern. Other issues I've mentioned are perhaps symptomatic of a potential burn-out, ShakespeareFan00 (talk) 23:35, 9 November 2023 (UTC)

The Swiss Family Robinson (Kingston)

Latest comment: 7 months ago3 comments2 people in discussion

This was validated back in 2019. However, in reviewing my own edits, I note the use of an exclude param to suppress some blank pages following image plates. Subsequent to this being validated, I seem to recall changes being made in the Pagenum script to supress generation of pagenums for 'blank' pages, meaning that the use of the exclude param here might now be redundant. Much appreciated if you could determine if there are simplifications that can now be made. ShakespeareFan00 (talk) 07:38, 10 November 2023 (UTC)

No, pages "Without text" should still preferably be excluded. They do no real harm, so it's not something worth going back to fix, but when transcluding something for the first time it's worth the extra effort to omit them (typically by using exclude=). Xover (talk) 08:00, 10 November 2023 (UTC)

That's what I thought. ShakespeareFan00 (talk) 08:02, 10 November 2023 (UTC)

User:Inductiveload/ActivePageAlert

Latest comment: 7 months ago3 comments2 people in discussion

I have this installed, and I still came into conflict with other users (despite the warnings). Can the script be upgraded to effectively treat pages it would otherwise warn about as 'locked' (i.e disables any edit facilities)?, (potentially generating an edit-request for a talk page or a pending-change request instead.) The UI for this that would be desirable would be something like the messages I would see when trying to edit a page that's fully (admin) protected, or subject to pending changes (the model used on some Wikipedia pages and at Wikibooks) ShakespeareFan00 (talk) 11:25, 10 November 2023 (UTC)

No technical measure is ever going to give you perfect results for this. You're going to have to navigate this with human judgement. Xover (talk) 12:39, 10 November 2023 (UTC)

Fair point. :) ShakespeareFan00 (talk) 12:43, 10 November 2023 (UTC)

List of Carthusians, 1800–1879/1806

Latest comment: 7 months ago5 comments2 people in discussion

A leaky table I think, but it's a contributor that doesn't like their stuff being 'sniped', and I can't tell from looking at the underlying pages, where the stray tag is coming from. ( Found this whilst checking my own older edits, but this was broken before I made mine). ShakespeareFan00 (talk) 12:06, 10 November 2023 (UTC)

I have a strong hint that there's an odd interaction around the sections going on.ShakespeareFan00 (talk) 12:09, 10 November 2023 (UTC)

List of Carthusians, 1800–1879/1803 confirms what's going on. The other contributor is using multicol, and then fromsection/tosection. This means that when transcluded the {{Multicol-end}} gets transcluded as well. This I don't think was the intent of the contributor concerned but they've asked me not to meddle, so perhaps you have a better approach? ShakespeareFan00 (talk) 12:14, 10 November 2023 (UTC)

It was the older change from onlysection to fromsection. I never can remember the exact semantics of fromsection/tosection, but apparently it means "on the starting page, transclude starting from the section named until the end of the page". This as opposed to "on the starting page, transclude only the section named". Since this transclusion was only two pages we could get away with just switching to onlysection=, but in other cases it's going to be a bit more tricky (would need changes to section markup). Xover (talk) 13:15, 10 November 2023 (UTC)

I'm not touching this further, but if you want to check related pages, I can't stop you. The fix is the same. ShakespeareFan00 (talk) 13:17, 10 November 2023 (UTC)

Lint. (Sprint efforts in some areas?)

Latest comment: 7 months ago8 comments2 people in discussion

https://en.wikisource.org/wiki/Special:LintErrors/obsolete-tag?wpNamespaceRestrictions=0%0D%0A104&titlecategorysearch=&exactmatch=1&tag=all&template=all

There seem to only be 2 obsolete tags concerns left in Main namespace! (I'm discounting the main page stuff as that seems to be a talk page in Mainspace). ShakespeareFan00 (talk) 13:34, 10 November 2023 (UTC)

And - https://en.wikisource.org/wiki/Special:LintErrors/misnested-tag?wpNamespaceRestrictions=0%0D%0A104&titlecategorysearch=&exactmatch=1&tag=all&template=all is almost cleaned out as well.

(Aside: One of my batch efforts with AWB a few months back was to replace certain RAW HTML tags with templated equivalents, (mostly so that per work IndexStyles had something to potentially latch onto given the efforts other contributors were putting in.). Any chance you could subject to wider consultations implement something sensible for semi-automated or bot replacement of common RAW tags with templated (and thus styleable) equivalents (SMALL -> {{smaller}} or {{smaller block}} or BIG -> {[tl|larger}} {{larger block}} comes to mind)? ( I checked and there don't seem to be unpaired tags in content namespaces.

(Please note that per previous conversations I've limited the above to a narrow 'content' namespace search. )

In terms of missing tags - https://en.wikisource.org/wiki/Special:LintErrors/missing-end-tag?wpNamespaceRestrictions=0%0D%0A104&titlecategorysearch=&exactmatch=1&tag=all&template=all

The overwhelming bulk seems to be unpaired italics on unproofread pages :( (My own efforts to generate a list of concerns on proofread pages was at a link I gave previously. - That list is around 1500 entries. for stuff that's eventually transcluded.) . For a determined competent editor that's 2 afternoons with AWB! :) ShakespeareFan00 (talk) 13:51, 10 November 2023 (UTC)

ShakespeareFan00 (talk) 13:51, 10 November 2023 (UTC)

The distinction between this and the above multicol stuff is that the multicol stuff caused something to actually break, whereas most of the lint errors are just "stuff on a list". The unclosed italics, for example, has literally zero effect in almost all cases because the parser will pick up and add a close tag at the end of the template argument (etc.). Xover (talk) 14:09, 10 November 2023 (UTC)

Yes , I fully understand that. I wasn't (with exception of some of the misnesting issues) was not now finding 'structural' errors being reported via Linter concerns, and as you stated earlier, most of the remaining Linter concerns are of a low enough priority, that there is time to come up with good solutions, if needed.

ShakespeareFan00 (talk) 14:29, 10 November 2023 (UTC)

Something else I was going to ask about in relation to looking for potentially mangled transclusions, Do we currently have a tool that can show what's being transcluded on a page in a visual manner? (A sort of transclusion map as I see it.) This would assist in finding not only potentially out of range transclusions (the issue I mentioned previously turned out to be this.) but also gaps (which as you stated previously might in some cases be intentional). The thought was to show something like the PAGES field on Index: pages but confined to the range of Pages: transcluded?.

for example , if you had a <pages index="TestCase01" from=10 to=20 /> what you would see is a

<< .. 10 11 12 13 14 15 16 17 18 19 20 .. >> where the first set of dots linked to [[Page:TestCase01/9]] the last set of dots to [[Page:TestCase01/21]] and the numerals linking to their respective pages.

I can of course use the text form links given on the edit page, but some kind of visual UI map, would speed up the workflow a bit. ShakespeareFan00 (talk) 14:29, 10 November 2023 (UTC)

We don't have a tool for that that I am aware of. Would be useful in some scenarios, I agree. Xover (talk) 14:44, 10 November 2023 (UTC)

In order to write a Phabricator ticket to request this functionality , we'd need to identify which workflow(s) are made more difficult by it's absence. I think mine is :-

Use a list of pages to determine if a transclusion contains missing or out of range portions.

Currently to check a transclusion, requires analysis of text based links which are placed below the edit window and or a preview. For a long page (as is typical on some transclusions at English Wikisource), these buly text links can be extensive, necessitating a lot of scrolling. In comparison checking a Pagelist on an index can be done visually.

Can you think of others? ShakespeareFan00 (talk) 14:52, 10 November 2023 (UTC)

Disregard noinclude(d) portions when presenting Linter concerns?

Latest comment: 7 months ago1 comment1 person in discussion

https://phabricator.wikimedia.org/T350950

Currently Special:LintErrors has no way of disregarding/ marking concerns in <NOINCLUDE>...</NOINCLUDE> portions when analysing. (My use case/example here is those raised in {{rh}} headers, that barring proofreaders/validators, no-one is likely to ever see. ShakespeareFan00 (talk) 15:22, 10 November 2023 (UTC)

Tracking 'transclusion map' changes?

Latest comment: 7 months ago5 comments2 people in discussion

In going through my past edits, I've not yet found any more mangled translcusions.

However, what would be helpful in filtering out potential transclusions I needed to review, I was finding that I was not aware of a tool that generated a list of revisions, where the 'transclusion map' (I.E what a page includes) had changed. What is the likely complexity of an analysis tool that scans through my edits, and identifies where my edit(s) made a cumulative change to the 'transcluion map'? (Would this be something that could run on Labs/Toolforge or whatever it's now called?) ShakespeareFan00 (talk) 13:05, 11 November 2023 (UTC)

Complexity would be very high. I don't think this is a realistic option. Xover (talk) 13:07, 11 November 2023 (UTC)

Thanks. What I want to do from a technical perspective is I think as follows:-

For multiple revisions ("oldid")

Identify a specific revision (oldid ?) in ns0 where I ShakespeareFan00 am the user responsible for that revision.
For that revision generate a set of results of which ns104 pages are transcluded ( This must be possible, because the edit UI, provides a list below the edit box.)
Also generate a set of ns104 pages transcluded for the preceding revision,where I am not the reviser.
Compare the results set,
If the 2 results set are not identical , then the "transcluion map" changed with my edits.
For each changed "transclusion map" add the ns0 pageid/title to a results set listing "transclusion map changes"
Report the "transclusion map changes" (ns0 page title/oldid) result set in an appropriate way.

I know the first part is a relatively simple SQL query.

Generating the 2 links table result sets given a specfic revision ID is also a simple SQL query.

The difficult part I think would be how to find the difference between the tables, to flag a "transclusion map" changed item in a potential report. It only needs to find a difference, it doesn't for my use case necessarily need to flag/report comprehensive details of what changed. It also only needs to consider changes for ns104, (although it should be flexible to allow for tracking ns10 changes if felt desirable.) ShakespeareFan00 (talk) 13:31, 11 November 2023 (UTC)

https://phabricator.wikimedia.org/T351019 , You are welcome to comment. ShakespeareFan00 (talk) 14:02, 11 November 2023 (UTC)

Hmm, There must be a means to get what was transcluded on the old version of page, but I am not seeing an obvious way to do it from the database schemea directly. ShakespeareFan00 (talk) 13:40, 11 November 2023 (UTC)

Sometimes a little bit of lateral thinking works wonders..

Latest comment: 7 months ago1 comment1 person in discussion

https://en.wikisource.org/w/index.php?title=Page%3AHarvard_Law_Review_Volume_1.djvu%2F385&diff=13592467&oldid=7967879 https://en.wikisource.org/w/index.php?title=Page%3AHarvard_Law_Review_Volume_1.djvu%2F386&diff=13592468&oldid=7967880 https://en.wikisource.org/w/index.php?title=Page%3AHarvard_Law_Review_Volume_1.djvu%2F387&diff=13592469&oldid=7967881

Sometimes it's the 'clever' way of doing something that is what breaks.

For reference here's a list of pages where something like this might also be happening... https://en.wikisource.org/w/index.php?search=insource%3A%2Flst%5C%3APage%5C%3A%2F&title=Special%3ASearch&profile=advanced&fulltext=1&ns104=1 93 pages.. but it might help resolve a few 'missing section' errors? ShakespeareFan00 (talk) 16:54, 11 November 2023 (UTC)

Gems of Chinese Literature/Liang Ch‘i-ch‘ao-My Country!

Latest comment: 7 months ago2 comments1 person in discussion

~~Showing up as having a missing section, but I checked ALL the names. Suggestions please? ShakespeareFan00 (talk) 21:25, 11 November 2023 (UTC)~~

Don't worry, I solved, it by undoing my attempted repairs to an underlying page, and carefully redoing them again:) ShakespeareFan00 (talk) 21:52, 11 November 2023 (UTC)

Blowout of spurious Speedy Deletes in the category

Latest comment: 7 months ago2 comments2 people in discussion

Hi, suddenly there are 209 pages in the category for speedy delete. I think they're related to Template:..., which was temporarily marked for spam speedy delete until the editor realised that it was being used legitmately. I assume that there's a caching issue for the pages to stick in the category without there being an actual delete template on the page. I've tried purging the category, but to no effect. I'd like to prevent the unthinking deletion that someone might do. Any ideas? Beeswaxcandle (talk) 06:09, 14 November 2023 (UTC)

@Beeswaxcandle: Yeah, it's annoying. Category membership is a scheduled update and it can take "forever" to clear in situations like this. Purging the category page usually does nothing, so the only way to expedite is to purge each of the category members. Which in practice means some form of automation like a bot (like pywikibot's "touch" script). I've ran a purge on each of the members so Category:Speedy deletion requests should be empty now. Xover (talk) 06:36, 14 November 2023 (UTC)

Index:SRD5.1-CCBY4.0License.pdf

Latest comment: 7 months ago2 comments2 people in discussion

File was apparently renamed at Commons (would have been nice if they'd told us), which has seemingly broken the Page and Index: linking, Can you move everything associated with this to ONE commone name so it all works as designed? Thanks. ShakespeareFan00 (talk) 17:39, 15 November 2023 (UTC)

@ShakespeareFan00:

Done. Now at Index:Dungeons & Dragons System Reference Document.pdf. Xover (talk) 21:19, 15 November 2023 (UTC)

Page:Völsunga Saga (1888).djvu/304

Latest comment: 7 months ago6 comments2 people in discussion

~~Short query, Why does ppoem not display anything? The previous page is using an identical approach and displays fine.~~ ShakespeareFan00 (talk) ShakespeareFan00 (talk) 14:16, 16 November 2023 (UTC)

Resolved. Don't knwo why it broke thoughShakespeareFan00 (talk) 14:46, 16 November 2023 (UTC)

The equals sign in the first line of the second stanza confused MediaWiki: it passed {{ppoem}} one argument named Nought hadst thou to praise … with value gear, Blue-white, well woven … and nothing in |1=. Xover (talk) 15:42, 16 November 2023 (UTC)

(rookie oversight) I should know better :( ShakespeareFan00 (talk) 15:43, 16 November 2023 (UTC)

@ShakespeareFan00: You don't want to know how long it took me to spot that equals sign and twig to what was going on. That one was a head-scratcher, even though the cause was obvious enough once finally spotted. Xover (talk) 15:58, 16 November 2023 (UTC)

Does ppoem generate TemplateDate the Bambots listing service can pick up on? I'd like to hope this was an isolated incident but wanted to be sure ;) ShakespeareFan00 (talk) 18:35, 16 November 2023 (UTC)

Index: using deprecated FONT tag..

Latest comment: 7 months ago2 comments2 people in discussion

https://en.wikisource.org/w/index.php?title=Special:LintErrors/obsolete-tag&offset=2114360&exactmatch=1&tag=all&template=all&titlecategorysearch=&wpNamespaceRestrictions=106

But nearly all of them seem to be using the font tag to white out or hide a POTM marking.. Surely the better way to mark POTM or Monthly challange items would be a category?

(Aside) Down to about 500 missing end tag lints on ql3 pages BTW. There are some I can't edit for various reasons, so I'd appreciate someone taking some time to hoover up some of those :) ShakespeareFan00 (talk) 18:34, 16 November 2023 (UTC)

@ShakespeareFan00: That's a technically obsolete way to do it, yes. If Wikisource:Proofread of the Month still uses that method we should find a better way and migrate all uses to that, whatever it is (either a straight category or a template that possibly auto-adds a category). But I'm not familiar with how POTM does things so you'd need to raise the issue there. Xover (talk) 11:22, 19 November 2023 (UTC)

ns0 formatting cleanups..

Latest comment: 7 months ago2 comments2 people in discussion

Before continuing I'd like a second opinion on this sort of cleanup:-

https://en.wikisource.org/w/index.php?title=The_Empty_Sleeve&diff=prev&oldid=13651663 ShakespeareFan00 (talk) 22:35, 18 November 2023 (UTC)

@ShakespeareFan00: I wouldn't start that kind of cleanup at volume. The correct fix for these is to scan-back and proofread them one by one. And in any case we shouldn't be using mediawiki heading markup even if the old version uses html h tags. Xover (talk) 11:17, 19 November 2023 (UTC)

A template that doesn't exist

Latest comment: 7 months ago8 comments2 people in discussion

https://en.wikisource.org/wiki/Page:The_practice_of_typography;_correct_composition;_a_treatise_on_spelling,_abbreviations,_the_compounding_and_division_of_words,_the_proper_use_of_figures_and_nummerals_by_De_Vinne,_Theodore_Low,_1828-1914.djvu/220

Is this a variant on the EB1911 Shoulder heading that's sometimes been used outside EB1911 or is there some other appropriate template to use here? What's needed is a left floated block, around which the body text flows.. This seems to be pushing the limit of what HTML can do? ShakespeareFan00 (talk) 10:41, 19 November 2023 (UTC)

@ShakespeareFan00: Would {{il}} work here? It was a quick hack so it needs some refinement (and migrating to use primarily Index CSS for its styling instead of the margin parameters), but it would seem to be roughly the same use case. Xover (talk) 11:16, 19 November 2023 (UTC)

Yes, I'll certainly look at that. I couldn't remember the name of the template when searching.ShakespeareFan00 (talk) 13:31, 19 November 2023 (UTC)

Looks good... Any chance you could look into migrating the non EB1911 uses of the shoulder heading templates? ShakespeareFan00 (talk) 16:40, 19 November 2023 (UTC)

That sounds like it requires more research time than I have available right now. Xover (talk) 16:42, 19 November 2023 (UTC)

I'll ask around then. I didn't think the Rh consolidation was feasible until it happened :) ShakespeareFan00 (talk) 17:04, 19 November 2023 (UTC)

To be clear, I can run the bot replacements. It's the research to figure out which pages to run it on, double-checking that nothing breaks, etc. that is beyond me just now. Xover (talk) 17:09, 19 November 2023 (UTC)

It was the edges cases I was concerned about, which was why I was going to ask around for potential users. ShakespeareFan00 (talk) 17:16, 19 November 2023 (UTC)

Sukavich Rangsitpol

Latest comment: 7 months ago2 comments2 people in discussion

Special:Contributions/G(x)

I was asking about the first link with the other user that voice to delete the page and lock the page .And now the page is gone ,the same as the two user that come to have the page deleted.

Sukavich Rangsitpol is Thai politician that help rural Thai people by His education reform not the corrupted Politician.And he probably was the one pay to delete 1995 Thailand education reform.All of his voter including my cycle did not know yet that he was not the one.

I was asking about reference for his policies to decrease poverty.I was blocked.

ln his page talking about Health Care program as it was his idea. Even though it was mandated by 1997 Thailand Constitution section 52 .

Can you unlock Author:Sukavich Rangsitpol page and retrieve what had been deleted.It probably will not have copyright anymore because I Google and found nothing.I wonder what was it that the person only come directly to have it deleted and left.2403:6200:89A7:D762:580C:5941:94E2:36DA 12:52, 20 November 2023 (UTC)

I can, but I won't; for the simple reason that all this fuss seems to be about advocating Rangsitpol, which is not the purpose of Wikisource. Please take whatever mission you're on somewhere else. Xover (talk) 13:21, 20 November 2023 (UTC)

Inline page numbering--

Latest comment: 7 months ago3 comments2 people in discussion

https://en.wikisource.org/w/index.php?title=Miranda_v._Arizona/Dissent_White&diff=prev&oldid=13661154

This is to me is a reasonably straightforward regex search and replace. Do you now of any tweaks or obvious edge cases that might apply? ShakespeareFan00 (talk) 19:02, 20 November 2023 (UTC)

@ShakespeareFan00: Yeah: don't do it. {{pagenum}} is obsolete with the advent of Proofread Page and should not be added to any new texts. :)

But in purely technical terms, no, that should be a straightforward replacement. Xover (talk) 19:34, 20 November 2023 (UTC)

This was not about New Text, but about formatting tweaks to existing content. But I agree with you about scan-backing being the preferred response. I won't continue this project. ShakespeareFan00 (talk) 19:48, 20 November 2023 (UTC)

The Bartered Bride (1908)/Act first

Latest comment: 7 months ago3 comments2 people in discussion

I was told the layout broke. I can't see where.

So before I mass revert a lot of effort I put in, I'd like a second view as to something being broken technically, as opposed to the rendering between the two versions being out of sync due to limitations in the approach or browser differences. ShakespeareFan00 (talk) 23:25, 20 November 2023 (UTC)

@ShakespeareFan00: I think you are too quick to jump to mass-reverting. When doing mass changes you need to be prepared to do mass reverts of those changes, but only if that is actually decided to be both necessary and the best approach. Reverting the second someone raises an issue just creates more noise and chaos. Instead, put the offer on the table and then spend some time investigating and discussing the best path forward. For example: the issue Jan raised does not so far seem to benefit from a mass revert of changes. It might turn out to need it, but right now it seems more likely that it won't. Xover (talk) 06:48, 21 November 2023 (UTC)

In doing the reverts, and re-application of the delints, I found some mistakes I'd made in the initial pass. ShakespeareFan00 (talk) 09:27, 21 November 2023 (UTC)

Containers for scans in image format

Latest comment: 7 months ago2 comments2 people in discussion

I saw your post on task T345519 about ways to use raw scans in image format. Are you familiar with the en:Comic book archive (CBZ/CBR/etc.) family of containers? They're literally just archives with image files in them (the last character defines what type of archive e.g. CBZ = ZIP), and I know there are some places that distribute scans using that format. Arcorann (talk) 08:06, 30 November 2023 (UTC)

@Arcorann: I'm familiar with cbz/cbr, yes. But the whole point of my comment is kinda that it doesn't make sense to rely on wrapping scan images in an external container file format. We should upload the master scan images in whatever format they are in (usually jp2 or tiff) and then provide the container features in software inside MediaWiki. Sort of the inverse of what things like WinZip provides for browsing ZIP files as if they were a folder in the file system. For example a special kind of category, that when it contains 100+ scan images with proper sort keys and a magic word (__THIS_CAT_IS_A_BOOK__ or whatever), the software displays it the way a DjVu or PDF file is displayed now. We also need some way for MediaWiki to store the OCR text layer that PDF and DjVu files provide today. We could do that using revision "slots", so that one image file has two slots: one for image data and one for OCR text data. Somewhat like the old Mac OS filesystem (HFS) file "forks".

The concept isn't all the novel or complicated, but we need to figure out how it can be solved inside the architecture of the MediaWiki platform; and then get the buyin from the WMF and volunteer developers to actually do it. It's going to take quite a bit of engineering work across multiple components of the platform, so it's a good way away even in the best case. But I think we need to start moving in that direction. Xover (talk) 08:22, 30 November 2023 (UTC)

DjVu request (Robinson)

Latest comment: 7 months ago2 comments2 people in discussion

Could you prepare a DjVu file from (external scan)? There are multiple later books and films with this title, so I recommend naming the file with (Robinson) and/or 1924 in the filename. Thanks for your help. --EncycloPetey (talk) 03:03, 3 December 2023 (UTC)

@EncycloPetey: Index:The Man Who Died Twice (1924).djvu Xover (talk) 11:00, 3 December 2023 (UTC)

Header not displaying contributor

Latest comment: 6 months ago3 comments2 people in discussion

On The Song of Roland/Note on Technique I've added a contributor= parameter that isn't displaying, and I do not know why. --EncycloPetey (talk) 18:33, 9 December 2023 (UTC)

EncycloPetey: You need something in section= on which contributor= may grad hold; I’ve added the name of the section. TE(æ)A,ea. (talk) 19:03, 9 December 2023 (UTC)
Thanks! --EncycloPetey (talk) 19:04, 9 December 2023 (UTC)

Lint - Tables..

Latest comment: 6 months ago1 comment1 person in discussion

https://en.wikisource.org/wiki/Special:LintErrors/missing-end-tag?wpNamespaceRestrictions=104%0D%0A0&titlecategorysearch=&exactmatch=1&tag=table&template=all

I did not feel confident in making the repairs needed. Perhaps you'll have better luck? ShakespeareFan00 (talk) 16:12, 23 December 2023 (UTC)

Geese

Latest comment: 6 months ago1 comment1 person in discussion

I have started a discussion at the Commons:Administrators' noticeboard regarding the renaming of files. I hope I don't have to launch a similar discussion here. --EncycloPetey (talk) 00:53, 25 December 2023 (UTC)

User talk:Xover/Archives/2023

= Category:Confucianism => {{author}} =

The Casebook of Sherlock Holmes

Noting issues with some forced formatting defeating layers

Bulk revert?

author link switch within module:article link?

With next set of DNB, the footer templates ...

User contributions for 82.167.152.179

User may need some hints..

Link curation...

Babel

respective contributor blurb in author description

Indian Constitution..

Maps for Index:The Northern Ḥeǧâz (1926).djvu

Obselete Center tags.

Personal tools

Follow up on book scanning

Yale - Richard III

"Centered" tables...

About Vasari's adventure

Files

Shakespearean Tragedy

"Stage" scripts...

pages missing fixed

On my files

What peeves me ...

Center tags..

Timeline needs adjusting

{{fs90}} and {{fs90/s}} are missing some parameters

Need your input on a policy impacting gadgets and UserJS

Importing commons:Module:Roman

Wikifunctions / Abstract

Aristophanes: The Eleven Comedies

Copyright on pages

Request to restore to user space

Lint errors.

Comics

Purpose of Category:Pages calling header main block with class

CharInsert

Misnested tags

Update Module:Message box

Labeled Section Transclusion and Category:Pages transcluding nonexistent sections

Template consolidation(s)

logic for Category:Pages transcluding nonexistent sections ?

De-Lints..

Lint repairs.. what constitutes 'meaningful' repairs vs cosmetic?

Repairs to running headers/centered headers.

Something didn't feel right..

Lint. (Sprint efforts in some areas?)

Disregard noinclude(d) portions when presenting Linter concerns?

Tracking 'transclusion map' changes?

Sometimes a little bit of lateral thinking works wonders..

Blowout of spurious Speedy Deletes in the category

Index: using deprecated FONT tag..

ns0 formatting cleanups..

A template that doesn't exist

Sukavich Rangsitpol

Inline page numbering--

Containers for scans in image format

DjVu request (Robinson)

Header not displaying contributor

Lint - Tables..

Geese

Navigation menu

Search