User talk:Inductiveload/Archives/2022

From Wikisource
Jump to navigation Jump to search
Warning Please do not post any new comments on this page.
This is a discussion archive first created in , although the comments contained were likely posted before and after this date.
See current discussion or the archives index.

Greek template and serif font display

Hi, any idea why Greek fonts (using the {Greek} template) no longer display with serifs? It started happening a few days ago. DivermanAU (talk) 03:53, 30 November 2021 (UTC)

The serifs disappeared for me a month ago, when the template was altered, and returned when I corrected the template to what it was before. Since they're now displaying for me, but not for you, then I would first suggest clearing your browser cache. --EncycloPetey (talk) 04:43, 30 November 2021 (UTC)
It's also sans for me if I remove my personal CSS, because the first in the old Template:Greek/fonts.css list that I have is "DejaVu Sans" (I imagine I share this with all Linux users, Windows users without special fonts installed will probably get Arial Unicode MS, but I'm not sure). As usual, a knee-jerk reversion as a first act is not particularly constructive. A constructive thing to have done here would have been to say what font your browser was actually using from the "styles.css" CSS and we could have addressed it properly.
@DivermanAU, please liaise directly with @EncycloPetey to find a font ordering that works for you both and please also bear in mind that most of the fonts in the list are not installed by most users. I have my own CSS anyway, so Works For Me (TM) whatever you do. Inductiveloadtalk/contribs 10:04, 30 November 2021 (UTC)
@EncycloPetey do you intend to address @DivermanAU's problem? Reverting something implies to me that you willing to take some level of responsibility for it, and I don't want to get in your way if you feel you have a better solution. Inductiveloadtalk/contribs 08:45, 2 December 2021 (UTC)
Any plans to reinstate Polytonic template @EncycloPetey? The way it stands currently the redirect to 'Greek' badly affects the look of Ancient Greek-style text. I've edited hundreds of pages in the past of the 1911 Encyclopædia Britannica that have Greek text to use 'Polytonic' to match the print. DivermanAU (talk) 11:27, 10 December 2021 (UTC)
Originally, we had {{polytonic}} for ancient and pre-20th century Greek. The template {{Greek}} was for modern Greek. It looks as though Inductiveload has changed that, though I do not know why. Perhaps we need to return to the old templates. --EncycloPetey (talk) 22:22, 10 December 2021 (UTC)
This is not correct. {{Greek}} was never for modern Greek, it has always applied the lang code grc. That's ancient Greek. Inductiveloadtalk/contribs 22:34, 22 December 2021 (UTC)
Can we please have the {{polytonic}} reverted to its previous behaviour of showing Ancient Greek text in polytonic format? The current re-direct to {{Greek}} seriously affects the display of ancient Greek text. Thanks DivermanAU (talk) 22:04, 22 December 2021 (UTC)
@User:DivermanAU What do you mean by "polytonic format"? In serifs? There's a three way conflation here between the encoding, the font and the lang code. Was it working for you before EP reverted it? Because it was originally broken for me, and is broken again after the revert (which is to say, it's using a sans font for me, but AFAICT that's never actually been the intention). As far as I was aware from our discussions before, it had been working for you as well until the recent revert? Inductiveloadtalk/contribs 22:43, 22 December 2021 (UTC)
By "Polytonic format" I mean the font display, currently as shown when using {{Polytonic2}}, serifs and variable width lines (for an example, see User:DivermanAU where I compare). EP didn't revert the {{Polytonic}} template, his last edit to it was in November 2018. I am requesting that {{Polytonic}} be reverted from a {{Greek}} redirect back to the November 2018 version so that Ancient Greek text displays in the Ancient Greek font style again. I don't know why the font display changed, but I know it's not a Windows issue as my ChromeBook has the same problem. DivermanAU (talk) 23:18, 22 December 2021 (UTC)
Yes, but it's not clear what the difference is supposed to be between {{Greek}} and {{polytonic}}. They're both just ways to say "ancient greek" (i.e. grc), but they used to be completely separate for no real reason. "Polytonic" or "not" is an encoding thing, not a language thing, and it sounds to me like we should bind the "serif-y" display of the fonts against the language (i.e. grc), as opposed to the encoding. I.e. all Ancient Greek, polytonic or not, is in the same font.
So then the question is do you want Ancient Greek to be styled as serif or not. I think yes. If you do, the November revert has broken that for both templates for me and you, it seems, though it apparently(?) EP is fine, perhaps they have one of the "special" fonts at the head of the list installed. For me, I get DejaVu Sans, I don't know what you get.
If the answer is "just make all Ancient Greek serif", then the November revert was wrong because it demonstrably does not work and places many sans fonts first, and should instead have been a adjustment of the fonts in Template:Greek/styles.css to ensure it also covers whoever system and font pallette EP has, which I don't know and hasn't been shared with us.
By the way, the "old status quo" you seek was actually still pretty bad for at least some people (like me) because the default polytonic fonts was always a completely dreadful font on this computer (I think it used to be Code2000, but I have changed a few fonts recently for other reasons, so now it's coming up with something else, which is certainly better than it used to be).
Can we at least be clear on what you were seeing before November? Because I understood then that it was working at least for you (and it was working for me, or at least it was in serifs).
It might actually be a simple change: does this work? "Ἀθῆναι"? Inductiveloadtalk/contribs 00:00, 23 December 2021 (UTC)
Thanks for looking into this further. Yes, there may be some confusion about terminology — I was under the impression that {{Polytonic}} was supposed to display the 'serif-style' fonts. I'm not focussed on the language encoding, but I understand it makes sense to flag the language as a particular type. Earlier than November, the serif-style fonts were displaying when using {{Polytonic}} template (which was, and is still currently, a redirect to {{Greek}} template) but that seemed to change maybe around mid-November. EP tried a change to the Greek template on 28 November — as they had noticed that serif fonts weren't showing either. But your test above is displaying the serif-type fonts for me, so if they can can be implemented, that would be great! Thanks again for your efforts. DivermanAU (talk) 03:23, 23 December 2021 (UTC)

┌─────────┘

{{Greek}} and {{Polytonic}} may have set the same lang code, but they set different CSS class names: .wst-lang-grc and .polytonic. Template:Greek/fonts.css has different font lists for the two classes, and uses .grc for the non-polytonic class. That means there are plenty of opportunity for differing behaviour just based on the CSS differences. And since the class name now mismatches, so the style is never applied, the current behaviour relies entirely on whatever automagic the webfonts stuff happens to trigger based on @lang=grc. This should be adding a webfont download of GentiumPlus, but it may be modifying that based on what @font-family is being applied through other means (it was designed to look at inline font specifications; how or whether that's been adapted to TemplateStyles, or site CSS for that matter, I haven't been able to determine).

#1 .grc Ἀθῆναι
#2 .polytonic Ἀθῆναι
#3 .grc+@lang Ἀθῆναι
#4 .polytonic+@lang Ἀθῆναι
#5 @lang Ἀθῆναι
#6 {{lang}} Ἀθῆναι
#7 @lang+@font-face (.grc) Ἀθῆναι
#8 @lang+@font-face (.polytonic) Ἀθῆναι
#9 @lang+@font-face (.wst-lang-grc) Ἀθῆναι

I am seeing (MacOS) definite differences between the display under the font lists for .grc and .polytonic. With @lang the difference disappear, and both look like just .grc. With just @lang it looks like it does with just .polytonic. The variants with @lang and an inline @font-face look the same as with @lang and .grc/.polytonic, so I'm guessing ULS reads and adapts to font-face rules present both inline and in TemplateStyles. So the major difference to consider is probably using just @lang (no local styling) vs. @lang + some local styling. And that depends on whether the same display quirks apply for all browsers and platforms. --Xover (talk) 10:25, 23 December 2021 (UTC)

@Xover Greek and Polytonic are the same thing: the latter is now a redirect to the former.
The problem is that the template was reverted carelessly without actually looking at what it's doing. So the template now doesn't match the CSS at all. The CSS for wst-lang-grc at Template:Greek/styles.css, has only one rule, the idea being that all Ancient Greek (grc) would style the same. No template is using the class names at /fonts.css (which is, obviously then, why it's not working). In theory, the grc shouldn't need any templates other than to set the lang attribute: ULS does that rest and allows user control. But...
There is a glaring deficiency of ULS UI (phab:T289777): there's no obvious way for a user to configure the webfont for any language that's not the system interface language without a faff of changing the interface language to Ancient Greek, then changing the font, and then returning to English. But, we could indeed set Gentium as a default for users who have not set any other preference: the code looks something like this:

Example

	mw.loader.using( 'user.options', function () {

		var prefs = JSON.parse( mw.user.options.get( 'uls-preferences' ) || {} );

		if ( !prefs.webfonts ) {
			prefs.webfonts = {};
		}
		if ( !prefs.webfonts.fonts ) {
			prefs.webfonts.fonts = {};
		}

		var changed = false;

		// the user hasn't specifically set a font here, use Gentium
		if ( prefs.webfonts.fonts.grc === undefined ) {
			prefs.webfonts.fonts.grc = 'GentiumPlus';
			changed = true;
		}

		if ( changed ) {
			var val = JSON.stringify( prefs );
			mw.loader.using( 'mediawiki.api', function () {
				var params = {
					action: 'options',
					optionname: 'uls-preferences',
					optionvalue: val
				};
				new mw.Api().postWithToken( 'csrf', params )
					.then( console.log );
			} );

			mw.user.options.set( 'uls-preferences', val );
		}
	} );
Probably needs more thought for logged-out users? Inductiveloadtalk/contribs 10:41, 23 December 2021 (UTC)
Well, my thought was… Do we need to faff about with it? Does using only @lang=grc alone produce reasonable-ish results (through ULSs meddling)? It doesn't have to be perfect, or any given user's most preferred font, just so long as it is acceptable-ish and works consistently. Then we can lobby ULS for a better default if we really need to. I have no idea how this is actually supposed to look, but given the main complaint seems to be serif vs. sans the presence of several sans fonts in our myriad @font-face rules is rather an odd choice. Hence, my theory that we're 1) making this more complicated than it needs to be, and 2) creating weirdness and inconsistencies where none needed be. But, hey, Greek is almost entirely Greek to me, whether modern or ancient, so I may just be confused. Xover (talk) 11:46, 23 December 2021 (UTC)
Oh, and none of the above table used the templates (apart from raw {{lang}}). I was referring to the CSS classes, and .grc and .polytonic (and .wst-lang-grc) use different @font-face specifications. Xover (talk) 11:54, 23 December 2021 (UTC)
@Xover I completely get the idea there, and broadly it's my feeling too: setting any fonts in the markup is presumptive and takes away from the user-agent's control - we should set @lang=grc and let it be configured from there (with a ULS polyfill, perhaps). And also setting actual font-families is always fraught unless you're serving the webfonts because there are no guarantees about who has what. But, clearly, that is not a universally-shared feeling.
I suspect that the serif thing is just because lots of people assumed polytonic meant serif, when it actually just happened to mean serif for them. It did come out as serif for me, albeit in a very ugly font.
BTW, one more data point for the pile: on Arch/Firefox and without other ULS shenanigans or classes, just @lang=grc comes out as DejaVu Sans because the body { font-family: sans-serif; } is "winning" for me. Though since I have now set my ULS grc font to Gentium, it does look rather nice if I allow ULS to run (or I could install ttf-gentium-basic). Inductiveloadtalk/contribs 12:06, 23 December 2021 (UTC)
@DivermanAU, @EncycloPetey: Which, if any, of the examples in the table above looks as you expect "polytonic" to look? If none do, is there any among them that look at least minimally acceptable even if not actually good? Xover (talk) 12:53, 23 December 2021 (UTC)
Two issues here: (1) polytonic, @lang, and {{lang}} look different from the other options, but (2) none of them have serifs. The serif option is not supported in this namespace, so no test in this namespace will work properly. --EncycloPetey (talk) 20:35, 23 December 2021 (UTC)
This is completely unrelated to the serif option. That just sets serif on the whole content block. Since the template use a class or an attribute selector, they'll be more specific and override the global settings (which is why forcing font families on readers causes these issues in the first place). Inductiveloadtalk/contribs 21:28, 23 December 2021 (UTC)
Thanks Xover for investigating. For me, 5 out of the 8 table examples appear in "polytonic" (i.e. with serifs) — these are: .grc, .grc+@lang, .polytonic+@lang, "@lang+@font-face (.grc)" and "@lang+@font-face (.polytonic)". Or, if it helps, numbers 1,3,4,7,8 in the table appear "polytonic". This was on a Windows 10 Enterprise PC (both using Chrome and Edge browsers), same result for a second Windows 10 Home PC and a work Win10 Enterprise PC. On a MacBook Pro, none of the fonts appear in 'polytonic', but polytonic fonts didn't display before either — presumably because no 'polytonic' style fonts are installed on it. DivermanAU (talk) 21:02, 23 December 2021 (UTC)
Update - I hope you don't mind but I fixed a typo in your table, you had "plytonic" instead of "polytonic" in the second row, so this now means I see 'polytonic' (or serif fonts) in examples 1,2,3,4,7,8. DivermanAU (talk) 21:14, 23 December 2021 (UTC)
For me (without any ULS override), it's like this: phab:F34894469. 2, 4, 8 and 9 end up in serif fonts (which are actually a mixture of P052 and DejVu Serif, except #9 which which is all DejaVu Serif, FWIW). I added one more option at the end (#9) which is what would be displayed if {{Greek}} pointed again to the CSS that matches the class set in the template (it's the same as it was in November, but the last fallback entry is serif now - proably that's the actual fix that should have been made). Inductiveloadtalk/contribs 21:18, 23 December 2021 (UTC)
@Xover: My view of the table is below:
(the ninth recently added looks the same as the eighth to me). DivermanAU (talk) 22:14, 23 December 2021 (UTC)
On my ChromeBook, the display was virtually identical to that of Inductiveload (rows 2,4,8,9 in the table appear with serifs). DivermanAU (talk) 01:30, 24 December 2021 (UTC)
One more: an Android device (who knows what fonts it actually has!) on Firefox, only #9 is serif for me. Inductiveloadtalk/contribs 08:52, 24 December 2021 (UTC)

┌───────────────────┘
Let's see if I've got this correct:

# Code Rendering DAU
(Win10)
DAU
(Chrome-
Book)
IL IL & DAU
(Android)
EP
#1 .grc Ἀθῆναι OK        
#2 .polytonic Ἀθῆναι OK OK OK[1]    
#3 .grc+@lang Ἀθῆναι OK        
#4 .polytonic+@lang Ἀθῆναι OK OK OK[1]    
#5 @lang Ἀθῆναι          
#6 {{lang}} Ἀθῆναι          
#7 @lang+@font-face (.grc) Ἀθῆναι OK        
#8 @lang+@font-face (.polytonic) Ἀθῆναι OK OK OK[1]    
#9 @lang+@font-face (.wst-lang-grc) Ἀθῆναι OK OK OK OK  
  • [1]: This is all in serifs but mixes two fonts within a single word: Ἀ and ῆ are in DejaVu Serif, something else for the others

@EncycloPetey: Based on your comment above it sounds like none of these show up in serif for you. Is that correct? And as a separate question: I think I pick up from discussions elsewhere (I could be wrong) that you disagree that the "correct" display is using serifs? If I understood that correctly, are there any of the variants above that display correctly according to how you personally think they should look?

As I see it there are two separate issues to address: 1) getting a template/font setup that displays polytonic Greek correctly in the sense "capable of displaying all five diacritics"; and 2) shows in a typeface that is visually pleasing (a subjective issue). My suspicion is that the two issues are being conflated, and that if we can disentangle them we'll be able to nail down the first issue properly. And if we do that we can more easily try to address the second through community discussion of what the default should be, and which aspects need to be configurable. And let me just say from the outset that any @font-face specification that contains a mix of serif and sans typefaces is a big huge red flag: the results of that in any given web browser is going to be essentially random and unpredictable. --Xover (talk) 10:37, 24 December 2021 (UTC)

You are correct that none of the above versions display with serifs for me. The problem is that, we have an option for turning serifs "on" or "off", but this functionality does not work for polytonic Greek. When the {{polytonic}} template was redirected to the {{Greek}} template, and that template was changed, any option for serifs was lost for me. The non-serif fonts that I get make it nearly impossible for me to read diacritics. For example, I have extreme difficulty distinguishing between smooth and rough breathing marks in the non-serif fonts. There are many other readability issues for me with no ability to read Greek with serifs. For me, the primary issue is readability.
A second issue is that, since serifs cannot be turned on or off in the Greek text for me, if I turn on serifs when reading a work, serifs only activate in certain namespaces, but not others. I cannot get serifs in the Page namespace, or the Template namespace, or any other namespace. This is a general issue with the serifs option. But, along with that issue, when I do turn on serifs in the Main namespace for a work, serifs for the Greek text does not turn on, so I get serifed Latin text, but serif-free Greek text on the same page. If a user turns serifs on, it should turn serifs on, not turn some serifs on and not others.
If we're going to tell users "you can turn serifs on or off at your discetion", then that functionality should actually work. --EncycloPetey (talk) 19:17, 24 December 2021 (UTC)
I've done further testing on other devices, for an Apple iPad the only font displayed with serifs was #9. For a Windows 8 laptop, the serif fonts were #2, 4, 8 & 9 (same as for ChromeBook) although #9 was a different font (it looked a little bolder but perfectly acceptable). So, I believe we have enough evidence to change the {Polytonic} template to that of #9 above. This would enable most platforms to display Ancient Greek text in the Ancient Greek style.DivermanAU (talk) 20:57, 4 January 2022 (UTC)

Going back to first principles (Take 2)

Ok, there seem to be multiple issues to figure out here, but to try to tackle them a bit at a time…

@DivermanAU, @EncycloPetey, and @Inductiveload: Does the below box seem to show the Greek in an at least reasonable typeface?

ἀἁἂἃἄἅἆἇὰάᾰᾱᾶᾳᾲᾴᾀᾁᾂᾃᾄᾅᾆᾇᾷ
ἠἡἢἣἤἥἦἧὴήῆῃῂῄᾐᾑᾒᾓᾔᾕᾖᾗῇ
ἰἱἲἳἴἵἶἷὶίῐῑῖῒΐῗ • ἐἑἒἓἔἕὲέ
ὀὁὂὃὄὅὸό • ῥῤ
ὑὓὕὗὺύῠῡὐὒὔὖῦῢΰῧ
ὠὡὢὣὤὥὦὧὼώῶῳῲῴᾠᾡᾢᾣᾤᾥᾦᾧῷ
'ΑἉ'Ε'Ε'ΙἹ'ΟὉὙ'Υ
ᾺἈἉῈἘἙῚἸἹῸὈὉῪὙ
Ἀ Ἐ Ἠ Ἰ Ὀ Ὠ (Unicode pre-composed Greek capital letters with smooth breathing (psili))
Ἀ Ἐ Ἠ Ἰ Ὀ Ὠ (Unicode Greek capital letters with combining comma above)
Ἁ Ἑ Ἡ Ἱ Ὁ Ὑ Ὡ (Unicode pre-composed Greek capital letters with rough breathing (dasia))
Ἁ Ἑ Ἡ Ἱ Ὁ Ὑ Ὡ (Unicode Greek capital letters with combining reverse comma above)
Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω (Unicode Greek capital letters)
α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ σ/ς τ υ φ χ ψ ω (Unicode Greek minuscule letters)
ϐ ϑ ϛ ȣ Ϝ ϝ Ϲ ϲ ϖ ᾽

ᾺἈἉῈἘἙῚἸἹῸὈὉῪὙ — This line using {Polytonic2} template for comparison.

By my calculation the above should show in a serif font that supports all the characters present in the box (ignore the "•", it's just a marker I added for… reasons). It should also all be a single font (no sneaky characters gypped from a different font). --Xover (talk) 15:01, 5 January 2022 (UTC)

Yes, to me (on a friend's Windows 8 PC), the Greek text appears in a serif font, the different diacritics appear clear. DivermanAU (talk) 20:50, 5 January 2022 (UTC)
The lowercase letters that are shown do display correctly, however the font does not support smooth breathing marks on capital letters. The lowercase sigma looks odd, and there may be additional issues, but not supporting smooth breathing marks on capital letters is a serious enough issue. --EncycloPetey (talk) 23:35, 5 January 2022 (UTC)
@EncycloPetey I added an extra line to the sample text above (using characters from the "Greek" dropdown option). Do these look better? I also just added sample text using {Polytonic2} to compare. For me the last two lines show curly smooth breathing marks correctly. DivermanAU (talk) 06:06, 6 January 2022 (UTC). — I also just added sample text using {Polytonic2} to compare. For me the last two lines of Greek text show curly smooth breathing marks correctly.DivermanAU (talk) 08:07, 6 January 2022 (UTC)
@EncycloPetey: There is (was) no sigma in the above letter specimen? It was just a semi-random selection of letters with diacritics that I grabbed for convenience (i.e. lazyness).
But I have now added examples for all the vowels in capitals with rough and smooth breathing marks, using both pre-composed characters (a single Unicode code point that includes the diacritic) and combining characters (the base Greek character and a "combining" diacritic character as two code points that your browser / operating system combines on display). I have also added full-alphabet samples for upper- and lower-case letters without any diacritics for reference.
Could you take another look and see if we're getting close to something usable? The new samples are the six last lines; the ones that have a comment in parenthesis after the sample). --Xover (talk) 08:58, 6 January 2022 (UTC)
@EncycloPetey, @Xover, for me (both on Windows 10 and ChromeBook), the new samples look good to me — all diacritics look curly and all characters have serifs. DivermanAU (talk) 11:11, 6 January 2022 (UTC)
Looks good to me on desktop (at least, it's using Gentium, which it's loading via ULS, and it looks like this: phab:F34909344). This doesn't happen on the mobile subdomain, so it just uses DejaVu Serif (I think that is Firefox trying to be clever, since if I removed 'font-family: GentiumPlus;', it uses DejuSan, the normal default). Rough breathings and sigmas look fine to me. Inductiveloadtalk/contribs 14:02, 6 January 2022 (UTC)
I've added one additional line of rare characters that have cropped up in works I've transcribed here. They look good to me, but @Inductiveload: and @DivermanAU: should look that line too. Everything looks good to me now. --EncycloPetey (talk) 01:40, 7 January 2022 (UTC)
@EncycloPetey: — Those additional rare characters look good to me too (on both Windows 10 and ChromeBook). DivermanAU (talk) 03:23, 7 January 2022 (UTC)
@DivermanAU, @EncycloPetey, @Inductiveload: Awsome. Thanks! I've updated {{greek}} with this approach now, so hopefully it will now produce both more consistent and better results. I'll wait a while before deleting {{polytonic2}} and other cleanup in case there's other issues that crop up, but other than that I believe we're more or less where we need to be for right now (Yay!). But this whole system is really finicky and fragile (it involves interactions between about a gazillion different components, including the details of each user's operating system and web browser, and really esoteric features of them all) so in future, if similar problems crop up, it's probably best to take an analytical approach ala. the above rather than reverting or making changes to various bits and pieces: the latter is apt to create both more problems and more confusion. Feel free to ping me, of course, if you think I can help, but I should note that in saying that I am in no way shape or form implying any particular expertise, just offering to help. :) Xover (talk) 09:44, 7 January 2022 (UTC)
@Xover thanks very very much for the effort untangling it! Inductiveloadtalk/contribs 09:51, 7 January 2022 (UTC)
@Xover: thanks for sorting this out. One minor query, the new {Greek} template now produces a display which shows a smaller font size than it used to. (See my talk page for a comparison, ninth line down; the {Polytonic2} template shows a 'normal' font size.) DivermanAU (talk) 19:29, 7 January 2022 (UTC)
@DivermanAU: The font size for all these is the same: calc(1rem * 0.85). But the other templates are picking a different font (which one depends on your OS and browser, so I can't check which; on my system it picks either Helvetica or Arial Unicode MS), and those may have both a larger x-height and a heavier stroke. This is generally the sort of variability one has to live with on the web since the reader's web browser is not directly under our control. Xover (talk) 20:10, 7 January 2022 (UTC)
@Xover: thanks for the explanation. But is there a reason why the font size is reduced to 85%? DivermanAU (talk) 20:23, 7 January 2022 (UTC)
@DivermanAU: You'd have to ask the WMF. They set everything to 85% of your base font size. The first thing any designer does is reduce the font size (typically to 80%) and increase the line-height (by 1.2–1.4). I think it's some kind of religious tenet or something. I mean, I'm sure it looks prettier in running text and all, but I've never quite understood why the browser defaults aren't a good enough default for most web sites. But I digress… We inherit that from the MediaWiki skin (and all the MW skins set this roughly the same). Xover (talk) 20:43, 7 January 2022 (UTC)
@Xover: can we adjust the font-size in the {Greek} template to 117% to compensate for the 85% in the MediaWiki skin? That would produce a 'normal' size font. DivermanAU (talk) 21:50, 8 January 2022 (UTC)
@DivermanAU: You misunderstand: it's not the Greek font that is set to 85% of the rest of the site; it's all body text on the site that is set to 85% of whatever font size you have set in your web browser. The Greek text has the exact same font-size specification as the surrounding text. Adjusting just the Greek text would bring it out of line with the surrounding text and make all formatting templates behave unpredictably. Xover (talk) 22:22, 8 January 2022 (UTC)
@Xover: - thanks again for the explanation. I just saw that text with the new {Greek} template
ΑαΒβΓγΔδ — {Greek} template
 is smaller than before and smaller than using {Polytonic2}
ΑαΒβΓγΔδ — {Polytonic2} template
I guess it's the font itself that get chosen when using {Greek} now (Gentium Plus) is actually smaller, I get "Palatini Linotype" when using {Polytonic2}. DivermanAU (talk) 06:00, 9 January 2022 (UTC)

How we will see unregistered users

Hi!

You get this message because you are an admin on a Wikimedia wiki.

When someone edits a Wikimedia wiki without being logged in today, we show their IP address. As you may already know, we will not be able to do this in the future. This is a decision by the Wikimedia Foundation Legal department, because norms and regulations for privacy online have changed.

Instead of the IP we will show a masked identity. You as an admin will still be able to access the IP. There will also be a new user right for those who need to see the full IPs of unregistered users to fight vandalism, harassment and spam without being admins. Patrollers will also see part of the IP even without this user right. We are also working on better tools to help.

If you have not seen it before, you can read more on Meta. If you want to make sure you don’t miss technical changes on the Wikimedia wikis, you can subscribe to the weekly technical newsletter.

We have two suggested ways this identity could work. We would appreciate your feedback on which way you think would work best for you and your wiki, now and in the future. You can let us know on the talk page. You can write in your language. The suggestions were posted in October and we will decide after 17 January.

Thank you. /Johan (WMF)

18:14, 4 January 2022 (UTC)

proofreadpagesinindex

FYI, I submitted this task T298848. Mpaa (talk) 22:54, 9 January 2022 (UTC)

@Mpaa thanks, I was literally just doing it. Patch pushed. Looks more sensible now, hopefully we can get it into this weeks train. Sorry about that. Inductiveloadtalk/contribs 23:03, 9 January 2022 (UTC)

Validation needed

My list of proofread pages is growing, but not enough validation is happening:

Any help would be much appreciated. -- Valjean (talk) 15:57, 11 January 2022 (UTC)

Sorry, I'm not really good at validation, and I don't have time at present anyway. As I said on your talk page, you should not take non-immediate-validation of proofread pages to be an insult and you should not expect that anyone does it as a precondition of your continued contribution. That way lies you burning out and leaving Wikisource (this happens). It just means no-one else feels like doing it at this time. Since proofreading is substantially more popular than validation on pretty much every Wikisource, this is rather the rule than the exception, and it's entirely possible that any given proofread page will go a very long time before being validated. I suggested you can nominate to the Monthly Challenge if you'd like assistance, which may include validation. It also may not, depending on how people feel, lots of works expire from the MC without validation, and some don't even get fully proofread) but it's probably more likely.
Remember that everyone is here by choice, and it's a fundamental part of the Wikisource model that it's completely up to other if they wish to validate something that you yourself chose to proofread in the first place. There is no expectation of a quid pro quo, unless you find someone to make some kind of validation pact with. Inductiveloadtalk/contribs 16:09, 11 January 2022 (UTC)
Okay. It's good to understand how things work here. -- Valjean (talk) 16:12, 11 January 2022 (UTC)
@Valjean by any chance have you got experience of PG Distributed Proofreaders? Sometimes recent converts find the less...structured...workflow here to be disconcerting, though I personally could not imagine enjoying working in that environment and find the lack of urgency the best bit of Wikisource. Inductiveloadtalk/contribs 16:26, 11 January 2022 (UTC)
No, I don't. -- Valjean (talk) 16:29, 11 January 2022 (UTC)

API for labels?

Hi. Is there a PP API to get page labels in index? If not, would it be possible to add it to proofreadpagesinindex? Mpaa (talk) 14:17, 16 January 2022 (UTC)

@Mpaa I don't think that's part of the current state of mw:Extension:ProofreadPage/Index pagination API. There's no obvious reason it can't be added, but I no longer have time to do much work on the PHP side any more, so I won't be able to help you myself any time soon. Inductiveloadtalk/contribs 09:21, 21 January 2022 (UTC)
OK, thanks. Mpaa (talk) 20:34, 21 January 2022 (UTC)

Script error: The function "missing_params_error" does not exist

Hello. Some pages such as 2018 Hong Kong Policy Act Report started displaying a red error notice Script error: The function "missing_params_error" does not exist. Is it possible that it has been caused by your recent changes to the Module:Header? --Jan Kameníček (talk) 12:39, 21 January 2022 (UTC)

Seems to be OK now, probably after this edit. --Jan Kameníček (talk) 13:04, 21 January 2022 (UTC)
The missing error handling has not changed since September, so if it started doing it today, it's not possible, no. I can't quite follow what today's consecutive uncommented reverts were actually trying to achieve, but since the revert was reverted, I'm going to say the offending edit was probably this one, and since it was self-reverted, it's now fixed. Inductiveloadtalk/contribs 13:36, 21 January 2022 (UTC)

Image Licences

Hello,

As part of the title pages of the ISC Russia Report, there are images of a coats of arms and the like (not sure what the other image is) on the first two pages. I have an image of these from the source ready to upload to Wikimedia Commons, but my usual copyright option (pre-1926/7) publication doesn't apply. Do you have any ideas what I should put for the licensing? Do I use "This publication is licensed under the terms of the Open Government Licence v3.0 except where otherwise stated. To view this licence, visit...", as in the copyright of said document? Or is that for the document itself and not the images on the leading two pages? Or is this a question for Commons (I hoped you might know given that you proofread some of the text)?

Thanks, unsigned comment by TeysaKarlov (talk) .

Sorry for the delay. It looks like they've now been added (though I probably would have gone for a black/white coat of arms).
Specifically, as far as I know (which is not far), the OGL does not necessarily cover images like that. For example, the OGL logo itself is not PD in the UK (it is PD in the US as wordmark: see commons:Commons:Deletion requests/File:OpenGovernmentLicence.svg). However, the coat of arms and the portcullis are both public domain anyway due to age. Inductiveloadtalk/contribs 09:18, 21 January 2022 (UTC)
Hi,
Thanks for the response, as I presumed the OGL wouldn't cover the images, but wasn't sure if coats of arms and the like were 'special', or still counted as old. Good to know for the future though.
I also had a black and white coat of arms planned, but it looks nice enough now.
Thanks, TeysaKarlov (talk) 19:45, 24 January 2022 (UTC)

Catalog of Copyright Entries..

I appreciate this isn't in your normal format for upload requests, but would it be possible to have hi-res digitial scans/djvu of an entire specifc specfic IA collection on Commons alongside the PDF's Fae uploaded about 2 years ago,

The collection concerned is the 674 Copyright Office record volumes :- https://archive.org/details/copyrightrecords?sort=-date There is an RSS feed of these at https://archive.org/services/collection-rss.php?collection=copyrightrecords. but this doesn't necessarily have all the files. Is there a way of grabbing an entire IA collection?

PDF's of these volumes were uploaded by Fae in response to a suggestion I made at Commons, However in places the PDF versions, are not reliably readable especially for smaller print.

It would be nice to have these, (which being US Gov works) clearly are Public Domain, accessible in a high quality version (vs the PDF), given that they are very useful in determining which works can be included on Commons/Wikisource and other projects, and in determining the status of other material on Commons. (I've used these to check renewals (for works 1927-1950) quite a lot.

Fae also did bulk uploads from other collections such as FEDLINK, which was again in response to a suggestion/proposal I had that Commons should attempt to mirror works from IA that were in the public domain. Generating Hi-res scans of these is not part of this request, but it would be nice to have high quality versions of as many of the PDF's Fae uploaded as possible. see Commons:Commons:IA books for details of the collections that Fae had contributed or was in the process of uploading when they left Commons.

It would of course be nice to have the copyright records on Commons before October, and any potential disruption that IA may encounter due to the ongoing dispute IA has with publishers. ShakespeareFan00 (talk) 07:35, 27 January 2022 (UTC)

@ShakespeareFan00 certainly it's possible to import from collections. The biggest impediment to me doing so is that 1) I don't have much time right now to write a new import script and 2) the IA metadata is, let us say, suboptimal, so a straight Fæ-style dump of the data leads to the uploader getting harassed at Commons with DRs and other housekeeping requirements.
I will certainly be able to handle a spreadsheet-based upload with curated metadata, since 1) that can just be a fire-and-forget process and 2) I can have more confidence that I won't then have thousands of files under my name with naff data that I might be expected to fix up.
Probably the best thing to do here is have a script to process the RSS feed or similar into a spreadsheet (e.g. CSV), which can then be manually tweaked if needed and ingested normally.
Also ideally these would get their WD items created at the same time, but that would require "someone" to know a suitable data models and implement a script for them. Inductiveloadtalk/contribs 10:20, 27 January 2022 (UTC)
IA does have an API, but it wasn't clear to me at least how you get a list of all the items in a collection, and Fae didn't leave/post any sources for how thier bulk-upload tool worked ShakespeareFan00 (talk) 10:23, 27 January 2022 (UTC)
Hmm - https://github.com/jjjake/internetarchive/tree/master/internetarchive seems to be in python, but I don't understand any of it :( ShakespeareFan00 (talk) 10:36, 27 January 2022 (UTC)
I have never used "collections" via the IA API in any way. I do use that library for my usual IA uploads, but I don't even know if it supports iterating over collections in the first place. Inductiveloadtalk/contribs 11:02, 27 January 2022 (UTC)
Well the logic seems to be
  1. Obtain a list of files (obtained in Json?) from a search.
  2. Process that list into a list of files.
  3. For each file in the collection,
    1. determine the filenames for the metdata, Djvu and Hi-res scans.
    2. Parse the metadata into a Wikidata/Commons compatible format. (or indeed the format your existing upload script uses)
    3. Upload the files to Commons, (Ideally a Djvu generated from the hi-res scan... or the existing Djvu file from IA, or maybe both :) )
    4. Generate an appropriate Wikidata item for the volume
    5. Generate an appropriate Commons {{Book}} template for the volume.
    6. Check to see if the same file was uploaded as PDF, and if so cross-link the alternate format versions.
  4. Prompt the user when the entire upload completes or breaks.

Would you like me to raise a ticket on the Phabricator workboard concerning this? ShakespeareFan00 (talk) 11:23, 27 January 2022 (UTC)

To be quite honest, I do not have an intention to do this work within, say, the next year. So, if you want to do this, you can either find someone who is able to do it, or break it down into phases that can be done with existing tool or processes. For example, if you can generate a spreadsheet of all the volumes, that already will work for steps 1,2,3,5 and I can press the button today.
Cross-linking the PDFs is easy to do after the fact, just look for the same ID on a file ending in ".pdf".
Generating the WD data also can be done independently. This is a wider task that nearly all scans should be involved in. However, WD data models are still deficient in many ways and there is no activity that implies it will improve "soon" (within, say, a year or two), so roadblocking on that is unwise, IMO.
So my advice would be focus on generating the spreadsheet-style data and don't get hung up on a completely integrated customised system as, to be honest, you will likely not find anyone to do that for you (unless you are waving fistfuls of cash, obviously). Feel free to open an issue for a complete tool, but it won't be accepted onto my workboard. I may consider working on small, tightly-focused and tools (e.g. a cross-linker script, assuming that it is actually useful) that could be used as part of a bigger process, but I will not be able to execute the wider process myself in any meanigful timeframe. Inductiveloadtalk/contribs 11:48, 27 January 2022 (UTC)

Summary for this MC

Would you like me to write the summary for this MC? Also, where would I find the stats for the month? Languageseeker (talk) 22:55, 31 January 2022 (UTC)

@Languageseeker thanks for doing it, looks good and was a nice read (maybe add a link to the current challenge to sucker in attract people)?
FYI, the stats come from https://phetools.toolforge.org/statistics.php?diff=30 with some manual inferences like subtracting the MC progress. If you're interested, getting this merged and run is the next step on being able to generate these stats much more easily (you currently have to do some heavy lifting to dig the numbers out of the DB). Inductiveloadtalk/contribs 22:22, 1 February 2022 (UTC)
Thank you! Could you also run your script to add all the Indexes in the MC to appropriate category? P.S. Hope that commit get merged. Phe always reminds me how much positive impact one user can make. Languageseeker (talk) 23:27, 1 February 2022 (UTC)

That time when David wore out Goliath before the fight even started…

WS:S#Lua error on the monthly challenge page addressed with diff. But Wikisource:Community collaboration/Monthly Challenge/February 2022 (with 80 works) is currently literally impossible to edit due to hitting the 60 second timeout (as is Jan. with 70 works, but Dec. with 65 squeaks in just under the limit). There may be confounding factors (I see issues elsewhere that may indicate an underlying performance issue down in the infrastructure), but the setup in any case seems to be pushing the limits of what MW/Scribunto/the WMF stack can handle and sometimes tripping over it. Absent massive performance improvements here, I think you'll have to seriously consider bot-generating "static" versions of these pages instead. Xover (talk) 09:48, 1 February 2022 (UTC)

Urgh, one day maybe finally it'll roll over peacefully (and or I'll change rollover to not be at midnight UTC). I did check when I got up before I got in the car, but it seemed to be not red (I guess either you fixed it or the data appeared, or both). I also need to get a "prepare for the next month" script to run. Thanks for the fix!
So the massive dependency thing is a pain, for sure. I'm waiting to see if the Comm Tech team will take on the mission of actually fixing a long standing mixing feature in core that causes this (meta:Community_Wishlist_Survey_2022/Miscellaneous#Check_if_a_page_exists_without_populating_WhatLinksHere). If they do not, I will definitely be doing something about a lazy mode for these stats. Inductiveloadtalk/contribs 20:56, 1 February 2022 (UTC)

A User Script to automatically run Transcribe Text

It seems that you’re busy IRL. Hope that everything is going well. I don’t know much about scripting and feels free to say no to this outright. However, I was wondering if it would be possible to write a user script that would run the Transcribe Text function whenever a Page is open that has not been saved yet. This would make it easier and faster for users to start with an updated OCR when proofreading which should reduce proofreading time and increase accuracy. Languageseeker (talk) 21:39, 8 February 2022 (UTC)

It's a good busy, but not leaving much time for much else, at least at first. You can just click on the OCR button in the toolbar when it appears. There's no hook for it appearing, but you can list for the DOM mutation. I wrote it before as an amusement, in fact, here. It would be better if there was some kind of formal API for it built into the Wikisource extension, of course, but wrapping that code with a check for wgAction and nanespace will probably work tolerably well. Inductiveloadtalk/contribs 22:21, 8 February 2022 (UTC)
Glad it's a good busy. Take your time and enjoy the adventure. How am I not totally shocked that the powers-that-be did not bother to add a proper API for the OCR tool. Maybe, one day. Languageseeker (talk) 16:22, 9 February 2022 (UTC)
@Languageseeker actually the back-end API does have a proper API which you can call quite happily from JS, but, what we don't have is a solid way to re-use/hijack the existing UI so you don't need to re-invent the wheel. Inductiveloadtalk/contribs 21:53, 14 February 2022 (UTC)

Ppoem

Hi,

This is simply to thank you for writing {{Ppoem}}, and inform you that I have imported it in the French Wikisource, where several people were interested by its advantages with regard to the standard poem tag. Seudo (talk) 09:42, 12 February 2022 (UTC)

@Seudo thank you for the note: I'm very glad it's useful! Inductiveloadtalk/contribs 21:52, 14 February 2022 (UTC)

Are you sure this was the best idea? The discussion only had two comments in support, and drastically limits the scope of content. I only noticed it myself when it was used as the justification for some proposed speedy-deletions. I certainly oppose this qualification, and I would have opposed it if I knew that the discussion was going to be closed so soon. TE(æ)A,ea. (talk) 16:09, 16 February 2022 (UTC)

@User:TE(æ)A,ea. If by "so soon", you mean "after the discussion was already archived due to lack of further comment", a month since the last comment, with zero actual objections, then yes, I think it's reasonable in terms of process. Ample time was given for comment, the topic was presented correctly, as far as I know, in the very public Scriptorium Proposals sections, as opposed to being lumped in with another thread elsewhere, and no further comments were apparently forthcoming.
Obviously, you may open a new discussion on amending the policy further or rolling back the change. If you want to claim the process is invalid and should be reverted, then also feel free, and, again, the Scriptorium would be the place for that too. Personally do not think it was, but that's, just, like, my opinion, man. Inductiveloadtalk/contribs 20:33, 16 February 2022 (UTC)
Also, for the avoidance of doubt, I also do not think the speedies in question would reasonable, even under the policy as amended, for existing works (though I had missed that these were new and are covered). Inductiveloadtalk/contribs 21:02, 16 February 2022 (UTC)

Block right and right block..

{{Right block}} and {{block right}} are for the most part identical in function.

I can't update the latter to make them functionally equivalent (as it's a high use template), which would seem sensible with a redirect set up.

I'd also at some point re-written {{right block}} to use {{optional style}} to reduce the amount of CSS generated. Would it be feasible for someone with appropriate access rights to update {{block right}} to make it functionally equivalent, and set up an appropriate redirect at {{right block}}? Thanks. ShakespeareFan00 (talk) 10:40, 17 February 2022 (UTC)

Adding fonts to ULS

Hello @Inductiveload: Can you guide me how and where to run compile-font-repo.php as you edited the documentation here. Thanks Zsohl (Talk) 05:00, 19 February 2022 (UTC)

The Case of Charles Dexter Ward - Lovecraft - 1971.pdf

Can you please replace the front and back cover of this file with blank placeholder pages? The cover has artwork by artist Michael Whelan, who is still alive, and this artwork will still be under copyright in the US and elsewhere. Alternatively, you can apply black to the artwork to hide it on the front and back covers, but the file will eventually be deleted from Commons for copyright violation, and we wouldn't be able to host it here either, with copyrighted content. --EncycloPetey (talk) 22:04, 20 February 2022 (UTC)

Turns out it doesn't have a correct copyright notice for a US work of that year. This was discussed at Wikisource:Copyright_discussions#The_Case_of_Charles_Dexter_Ward_-_copyright_of_1971_edition and is also clearly listed on the file info page. Inductiveloadtalk/contribs 22:08, 20 February 2022 (UTC)
OK, thanks. And I see the information appears at Commons as well. --EncycloPetey (talk) 01:23, 21 February 2022 (UTC)

That bring the case, I have a paperback with this cover art, and scanned the cover as File:CDWard.jpg (local temporary upload). The lettering of the title is colored differently, but the bottom portion could be grabbed and combined into the original. My image software is crude and cannot incrementally rotate the image, or I would have done more, but someone with good software can use it to repair the image and eliminate the library sticker. --EncycloPetey (talk) 01:57, 21 February 2022 (UTC)

Subwork tag for Pagelist and automated transclusion

I was wondering if it would be possible to create a subwork tag for the pagelist to help automate transclusion of multi-work Indexes such as periodicals or encyclopedia. I'm thinking about something along the lines of

<pagelist 1="Cover" 2to4="–" 5="Title" <s> author='Paul Rosenfeld' title='Charles Martin Loeffler' pagelist=43:48 (could also be something like 44 for a single work or 41s1:48s2 for sectional transclusion) </s> />

A transclusion tool can then use the information in the <s></s> to know to transclude this to Title/Volume/s_title and fill in the header and pagelist automatically. This should a considerable amount of time when trancluding. Languageseeker (talk) 16:12, 25 February 2022 (UTC)

It would be extremely complex and I do not think the result would be usable, but least because the information is far to complex to universally encode into such a simple tag (and also because the implemention would be very, very hard and would break a lot of things, since it's an API of sorts). Furthermore, this information being stored in the index page is a hack from the start of last decade. What we probably should do is store this stuff in Wikidata (after defining a robust schema, which the above suggestion also needs) then fill the header template from there. Inductiveloadtalk/contribs 22:24, 25 February 2022 (UTC)
Got it. Thanks for responding to all my ideas. Hope you're well. Languageseeker (talk) 02:52, 26 February 2022 (UTC)
Ideas are always good to have. However, if you're angling for something that will improve this kind of thing, the one thing that will have a seismic effect will be to complete and fully document the incomplete and inconsistent Wikidata schemas for bibliographic works. It is fairly clear that if we at WS wish to have such schemas, we will have to do it ourselves. It's a lot of work to plough through the options and figure out how to represent everything and also propose any new properties needed and shepherd them though the system. Once the schema is in place it's then just "simple" iteration to adapt templates to use that data where possible. Inductiveloadtalk/contribs 19:19, 26 February 2022 (UTC)

MC

The good news is that we didn't blow up the front page this month, but, in the yearly summary bar, there's a "Lua error in package.lua at line 80: module 'Module:Monthly Challenge daily stats/data/2022-03' not found." Languageseeker (talk) 00:33, 1 March 2022 (UTC)

@Languageseeker, @Inductiveload: I’ve created the empty Module page that was missing. It’s fixed the issue in the yearly summary bar and I hope the bot can pick up from there. Ciridae (talk) 05:35, 1 March 2022 (UTC)

RunningHeader Gadget

Hi! I’ve been using the RunningHeader Gadget and I’ve encountered an issue while proofreading this index. When I click the link to add the running header, it first adds it correctly from 2 or 4 pages behind, but then immediately overwrites the correct header with an incorrect one from 1 page behind. Anything I’m missing that causing this? Ciridae (talk) 09:57, 28 February 2022 (UTC)

@Ciridae probably it's just a quirk with that work, or maybe the script has changed behaviour (or more likely some dependency or hook used by the script has). I'll take a look at some point when able. Sorry for the delay in reply. Inductiveloadtalk/contribs 18:56, 23 March 2022 (UTC)

Jeannette Rankin

I found a request from you for a Wikisource speech by Jeannette Rankin prior to her "No" vote on the U.S. entering World War II. There is a very interesting summary of her political and suffragette career on pp. 36-41 of Women in Congress 1917-2006, https://www.govinfo.gov/content/pkg/GPO-CDOC-108hdoc223/pdf/GPO-CDOC-108hdoc223.pdf (Matthew A. Wasniewski, U.S. House of Representatives, 2006). That document gives the illuminating information (p. 40) that Speaker Sam Rayburn blocked Ms. Rankin from speaking on the war resolution.

She gave her first speech in Congress on 7 August 1917, asking for federal intervention against corporate copper mining interests in Montana, and again in December 2017 in debate over the war with Austria-Hungary. She was out of office for decades but was reelected in 1940 as a left-of-centre pacifist Republican, and gave speeches in May and June of 1941, which will presumably be in the Congressional Record, opposing the sending of U.S. troops to fight in Europe or Asia.

Another useful source might be this book, which is frequently cited by the author of Women in Congress 1917-2006: Hannah Josephson, Jeannette Rankin, First Lady in Congress: A Biography (Indianapolis, IN: Bobbs–Merrill, 1974). Objectivesea (talk) 22:39, 22 March 2022 (UTC)

@Objectivesea thanks for the background! Does that mean that the speech mentioned at WS:RT (sole dissenting vote, and the speech that accompanied it, against US participation in WWII) doesn't exist? Inductiveloadtalk/contribs 18:55, 23 March 2022 (UTC)
That was my understanding, @Inductiveload, from the Women in Congress history. However, I could well be wrong. I have heard, for example, that sometimes things can be added into the Congressional Record as a courtesy, even if the words were not spoken in the House. I have no idea of how long-standing that practice is. Objectivesea (talk) 20:21, 24 March 2022 (UTC)

Hi-res image mismatch

Can you make some checks on this:- Index:Heartbreak House, Great Catherine, and Playlets of the War.djvu because despite being the IA souced djvu, the add on for getting the hi-res page images from IA seems to be very confused, as no offsetting should be required.

What's actually broken if anything? ShakespeareFan00 (talk) 18:47, 27 March 2022 (UTC)

Somewhere, an off-by-one error crept in: the first page at the IA is "n0", not "n1". I guess that's been like that for "some time"! I'm not sure that can be characterised as "very confused" (I'd reserve that for, say, returning pictures of stoats instead of scans), but it's fixed now. Thanks for the report (though it would, as always, be better if you could say what it is exactly you see not working: in this case the offset was always off by exactly 1, which made the error obvious - because you didn't say that I had to figure that out for myself). Inductiveloadtalk/contribs 20:33, 27 March 2022 (UTC)
From DJVU page Needs Offset
5 7
7 7
8 9
11 11
13 13
14 15
15 15
16 15

Suggesting something else is going on as well. It's a shame that I don't see anything like a manifest list for IA works, like other uploading institutions use.

Looking at the original scans, the page mapping to scans for this work seems to be a bit convoluted anyway.

https://ia802705.us.archive.org/view_archive.php?archive=/25/items/heartbreakhouseg00shaw/heartbreakhouseg00shaw_jp2.zip&file=heartbreakhouseg00shaw_jp2%2Fheartbreakhouseg00shaw_0000.jp2&ext=jpg appears to actually be page 84.

Also the original scans number 471, whereas the edition here only has 360 pages in the index. Are there duplicates in the originals maybe? ShakespeareFan00 (talk) 08:11, 28 March 2022 (UTC)

I also note that - Page:Hegan Rice--Mrs Wiggs of the cabbage patch.djvu/102 is now off by one, when it wasn't previously, but this may be due to a calibration page being removed in the upload process. ShakespeareFan00 (talk) 08:17, 28 March 2022 (UTC)
Turns out the IA 1-index some things and 0-index others, which is annoying, but I've made a fix, so I think it's working now. For Heartbreak House, you may just have to resign yourself to the offsets being needed since there are lots of junky images in the normal IA stream. Maybe the addToAccessFormats in the scandata.xml will allow to deal with it (I had thought it was already handled by the IA, but apparently not). Inductiveloadtalk/contribs 21:05, 2 April 2022 (UTC)

Working fragments of fragmentary work

Any chance you recall what this was about? What was it that didn't work unless the inner pagenumber span was emptied? Xover (talk) 07:01, 31 March 2022 (UTC)

@Xover: IIRC, this was because the "new" spans in #ct-pagenumbers want to use the fragments as IDs and there could be only one. But, don't assume I was having any kind of divine inspiration! Inductiveloadtalk/contribs 19:44, 31 March 2022 (UTC)
Hmm. But .pagenum-inner is the innermost of the spans, so .innerHTML() on it just nukes the text content (the zero-width space). Where do fragment identifiers come into play? (I'm looking for an otherwise unrelated issue where I need to preserve that zero-width space, and want to make sure I don't break anything else in the process.) Xover (talk) 06:44, 1 April 2022 (UTC)
Then I'm afraid I don't really recall. It presumably made sense at the time, and I do vaguely recall that there was an issue where the page fragment links were not working, but I don't remember exactly how it went down. Re the ZWSP, IIRC Chrome and Firefox have different opinions about that somehow. Inductiveloadtalk/contribs 13:21, 2 April 2022 (UTC)
When presented with a regular space and a zero-width space adjacent to eachother (modulo intervening markup, like spans), Chrome (but no other browser engine, including Safari's) seems to sometimes collapse both spaces into a black hole leading to unspacedwordslikethis. Since pagenumbers.js removes the ZWSP on desktop, it only shows up on mobile. Hence why I landed at that diff. Oh, well. Thanks, in any case. Xover (talk) 13:53, 2 April 2022 (UTC)

Poem inside reference inside Poem

Example from: Page:Felicia Hemans in Forget Me Not 1828.pdf/7

There shall be no more snow,2[example 1]



  1. 2 "Wohl ihm, er ist hingegangen

    Wo kein schnee mehr ist"

    Schiller's Nadowessiche Todtenklage

This fails to format properly, In that the smaller should be applied to the whole stanza, and not just a singular line. How to repair it? ShakespeareFan00 (talk) 12:40, 7 April 2022 (UTC)

@ShakespeareFan00: Stop trying to make point-fixes to whatever lint error or similar you're looking at. Just throwing a "ppoem" or "|1=" in random places is unlikely to achieve any actual net improvement. If you want to fix these pages you need to look at the totality. In particular, if you want to use ppoem for a work you need to change the whole work, or at the very least all of the sequence of the work you're touching. Especially since the reason ppoem exists in the first place is the fundamental problems with the "poem" extension tag. If you drop the magical "|1=" and use ppoem properly for both the outer and the nested poem it works just fine. Xover (talk) 14:13, 7 April 2022 (UTC)
@ShakespeareFan00: Oh, and expending all this effort on reproducing nonsense like a manual <sup>2</sup> just before the automatic reference marker, and {{smaller}} surrounding ref text that is going to be applied automatically by {{smallrefs}} anyways, is not a good use of resources (i.e. your time). If you're going to expend effort on fixing stuff like this then fix it properly instead. The original contributor quite clearly did it this way out of inexperience and unfamiliarity with enWS practices, so bending over backwards to preserve it is not a good use of anyone's time. Xover (talk) 14:28, 7 April 2022 (UTC)
@Xover: Thanks for the vote of confidence. I was trying to make minimal changes in repairs, however as you say fixing all the concerns at once is the best approach. ShakespeareFan00 (talk) 14:32, 7 April 2022 (UTC)

Let them scroll - the missing 11th commandment

After studying and hazily understanding the complexity of your work, I am only asking about the Proofread module's image scrolling feature in page edit mode. Is it still under consideration?

I ask because my situation is akin to that of the ancient Hebrews wondering in the Sinai desert. They had the scroll but couldn't scroll it. In fact, they will be remembering the event on this weekend.

I have not seen any movement on the issue recently. As I said before, I do not currently have interest in working on improving the OpenSeaDragon viewer from its fairly raw initial state. It's a shame, but I realised that trying to contribute to Wikimedia software in general was making me unwell, and life is too short to spend it rebasing patches to be ignored (also I don't have time anymore anyway, but that came later). I hope you can find someone to push it though. But it's not going to be me for the foreseeable future, I'm afraid. As always, I claim no ownership of the patches, I will not mind if someone wants to use them as a basis for further work in any way. Inductiveloadtalk/contribs 22:18, 15 April 2022 (UTC)
You've done the smartest thing possible. I understand and support you completely. There will be no more such questions from me. And, enjoy life and proofreading.— Ineuw (talk) 22:23, 15 April 2022 (UTC)
Thanks for understanding. I genuinely did think I could help there, but I just wasn't up to it in the end. Maybe one day I can collect myself to try again, but in the mean time, I much prefer sticking to things locally that don't need to beg to get them though Gerrit. Happy Easter! Inductiveloadtalk/contribs 22:39, 15 April 2022 (UTC)

Raw ocr using userscripts

Hello, I was looking for a way perform ocr using automation user scripts or bot any other way. I saw that you did the same using bot once upon a time. Do we have something similar now? Or can you point me where and how can I access such a tool? On mrwikisource we use ocr4wikisource tool build by a Indian wikimedian. But that needs knowledge of Ubuntu and commands and much more. Cant we have a simple version of it? QueerEcofeminist (talk) 14:22, 14 April 2022 (UTC)

Probably the best way to do OCR from a JS context is to use the Wikimedia OCR tool: https://ocr.wmcloud.org/ There is an API you can use to OCR images hosted at Wikimedia projects. This is what the built-in OCR tool uses, but any other JS can use it too. Inductiveloadtalk/contribs 22:23, 15 April 2022 (UTC)
thanks, though thats the problem I don't know how to make use of api or js scripts. As I am not a code literate. QueerEcofeminist (talk) 06:00, 5 May 2022 (UTC)
@QueerEcofeminist if you are not able to use JS or the API, you can use the "Transcribe Text" button (or whatever it's called at mrwikisource). It's built into the ProofreadPage extension and appears in the top right of the page editor: you can see an example at mw:Extension:Proofread_Page/Page_viewer. Inductiveloadtalk/contribs 06:27, 5 May 2022 (UTC)
We have it already and I have done hundreds of pages like this. I was asking for a tool which will perform a batch, one book at a time or set of pages from a book. Do we have something like did earlier using bot. Sorry for not being clear about what exactly I was looking for. QueerEcofeminist (talk) 06:33, 5 May 2022 (UTC)
There is no pre-made tool for this, as far as I know, sorry. Inductiveloadtalk/contribs 06:35, 5 May 2022 (UTC)
No issues. Thanks for your time QueerEcofeminist (talk) 14:12, 8 May 2022 (UTC)

Updating Module:ISO 639 and {{ISO 639 name}}

The Wikipedia versions of this template and module are significantly more complete. Could you import w:Template:ISO 639 name and w:Module:ISO 639 name to Wikisource? —CalendulaAsteraceae (talkcontribs) 01:35, 21 April 2022 (UTC)

@CalendulaAsteraceae it can be, but not trivially, because there's actually quite a large number of dependencies like w:Module:Language/data/ISO 639 override, so it's not necessarily just a single copy-paste. It would be nice if there could be a central utility module system, but I haven't heard of such a thing. Inductiveloadtalk/contribs 12:29, 23 April 2022 (UTC)
Makes sense! If you're not up for dealing with all that, I would also appreciate you adding all the languages in Template:ISO 639 name (and I could do the format conversion if needed). —CalendulaAsteraceae (talkcontribs) 22:27, 23 April 2022 (UTC)

How to get consistently good OCR from PDFs converted from TIF images?

I have a number of page scans from a reel of microfilm, which are saved as fairly high-quality TIF images. (I plan to make some more of these in the future, as well.) Right now, I am working with primarily single-page documents, so the conversions are not particularly demanding. However, there a greater number of multi-page documents for which a more efficient conversion would be useful. One of the most pressing concerns I have is that the TIF-to-PDF converters significantly reduce the quality of the image—so much so so as to make the image unsuitable for OCR, and thus requiring, in some cases, manual transcriptions. Is there a way of which you know which would be more useful for these conversions? I would also need some way to combine the individual page images into a full PDF document. TE(æ)A,ea. (talk) 21:15, 16 May 2022 (UTC)

@TE(æ)A,ea. as usual when processing any data (image, video, audio, digital, whatever), always try to feed the highest quality data into each stage as best you can. In this case, you should probably:
  • OCR the TIFFs first and then convert them to PDFs (so you're not adding needless noise to the input of the OCR), along with the text layer from the OCR. It should not matter how, or how much you compress the image: the OCR ingests the pre-compression image. If you are ever OCRing compressed images when you have had uncompressed (or less compressed) images in your pipeline, you are either doing it wrong (or you have carefully concluded you have literally no other choice, or you have checked that the compression is not going to substantially affect the OCR). That said,
  • If you're starting with bitonal TIFF (i.e. each pixel is black or white, nothing in between, which is usual for third party microfilm scans), you should not use unsuitable compression. In particular, any compression based on image frequency data (e.g. JPG, JP2) is likely to be atrocious in terms of image quality for a given size. Those algorithms are designed for photos, not text, and certainly not bitonal text. There are compression codecs like CCITT which are designed for the purpose. They might not crunch the size as hard as a generic JPX codex cranked up to the highest setting, but they will produce a far better looking result. CCITT is actually completely lossless.
I don't know much about PDF specifically, and I really don't know about PDF authoring tools (I always use DjVu since the tools are better for my purposes, specifically the OCR text layer generation). Like DJVU (and TIFF actually), PDF is not a compression format, it's a container format, so you can embed pretty much any image data in there with any compression codec. I am 100% sure there is a way to directly embed a TIFF (CCITT-compressed or otherwise) into a PDF, but I do not specifically know how to do it.
The tool to convert a bitonal image to a DjVu is cjb2, which is also a dedicated bitonal codec.
As for combining PDFs, when I do do that, I use stapler, though I'm sure there are lots of similar tools on your platform of choice. When combining DjVus, the tool is djvm -i.
My script for doing this is https://github.com/inductiveload/wstools/blob/master/wstools/make_document.py. This script contains the logic for converting images to DjVus, OCRing the images and combining the OCR text layer and the DjVu. This script and the whole repo there is a mess and it essentially a scratchpad for a larger project that didn't happen (yet? hard to say). It is not designed for public use (though I hope it's useful), and I might not be able to provide any guidance on it at all. I can try, but I'm only around sporadically and with very limited time.
That said, hopefully all you will need to do is ./make_document.py -i /some/directory/of/images and it could literally be that simple. Note that you will need the entire repo, not just that script (it imports other tools) Inductiveloadtalk/contribs 21:43, 16 May 2022 (UTC)
The other, time honoured Wikisource way is to prepare the works as zips of images and upload them to the Internet Archive. The IA will OCR them and make PDFs. Then, import them either directly as PDF or as a DjVu (which will re-convert them) with IA Upload. This is slow and a bit frustrating but is conceptually simpler.
If you arrange for this format I might be able to deal with it over the next few weeks/months, but I cannot promise any specific results. The Scan Lab in general might be able to help too if you provide the raw images. Inductiveloadtalk/contribs 22:00, 16 May 2022 (UTC)
  • Inductiveload: As an example, see this index. The images are not bitonal: I am scanning them directly from the microfilm reel in “color” (which, in this case, is just full grayscale). Do you know how to convert color/grayscale TIFFs to DJVU (with DjVuLibre)? Doing that would probably be my best bet, as automation isn’t quite suitable because of all of the clipping I need to do to form the two-page image scans into properly clipped, single-page images. I don’t care about PDF versus DJVU, so long as the result is functional. I may use the Internet Archive method for works when I’m not too interested in proofreading any specific work, but otherwise, for multi-page works, I’d like to do something myself. Similarly, I don’t want to rely on you guys on Wikisource, when I’m essentially dumping thousands of pages of TIFF images and saying, “Thanks for your assistance!” I don’t know an efficient way of generating OCR from TIFFs, preserving the OCR layer, and converting that to a PDF (using on-line tools, when I can’t download decent tools). Because of that, I converted the TIFFs to PDFs, uploaded the PDFs, and generated the OCR therefrom. This, however, generated a weak OCR, that was in a number of cases not usable. Thanks for your detailed response! TE(æ)A,ea. (talk) 23:55, 16 May 2022 (UTC)
    @TE(æ)A,ea. OK, then you have the following choices:
    • For a non-bitonal DjVu result, the tool is called c44 (also part of DjVuLibre), which is a wavelet compressor (a bit like JPEG2000). The script above this uses it for any colour image (that it wants to stay in colour). You will need to first convert to something c44 can ingest:
      • JPG (a lossy conversion)
      • PNM (which is lossless), but the files are huge and considering that you're about to compress lossily into DjVu, I haven't seen important differences in final image quality using PNM as an intermediate over a fairly-good JPG (say, Q ~= 85+).
      • Note: I rarely convert greyscale TIFFs, normally if I'm feeding c44, it's with Internet Archive scans that are already heavily compressed. You will need to see how your results go.
    • If you want to target a bitonal DJVU, you can convert the TIFF to a bitonal TIFF (at this point you may wish to adjust the threshold as for OCR, see below) and then use cjb2. This gives pretty amazingly small files, but can be quite a harsh thing to do to images. However, because you will be extracting things like images from the TIFF originals, and never from a DjVu, see H:EXTRACT, it's a trade off that might be acceptable to you if you end up with long files that start getting into the hundreds of megabytes. Again, I rarely do greyscale TIFFs, but I do use bitonal mode for Hathi scans, which are already bitonal.
    As for the OCR step, Tesseract natively ingests TIFFs, so you can just run it on the TIFFs. Tesseract internally, AFAIK, binarises the files. You may find you get substantially better results if you binarise the files yourself and choose the threshold carefully (this can be most important if the text is very light or the background very dark). Noise removal can also give a huge step up in OCR quality (dust and streaks can really freak Tesseract out sometimes). Also avoid including borders in the OCR input images. I'm not very good at this step, and I rarely do it other than sometimes adjusting thresholds and cropping margins.
    If you need to process all your files (e.g. rotating, splitting, trimming margins, dewarping, despeckling, binarisation thresholds and methods, etc.), before OCR or converting to PDF/DjVu, an excellent tool is Scan Tailor Advanced. If you are doing that step page-by-page in a photo editor, you will find it extremely helpful. 06:47, 17 May 2022 (UTC) Inductiveloadtalk/contribs 06:47, 17 May 2022 (UTC)

A Bunch of Questions From Discord

These are questions from Discord that I have asked but no answer as of yet.

  1. Is mul:Wikisource:Shared_Scripts still in use in any way? Are scripts on that page outdated?
  2. Are MediaWiki:Gadget-ocr.js and MediaWiki:OCR.js the same thing?
  3. Template:Option 100% failure rate?
  4. Is MediaWiki:Gadget-WSexport.js still required? (I think Wikisource extension made this obsolete)
  5. Are MediaWiki:OCR.js, MediaWiki:Corrections.js (th:MediaWiki:Corrections.js), MediaWiki:Dictionary.js, and MediaWiki:IndexForm.js still in use in English Wikisource? (this is more like a sanity check for me and Thai Wikisource)

Bebiezaza (talk) 18:19, 28 May 2022 (UTC)

@Bebiezaza I have been away this week.
  • mul shared scripts: I have no idea, I've never seen them before, so at least, probably not much at enWS. enWS has often developed scripts locally rather than deferring to loading scripts at mulWS. It would be nice if working scripts were bubbled up for sharing, but it's a substantial maintenance headache to make changes while making sure scripts don't break on other Wikisources, so I can see why they don't always do so.
  • MediaWiki:OCR.js looks like MediaWiki:Gadget-ocr.js from when gadgets would load from any page, not just ones starting "MediaWiki:Gadget-". This is the "old" OCR script. It's still working, but it should not be necessary any more since the "real" OCR backend exists now. It uses a completely separate OCR backend, though it's still Tesseract.
  • {{option}} no idea what that is, doesn't look to be used or useful?
  • Yes, MediaWiki:Gadget-WSexport.js is obsolete, and was removed from the gadget definition list here.
  • OCR.js is still in use (though not the "primary" OCR script any more), the others I guess are not, at least at enWS, since they can't be "gadgetified" if the pages don't start with "MediaWiki:Gadget-" (gadgets can load script from other pages and even other Wikis, but the primary script has to have that name prefix). Not to say they couldn't be repaired and used with enough time, effort and willingness to find a gadget-writing workflow that is tractable.
TL;DR there's lots of junk about. Inductiveloadtalk/contribs 11:22, 29 May 2022 (UTC)
  • (2./5.) We have MediaWiki:OCR.js in Thai Wikisource for a while now, but no one ever used it until OCR got implemented in proofreadpage extension.
  • (3.) I found {{option}} in mul:Wikisource:Shared_Scripts (Template in Base.js), so ... yeah.
  • (5.) To me, it seems like th:MediaWiki:IndexForm.js is not in use anymore, but yes, sanity check :)
Anyways, I am currently going through pages in MediaWiki namespace in Thai Wikisource because I feel like a bunch of them was botched in place. A bunch of them have roots from English Wikisource, and maybe some more from Multilingual; there's definitely going to be more questions and sanity checks in the future, and I think I'm going to post them here instead of Discord because they always falls off the table before you come around and see the chat. --Bebiezaza (talk) 12:56, 29 May 2022 (UTC)
@Bebiezaza feel free, but do note that I'm busy these days with the pernicious demands of physical reality. So maybe you could ask this kind of thing at the Scriptorium (you can still @ me) and others might be able to help if I can't get to it quickly. Inductiveloadtalk/contribs 13:43, 29 May 2022 (UTC)

Styles in MediaWiki:Vector.css

Some more code questions for vector.css in Thai Wikisource, which seems to be copied from here.

  1. Do we need MediaWiki:Vector.css#L-14 (th:MediaWiki:Vector.css#L-37)?
  2. What was MediaWiki:Vector.css#L-20 (th:MediaWiki:Vector.css#L-43)? It seems like that it is unused now because that class is now "vector-menu".
  3. I already tried checking MediaWiki:Vector.css#L-8 (th:MediaWiki:Vector.css#L-8) with up to date class name "vector-menu-tabs" but can't seem to find if it's working or not.

Bebiezaza (talk) 13:04, 29 May 2022 (UTC)

@Bebiezaza I honestly do not know. Most of that is very old hacks around defects in Vector. Vector has moved on since then (and there's now Vector 2022, which is completely different and I have no idea why anyone thought calling it Vector would be a remotely good idea), as has the primary author of most of that CSS. The skin maintainers are generally quite happy to change things and break stuff like this, so it could well have just stopped working and no-one even noticed. Which leads me to: I would imagine you do not need any of it at all, since Vector is supposed to work even without local CSS hacks. Inductiveloadtalk/contribs 13:49, 29 May 2022 (UTC)
Thanks! I've decided to remove everything in vector.css in Thai Wikisource in the future (my plan is to do it all at the same time), except the "Styling for updated markers on watchlist, history and recent/related changes" although moving it to common.css --Bebiezaza (talk) 15:00, 29 May 2022 (UTC)

Where to post a feature request?

Hi,

Perhaps you can tell me who handles feature requests for the edit menu's OCR script? Would it be Phabricator? I am confused who handles what and where. Much appreciated. — ineuw (talk) 04:02, 7 June 2022 (UTC)

It seems this index escaped the batch upload, can you add it? Languageseeker (talk) 14:41, 9 June 2022 (UTC)

North American Review

Not sure what happened here, but…

@Xover just general flaky uploads as per usual. Inductiveloadtalk/contribs 21:47, 18 June 2022 (UTC)

This cannot be used in sidenotes in it's current form.

Would it be possible to have an 'inline' version of this that I can use instead of {{sn-year}} ? Thanks? ShakespeareFan00 (talk) 07:30, 10 July 2022 (UTC)

Export Function in Portal NS

It seems as the export function is disabled in Portal NS. Is there anyway to enable it so that compilations such as Portal:Sherlock Holmes (UK Strand) or all the volumes of a series can be exported. 14:48, 13 August 2022 (UTC) Languageseeker (talk) 14:48, 13 August 2022 (UTC)

(this is kind of why maybe these shouldn't be in the portal namespace in the first place...) PseudoSkull (talk) 15:25, 13 August 2022 (UTC)
This is simply a UI thing - the links aren't shown. The export tool will very happily export that (though it's very big, 185MB, due to the images): https://ws-export.wmcloud.org/?fonts=&format=epub-3&lang=en&page=Portal%3ASherlock_Holmes_%28UK_Strand%29.
We should be able to fix the JS to show the links. Certainly mis-placing works into the wrong namespaces because of this would be putting the cart before the horse. WS Export serves WS, not the other way around. Inductiveloadtalk/contribs 20:12, 13 August 2022 (UTC)
Well, Xover suggested that I moved this into Portal NS because it is not an actual published work, but a WS created compilation. If it should be in main, then the redirect can be deleted and it can be moved back. Thoughts? Languageseeker (talk) 20:34, 13 August 2022 (UTC)

No, it should be in Portal, I think. We should just fix the download links. Inductiveloadtalk/contribs 21:18, 13 August 2022 (UTC)

Overview for Indexes in MC

Is there any way to see the per-index change for titles that are in the MC, e.g. Moby Dick Proofread +100, Validated + 24, Problematic + 25? Thanks Languageseeker (talk) 15:15, 28 July 2022 (UTC)

@Languageseeker: I don't see that data anywhere on-wiki, so my guess is IL would have to modify the bot to calculate and update that separately. I imagine that'd raise all sorts of problems related to adding and removing works from the challenge that would make it tricky to get right. Is it "nice to have" or an actual use case? Xover (talk) 20:08, 28 July 2022 (UTC)
@Xover Ah, I see. I was hoping to use this to see what indexes are being actively worked on so that I can give them extensions after 3 months. Otherwise, it's really hard to keep track of 70+ indexes. Languageseeker (talk) 22:30, 28 July 2022 (UTC)
@Languageseeker, broadly, @Xover is right. To get this easily importable for use in a template or Lua module would need bot changes. This is hard enough that I don't really want to do it unless there's a strong need for it. The whole stats bot is a disgusting and fragile hack that I strongly want to migrate to core Wikisource server functionality eventually.
However, the ProofreadPage change tagging system is now active, which will make it much easier for anyone to pull this kind of thing out of the database. The update script to backfill the changes has yet to be run or fully reviewed. It just turned one year old, so maybe it's gone stale. This means change tags cannot be used for any stats gathering that goes back too far (before about the start of this year), but they'll work for recent MCs.
As for what you can do right now, you should be able to pull this data out without too much drama using Pywikibot and maybe some database queries. Basically, you need to gather up all the pages in the index (there is API call to get this data, and Pywikibot can do this too). Then you get all the revisions for these pages within the last month (or whatever time horizon) and see how many of them have the relevant page-quality status-change change tags. Sorry to say, I am not able do this for you right now, but it's definitely doable. Inductiveloadtalk/contribs 21:24, 31 July 2022 (UTC)
No worries at all. You're rights, it's probably better to implement in PP. Languageseeker (talk) 21:52, 14 August 2022 (UTC)

Too soon?

this. Given there are only one of most of these (the targets selected) present on a page, and the number of DOM nodes traversed is effectively identical (#mw-content-text is, what, three levels down from the root?)… Does this actually have a measurable effect? Commensurate with the added complexity? So far as I know, native querySelectorAll is both wicked fast and efficient, so I'd be surprised if we managed to create any actual examples (pathologically large or complex pages) where the difference was measurable without running into post-include expand size limits and similar. But maybe I'm missing something? Xover (talk) 18:58, 14 August 2022 (UTC)

That was nearly a year ago to the day. I know for sure I didn't actually benchmark that, but probably it was some flavour of "at least pretend to do the right thing". I think I may have had grander designs of modularising things a bit more and abstracting away from the DOM in general, but honestly I can't remember right now. Since I'm not going to be released from the jaws of life any week soon: just do what you think makes most sense! You know me, I will be far more offended by people pussy-footing about "my" stuff and not improving the place than I will be about people getting things done. Have at it! Inductiveloadtalk/contribs 18:58, 15 August 2022 (UTC)
("too soon" as in "premature optimization", btw) Yeah, I generally feel the same: for all this little on-wiki stuff, assume effectively CC0 and do whatever you want. But I'm not yet so monomaniacally arrogant that I assume I know better on stuff like this, so I wanted to check whether this was a moment's obsession or similar, or something triggered by concrete knowledge or to fix an observed issue. The convoluted / non-ideomatic approach took me a bit longer to parse, so I'm inclined to walk it back if I at some point need to do more than tweak a line or two there. You've beat it into sufficient shape that I no longer itch to tear it apart every time I stumble across it so that probably won't be until I finally come up with the magic solution that makes dynamic layouts no longer be an ugly hack (real soon now). Xover (talk) 19:21, 15 August 2022 (UTC)

Zarro Boogz!

… or, erm, I mean "LoC". Which at least reduces the probability of boogz, boogers, and other bits and bobs we don't want to a reasonable level.

Anyways: Yay!

We still load way too much junk by default, but this is definitely a milestone, even if the last few bits removed were mostly symbolic. In any case, thank you for all the work you put into cleaning this up! Very very much appreciated. Xover (talk) 06:32, 15 August 2022 (UTC)

Horray! Good work on finally putting a nail in that coffin! Now if only gadgets could be developed in any sane way rather than copy pasting from an editor into the wiki, or with a hacked-up web server, we'll be cooking with (very expensive these days) gas! Inductiveloadtalk/contribs 18:49, 15 August 2022 (UTC)
I'd settle for permitting ES6+ and any half-way sane UI library. Oh well. At least "useful for Gadgets" is now actually listed as a goal for Codex (not without prodding), and there's a dedicated Phab task for it and everything. Whether "half-way sane" will apply remains to be seen, of course. Xover (talk) 19:06, 15 August 2022 (UTC)

Jump to file script no longer working..

https://en.wikisource.org/wiki/Page:Oriental_Scenery_%E2%80%94_One_Hundred_and_Fifty_Views_of_the_Architecture,_Antiquities,_and_Landscape_Scenery_of_Hindoostan.djvu/238

No drop down menu under links tab. ShakespeareFan00 (talk) 07:54, 18 August 2022 (UTC)

Following some diganosis on the Wikimedia discord #technical board:-
<quote>
4:27 PM] TheDJ:
ok. 2 issues: 1. the script ILUI.addVectorMenu() checks what skin you have set in your preferences, instead of which skin is actually in USE. mw.user.options.get( 'skin' ) should be mw.config.get('skin') 2. The menu is broken in legacy vector right now, but if you use the vector-2022 variant of it on vector legacy, then it works.
[4:28 PM] TheDJ:
and because my default was vector-2022, but i was using a link with useskin=vector, I accidently got the 'working' code.
[4:30 PM] ShakespeareFan00:
So what neeeds updating?
[4:30 PM] TheDJ:
https://en.wikisource.org/wiki/User:Inductiveload/LibParamUi.js#L-51
LibParamUi.js
[4:32 PM] TheDJ:
1. That check is the wrong skin check 2. The checkbox is now needed for both vector and vector-2022 skins 3. The menu structure has been updated a lil bit (for accessibility reasons from the looks of it), so they'll probably want to get the implementation of that menu to be in sync with the skin again.
</quote>
ShakespeareFan00 (talk) 15:38, 18 August 2022 (UTC)
I am also hearing that some "Desktop improvements" arrived this week .. Hmm ShakespeareFan00 (talk) 15:38, 18 August 2022 (UTC)
@ShakespeareFan00: Basic fixes from the above have been applied (thanks TheDJ!). I don't normally use this script, so I'm not sure how it used to behave, but from a superficial look it appeared to be working now for me. @Inductiveload: I just did it the dumb way so you'll probably want to tweak it. Xover (talk) 15:51, 18 August 2022 (UTC)
Thanks, Works for me now :) ShakespeareFan00 (talk) ShakespeareFan00 (talk) 16:10, 18 August 2022 (UTC)
Thanks for the fix. I cannot review that now, but if it works, it works. This is exactly what I predicted would happen due to there being no skin-independent way to add a menu tab and pushing that basic function onto all the individual gadget developers to each figure out and make changes whenever a skin breaks it. It's not the first time there's happened, and sadly won't be the last. Inductiveloadtalk/contribs 21:56, 18 August 2022 (UTC)

Sloth, obesity, {{ts}}, and other deadly sins…

I have a bunch of pages sitting in Category:Pages with script errors due to exceeding the Lua time limit, with the culprit being a couple of thousand invocations of {{ts}} (lots of borders and per-column formatting leading to essentially every cell having a {{ts}}). And since I haven't found any way to further optimize {{ts}} enough to matter, I am ~planning (for values of "planning" somewhere in the range "vaguely pondering" to "well, that's one way I could do it") to "hard-code" it with CSS classes on each cell and the dumbest possible per-work style sheet (i.e. I will not fix any semantics or other architectural issues; just dumbly replace a {{ts}} short code and its templatestyles stylesheet with a CSS class shortcode and per-work stylesheet). It'll affect somewhere south of fifty pages so it's nothing that can't be handled if we establish a real solution later on. But if you have any thoughts about this (since I know you've given some thought to the general problem), now's the time…

PS. I may give optimizing {{ts}} another try. It looks like it's blowing its wallclock budget on .loadData()ing its /data, but since that should be cached its per-invocation cost should be negligible and yet we're clearly dependent on the number of invocations. Xover (talk) 14:53, 25 August 2022 (UTC)

{{ts}} never ceases to remind me that it was probably a mistake! I don't have any strong feelings about it other than what you already know very well: if you use it for every cell, you are likely in need of "real" CSS applied to semantically meaningful content. Which I know is about as helpful to say as when my manager says "but can't you just make it work better?". It's also easier said than done when Mediawiki doesn't support the COL element. But other than that, I have complete faith that you can't make it any worse! Good luck! Inductiveloadtalk/contribs 11:20, 26 August 2022 (UTC)
Never underestimate the ability of idiots to surprise you. :)
BTW… you do realise col/colgroup are completely useless, right? They support no attributes (except span), and nothing is inherited from them, so no matter how many <col> you add you still need to faff about with un-generalizeable and fragile td:nth-child() (at which point {{ts}} no longer seems so disgusting by comparison). Xover (talk) 13:50, 26 August 2022 (UTC)

MC Bot down?

It seems that the bot doing the tallies for the MC has gone down. Could you take a look when you have a spare moment? Languageseeker (talk) 22:31, 16 September 2022 (UTC)

@Languageseeker looks like a DB schema change or something: Unknown column tl.tl_title in on clause. Will check it out. Good for this to happen on a Friday! Inductiveloadtalk/contribs 22:44, 16 September 2022 (UTC)
Yep: https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/message/U2U6TXIBABU3KDCVUOITIGI5OJ4COBSW/. I actually vaguely remember reading this, but obviously the implication for the bot clearly passed me by at the time. Script is updated, and should be running again. Thanks for the report! Still hoping one day not to need such a DB query at all - perhaps although the change-tag-based page-status work is still moribund. Or maybe the schema will not get changed again for a while anyway and it'll keep ticking anyway. Inductiveloadtalk/contribs Inductiveloadtalk/contribs 00:50, 17 September 2022 (UTC)
Thank you so much! Awesome job as always. Languageseeker (talk) 23:49, 17 September 2022 (UTC)

On Being Ill

Do you think you can extract On Being Ill by Virgina Woolf from [1]? Languageseeker (talk) 12:57, 28 September 2022 (UTC)

@Languageseeker could we not just import the entire volume as it's 1926 so PD-US at least? Inductiveloadtalk/contribs 12:32, 9 October 2022 (UTC)
That would be the best case scenario, but I feared that it would be too much work. Honestly, if you could import all the volumes of The Criterion that are in the PD, that would be amazing.Languageseeker (talk) 12:53, 9 October 2022 (UTC)
@Languageseeker it's not a huge amount of work if I have a spreadsheet to work from, as it's mostly automatic then. Assuming the spreadsheet is correct, 90% of the pain now is Wikimedia itself failing to handle the uploads. Inductiveloadtalk/contribs 13:05, 9 October 2022 (UTC)
But we would need to redact the '60s preface(s). Inductiveloadtalk/contribs 13:29, 9 October 2022 (UTC)
@Languageseeker good news: the only full volume I found (so far) that's not the reprint is the one you wanted anyway: Index:The Criterion - Volume 4.djvu. Inductiveloadtalk/contribs 18:22, 9 October 2022 (UTC)
Amazing, thank you!Languageseeker (talk) 21:36, 10 October 2022 (UTC)

bikeshed vs. discord

If there is going to be a server with the name "discord", there needs to be a server whose name is "bikeshed".--RaboKarbakian (talk) 17:12, 7 October 2022 (UTC)

I am sorry, I do not understand what you are trying to say here. You may access the Discord server either though Discord (web client or app). Alternatively, as I believe is more appropriate for an open project, though the public Libera IRC channel, to which it is bridged (this means you can see Discord messages in IRC and Discord users can see IRC messages). There is also a Matrix bridge to the same IRC channel, so you can access it through a Matrix client if you prefer. Inductiveloadtalk/contribs 12:13, 9 October 2022 (UTC)

Need a little push in the right direction

Hi. I posted an issue some days ago with images. It's about a change in the textbox and header/footer space which increased in height and I can't edit. The additional areas I defined, are light blue.

  • Perhaps there was a change and the CSS selector names have been changed/upgraded?
  • I thought that the problem stems from the Global/Local CSS when I tried to separate local from global. [[2]] I compared the CSS of User:IneuwPublic which as older version and the problem is the same, and it's global settings were not altered.

Can I fix this? I want to learn how to do it, by myself. Just need a push to the direction of the new documentation of the changes. — ineuw (talk) 02:26, 29 September 2022 (UTC)

@Ineuw sorry for the long delay, I have been away. I do not know what changed here, but it's probably either one of or an interaction between any number of: the Vector skin (or whichever skin you use), ProofreadPage extension, Wikisource extension, something in the MediaWiki core, a gadget or your personal JS or CSS. I have no personal involvement with any recent changes, so I can't really help any more if you come to me directly than if you just report at WS:S/H. Inductiveloadtalk/contribs 12:23, 9 October 2022 (UTC)
Many thanks for the reply. I know you are no longer involved, so allow me to rephrase the question in proper context. While you were away, there have been some changes. A key dimension of the ProofreadPage was changed from a fixed dimensions to "vh". I understood why, but and wondered if it's possible for a user to override the ProofreadPage settings which only relevant to me in my editing namespace. Assuming that there is a private name space i.e: my account name.
I researched the subject and realized that it's up to me to understand and test the existing solutions before getting too deep into HTML and especially CSS. I have been using Vector, since switching from the original Monobook when I signed up. Though, it just occurred to me to practice on the preferences of my alternate account of user:IneuwPublic. This tool may be of interest to you and the community and these are the results of a 24" LED monitor displaying Wikisource Vector with the 16:9 True resolution indicated in this image. — ineuw (talk) 21:02, 10 October 2022 (UTC)
Thanks for listening. The ProofreadPage dynamic layout with its large block of empty spaces in the textboxes is identical in all skins, including the Visual Editor. I will ask the community. — ineuw (talk) 00:32, 13 October 2022 (UTC)

The latest victim

Incidentally, the impetus for kicking off the PageNumbers Next Gen thing was the latest victim of templatestyles getting orphaned being {{authority control}}. After updating Module:Navbox to a version that replaces global CSS with templatestyles (MediaWiki:Gadget-enwp-boxes.css is shrinking nicely now; mostly just Module:Infobox left to migrate), {{authority control}}, which uses it internally, gets bit by it too. I'm thinking maybe I can wrap its output in a dummy div with a class and move that in PageNumbers so the style elements comes along for the ride. The only alternative I came up with was to have PageNumbers manually check for a style-element sibling and hoist that along with the AC div. And, frankly, that's pretty gross.

If you have any brilliant ideas from your previous run-in with this issue I'm all ears.

Then again, I'm not sure I actually understand why the styles do not work just because they're in a different subtree of the DOM. I thought the only scoping was with a CSS selector forcing it to descendants of #mw-content-text, but apparently I'm missing something. At least there used to be dire warnings about TemplateStyles' ability to affect other elements on the page. Xover (talk) 12:27, 16 October 2022 (UTC)

Oh. Wait. .mw-parser-output not #mw-content-text. That would explain why they're not working. I wonder if there are any downsides to simply adding that class to our hoisted / dynlayout-exempt containers… They do contain all parser outputted content after all. Xover (talk) 12:42, 16 October 2022 (UTC)
Indeed, that was the issue. Maybe that'd do to convert {{header}} to TempateStyles as well? Can you think of any horrible side effects of slapping .mw-parser-output on these? I mean, apart from the general reluctance to appropriating anything with a mw- prefix, I mean. Xover (talk) 13:53, 16 October 2022 (UTC)
Not off the top of my head, but I'm sure entropic demons would intervene and make it harder than I might expect at first! Inductiveloadtalk/contribs 11:43, 18 October 2022 (UTC)

Paired italics..

In my recent efforts at seriously reducing LintErrors, ('Missing tags') I am running into an issue as follows.

This is content '' that contains 
an italic portion split over two lines'' Or worse '' contains
italics that run for 
multiple lines''

Is there a tool or script that could identify and semi-interactively repair these, without me having to manually search and use LintHint?

I also want to try and avoid giving myself long-term carpal tunnel fixing these relatively minor issue manually, if there is an automated solution available? ShakespeareFan00 (talk) 19:01, 16 October 2022 (UTC)

Basically, you can spot this if you can accurately tokenise each line. Then you can find lines that contain an odd number of '' markups and conclude that line either needs to start or end with another one. You then step though lines of Wikitext in the page one by one and figure it which it is based on how many unmatched lines you have already found.
The "semi-manual" way may be to just flag up all such lines and ask the user "add italics markup at start or end?" with some kind of helper UI. Inductiveloadtalk/contribs 11:39, 18 October 2022 (UTC)

Did you change something recently in relation to load-save actions?

I follow whats on the documentation page here Page:Nostradamus (1961).djvu/139 click preview and nothing happens, the references do not substitute as they are supposed to.

What broke and why? ShakespeareFan00 (talk) 18:15, 17 October 2022 (UTC)

Well some random combination of Gadgets/scripts has re-enabled it for now.. I don't know what broke it thoughShakespeareFan00 (talk) 19:09, 17 October 2022 (UTC)
I did modify it recently, but I didn't think I broke it (I made it so it no longer deletes bits of * r1 if it doesn't find the <r1> counterpart). Let me know if you encounter issues. If you do, if you could report the raw Wikitext page content that broke it that would allow it to be debugged. Inductiveloadtalk/contribs 11:22, 18 October 2022 (UTC)

CSS sanity check.

Being the 'overfloated' finger here - https://en.wikisource.org/wiki/Page:Nostradamus_(1961).djvu/137

I'm currently using:- Index:Nostradamus_(1961).djvu/styles.css to tweak it, but ideally there should be a cleaner way of doing this.ShakespeareFan00 (talk) ShakespeareFan00 (talk) 11:28, 18 October 2022 (UTC)

Can you just bump up the ppoem gutter width to accommodate a manicule + space + number? Template:Ppoem#Gutter_width. The default is 2em, which is enough for short numbers. Inductiveloadtalk/contribs 11:42, 18 October 2022 (UTC)
Thanks. I'll have a look at that. ShakespeareFan00 (talk) 11:52, 18 October 2022 (UTC)
Hmm-
https://en.wikisource.org/w/index.php?title=Index:Nostradamus_(1961).djvu/styles.css&oldid=12676880 and https://en.wikisource.org/w/index.php?title=Page:Nostradamus_(1961).djvu/137&oldid=12676878. which looks ugly due to the modified font-size for the manicule.ShakespeareFan00 (talk) 12:00, 18 October 2022 (UTC)

VPN access to enWS

I'm going to be moving to a country that regularly blocks Wikipedia and the entire Wiki world. I would like to keep on contributing to enWS, but I've noticed that it doesn't allow you to edit when I'm on a VPN. Is there anyway to enable editing when on a VPN? Pinging @Xover as well. Languageseeker (talk) 04:48, 18 October 2022 (UTC)

@Languageseeker: Depends on the mechanism that's preventing you from editing. There are some hard network level blocks for some VPN and VPN-like services, and these cannot be circumvented. But the most common variant is a simple block of the IPs added by admins whenever various forms of undesirable behaviour is detected. These blocks are circumventable by simply giving your account the "IP block exempt" flag, which lets an account edit through the block so long as they are logged in. We don't really have a defined process for this, but any admin can add it, so just put a request for it on WS:AN (for transparency purposes) and we'll figure it out. You'll need to request the same on each project you want to participate on (e.g. see c:COM:IPBE and wikt:Wiktionary:IP block exemption), and there is an additional global level where global IP blocks may be placed (see meta:WM:IPBE). Xover (talk) 06:28, 18 October 2022 (UTC)
@Xover Many thanks for the detailed answer. Languageseeker (talk) 01:32, 19 October 2022 (UTC)

TableStyles in Index: pages

When you were messing with Module:Proofreadpage index template, did you ever try adding a TemplateStyles stylesheet to it to replace the inline style attributes? Since MediaWiki:Proofreadpage index template is somewhat "special", I'm a little concerned this will break in spectacular ways and am hesitant to just try it out. I think I can treat it as just any other template, modulo the actual #tag:pagelist itself, but given the hacky ways PRP has historically stored things, my paranoia is in full bloom.

PS. Prompted by needing to make a small change to it, and getting offended by all the remaining cruft, I'm toying with the idea of finishing migrating it to Lua, moving formatting into a TemplateStyles stylesheet, and then writing a Gadget to allow inline editing of individual index fields (via the API). If you've thoughts, cautions, ideas, etc.… Xover (talk) 15:11, 18 October 2022 (UTC)

@Xover:, I'm afraid I do not recall at all, so...maybe? Definitely would be a good idea. Inline editing would be pretty sweet.
BTW, User:Inductiveload/maintain actually does some fields already through a wizard-ish kind of UI. However, it hasn't yet learned to edit Index pages specifically via the API's JSON content-format, it just regexes the Wikitext. Inductiveloadtalk/contribs 16:21, 18 October 2022 (UTC)
Oh, wow, it hadn't even occurred to me to hit the "Maintain" link on an Index: page. D'oh! Well, now I know where I can crib some code. :)
I was thinking just regexing the wikicode, but I'll take a look at what the API offers. I was more thinking in terms of not needing to reload the page for each field so it feels snappy. Xover (talk) 19:12, 18 October 2022 (UTC)
@Xover FYI, the JSON-aware index field API is actually broken by upstream changes (see phab:T321446). Inductiveloadtalk/contribs 15:36, 29 October 2022 (UTC)
Umherirrender suggests As replacement it seems prop=revision with the contentformat is usable., but I haven't got around to digging yet. Xover (talk) 15:52, 29 October 2022 (UTC)
I couldn't find a prop=revision for the action=parse, so assuming that that means a query action like https://en.wikisource.org/w/api.php?action=query&format=json&prop=revisions&titles=Index%3ASandbox.djvu&formatversion=2&rvprop=content&rvcontentformat=application%2Fjson, then rvcontentformat is apparently deprecated. Inductiveloadtalk/contribs 16:18, 29 October 2022 (UTC)
Hmm. So it seems the thinking is that a JSON representation should live in a derived MCR slot (that can be requested explicitly) instead of getting transformed on request when the API is hit. Optimising for latency instead of disk space, maybe? Xover (talk) 16:30, 29 October 2022 (UTC)
That would make more sense (hence phab:T291293), but will that ever happen? I am not hopeful. Inductiveloadtalk/contribs 16:37, 29 October 2022 (UTC)
Grmbl. @Tpt, @Samwilson, @Sohom data: Have any of you by any chance looked into making Proofread Page use a MCR slot for storing proofreading status (cf. T291293 and T48724)? And if so, can you comment on the feasibility of replacing the now-broken Index Data API (T321446) with a derived slot (cf. T277203)? Context is that T206253 nerfed contentformat with action=parse, and rvcontentformat with action=query is deprecated (still works, but will go away eventually). Unless someone picks up the MCR work it seems unlikely this is fixable. Xover (talk) 17:00, 29 October 2022 (UTC)
Yes, MRC is doable. The thing that is blocking me to implement it is to know what to do with the header and footer fields. Should we keep them as special areas or just drop special support for them? Tpt (talk) 09:37, 1 November 2022 (UTC)
@Tpt: Having dedicated noinclude fields is necessary for quite a lot of cross-page formatting, so dropping them would necessitate a lot of ugly and confusing literal <noinclude>…</noinclude> tags in the page body. In addition, I think the community is quite attached to reproducing running headers and footers on each page. Even community members with an exceedingly pragmatic view of the header/footer ("Don't put too much effort into them, they won't be transcluded so don't really matter") religiously reproduce them in their own texts. In other words, I think it is important that we preserve this functionality.
Whether that means keeping the existing serialization, switching to wikitext-embedded-in-json, move the header and footer to separate MCR slots, or some other clever thing I have no idea. It's going to require special handling for diffs etc. whichever way we go so I don't think the exact serialisation and persistence method matters much outside its own concerns. Xover (talk) 06:50, 2 November 2022 (UTC)

MC Bot down

It looks as if the MC bot is down again. Can you take a look when you have a moment? Languageseeker (talk) 20:08, 26 October 2022 (UTC)

My tmux session was gone, so maybe the toolforge server was restarted or something and there was a transient outage. I can see some stats got missed I have refreshed the data and it should update soon. Thanks for the report. Inductiveloadtalk/contribs 22:59, 27 October 2022 (UTC)

Css field still needed?

Given Index: styles are here, do we still need the Css field (and its tracking category) in the index/index data config? Xover (talk) 15:23, 29 October 2022 (UTC)

@Xover no, I do not think we need it any more now that the category is cleared out. Inductiveloadtalk/contribs 15:29, 29 October 2022 (UTC)
Thanks. I'm doing some tidying there so I'll probably nuke it entirely when I get to that point. Xover (talk) 15:33, 29 October 2022 (UTC)

PRP css

In the PRP css there is the statement

.prp-page-image-openseadragon-horizontal {
    height: 33vh;
}

and it gives me a huge whitespace between the top of the tabs and the header/image panel (in monobook). What is its purpose? I have been ignoring it for ages as it is ugly though manageable on my monitors on PC, however, when on a laptop alone it is butt ugly. I can turn it off, though it does need to be addressed IMNSHO. [Noting that I reckon that I am way off any standard js/css display for enWS] — billinghurst sDrewth 02:05, 13 November 2022 (UTC)

@Billinghurst: I am unable to reproduce this in Safari, Firefox, or Chrome. Does it still do this if you open the page when logged out and add &useskin=monobook to the URL? I see nothing obvious that should cause problems in User:Billinghurst/common.js or User:Billinghurst/common.css, but you might want to test after removing the #prp-page-image-openseadragon-vertical {height: 100%;} (this should no longer be needed) and .ext-wikisource-ExtractTextWidget { display: none; } as the most likely to interact some way with this (if you haven't tried that already).
The CSS you quote above should only have an effect when the PRP editor is in horizontal mode. It's for the container for the image when the editing layout is horizontal, and sets its height to "33 viewport height units" (essentially just 33% of the visible browser window height). When the editing layout is vertical the scripts dynamically add the class ".prp-layout-is-vertical" on that container, and the associate style then sets display: none; which hides it entirely regardless of its configured size. From what you describe it sounds like something is preventing this display:none from having an effect, but there are any number of things that could potentially cause that.
If you want to go delving into the Web Inspector in your browser you might try seeing if you have the class .prp-layout-is-vertical set on any element, and whether that element is the div with .prp-page-image-openseadragon-horizontal set. It might also be worthwhile to check the computed properties for that div to see what its height is set to (and from where) and what its display is set to (and from where). Some of the possible causes of this would leave an error message in the JavaScript Console, so any error messages there might also be relevant. Xover (talk) 09:46, 13 November 2022 (UTC)
The changes to my common.css don't make a difference. Interestingly when I do it in a private window (so logged out) I see the thick white band appear (things rendered down, and then disappear with everything rendering back up, and that is with vector or monobook. Logged in, through web developer tools, when I pick the element (highlight) the whitespace shows as
<div class="prp-page-image-openseadragon-horizontal" id="prp-page-image-openseadragon-horizontal"></div>
when I do the same logged out, it doesn't show. So I may have to outwait the cache somewhat. <shrug> — billinghurst sDrewth 10:38, 13 November 2022 (UTC)
Hmm. Just to double-check: you are using the so-called "vertical" layout, with the scan image to the right of the text box, and not the "horizontal" that has the image above the text box? Because in that case the javascript delivered as part of PRP should have set prp-layout-is-vertical in the @class for that div. There are no error messages in the JavaScript console? What web browser is it?
Provided you are using "vertical" layout, and don't need to switch, you could maybe work around it using .prp-page-image-openseadragon-horizontal {display:none;} in your Common.css, but that's kinda icky and prone to break with future changes. It would be good to be able to pinpoint exactly why it's breaking and fix it properly.
PS. What you see when logged out sounds like roughly correct behaviour. On page load you should be getting a div with just .prp-page-image-openseadragon-horizontal set, and with a stylesheet that sets it to 33vh. Once the javascript has loaded (they load asynchronously) it should detect that you're in vertical layout (which is default for logged-out users) and add the .prp-layout-is-vertical class which changes the style to display:none. If the computer or network is sluggish (or heavily loaded or...) this could look like a big empty block appearing and then getting hidden. I don't see that happening on my computer, but that's probably because it happens fast enough that it gets lost in all the other site chrome and dynamic menus and stuff. It's likely that this would be more visible in a lightweight skin like monobook than in Vector. Xover (talk) 12:52, 13 November 2022 (UTC)

the subpage format used with header=1

With something like Emily of New Moon the header=1 format is ugh, as it inserts a year parameter on all the subpages, and that then bumps down the author field extending AND adding the publisher detail further extending the header. I believe that header=1 works need a simpler (separate?) subpage structure. — billinghurst sDrewth 23:29, 23 November 2022 (UTC)

PRP just calls a template with pre-filled params when header=1 is present. Once we have that all backed by a Lua module it should, I hope, be fairly straightforward to detect that we're on a subpage and adjust the output accordingly. The tough part would be defining what the logic should be, because we have so many quixotic edge-cases (and a lot of historic uses that hardcodes specific formatting inside template fields to achieve some specific effect). But without actually looking into it in detail I think it's probably doable. Xover (talk) 20:15, 24 November 2022 (UTC)
Hmm. But the author-on-a-new-line thing is actually due to feeding PRP's author info to the |override_author= field in {{header}}. For some reason that always puts it on a new line (I have no idea why). Xover (talk) 20:45, 24 November 2022 (UTC)
I'm not sure what I can helpfully add here, since the general "it's all a bit unsatisfying" conclusion is a long-known feature of header=1. Honestly, the only real answer is probably to put all the bibliographic data on Wikidata, structure it properly and use that. Holding this data on the index pages as freeform wikitext (which is what header=1 is actually doing) will never produce a universally good result. It was the only thing PRP could have done when written. I think it's probably safe to say it was an ambitious feature which never quite took off at enWS due to the inherent limitations. As for using structured data, I have moaned long and hard enough about WD's failure to define any usefully robust and consistent schemas for bibliographic data in the past and so I will refrain from repeating all of that here. What we should do from here in the medium term, honestly, I don't really know. Inductiveloadtalk/contribs 14:30, 29 December 2022 (UTC)

ppoem and speech prefixes

Happy Hollidays, and hopefully you're busy stuffing your face with all sorts of traditional delicacies (next year's project: Fruitcake).

cf. Page:Shakespearean Tragedy (1912).djvu/335 and Page:Shakespearean Tragedy (1912).djvu/336, and the speech prefixes that overlap the play text (due to zero padding and italic text). Using <<< to float the speech prefixes in theatrical scripts over to the left of the text seems an obvious approach; but it currently seems to be making a hard assumption that it's being used for a verse number. Thoughts about this use case? Is it overloading the verse number functionality? Or a useful generalisation? What's a sensible interface to allow tweaking it?

PS. the fact that >>''Cor.''<<<And so I am, I am. actually works as intended is pure awesome. I tried it fully expecting it to fail. :) Xover (talk) 11:28, 28 December 2022 (UTC)

@Xover I have been fairly well-behaved with regards to face stuffing, though someone may have caused greivous harm to a rather tasty box of cheese. However, in the spirit of the season, I think it's unseemly to point too many fingers and we should let bygones be bygones.
As for the ppoem thing, I think the <<< syntax is good enough. It might be nice to have a "play script" module or something that wraps the core ppoem logic, but that's a hell of a lot of work and complexity. There exists control of the gutter width with CSS like .ws-poem-left-gutter .ws-poem-line { padding-left: 3em; }, which can also be selected by a per-ppoem or per-stanza class.
By the way, do you have any thoughts about Template_talk:Ppoem#Option_for_manual_centring? It's something that I thought about back in the early days, but the "obvious" solution of hardcoding a width falls apart almost immediately on small screens without a lot of CSS hackery (usually including media queries, which do not work well or at all on a lot of ereaders). Inductiveloadtalk/contribs 14:23, 29 December 2022 (UTC)