From Wikisource
Jump to navigation Jump to search
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 293 active users here.




Template to mark works ineligible for copyright due to lack of human authorship[edit]

I propose either creating a new template (perhaps {{PD-machine}}), or altering our current {{PD-ineligible}}, to account for works that are in the public domain due to the absence of human creative expression. This template would make clear, for example, that machine translations of non-English works may be freely hosted here (so long as the foreign-language original works are also eligible).

The question whether machine translations may be hosted here has been discussed at meta: Wikilegal/Copyright for Google Translations, which concludes that such Google holds no U.S. copyright in the automatic translations produced by its software.

Guidance from the U.S. Copyright Office also supports this conclusion. Section 306 of the current (2017) Compendium of U.S. Copyright Office Practices states:

The U.S. Copyright Office will register an original work of authorship, provided that the work was created by a human being.

The copyright law only protects “the fruits of intellectual labor” that “are founded in the creative powers of the mind.” Trade-Mark Cases, 100 U.S. 82, 94 (1879). Because copyright law is limited to “original intellectual conceptions of the author,” the Office will refuse to register a claim if it determines that a human being did not create the work. Burrow-Giles Lithographic Co. v. Sarony, 111 U.S. 53, 58 (1884). For representative examples of works that do not satisfy this requirement, see Section 313.2 below.

The cross-referenced Section 313.2 does not expressly mention translations, but it does provide, in part, that “the Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.” This further supports the principle that machine translations contain insufficient human expression to qualify for their own copyright protection.

At present, none of our Category:License templates address the human authorship requirement directly. The closest analog is {{PD-ineligible}}, which addresses works that “consist[ ] entirely of information that is common property and contain[ ] no original authorship.” All the works marked with the present version of {{PD-ineligible}}, however, appear to be the product of human authorship (or at least human transcription), and the {{PD-ineligible}} tag does not appear to address the ineligibility of non-human-originated works as discussed in the above-quoted language from the Compendium.

Because {{PD-ineligible}} in its present form does not quite fit the scenario of works created by machine, I suggest creating a new template that would expressly note that the lack of human authorship disqualifies such works from copyright protection in the United States. Tarmstro99 14:45, 8 September 2018 (UTC)

Even if Google holds no U.S. copyright in the automatic translations produced by its software, how about adding quick links from Wikisource? Perhaps machine translations will be improved, thus evolving.--Jusjih (talk) 01:52, 25 September 2018 (UTC)

Bot approval requests[edit]

Repairs (and moves)[edit]

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

Orphée aux Enfers[edit]

Could someone please strip the Google notice from File:Orphée aux Enfers (Chicago 1868).djvu (on Commons) in preparation for hosting the work on Wikisource? --EncycloPetey (talk) 01:54, 14 October 2018 (UTC)

Yes check.svg Done --Mukkakukaku (talk) 18:34, 14 October 2018 (UTC)

Other discussions[edit]

Adapting Template:pd/1996 or a new template[edit]

As per previous conversation started by Prosfilaes, from next year US-published works published in 1923 will be out of copyright, and progressively year by year others will follow. We need to start working on whether we will adapt Template:pd/1996 to have wording that says that the work is out of copyright, and reconfigure that template to set triggers. Or whether we are going to implement a new template for post 1922 works. (Full coverage at copyright tags.) — billinghurst sDrewth 09:34, 5 August 2018 (UTC)

Don't the 1996-series of template primarily apply to works published outside the US? For works inside the US, we've been using the 1923-series of templates, and I would assume that it's the 1923-series that would need to be adapted to accommodate US-published works from 1923. It would be odd to have "published before 1923" to be a reason a work is in PD, if works published before 1924 is the actual set of works in PD. --EncycloPetey (talk) 15:55, 5 August 2018 (UTC)

You are correct that the pd/1996 has been non-US first publications to this point, and there would be complications in updating the template. Template:pd/1923 is set, and incrementing Template:PD/19xx is possible, though becomes a lot of templates. It is why I brought up the issue as we have to get the wording right, and look to the easiest means to progress through the years. As 1977 is the next US copyright milestone, maybe it is something like pd/1978 with both a year of birth AND year of publishing as parameters, where year of publishing flicks between copyright and not copyright.

billinghurst sDrewth 22:46, 5 August 2018 (UTC)

Let us keep watching for the rest of 2018 to be sure that the US copyright term is not extended. Then I may want to introduce "PD-pub-95" to mean "public domain for being published more than 95 years ago. Renaming Pd/1923 will probably be too disruptive, so making a new template may be better.--Jusjih (talk) 04:48, 9 September 2018 (UTC)
It's not going to happen, and part of the reason it's not going to happen is because we are going to rip the hell out of Congress if they try. Being able to say that are already preparing for this change will only help our case. And when the ball drops in New York, I will be uploading 1923 works, and will need an appropriate tag.
I'd like a single PD tag that takes publication year and author death year (if known), and it shouldn't mention in the name the exact rules, just applying all the rules that can be deduced clearly from publication year and author death year. Maybe just naming it PD-old would be too much?--Prosfilaes (talk) 06:35, 10 September 2018 (UTC)
I support the single-PD template idea. While it would be rather an in-depth template, I don't think it would be particularly difficult to implement (just a series of if-elsif-else conditions). Mukkakukaku (talk) 00:56, 13 September 2018 (UTC)
The US Copyright Office already considered the copyright terms too long. Mid-term election will be soon. Template:Pd/1923 is heavily used, so renaming will be harder than adding new template like "PD-pub-95". I will wait for the ball to drop in Times Square.--Jusjih (talk) 03:00, 18 September 2018 (UTC)
  • Pictogram voting comment.svg Comment Looking back at this and thinking again, I think that we should be building a template based on 1978 cutoff, aligning with 1923 and 1996 cutoff usage. We already have our subset of use templates (<1923; 1923<1996) that have a series of #if statements (well #ifexpr) that get implemented. At this stage we need to have some output templates that work in main ns that cover 1923 to 1977 at least
  • published in US between 1923-1963 with notice and renewal
  • published in US between 1964-1977 with notice
  • published outside of US between 1923-1977 (two scenarios)

where we will be incrementing per year. So we just use a #if expression for currentyear - 95 > publicationyear where it shows the PD template when true, and copyright violation when it fails. It is a few years until we need to worry about PD-old for post 1923, so we can work that bit out later. If someone uses this new license for a pre-1923 work, we can simply apply the {{pd/1923}} logic.

We still need a licence to display and the wording to use for US users, bottom half replicates pd/1996.billinghurst sDrewth 10:03, 25 September 2018 (UTC)

List of broken links from Wikipedia to Wikisource[edit]

In my profile: User:Uziel302 I put a list of 340 broken links from Wikipedia to Wikisource, any help fixing those links is much appreciated. Thanks.Uziel302 (talk) 07:58, 22 August 2018 (UTC)

Some examples:
  1. w:Alabama to Wikisource:Alabama
  2. w:Afghanistan to Wikisource:Afghanistan
  3. w:Azerbaijan to Wikisource:Azerbaijan
  4. w:Ancient_Egypt to Wikisource:Ancient_Egypt
  5. w:Aga_Khan_III to 1922_Encyclopædia_Britannica/Aga_Khan_III
  6. w:Antipope to Dictionary_of_Christian_Biography_and_Literature_to_the_End_of_the_Sixth_Century/Dictionary/Z/Zephyrinus
  7. w:Andrew_Carnegie to 1922_Encyclopædia_Britannica/Carnegie,_Andrew
  8. w:Angles to Ecclesiastical_History_of_the_English_People/Book_2
  9. w:Angles to Historia_Ecclesiastica_gentis_Anglorum_-_Liber_Secundus
  10. w:Angles to The_Ecclesiastical_History_of_the_English_Nation
  11. w:Aswan to Dictionary_of_Greek_and_Roman_Geography/Aswan
  12. w:Andalusia to Estatuto_de_Autonomía_de_Andalucía_2007
  13. w:Beryllium to Beryllium

Thanks, Uziel302 (talk) 10:51, 22 August 2018 (UTC)

There may be some broken links to the Folk-Lore Journal at en.wikipedia, the result of a move with suppressed redirects. — CYGNIS INSIGNIS 11:25, 22 August 2018 (UTC)
Anything that was Wikisource: namespace is now Portal: ns. To the others, there looks to be a collection of never/wishful, or moved. — billinghurst sDrewth 12:15, 22 August 2018 (UTC)
Looking through the list itself, one wonders why it was linked in the first place. Can I suggest for people, that you can use {{wikisource author}} as that was redesigned to utilise Wikidata interwiki links so moved pages are automatically updated. One day I am hoping that Wikipedia is better acclimatised to WD and many of their citation templates will be able to utilise a WD-based citation. — billinghurst sDrewth 12:23, 22 August 2018 (UTC)
billinghurst, is there an option to make Wikisource namespace redirected to Portal namespace? Uziel302 (talk) 16:09, 22 August 2018 (UTC)
No, that would be a cross namespace redirect, and wikis don't do it. Portal is a content namespace, and Wikisource is not. — billinghurst sDrewth 22:54, 22 August 2018 (UTC)
Just found a better query for these broken links, updated my page to include over 5,000 broken links. Uziel302 (talk) 08:02, 24 August 2018 (UTC)
Bunch of those aren't actually intentional links to Wikisource. They're trying to link to articles about Swedish tv shows, churches, etc. that are prefixed using the abbreviation S:t -- eg. "S:t Mikael" (a tv show). The "s:" prefix is forcing a WS interwiki.
Also this seems like something that they should be fixing up over the the enWP side. I fixed up a bunch where it was a clear "page moved" situation, but it's a thankless task. Mukkakukaku (talk) 00:32, 25 August 2018 (UTC)
well, i will thank you. a bunch of those are broken DNB links, and missing transcribed articles that were copied there from IA. (i.e. The New International Encyclopædia/Leutze, Emanuel) we did a 12000 article backlog for EB1911 - NIE and Appletons should be easy, not soul crushing at all. Slowking4SvG's revenge 02:55, 29 August 2018 (UTC)
by the way, some of those may be the em dash versus en dash conflict. Slowking4SvG's revenge 02:52, 3 September 2018 (UTC)
fyi, (a lot of links are are malformed template syntax) the english admins are mass reverting my attempts to work this backlog, so i leave it to you. Slowking4SvG's revenge 00:24, 19 September 2018 (UTC)
Hmm. What changes are getting reverted? --Xover (talk) 17:02, 19 September 2018 (UTC)

Page numbers borked with hyphenated words[edit]

In working on Characters of Shakespear's Plays (which is ready for validation, hint hint :) ) I notice that page number display in mainspace seems to be broken whenever there's a {{hyphenated word start}}, or a quote that crosses page boundaries (using {{margin left}} + {{smaller block/s}}; and closed with 2 x {{div end}}), in the relevant Page:-pages. When these occur the bits of the page highlighted in mainspace when hovering over the page number is truncated in strange ways, and the page numbers corresponding to the relevant pages do not show up (they are in the html source of the page though). Subsequent page appear correctly numbered and the highlight works as expected; and the actual content of the affected pages is present as expected. A known issue with… the page number display? Something I'm doing wrong? --Xover (talk) 13:12, 26 August 2018 (UTC)

Would you please provide a direct link to a broken subpage. Also it would be worth using {{auxiliary Table of Contents}} on the root page. It is not a known issue that it is broken, and I have been doing plenty of works that way recently. — billinghurst sDrewth 13:14, 26 August 2018 (UTC)
Thanks. Page 18–19 on Characters of Shakespear's Plays/Macbeth (in that there's no page 19 link in the left margin on my browser). AuxToC is coming up. --Xover (talk) 13:21, 26 August 2018 (UTC)
works for me in firefox browser. Slowking4SvG's revenge 16:23, 26 August 2018 (UTC)
Meh. I hadn't even considered the possibility that it might be browser-related. Thanks. I'll try fiddling there to see if anything dawns. --Xover (talk) 17:32, 26 August 2018 (UTC)
Works in Firefox and Internet Explorer 11. Broken in Safari and Chrome. All latest versions, except IE. Logged in or logged out doesn't affect it. --Xover (talk) 17:44, 26 August 2018 (UTC)
Hmm. And when I turn on page numbers inside the text from the Display Options in the sidebar (that I, frankly, had no idea existed until now), page 19 shows up again and is in the right place in the text. But, looking through this with a debugger, I see Safari gives the span element with the ID "18" an .offsetTop() of 1225 pixels; page 19 is 976 pixels; and page 20 is 1814 pixels. Which is of course impossible since the spans in question follow one another from top to bottom in the page. The immediate reason page 19 isn't showing up is that the page numbering code checks that there's at least 5 pixels between the current and previous page number, and since 976 - 1225 is in fact less than 5 (as it's a negative number), the scripts hides the page number with display: none. Now, as to why Safari gives page 19 a nonsensical .offsetTop()… I have no idea. Could this really be bugged in Safari? Or, rather, in Webkit, since Chrome also sees this behaviour. Not sure where to dig next with this. --Xover (talk) 19:46, 26 August 2018 (UTC)
Just for completeness, I checked the offsets in Firefox and got: 984 for page 18; 1186 for page 19; and 1436 for page 20. Pretty much as expected, in other words. --Xover (talk) 20:48, 26 August 2018 (UTC)
Ok, this is getting… weird. It appears there's a bug in Webkit where .offsetTop is calculated incorrectly for empty inline elements (the page numbers are an empty <span>). The bug was reported in 2011 but the Webkit project seems to think it's notabug (I don't understand their reasoning for this, and the conclusion appears nonsensical on the face of it). That explains why page numbers inside the text works: when you toggle this the PageNumbers script puts the link etc. inside the formerly empty <span> (making it non-empty, and avoiding the bug). However, since this only affects certain pages, but all page markers are empty <span> elements, there has to be a further factor at play. And it looks like that factor is this: when {{hyphenated word start}} is used, there is a space character before it, and that shows up in the generated HTML before the page marker <span> (which won't be there for non-{{hws}} pages). That single space character appears to be what triggers the Webkit .offsetTop bug. To wit: I removed the space before {{hws}} on page 19 in the Page:-namespace, and lo and behold, now the page number for page 19 shows up properly in mainspace. I have no idea what to do about this (cry? laugh hysterically? take up fishing?). --Xover (talk) 15:26, 27 August 2018 (UTC)
You might consider opening a ticket on phabricator with your findings? Since it is a known defect in webkit's implementation of offsetTop, then it stands to reason there are workarounds that can be implemented in a cross-browser fashion, if not by browser sniffing then possibly via another mechanism. Mukkakukaku (talk) 00:20, 28 August 2018 (UTC)
yeah- wrap it up with a bow, for a GSoC or hackathon task. display options have only worked since 2016. you could also apply for a grant to fix it. Slowking4SvG's revenge 03:23, 28 August 2018 (UTC)

More details than you wanted[edit]

Ok, I've done some more digging into this, and it still boils down to a really weird bug in webkit (Safari, Chrome). Given a construct such as…

1: TEXT <span id="p1"></span>TEXT<br/>
2: TEXT <span id="p2"></span>TEXT<br/>
3: TEXT  <span id="p3"></span>TEXT<br/>
4: TEXT <span id="p4"></span> TEXT<br/>

…which approximates what MediaWiki + ProofreadPage + pagenumbers.js is doing/using, webkit will report the wrong .offsetTop value for lines 3 and 4 (that is, for the page markers for a page 3 and a page 4). In this example it is triggered by the two space characters in front of the <span>-element in line 3, and by the single space character after the <span>-element in line 4. It does this even though all of them report the same .offsetParent (the <body>-element in my test case; on Wikisource it's the <div id="regionContainer">-element).

The extra space character in line 3 is the case that seems to trigger it on Wikisource when {{hws}}/{{hwe}} are in use, because in the final HTML output you will get both the literal space character preceding the {{hws}} on the first page and the character entity reference for the space character that Mediawiki inserts when it transcludes the two pages together.

The webkit folks haven't acted on this bug over the last 7 years (reported in 2011 I think), and seemed then to think that this was intended behaviour, so I'm not holding out much hope that they'll do anything about it any time soon. Instead I'm going to look into whether there is any kind of reasonable workaround that can be implement here to avoid triggering this bug. Not sure there is one, at least not without major surgery to MediaWiki/ProofreadPage, but it seems more promising than getting Webkit fixed in any case. --Xover (talk) 16:48, 30 August 2018 (UTC)

Ok, it looks like in every case where this is triggered by {{hws}}/{{hwe}} it can be worked around by moving the preceding space character into the template: it is not {{hws|no|normal}}it is not{{hws| no|normal}}. This gives correct presentation in the Page:-namespace, even if the underlying markup is technically incorrect, and doesn't affect the main namespace (because the mainspace content is taken from the following {{hwe}}).
There are, however, other things that trigger this (or possibly a similar bug), mainly variations of cross-page formatting markup. In my case I have long multi-page quotes offset using {{margin left}} + {{smaller block}}, both in start+end-tag mode ({{div end}} in the footer, and on the page where the quote ends). In these cases my suspicion is that it's actually illogical markup that gets generated. By default a paragraph will be wrapped in <p>…</p> tags; but the formatting templates mentioned above will insert a <div> start tag in one paragraph that won't have its corresponding </div> end tag until a later paragraph. This is either output as is (leading to invalid markup), or some Mediawiki mechanism is doing something magical to compensate for it. In either case, whatever it is that's happening there gives the same symptoms as the Webkit bug, and seems likely to be a different way of triggering the same bug. --Xover (talk) 06:16, 6 September 2018 (UTC)
Ok, it looks like the multi-page quotes are just another variant of the trigger for the same bug. In this case it's the <br/> at the end of the first page rather than the extra space character left behind by {{hws}}, but in both cases it's having extra whitespace before the page span that triggers the Webkit bug. In the multi-page quote (or, probably, more properly multi-page formatting of any kind) case, a workaround is to simply move the <br/> to the start of the following page. It ends up displaying a single extra newline on the following page in the Page: namespace (which you won't notice if you don't have pilcrow paragraph markers enabled), but otherwise gives correct presentation. Both these workarounds are, of course, {{nop}}-level voodoo coding, and tedious as nevermind, but do sort of work. --Xover (talk) 06:41, 7 September 2018 (UTC)
Hmm. Perhaps we could detect this bug by checking if the current page span's .offsetTop is less than the previous page span's, and then temporarily set the current span to display: inline-block or something just to get the correct .offsetTop? Since inline page numbers (which are inline-block and have the page number as content) have the correct .offsetTop it should be possible to do at the cost of some more processing. Combined with ignoring negative offsets to work around the possible Firefox bug mentioned below, that might eliminate a whole bunch of cases of weirdness with the page numbers. Anyone have any thoughts on such an approach? Billinghurst? Anyone? --Xover (talk) 05:31, 7 September 2018 (UTC)
No, testing suggests it's insufficient to simply set the spans to be inline blocks: the core issue in all of these is that both the specifications and browser implementations treat empty inline elements in weird and inconsistent ways. As an example, what is actually happening in the case of multi-page quotes, is that there's a block-level element (p or div) that surrounds the quote across pages, and the span that marks the page—because it is empty—is taken out of the normal flow (much like floating elements) and placed at the beginning (top left corner) of the containing block's layout box. The .offsetTop value we get back in those cases isn't so much wrong in itself so much as they are reflexive of an underlying behaviour of the layout engine.
That underlying layout engine behaviour seems the best place to tackle this. The construct currently used when two pages are are transcluded together is &#32;<span><span class="pagenum ws-pagenum" id="19" data-page-number="19" title="Page:Some_Work.djvu/49"></span></span> The &#32; is the default configured page separator for the ProofreadPage-extension. I'm not entirely clear on where the spans come from, or why there are two of them, but the main problem is that both of them are empty. When inline elements are empty browsers tend to calculate a layout box for them that is 0 pixels high and 0 pixels wide: that is, a dimensionless point. This is then removed from normal flow and placed at the beginning of its containing element's layout box. Any content inside these spans that generates a text box (a text layout box is what its CSS background-color applies to; it's the area with the gray background in the code snippets here) will make them non-empty and will make the layout engine start treating them the same as it would, say, an <i>emphasised</i> word in the text.
Since anything we put in there will affect the rendering of the page, we can't just stuff a random text string in there (it'd show up in the page). Normal whitespace also won't work because the layout engine optimizes it away and ends up treating the span as empty again. However, we have the Unicode Zero-width space character. It is intended to be an invisible character that hints to layout engines about a suitable place to break text between lines (see the example in the enwp article). But it also works fine as a general invisible marker: in the rendered page it is invisible, but gets a text box that's 1px wide and font sizepx high.[1] Since it now has dimensions the layout engines treat it as part of the normal flow, and we will avoid all the weirdness that comes with empty inline elements.
Depending on what particular bit of code is inserting those spans, the (proposed, must be tested) solution is either to add a &#8203; in the innermost span there, or to make the PageNumbers.js script add it dynamically before getting their positions. --Xover (talk) 07:30, 8 September 2018 (UTC)
@Xover: One possible existing "hook" is MediaWiki:Proofreadpage pagenum template which is the "template" (yes, yes, wrong namespace - you didn't think this stuff ought to make sense, did you?) for the inner of your two nested spans. You may also wish to view the archives - an aborted discussion on a related topic which might have some useful facts: Wikisource:Scriptorium/Archives/2016-09#Making_<pages>_more_flexible? 10:07, 8 September 2018 (UTC)
Thanks! For this particular purpose, it looks like simply adding &#8203; to MediaWiki:Proofreadpage pagenum template would be sufficient. And I can't immediately think of any likely negative side effects of doing so. In other words, it might actually make sense to simply do so and then spot-check a couple hundred works for obvious breakage. The most likely places for that to occur are the very pain points mentioned in that discussion thread: multi-page tables. But since Phrasing content (which text, including character references, are) is allowed anywhere <span> is, the addition should not create any problems that are not already present. The only way to know for sure is to try, I think. --Xover (talk) 07:03, 9 September 2018 (UTC)
Something else to note (official documentation here): {{hws}} is not the source of the extra space between transcluded pages. It is inserted by global mediawiki PHP code beyond the scope of administrative control within local enWS. 02:12, 17 September 2018 (UTC)
@Billinghurst: (or anyone else with relevant knowhow) It looks like MediaWiki:PageNumbers.js is loaded unconditionally by the skin (or somesuch)? That is, there's no actual way for me to disable it in order to experiment with a custom version in my user scripts. Any suggestions for how I might approach trying to implement a workaround for this bug (and the Firefox one below) in the script? --Xover (talk) 06:47, 7 September 2018 (UTC)


  1. The height of the text box is weird: the CSS spec punts on the definition, leaving it up to each browser to figure out, and each browser does it differently. The core issue is that a given bit of text can be in one or more fonts; each font has different inherent size; and "font size" can be measured in many different ways. Is the height measured from the top of the highest ascender (on the "h" say) to the lowest descender (on the "j")? Or from the baseline to the highest ascender? And there's the concept of "x-height", the size of the character "x", and an "em", the width of the character "m". This topic would make for a pretty beefy chapter, or even two, in a book on typography, and I've only barely scratched the surface.

Do you have small tasks for new contributors? It's Google Code-in time again[edit]

Hi everybody! Google Code-in (GCI) will soon take place again - a seven week long contest for 13-17 year old students to contribute to free software projects. Tasks should take an experienced contributed about two-three hours and can be of the categories Code, Documentation/Training, Outreach/Research, Quality Assurance, and User Interface/Design. Do you have an idea for a task and could you imagine mentoring that task? For example, do you have something on mind that needs documentation, research, some gadget or template issues on your "To do" list but you never had the time, and can imagine enjoying mentoring such a task to help a new contributor? If yes, please check out mw:Google Code-in/2018 and become a mentor! Thanks in advance! --AKlapper (WMF) (talk) 13:51, 9 September 2018 (UTC)

Would fixing the edit page tool bars be a suitable task? There are things that are there that shouldn’t be and things that should be that aren’t e.g. the maths characters are up the wop and ff isn’t on the ligatures and the User choice is at the bottom of the pull-down list, it would be best at the top. Zoeannl (talk) 23:56, 16 September 2018 (UTC)

My first index[edit]

I'm abut to transcribe my first book index, starting at Page:Bird Haunts and Nature Memories - Thomas Coward (Warne, 1922).pdf/273. Does anyone have any tips, good examples or recommended templates, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:16, 15 September 2018 (UTC)

I generally proof them like extended tables of contents. (I also don't preserve the two-column layout unless there is a good reason for it.) Page numbers can be linked using {{DJVU page link}} if you think it's worthwhile, otherwise they're transcribed as-is. If you run into any ditto marks, there's a template for that.
I like to link topics as close to where they are in the work as is possible. If there's a proper section heading with an anchor, they can be linked to that, otherwise just to the chapter. (Thus it's sometimes useful to have already figure out the TOC/how the work will be organized in the main namespace.)
It's really just like proofreading/transcribing a set of highly ordered pages (eg. standardized layout.) It can get tedious. For complex indices I sometimes copy the content into a text editor which supports useful things like regex find/replace, column editing, etc. --Mukkakukaku (talk) 01:53, 16 September 2018 (UTC)
Thank you. Is there a template that will make each line in the source code appear as a new line in the rendered result, as <poem> does, rather than putting <br> at the end of each line? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:18, 16 September 2018 (UTC)
  • Pictogram voting comment.svg Comment Index pages are always a little 'urk', and there is no one way to do them, as it depends on how they are indexed. Please don't retain columns as they don't stitch together well. Add {{anchor}}s at each section and when transcluding put {{compactTOCalpha}} into the notes. Generally with spacing I have just put one empty line between each line, and two empty lines b/w each letter. If you need to centre the Index label, I "noinclude" the formatting as it transcludes poorly. I will also often {{block center}} the entire transcluded set as a left aligned column can look "urk". If you are looking for examples, I have had parked quite a few at user talk:Phe over the years. — billinghurst sDrewth 13:34, 16 September 2018 (UTC)

On the other hand, I never block center/center block indices because I use Layout 2 by default and it looks "urk". ;) But to each, his own.
I sometimes use the {{dhr}} template to guarantee the explicit "blank line" between sections (the software collapses whitespace for you otherwise.) I also find {{plainlist}} to come in useful, though I tend to use it more for ingredient lists in cookbooks than for indices. --Mukkakukaku (talk) 18:05, 16 September 2018 (UTC)

Tech News: 2018-38[edit]

21:58, 17 September 2018 (UTC)


An editor has proposed at w:Wikipedia:Village pump (miscellaneous)#Vulgate that we need a better and more complete Vulgate here (presumably an English translation). BD2412 T 02:30, 19 September 2018 (UTC)

It wasn't clear to me from the query whether it was a request for the Latin Vulgate or an English translation. The standard English translations of the Vulgate are the Bible (Douay-Rheims) translations. As with many translations into English, there are multiple editions extant. --EncycloPetey (talk) 03:15, 19 September 2018 (UTC)
The main Bible focus here at present is getting the KJV scan-backed. I doubt the current group of enWikisourcerors have the capacity to work on another version/translation right now. Beeswaxcandle (talk) 03:30, 19 September 2018 (UTC)
yeah, if User:Temerarius wants to show up, and get started, i would help him. here is an 1844 edition [5], and a 1852 [6] but i am past patience with the aspirational directive form of collaboration. too many other backlogs on my to do list. Slowking4SvG's revenge 17:29, 19 September 2018 (UTC)
If it is something we should eventually have, then it can't hurt to set up the project. If others want to follow through, that's on them. BD2412 T 22:31, 19 September 2018 (UTC)
Without a lack of clarity of what is needed, I am not certain that we can give specific advice. We can give general advice of 1) Straight transcription belongs at laWS. 2) If scans exist then they can be translated in our Page: ns. 3) If you want it done, it is our experience it will require a team of interested and committed people and a project is the best means to coordinate such. — billinghurst sDrewth 22:36, 19 September 2018 (UTC)
I'm sorry I wasn't clear: I was inquiring about the Latin text. There are a number linked at, to which presumably no copyright restrictions apply. Temerarius (talk) 15:40, 22 September 2018 (UTC)
@Temerarius:The Latin text would need to be uploaded to the Latin Wikisource. And some of the Latin editions do have copyright restrictions. The Latin text is itself a translation from the Greek, attributed to Jerome, but the most recent revision was issued in the middle of the 20th century. That edition would still be protected by copyright. --EncycloPetey (talk) 17:28, 22 September 2018 (UTC)

The GFDL license on Commons[edit]

18:11, 20 September 2018 (UTC)

WEF framework / Wikidata gadget — confirm that it is again working[edit]

The WEF framework gadget has been reconfigured, so it broke here. I have played with the mediawiki configuration, and I believe that it is functional again. I would appreciate if someone can please confirm that it is working. Thanks. — billinghurst sDrewth 23:37, 20 September 2018 (UTC)

Tech News: 2018-39[edit]

15:23, 24 September 2018 (UTC)

Infoboxes on categories?[edit]

At Commons they have infoboxes that pull Wikidata, similarly to how we pull data for our headers; {{wikidata infobox}} and an example of use at c:Category:Alfred Odgers. We have been less than active with our labelling of categories, and while some have the use of {{plain sister}}, others have nothing. In a Commons's conversation it was asked whether it was of interest to us to utilise their schema for infoboxes. I am not adverse to its use


  • The modules that are utilised have some similar functionality with some of our existing modules
  • Commons coding fraternity is reasonably active, so there is opportunity for access to more lua coders skilled at pulling Wikidata
  • We are not the best categorisers, and generally don't do people categories

Thoughts? — billinghurst sDrewth 23:15, 24 September 2018 (UTC)

Commons has a preference setting that puts the categories at the top. It is easier to have an opinion if you can see what you are making the opinion on.--RaboKarbakian (talk) 00:26, 25 September 2018 (UTC)
Not sure what to do with that comment. Any preference that Commons has, we have. Any gadget that Commons has, we can have, or individually one can run using a configuration line in your Special:mypage/common.js. — billinghurst sDrewth 02:13, 25 September 2018 (UTC)
Our category naming and structure differ significantly from that used on Commons and the Wikipedias. --EncycloPetey (talk) 00:32, 25 September 2018 (UTC)
As with our author and portal namespace, local naming and structure is not particularly pertinent. All this would do is put the data boxes into place, and display Wikidata with matching data as we do in main, author, and portal nss. Presumably (though we should check) if a category is not connected to Wikidata, then its display would be 'hide'. — billinghurst sDrewth 02:13, 25 September 2018 (UTC)
I disagree. Category naming and structure here is vastly different from other projects, and the names in the infobox will match the Wikipedia/Commons model. With the Author namespace, we are dealing with the name of the author and relevant dates. There is no major disconnect in that case. But pulling the names of Categories from Wikidata will more likely confuse users than help, and will encourage the creation of parallel category structures here to match Wikipedia and Commons. Both are strong negative points for me. --EncycloPetey (talk) 02:33, 25 September 2018 (UTC)
I have copied it (and the required modules and their required modules (ad nauseaum)) on User:Einstein95/sandbox. I particularly find the "Notable Work" field quite interesting, it might lead people to start adding texts that we don't have if they're fans of authors or genres. -Einstein95 (talk) 06:09, 25 September 2018 (UTC)
commons has a structured data on commons, and much work, standardizing metadata in templates.
we could increase links from wikisource to wikidata; could create items: "Wikisource author page" and "wikisource work" page, they have an index page item [8], with status [9], and wikisource links at footer.
we could increase our use of wikidata at author and work header pages. (will have to relax edition specification) we could modify author and header template to pull from wikidata. Slowking4SvG's revenge 20:28, 25 September 2018 (UTC)

Just a couple of brief comments (background: I wrote and maintain the infobox). It doesn't have to be used in categories - it should work equally well in other namespaces. It currently follows d:Property:P301 to go from category to topic items - if it would make sense to follow a different link to fetch the topic information then that should be possible. The infobox is built on modular code (Module:WikidataIB), so pieces of it (row lines, auto-categorisation, etc.) could be reused in e.g. {{Author}} (if that isn't the case already). I'm happy to provide help with using the infobox and WikidataIB if that's useful (on the parser function and bot editing sides - I don't know Lua). Thanks. Mike Peel (talk) 21:08, 25 September 2018 (UTC)

Pictogram voting comment.svg Comment On WP they use infoboxes in articles much like we use headers. Would it perhaps be more in keeping with the rest of WS if instead of category infoboxes we introduce category headers that perform similar function? —Beleg Tâl (talk) 23:38, 25 September 2018 (UTC)
I'd be mildly opposed to making our categories look like Author, Portal, and Work pages. Also, anyone who uses other projects will arrive with a certain expectation of what categories will look like, and doing something to violate that expectation seems a bad idea to me. --EncycloPetey (talk) 00:53, 26 September 2018 (UTC)
I strongly support this proposal, which will add semantic richness to our category pages. The templates work well, with overwhelming popular support, on Commons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 05:30, 26 September 2018 (UTC)

What is the correct date?[edit]

We have had a contributor newly add:

Declaration of education The announcement of Fifty Industrial Community College on 18 June 1997

which appears to be a Google-translated copy of the same work as

Declaration of education The announcement of Fifty Industrial Community College on 18 June 1996

Can we determine the correct date so as to effect a merger? --EncycloPetey (talk) 01:01, 26 September 2018 (UTC)

According to File:ฯพณฯสุขวิช รังสิตพล รัฐมนตรีว่าการกระทรวงศึกษาธิการประกาศจัดตั้งวิทยาลัยการอาชีพ.jpg, it gives the year as B.E. 2540, which corresponds to 1997 C.E. -Einstein95 (talk) 04:50, 26 September 2018 (UTC)
This is also shown on the Thai Wikipedia article:

พ.ศ. 2540 (ค.ศ.1997) - นายสุขวิช รังสิตพลประกาศจัดตั้งวิทยาลัยการอาชีพ 51 แห่ง

-Einstein95 (talk) 04:54, 26 September 2018 (UTC)
Document itself says 1997. I have moved the first added to the 1997 space, though I think that it needs to be moved to the translation namespace, and we need to correct the case and grammar of the title. — billinghurst sDrewth 06:10, 26 September 2018 (UTC)

New Wikidata templates[edit]

The templates {{Reasonator}} and {{Scholia}} are available, for linking from Wikisource pages to representations of Wikidata items, where no suitable Wikipedia page is available as a target. They are modelled on {{WikiDark}}, with the same low-contrast links. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 05:26, 26 September 2018 (UTC)

Wikiscan Statistics[edit]

For general information. A statistics of this site is available here, maintained by Wikimedia France. Hrishikes (talk) 03:31, 27 September 2018 (UTC)

thank you. here are some old format, continuously updated stats -- Slowking4SvG's revenge 13:37, 27 September 2018 (UTC)

hyphenated words[edit]

Just FYI: Such usage is broken now due to phab:T104566 change in proofreadpage (it was "queen mother" in transclusion before by change). Probably some review of pages ending with hyphen (minus) is needed. Ankry (talk) 22:27, 29 September 2018 (UTC)

Er, you mean a hyphenated word broken at the hyphen at the page break? Like mother-in-law? You can use the {{hyphenated word start}} and {{hyphenated word end}} templates for that. Eg. If mother-in-law is broken like 'mother-' and 'in-law', then you can do: {{hws|mother|mother-in-law}} and {{hwe|in-law|mother-in-law}}.
So for queen-mother it would be {{hws|queen|queen-mother}} and {{hwe|mother|queen-mother}}.
Unless I've misunderstood what you're getting at? --Mukkakukaku (talk) 23:02, 29 September 2018 (UTC)
@Mukkakukaku: I mean, that the hyphen is now removed by software. So if it is intended to remain, its usage is broken. Ankry (talk) 23:06, 29 September 2018 (UTC)
In most cases, the removal makes things working as an editor intended however (as most hyphens at the end are missing {{hws}}/{{hwe}}). Ankry (talk) 23:08, 29 September 2018 (UTC)
I'm still not sure what you're getting at? I've looked and it appears to be working as expected, and as we've described it at Help:Formatting conventions (section "hyphenated end of page words"). Do you have an example where it's not working this way? --Mukkakukaku (talk) 03:16, 30 September 2018 (UTC)
@Mukkakukaku: The change is this: say the word is beautiful. On first page end, you can write beauti- and on next page start, -ful. No template required. On transclusion, it will become beautiful. But for rendering actual hyphenated words like mother-in-law, you'll still need template use (hws/hwe). This is an alternative method. The templates still work. Hrishikes (talk) 03:51, 30 September 2018 (UTC)
Ankry is trying to say that originally there was written "queen -" on one page ane "mother" on the other, which rendered "queen - mother". However after the change in proofreading software it renders "queen mother", which is wrong, and so he changed it using the template hws. His suggestion that all pages ending with a hyphen should be reviewed because of the change and possibly also corrected in the way he did here which should have probably been done immediately after the change) sounds reasonable to me. --Jan Kameníček (talk) 07:12, 30 September 2018 (UTC)
There is also the possibility of trailing hyphens wither within or at the end of dialog. We may need to look for those as well to be certain they are not affected. --EncycloPetey (talk) 15:25, 30 September 2018 (UTC)

Words hyphenated across pages in Wikisource are now joined[edit]

Hi, this is a message by Can da Lua as discussed here for wikisource communities

The ProofreadPage extension can now join together a word that is split between a page and the next.

In the past, when a page was ending with "concat-" and the next page was beginning with "enation", the resulting transclusion would have been "concat- enation", and a special template like d:Q15630535 had to be used to obtain the word "concatenation".

Now the default behavior has changed: the hyphen at the end of a page is suppressed and in this case no space is inserted, so the result of the transclusion will be: "concatenation", without the need of a template. The "joiner" character is defined by default as "-" (the regular hyphen), but it is possible to change this. A template may still be needed to deal with particular cases when the hyphen needs to be preserved.

Please share this information with your community.

MediaWiki message delivery (talk) 10:28, 30 September 2018 (UTC)

So no more {{hws}} except for special cases maybe. This is great! Maybe we can get something done about the em dashes ending at the end of the page also? Make it default join the words with the em dash intact? Jpez (talk) 11:16, 1 October 2018 (UTC)
Except now apparently we'll need to template the opposite use case, right? So instead of the hyphenated word start/end templates which would collapse the hyphen, we'll now need something to preserve the hyphen? (And, more complicatedly, will now have to go find all the places where we were relying on the old behavior to preserve the hyphen.) --Mukkakukaku (talk) 05:37, 3 October 2018 (UTC)
For preserving the hyphen, writing &#x2010 and semicolon will suffice, if the hyphen is not part of a combined word. Template use in case of combined word with hyphen. Hrishikes (talk) 15:44, 3 October 2018 (UTC)
I suspect even typing &#45; will work to preserve the hyphen too, won't it? —Mahāgaja (formerly Angr) · talk 17:01, 13 October 2018 (UTC)

See Page:Morris-Jones Welsh Grammar 0125.png and Page:Morris-Jones Welsh Grammar 0126.png for what to do if an italicized word is split across a page boundary. The bottom of the first page requires ''gwar- (no closing double apostrophe) and the top of the second page requires <noinclude>''</noinclude>. Then the italicized word appears correctly in both Page: namespace and mainspace. If you close the double apostrophe at the bottom of the first page, the spell is broken and mainspace will show a hyphen followed by a space; see the bottom of Page:Morris-Jones Welsh Grammar 0079.png and its transclusion at A Welsh Grammar, Historical and Comparative/Phonology#80 for an example. —Mahāgaja (formerly Angr) · talk 17:17, 13 October 2018 (UTC)

Breaks around image-pages[edit]

This breaks where there is an intermediate image page, as on The Migration of Birds/Chapter 7. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:17, 9 October 2018 (UTC)

@Pigsonthewing: -- Please check whether it is OK now. Hrishikes (talk) 14:39, 9 October 2018 (UTC)
Thank you. It is, but I was leaving it so others could see the effect. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:47, 10 October 2018 (UTC)

Tech News: 2018-40[edit]

17:35, 1 October 2018 (UTC)

Encrypted PDF of PD book[edit]

The text of this book: [14] is out of copyright (Author:George Bramwell Evens, died 1943) but is only available as an encrypted PDF to "borrow". Does anyone have suggestions for uploading it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:35, 4 October 2018 (UTC)

Two points:
  • The original work is PD in the UK by the 70 pma rule, but it is PD in the US, as it was first published in 1932?
  • The copy you link to is a 2002 edition, so it's hardly surprised that access is restricted, if it contains copyrighted modern material.
BethNaught (talk) 20:58, 4 October 2018 (UTC)
not renewed here [15]; [16]; [17]; [18]; [19]; [20] and no hits at (after 1978 renewal) = i would say PD-US no renewal - do not see a 1932 scan at Internet Archive; i see there is a copy of 1946 edition at Drew University in New Jersey, and Michigan State University, i can drive down and scan a copy [21] - name your price. Slowking4SvG's revenge 21:44, 4 October 2018 (UTC)
The 2002 edition contains, AFAICT (and I'll check against my paper copy once I can access it), no new material. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:15, 5 October 2018 (UTC)
I see no evidence that it was ever published in the US, so the URAA would have made it publication+95 in the US, or in copyright in the US until 2028.--Prosfilaes (talk) 03:04, 5 October 2018 (UTC)
we can have that discussion on commons. Slowking4SvG's revenge 16:23, 5 October 2018 (UTC)

Problematic PDF: The Migration of Birds - Thomas A Coward - 1912[edit]

There is a problem with File:The Migration of Birds - Thomas A Coward - 1912.pdf; please see c:Commons:Village pump#Problem with PDF, and advise if you can. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:35, 5 October 2018 (UTC)

Some of the PDFs are overly compressed for display in Mediawiki. I think that either @Mukkakukaku, @Hrishikes: has fixed some of these previously, I cannot remember whom. We had one in the past couple of months that should be in the archives. We have a section further up to park broken files, for whatever reason. — billinghurst sDrewth 14:32, 5 October 2018 (UTC)
@Pigsonthewing: -- Yes check.svg Done . OCR is not there, however. If you insist on OCR layer, then I'll do some more experiment. Hrishikes (talk) 15:35, 5 October 2018 (UTC)
@Hrishikes: Working well now, thank you, What did you do to fix it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:17, 5 October 2018 (UTC)
The file was in pdf 1.5 format with compression. I resaved it in pdf 1.4 format without compression. Hrishikes (talk) 02:48, 6 October 2018 (UTC)
I guess it was a similar problem as --Jan Kameníček (talk) 09:52, 7 October 2018 (UTC)

Licence check: anonymous 1929 Australian article[edit]

Please can someone advise what licence should apply to this anonymous 1929 article, published in Australia: Examiner (Launceston, Tasmania)/1929/"A Romany in the Fields"? If there's a URAA issue, should it be moved to the Canadian site? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:51, 5 October 2018 (UTC)

{{PD-anon-1996|1929}}billinghurst sDrewth 14:26, 5 October 2018 (UTC)
@Billinghurst: Thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:10, 5 October 2018 (UTC)

Index:Telegraphic Code to Insure Privacy and Secrecy in the Transmission of Telegrams.djvu[edit]

Text OCR cleaned up, anyone want to Proofread? ShakespeareFan00 (talk) 21:46, 5 October 2018 (UTC)

Telegraphic Code to Insure Privacy and Secrecy in the Transmission of Telegrams/Amounts[edit]

Mangled page numbers. It seems someone needs to rethink the module, so it is in fact COMPATIBLE with Proofread page, as currently once you are inside a section generated using {{aligned table}} the automatically generated page numbers aren't displayed correctly (Long term issue). ShakespeareFan00 (talk) 10:01, 7 October 2018 (UTC)

As has been discussed for an extended period, within templates put the table row markers at the beginning, rather than at the end. I have never understood why people close with a row open statement, especially at the end of a table, an extra row marker is like "why?" — billinghurst sDrewth 11:47, 7 October 2018 (UTC)
As an extra comment, the template itself says that this is problematic for page numbering, and it is your choice to continue to use the template. — billinghurst sDrewth 11:52, 7 October 2018 (UTC)
Yes, I know.. Sometimes it would be nice to have long term solutions, (It was me that documented the incompatibility originally). ShakespeareFan00 (talk) 12:38, 7 October 2018 (UTC)
The fix you suggest about row openings would need to be made in Module:Aligned table, All other table handling in the work is based on that template. I'm using it rather than direct table syntax because of concerns about transclude limits for table rows.ShakespeareFan00 (talk) 12:44, 7 October 2018 (UTC)
Yes, which is why I haven't fixed it. I can fix templates, LUA is beyond my capacity, or maybe that is my patience-level. — billinghurst sDrewth 13:07, 7 October 2018 (UTC)
Same here. I'll consider if a different approach might work. The work will need to be split into sections anyway.. ShakespeareFan00 (talk) 13:33, 7 October 2018 (UTC)

RFC: Automating "Wikipedia" link in Header if WD main topic is activated[edit]

I have been tromping through transcluding and WD'ing Dictionary of Indian Biography which has been proofread, though predominantly, not transcluded. Quite a number have Wikipedia articles, and it is pretty tiresome to transclude, then add WD, and identify whether they have a main subject link, then have to go back to the biographic article again. Whereas where I have added "main subject" to d:Q57008414, it would be my preference if the database pulled and automagically added the Wikipedia link, rather than the extra edit.

I am trying to identify any downsides to such an approach, and apart from wrong additions (which can equally happen here. About the only one that I can identify is if someone added more than one main subject, where we would be forced to choose one (if rank preferences where used), or maybe choose none, though mark as problematic and needing resolution. Otherwise, I am unable to identify major stumblings.

@Samwilson, @Mike Peel: from your WD experience, I am guessing that this is a relatively easy data pull. [Mike this happens through {{plain sister}} which is embedded within {{header}}, and in the main ns is an indirect pull as it is a many to one relationship, unlike {{author}} which plain sister does as a straight pull of the interwiki data).

So I am seeking the community opinion on

  1. their thoughts on automating the linking;
  2. any hurdles for implementation; and
  3. the technical aspects for implementation.

Thanks. — billinghurst sDrewth 11:37, 7 October 2018 (UTC)

(comment) I think you mean this property? --Mukkakukaku (talk) 17:51, 7 October 2018 (UTC)


My first comment is that there already many links in place for Wikipedia, so as we have done for other migrated data, where the the parameter is implemented within the existing header it overrides any WD data pull. This approach allows projects to work out what they wish to do with their data. This allows us to identify where we have overrides in place (current situation for images and dates of life). — billinghurst sDrewth 11:43, 7 October 2018 (UTC)

  • Pictogram voting comment.svg Comment I can't see any issues where the data item will have a single "main subject" for biographical articles, but aren't there situations where we would pull information that isn't suitable, say for non-biographical non-dictionary data items? Some books will have a "main subject", but the WP article of primary interest is actually the WP article about the book, and not the article about the book's subject. --EncycloPetey (talk) 19:12, 7 October 2018 (UTC)
    Fully agree about biographical/people. Maybe that is part of our decision-making process. If it is "edition" d:Q3331189 it should one path, if it is an article, it should follow another path. Let us try mapping these. — billinghurst sDrewth 03:00, 8 October 2018 (UTC)
    This does not account for editions of articles though, nor articles which themselves have wikipedia articles. In my opinion, we should either a) have the article's wp link override the subject's wp link, or b) have two wp links (like how we have two commons links for gallery and category), or c) not use plain sister but perhaps have a special template for such cases. —Beleg Tâl (talk) 11:27, 8 October 2018 (UTC)
    Not just "editions" but also versions pages and translations pages for works, which will have any of several possible values for "instance of" (novel, poem, short story, etc.) --EncycloPetey (talk) 22:04, 9 October 2018 (UTC)
    Further to this, there are some works with multiple "main subjects", and this will need to be accounted for. —Beleg Tâl (talk) 00:15, 8 October 2018 (UTC)
    I am guessing that multiple main subjects is due to there being no single useful subject. To me, if one is given priority (higher ranking) then we show the preferred, if two are equal, maybe we ignore them., or maybe we flag them for review, and again not displayed. — billinghurst sDrewth 03:00, 8 October 2018 (UTC)
    An alternative: create a wikidata item for the group of multiple subjects and link to that from the article. —Beleg Tâl (talk) 11:27, 8 October 2018 (UTC)
    I think that this alternative happens from case 3 of flag as problematic, with an fix eventuating. — billinghurst sDrewth 14:23, 8 October 2018 (UTC)
  • The code for this is demo'd at User:Mike Peel/main topic - for Dictionary of Indian Biography/Aliverdi Khan, {{User:Mike Peel/main topic|qid=Q57008414}} will show Wikipedia, and if used without a QID then it will follow the page's sitelink. It should be straightforward to migrate this to Lua and to embed it into the appropriate templates directly (it's basically a couple of Lua module calls and an if statement - just written in parser functions rather than lua right now). Thanks. Mike Peel (talk) 07:00, 10 October 2018 (UTC)
    But, as noted above, this creates more than a few problems that have yet to be solved. --EncycloPetey (talk) 14:00, 11 October 2018 (UTC)

BHL IDs[edit]

The w:Biodiversity Heritage Library is a rich source of out-of-copyright texts, and a good ally for Wikimedia projects. We store BHL author IDs in Wikidata, as P:4081. Can we add these IDs to {{Authority control}}? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:58, 8 October 2018 (UTC)

Looks like a change to Module:Authority control, which seems rather straight-forward. --Mukkakukaku (talk) 23:56, 8 October 2018 (UTC)
@Pigsonthewing: -- Yes check.svg Done . It needed change in the module, not the template. Hrishikes (talk) 15:22, 9 October 2018 (UTC)
Looks like it's working; thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:15, 10 October 2018 (UTC)

Tech News: 2018-41[edit]

23:38, 8 October 2018 (UTC)

Google Books PDF[edit]

What's the best way to upload the PDF available here - do we have a tool for that, like ia-upload? Note that it includes a Google Books cover sheet. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:48, 9 October 2018 (UTC)

@Pigsonthewing: -- It can be done with url2Commons tool. Hover the cursor over the "Ebook - Free" notice, then right click the pdf option and copy the link address. Use this as the url in the first box of url2Commons. Use the main Google Books address as the source url in the second box. OAuth authorization will be required. OAuth often shows failure in case of this tool. It is false failure. Keep the OAuth screen as it is and go to the tab having the tool window to complete the transfer. If you want to remove the frontsheet, you will need to download, edit and re-upload. Hrishikes (talk) 15:06, 9 October 2018 (UTC)
@Hrishikes: Thank you. The simulation failed, complaining about an invalid URL. I trimmed the "?" and everything after it, and then the simulation worked, but the upload failed with " ERROR: null". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:52, 9 October 2018 (UTC)
@Pigsonthewing: -- c:File:A Discourse on the Emigration of British Birds.pdf -- Hrishikes (talk) 16:19, 9 October 2018 (UTC)

Notable printers[edit]

The plate at Page:The birds of Tierra del Fuego - Richard Crawshay.djvu/180 (like others in the same work) was printed by West Newman & Co. I have created a Wikidata item for that company, d:Q57166684. What's the best way to link them? I've used {{Reasonator}}, for now, but am open to counter suggestions. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:25, 9 October 2018 (UTC)

We have created portals for some publishers, no problem for doing that for printers, though that has usually been for complete works. If it is just the images, then maybe Commons alone is sufficient. No need to replicate what is done better elsewhere. — billinghurst sDrewth 22:05, 10 October 2018 (UTC)

Index:Charlesjarrot nytimesarticle1907.jpg[edit]

I was just attempting to validate this single page article, but have encountered an issue where the source image text is cropped early. I found a link referencing back to the original source page ( where the complete text can be found.

I have added in the missing few lines of the article, would somebody be able to recreate the source image for this page using the above link so that it can be completed?

Thanks Sp1nd01 (talk) 14:30, 9 October 2018 (UTC)

Yes check.svg Done -Einstein95 (talk) 22:11, 9 October 2018 (UTC)
Thank you! Sp1nd01 (talk) 07:50, 11 October 2018 (UTC)

Two-page table[edit]

The Migration of Birds - Thomas A Coward - table from pages 92 + 93.jpg

I should be grateful if someone could verify the table on Page:The Migration of Birds - Thomas A Coward - 1912.pdf/114, which also incudes data from Page:The Migration of Birds - Thomas A Coward - 1912.pdf/115, and advise on formatting. I have created a single image, above, to aid this.

Is this the best way to show a table which runs horizontally over two pages? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:45, 9 October 2018 (UTC)

For maps and tables that spread, it does sound best to handle them on one page, and comment on the second. It is one of the adaptations that makes sense to me. — billinghurst sDrewth 22:02, 10 October 2018 (UTC)

Match and Split bot[edit]

As reported by both @Jasonanaggie and @Beleg Tâl on Wikisource:Bot requests, the Match and Split functionality is not currently working. Going to @Phe-bot's page ( says match_and_split robot is not running. Please try again later. @Phe has been pinged at least twice about this. -Einstein95 (talk) 20:45, 9 October 2018 (UTC)

He has been active on another wiki recently, I have asked there if he was no longer supporting the tool whether it is something that we can migrate. — billinghurst sDrewth 22:00, 10 October 2018 (UTC)

Descriptions from Wikidata[edit]

In {{author}}, can we pull |description= from the English-language description in Wikidata, if there is no locally-entered value? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:14, 10 October 2018 (UTC)

It is my understanding (from an earlier time) that we cannot pull the description from the description field. We decided to not pull the occupation alone and continue to add our own description. <shrug> — billinghurst sDrewth 21:59, 10 October 2018 (UTC)
I am sure that's not (or is no longer) the case; no doubt User:Mike Peel can advise. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:29, 11 October 2018 (UTC)
As Mike is busy, @RexxS: for help, please. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:55, 14 October 2018 (UTC)
Try (for George Orwell, Q3335):
  • {{#invoke:WikidataIB |getDescription |qid=Q3335 |wikidata}} -> English author and journalist
Documentation is at Module:WikidataIB #Function getDescription. You could use it in a template something like this:
  • {{#invoke:WikidataIB |getDescription |qid={{{qid|}}} |{{{desc|wikidata}}} }}
That takes optional parameters |qid= and |desc=. If qid is omitted, then it uses the current page; while desc is a local description, which overrides the wikidata. You can supply |desc=none if you want to suppress the description. HTH --RexxS (talk) 17:51, 14 October 2018 (UTC)
I have poked it into template:author/sandbox with an example visible in special:diff/8873313. I haven't done an example where we have no description, and pull from WD. — billinghurst sDrewth 20:29, 14 October 2018 (UTC)
Question, do we wish to track where we have used the WD description? — billinghurst sDrewth 01:53, 15 October 2018 (UTC)
@Pigsonthewing: It is only technically possible, and a fine idea. One of the benefits of having WD fill in blank fields is that the usual names of authors, as given in citations or a user's search, could be spilled out as synonyms of the author page's title here. — CYGNIS INSIGNIS 11:33, 14 October 2018 (UTC)
Additional comment: What would appear in the description field here that is not data or facts, better served at the respective sister sites? Errant content forking across wikimedia is one of the things WD can resolve, and there is no mechanism to address it here if a description or fact is given without references other than bluff. I would prefer that author pages function as a library index card, merely links to sources with all relevant and labelled data providing disambiguating context for the reader. — CYGNIS INSIGNIS 04:22, 15 October 2018 (UTC)
The descriptions at Wikidata are sometimes too brief, sometimes overly verbose. We do often want information in the description such as pseudonyms, pen names, other forms of their names, as well in some cases tha names of close colleagues, family members they might be confused with, or information specific to their status as author rather than whatever else they might be known for. I've seen all of these things and more placed in our descriptions, but they are not generally included in the description field at Wikidata. Neither can we wikilink or bold portions of text pulled from the Wikidata description. --EncycloPetey (talk) 04:27, 15 October 2018 (UTC)
I was not very clear on how I think the pages should be configured, it is very different to the individual creation and maintenance of the information by users here. A key point in my upcoming proposals is that labelled data is the solution to untidy workarounds and verbosity. That information is available in other statements at the the WD item with a reference, each site and reader would be able too choose a preference for what is displayed by default and an opportunity to gather or access further information. This allows any whim to be fulfilled by being able to create a query across wikimedia: are there incomplete books here or at commons, who are the coauthors, who are the notable collaborators, what was their birth name …? I cannot think of an example of an author page I created or modified here that required a unique and unreferenced description, only those that required me to manually copy paste data from other sites, — CYGNIS INSIGNIS 05:52, 15 October 2018 (UTC)
And you will still be able to do all of those things. Using Wikidata descriptions would be a mere fallback, for where none is provided locally. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:00, 15 October 2018 (UTC)
Pictogram voting comment.svg Comment There will be some situations where we want more information than is in the description, and there are some authors for whom the description maintained on Wikidata does not meet our needs, but for the majority of situations, I don't see why we wouldn't want to do so. --EncycloPetey (talk) 15:35, 14 October 2018 (UTC)
Yes, I did specifically say "...if there is no locally-entered value". But where we currently have none, surely something from Wikidata is an improvement. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:54, 14 October 2018 (UTC)

Natural History of the Nightingale[edit]

Natural History of the Nightingale is ready for a second set of eyes, if anyone has time to kindly check it over. There is a gallery of images of the original publication on the talk page.

It's quite complex, being originally spread over two issues; and a lengthy footnote, that includes a subordinate footnote. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:01, 10 October 2018 (UTC)

Is there a reason this wasn't worked as a transcription project using those images in an index? --Mukkakukaku (talk) 03:05, 15 October 2018 (UTC)
There is potential for improvement, I would be proof reading it now if the images were in index, but if a knowledgeable user with access to hathi trust were to bring it over … Is there a reason why that would not be the simple solution? — CYGNIS INSIGNIS 06:26, 15 October 2018 (UTC)
A perhaps not-so-elegant solution(?): Create a single PDF file using the available images, upload to IA, then Commons &c. Unless IA is *still* not generating DjVu files? Londonjackbooks (talk) 06:57, 15 October 2018 (UTC)
Another solution is to check if the file is already hosted IA, but I can't do that and type this message. I'm also limping along on an antique that is allergic to pdf, as am I, limping and allergic. It's a quandary … CYGNIS INSIGNIS 07:25, 15 October 2018 (UTC) P.S. A very enjoyable text, Andy, nicely sourced, transcribed and linked. Any note within a note will resolvable in the Page: namespace, but I'm wondering if another was missed; a dagger † often refers to the second footnote of a page, following the use of an asterisk * — CYGNIS INSIGNIS 07:44, 15 October 2018 (UTC)
Couldn't find it at IA searching text. It would be doubtful anyway (wouldn't it?) for both sections to be pieced together at IA unless someone had taken the pains to do so. I am limping, but my computer is not; and neither of us are (is?) allergic. I can't get to it this minute, but maybe later today, unless/until someone else gets to working out a solution. Londonjackbooks (talk) 07:54, 15 October 2018 (UTC)
Is the watermarking problematic? I can't do anything about that for most images. Londonjackbooks (talk) 08:35, 15 October 2018 (UTC)
Went ahead and uploaded a file to IA. Don't see a 'regular' djvu file derived (forgive my questionable terminology). I have failed in the past to to the pdf to djvu conversion via Commons (or wherever)... If we can get it to Commons, I can set it up here, but confess I have yet to learn how to do match and split (not without an offer to help). Soup sandwich I am :) Londonjackbooks (talk) 17:11, 15 October 2018 (UTC)
DjVu file is now at Commons. Will create an Index here. Londonjackbooks (talk) 05:50, 16 October 2018 (UTC)
What is the point of this, when the work is already proofread and published? I simply asked for someone to "check it over". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:12, 16 October 2018 (UTC)
I don't regard it as obligatory, although I presonally have a strong preference for localised images. The few times I proofread something the old way did not encourage me to do it again (most users cut/paste gutenberberg texts), but I have checked thousands of pages since. This includes a couple of small improvements to this text, that I frankly would not have bothered to do if were not for the scan. Secondly, verifiability, crucial to the work we do here, confirming that the text matches makes it easier for London, for example, to find all the things I miss. — CYGNIS INSIGNIS 17:13, 16 October 2018 (UTC)
Some even {{ls}}eem to be po{{ls}}{{ls}}e{{ls}}{{ls}}ed of a different {{ls}}ong from the re{{ls}}t, and contend with each other with great ardor.
Proof reading is a bit of a challenge, is the preference that the long esses appear in main? — CYGNIS INSIGNIS 12:18, 16 October 2018 (UTC)
  • Generally preferred not to have long s in mainspace, but it's up to you and the other proofreaders of the work. You can post your discussion and decision here. —Beleg Tâl (talk) 12:33, 16 October 2018 (UTC)
    • Er, as I transcribed and proofread the entire work, isn't it up to me? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:12, 16 October 2018 (UTC)
      • That is the practice, but the guidelines say that deviation from whatever is deemed 'standard'—by consensus or otherwise—is liable to be challenged by others. I've only done one work with long esses, just one mind, the argument against their display became even more persuasive. — CYGNIS INSIGNIS 17:21, 16 October 2018 (UTC) P. S. the reason I ask is that I will need to replace the template with the character, which is trivial when compared with your investment in applying them. — CYGNIS INSIGNIS 17:26, 16 October 2018 (UTC)
        • "I will need to replace..." You will? Why? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:22, 16 October 2018 (UTC)
          • The shorter version of the protracted chapter in WS history is that the template does not display in mainspace [!] unless you install a script to show them, although there was an idea to put another option in the sidebar (maybe this has been implemented). In my view the existence of Template:ls is unhelpful, a user is either using it or not; my recommendation is to always check any template documentation. — CYGNIS INSIGNIS 21:03, 16 October 2018 (UTC)
            • So, no "need" to remove it, then. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:27, 16 October 2018 (UTC)
              • If that is your decision. Just to be clear, I was not proposing to remove it: "I will need to replace the template with the character …", that is, replace the template with the character itself. — CYGNIS INSIGNIS 21:48, 16 October 2018 (UTC)

Big untranscluded works[edit]

I am just looking at some of the untranscluded works that we have, and these are some of the biggies that need addressing

Top 5
Migration from text only to image-based

Probably a couple of thousand pages here. — billinghurst sDrewth 04:59, 11 October 2018 (UTC)

The US Statues are a nightmare because of the weird templates and they use in mainspace. Is there a list of untranscluded works somewhere? I've never found one. --Mukkakukaku (talk) 04:53, 12 October 2018 (UTC)
Also, I'll take a stab at A General History... so we don't all go stepping on each other. :) --Mukkakukaku (talk) 04:54, 12 October 2018 (UTC)
We have category:Transclusion check required which is proofread works with something needing to be done. There is also the active list generated at toollabs:phetools, though that a listing of untranscluded pages, irrespective of the status of the work. — billinghurst sDrewth 05:07, 12 October 2018 (UTC)

Narrow no-break space for contractions?[edit]

This is something I've thought about for a long time and would like to hear what people think. In a lot of old books contractions like 'll for will and 's for is are preceded by a narrow space that never breaks for a new line. Until now I've been deleting the space (or following whatever's the trend on projects that are already well advanced, usually an ordinary space), but felt I should really be using u+202f. Anyway, on this page from Oliver Twist there's a good example of why it's important: "Why, a beak 's a madgst'rate; and when you walk by a beak's order, …" where there's a contrast between the possessive "beak's" and the contraction "beak 's". That page I proofed entering the unicode character directly, and the following page I used the entity &#x202f; … Does anyone have advice on which would be better to use? Or should I just use &nbsp; which would be less confusing for validators? — Mudbringer (talk) 01:58, 12 October 2018 (UTC)

I think that any of those options are acceptable so long as it's consistent within a work. I personally would probably use nonbreaking space if I were to use a space at all. I like the idea of narrow nobreaking space if you're willing to put in the effort for it. —Beleg Tâl (talk) 23:22, 12 October 2018 (UTC)
For dialectical speech, I tend to use a full space. I tend to treat such instances of elision differently from contractions. I've come across cases where a half-space is used in the source, but also cases where a full space is inserted. There are also situations where "connecting" the two parts with a non-breaking space would imply a connection not implied in the source text. For example, consider the final paragraph on this page, especially the phrase fellers 'll which is divided between two lines in the final paragraph. Using a non-breaking space when the source text allows for a line break in such a place would not be faithful to the style in which the original was printed. --EncycloPetey (talk) 01:43, 13 October 2018 (UTC)
Thank you both for the comments. The example from Red Badge of Courage is very interesting. In that work I found 'd 'll 'm 'n 're 's 've with elided initial vowel, which all appear both connected with the previous word and with an intervening space, often for the same combination of forms, such as we 're and we're. The only example of one of those appearing following a line break is the case of feller 'll that you pointed out. In Oliver Twist there's also variation between inserting spaces before these forms and joining them to the previous word, but I can find no cases in the three volumes where they appear after linebreak. …… Does anyone directly insert no-break spaces, or is it better to use the html entity? — Mudbringer (talk) 15:29, 13 October 2018 (UTC)
I would look at later editions of the same text for clues to transcription, the fashion for thin spacing to indicate a semantic distinction from a regular space did n't last. I have seen them slipped in and out to aid with the justification of the text block, as if the typesetter was in two minds about the whole business. The trend for justified text (i.e. flush left and right margins), which lasted much longer, was always going to confound the practice; requiring variable width 'full spaces' to be read as distinct from the thinnest space between the type. — CYGNIS INSIGNIS 00:04, 14 October 2018 (UTC)

Portal:Renaissance texts[edit]

We were lacking a Portal for the Renaissance period from our list at Portal:Era, so I began one.

Anyone with an interest in texts from this period (c. 1420-1630) please feel welcome to improve the meagre start that I've made. --EncycloPetey (talk) 00:13, 14 October 2018 (UTC)

Cool image[edit]

Proofread heart.jpg

Kudos to our friends on the Polish Wikisource for creating this image! Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:37, 15 October 2018 (UTC)

Support ends for the 2006 wikitext editor[edit]

This toolbar is being removed from MediaWiki.

The 2006 wikitext editor will be officially removed next week, on the normal deployment train (i.e., Wednesday, 24 October 2018 for the Wikisources). This has been discussed since at least 2011, was planned for three different dates in 2017, and is finally happening.

If you are using this toolbar (and most of you aren't), then you will be given no toolbar at all (the 2003 wikitext editor). This default was chosen so that your editing windows will open even faster, and to avoid cluttering the window with the larger toolbars (a particularly important consideration for Wikisource's PagePreviews). Of course, if you decide that you would prefer the 2010 or 2017 wikitext editors (or a gadget like WikEd), then you are free to change your preferences at any time.

Although it is not a very popular script overall, I know that some editors prefer this particular tool. If you are one of its fans, then you might want to know that some long-time editors are talking about re-implementing its best features as a volunteer-supported user script. I believe that any announcements about that project will be made at mw:Contributors/Projects/Removal of the 2006 wikitext editor. Whatamidoing (WMF) (talk) 17:48, 15 October 2018 (UTC)

Tech News: 2018-42[edit]

22:40, 15 October 2018 (UTC)

Frankenstein, or the Modern Prometheus (Revised Edition, 1831) chapter links[edit]

I changed the fake ToC on the main page to use {{AuxTOC}} and changed the links to point to arabic numeral numbering rather than roman, eg. "Chapter 4" over "Chapter IV". Only issue now is, the old pages are at "Chapter IV" but there are already existing, non-scan-backed pages named with the roman numerals so I am unable to move the page over it (due to lacking permissions). Can someone either move the pages or do a mass edit of the non-scan-backed pages (1-24) to basically contain the content from the recently made pages (I-XXIV) and redirect the roman numeral pages to the arabic numeral ones? -Einstein95 (talk) 03:05, 16 October 2018 (UTC)