User talk:Billinghurst

From Wikisource
Jump to: navigation, search
A harp which sounds to good to be true is probably a lyre
System-users.svg This user has alternate accounts named SDrewthbot & SDrewth.
billinghurst (talk page)

(Archives index, Last archive) IRC cloak request: I confirm that my freenode nick is sDrewth
Note: Please use informative section titles that give some indication of the message.

Popular Science Monthly 1916 Ad.jpg

Wikisource has a number of active Wikiprojects that could use
your help in tackling these large additions to our library.

Popular Science Monthly Project
Work: Popular Science Monthly

Note to self (export)[edit]

(pastes from conversation)

  • mw:API:Parsing wikitext
  • //
  • then it uses the ws-noexport class to tidy the html from unneeded stuff; it was easier to use html as output, epub are html in zip file, and it exists many tool to convert html to other format

billinghurst sDrewth 14:50, 4 December 2012 (UTC)


  • When building Tables of contents and Lists of Illustrations, the title components need to be included within the table |+ ... as otherwise they page break after the title before table, d'oh!
  • asked Tpt about the attribution page, and how to edit, and to correct a typo
  • epubreader (reasonable in browser app for FF)

TO DO — DNB footer initials[edit]

Obliterating previous claim(s) to authorities.[edit]

Please consider using {{authority control|$1}} instead of {{authority control}} when replacing a pre-existing set of imports/additions to this template. It does not interfere in anyway with any data "collection" or rendering and the like, but actually helps matters by proving some sense of an "anchor" for robot/gadget/crawl utilization.

Think of this as the last {{{ }}} in a string of {{{ | {{{ | {{{ | }}} | }}} }}} 's but with no " | " in the last one & the resulting behavior it causes. Thanks. -- George Orwell III (talk)

That is an unnecessary level of complication. If it needs it, then code it. Trying to and needing to explain its use in that way seems nonsensical. Let us keep it simple.— billinghurst sDrewth 08:58, 30 March 2014 (UTC)
Deleting the pre-existing, human applied info is driving the need for this so-called complication, not I, so please reconsider taking that approach. The amount of "work" needed to accomplish the same effect code-wise & on-the fly is a galactic waste of time and energy. The next incarnation will soon be upon us (see Module:WikidataF ) and "we" are already throwing away huge amounts of localized research just for the sake of what only appears to be 2nd phase progress.

Just tell me where/which script/toolbar you are loading this parameter-less template from and I will modify it for you. Nice and simple. It will most likely be obsolete in a week or two to boot! -- George Orwell III (talk) 09:55, 30 March 2014 (UTC)

What is your issue today? Obliterating, deleting, complication ... can you please move away from the rhetorical to any specific issues and problems that are caused by my editing. From what I am seeing the pages that I leave have more or the same data and links, ie. no loss of functionality. All the data and more is now in Wikidata, and it gets there by a human. Nothing is thrown away. Your statements about localised links doesn't address the issue of link loss when pages are moved, deleted relocated at the other wikis, so that is a specious argument. This is all about making things as simple as possible, and with minimal maintenance. Ideally we should be completely removing the visible aspects of the sister links which can be managed by WD, and data pull. — billinghurst sDrewth 10:11, 30 March 2014 (UTC)
You've changed the point to somehow make this "my issue" because of "my words" when you are the one actually removing stuff that you claim won't matter either-way, nor make any difference at-the-end-of-the-day, so I kind of know where continuing any of this is going to go already (nowhere fast). I give up. retract my request(s) Delete away.

One thing I must insist you try the next time you need to expand or move an Author: page however; plz locate the 'Wikisource: #######' entry in the existing AuthCont template bar and copy down the URL linkage & number-string within it before you "actually" move anything. Of course, verify the link actually works while you are at it. Finally, check the same URL link and associated-ID aspects after you make your move(s). Before & after should be exactly the same & clicking on it should take you to the same page too. Nobody had to amend or detect anything to accomplish that. Would that have been simple enough for you?

Try to have a good day there anyway Mr. Lost-Links. :) George Orwell III (talk) 13:08, 30 March 2014 (UTC)

Bilingual Swedish-Latin book on spiders[edit]

Hi Billinghurst, as briefly mentioned at Wikimania, I would appreciate help with setting up File:Clerck 1757 Svenska Spindlar - Aranei Svecici.pdf for transcription, which I tried here and suggested here and here, all unsuccessfully so far. Thanks and cheers, -- Daniel Mietchen (talk) 02:06, 14 August 2014 (UTC)

Hi @Daniel Mietchen:. It looks like the framework may not be properly set at oldwikisource, definitely something not right in the background. Let me ping @Zyephyrus, Tpt: to see if they can help immediately, otherwise, I will have a look when I get home in a few days (tablet is less than ideal for that comparative work). — billinghurst sDrewth 06:00, 14 August 2014 (UTC)
Created here and here, but I have no text, it must be added. --Zyephyrus (talk) 21:12, 16 August 2014 (UTC)
I was poking around a bit for help on how to do an OCR (especially for Latin/ Swedish) for Wikisource but did not find any. Do you or Zyephyrus or anyone here have some pointers? -- Daniel Mietchen (talk) 20:19, 6 September 2014 (UTC)
@Daniel Mietchen:Must be too obvious … under the "Proofreading tools" button on the toolbar (traditional or advanced) there is a big button labelled "OCR". Press it and it OCRs. — billinghurst sDrewth 00:40, 7 September 2014 (UTC)
Note en passant: Make sure Preferences/Gadgets/Editing tools for Page: namespace/Disable OCR button Button Button ocr.png in Page: namespace checkbox is unticked. AuFCL (talk) 02:59, 7 September 2014 (UTC)
Haven't been in the Page namespace much, so it was not obvious to me. Found it now on the English and Swedish Wikisources, but not on the Latin one (I checked for gadgets to be set in the preferences). Anyway, since the Swedish alphabet contains the Latin one, it would seem possible to just do the OCR on the Swedish Wikisource and then transfer to the Latin one. -- Daniel Mietchen (talk) 22:19, 7 September 2014 (UTC)
It is there in both, I did check prior to stating, maybe just push the page again. The OCR tool is loaded through "Mediawiki:Common.js", and gadgets are just there to offer to turn off the button in the toolbar. I think that it is on for all WS wikis, though you can always load it through your m:Special:MyPage/global.js see mw:Extension:GlobalCssJsbillinghurst sDrewth 23:12, 7 September 2014 (UTC)
The OCR button is hidden in the Proofread tools group, when you expand it other groups disappear: can you find that too? --Zyephyrus (talk) 23:25, 7 September 2014 (UTC)
OK, found it now, thanks - seems the icon simply hadn't loaded. Is the OCR customized to the respective language version? Results for neither Latin nor Swedish are very promising for this book. -- Daniel Mietchen (talk) 23:59, 7 September 2014 (UTC)
@Phe: are you able to provide better feedback on Daniel's question? — billinghurst sDrewth 03:18, 8 September 2014 (UTC)
Lang is customized depending on the site the ocr request come. --> ocr in English etc. except for where the ocr use Italian as lang as there is no latin package. Actually the lang can't be chosen on a per book basis. — Phe 11:26, 8 September 2014 (UTC)
@Phe: @Daniel Mietchen: I'm guessing that the Swedish package is based on 20:th and 21:st century Swedish texts. That is not really suitable for an antiqua text from the 18:th century which is very different in many instances, long-s, different 'ä' and 'ö' types etc. That makes it hard to get a good OCR anyway but besides that isn't the facsimile that good. The text from the opposite side has bleeded through the page which makes for a lot of artifacts on the page that disrupts the OCRing.--Thurs (talk) 20:07, 8 September 2014 (UTC)
I have been known to save a page image (as jpg) try one of the external OCR sites for a comparison. Sometimes those sites do better than we do. Sometimes if the work has not been through Internet Archive it is worth putting it through their derivation processes. Sometimes IA's derivation process does a crap job, and asking them to rederive a work can bring different results. Sometimes, it all is a bit too hard for whichever process. — billinghurst sDrewth 01:02, 9 September 2014 (UTC)

Tech News: 2014-36[edit]

07:48, 1 September 2014 (UTC)

Author:Grace Granville[edit]

Did you notice the discrepancy in birth dates? On here it still says ca. 1667. I'm not sure which one is correct, although 1654 seems the right one. The editor possibly used this source. --Azertus (talk) 09:26, 2 September 2014 (UTC)

Yes, I noticed it, and that is what led me to remove it at WD from what I had added from enWP, which was unsourced. I am going to leave it vague here, just hadn't got there yet, and who knows which is correct. Family history data shows both years, and my quick check of references doesn't show anything definitive. — billinghurst sDrewth 14:36, 2 September 2014 (UTC)

Tech News: 2014-37[edit]

09:33, 8 September 2014 (UTC)

Thanks for the welcome[edit]

Hello Billinghurst, thanks very much for the warm welcome and guidance. I'm very new to wikis so apologies for the elementary errors. Thanks too for the response to the help request. As mentioned I'm part of a wider group of sociologists who have been working on transcripts of military incidents - we will produce other modified transcripts (we usually work on them for around a year each) but the one mentioned is our most developed. In terms of contributing to discussions on transcripts, how is it best to do that? You can see the way we have done it here.Michaelmair (talk) 13:54, 8 September 2014 (UTC)

@Michaelmair: We were all new at wikis at one point, so we try to do guidance, and hopefully teach, rather than complain. Politely slap us, if we stray or forget that.
We see WS as the source/library, such as "here is the original text that passed contemporary peer review at the time of publication", rather than as the encyclopaedia, or place of analysis. Such that the body of a text is meant to be that source, headers of works can provide some neutral context, scene setting and linkage (even to talk pages). Talk pages themselves can provide further context, and links to pertinent off-site analysis. So with that background, if you are talking about correcting errors in transcriptions, then fix them, and add pertinent comment to the source (as I said before). If discussions for new transcripts, then where you were is a good start place (we try to keep help simple), and we can direct from there, of course noting Wikisource:What Wikisource includes. If you meant discussions to analysis of the transcript, if it is a published work and in the public domain, then it is ours, otherwise is not our bailiwick, and may be more relevant for our sister sites wikinews or wikibooks and then we link between them to discuss source <-> news <-> annotation/book; and of course Wikipedia for encyclopaedic review.
Does that answer your questions? Or have I missed your point? — billinghurst sDrewth 00:29, 9 September 2014 (UTC)

A compromise?[edit]

Hey, I've been thinking about our discussion about linking in archived scientific works and I think I've reached a compromise that will please both of us. Check out that example page again. If it looks the same as it did before bypass your cache (ctrl+F5 in Firefox or Chrome). Does this satisfy us both? Abyssal (talk) 13:33, 8 September 2014 (UTC)

@Abyssal: Personal opinion: there is more than I would link in a work if I was doing it. There are still words linked that I consider common words, and I believe anyone reading this technical work (the audience) would know those words (eg. cervical, suture, ...). That said, I (generally) would not unlink words if I came to validate the work. So if I came to the work now, my expectations would be … a consistency in linking style/approach, ie. not seeing the odd page in the work having the linking, AND how does a chapter look when it is transcluded to the main namespace? Is it a sea of links that swamp the work, or does it seem appropriate?
In the end, this is your effort, and you are the lead reproducer and at enWS have an approach that respects the lead contributor for a work. So I am not going to be (needlessly) critical and (hopefully) more provide an observer's reflective critique. Maybe you can get an outsider's view of what is the balance, and as a community that is probably the point of view needed, not managing my expectations. In the end that you read and consider my thoughts, and seek other opinion is all that I ask, and what you decide is okay if we are getting the consistency, and the overarching view. Thanks for asking. — billinghurst sDrewth 00:45, 9 September 2014 (UTC)
Do you think the change to darker blue, less conspicuous links was an improvement, though? I was hoping to strike a balance between readability and integration with other Wikimedia content. Abyssal (talk) 01:43, 9 September 2014 (UTC)
No. It isn't our practice to change link colours to suit our needs to lessen the visual impact of overlinking, we have left links as the system defaults. Changing the colours doesn't change the basis of my issue. — billinghurst sDrewth 11:42, 9 September 2014 (UTC)
OK. I modeled the change to dark blue links on Template:Wg. I didn't know that kind of thing was controversial. A philosophical question in response to your distaste for my use of links: what is the drawback of a link if it's not drawing attention to itself by being the default bright blue color? Abyssal (talk) 12:08, 9 September 2014 (UTC)
My philosophical point about links and templates has been addressed as a more general discussion at WS:S. I think that more general discussion is more useful for that component and becomes not my opinion alone, and I keep my responses here relevant to my opinion about the pages in question, that hopefully separates philosophy from specifics. I hope that is okay. — billinghurst sDrewth 00:50, 10 September 2014 (UTC)

" Where's the Beef ? aka short "cuts"[edit]

Recently your bot changed three SHSP pages and I am wondering why short cuts were substituted or removed. I looked at Template:RunningHeader and saw that the "shortcut" of the beef (rh) is okay to use rather than RunningHeader. I also saw the following was changed.

 SDrewthbot (Talk | contribs)
m (expand diacritical templates, replaced: {{rh| → {{RunningHeader|, {{hws| → {{hyphenated word start|, {{hwe| → {{hyphenated word end|, œ → {{subst:oe}} using AWB)

I need to know if the short "cuts" are still allowed. Respectfully, —Maury (talk) 23:57, 10 September 2014 (UTC)

Sure, shortcuts are still allowed, what ever gave you the idea that they weren't. I was doing the expansions, and the others are just general maintenance in passing. Not certain what is your issue. Ideally we would subst all the shortcut templates to expand normally, but I have never bothered to go that route. — billinghurst sDrewth 00:06, 11 September 2014 (UTC)
There is no issue now that I have read your statement about the above. However, rules and codes sometimes change fast on wikisource and your bot's changes sparked my question. Don't worry, be happy. Kindest regards, —Maury (talk) 00:20, 11 September 2014 (UTC)
Shortcuts are easy to type, and that is their purpose, whereas the issue is that they are hard for newbies to comprehend, especially if they don't know what is a template in the first place. It is hard to find the right balance, however, if my bot is running through pages doing maintenance on a page, it replaces to the full. Though I don't send it through replacing just for the hell of it. As I said, ideally we would subst: expand on creation, but we haven't done it. That said, something like RunningHeader is easy to add to the header field on the Index: page, so we shouldn't need to overly use a shortcut. — billinghurst sDrewth 00:48, 11 September 2014 (UTC)


Can you take a look and respond at WS:PD please? Been trying to cleanup some items with missing files. ShakespeareFan00 (talk) 18:01, 12 September 2014 (UTC)

Like most of us, I look at WS:PD when the mood takes me. We are used to long conversations there. — billinghurst sDrewth 09:55, 15 September 2014 (UTC)


does it say that you have to use that ugly template (which policy)? ~ DanielTom (talk) 23:38, 14 September 2014 (UTC)

That template matches many of our other templates that accurately describe a work, its source, and year of publication (where known). Useful and pertinent information is what we looking to provide. — billinghurst sDrewth 09:53, 15 September 2014 (UTC)

Tech News: 2014-38[edit]

08:34, 15 September 2014 (UTC)

Undeletion and transfer here[edit]

Please see c:User talk:INeverCry#Request for temporary undeletion. At least he tried (although the re-deletion comment demonstrates low understanding of what I was after). Beeswaxcandle (talk) 04:27, 17 September 2014 (UTC)

Transferred, will be in category in RC header. — billinghurst sDrewth 10:01, 17 September 2014 (UTC)


Hi, for the St Andrews book I've found dates of death for all the authors except these three. When you have time could you see what you can find in your resources? Beeswaxcandle (talk) 20:36, 17 September 2014 (UTC)

  • Thomas Ross Mills (b.1869)
    (parking) can find birth in Berwick, Northumberland, and census records up to 1901, and in St Andrews at the time. Nothing afterwards. There are deaths for "Thomas R Mills" of the age that correspond to year of birth, however, nothing definitive. First check of newspapers shows nothing evident, will need to try variations, and look to Scottish papers, and their record sets.
  • Robert Norrie (Asst. Lecturer Univ Coll Dundee)
    Born 1878, Dundee, Angus (Forfar). Census records to 1911 (single holidaying with mother in England). Son of William and Ann(e) (?Findlay). Family tree info gives year of birth only.
  • William Smith Denham (Ass. to Dept. Chemistry St Andrews)
    Born 1878/9, Glasgow, Lanarkshire, Scotland, can find census records to 1901. No evident death, need to separately try Scottish sources.
    Died 1964. Obituary. The Times (London, England), Tuesday, Jun 09, 1964; pg. 15; Issue 56033. Director of Research, British Silk Research Association (1921-34), aged 85, 1 June 1964, Sutton, Surrey.

Some research done above. — billinghurst sDrewth 01:37, 19 September 2014 (UTC)

Improve Template:header to handle arbitrary number of categories[edit]

I posted on the talk page here Template_talk:Header#Change_to_Support_More_than_10_Categories and you can see this change to the template in action here Template:Header/sandbox/sample and here Template:Header-wpoa/sandbox/example.

Edits to Template:Header are currently restricted, but I think this change would be a general improvement for any document with more than 10 categories added via the header template. Let me know if you think this change can be accepted.

-- Mattsenate (talk) 22:54, 18 September 2014 (UTC)

Thanks. Commented there. We probably have numbers of similar improvements elsewhere that we need to find and resolve in a similar manner. [Us non-coders!] — billinghurst sDrewth 00:57, 19 September 2014 (UTC)

Google Front pages/Watermarking[edit]

A bit of blunt approach, but I couldn't think of a better way of flagging files that might require to be 'evacuated' before certain people at commons decided to go on a 'freedom' purge :(

No objections to reverts of course.

ShakespeareFan00 (talk) 00:31, 21 September 2014 (UTC)

Google watermarking can just be ignored, if it becomes a major issue, we get someone to write a bot to replace the lead page with a blank one. It doesn't change the licensing of the work. Face the issue when it is an issue, and we can have the bitchfight at the time. — billinghurst sDrewth 08:23, 21 September 2014 (UTC)
Actually @Phe:, how hard would it be to write a tool for toollabs that takes a file from Commons, removes and replaces the first page of a djvu/pdf, then puts the file back. Something similar to croptool and rotatebot. — billinghurst sDrewth 08:42, 21 September 2014 (UTC)
That would be a great tool to have but it only addresses the Google disclaimer page part of the "situation". If the recent application of Template:Watermarking is anything to go by, I think Shakes is also worried about the watermark typically found at the bottom of every page. That's an entirely different matter to say the least. -- George Orwell III (talk) 09:53, 21 September 2014 (UTC)
Lots of files have watermarks, and it is just a template c:template:Watermark and that is advisory, not a criteria for deletion. I would rather we show how radical that they are becoming, and how much their ways are problematic for the broader community. — billinghurst sDrewth 10:06, 21 September 2014 (UTC)
It was partly because the watermarking is a 'quality' issue , vs a licensing one that I wrote the template, I've also written {{front-sheet}} to flag works that have the front-page issue. I've also reviewed some of my recent Index status changes to make sure that what's in the relevant status categoy (File to Fix) are actually File with STRUCTURAL issues, which is why you should see a lot of Index pages in Recent changes.ShakespeareFan00 (talk) 10:22, 21 September 2014 (UTC)
First, I'm pretty sure we're all clear by now that the best time and place to correct structural issues is before uploading the file in the first place. After that, the amount of labor already invested should help determine the next course of action for a structurally deficient source file.

There is not much to gain by "patching" a file uploaded 4 or 5 years ago that still only has the dozen or two pages originally created (e.g. a bunch of no-texts, the title page & maybe the ToC) out of the hundreds of the total-remaining done. Its frequently easier to find a replacement file (quality and/or edition permitting of course) or just patch the original source file first and re-derive that as the replacement instead. Folks can take on those kinds of tasks at their leisure.

At the same time, if there are hundreds of Pages already proofread of a source file and two or three dozen left to still "work out", it makes perfect sense to expend the effort to patch what we already know to be nearly vetted in full. The same goes for the old PoTM-turns-out-to-be-flawed-187-pages-into-transcription scenario.

The notion that uploading a bunch a stuff only to wind up "parking" them for months if not years will somehow entice new contributors before you can get them yourself in the interim has also proven to be load of made-up malarkey. A freshly derived file will more often than not require "less work" to bring to completion than an older file would because the process itself is constantly improving over that same period of time.

I'm fully in favor of such housekeeping & appreciate knowing of any file issues at a glance but even in the real world - sometimes a new replacement is far easier to maintain than trying to clean up something "old". -- George Orwell III (talk) 11:16, 21 September 2014 (UTC)

I will also note that in removing the front sheet, it might be "useful" if the layout could be patched for files we know (due to pagelist checking) have 'missing pages', The layout could be fixed temporarily by inserting a suitable "This page is missing in the source scans" page, something I've done myself on a few files after a suggestion by someone here.ShakespeareFan00 (talk) 10:25, 21 September 2014 (UTC)
See my screed above re: placeholders & patching first but if a tool is developed to easily do some of the more minor placeholder-insertions/duplicate-deletions on the fly, I'm all for it too. -- George Orwell III (talk) 11:16, 21 September 2014 (UTC)

Hi, keeping watermark/front page is required by CC-BY and CC-BY-SA licence "attribution – You must attribute the work in the manner specified by the author or licensor" ([37]). emphasis mine. — Phe 18:48, 23 September 2014 (UTC)

Rescanned works don't get to be relicensed by the scanner just by 'sweat of the brow' effort. I woukld agree with approach for their works that they author and they license, but not where they slap a front page to a work. — billinghurst sDrewth 23:08, 23 September 2014 (UTC)
Ok, we have probably the right in some country at least to remove the front-page, and I have little doubt watermark will need to be removed later. I really found this story idiotic, why the hell some commonist are making our work more difficult? there is nothing preventing to keep this except an internal policy of commons, perhaps it's time to use local repository as commons is less and less usable, it'll require less work rather to try to workaround commons policy. — Phe 00:51, 24 September 2014 (UTC)
Yes, though it means that a Wikisource'd translation will need to be double loaded. :-/ Otherwise I agree about the issue of Commonist-nitpickers and there losing the global picture. — billinghurst sDrewth 01:24, 24 September 2014 (UTC)

One of the toolwriters who runs a croptool has said that he may be able to build a tool, and I have given some examples with which he can play. I asked for djvu and pdf. It might work really well in conjunction with the toollabs:bubbillinghurst sDrewth 10:11, 28 September 2014 (UTC)

Julins Palmer[edit]

Actually it's no typo: see e.g. [38]. Charles Matthews (talk) 04:38, 21 September 2014 (UTC)

It is still the published word, so might be worth changing the message to be forthright. I have already created the redirect. — billinghurst sDrewth 08:19, 21 September 2014 (UTC)

Tech News: 2014-39[edit]

09:04, 22 September 2014 (UTC)

Tech News: 2014-40[edit]

09:44, 29 September 2014 (UTC)