User talk:Charles Matthews

From Wikisource
Jump to navigation Jump to search

Article on Henry Oatlands of Bewdley - C17th preacher. I believe your information about his marriage is wrong. He was married to Mary Hennezey or Hensey a widow in 1660. The Hennezey were Huguenot glass makers. On his ejection he worked for a time in the glass industry. Various sources. .. he was also in trouble over a claim against him will an agent for the glass makers.

According to the Oxford Dictionary of National Biography, he married Mary Bradley, née Henzey, consistent with what you are saying. I'll add a note: what we do here is to make the first edition DNB available. Charles Matthews (talk) 20:43, 5 January 2016 (UTC)Reply[reply]

Congrats on 100k[edit]

You passed the 100k edits on enWS this past month. Nice. — billinghurst sDrewth 11:02, 5 January 2013 (UTC)Reply[reply]

So it's about 5 edits per DNB article, really. I'll start to worry when WS edits overtake WP edits, but that will be a while yet. Charles Matthews (talk) 11:06, 5 January 2013 (UTC)Reply[reply]

James Wilson link — do you know?[edit]

Got a link to a James Wilson at Page:Dictionary of National Biography volume 52.djvu/37 however it is too early for those listed. Can you guess who? Or do I just dead link it? — billinghurst sDrewth 13:22, 13 January 2013 (UTC)Reply[reply]

It will be w:James Wilson (anatomist) - not in DNB, or that article would have been easier to create! An interesting test case. What serves the reader's interests best? There is an argument for having a standardised page with the message "qv in DNB is misleading, but enWP does have an article". Charles Matthews (talk) 15:10, 13 January 2013 (UTC)Reply[reply]
I did a bit of text around blind links, and made a comment at the project page. I think that we have scope to add enWP links under that design. — billinghurst sDrewth 15:30, 13 January 2013 (UTC)Reply[reply]

retitling page[edit]

Hi Charles. I moved Seymour, Francis (1590?-1664) (DNB00) from a page title which was inexplicably mentioning 1669, and then I got over-excited and edited the template and the fromsection / tosection etc. This has messed up the way the page appears. Not sure how much it's best to change. Dsp13 (talk) 19:11, 15 January 2013 (UTC)Reply[reply]

Done - needs the transclusion to be updated between the ## and ## markers, if you change it in the other place (which you don't have to).
By the way, while you're here, DNB00 is going well, with just about 1500 articles left to do; and DNB01 is apparently finished. So not so long until we are down to DNB12. Now for that, there is a whole bunch of new authors to identify. There is a list at Wikisource:WikiProject DNB/1912 authors. Also we are quite sophisticated compared to the old days, tagging with authority control. There is a tool for that, inevitably written by Magnus Manske. Bearing down on either or both of these tasks would be very helpful. Charles Matthews (talk) 19:56, 15 January 2013 (UTC)Reply[reply]
I've had a go with Magnus' tool... very nice! as far as redlinks in the 1912 authors are concerned, where the author's already there under a different name, should I just add a redirect? e.g. Author:John E. Sandys to the existing Author:John Edwin Sandys?
Or alter on the list - I suppose either way is good. Charles Matthews (talk) 06:49, 19 January 2013 (UTC)Reply[reply]
OK, great. (I wasn't sure whether the list itself was a transcription which shoudl be presersved as surface text.) My next question is about authors for whom there are WP pages, eg. Mrs Blanco White is w:Amber Reeves. The WS author pages have some standard templates etc, and I don't have a full grasp of the conventions. How should I set one up just to say that she's a 1912 DNB author? Do I need to add what she wrote in the DNB at that point? If I have an example to copy, then I can crack on through that list I think. Dsp13 (talk) 10:27, 19 January 2013 (UTC)Reply[reply]
Great. I have just lashed up {{DNB contributor 2ndSupp}}. If you add that to author pages you create for the additional authors, it will do as a start. That is, tracking those pages will be easy. We are still a bit short of infrastructure for DNB12, obviously. But as it is only 5% of the size of the first edition it is not going to be a serious issue to match the work, in time. ({{DNB contributor}} has the code for adding the initials and that can be carried across when someone wants to go through systematically. I think we can cross the bridge when we come to it for listing articles: I was only able to do it for the first edition because you lent me the Fenwick book.) Charles Matthews (talk) 10:40, 19 January 2013 (UTC)Reply[reply]
Great. I think I'll hang out on ws for a while :) Watching Rich F undergo death by 1000 cuts on WP is making me absurdly upset when I go there. Dsp13 (talk) 10:51, 19 January 2013 (UTC)Reply[reply]
He's be very welcome here also. Category:DNB No WP has 11,000+ entries and we really need to fill in links to existing WP articles, as he has done in the past. I hadn't got round to showing you, which is joint work of Magnus and myself, ordering that category by the number of DNB pages a biography draws on. Charles Matthews (talk) 10:57, 19 January 2013 (UTC)Reply[reply]
Opened discussion on hard to identify 1912 authors (S to Z so far). Dsp13 (talk) 18:05, 19 January 2013 (UTC)Reply[reply]
Got one via ODNB contributors, left a couple of other comments. Charles Matthews (talk) 18:31, 19 January 2013 (UTC)Reply[reply]

'Zaphod Beeblebrox'-style[edit]

Apart from the first question of "why isn't he in DNB?", I have had my heads smacked together lightly by an AWB hacker, and I have been directed to the way around the 25k category limit. 29037 articles! Then I remembered that we have it embedded in the header, to automatically apply, so the list at Category:DNB No WP will show where there is no parameter, which means that it should be pretty good. I will run a check to make sure that we don't have any empty parameters which may throw the process, and fix accordingly. Not sure how we check the validity of a sister link. I will see if I can find someone to ask. — billinghurst sDrewth 01:21, 8 February 2013 (UTC)Reply[reply]

Did some tests, the lsit should be complete for work to be done. — billinghurst sDrewth 01:43, 8 February 2013 (UTC)Reply[reply]

Question on CE[edit]

Hi. A question on the approach to be followed. For articles like this: [1], Diocese and University are in the same page. On OCE the article is named after the town, with sections inside but the article on WS has been called Catholic_Encyclopedia_(1913)/Diocese_of_Sigüenza, even if University is included.

On the other hand, for Perugia, both on OCE and WS a different choice has been made, and the article splitted in two. See Catholic Encyclopedia (1913)/Archdiocese_of_Perugia and Catholic Encyclopedia (1913)/University of Perugia. To me both approaches are the same, but I think it should be a consistent choice. I started to change in one direction, but stopped once I realised both choices were in place. Any advice?--Mpaa (talk) 16:20, 28 April 2013 (UTC)Reply[reply]

In time I think the articles should be merged so that they have the same structure as the original. In fact since we have the scan here in the Page: namespace, I don't think there is really any choice: the transcluded version should match the original version. Splitting out the universities was just someone's arbitrary idea.
There are other types of examples.
As far as titles are concerned, the Sigüenza article does start Sigüenza, Diocese of and I have been consistently naming these articles "Diocese of X". It is justified from the original, and is also more helpful to the reader, really, to know that the content is primarily ecclesiastical. So this is a convention that I think should be applied. Charles Matthews (talk) 05:03, 29 April 2013 (UTC)Reply[reply]
OK. I will mark articles for merge or will do it directly when I will come across such cases. I already splitted one, too bad ... --Mpaa (talk) 06:47, 29 April 2013 (UTC)Reply[reply]
Hi. To cope with the wrong order, I tried the following. I fetched the ordered sequence of articles in OCE (with prev/next, Volume and link to the scanned page). There are (should be) 11487 articles. In WS there are 11648 articles (I think the difference being splitting choices here and there). Comparing them, 2048 do not match (after taking care of Blessed, Venerable, Saints, etc.). There are several reasons for that, like:
  1. e.g. Diocese of ..., Prefecture of ..., not yet moved on WS
  2. OCE has changed convention in some cases (e.g. Apostolic Vicariate instead if Vicariate Apostolic of ..., see Apostolic Vicariate of Kiang-nan and Catholic Encyclopedia (1913)/Vicariate Apostolic of Kiang-nan)
  3. accented charachters, ligatures, etc. (by the way, what is the convention to be used? ae-like or æ-like?)
This could be a way forward in dividing articles per volume and fixing prev/next, as OCE should be reliable. Names could be tuned on both sides, articles merged, keeping the scan as reference, until convergence. What do you think? If you want to take a look, I can post the lists (maybe too long?) or send you an excel file.--Mpaa (talk) 20:55, 29 April 2013 (UTC)Reply[reply]
To get serious, lists should be posted as project pages, as subpages from WS:CEU. And that initial page should be reconsidered.
One contribution to mismatches would be that, for example, a beatified person who is now canonised would appear as a Saint in the CE version, because what we have is a mirror of the version on where the editor did such things. But there are also a number of typos to find.
By the way, our version also has some missing articles. There used to be a whole block missing under letter E: I filled that in. But there are isolated instances of skipping. Charles Matthews (talk) 11:25, 1 May 2013 (UTC)Reply[reply]
To give you a better idea of what I am pursuing, I have posted Vol.1 here. In black there are OCE data (and links to scan page), in blue links to WS. Where there is one line, a 1:1 match has been found. Where no perfect match is present, alternatives are presented, obtained with a fuzzy match. Where there is a mismatch, we can change what is wrong, WS title or OCE title in this page. E.g., if accents are OK in the following case:
  • Vol:1 Lucas D'Achery -> Lucas D'Achery [2]
  1. Lucas d'Achéry
OCE reference could be changed as follows and at the next run of the script a perfect match could be obtained.:
  • Vol:1 Lucas D'Achery -> Lucas D'Achéry [3]
  1. Lucas d'Achéry
  1. OCE as reference (not perfect due to arbitrary splitting of articles or spelling errors) but scans are one click away
  2. changes to OCE references can be done in place and a script can be run to refresh this as needed
  3. a refresh can be done to updated WS links after page move
  4. fuzzy match algorithm can be made a bit smarter and show more accurate results
  1. OCE inaccuracies might be present
  2. not all WS links are considered, but it could be identified what is not yet matched.
Feedback appreciated, given your expertise on the matter.--Mpaa (talk) 22:18, 1 May 2013 (UTC)Reply[reply]

I'll put detailed feedback on that list on its talk page when I have a moment: I'm a bit busy right now. Just to make sure we are communicating well here, you do know that the CE is here in pagespace as an upload? The reason I use the OCE site is that there is an indexation of the articles, so that you can get to the scanned page from an article title usually quite quickly. Until we have the articles here in the correct order, that is a timesaver. In a systematic project it would of course make sense to link from a list to our own pagespace scan. Charles Matthews (talk) 09:59, 3 May 2013 (UTC)Reply[reply]

Yes, I am aware of that. Problem is that there are already articles present (with unsure sorting) and Page ns is almost empty. I was just trying to use OCE article order (available now) to sort articles in WS. It might not be 100% correct but could improve current unreliable status.--Mpaa (talk) 11:27, 3 May 2013 (UTC)Reply[reply]

It does look as if reconstructing the volume ToCs linked from Catholic Encyclopedia (1913) is going to be a major step forward. Charles Matthews (talk) 11:41, 3 May 2013 (UTC)Reply[reply]

Rees's Cyclopaedia project suggestion[edit]

Hi Charles, to continue the correspondence we've been having on your WP talk page about putting Rees's Cyclopaedia on Wikisource, I just read your Wikisource thread - lots of very detailed information to digest. I checked my records and find there are 30,400 pages in Rees, over 39 volumes. The text is in double column with an average length of 1480 words per page, which makes it around 43 million words. The pages are un-numbered, and only the botanical articles are signed. The plates (in separate volumes) are keyed to the articles, and it would be really useful to hyperlink the plates to the text, if you see what I mean. This would be possible with Commons, of course. It will be a very, very long job indeed, and at my age I shall probably not live to see it finished. However that is no reason for not beginning!! The spade work done in organising the logistics of getting the DNB done will make the job far easier. I'll be in touch here later on. Apwoolrich2 (talk) 16:02, 11 May 2013 (UTC)Reply[reply]

Yes, 30,000 pages is the same scale as the DNB. You mentioned that the Hathi trust scan is better. I think you should look carefully at it: with a facing image to look at, it is possible to make changes as you go, provided they are minor. Charles Matthews (talk) 16:51, 11 May 2013 (UTC)Reply[reply]
Unless I've mis-read the Hathitrust catalogue, they only have scans of the American edition, apart from vol 39 of the British one. I detected problems in not being able to download more than one page at a time in PDF format. The HathiTrust scan shows the long ESS as a long ESS. I'll re-check this Apwoolrich2 (talk) 17:53, 11 May 2013 (UTC)Reply[reply]
I've just copied and pasted into my text editor a sample page, and its come over very well with all the lines in the right order. Minimal editing is needed. Some of the OCR'd texts in the IA, however, sometimes conflate the lines across the columns (First line col 1 followed by first line col 2, line 2 followed by 2, 3 followed by 3 etc). I see the Hathi Trust now has the British version as well as the American. I did not see it when I made the listing of digitised copies for the WP article last year. I do have my own copy of Rees, so can proof read from that. A tip - The British edition is dated 1819, and the American 1805 on this site. Apwoolrich2 (talk) 18:29, 11 May 2013 (UTC)Reply[reply]
Some further thoughts. Checking through a volume on the HathiTrust site, I see that all formatting of mathematical tables is lost in the OCRd version, so these will need to be re-typeset from scratch. Many of the tables run over the entire page instead of being confined to the columnar format of the text, some orientated sideways as well. I'm sure it can be done, but it will be a fiddle. I must see what other books there might be in WS with similar tables and check out how it was done.
Tho' there are 30,000+ pages, there are far, far more articles then this, when the very short dictionary articles are taken into account. I suppose an index pages will need to be created for each volume listing each article in turn. The fact that the original was not paginated has caused problems ever since the work was published. Would there be any profit in adding page numbers in the Wikisource version? I've copied off a number of the WikiProjectDNB documents, and will spend a bit of time digesting them. Also see if there is something similar for WikiprojectEB11, if there is one. Kind regards. Apwoolrich2 (talk) 09:13, 12 May 2013 (UTC)Reply[reply]
There is a Britannica 1911 project here, but they don't transclude, I think. The tables would not be impossibly hard to format in wikitext, in place and from scratch.
What we have done for the DNB is to list the main articles in tables of contents, and make the "previous" and "next" fields skip the short articles. In other words the first aim is to have a ToC that covers the essential ground of the content. One can always come back later to interpolate the short articles. You'll find that there is just a bit of such interpolation around for the DNB.
On pagination: once there is an index page, and transclusion from pagespace, page numbers do appear automatically in the left-hand margin. A typical index page is like Index:Dictionary of National Biography volume 58.djvu. You can see there is a certain amount of control there on things: there are some Roman-numbered pages, and skipping. If you click through on page 100 there you get to page 108 of the pagespace version: as we would say, there is an offset +8. At Vanderbank, John (DNB00) you can see, though, that the page displayed in the margin is 100, corresponding to the index page.
Therefore there are two things going on. The scan has been uploaded with a straightforward numbering from 1 to 477. The index page has its own version, basically with 8 front-matter pages. This is a bit technical: the question is whether it is technical enough for Rees! I've never actually edited an index page before, but I see that it reads essentially as
Whatever the specifics are, there is clearly a good deal of control on how the marginal numbers are displayed. So it ought to be the case that you could set up the numbering in a helpful way. Charles Matthews (talk) 09:51, 12 May 2013 (UTC)Reply[reply]
I've created a sandbox page on my WS talk page and posted in it the entire preface and the first page of the work. The latter is about half a regular page in length because of the depth of the heading. I've edited this page with all the small caps and italics of the original.The Preface does have typos in it, I know. It was not very onerous, and occupied a part of a damp Sunday afternoon. I was glad to find the WS spell checker correctly found most of the 'long esses'. Next phase to post a message on Scriptorium making a proposal? Also to have a play with a header? Apwoolrich2 (talk) 16:28, 12 May 2013 (UTC)Reply[reply]
Procedurally I suppose you can just get on with it. Adapting {{DNB00}} to make {{Rees}} is probably less scary than it looks. NB that here on WS documentation tends to be at Template talk:X rather than Template:X/doc. User:Billinghurst is my go-to guy for templates, as for much else.
The Scriptorium - WS:S - is a good place for technical queries. The big technical issue is actually getting vol. 1 uploaded. That needs an admin: I'm one but have never in fact done a bulk upload. Others would, but a case has to be made. Charles Matthews (talk) 18:13, 12 May 2013 (UTC)Reply[reply]

I've just posted the list of long biographical articles from Rees on the WP Rees Page. Quite fun to do, but it was a pig getting the tables to work properly. All is well now though. Can't say the same for the page name. I've written '...ON Rees...' instead of '...IN Rees...' I will try and work out how to change it. I'll be interested to see how long it takes for somebody to Wikify the names. I'm still thinking around the form of the WS Rees project. Get back to you later Apwoolrich2 (talk) 14:27, 18 May 2013 (UTC)Reply[reply]

On my WS Sandbox page, I've posted a draft proposal about Rees for the Scriptorium. I'll be glad of any comments, please. There is a queer glitch I can't resolve. I italicised a book title, yet when saved, the title remains in Roman, and the italic is shifted a few words to the right. I wrote the original on my text editor (NoteTab), then cut'n'pasted it to the sandbox, where I added the markup. All very odd. Apwoolrich2 (talk) 19:01, 19 May 2013 (UTC)Reply[reply]

The format issue was a line break. Fixed now. Charles Matthews (talk) 08:17, 20 May 2013 (UTC)Reply[reply]

Hullo Charles. Its been 2 months since I posted my proposal, but apart from you there has been no response, so I am wondering if I am being a bit over enthusiastic with the idea of getting the text of Rees on WS now. In the last two months I've been indexing the Rees biographies on Botany. I took an existing list, but found it was highly inaccurate, so have been through each page of every volume. This is the first time I have ever done it, and am amazed at the wealth of material Rees contains. The musical writings of Charles Burney are very readable and I plan to index these next, since there is a fair amount of academic interest in Burney which does not appear to have made much use of his Rees writings. They provide a wealth of info about the C18 London musical and theatrical scene. I've been looking at the scans of the Rees plates on IA, but find they are very coarsely scanned, as well as being crooked and cropped in some instances so as to be inadequate for WP use. When I've finished Burney will scan my set of the plates at a better quality resolution and post those on Commons. Once they are there maybe a demand for the texts on WS might arise. I must confess to getting side-tracked looking up botanical biographies on WP of names in Rees. Also musical examples of relevance to Burney on You Tube. I'll keep in touch on progress. Kind regardsApwoolrich2 (talk) 19:43, 20 July 2013 (UTC)Reply[reply]

Wikisource User Group[edit]

Wikisource, the free digital library is moving towards better implementation of book management, proofreading and uploading. All language communities are very important in Wikisource. We would like to propose a Wikisource User Group, which would be a loose, volunteer organization to facilitate outreach and foster technical development, join if you feel like helping out. This would also give a better way to share and improve the tools used in the local Wikisources. You are invited to join the mailing list 'wikisource-l' (English), the IRC channel #wikisource, the facebook page or the Wikisource twitter. As a part of the Google Summer of Code 2013, there are four projects related to Wikisource. To get the best results out of these projects, we would like your comments about them. The projects are listed at Wikisource across projects. You can find the midpoint report for developmental work done during the IEG on Wikisource here.

Global message delivery, 23:20, 24 July 2013 (UTC)

Thinking ahead[edit]

Hi Charles, I'm thinking ahead to the December Proofread of the Month, which has been tagged with a games theme. I see we already have The Game of Go by Arthur Smith (1908). Is this the best we can do or is there a better PD book on Go that could be dropped in alongside a book on Chess and Association Football? I'm asking you because I've just tripped across your name on the Sensei Library and hope that you've got a better knowledge of the available literature than I do at a rusty 25kyu. Beeswaxcandle (talk) 07:47, 26 July 2013 (UTC)Reply[reply]

Ha. Yes, I know much of the literature in English. I co-wrote "Shape Up!" with a friend and many people assume it's PD; but not so. I also wrote "Teach Yourself Go".
Here's the deal: Wikisource could add plenty of value to the Smith book, by creating diagrams for the problem sets that are given in algebraic notation. There are various kinds of diagram software. Otherwise it seems to me that it would be a reasonable POTM, because it seems well advanced and mostly needs conscientious validation. But the book itself is no longer indicated for beginners. Charles Matthews (talk) 09:27, 27 July 2013 (UTC)Reply[reply]
I've already re-started the stalled validation and plan to get through it over the next couple of weeks. I haven't got to the problems yet, but may come back to you on the diagram software. Since I messaged you above I've found a copy of Cheshire's 1911 work on the Internet Archive. A quick scan through suggests that this is also not indicated for beginners and is probably more of historical interest. Nonetheless, is this worth consideration for a more general proofreading audience or should I just add it my list?
What surprises me is that Go doesn't seem to have been picked up in the English craze for things Japanese of the 1870s and 1880s. The concept of Ko-Ko singing Tit-willow to Katisha over a goban is intriguing. Beeswaxcandle (talk) 10:31, 27 July 2013 (UTC)Reply[reply]
Now that book I didn't know. The game on p. 36 seems to have been played between top pros on 9 December 1910 (Nozawa Chikucho versus Iwasa Kei). I've never met the notation system on p. 32 before. This book is certainly of historical interest, and as a self-published book must be rare. Not for beginners, as you say. The other games in the book could probably be identified, given that (I imagine) they were published in Japanese newspapers. Charles Matthews (talk) 13:13, 27 July 2013 (UTC)Reply[reply]
The article by Tony Atkins linked from here on the BGA site suggests the posting (from 2010) is not so generally known. These books are sort of test cases for annotation, in my view. Charles Matthews (talk) 13:25, 27 July 2013 (UTC)Reply[reply]
OK, there's now The Game of Go (annotated) with illustrations of game 3 in chapter 5, the missing joseki in chapter 6, the alternate for ending no. 2 in chapter 7, and all the problems and answers in chapter 8. Probably needs some further layout attention, but I think it's usable now. I'll have a break and focus on other projects, but do plan to come back to Cheshire later in the year. Beeswaxcandle (talk) 06:06, 21 August 2013 (UTC)Reply[reply]

Re: Hashes and DNB transclusion[edit]

Uh, what are you talking about? Not to be difficult, but it's been a month since my last edits to Wikisource, so I found your message cryptic. If I removed hashes -- & I probably had, I simply don't remember which ones you could be referring to -- it was because of display issues in the browser I was using. (Which browser, I don't remember off the top of my head: I tend to edit from a number of different computers.) So with a little more information I could understand what I did & if there's a problem with the site or just not understanding how things are done here. -- Llywrch (talk) 18:09, 21 September 2013 (UTC)Reply[reply]

I was referring to this diff, indeed from August. What actually happens is that <section begin=""/> and <section end=""/>, that mark out the beginning and end of sections transcluded into the freestanding article, get consolidated by software into a single section marked out by ## and ## at the start of a biography (assuming it's the DNB). So I found some biographies that weren't properly formed, and fixed them. Your edit summary suggested you weren't alert to the mechanism. Charles Matthews (talk) 19:00, 21 September 2013 (UTC)Reply[reply]

1,000,000th Content Page[edit]

The 1,000,000th content page Wilkinson, George Howard (DNB12) new content page origination edit by you took place at 09:10, 29 October 2013 (UTC). It had 408 bytes. Congratulations! This anticipated edit was discussed on the Scriptorium at Wikisource:Scriptorium#Approaching_1_million_content_pages_at_enWS. ResScholar (talk) 04:06, 31 October 2013 (UTC)Reply[reply]

Thanks - through the marvels of Echo I had seen that, via @Prosody:'s edit. Unexpected good news. Charles Matthews (talk) 04:25, 31 October 2013 (UTC)Reply[reply]

Catholic Encyclopedia & DNB[edit]

You seem to be the person most involved in trying to get the above complete here, from what I have seen. What do we need done yet? To the degree that I can help, I think it would be great to get some of the old reference works completed, because I am also right now developing a rather huge list of the older reference works still counted as useful as per a 1986 book on reference works, and I think having a few such reference works completed might make it more likely that some of the others get attention as well. John Carter (talk) 19:12, 3 November 2013 (UTC)Reply[reply]

The DNB had, as of this morning, 37 articles of the second supplement to post, and then the public domain text should be complete. In other words, it is pretty much done.
The Catholic Encyclopedia here is more complicated to discuss. It was first posted by a bot, in a most unfortunate way. The division of articles didn't exactly match the original; the ordering was somewhat arbitrary, so that you can't easily match the "volumes" with the originals; and the bot skipped in posting. Also the text was apparently scraped from the New Advent digitisation, which often omitted the endnotes (and did worse things ...)
I would guess the Catholic Encyclopedia is about 95% complete here: there was a gap of about 300 articles, mostly with initial E, but I filled that in. The other missing articles are presumably cases where the bot skipped one. They are hard to detect without going through the whole text against a scan. We have a scan uploaded here.
So, frankly, the Catholic Encyclopedia is still a mess from the point of view of completeness. Some progress has been made on author pages, which is one way to cross-check. We won't really know until the text is "migrated to djvu", as the DNB is, how much there is to do, but "plenty" covers it. Charles Matthews (talk) 19:23, 3 November 2013 (UTC)Reply[reply]
Well at least vol. 7 of the CE here is in djvu, maybe some others as well. I've never started pages for works before, and don't think it would be a good idea for me to try one this early, but if might be a start. If you've got a link somewhere indicating what isn't done in the DNB, I can maybe at least look that over and maybe try to start some of them as well. John Carter (talk) 19:35, 3 November 2013 (UTC)Reply[reply]

We do have all the Catholic Encyclopedia files we need: Index:Catholic Encyclopedia, volume 1.djvu and so on cover it. If you find a missing CE article, you can post text with the header at Template:CE13, which is the easy way to do it. In the fullness of time the text will go by the scan, as is done for example at Page:Catholic Encyclopedia, volume 1.djvu/25 for Catholic Encyclopedia (1913)/Aachen. If you go to edit Catholic Encyclopedia (1913)/Aachen you can see the syntax that pulls the text into the article. Yes, it's a bit complicated at first sight, and there needs to be markup on the Page: versions to make it work. Anyway that is how we do business here, by preference, these days. ProofReadPage is the name of the system, and it means proofreading is verifiable by anybody.

The DNB remaining redlinks are at Dictionary of National Biography, 1912 supplement, Volume 3. I see six have been done this afternoon, so the end is in sight. If you want to experiment on say Warner, Charles (DNB12), you need to go to Page:Dictionary of National Biography, Second Supplement, volume 3.djvu/605. The DNB12 header to use is like

{{DNB12 |article= |previous= |next= |volume= 3 |contributor = |wikipedia = |extra_notes= }} <pages index="Dictionary of National Biography, Second Supplement, volume 3.djvu" from="6" to="6" fromsection="" tosection=""> </pages>

with the relevant page numbers 6.. inserted. fromsection and tosection relate to whatever you put in <section begin=""/> and <section end=""/> to mark out the start and end of the text you want in the article. (When you place nowiki> and </nowiki> correctly on the same page they resolve to a header like ## Warner, Charles ## if, as I would, the marker is Warner, Charles.)

To some extent this is straight in at the deep end, and may look forbidding.

You can get some idea for the Catholic Encyclopedia from another site with scans, e.g. . If you compare with Catholic Encyclopedia (1913)/Quinquagesima you will see that the next article matches, but after that the OCE site has "Quiricus and Julitta" and we apparently don't. That turns out to be because the "next" link from Catholic Encyclopedia (1913)/Agustín Quintana needs to be changed! That is about where we are, and you are of course very welcome to help in checking. Where there really is a missing article, you could use the OCE text to create an article with the CE13 header. Charles Matthews (talk) 20:23, 3 November 2013 (UTC)Reply[reply]

Sorry if I jump in. I have aligned to my best the articles in CE. Volumes TOCs (e.g. Catholic Encyclopedia (1913)/Volume 12) should reflect OCE, and hopefully scans (except articles to be merged). Using your example, you can see that in Vol. 12 TOC the right article is there, and also Catholic_Encyclopedia_(1913)/Sts._Quiricus_and_Julitta points back to the right article. So, work in progress on this side.
IMHO, the next point would be to attack scans but the most important issue to be defined is how to tag sections. If we use CE convention (e.g. "Quintana, Augustine"), it would be cleaner but a bit more challenging to write a bot to automatically transclude links to the current title pages. Any suggestions on this?--Mpaa (talk) 21:24, 3 November 2013 (UTC)Reply[reply]
Now prev/next for articles are aligned with ToCs for all volumes.--Mpaa (talk) 21:30, 5 November 2013 (UTC)Reply[reply]
But that's excellent! Charles Matthews (talk) 16:50, 6 November 2013 (UTC)Reply[reply]

I should review the situation with the Catholic Encyclopedia, then. The DNB posting has about another day in it, and then that project needs to take stock. For the CE, to adapt my past DNB method that used long strips of marked-up text, the first thing is to get the table of contents correct; then use the list of names of articles to generate marked-up text in bulk. I used {{polysect}} and some list manipulations to do about 30 pages at a time, with the page titles serving as the transclusion markers, and had a system for producing templates to minimise work. I used {{DNBset}} for the actual article creation. A serious amount of work, though, and the Ce text requires more remedial work. Charles Matthews (talk) 06:40, 4 November 2013 (UTC)Reply[reply]


You have new messages
You have new messages
Hello, Charles Matthews. You have new messages at AdamBMorgan's talk page.
You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.
You have new messages
You have new messages
Hello, Charles Matthews. You have new messages at AdamBMorgan's talk page.
You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

Tenth Anniversary Contest[edit]

Continuing from the discussion on the Wikimedia UK mailing list, I've started a draft page for this: Wikisource:Tenth Anniversary Contest. Does this look even slightly appropriate to you?

It's only partly done because I haven't had a lot of free time so far this week. I've left space for ten texts, to match ten years of Wikisource, but I've only found a few that seemed appropriate so far (and I uploaded a new one; a WWI work seemed right this close to that anniversary). It might end up being a lot less than ten. I also need to find out how WMUK see themselves being involved in this. - AdamBMorgan (talk) 22:34, 13 November 2013 (UTC)Reply[reply]

Thanks - I've been on holiday and offline, need to catch up. Charles Matthews (talk) 08:25, 17 November 2013 (UTC)Reply[reply]


Hi Charlies, I replied to you here. Ed [talk] [en] 07:42, 29 November 2013 (UTC)Reply[reply]

Links from CE1913 to en:WP[edit]

Hi. Do you think it would be interesting to add such links? The approach would be to look for {{Cite Catholic Encyclopedia}} (any other useful templates) on en:WP, and check that what is linked back to WS does not contain the corresponding WP link.--Mpaa (talk) 21:57, 8 January 2014 (UTC)Reply[reply]

Need just to clarify. This would be from the text here of the CE, not just for the wikipedia= field in the header? Charles Matthews (talk) 08:28, 9 January 2014 (UTC)Reply[reply]
I meant "just for the wikipedia= field in the header".--Mpaa (talk) 11:03, 9 January 2014 (UTC)Reply[reply]
Right. Then there is a great tool for this type of matching. But it is not working right now. And then there is this other tool, which is matching CE pages up to Wikidata pages, whence the enWP page could typically be found. But it would be more up-to-date to work on putting "wikidata=" as a header field anyway, I guess.
So there is some overlapping work to take into account.
The things I'm talking about are
Now that doesn't run - it was ported from the toolserver by Magnus Manske, and it didn't run there either, as I recall. It is probably some relatively trivial thing the code, I guess, given that the CE pages are subpages and the DNB pages have a suffix, and the DNB version of the tool runs very well for me. Might be worth a few minutes of your time to look into this; I can ask Magnus if this business really isn't transparent. (I tend to assume everyone here has more technical knowledge than me, and it is usually a good guess.)
The wikidata-related tool is This has a CE setting which really does work, and others are working on it, meaning you might be able to reuse.
I'm sure there is nothing wrong with your original idea. Just seemed worthwhile documenting what is out there already. Charles Matthews (talk) 19:51, 9 January 2014 (UTC)Reply[reply]
I am not completely familiar with Wikidata, and as far as I know wikisource is not supported yet. I have no idea of what the effect of the wikidata= field in {{header}} is. I'll try to dig a bit more, otherwise I guess will stick to my original idea.--Mpaa (talk) 22:02, 9 January 2014 (UTC)Reply[reply]
WS:S#Reminder: Wikidata coming on January 14th! Charles Matthews (talk) 22:10, 9 January 2014 (UTC)Reply[reply]
I posted a question on Scriptorium. Let's see.--Mpaa (talk) 22:50, 9 January 2014 (UTC)Reply[reply]
You might be interested in following up the discussion at Scriptorium, as my understanding is that the matching done by the tool you have listed above associates different 'entities'.--Mpaa (talk) 08:50, 10 January 2014 (UTC)Reply[reply]

wikimania panel?[edit]

a bird was suggesting that User:Moondyne was interested in WikiSource activities at wikimania? how about a WS panel, about the DNB success story. a reception would be nice to recruit, although there aren’t many pubs near the barbican, ? i defer to your local knowledge. Slowking4Farmbrough's revenge 18:50, 20 January 2014 (UTC)Reply[reply]

The sort of panel that would interest me would be around "Digitization and reference material". Charles Matthews (talk) 19:39, 20 January 2014 (UTC)Reply[reply]
hate to keep harping, but i see the wikisource meetup social, but the panel, did it get not accepted? [4] could it be a workshop? WMUK has not been very transparent. i guess in a month things will have settled. Slowking4Farmbrough's revenge 02:35, 7 June 2014 (UTC)Reply[reply]
I've been away, and missed an "availability check". But I have just heard that the panel has the green light. I'll post more to the Scriptorium: the panel is 6031 on Charles Matthews (talk) 15:45, 12 June 2014 (UTC)Reply[reply]


To mention, nothing more, that I have uploaded nine vols of Thomson's A biographical dictionary of eminent Scotsmenbillinghurst sDrewth 09:59, 27 January 2014 (UTC)Reply[reply]

Thanks. I was getting a bit puzzled, given that w:Thomas Napier Thomson mentions another number of volumes. Page:A biographical dictionary of eminent Scotsmen, vol 2.djvu/287 shows a volume start. So I suspect that those volumes were bound as nine, rather than being nine originally. Charles Matthews (talk) 14:22, 27 January 2014 (UTC)Reply[reply]
They are page numbered as 3, though bound as 9, in nine divisions (title pages for each) corresponding to the binding. As it is the 1857 publication date, presumably it was a reprint in a library form of the 1851, maybe it is the second edition of 1851, all a bit hard to tell and maybe I need to do so research on it. All that said, it was the best scan available, the pages seemed to align and be present (though it was mental arithmetic late at night). — billinghurst sDrewth 05:40, 28 January 2014 (UTC)Reply[reply]

Automated import of openly licensed scholarly articles[edit]

Hello Charles Matthews,

We are putting together a proposal about the automated import of openly licensed scholarly articles, and since you are an active Wikisourceror, we'd appreciate yourcomments on the Scriptorium. For convenience, I'm copying our proposal here:

The idea of systematically importing openly licensed scholarly articles into Wikisource has popped up from time to time. For instance, it formed the core of WikiProject Academic Papers and is mentioned in the Wikisource vision. However, the Wikiproject relied on human power, never reached its full potential, and eventually became inactive. The vision has yet to materialise.
We plan to bridge the gap through automation. We are a subset of WikiProject Open Access (user:Daniel Mietchen, user:Maximilanklein, user:MattSenate), and we have funding from the Open Society Foundations via Wikimedia Deutschland to demo suitable workflows at Wikimania (see project page).
Specifically, we plan to import Open Access journal articles into Wikisource when they are cited on Wikipedia. The import would be performed by a group of bots intended to make reference handling more interoperable across Wikimedia sites. Their main tasks are:
  • (on Wikipedia) signalling which references are openly licensed, and link them to the full text on Wikisource, the media on Commons and the metadata on Wikidata;
  • (on Commons) importing images and other media associated with the source article;
  • (on Wikisource) importing the full text of the source article and embedding the media in there;
  • (on Wikidata) handling the metadata associated with the source article, and signalling that the full text is on Wikisource and the media on Commons.
These Open Access imports on Wikisource will be linked to and from other Wikimedia sister sites. Our first priority though will be linking from English Wikipedia, focusing on the most cited Open Access papers, and the top-100 medical articles.
In order to move forward with this, we need
  • General community approval
  • Community feedback on workflows and scrutiny on our test imports in specific.
  • Bot permission. For more technical information read our bot spec on Github.

Maximilianklein (talk) 18:27, 20 June 2014 (UTC)Reply[reply]

Template:Collective still needed?[edit]

Is this template still needed? Unused, and we seem to have stopped development. — billinghurst sDrewth 02:16, 22 July 2014 (UTC)Reply[reply]

Seems not, and could be reinvented at need. Charles Matthews (talk) 06:30, 22 July 2014 (UTC)Reply[reply]
For the record though: Template talk:Polysect. I used {{polysect}} all the time in DNB work, but only in preview. So there was no reason to save it to a page. Charles Matthews (talk) 06:36, 22 July 2014 (UTC)Reply[reply]
Back. It could do with some simple {{documentation}}. — billinghurst sDrewth 15:08, 22 July 2014 (UTC)Reply[reply]
Mmm. It is part of my method for doing long "strips" of text to paste into pagespace. And I could document that: would be relevant to EB1911 and the Catholic Encyclopedia. I did explain that to Adam B. at a meetup once, over several minutes and much handwaving. The trick is to start with a list of titles/markup choices, use the template, scrape the preview text, and then paste the real text into the gaps between begin and end. Not very intuitive. Charles Matthews (talk) 16:23, 22 July 2014 (UTC)Reply[reply]

Need edit help[edit]

Hi Charles, Would you have the time to look at [5] i have section breaks I can't seem to fix. --Daytrivia (talk) 01:07, 7 August 2014 (UTC)Reply[reply]

If it's not working now (I have moved the reference footer up into the text) it looks like an artefact created by the footnote. Charles Matthews (talk) 04:21, 7 August 2014 (UTC)Reply[reply]
I had to wait until the caching allowed me to see the new effect, but it does seems fixed now. Charles Matthews (talk) 05:43, 8 August 2014 (UTC)Reply[reply]
Thanks a million Charles. Daytrivia (talk) 08:57, 8 August 2014 (UTC)Reply[reply]

Edition problem?[edit]

Here the text diverges from the image in CÆSAR, Sir THOMAS (1561–1610):

  1. ", and M.P. for Appleby in 1601" added to the text
  2. "His career at the bar was undistinguished" rather than "wholly undistinguished"
  3. "cursitor baron" rahter than "puisine or cursitor baron"
  4. "next month" rather than "ensuing month"

I am assuming that these are due t replacing the image with a better one, but with different text!

Rich Farmbrough, 03:10 24 August 2014 (GMT)

@Rich Farmbrough: Looks as though the scan was replaced c:File:Dictionary of National Biography volume 08.djvu in April 2014, and from today's access I cannot see the older djvu version to know about the scan at that page (it may have been a dud image scan). We should proofread to the scan, so feel free to make the appropriate changes, and any addendum will be added. You can always make a comment in the notes section to the transcluded work of the additions in a later edition, presuming that is what we are actually seeing, though history unknown. — billinghurst sDrewth 00:35, 25 August 2014 (UTC)Reply[reply]

Thanks for the corrections. The issue is caused by my use of the DNB text from the ODNB site, which is a later edition. Here they've added in the MP information, and shortened other pieces of the text to fit it in: better for the reader, but worse for the WS "norm" of faithfulness to the scan and the first edition. There will be other examples: my plan in particular is to find examples in DNB00 where later references have been added by searching for 1901, 1902 etc. In any case I tried in proofing the text to pick up on these changes; but I was not completely successful, and if you find more of the same you can just correct them.

I don't believe the replacements of djvu have caused a change away from the first edition, but it did happen once, I think, and was fixed. Charles Matthews (talk) 04:45, 26 August 2014 (UTC)Reply[reply]

UK Wikisource training[edit]

Hi - it would be great to hear a bit more about Wikisource and what UK training sessions might involve. I see from your Wikipedia userpage that you're in Cambridge - I don't suppose you're coming to EduWiki this year in Edinburgh? If not, I am coming down to London in the first week of November for what might be a couple of nights. It would make sense to work in a trip to Cambridge while I'm down south to pick your brain on either the 4th or 5th of November if you happen to be available. I'm doing the Train the Trainers session after EduWiki, and I think I'll focus on using that to work through some ideas for a Wikisource training session. It would be great to be able to speak to you about it and get your input. ACrockford (talk) 11:23, 10 October 2014 (UTC)Reply[reply]

Not at EduWiki. I think it is in the nature of an unsolved problem how to do a basic Wikisource training session, so, yes, some discussion would be good. Charles Matthews (talk) 04:38, 11 October 2014 (UTC)Reply[reply]
Would you be free on either 4th or 5th November? If you prefer to arrange something by e-mail let me know ACrockford (talk) 10:28, 15 October 2014 (UTC)Reply[reply]
I've just seen that there might be a Wikidata meetup in London, evening of 5 November. So I might well be at that. Not confirmed yet, though. Charles Matthews (talk) 14:11, 15 October 2014 (UTC)Reply[reply]
I hadn't seen that, but if so it would be good to attend. Let me know if you'll be there or whether 4th/5th would work. Happy to come through to Cambridge! ACrockford (talk) 12:14, 20 October 2014 (UTC)Reply[reply]
OK, you can mail me from the sidebar a bit nearer the time, and we'll firm something up. Charles Matthews (talk) 16:31, 20 October 2014 (UTC)Reply[reply]
So the Wiki Wednesday meetup is set for 6 to 8 pm at Development House, Wednesday 5 November. I'm intending to be there, and so if you are that would do for a date. Charles Matthews (talk) 11:25, 22 October 2014 (UTC)Reply[reply]
Yes, I was just looking at that. I'll be there for sure - I will probably see if I can hotdesk at WMUK HQ that day anyway and have some meetings with other WMUK staff, would you possibly be able to come a bit before the meetup? If not that's fine too ACrockford (talk) 12:41, 22 October 2014 (UTC)Reply[reply]
Probably, for 5 pm anyway. Charles Matthews (talk) 13:21, 22 October 2014 (UTC)Reply[reply]
That would be perfect then - shall we tentatively set that in place? ACrockford (talk) 12:06, 23 October 2014 (UTC)Reply[reply]
That will do fine, barring the unexpected. Charles Matthews (talk) 04:49, 24 October 2014 (UTC)Reply[reply]

Conversation at WD to note[edit]

Just wanted to point to you d:Property_talk:P972#Question_on_usage which is probably relevant to note the proposed alternative and probably what we will use to cite DNB to people. I also see that the contributors to DNB will be listed to each volume of the DNB to which they contributed by d:Property talk:P767. Something that we will need to get to when time permits. — billinghurst sDrewth 09:36, 22 October 2014 (UTC)Reply[reply]

Who Killed Wikipedia?[edit]

saw your remarks, wish i could be as sanguine. outsiders rehashing insider worry is news, (this is what we call reliable sources). instead of "who killed?" how about "what ever happened to"; a permanent plateau is tantamount to death for the growth obsessed. i agree about Jemielniak, except that there isn’t any better; there is an unfortunate tendency to sweep under rug and forget how we got here. i dislike linear extrapolation too, but the long term trend worries many including Andrew Lih amd Lila Tretikov (and me).

it may be easier for experienced editors, but not for newbies. the work going forward will be harder; we cannot rely on subject matter experts (Wadewitz) showing up, they are all being bitten and going away. in my work with GLAM institutions, i find they will edit during editathons only. i worry that wikipedia is becoming wikinews. it is not enough to be better, but the rate of improvement must be fast enough to inspire confidence. it will take a substantial culture change to get enough editors to make a dent in the quality article backlog; given the resistance to change, it will be ugly.

sorry to go on, i would be interested in any ideas you might have going forward. Slowking4Farmbrough's revenge 00:33, 28 November 2014 (UTC)Reply[reply]

I'm currently active on Wikidata, which is going well, and opening up a new GLAM front (see for what I'm involved in). I'm familiar with Andrew Lih's point of view, but have felt for years that it basically misses what is good about WP (e.g. the DNB push). The view from journalism is on the media side of "media vs. pedia", which is an ongoing tension rather than anything else.
Training is a problem. I have done enough to know that editathons are not the answer. I have my own project (Wikisoba, based on the premise that we have to get better at distance education. The Visual Editor will come and help. Wikidata queries can produce much better "redlink lists" to work on. There are issues specific to enWP, and they shouldn't be confused with those relevant to the movement as a whole. I don't think the WMF gets everything right, but there is enough to work with around. Charles Matthews (talk) 07:18, 28 November 2014 (UTC)Reply[reply]
i agree, it is risible- "wikipedia is dead" (long live wikipedia). and the horse race reporting, does not do analysis. perhaps Lih has nostalgia, but i wonder why the culture can’t be more like the early days. now, it seems more a power trip, than problem solving. the veterans have spun off to more civil projects, and the newbies are driven away. the gamesmanship at the expense of the project is discouraging.
training is hard; if we had 1000 wadewitz’s then quality progress would be large enough. user:sadads has a digital humanities idea [6], but for now the quality progress is miniscule; the editathons are really about ally building; helping institutions get their digital resources where researchers can find them. also, there are some editors who prefer the group events rather than the solitary efforts.
as you know VE is opt in, so i have to say to every newbie, turn it on; there is more than enough work, high time to increase the newbies to do it. enWP gets confused as a matter of course, but i’m waiting for the civility enforcement, as promised at wikimania. no sign that i can see. Slowking4Farmbrough's revenge 01:37, 1 December 2014 (UTC)Reply[reply]
The constants of the WP community are: dynamism, blind spots, rhetorical excess. The Visual Editor will come along—at Wikimania I was looking over James Forrester's shoulder, and he was fixing up the "hidden comment" beef (comments lodged in the wikitext are not prima facie visible to the VE user). This is kind of fine detail. I suppose the WMF would like it the default for newbies, without the posturing on both sides you get when the WMF tries to strong-arm the community for its own good.
I would feel happier if the WMF showed some grip on the issue of what the newbie editor experience should be. In practice training sessions show they get many capchas and other things running interference.
I had an idea last week for a "Wikidata Article Wizard" that would start stubs in a new nursery space, where the Reasonator could be used to visualise them. Charles Matthews (talk) 15:14, 1 December 2014 (UTC)Reply[reply]

New Proposal Notification - Replacement of common main-space header template[edit]

Announcing the listing of a new formal proposal recently added to the Scriptorium community-discussion page, Proposals section, titled:

Switch header template foundation from table-based to division-based

The proposal entails the replacement of the current Header template familiar to most with a structurally redesigned new Header template. Replacement is a needed first step in series of steps needed to properly address the long time deficiencies behind several issues as well as enhance our mobile device presence.

There should be no significant operational or visual differences between the existing and proposed Header templates under normal usage (i.e. Desktop view). The change is entirely structural -- moving away from the existing HTML all Table make-up to an all Div[ision] based one.

Please examine the testcases where the current template is compared to the proposed replacement. Don't forget to also check Mobile Mode from the testcases page -- which is where the differences between current header template & proposed header template will be hard to miss.

For those who are concerned over the possible impact replacement might have on specific works, you can test the replacement on your own by entering edit mode, substituting the header tag {{header with {{header/sandbox and then previewing the work with the change in place. Saving the page with the change in place should not be needed but if you opt to save the page instead of just previewing it, please remember to revert the change soon after your done inspecting the results.

Your questions or comments are welcomed. At the same time I personally urge participants to support this proposed change. -- George Orwell III (talk) 02:04, 13 January 2015 (UTC)Reply[reply]

Is this category still wanted?[edit]

Hi, I've been doing some gnome work and just looked at Special:WantedCategories. One of the higher in the list is Category:Index talk subpages with 62 entries, all of which are for DNB. Now that the DNB has been proofread, are these pages still needed? (I'd rather not create this category, if I don't need to.) Beeswaxcandle (talk) 05:14, 23 August 2015 (UTC)Reply[reply]

No immediate need, certainly. Charles Matthews (talk) 07:11, 23 August 2015 (UTC)Reply[reply]

Venn's site at Cam[edit]

Hi Charles. Through your contacts are you able to find out what has happened to CamLib's online version of Venn's Alumni Cantabrigienses. It used to be at and now seems to be blocking access, and I cannot find the entry portal (if there is one). It was a great resource, and it would be a shame if it has been withdrawn. Thanks. — billinghurst sDrewth 03:45, 1 November 2015 (UTC)Reply[reply]

It has been down for a few days now. I once saw the message change, so I'm hoping this is just maintenance. If the outage persists, I'll ask Dsp13, who was once involved, and knows the folk.
BTW, ACAD as it is now called forms one of themix'n'match datasets, meaning to some extent there is usable (partial) information anyway. If you don't yet use the whole-mix'n'match search, you might find it useful. Search within ACAD itself can be found from the mix'n'match main page.
Speaking of alumni, I have a theory about Alumni Oxonienses part I that may prove of interest. I'm dabbling in AO part II here, in relation to the FRS catalog on mix'n'match, which has just been completed. Much to do in tracking down minor Fellows of the Royal Society. Charles Matthews (talk) 17:49, 1 November 2015 (UTC)Reply[reply]
The word on Venn is that it is migrating, on a time scale of a couple of weeks. Charles Matthews (talk) 08:55, 9 November 2015 (UTC)Reply[reply]


hi, i have been adding cite ODNB links as i go along. however, recently i encountered a mobile wifi wall that prevents getting the metadata of the ODNB article. for example [7]; [8] i can find the id number there, but cannot credit the author properly. i seem to recall the article landing page being a little more friendly, maybe a word would convey that we noticed. Slowking4RAN's revenge 22:18, 31 December 2015 (UTC)Reply[reply]

An alternative way to get the ODNB article authors is via Charles Matthews (talk) 01:21, 1 January 2016 (UTC)Reply[reply]
yes thanks, i see they have "refreshed" the front end. doesn't do anything for me, but i can get it into the template. Slowking4RAN's revenge 03:31, 1 January 2016 (UTC)Reply[reply]

Constructed an index[edit]

Found that we had a set of validated index pages, so I have done a construct at the base of Dictionary of National Biography, 1885-1900/Vol 28 Howard - Inglethorpe. It is just some people's renditions of edits with a tidy-up veneer by me. It is there for opinion, of which I have yet to establish anything particular at this point, beyond it needs work. Anyway, there for your opinion too. — billinghurst sDrewth 12:09, 19 March 2016 (UTC)Reply[reply]

Hmmm, thanks, interesting addition. On the theory that lists should be constructed by bot from Wikidata, what are we looking at here?
I assume Listeria could turn out such lists, with a bit of extra processing. What properties would be involved? Page (P304) with a bit of a stretch. Title (P1476) could legitimately be used to place the index title on the Wikidata data page. Possibly number of pages (P1104) belongs there; in a sense "final page" would be better, but that might require a new property.
Then volume (P478) should in any case be placed on the data pages, along with the author name, publication date. There are "follows", "followed by" properties to code up the article ordering. A lot of work to put this all into Wikidata, naturally.
The advantages would be: the generation of complete references for the DNB articles could be automated, given just the item number. The index listing of this kind could be generated from the initial article's item number, and use of "followed by".
Caveat: the "see articles" properly belong in sequence, and we have ducked their role so far. Their "title" can involve a hyperlink. This would require some further thought. Charles Matthews (talk) 12:40, 19 March 2016 (UTC)Reply[reply]
At this point of time, I am simply being true to the published work, and transcluding pages. — billinghurst sDrewth 13:29, 19 March 2016 (UTC)Reply[reply]
I'm aware that the perfect is the enemy of the good. Charles Matthews (talk) 07:22, 20 March 2016 (UTC)Reply[reply]


Hi Charles, I've sent you an email! Ed Erhart (WMF) (talk) 05:16, 25 May 2016 (UTC)Reply[reply]

Organisations become portals, not authors[edit]

Hi. I have just stomped through the headers for The Report of the Iraq Inquiry and converted links to relative links, got rid of year in subpages, and other tidying to style. [Noting that the gadget we have should have done that stuff automagically with the right lead page.] I have also converted the author parameter to use | override_author = by the Iraq Inquiry, chaired by [[Author:John Chilcot|]]. It was a community decision a way back that if we are talking about an organisation that they should be a portal: ns page, not an author: ns page. So either we can have a portal page [[Portal:Iraq Inquiry]] that references each of the people on the commission and houses the links to the work, or we can have it as above with indicative text and a lead author. I hope that my boots are not overly stompy. :-/ — billinghurst sDrewth 02:00, 13 July 2016 (UTC)Reply[reply]

No, no, all helpful. Doing something about the report came up recently on a UK mailing list, and when I realised Tom Morris had actually started over the weekend I decided to do some transclusion, to give a model for others. So better to have it all in house style. Charles Matthews (talk) 05:44, 13 July 2016 (UTC)Reply[reply]

visual editor find and replace[edit]

fyi, i see they have find and replace functionality, in the visual editor in page space. give them your feedback, they are relatively responsive. Slowking4RAN's revenge 18:55, 30 July 2016 (UTC)Reply[reply]

Thanks for that. Charles Matthews (talk) 07:30, 31 July 2016 (UTC)Reply[reply]

A biographical dictionary of eminent Scotsmen — no WP articles[edit]

Hi CM. Flagging local articles without something corresponding at enWP

moved to Index talk:A biographical dictionary of eminent Scotsmen, vol 1.djvu

billinghurst sDrewth 12:11, 17 February 2017 (UTC)Reply[reply]

I see that you have grouped the apocryphal DNB to an article, and we may wish to look at how we do that more broadly. — billinghurst sDrewth 12:31, 17 February 2017 (UTC)Reply[reply]

I have wanted for a while to tabulate w:Apocryphal biographies in the Dictionary of National Biography with the reasons anyone thought the people existed. The same few names (Bale, Dempster, Tanner) come up. Funnily enough, I was having a conversation in a Cambridge bookshop not long ago, and was told that around the time of WWII it was a donnish game, to get a fake person into the DNB supplements. Not that I believe everything I'm told. Charles Matthews (talk) 15:42, 17 February 2017 (UTC)Reply[reply]

Then how do we manage these in WD? They are not fictional characters, they are not humans. Collective article per work propbably isn't sustainable, and we still need to manage the pseudo-person. Hmm, maybe I should start that conversation there to see what they want. — billinghurst sDrewth 22:53, 17 February 2017 (UTC)Reply[reply]

As far as I'm concerned they are fictional humans, as in instance of Q15632617. Maybe some qualification of that is required, though. Charles Matthews (talk) 05:40, 18 February 2017 (UTC)Reply[reply]

Asking suggests to use d:q21070568, and that is my addition in the aliases — billinghurst sDrewth 10:59, 18 February 2017 (UTC)Reply[reply]

Well, OK. I think what they really are could be called "artefactual human identity", in the sense that they existed once some scholar validated them in a published work. This anyway seems to be a specialised ontological niche, and Wikidata could cope with having a special subclass.

That said, in order to do anything with the item one has to know how to handle questions of identity. For example, monk A who wrote manuscript M and monk B who wrote manuscript N can sometimes be identified as the same person, by scholars. We then, usually tacitly, just merge the items without more ado, assuming A, B, M and N existed IRL. But things can get a lot more complicated ... and "more ado" is required.

I don't have doctrinaire views here: my initial concern was to prevent Wikipedia articles being created that would propagate DNB errors. The tabulation I mentioned would give case studies, which might help to nail this issue. Charles Matthews (talk) 11:32, 18 February 2017 (UTC)Reply[reply]

Not my area of speciality, I suppose I am looking at those more acquainted to have an opinion. (Presumably we can update item labels withouth much fuss.) Sometimes I am just a simple transcriber. The discussion at d:Project chat has a few curves witin it. — billinghurst sDrewth 09:42, 19 February 2017 (UTC)Reply[reply]


congratulation on WiR ContentMine ! we need a major push to get published metadata into wikidata. let me know if i can help do some cleanup. Slowking4SvG's revenge 13:29, 20 April 2017 (UTC)Reply[reply]

Thank you. d:Wikidata:WikiProject Source MetaData has been put on my agenda already. There is much to do, and I expect to know more next week. One clear issue is "manual of style": as with much of Wikidata, any conventions about formats to use may remain implicit. We are about at that point where there needs to be discussion, and initially that would be on talk pages of properties, trying to get consensus on how best to add metadata in this area. So, d:Template:Bibliographical properties may summarise the state of the art. Charles Matthews (talk) 05:00, 21 April 2017 (UTC)Reply[reply]

applying text layer for Catholic Encyclopedia[edit]

I have started giving Wikisource-bot the task of applying the text layers for the 17 volumes of CE. Should have them done by the end of the week. I have already stuck {{engine}} into the volume template so we can search the vols. Asi I am hunting through the vols. for author data, I may as well start converting push/pulling those existing articles into transcriptions. — billinghurst sDrewth 13:23, 10 April 2018 (UTC)Reply[reply]

That's good. Charles Matthews (talk) 15:22, 10 April 2018 (UTC)Reply[reply]

Too late for me to sort this out. We have two authors assigned to the same set of DNB12 works. Do you have a quick and easy answer without me digging through other works? — billinghurst sDrewth 15:29, 5 May 2018 (UTC)Reply[reply]

@Billinghurst: I'd assign them all to Horace, the younger man, who already has the EB1911 articles. Henry (his uncle) was distinguished enough, but would have been 80 years old at the time. Charles Matthews (talk) 16:07, 5 May 2018 (UTC)Reply[reply]


Wondering if I could ask an odd favor, seeing that you know basic Russian at the Admin page. I am trying to translate a public domain poem into Russian. I have played around with Google translate, and have come up with the following after many attempts:

Mечтать о Великой Мечте

Мечта о Великой Мечте, хотя вы должны мечтать - вы, только,
И следуйте, без друзей, в поисках высокого квеста.
Хотя мечта приведет вас в пустыню, в одиночку,
Или перетащить вас, как буря, без отдыха,
Тем не менее, пытаясь подняться до самого высокого алтаря,
Поместите свой высокий дар туда перед богами:
Человеческое сердце, чья храбрость не дрогнула,
Хотя Он был далеко, как Арктур, когда он мерцал.

Не спрашивайте, видят ли другие люди свечение,
Они не разделяют ваше желание или страсть;
Не обескураживайте, если это определено детьми земли—
Земля - ​​их богиня, и она только посредственная!
Душе нуждается пророк и искупитель:
Ее крылья вытянуты против ее тюремных баров,
Она ждет истины, и правда с мечтателем,—
Непрерывный, как бесчисленный звездный свет!

The original is as follows:

Dream the Great Dream, though you should dream—you, only,
And friendless follow in the lofty quest.
Though the dream lead you to a desert lonely,
Or drive you, like the tempest, without rest,
Yet, toiling upward to the highest altar,
There lay before the gods your gift supreme,—
A human heart whose courage did not falter
Though distant as Arcturus shone the Gleam.

The Gleam?—Ah, question not if others see it,
Who nor the yearning nor the passion share;
Grieve not if children of the earth decree it—
The earth, itself,—their goddess, only fair!
The soul has need of prophet and redeemer:
Her outstretched wings against her prisoning bars,
She waits for truth; and truth is with the dreamer,—
Persistent as the myriad light of stars!

Does the translation seem adequate to you? It need not be perfect, just so that the general sentiment may be understood. If you are unable to assist in correcting any errors, there are no hard feelings! Feel free also to pass the request off to someone else if you know of anyone who might be able to help! Many thanks either way, Londonjackbooks (talk) 16:17, 8 June 2018 (UTC)Reply[reply]

I haven't spotted grammatical errors, so far, which would be about my limit (three years of Russian at school, a long time ago). I would write "ее" as "еë" since phonetically it is "yeyo". I wasn't sure about "против ее тюремных баров", where I would expect a form of свои, namely своиx.
Really poetry, and permissible forms there, is over my head. I can't pretend to native speaker feeling for it. Charles Matthews (talk) 16:34, 8 June 2018 (UTC)Reply[reply]
Thank you! Eloquence is not sought, just a basic, dry, literal translation so an acquaintance can understand the gist. I appreciate your taking a look :) Londonjackbooks (talk) 16:42, 8 June 2018 (UTC)Reply[reply]

Ancestry's 1939 England and Wales Register[edit]

Just to note that with the release of the England and Wales Register at Ancestry, it is pretty good for listing DoB, which then allows us to track some further. So can be useful if we know alive at date, and at a location. — billinghurst sDrewth 02:57, 23 June 2018 (UTC)Reply[reply]

Thanks for that. Charles Matthews (talk) 06:40, 26 June 2018 (UTC)Reply[reply]

Use of Tabernacle for easier biographical article creations[edit]

Asking as it is easier (and lazier). :-)

Using the WEF framework gadget, I have been one by one adding the biographical articles into WD, either as I have been creating them, or retroactively when someone else has been creating them. See Special:RecentChangesLinked/Dictionary of Indian Biography/A. It isn't too bad, though it is repetitive, especially for a number of the items added, though it doesn't allow for the quality tag to be added. Example of creation d:Q57159798.

Is there a means using Tabernacle to create these items as a batch/stream, with the interwiki with quality flag, and the items with the qualifiers. If we can, we have some good ability to go back and backfill heaps of other works. Thanks. — billinghurst sDrewth 00:57, 10 October 2018 (UTC)Reply[reply]

I suppose my first thought on this issue would be (a) a category here for the work and (b) Petscan for creation of items. I hadn't thought of Tabernacle, but I guess it could be used on the output of that workflow. Charles Matthews (talk) 03:49, 10 October 2018 (UTC)Reply[reply]
I haven't found Petscan valuable for these sorts of matters as it is just creating too many disparate steps (fwiw, Petscan can create from subpage listing, MM fulfilled that request for me). It seems that nothing does the interwiki quality flag, and it keeps needing to be done individually, all a PITA. — billinghurst sDrewth 10:29, 10 October 2018 (UTC)Reply[reply]

A request[edit]

Hello. At the moment, uploading scans and creating index pages for them is beyond what I am capable of. I was wondering if someone could upload the scans for Halsbury's Laws of England (First Edition) and create the index pages for them. If this is done, I can do the individual pages. James500 (talk) 23:27, 12 March 2019 (UTC)Reply[reply]

@James500: I'll take a look at it over the weekend. I've not done the operations before; but I have more of a tech background now than when I joined Wikisource ten years ago. Charles Matthews (talk) 04:41, 13 March 2019 (UTC)Reply[reply]
@James500: I have set up Index:Halsbury Laws of England v1 1907.pdf now for the first volume: it's still in a fairly basic state. There are a couple of hundred pages of front matter, with roman numerals. I've created the first six pages of the text proper, for you to look at. Charles Matthews (talk) 12:19, 17 March 2019 (UTC)Reply[reply]
Thank you. James500 (talk) 13:36, 17 March 2019 (UTC)Reply[reply]
(stalker) think i have the index sorted. cheers. Slowking4SvG's revenge 01:11, 18 March 2019 (UTC)Reply[reply]


Hello. I was wondering if someone could advise me on which wikisource a particular text belongs. Does the Rolls Series belong on the English Wikisource, or does it belong on the old (multilingual) wikisource, or do parts of it (specifically the English parts and the bilingual parts) belong on both? James500 (talk) 06:27, 9 April 2019 (UTC)Reply[reply]

I suppose texts in w:Law French might be considered for the French Wikisource. Try the Scriptorium for more authoritative advice. Charles Matthews (talk) 06:44, 9 April 2019 (UTC)Reply[reply]

How to OCR this type of pages[edit]

Hi Charles, I want to ask about how you proofread these type of pages. In Punjabi Wikisource we also have a book like that. Can you help me with that? --Talk 07:42, 8 October 2019 (UTC)Reply[reply]

@Benipal hardarshan: I think you mean pages with two columns of text. It can certainly be a problem for OCR, because in the worst case the software confuses the two columns, and you have to sort it out by hand. Mostly for that DNB work the scanning was OK, though. It is an issue at the beginning of the scanning process, and I don't know about the software involved. Charles Matthews (talk) 07:53, 8 October 2019 (UTC)Reply[reply]

A request[edit]

I have been getting messages about page creation from Billinghurst. I fear that he might try to use admin tools to prevent me manually creating pages in the page namespace at some point in the future. I asked him what his intentions are, but he has not answered. I have discussed this matter with Peteforsyth. He agreed that Billinghurst's messages do not seem based in policy and suggested that I approach another admin. James500 (talk) 16:19, 24 May 2020 (UTC)Reply[reply]

@James500: At least at first sight, it appears to me that there may not be too much for you to be concerned about. For some repetitive tasks, a bot could be used. If you requested bot support for such a task, it would be regulated by the bot policy, naturally. You don't have to do that, certainly. It is the sort of thing that could be helpful to those patrolling recent changes. Charles Matthews (talk) 19:04, 24 May 2020 (UTC)Reply[reply]

Request about 1974 Yugoslav Constitution[edit]

Hello. I've been trying to edit the 1974 Yugoslav Constitution to clean up the formatting so it will better match the format used on constitutions, but the editor continuously blocks the publication of my changes and says that my action matches Abuse 23. I believe my edits were constructive, however, and in accordance with the error message am reaching out to you to ask: why does the system consider my actions harmful? and what, if anything, can I do? (The Professor (Time Lord) (talk) 23:36, 14 February 2021 (UTC))Reply[reply]

The problem is that there's an apostrophe in the title. I fixed it by adding the "onlysection" parameter with the subpage title inside. There are as of now at most 287 pages with that issue. I'm fixing some now as we speak. ミラP@Miraclepine 03:46, 5 April 2021 (UTC)Reply[reply]

Thanks for the help! Charles Matthews (talk) 05:43, 5 April 2021 (UTC)Reply[reply]

oxon preload[edit]

Gday. I have added a preload function in the top component of "Alumni Oxon" subpages. So you should be able to click that bit to erove {{header}} and load {{oxon}} with page creations. I do know that often that is not your approach to prepare in situ, though thought it important to say. — billinghurst sDrewth 00:13, 15 September 2021 (UTC)Reply[reply]

Mismatch between text, scan and errata in DNB[edit]

Hi. I was adding errata from 1904 to articles. I realized that in some cases text does not match the scan, as if the text comes from an older version compared to the scan. See Dictionary_of_National_Biography,_1885-1900/Bankes,_Henry for example and search for 'Woodward / Woodley' in text, scan and errata.

I also came across the opposite, where the text was more updated than the scan and already included the errata changes, as if the text was copied from a more recent source compared to the scan, see Dictionary of National Biography, 1885-1900/Campbell, Guy. In this case I updated the text to reflect the scan content (see this).

Just wanted to highlight this, as you have the history of the project. I am not sure what to do with this mismatches, and with the errata handling. Mpaa (talk) 10:22, 19 September 2021 (UTC)Reply[reply]

@Mpaa: Yes, after my early days on the DNB project, I used text from It is proofread text of the DNB, but from an edition later than the first editions. Therefore it has some corrections and additions.
I made a big effort to modify it to match the first edition text, but I wasn't 100% successful. From the point of view of being faithful to the original, clearly, the text should be fixed up with the scan.
I haven't worked with the errata. Charles Matthews (talk) 05:31, 20 September 2021 (UTC)Reply[reply]

How we will see unregistered users[edit]


You get this message because you are an admin on a Wikimedia wiki.

When someone edits a Wikimedia wiki without being logged in today, we show their IP address. As you may already know, we will not be able to do this in the future. This is a decision by the Wikimedia Foundation Legal department, because norms and regulations for privacy online have changed.

Instead of the IP we will show a masked identity. You as an admin will still be able to access the IP. There will also be a new user right for those who need to see the full IPs of unregistered users to fight vandalism, harassment and spam without being admins. Patrollers will also see part of the IP even without this user right. We are also working on better tools to help.

If you have not seen it before, you can read more on Meta. If you want to make sure you don’t miss technical changes on the Wikimedia wikis, you can subscribe to the weekly technical newsletter.

We have two suggested ways this identity could work. We would appreciate your feedback on which way you think would work best for you and your wiki, now and in the future. You can let us know on the talk page. You can write in your language. The suggestions were posted in October and we will decide after 17 January.

Thank you. /Johan (WMF)

18:14, 4 January 2022 (UTC)

Decedent of Gileon Delaune[edit]

I red your article about Gileon Delaune and you mentioned that you are related to him. I too am related. He was my 9th Great Grandfather. Gileon Delaune, III, then Thomas are other Great Grand Fathers of mine. My Grand Mother was Emma Dulaney. The spelling of the name changed when the family came to the Untied States. If you have any other information about him or the Delaune family, could you please direct me toward it. 17:05, 3 September 2022 (UTC)Reply[reply]

Alumni Oxonienses[edit]

Hi Charles,

Awesome work on Alumni Oxonienses!

But I have a question… Why'd you pick specifically <div class="leftoutdent"> to format the entries? I ask because, for mainly technical reasons, the leftoutdent class needs to go away, and right now the Alumni Oxonienses entries are the only users of that class (or, well, I think it is; the insource search caps out at 10k hits so it can't cope with the sheer majestic scale of the Alumni Oxonienses :)).

Could we replace it with one of the standard dynamic layouts (Layout 2 should be a generally inoffensive replacement, I think)?

And while I'm bugging you with questions, why did you use the {{#tag:pages|…}} syntax instead of the standard <pages …/> syntax? As far as I can see they are exactly equivalent for this particular use case (the #tag: syntax is only really needed in very narrow circumstances), so the more standard one would be generally preferable. Xover (talk) 17:32, 13 October 2022 (UTC)Reply[reply]

@Xover: On the technical points: I was just copying syntax in use already at the time. Billinghurst was making bot runs replacing the header template anyway. There was no discussion about what the project should be using that I recall.
The outdent is part of the original look. Given that, if someone wants to make the whole 63K consistent and updated with a bot, that would be OK with me. Charles Matthews (talk) 04:52, 14 October 2022 (UTC)Reply[reply]
Ah, thank you. Every single entry that I checked was created by you, so I just figured you were the one that done that part of the work. But then, as mentioned, MediaWiki "just" shows me the first 10k entries. :)
In any case, thanks for clarifying. I know people (myself included) can get attached to a particular way of doing things, so I didn't want to step on any toes. I'll dig a bit further and then run a bot over them to update these two aspects. Xover (talk) 06:33, 14 October 2022 (UTC)Reply[reply]

@Xover: The #tag:pages syntax just becomes a whole lot easier at times, you don't need to plug in underscores, nor wrap in quotes, and it is equivalent. Plus you don't have the issue of subst: not working when doing things. — billinghurst sDrewth 13:16, 31 December 2022 (UTC)Reply[reply]

AO early series[edit]

I have uploaded the 4 volumes of the early series. Vol. 2 is not the best, though I could only find the one scan out there. If you know of an alternate, we should look to fix or reload.
Early series: vol. 1vol. 2vol. 3vol. 4
Later series: vol. 1vol. 2vol. 3vol. 4

I will still need to run the bot through to apply all the layers, though thought that I would drop the note. — billinghurst sDrewth 13:11, 31 December 2022 (UTC)Reply[reply]

OK, thanks. At w:User:Charles_Matthews/alox I only have the one scan for Vol. 2 of alox1. If I had found another one in use on enWP, I would presumably have added it. I'll pay attention when adding the citation template over there. Charles Matthews (talk) 14:16, 31 December 2022 (UTC)Reply[reply]

Errata DNB[edit]

Hi. How is one supposed to validate this Page:Dictionary of National Biography. Errata (1904).djvu/77 with such formatting in Page ns? What is the gain in Main ns? Mpaa (talk) 17:31, 9 May 2023 (UTC)Reply[reply]

@Mpaa: Might be a good question. The errata should appear correctly when transcluded to the pages Dictionary of National Biography, 1885-1900/Clarke, Alured (1745?-1832) etc. to which they apply. So the text should be accurate and the markup should be working, for the whole block of articles relevant to that errata page. Charles Matthews (talk) 06:05, 10 May 2023 (UTC)Reply[reply]
I never noticed major disruptions in Main ns with my version (never checked mobile version, only desktop). Some time ago I compared Main ns HTML source code and, if I remember correctly, the difference was just a <p> tag. Have you seen something blatantly wrong? Mpaa (talk) 15:43, 10 May 2023 (UTC)Reply[reply]
I may have misunderstood what was going on, in passing. The common error to fix is just a straightforward typo. Charles Matthews (talk) 15:46, 10 May 2023 (UTC)Reply[reply]
Not sure if I got you correctly. With ref to this diff, the typo "be" -> "he" is OK. What I don't get is the need of this kind of changes <section begin="Clarke, Alured (1745?-1832)" />|416 beacuse they break the formatting in Page ns. and I do not see benefits in Main ns rendering. Mpaa (talk) 15:56, 10 May 2023 (UTC)Reply[reply]
So I'm not really understanding the newline business, and how it is apparently interfering with the table format. But I'm not doing such edits at present, and will try to avoid them in future. Charles Matthews (talk) 16:08, 10 May 2023 (UTC)Reply[reply]