Wikisource talk:WikiProject DNB

From Wikisource
Jump to: navigation, search
Wikisource: WikiProject DNB Main Talk


The supplements[edit]

Organizational point: there were three DNB supplements published in 1901, mainly catching up with folk who had died after the relevant volume was completed. It would make sense to handle these in parallel with the 1900 DNB: but how exactly? Charles Matthews (talk) 16:08, 6 May 2009 (UTC)

I believe that we are managing these with the later year of publication, eg. DNB01, and so on. I don't think that there was anything against extending, more focussing on something achievable(???) -- billinghurst (talk) 08:02, 7 September 2009 (UTC)
Coordinating the establishment of DNB01. Will require review to ensure files and index pages fit within existing DNB framework to facilitate tracking, template interoperability and the like. JamAKiska (talk) 17:34, 3 October 2010 (UTC)


I'm just getting up to speed on Category:Problematic, which currently has 17 DNB pages. If that is to be used as a general place for cleanup of scans, it is probably going to need its own subcategory subsystem and a bit of infrastructure. I suggest systematic use of discussion pages, in reference to the various kinds of issues such as are discussed under the previous topic. Charles Matthews (talk) 10:34, 27 October 2009 (UTC)

Sure, however, for specific editions, we can manage it by adding notes to the respective Index_talk pages. We can also look at DNB IndexPages to get an overall picture of where the problematic pages are situated. I also remember someone showing me how you could search for the union of two categories. billinghurst (talk) 11:29, 27 October 2009 (UTC)

Treatment if "redirect" articles?[edit]

Have we decided how to treat the "redirect" articles in the original DNB? I just added Osborne, Edward (DNB00). Its predecessor (see Page:Dictionary of National Biography volume 42.djvu/290) is is a "redirect." Should I create an article for Osborne, Dorothy (DNB00), or not? if not, what should I use as the "previous" for the Osborne, Edward (DNB00) article? -Arch dude (talk) 02:39, 9 January 2010 (UTC)

This point has not been decided. My view is this: let's not include any redirect pages in the volume ToCs. Let's not include them in the "previous" or "next" fields, either; what is helpful is to go directly to the previous or next full biography. Where they can fit in is as linked from the wikified Volume Index pages. In other words, in creating hyperlinks from the Index pages at the ends of each volume, there is the scope to create a wikilink to a short page for these various types of redirects. Charles Matthews (talk) 10:03, 10 January 2010 (UTC)

Blind DNB links[edit]

Have been introduced to our first blind DNB link. Where they did a qv in an earlier volume, and subsequently it would seem that they have decided against the biography.

We need some thinking for how we wish to handle this.

I don't think that we should make the qv wikilinks disappear, and feel that we could look to having a page without any body text, with a standard DNB00 header, though with something in the note field that says something along the lines "while another DNB biography links here, the biography was not undertaken in the published volumes 1-63." — billinghurst sDrewth 00:08, 7 February 2010 (UTC)

Preferably with something added: "The wikilink to this page is a placeholder. Feel free to improve it with an internal or external link". And we should probably have a project page somewhere noting the blind links, and so that such changes can be logged, to avoid duplicated efforts. Charles Matthews (talk) 10:32, 7 February 2010 (UTC)
There it is probably easiest to create a {{maintenance category}}, do we use Category:DNB blind linked page, and slap it onto Category:DNB? If the addition is any more than simple I think that we should template whatever we have there so we can more easily update the words. Should we create a page as a pilot? billinghurst sDrewth 12:48, 7 February 2010 (UTC)
Shouldn't we both keep it simple, and avoid any mission creep within Wikisource? I propose that we simply have one page (a subpage of this project) to which such blind links (and this may be the only one) would direct. If anything more than that is need then what am I missing? Jan1naD (talkcontrib) 14:49, 7 February 2010 (UTC)
Like it centurion! Make them redirects to the page, and we can add each bio name to a compile list as we find them. As redirects also easy to undo if we find it in a weird spot. billinghurst sDrewth 16:34, 7 February 2010 (UTC)
OK, can do it all on one project page if the cross-namespace linking isn't considered weirdness. Charles Matthews (talk) 08:41, 8 February 2010 (UTC)
I cannot see why we cannot do it in the main ns, make it a specific subpage of DNB, it keeps the work together. Not perfect perfection, however, it should do. billinghurst sDrewth 12:27, 8 February 2010 (UTC)

Set it up then, can be moved whenever. Charles Matthews (talk) 18:08, 8 February 2010 (UTC)

Please make it clear that these are blind links (i.e. to articles that were never added to the DNB) rather than redlinks (i.e. to articles that WERE added to the DNB but which have not yet been transcribes into the Wikisource DNB.) We will likely need to re-emphasize this to new contributors occasionally. I recommend a brief note for readers on the new page itself. and a longer note for contributors on its talk page. -Arch dude (talk) 14:37, 9 February 2010 (UTC)

Blind Link adds from Volume 28 & 33 Found these over the last month.

The following two currently have only Wiki articles available.

JamAKiska (talk) 23:15, 28 May 2010 (UTC) All of these Blind links are now red links. Those with wiki links have been moved to [q.v.] symbol.JamAKiska (talk) 16:42, 30 May 2010 (UTC)

Mitchel, Jonathan (DNB00), Rev. John Cotton (d. 1652). JamAKiska (talk) 13:39, 31 August 2010 (UTC) Included in 1901 edition of the 2nd supplement.JamAKiska (talk) 00:52, 1 October 2010 (UTC) redirected to DNB01

I have created Blind link target page (DNB00) to get moving on this, and linked one page to it. No header so far. Charles Matthews (talk) 10:06, 29 October 2012 (UTC)

I have moved the page to Dictionary of National Biography, 1885-1900/Blind links which makes it a subpage of the original work, rather than sitting as a top level biography. Can I note that it is a subpage of the DNB00, which may or may not prove the best place rather than under the overarching name, and that depends whether there are any blind links in the later supplementary or errata editions. I also believe that we should leave the link as a standard in line links {{DNB lkpl}} and that from that page it can be redirected to the catchall page eg.
#redirect [[Dictionary of National Biography, 1885-1900/Blind links#Kearns, Mogue (DNB00)]]
At the catchall page I have added a ToC and added entries with
{{anchor+|Kearns, Mogue (DNB00)}}
so they know exactly where they are landing, plus this gives us an opportunity (if we so choose) to direct the person onwards. After I have finished my other round of article <-> author link maintenance I will try to move to this task. — billinghurst sDrewth 10:23, 10 January 2013 (UTC)
Thanks. I have noted the new page on Wikisource:WikiProject DNB#Index of project internal links. Charles Matthews (talk) 08:37, 11 January 2013 (UTC)
As discussed elsewhere, James Wilson (1765–1821)James Wilson (anatomist) is added. I think that we could design it into a table (template the rows to make it easy) that includes a link to enWP. The discussion is a little on formatting ... keep it simple, two columns, label and enWP, or space it a little more. <shrug> — billinghurst sDrewth 15:39, 13 January 2013 (UTC)

Wikisource:WikiProject DNB/Wikification[edit]

Trying to grasp the nettle of what we mean by saying that articles should be hyperlinked. Couple of points. The "blind links" discussion just above this should be referenced by this new project page, mentioning what is to be done about the qvs that can't be resolved. And there is the possibility that some wikilinks should run to a disambiguation page, to give the reader a choice of references rather than picking out one. This is also something to mention on the page, depending on what we think. Charles Matthews (talk) 09:54, 15 February 2010 (UTC)

The disambig page is a good idea. Better than my (unpublished) thought that something like [Abercorn Earls of. See Hamilton] should simply point to the contents page that lists the Hamilton articles. Jan1naD (talkcontrib) 10:12, 15 February 2010 (UTC)
What is said at Wikisource:Style guide#Disambiguation pages is not particularly well adapted to reference texts (the common experience). In practical terms it doesn't seem to annoy anyone to create a dab page about a given topic, including reference work articles with closely-related titles. Linking to such pages is what I had in mind. Your idea can be refined by using {{anchor}} to set up the "Hamilton" anchor on that page. I actually just don't know where we stand on using dab pages more systematically, for example disambiguating "Hamilton Earls of Abercorn". Discussions on general principles seem to turn out inconclusively. Charles Matthews (talk) 12:02, 15 February 2010 (UTC)
To annotate this discussion, there is a general discussion at WS:S about use of a namespace to collate topic pages (person/pace/other). To also note that as I have been working on a couple of biographical works, I have been starting some disambiguation pages to manage such occurrences. An example is Adrain, Robert. — billinghurst sDrewth 15:20, 28 May 2010 (UTC)

The identified issue with M's tool[edit]

Wikisource DNB links to Magnus Manske's statistics and maintenance tool. (The detailed readout is now more complicated than in the past, because the /DNB author subpages are causing some unproblematic pages to register in both of the main cleanup lists.)

As an interim measure, why don't we transclude the subpages back to the top level of the author space. It keeps the page smaller, subsidiary, yet brings them back to the front and away hopefully available to Magnus's toolserver script. At least worth consideration. — billinghurst sDrewth 15:47, 22 February 2010 (UTC)

OK, could try it on a small sample. I should mention the slightly embarrassing (for me) glitch meaning that the tool picks up a couple of Catholic Encyclopedia pages. I was tinkering with {{CE13}} as an adaptation of {{DNB00}} and never finished changing over some of the categories. Charles Matthews (talk) 16:00, 22 February 2010 (UTC)
Hah, that isn't even trying!wink The other thing that we could look to see if we can leverage a tool that ThomasV was building. Discussion at User talk:Beeswaxcandle#Have a look at the presentation of ..., as I think that there is some potential there. — billinghurst sDrewth 06:56, 23 February 2010 (UTC)

Article names in natural order as on WP?[edit]

I'm new to DNB. Is there a present or future plan to have article names in natural order as on WP? Otherwise when I Google, for example, for "Robert Murray M'Cheyne" (in quotes) I am never going to find the relevant DNB page unless he happens to be mentioned under that name in the text of the article. I Google for people's names a lot by this method.--PeterR (talk) 10:57, 6 March 2010 (UTC)

I'm not a fan of inverted name order generally, since it doubles (at least) the time for searching any database if one even remembers to do a second search. On the other hand the convention is established here, so that the natural solution is to create redirects. Which could be done systematically at some point: the project is really too big/long to say when we might get round to that, and too laborious to be prescriptive in the sense of saying that "redirects must be created as we go along".
On the bright side, once there is a Wikipedia link on an article, that will be in "natural order", and therefore should be picked up by search engines, so the business isn't hopeless. It is reasonable to ask how articles will actually be found, and the answer is like "they should be created with at least four incoming wikilinks anyway, and in many cases with a link from Wikipedia; and we shall be wikifying the text of the DNB articles so that over time there will be other links in, reflecting the DNB qv structure and other occurrences". I hope that is a fair answer to the concern. Charles Matthews (talk) 11:38, 6 March 2010 (UTC)
I think the most compelling argument you made is that our link to the WP article by itself introduces the appropriate searchable text. when a WP article does nto yet exist, perhaps we could figure out a way to add the "natural order" name in a non-displayed field for use by web crawlers? Many of our subjects actually have multiple "natural" names (e.g., Duke Wellington) which could also be added as hidden search terms. -Arch dude (talk) 23:06, 8 March 2010 (UTC)
We could use the "extra notes" field to display fuller names, since our title convention is "minimal", while in some cases the useful name might be quite complicated: I was struck by the case of Savile, Thomas (DNB00) which begins as "SAVILE, THOMAS, first Viscount Savile of Castlebar in the peerage of Ireland, second Baron Savile of Pontefract, and first Earl of Sussex". I don't know quite what that proves, though. "Thomas Savile, 1st Earl of Sussex" is the WP title, and very sensible too. Perhaps there is a maintenance task associated with Category:DNB No WP, along the lines you suggest; that category is suddenly becoming big, for various reasons. Charles Matthews (talk) 06:53, 9 March 2010 (UTC)
The correct answer for searches is probably none of the above. Metadata is the appropriate answer, and utilising future renditions of the mediawiki software to organise the work. How we seed works to get extra terms as indicated above, it will be interesting to see how will eventuate, and we have brought it to the attention of developers. That said, I don't think that the concept of surname, firstname is that foreign, especially when one considers its relationship to DEFAULTSORT. — billinghurst sDrewth 10:28, 9 March 2010 (UTC)
Worth saying also, in reply to the original query, that the WP onsite search is ahead of the game here: for example Thomas Savile searched does show the WS page in a box. Charles Matthews (talk) 11:52, 9 March 2010 (UTC)

An awkward one[edit]

Sion Llywelyn (DNB00) is proving elusive (vol. 52). It is a duplicate of another article to be seen at Page:Dictionary of National Biography volume 34.djvu/28, as Llywelyn of Llangewydd (DNB00), a.k.a. Llywelyn Sion, and for that reason, presumably, occurs in no edition after the first (they caught this by 1904). The page of the unique scan is Page:Dictionary of National Biography volume 52.djvu/327, which is badly corrupted, but you can see the initials J.E.L. above Sion Lleyn (DNB00). The Fenwick handbook, based on the 22 volume edition, denies that this article exists. It's only a short article, but it may need someone with the physical original volume 52 to fill this gap. Charles Matthews (talk) 19:23, 1 April 2010 (UTC) Completed...JamAKiska (talk) 02:55, 1 October 2010 (UTC)

+1. Charles Matthews (talk) 07:10, 1 October 2010 (UTC)

A listings automation issue[edit]

It is pleasant to be able to announce that we shall reach the milestone of 5000 DNB articles shortly. Another milestone relates to Category:DNB contributors with incomplete listings, in other words author pages not having a full list of DNB articles: this is down to 100 authors (out of a notional 683 - there are a few author pages for the DNB not yet created, but they are negligible for the listing issue). Barring various kinds of error, we are down to the authors who were prolific (more than 50 articles). I can do some more on this, but the longest lists are many hundreds.

I have been thinking along these lines: with listing complete by author, wouldn't it be possible to have some automatic way to scrape the names from the author pages, put them in alphabetical order, and then create volume ToCs in that way? The listing by volume issue is at most one third done at present. There are some points about the results you'd get (disambiguation is not guaranteed but with a list showing the duplicates can be elucidated by finding which authors link to a name, and also ASCII order isn't exactly right for the DNB ordering by dates); but work by hand on rough lists would be quite reasonable to handle these matters. Technically the author pages should all carry {{DNB contributor}} and related templates, and the relevant list would be enclosed in {{DNB lkpl}} or {{DNB link}}.

I find this attractive not just because if it works it would save a great deal of typing, but actually it could be done in pieces (a few volumes or initial letters at a time). It would be a reason and motivstion to get the very long listings done piecemeal. Charles Matthews (talk) 16:04, 11 May 2010 (UTC)


I have just updated Wikisource:WikiProject DNB/Statistics for June. There has been a gradual accretion of numbers to track. Now that we are finishing volumes, where should we record progress on completed volumes, and completed letters of the alphabet? These are probably the most conventional measures, for a project such as this, together with the headline number of articles (which is now close to 20% - we should get an accurate number of DNB00 articles, which total around 27,000). Charles Matthews (talk) 09:40, 2 June 2010 (UTC)

The Fenwick handbook says 27,326 articles are DNB00, disgreeing with Sidney Lee's Statistical Account. So we have done 19.8% by that measure. NB that the articles generally get longer, on average, in later volumes. Charles Matthews (talk) 09:45, 2 June 2010 (UTC)
It is also likely that the shorter articles will be done first, so the proportion of text inserted is probably less than the proportion of articles.--Longfellow (talk) 20:35, 2 June 2010 (UTC)
That may be, depending on how people work, but I doubt it is really significant (the very short articles are not so interesting, and may well be ignored by anyone who isn't going through systematically). The average or "normal" (median) article is about one page of DNB; my impression, working through letter S, is that the really scanty articles are fewer, probably because the team of authors by then had enough experts in all the required fields. Anyway as the project progresses, it will become more possible to extract information. Gillian Fenwick calls the DNB "a fascinating subject, barely documented to date". Charles Matthews (talk) 07:02, 3 June 2010 (UTC)
I would also reflect that I typeset pages, not articles, so there will also be lots of part pages waiting for the remaining parts. We could look at the number of proofread pages, though in the earlier period, there was more of a tendency to not use the progress markers. — billinghurst sDrewth 10:12, 3 June 2010 (UTC)
Shouldn't really get hung up on numbers: extrapolation says this is a three-year project now at the current rate of progress, and that is indicative enough. Consolidation will take in already-done text in the natural course of things. Once the articles are there for DNB00, DNB01 is another 5%. Then DNB12 is also possible. I'm interested in tracking the various referencing and cross-linking issues because they form a part of the bigger picture, as well as motivations. Creating all those author pages was part of fighting initial inertia and getting some momentum, too. I was asking about how to display the "headline figures" mainly because the project's front page hasn't up till now made a point of announcing progress, while we are reaching one or two milestones. Charles Matthews (talk) 10:37, 3 June 2010 (UTC)
We can always link to

British Museum[edit]

See Wikisource:Scriptorium#British Museum tie-ins for what this is all about; and Wikisource:WikiProject DNB/British Museum where I'm marking the project's card about author most relevant to us. Basically my list shows that 25 out of 35 author pages for writers who worked at the British Museum are DNB authors. Charles Matthews (talk) 20:03, 5 June 2010 (UTC)

Linking from the Wikipedia page on the DNB[edit]

I have left a note on w:Talk:Dictionary of National Biography saying I intend to link from the Wikipedia article to volume ToCs here, as we finish up the volumes. Currently the links run to versions. This is partly prompted by discovering that Dictionary of National Biography here rates only at a lowly #36 in a google search for "Dictionary of National Biography". Our efforts could be more prominent. Charles Matthews (talk) 10:10, 16 June 2010 (UTC)

Where the DNB errs[edit]

Following a discussion on my talk page about the DNB's mistakes, I have put together Wikisource:WikiProject DNB/Errors and errata. Charles Matthews (talk) 12:06, 29 June 2010 (UTC)

More on author pages[edit]

I'm working through listing the first edition articles on author pages, and should be done with it some time in August. At which point there will be some very long lists around. There have been previous proposals and discussions about the use of subpages. Leaving that aside for the moment - having analytical lists is of interest since it ties up with the "missing article" drive on WP - my current thinking is that for all longer lists we should use a collapsible template on the author page. Such things exist here, e.g. {{British legislation lists}}. Taking "long" to mean 50+, there would be just over 100 to create; if it means 20+ it is more like 160.

Therefore I'm asking the more technically-minded DNBers to look into the syntax issue here. As I understand it, enWS doesn't have a standard off-the-shelf navbox we could use, as WP does. For the author pages, a simple list of article links with {{DNB lkpl}} is most of what is required; we should allow for the possibility of DNB01 and later editions, too, and probably for an alphabetical breakdown for the really long listings.

Thoughts? I'm no expert, but coding up a generic navbox should benefit the site as a whole. Charles Matthews (talk) 07:45, 8 July 2010 (UTC)

Coding a box shouldn't be a killer, though, not anything that I have done. Let us put aside the number to make, and what it takes to do as I see that as least of a concern, compared to 63 volumes! 100s of authors, etc.
Let's explore what you/we want to show.
  • 1 box, or boxes within; thinking possibility of different boxes for different groupings, or different boxes for
  • multiple columns or straight; if replicating the subpage tables, we wanting wrapped columns? I suppose that is "what data are you wanting to show?"
  • (thought) in a toggle of a collapsed list, I would think that we would not want the length of the toggled space to be larger than the depth of a screen, allowing people to toggle without scrolling.

Other bits

  • always simpler more likely to be more compliant across browsers
  • agree that subpages doesn't really work well for us
billinghurst sDrewth 10:45, 9 July 2010 (UTC)

To clarify a bit: my first thoughts were to use just a fairly standard middot format for most of the boxes. So that if it's a sub-50 author such as Author:James Bass Mullinger, you click and then see a list of the article names with middot separation, several to a line. The assumption is that most readers will scan down to the author they want (probably having been brought there by a search), and click. For the authors with longer lists, I think a single block would be less suitable, and there should be alphabetical division as on Author:Sidney Lee. Charles Matthews (talk) 13:04, 19 July 2010 (UTC)

Grand Program(me) for Infrastructure[edit]

I have raised some of this in bits and pieces in previous threads. I now have a target date of September for trying to address some of the remaining big infrastructural issues (i.e. pretty much everything that isn't proofreading or adding links). It seems that tackling what remains to do awaits finishing the listings on author pages. Once that is done, I feel we can move ahead on several fronts:

(a) Definitive format on author pages;
(b) Scraping and sorting the author page listings so we have have "rough" ToCs for each volume;
(c) Troubleshooting the "rough" ToCs, which means various things including proper disambiguation checks, catching omissions, and full set of author pages plus (i.e. we'll need to have an Anonymous listing, and a side issue is whether that is in project space or the Author: namespace);
(d) Definitive manual on article titles, which comes down to one main point (sort out which of the small caps wording gets into the page titles, and which doesn't, with the main burden being medieval names).

I think we'll probably need to spread out some detailed ideas over project pages for all this.

But, first, am I making sense? The overview is that we want to get to the situation where proofreaders can create pages that will automatically be linked in, from ToCs and author pages, and will be able to get the "previous" and "next" from the ToCs, with no quibbles, for the easiest possible experience of article creation. There will have to be a big push to get to this point, and I'd like to think that there is consensus about what we're pushing towards.

Charles Matthews (talk) 12:50, 19 July 2010 (UTC)

New DNB WikiProject on Wikipedia[edit]

For information: I have set up w:Wikipedia:WikiProject Dictionary of National Biography, since the time has certainly come when there should be a sister-project, and a definite place for collective discussion of the DNB adaptation effort over on WP. Please come and participate. Charles Matthews (talk) 09:35, 9 September 2010 (UTC)

Updating the Manual[edit]

I have gone into Wikisource:WikiProject DNB/Style Manual and updated material on titles, to reflect better where we stand. This does need further work. It has been suggested to me that there should be a section on when and how to add rational redirects. Charles Matthews (talk) 07:42, 16 September 2010 (UTC)

I would agree about the redirects, though feel that it would be best framed with a discussion about disambiguation, including within a whole of site context. — billinghurst sDrewth 11:48, 16 September 2010 (UTC)
It's quite a big area, considering that "redirect" also describes the DNB fragments that send you to articles from variant names. The Manual needs some beefing up to deal with all the hypertext issues we are gradually getting to, with volume ToC format, Author page format also now on the agenda as we get a bit more complete. I'll try to spend further time on it, and make the structure more obvious as well. Charles Matthews (talk) 14:21, 16 September 2010 (UTC)
I think that we are better to have the strategic solution, and look to have a bot fix primarily, and then a second semi-auto run through on the non-obvious targets. Phe's scripting for managing author pages indicates that there is scope for validity checking and proper bot cheating. — billinghurst sDrewth 16:07, 16 September 2010 (UTC)
I'm going to want to come back to Author pages, and in particular scraping our article names off them, very shortly. The current position is that there are L and M of Author:Thompson Cooper to add, followed at some point and somewhere a list of the anonymous articles. And then all the DNB00 and DNB01 articles are listed (somewhere, if you are forgiving about those on subpages which are caught on the Magnus tool, and whatever omissions I need to apologise for in advance). This does open up new fronts, as I have said before. In particular I thought we could look if the three 1901 Supplement volumes could now be posted, because a testbed of scraping and sorting the entire DNB00 listing would be to sort the DNB01 names somehow and create three volume ToCs for the Supplement with much less pain than in the past. Charles Matthews (talk) 21:05, 16 September 2010 (UTC)
So now Author:Thompson Cooper is now complete (I think) if messy, at 48K and the longest DNB list at over 1400 biographies. This is the "worst case" for author page format, and I'm going to start a thread on Author talk:Thompson Cooper for those who want to experiment with various format options. Charles Matthews (talk) 08:19, 17 September 2010 (UTC)

Way forward on categories?[edit]

I was working on Category:DNB No WP, which is much improved by the bug fix (thanks ThomasV); and I've added a LargeCat ToC (thanks Pathoschild). I found Palmes, Bryan (DNB00), a typical instance of an article for WP usage (twice an MP); how should I categorise it here, though? Category:British politicians is full of author pages, and the same is apparently true for related categories. I don't want to start a big drive to categorise if there is going to be resistance. Am I BOLD? Do I decide that Wikisource:Categories has to be created at last? Or do I start another round of the "topics" discussion at the Scriptorium? I'm certainly not going to start a corner of the category system just for the DNB, having argued in general terms against such things in the past. Charles Matthews (talk) 08:23, 29 September 2010 (UTC)

Why wouldn't we point at Category:Categories, or are you meaning something else for the creation of the category. To your former comment, is it that you do not think that the author ns pages and the main ns articles should not be in the same category? What is it that we are looking to try to separate? To the doing, I don't think that it would need to be anything that would require further conversation, as I would think look at the history of WS that the discussion has been had. — billinghurst sDrewth 11:41, 29 September 2010 (UTC)
Yes, one issue is the tie to namespaces, or at least in the oblique form that "category X contains in practice pages of a certain type". There is more than one way to set up a category system. The "German" or deWP system relies much more than what enWP does on being able to intersect categories. I don't like some of that in detail, but here I think it would help clarify things. It would get us away from having to think about a biography as either by nature a mainspace text or something to be classified via its topic, a person. Category:British politicians is something one might want to search in various ways (also an author page, also a DNB page, also nineteenth century). Charles Matthews (talk) 17:59, 29 September 2010 (UTC)
[mounting his hobby-horse] We have this well covered I think. We already have a subjective, topical, intersecting, method of categorisation, possibly the most elaborate ever devised, it certainly has the greatest number of enthusiastic workers and users. I'm not exaggerating here, I'll give another clue: the DNB project is already populating these categories. Any DNB article, or other biographical text, will meet the 'within two-clicks' criteria* for a huge number of topics and possible paths of enquiry.

*Wise words that are achievable ideal, with a bit of thought, and never far from my thoughts since I read them here. Here are someone else's, if a method of indexing is redundant it is only likely to cause confusion. cygnis insignis 14:40, 30 September 2010 (UTC)

If you are implying what I think you are, then you advocate the chicken and I the egg (if that is not disrespectful of a bird of your distinction). If we are to rely on WP categorisation, then articles have to be created over there, the process I'd like to promote. So is it egg first or chicken? Charles Matthews (talk) 21:20, 30 September 2010 (UTC)

Using intersects

What are trying to achieve/present/differentiate? — billinghurst sDrewth 12:56, 1 October 2010 (UTC)

I'm trying to develop the point of view that WS does need big categories such as Category:British politicians. A category such as w:Category:Members of the pre-1707 Parliament of England is interesting because it encodes expert knowledge (something major happened in 1707), and you would not typically do as well with intersecting with birth dates (say). That is the snag with the intersecting approach: expert knowledge does not reside in the category system. Bad for zoology (say). For WS we should be able to fix that by saying "a portal on MPs from a given period would be fine: we don't need to import enWP's category system, but we'll develop another way." This is now where I'd like to head. Charles Matthews (talk) 15:42, 1 October 2010 (UTC)

Master lists[edit]

My ideas on these are now set out on a project page: Wikisource:WikiProject DNB/Master lists. It is now a matter of greater urgency, given the starting of Dictionary of National Biography, 1901 supplement, to get the DNB01 articles listed. So that that subproject will start with good ToCs, I mean. So I'm putting in time on scraping the names off the author pages: no need for others to duplicate that. Charles Matthews (talk) 09:49, 5 October 2010 (UTC)

Format on author pages[edit]

Further to the welcome appearance of Dictionary of National Biography, 1901 supplement, we need to talk about some format issues:

  • If the DNB01 links are routinely piped as the DNB00 links are, then we presumably should be flagging the edition in author page listings. One way would be to use as standard a semi-colon heading within the DNB section (thus not creating a subsection).
  • We should have a template to do the piping, because at the very least it makes list handling much simpler. Would it be OK to call this {{DNB01 lkpl}}? Also we'd need {{DNB01 link}} as the version displaying the full reference.

Charles Matthews (talk) 07:23, 7 October 2010 (UTC)

The template issue is now handled. And there are volume ToCs up for the supplement: things move on apace. The tracking category for DNB01 text on WP is w:Category:Articles incorporating DNB01 text without Wikisource reference, which currently has just ten articles to create. Charles Matthews (talk) 07:09, 9 October 2010 (UTC)
Volume II articles of 1901 Supplement are complete. Eight articles to go... Volume III text layer missing, Volume I awaiting upload.JamAKiska (talk) 12:54, 19 October 2010 (UTC)
With regard to templates, I would rather have less templates though with more options, rather than by having more templates. For example have {{DNB lkpl}} rather than {{DNB00 lkpl}} Alternatively if people do not like that idea, I would prefer to create an underlying template that picks up the options. I would favour more/specific templates where there is value in having the templates, eg. how many works utilise DNB01...? What value can people see in having specific templates, what extra information do we want or what do we want to know about things that are specifically about the individual volumes? — billinghurst sDrewth 09:31, 20 October 2010 (UTC)

An obvious remark[edit]

Dates later than 1900 should not appear in DNB00 articles. I'm onto this as a way of checking that should be run. It's particularly relevant to the way I work, but apparently not solely my problem. Very often updates in later editions are adding references appearing in the early twentieth century. Charles Matthews (talk) 10:16, 9 October 2010 (UTC)

Matching tool for Category:DNB No WP[edit]

This is another Magnus Manske tool: see here for initial Q and edit the browser line to put in any letter to replace Q. The run for Q is mercifully short, and it identifies one hit, for Quarles, John (DNB00), as an existing Wikipedia article. I'm leaving that for demonstration purposes, therefore. For other letters, leave the tool to itself and it will run through DNB biographies not yet matched with WP articles for that letter. As Magnus comments, slow but thorough. Charles Matthews (talk) 06:23, 24 October 2010 (UTC)

I have added it to the Category page for each letter. — billinghurst sDrewth 11:07, 24 October 2010 (UTC)

Seems to be remarkably useful. Also somewhat touchy, though, as the toolserver can be. Smith names are so common it can take a long time on one. It should work better once the backlogs on common letters are cut down. Charles Matthews (talk) 12:09, 24 October 2010 (UTC)

Upgraded: you can now use two initial letters, to narrow down for the longer listings. Charles Matthews (talk) 11:34, 27 October 2010 (UTC)

Further wizardry: tool has been adapted to the Catholic Encyclopedia articles. See Wikisource talk:WikiProject Catholic Encyclopedia Upgrade. Charles Matthews (talk) 08:11, 29 October 2010 (UTC)

DNB redirects: categories and accounting[edit]

I mean the DNB's own redirects such as More, Roger (DNB00). It would probably be better if they were not in Category:DNB biographies but in a category of their own. And they show up in counting articles, I believe. They can and should be linked to WP: why not? Charles Matthews (talk) 09:58, 25 October 2010 (UTC)

Agreed: they need to be handled differently. If I recall correctly, we originally decided to not include them at all, except as inline data in the TOC. However, I now believe that we need them as separate tiny little articles, just as in the example, merely as a matter of consistency, and also to enable us to write a an automated "coverage" tool for pagespace. I propose that we add a "redirect" parameter to template:DNB00. This would take the page out of that category and the "no author" category. -Arch dude (talk) 15:19, 25 October 2010 (UTC)

I wouldn't say we "need" them now, but they are going to be a part of making the hypertext version. Charles Matthews (talk) 21:34, 25 October 2010 (UTC)

  • I have been manually adding Category:DNB See as per a very early discussion. It is not a major issue to amend {{DNB00}} to allow for an extra parameter to choose between adding Category:DNB biographies and Category:DNB See, nor to convert the pages that use See at this point in time. The tricky bit will be to collect those that have been missed, as we would need to be looking for a (DNB00) biography that is not directly linked from one of the 63 main ns ToC pages. Not sure that we got a lot of benefit, beyond article separation, by adding the parameter, as people will need to know what it means, how to use it on the rare occasions, and they could just as easily manually add. <shrug> Note that it will NOT affect the no contributor and I would have to think about whether we can tie the two parameter tests together functionally.
  • I haven't traditionally listed the Wikipedia = field on the page, though don't see that it matters particularly either way
  • While I don't wikilink to these referring/pointer pages ("redirect" has too many local connotations) from our ToC pages, when we changed to transclusion, I ended up doing them and have them in the general prev/next run. — billinghurst sDrewth 03:29, 26 October 2010 (UTC)
Having a think, we might be able to do something with the contributor = field. Something along the lines of WHERE CONTRIBUTOR = see (or whichever unique keyword) that it puts the DNB See category, this would both move it out of the NO CONTRIBUTOR category, and also put them into the referral category. No impact upon the DNB biographies category. — billinghurst sDrewth 11:29, 27 October 2010 (UTC)

I'll comment that we are overdue a manual of style for the volume ToCs (what to include, format). I don't create the referring pages, nor do I use them as "previous" or "next"; but the interpolation of such pages in sequence as they are created between articles that have already been created can go on in the background. Charles Matthews (talk) 06:46, 26 October 2010 (UTC)

I add the "redirect" parameter to Template:DNB00. If this parameter is set, the article is added to Category:DNB redirects instead of Category: DNB biographies, and it is not added to most certain content-oriented maintenance categories. -Arch dude (talk) 17:51, 31 October 2010 (UTC)
I have undone the change. As I mentioned above, I see misunderstanding using the term "redirect" which has a the common parlance of wikimedia, and I am not comfortable with that potential conflict for misunderstanding. Also, as I also mentioned above, I see a better means to manage the direction aspect and the contributor aspect in the one hit. I also think that it is a premature to make the decision about DNB biographies category and to remove the SEE articles from that category without a better understanding of what was the purpose and how we look to manage the corpus of the articles. — billinghurst sDrewth 22:54, 31 October 2010 (UTC)
To continue the discussion. Category:DNB biographies was created to house all articles relating to DNB, not just those that were the articles themselves, noting that as the articles are not subpages that this is the only ready means to produce the articles. If we are going to split out the referring articles then do we need to maintain a complete list, or to have a complete list available somewhere, and how do you propose to have the category hierarchy? Or does having the categories just duplicate what is available from the main ns and the category itself with the dual listing is redundant? — billinghurst sDrewth 08:55, 1 November 2010 (UTC)
Sorry, I did not realize that we had not reached consensus. I am primarily concerned with the fact that the "see" articles are contaminating the maintenance categories, and my modification solves this problem. I suggest that we address your two points as follows:
  • modify the parameter name from "redirect" to "xref." I do not like the word "see" as a parameter name as I think it is confusing. More generally, parameter names should be nouns, not verbs.
  • modify the category to be "DNB cross-references."
  • make both "DNB biographies" and "DNB cross-references" subcategories of "DNB articles." We can also have a subcategory for other DNB mainspace pages such as the TOCs.
Thanks. -Arch dude (talk) 13:28, 1 November 2010 (UTC)
So within Category:DNB, Category:DNB biographies actually stands for "DNB content" or "DNB texts"? In due course, we'd also want to include the index pages in the volumes, and there's a memoir in vol.63 also. So one approach would be to rename that top holding category for actual DNB text content, and have various subcategories within it? Charles Matthews (talk) 13:32, 1 November 2010 (UTC)
Charles, you seem to be reacting to Billinghurst, not to me. my proposal is to create the category "DNB articles" as a subcategory of "DNB." "DNB articles" would be the supercategory of a set of subcategories that will include all mainspace DNB articles, and each mainspace DNB article will be in exactly one of these subcategories, although any article may be in other categories in different hierarchies. The initial two subcategories will be "DNB biographies" and "DNB cross-references." One-off articles such as the memoir in vol.63 and the odd little article at the end of vol. 01 may be placed directly in "DNB articles." -Arch dude (talk) 17:55, 1 November 2010 (UTC)

Five figure milestone[edit]

Wikisource:WikiProject DNB/Statistics#Stats 1 November 2010: the project passed 10,000 articles early on Friday (UTC). Charles Matthews (talk) 21:20, 1 November 2010 (UTC)

Congratulations, Charles! (Well, Charles, 95+% and the rest of us, 5-%.) Since there are <30K articles, we (i.e., Charles) are more than one-third done with the most fundamental part of the project. -Arch dude (talk) 22:40, 1 November 2010 (UTC)
ADude, there are some other great performers in that space; so while CM is just like a master machine, we have some great apprentice machines, JamAKiska, there too who need us to dip our lids to them. — billinghurst sDrewth 03:31, 2 November 2010 (UTC)

It's a coming of age, certainly: no cake, but a new project page at Wikisource:WikiProject DNB/FAQ on the way. In numbers, I do about two-thirds of additions, but I'm sure I put in less than half the hours worked on the project. To underline that, I know that whatever the notional article #10000 was, it wasn't one of mine. Charles Matthews (talk) 08:05, 2 November 2010 (UTC)

Congratulations…on hitting the critical mass, and thanks for guiding the trek! JamAKiska (talk) 11:19, 2 November 2010 (UTC)

ODNB ids[edit]

It is going to turn out to be useful to have the identifiers on the ODNB site recorded somewhere, for each of our DNB articles. My question is, where and in what form? I suppose there might be objections (non-free) to including this information with the articles? For the WP end of this project, though, we want to get on top of matching articles to ids as well as to biographies here (it seems that Dsp13 has already done plenty in this direction). We are getting into a triangular situation, then, and this is likely to be reinforced by development of w:Template:ODNBweb. What is the right way to go? Charles Matthews (talk) 15:57, 10 November 2010 (UTC)

Personal opinion is to include ODNB = xxxxx as a parameter in the DNB00 header at this point in time. It will do nothing, but it will break nothing. If/when we decide to do something, and how we decide to do something with this data, then we can address that and know that we have prepopulated articles to do so. — billinghurst sDrewth 07:57, 11 November 2010 (UTC)

Flexibility in the plan, I like it ! JamAKiska (talk) 11:32, 11 November 2010 (UTC)

Just to report that w:Template:ODNBweb has been partially upgraded now. This is really a WP issue, naturally; but with a further upgrade it could populate a maintenance category (DNB articles needed to provide a free alternative). The actual business of integrating the existing w:Template:DNBfirst into w:Template:ODNBweb in optimal fashion has been mooted at w:WT:WP DNB. Exactly how to do all this is still up for grabs. Charles Matthews (talk) 07:06, 29 November 2010 (UTC)

An Index page absence[edit]

After some searching, I couldn't find the page like Index:Dictionary of National Biography. Sup. Vol II (1901).djvu but for Vol I. All a bit first-day-at-school. Any clues? Charles Matthews (talk) 19:35, 8 December 2010 (UTC)

The Status, found on Talk:Dictionary of National Biography, 1901 supplement reflects the present situation. Only volume II is available to proofread at this time. Vol. III is in the "needing OCR" as the images look letter perfect, but am still not getting any text. Vol. I has neither images or text. JamAKiska (talk) 23:51, 8 December 2010 (UTC)

Supp. Volume III index page is available for proofreading. Both sources for [Index:Dictionary of National Biography. Sup. Vol I (1901).djvu] had missing pages or unreadable & blurry text. Created a composite pdf document of page images from these on-line sources and attempted upload at IA this morning. Did not see the file in the processing queue after upload or afternoon reload. JamAKiska (talk) 22:07, 28 December 2010 (UTC)

Supp. Volume I index page is available for proofreading. JamAKiska (talk) 03:08, 30 December 2010 (UTC)

1901 Supplement.[edit]

All three volumes of 1901 Supplement are available for proofreading. All of the articles previously forwarded have been migrated. JamAKiska (talk) 17:38, 31 December 2010 (UTC)

Author page headers[edit]

I have boldly made DNB into a disambiguation page, to reflect the presence of DNB01. Now we should agree on Manual handling of the headers for the DNB sections on author pages. I think where there is more than one edition (typically DNB00 and DNB01 articles) the header should be "Contributions to the DNB", with a subsection as a semi-colon header for the DNB01 articles (could be an actual subsection, but in any case there should be an agreed way to help future automation). Where there is just one edition it can be like "Contributions to the Dictionary of National Biography, 1885-1900", or with the Supplement link. In any case we should move to clear up past tentative efforts to get the author pages under control. Charles Matthews (talk) 07:06, 3 January 2011 (UTC)

To me they are all DNB and I haven't seen a need to differentiate further. I suppose that means that I am more interested in making sure that they are listed, and no particular opinion about drill downs. — billinghurst sDrewth 13:55, 14 January 2011 (UTC)

George Smith Memoir[edit]

Expect to have the memoir proofread later this week. My intention is to transclude this memoir, as is, onto the page created from the link found here. This memoir, 39 pages in length, is sub-divided into nine parts using roman numerals. JamAKiska (talk) 13:44, 25 January 2011 (UTC)

Prefatory Note to 1901 Supplement[edit]

Would like for it to go here. And would like to add a link to the George Smith memoir from this location as well. JamAKiska (talk) 01:26, 31 January 2011 (UTC)

{{DNB errata}}[edit]

In a trial to display the errata from Index:Dictionary of National Biography. Errata (1904).djvu with the respective article, I have mirrored something that I have been doing for A Compendium of Irish Biography and created a means to append the corrected data to the article. Now it is still in the play zone, so we can evaluate its effectiveness and tweak it further if we lke down. We have it working for erratum on one page Greaves, Edward, and I have coded it for where that goes to a second page, though not yet tested. Issues to resolve, names of parameters we may wish to change them to something shorter like p1, p2, ... When we transclude the text, the references to line numbers is no longer current, so it may be worth having a rider that trails onto the end of the template that explains such. Probably other stuff that I haven't considered. — billinghurst sDrewth 15:29, 4 February 2011 (UTC)

The shorthand used in the Errata to help the reader locate the amendment will require slight adjustments for us made easy by transclusion. By way of example, see 1904 Errata p. 50 at the page bottom of the original, specifically the Campbell, Frederick W. article. The original page only includes the line number entry, 15f.e. for this article, as the page and column were indicated in previous entries found higher on that page. The next errata page 51 contains another entry for this article. JamAKiska (talk) 17:26, 4 February 2011 (UTC)

Greaves makes for a curious example, but it's fortunate in that I didn't have to look any deeper to demonstrate that messing with this can easily go wrong: 'error happens' [1]; the column has some relevance, if someone checks the scan of the source; I see f.e., for example, is not an abbreviation of "for example"; and without counting, I think the line number is accurate (a line ends in a full stop, not where it wraps on the printed page), but I didn't check how this is counted where an article goes onto the next page I checked, it restarts for each page. Any entry with an erratum should link to that section from the notes. cygnis insignis 19:44, 4 February 2011 (UTC)

The first page of each chapter in the Errata has that information - namely l.l. for "last line", and f.e. meaning "from end," as in counting backwards from the column end. I have yet to encounter an Errata entry which applies to more than a single page in any particular volume, until now See Edwin Sir Humphrey. The absence of those abbreviations indicates counting lines from the top of the column. Perhaps the way forward is to leave a simple word like "Errata" in the notes to help alert the reader to the information contained at the conclusion of the article to include the link to the appropriate errata page which the current template provides.

Your example also brings up a great point. A few months back I was proofing pages in the early volumes when I discovered that I was undoing some Errata adjustments made by another editor. So sheepishly I undid my edit to re-establish the "improved version," as I recognized that while in good faith I was editing the text and hopefully adding value, I was also doing so without all the facts. I now look through the edit history before joining into the editing fray. Which brings up a great topic for discussion. Namely, when is it a "reasonable" time in the edit cycle to include links and Errata adjustments? to help others avoid situations like this in the future. JamAKiska (talk) 00:24, 5 February 2011 (UTC)

The emboldened comment is my point, I saw what it is "not" because I read that information at the errata. And the 'way forward'—the work-around to the work-around—is my solution: don't mess with it, preserve the integrity of what we transcribing, link back from the notes section of the header.
The example was not mine. If someone incorporated the errata, made corrections, the text no longer matches the scan; that is all that is required, no more. I realise that this leaves little opportunity to be erudite, knowing with the BOH-S, and demonstrate our talents as clever editors of text, it is as boring as bat shit from that perspective and incredibly frustrating for wikipedians, but that is not what libraries do. If the same text appears at wikipedia, which it will, it can be corrected there with another ref to the errata volume. Dig this: there will, eventually, be very few reasons for a reader to use the DNB text here. The timescale is 'when someone gets around to it'. cygnis insignis 21:25, 4 February 2011 (UTC)

This DNB article, Brand, Henry Bouverie William, addresses previous concerns by preserving the original. The note alerts the reader’s attention to the appendage, and there are currently no links. Would like to add the text below to this template to aid the first time viewer.

Dictionary of National Biography, Errata (1904), p.141
N.B.— f.e. stands for from end and l.l. for last line

Page Col. Line  
37 i 22f.e. Greaves, Sir Edward: for 279 read 302
9f.e. for 225 and 279, i. 18 read 51 and 302
The formatted three columns found nearest the left margin provide the specific location on the DNB biography page where the original text is to be adjusted, in this case page 37 of the volume. There are only two columns from which to chose, and the text line is determined usually counted from the column top. In this example the counting starts at the bottom of the 1st column and is located 22 or 9 lines f.e. (see also l.l.) These text adjustments can be replacements, omissions, or insertions. In this example there are two text replacements. When there is no indication of the page or column immediately adjacent, use the first value found above in the respective column.

JamAKiska (talk) 05:04, 5 February 2011 (UTC)

We could inhale the erratum into the Notes field, which means that the article retains the initial publishing integrity; however, it being errata rather than an extension of the data, I am reasonably comfortable with a segregated transclusion. If we transclude, we can easily argue that the article shall remain as is. To the wikilink on the page number, at the moment it goes through to the Page:, I think that we should amend that so that it points to the main ns transclusion of the errata pages, and we can poke in the wikilink. So that may be something like [[Dictionary of National Biography. Errata (1904)/Volume 23#141|p. 141\] — billinghurst sDrewth 05:30, 5 February 2011 (UTC)
Re the key, there are a number of ways to do this, 1) add in key to each transcluded component, 2) add a hover over each code, 3) link through to a page that explains the key as part of the template. All have strengths and weaknesses. — billinghurst sDrewth 05:33, 5 February 2011 (UTC)
Cochrane-Baillie, Alexander Dundas Ross Wishart was an early view of the Notes option, and went the final step to include the errata. The next option, Brand, Henry Bouverie William, includes a brief note to alert the reader of errata(um) YET preserves the integrity of the original AND provides this additional research for those whose interest is accuracy.
While the one line Errata entries fit nicely into the notes section, not all erratum footprints are so easily managed. A hybrid approach that includes hover keys as well as a detailed example, available via link or transclusion, would go a long way to help editors work their way through this problem successfully. JamAKiska (talk) 14:03, 5 February 2011 (UTC)

I have amended the template and you will see that it now has the key embedded after the page break label. It just seemed easier. — billinghurst sDrewth 02:09, 6 February 2011 (UTC)

The template seems to be working well in 5 volumes. The offset of 10 is good through volume 28, thereafter it tapers to 4 in the Supplementals. The link leads to the correct page in the errata volume. George indicated he would take this on eventually. JamAKiska (talk) 12:21, 7 February 2011 (UTC)
Kind of hard to build a range of offsets when the Index: pagelist itself doesn't reflect the actual offsets either! :-ο

Anyway, I swapped in the "fixed" template to the main position already - hopefully nobody even noticed that change - and have since managed to "redo" offset 10, didn't know where exactly offsets lower than 8 started/ended so I just jumped to the end where djvu page 293 & higher have an offset of 4 now. Please respond on the errata template's talk page - this gap thing going on here displays way too weird for me. — George Orwell III (talk) 17:03, 7 February 2011 (UTC)

Nicely done, pages are properly numbered in each of the four offset regions. Howell, Laurence would provide a good example for errata straddling 2 pages if you are looking for one. I’ll add a note to the errata djvu file to give you a heads up when this file gets replaced to fill in the missing pages. JamAKiska (talk) 23:43, 7 February 2011 (UTC)


The following Volumes Errata pages are proofread, have links installed, and are ready for transclusion: 1, 2, 6, 50, 51, 55, 57. The Errata have been included into all written articles in Volumes 23, 28, and the Suppl. vols 1 & 2. MISSING PAGES - The offset dropping by six indicates 6 missing pages, only five of which have text: text pages 170 & 171 (last 2 pages of V. 30); text pages 193 & 196 (1st and last pages of V. 36 - not much text on p. 196); and finally text page 197 (lead page for V. 37). Will see if I can locate a complete volume prior to resuming the effort to reduce subsequent page moves. I need to leave it at this point of progression for a few days as other concerns are more pressing. JamAKiska (talk) 15:11, 11 February 2011 (UTC)
I don't know if this was for my benefit or not but whatever you folks wind up doing and the resulting change in page ranges it may or may not cause; any adjustment(s) needed to the template would not be major. I did not use coding that only relied on the typical plus or minus the offset # from a base page # but rather wrote it to use equal to or greater than an opening page # and less than the closing page # of the range of vol. pages in question before any offest is calculated (+ or -) for display/link purposes. Carry on & let me know when things need-a-changin' — George Orwell III (talk) 15:29, 11 February 2011 (UTC)
Found a better scan of 1904 Errata that has all 300 pages of text. This Errata volume will add eight pages to the page count, six of which will include text (the 6th is the final page of the Errata for the 1901 Supplement Volume III). The only down side to this move is the loss of the title page which in the replacement volume is a blank page; however, it does retain the Title page with identical markings on the cover. Upon insertion, all text pages up until page 169 will align. 42 page moves will be required following insertion, beginning with text page 170 or djvu page 180. I will adjust the few pages of links that require adjustment during this move. JamAKiska (talk) 18:51, 14 February 2011 (UTC)
Replaced Errata djvu file with complete volume. Offset is now set at 10 throughout. JamAKiska (talk) 14:01, 15 February 2011 (UTC)
Erratum template adjusted to expanded djvu file…thanks George…initial look is favorable…will poke and prod for a couple of days as I begin to incorporate these links into completed volumes. The first ten volumes are ready (23 & 28 already in use), will prep other volumes based upon DNB progress page. JamAKiska (talk) 23:20, 15 February 2011 (UTC)
Errata template installed in volumes 1, 2, 6, 18, 23, 28, 47-60, 62, Suppl. vol. I and II (written articles in 18, SI & SII) . Discovered a few articles where the editors had already included the errata or had clarifying notes to that effect. These I left in place. Will continue with errata installs. 17 & 22 will be the next group. JamAKiska (talk) 16:11, 11 March 2011 (UTC)

Supplementary list in prefatory - linking[edit]

At Page:Dictionary of National Biography. Sup. Vol I (1901).djvu/14 and the following page there is a list of people who died int he first six months of 1901, and were indicated as going to have biographies added. Do we link the list of names? If so, do we link to future articles, or to other places within WS, eg. author pages? — billinghurst sDrewth 03:52, 4 March 2011 (UTC)

John Farmer musician made DNB12. JamAKiska (talk) 04:34, 4 March 2011 (UTC)

1912 Supplement[edit]

Have located volumes II and III of DNB12. Have yet to do a full page scan, but initial look seems as favorable as DNB01 scans. Am pausing awaiting location of good volume I from this second and final supplement from Smith, Elder & Co. JamAKiska (talk) 17:41, 8 March 2011 (UTC)

Second Supplement published 1912 in three volumes (DNB12): [Volume I][Volume II][Volume III] Only blemishes I saw were author pages in volume I. The errata are included in the front of each volume, and complete index pages are found at the end. Would organize along the lines of First Supplement (DNB01). The file sizes range from 34 to 42Mb. JamAKiska (talk) 13:56, 9 March 2011 (UTC)

Page check complete on volumes I, II & III of DNB12, all pages present & images are clearly readable. The addition of these three volumes will require an adjustment to the DNB index template. JamAKiska (talk) 19:44, 16 March 2011 (UTC)

Wikilinks to works[edit]

Some pointers to works that have been added or are around for wikilinking from the body or ref section.


Also looking at a getting Portal:Notes and Queries populated as we have located all the volumes. Next to get done is Gentleman's Monthly. If there are other works that we should be looking to locate, then do let me know. I am thinking through whether this should sit separately as part of a Portal, or as a separate project, or both. Would like to hear your feedback on this. — billinghurst sDrewth 02:59, 5 April 2011 (UTC)

List of contributors[edit]

On Wikipedia I created a list of contributors, using the DNB author templates which I transwikied some time ago, at .

It should be possible to cut-and-paste that page straight back here, if it would provide a useful reference, and indication of which author pages are and are not done. Rich Farmbrough, 00:23 21 May 2011 (GMT)

There'll be various comments. I would say that not all the contributors are notable; but exactly which ones are is not so easy to determine. Nor are they all easy to identify. Plenty of work has gone on to remedy that, but I believe (for example) that we don't know whether the J. H. Thorpe is the one who was Jeremy Thorpe's grandfather. Charles Matthews (talk) 07:42, 21 May 2011 (UTC)
Rich, I am hoping that you have already seen Dictionary of National Biography, 1885-1900/List of Contributors and have a look at the talk page to where we have plenty of identification. I cannot say that I have been back to much work there more recently. — billinghurst sDrewth 10:10, 22 May 2011 (UTC)

15,000 milestone[edit]

According to Magnus's gadget the 15000th DNB biography here is Wilmot, John (1647-1680) (DNB00). Category:DNB biographies has slightly different ideas; but let's not quibble about numbers. I've thought for a little while that this milestone is the best one we'll have to mark the midpoint of the project—as far as the creation of the biographies goes, that is.

So, a good time for a chat about how it should go from here, in consolidating the project. Certain things about the desired final state are not clarified; and some idea of strategy would probably help. I have various points to bring up. But would someone else like to start? Charles Matthews (talk) 20:01, 22 June 2011 (UTC)


I'm putting up a list of the unsigned (i.e. anonymous) DNB articles at Wikisource:WikiProject DNB/Unsigned. Most of these are in early volumes, and were because Leslie Stephen was being coy, and there are something over 300 in all.

There are a couple of definite uses for the list (checking that all articles are present as a complement to author pages; making up volume ToCs). But there is another issue: we currently have two styles in play, namely linking to Author:Anonymous and placing "no contributor recorded" in the contributor field. A third option would be to create a dedicated page in the author namespace for such a list, and link to that. I wonder what people think: for some purposes having such links would be a plus. Charles Matthews (talk) 09:36, 30 July 2011 (UTC)

I thought that we had decided to identify them as "no contributor recorded". They should appear on the category that we set up, though we do different if required. I probably need to go back and poke it, as we still had a number of those identified on that page that were there solely due to the field being omitted (ie. early transcripts). — billinghurst sDrewth 15:12, 30 July 2011 (UTC)
Category:DNB no contributor has the bits, and there is a subset Category:DNB See, which I also remember chatting about though again was waiting until we had moved all the pages that were not backed by scans to scans. — billinghurst sDrewth 15:34, 30 July 2011 (UTC)

Author pages[edit]

I have a couple of points to raise about what we do on author pages, one for now and one for the future.

Firstly, there was my theory that very long lists of DNB articles on author pages would prove unacceptable. Author:Thompson Cooper has, however, had 1400 DNB links now for a while, and I'm not aware of complaints. So it seems my caution was unnecessary, and the proposed solution of subpages of author pages isn't required. So I suggest that those subpages now be dismantled. NB that there is another solution, in fact what the Germans do for the ADB, their equivalent: a dedicated category. I wouldn't enjoy this while there were still redlinks to fill in.

Okay, we can start dismantling them, and putting them back into the root level. I will ask someone to run a sql query to generate a list.
I have repatriated the remaining subpages to the root level. Are we right to kill Template:DNBauthorsubpage or would you like it to remain. — billinghurst sDrewth 15:29, 17 August 2011 (UTC)

Second, there is the issue of the separation of the DNB00, DNB01 and DNB12 parts of the work on author pages. Having thought about it, in the long term, I believe it would help the reader not to maintain the distinction, but to have a single DNB list, alphabetical by name. There is a small twist to this, in that some amount of disambiguation might need to go on; this actually could be handled by piping, and needn't involve a new layer of disambiguation on the page. But I would see this as part of the final tidying of the work, when all redlinks are filled: the separation currently serves a useful function as we work through the volumes. (Also it comes to mind that the DNB12 effort will need fresh application, to get author pages up and list the articles to create on them.) Charles Matthews (talk) 06:51, 31 July 2011 (UTC)

Probably right, though one wonders whether a numerical count of biographies undertaken by volume would be helpful. Note that we can probably get someone to do a sql run once the vols. are complete. Re 1912, do we have a list of authors, and that probably means that we should review the templates against the requirements for all the linking templates. Probably worthwhile running a pilot to get our head around it. Also we can probably grab the author list from the volume, and work upon it. — billinghurst sDrewth 08:52, 31 July 2011 (UTC)

"Messy" lists[edit]

In case anyone wonders what is happening at /Messy lists, it is an attempt to do something about what is written at /Master lists by means of a CatScan search. In other words to scrape redlinks off about 450 author pages. I believe this is going to be useful in a few ways; but on the other hand there are also a few caveats to enter. I have been quietly working with some other "messy" lists over at letter P. There is the potential to cut down the time needed to complete the DNB00 volume ToCs, and also to pick up on some dab work in an economical fashion. I can explain more if required. Charles Matthews (talk) 13:36, 16 August 2011 (UTC)

So the longer lists by letter have been put on subpages of that page, to avoid the slowness of manipulating a page of nearly 400K. There are a few tasks that are immediate and in the nature of tidying. Where there are complete letters and volumes, any DNB00 redlinks on these new lists in those areas are caused by a mismatch of an author page redlink and the article title as created (can be a dab issue, typo, title convention point, and so on). So a bit of maintenance to do. Also I noticed that the search throws up some dab issues itself: for example Cooke, Thomas (DNB00) is the top hit, being linked to by five author pages. I can get these done out of the Fenwick handbook easily enough.

The main list should be redone over time, at least in part, to purge bluelinks as they appear, and to feed back the dab work. The main idea has always, in my mind, been to create volume ToCs from a "master list". Given a list for a volume in "messy" form, paging through a volume that currently has no ToC to create the listing is not going to be a very long task: certainly quicker than starting from scratch. This is where the caveats come in, of course. From the "messy" redlink list, you need to:

  • add bluelinks that are available on the volume ToC already;
  • add "unsigned";
  • ASCII-sort the whole list you have;
  • check with the actual volume text on ordering (not exactly ASCII) and proper dab.

There may be a few missing (e.g. errors of omission in author page lists, bluelinks that are not on the ToC). But after the pass through the volume there should be a good-enough volume ToC to post. Charles Matthews (talk) 09:22, 17 August 2011 (UTC)

Stats January 2012[edit]

Posted at Wikisource:WikiProject DNB/Statistics: we start the year at 63% done (figure includes the supplements in the 100%, so it is more like two-thirds of the first edition). It ought to be the case that 2012 is the year in which we get the DNB under control.

Anyway this is a reasonable moment to do some looking ahead and planning. I have a straightforward idea or five-year-plan, which I'll be posting and discussing mostly at the WP end of the project: take a volume complete here each month, and do adaptation and checking of that volume over at Wikipedia.

I don't at all suppose that a complete job can be done there, in a month, but new tracking pages can certainly be set up and some of the more serious stuff seen to. Such a VOTM implies a few things at this end also: a focus for validation and the checking of WP lks; definitive format for the volume ToCs; and a completeness check (make sure all the biographies are present). So if there is a task list for DNB VOTM/WS, what should be added to it? Charles Matthews (talk) 20:08, 2 January 2012 (UTC)

I've been slowly working through volume 4. At my current rate of progress, there is no way it will be ready by April. Sorry. -Arch dude (talk)

Not starting at the beginning, though. Letter A has had a lot of attention. I decided to try vol. 21 as more typical. See w:Wikipedia talk:WikiProject Dictionary of National Biography#Volume of the Month for the deal in the other place. Charles Matthews (talk) 20:41, 3 January 2012 (UTC)

Wikisource:WikiProject DNB/Most wanted articles[edit]

This page is now active, after a lull (material for a project around w:Nominate reports, by User:James500, prime territory for DNB additions). If anyone would like to tidy up what is there now, that would be great: I'm doing a daily pass at present to create articles. Charles Matthews (talk) 10:41, 6 January 2012 (UTC)

Mc, Mac and messiness[edit]

I have just added a note to Talk:Dictionary of National Biography, 1885-1900/Vol 35 MacCarwell - Maltby. Basically my standard method of using lists like on Wikisource:WikiProject DNB/Messy lists/M that are ASCII-ordered to build up the volume ToCs breaks down badly: volume 35 starts

MacCarwell, David
M'Caul, Alexander
McCausland, Dominick
McCheyne, Robert Murray
MacCluer, John

showing the old librarians' convention in action. Meaning that the alphabet goes M - Mc - N and both Mac and M' are sent to the normal forms McCarwell and McCaul before ordering. (If I have that right - User:George Burgess?) Anyway a wee bit of programming would come in handy here. Charles Matthews (talk) 11:24, 18 June 2012 (UTC)

I claim no particular expertise here, but I would normally expect all "M'"s, "Mc"s and "Mac"s to be treated alike. The DNB approach appears to be to treat them all as "Mac"s rather than "Mc"s- see for example "Macclesfield, Earls of" in the index between McCheyne and McCluer.--George Burgess (talk) 19:56, 18 June 2012 (UTC)

Ah, right. In any case if there is a "normal form", then it becomes comprehensible to a machine. My edit to the Vol. 35 ToC is probably not useful as such, and I'll go and revert it now: it will still be in the history. I'm working on letter C, by the way, with the aim of getting A to G solid, as P to Z is solid. Then we have the "endgame" for DNB00. Sorting out the volumes affected by the Macs is a subtask (I mean getting the listings straight) that might appeal to someone methodical who wants a break from proofing, or who would like to automate. Charles Matthews (talk) 07:21, 19 June 2012 (UTC)

I now believe the sorting thing can be done with find-and-replace and a template trick. So you can all relax ... Charles Matthews (talk) 05:35, 20 June 2012 (UTC)

Added three volumes of Second Supplement[edit]

To note that I have added the first two volumes of the Second Supplement (1901-1911), these being the only volumes that I can locate. I have also updated the substituted template {{DNBset}} that allows for its use for the first and second supplements. Addition of the parameter yr = 01 or yr = 12. If you notice anything wrong with the template then please get back to me. Similarly if you find a later volume. I would also be interested if anyone was able to find the 3rd supplement, as I believe that we can add it too. — billinghurst sDrewth 04:34, 9 September 2012 (UTC)

I believe vol. 3 of the 2nd Supplement is out there: see User:Charles Matthews/DNB referencing data#Adamant at the bottom for the key. NB that I can't read it, because in general UK readers can't get Google's versions of the DNB. But the external link tool shows that on WP there are references to it, at the version of the address. So I would expect our American participants to have access to this edition.
On the 3rd Supplement, I believe we are out of luck as far as public domain is concerned. It covers 1912 to 1921 (deaths) but was published around 1927. Charles Matthews (talk)

Pardon - I wish you guys would just come to me first re: American access to such works - especially for a project as important as DNB. At the same time, there a few DNB volumes long marked as needing source file fixing/replacement that I'd love to get out of the way (I've swapped in a better volume 60 from one of the lists on scan quality somewhere here or on WP already for example).

And yes, 2nd Supp., Volume 3 is available on GooBoo at least in two forms ...

... and there are probably more than just those 2 out there (as you know - the naming is frequently not spot on for every existing volume Google hosts).

Just give me a task and I'll report back to you with the available options here in the U.S. - even temp upload base files for your review if need be. I'd much rather take the time out to properly prep a file before it gets worked on rather than after ~600 pages are already in place and I have to squeeze in a missing page near the front or something. -- George Orwell III (talk) 23:35, 9 September 2012 (UTC)

The versions that we have looked and commented upon are at Wikisource:WikiProject DNB/Progress. To this point the second supplement has been way from a priority, until I needed it for an author page reference. :-) — billinghurst sDrewth 08:15, 10 September 2012 (UTC)
The list is OK I guess but most of those scans are coming up on being 5 or 6 years old from the date of conversion. A large part of those have since been re-done or superseded by later incarnations.

Anyway - volume 3 of the 2nd Supplement is also up --> Index:Dictionary of National Biography, Second Supplement, volume 3.djvu -- George Orwell III (talk) 05:45, 11 September 2012 (UTC)

I've made a working list of authors for Supplement II: Wikisource:WikiProject DNB/1912 authors. These are as found in the volumes: many are "usual suspects" from the previous volumes, and so can be blue links by correcting the name. Many should be easy identifications given the biographical clues there (FRS and so on). And some will probably prove tricky. But in any case, a start, and some of the template work can be done. Listing the articles will probably have to be done directly - the Fenwick handbook doesn't go that far.
PS: Why is John Henry Bernard under O? Because he signed John Ossory, as bishop of Ossory. Charles Matthews (talk) 09:29, 11 September 2012 (UTC)
Work on identifying these 1912 authors is now under discussion: Wikisource talk:WikiProject DNB/1912 authors. Charles Matthews (talk) 08:19, 20 January 2013 (UTC)

25,000 up[edit]

The November stats show 25,000 articles (actually a little more). By the way, the percentages on the stats page are calculated with 30,000 = 100%, which is enough to give the trend sensibly, but is only a round number. Anyway this milestone is the last before the first edition is done. Charles Matthews (talk) 21:51, 2 December 2012 (UTC)

Standing and clapping. — billinghurst sDrewth 23:15, 2 December 2012 (UTC)

New year report[edit]

Well, things are now looking in good shape. There are 26,800 articles created. The 1901 Supplement volumes are apparently complete (i.e. articles done), though I've not had time to check them through. And the major technical issues seem to have been covered. Only about 25% of author pages still need articles (that disregards DNB12, where we are still gearing up).

One point of interest is the relationship between Category:DNB No WP and the articles actually missing on enWP. The ever-helpful Magnus Manske worked out a quick way to order this category (about 11.5K articles) roughly by length. There would be various ways to go about that. This one isn't a tool as such. What Magnus does is to look at the number of pages transcluded into a given biography, and orders by descending order. The pages with >4 such pages already all had WP articles. After these were cleaned out, and I went through those spread over exactly four pages in pagespace, the results looked like w:User:Magnus Manske/dnb ws no wp. Something over 80% of the Category:DNB No WP articles with four pages transcluded did already have a matching WP article. That leaves 30-odd to do: our "longest missing articles". These are for the sister project to worry about.

Of course the situation is dynamic: new articles change it. But the percentage suggests that with the DNB match tool the category could be trimmed down now a fair bit. There a few months ahead of article creation here, but it doesn't seem premature to try to get a number representing the total number of DNB articles missing on enWP.

There are plenty of other maintenance tasks, of course, most of which will become easier when the articles are all here. Charles Matthews (talk) 21:28, 2 January 2013 (UTC)

Update: I have been doing some work on the "messy lists" and it has given me a number for the biographies still to do in DNB00: it's about 1800 now. Charles Matthews (talk) 10:17, 11 January 2013 (UTC)

Author pages — Contributors in later volumes[edit]

Starting to fix the author pages for the post 1900 contributors, as I fiddle through the remaining author components. I am seeking opinions on how we progress on the {{DNB contributor}} vs. {{DNB contributor done}} components. To separate or not? At completion of 1900, is the 'done' aspect necessary, or is that more an internal maintenance marker. I will also need to look at the DNB footers. When I am looking at both thee templates I will see if there is a neater way to have both the author page contributor marker, and the article footer components. Our template skills have improved and they probably can be done a lot nicer, even if they have a better underlying template. Comments welcome. — billinghurst sDrewth 16:53, 12 January 2013 (UTC)

Currently, I actively use {{DNB contributor complete}} in a maintenance role, as well as flipping it over to {{DNB contributor done}} when the articles are all there. It should be the case that when the first edition DNB is all posted here, and all the author pages are checked for redlinks (which would indicate wrong titles, therefore), that the distinction between these templates will no longer be necessary. In other words {{DNB contributor done}} could then be redirected to {{DNB contributor}}. I suppose we should be thinking in terms of a complete redesign of the DNB on author pages, anyway. Charles Matthews (talk) 05:15, 13 January 2013 (UTC)

Category:Dictionary of National Biography contributor templates conflicts[edit]

As we get further into time and volumes, we are starting to run into issues with {{DNB footer initials}}. Our initial response was to convert these to convert the conflicting initials to disambiguation pages, eg.

However, we have not been wholly successful

It got worse with the first supplements, and it will only get worse with the second and third supplements. The solutions as I see it are:

  1. to continue to disambiguate; and with every new edition to create new diambigs and to update all pages as required (a bot run will work); OR
  2. to add a volume parameter to the template that is affected, in essence {{DNB FL}} flips out the right volume. Typically, we would default the names to the initial 63 vol. work so there is no need to amend them now, or we could always bot run them.

Neither solution is perfect, depends on what is the community's view of what they want to see and achieve. OR have to remember to do. — billinghurst sDrewth 05:25, 24 January 2013 (UTC)

I see what you are saying. But I'd suggest we stick with option 1. Unless there is a sudden shift of 5 years in the base year for US public domain, we are only talking about DNB1912. As far as I'm aware the problem is not overwhelming. Charles Matthews (talk) 21:12, 24 January 2013 (UTC)
The original disambiguation templates looked like {{DNB EC}}. These were intended to tell the editor to replace the template with the appropriate DAB as needed. I thought that I had found all of the dabs for the original 63 volumes, but perhaps not. Use of a volume parameter with a default is probably not a good idea, because it does not warn the editor in any way if there is a mistake. However, there may be a way to pass a parameter from the article header into the contributor template. If so, we can simply alter the header template to seed the volume parameter, and then use that parameter in our ambiguous contributor templates. This is beyond my current wiki-fu, but I seem to recall that such a capability exists. -Arch dude (talk) 23:03, 26 January 2013 (UTC)
Basically, we need to add logic in the {{DNB00}} template and our other header templates (if we are brave and arrogant enough, we can add this to the {{header}} template) that says something like
set the "work" paramter to "DNB00" and set the "volume" parameter to vol
and then, to each ambiguous contributor template, add
if the "work" parameter is "DNB00" and the "volume" parameter is 50, use "Harvery Schmedlap", else use "Herkimer Smoot"
or whatever logic is needed. -Arch dude (talk) 23:22, 26 January 2013 (UTC)
This wasn't a criticism, more just a development, especially as we move beyond the original volumes, and that our concatenated source of contributors has proven to have errors. I am not looking to hack DNB00 as it doesn't directly effect the footer initials template, and trying to get that relationship with DNB01 and DNB12 is just asking for trouble when the fixes are simpler, especially when it won't show that way in the Page: ns, and writing that bit to pull the volume information is ... MEH! There are two easy solutions, one requires the mistake approach of "Oh, damn that is disambiguated" make the change, or adding the volume information to every footer template. CM has indicated the former suits better, and that is okay with me. Just need to work through the process to do it. — billinghurst sDrewth 04:20, 27 January 2013 (UTC)

Added anchor = parameter to {{DNB link}}[edit]

I have finally got fed up with not having the ability to link easily to a sub-article, and the desire/irritation/need was strong enough today to get around to adding that facility to the template. Added and documented. In short, if you want to get to the sub-article, anchor = whatever is the anchor used. — billinghurst sDrewth 12:53, 29 January 2013 (UTC)

Last big push on DNB00[edit]

As of right now there are 28528 DNB00 and DNB01 biographies posted. That leaves something over 500 to go: Wikisource:WikiProject DNB/Messy lists would say about 510 but that underestimates because of non-disambiguated names. In any case the first edition is 98.5% done. (The completions page is not updated - no time to check volumes!) Charles Matthews (talk) 09:25, 1 February 2013 (UTC)

Getting the place tidy for visitors. I came across a biography with {{incomplete}} today: apparently this is unique, though. According to this there are currently 18 instances of biographies carrying {{migrate to djvu}}. Charles Matthews (talk) 11:40, 3 February 2013 (UTC)

I will track down the incomplete and migration and work out what needs to be done. — billinghurst sDrewth 12:21, 3 February 2013 (UTC)
Yes check.svg Done "migrate to djvu" and you noted the incomplete. — billinghurst sDrewth 00:19, 4 February 2013 (UTC)

The incomplete one is Lowe, Hudson (DNB00). I guess I'll fix it later today, since it is among the batch I'm working on.

It occurs to me that remaining redlinks on volume ToCs are confusing. But they could be commented out: good reasons not to remove them without discussion. Charles Matthews (talk) 16:00, 3 February 2013 (UTC)

I would prefer to only do what is correct to fix them (after consideration), than just react to redlinks because we are reaching a milestone. So we probably should have that discussion. I am so hating the old header template that we hacked below {{header}}. I just haven't had the priority to fix them. — billinghurst sDrewth 00:19, 4 February 2013 (UTC)

OK, I believe we are talking about material on the volume ToCs placed there in imitation of the ToCs that come on pages at the end of each DNB volume. I suppose, though I don't know that we have discussed this at length, that the volume ToC should eventually offer the reader links comparable to a hyperlinked version of those pages? Vol. 34, for example, seems to have a quite full version of the "auxiliary" information. I fixed a couple of corresponding redlinks there, which were simple changes in the link. Can't be any objection to that. In Vol. 33 as it stands the entry that is in wikitext

*Lemens, Balthazar Van :See: [[Van Lemens, Balthazar (DNB00)|Van Lemens, Balthazar]]

is just a bit puzzling because the actual article is at Van Lemens, Balthasar (DNB00) which is a different spelling. There is an actual "See" page for that one at Lemens, Balthazar Van (DNB00), and to fix the "van Lemens" link on that one I changed the piping on Page:Dictionary of National Biography volume 33.djvu/31 to the "s" spelling, as one would. But the volume ToC currently doesn't link to the "See" page, it tries to reproduce the same effect.

A full case analysis is not beyond our powers, and ideally all the redlinks go away with enough work. We haven't really decided on the status of the "See" articles in the project. For me they add value in hypertext terms, but among hypertext issues are not the highest priority (they certainly can be useful - for example in finding the surname relating to "Lord Hunsdon", Hunsdon, Lords (DNB00) came to my rescue).

So we can tackle all this (there seem to be half-a-dozen types of "auxiliary" volume ToC entries) with enough patience. My point was simply that in cases like Vol. 36, where the articles all exist, there are redlinks that are not explained as such. An outsider might be puzzled when told the volume was complete. An alternative way is not to comment out anything, but to add a template to such pages to the effect that redlinks are in process of being sorted out. I would not actually want to remove such redlinks when they are there for some reason. Charles Matthews (talk) 07:12, 4 February 2013 (UTC)

We must be clear on the distinction between the ToCs and the DNB00 index pages. We have total control over the ToCs: these do not exist in the original 63 volumes. We created these as a rough analogy to the abiltiy to thumb through the physical volumes, and we have total control over their content. We have three distinct issues to address:

  • disambiguation entries in the DNB
  • creation of index articles equivalent to the indexes of each volume
  • disambiguation entries in the Toc
For the first: if there is a "See" entry inline in the DNB, should we have an article?
For the second: should we have a faithful index article for each of hte 63 volumes, and what, exactly, should it look like?
For the third: When the DNB has a "disambiguation article" what should the ToC entry look like?

As the project has progressed, I have come to the conclusion that we have not been rigorous enough. If I had to do this all over again, I would to the following:

  • Each dismabiguation entry in the DNB would have a separate article
  • The ToC entry for a DAB would simply point to the DAB article
  • There would be an index article for each of the 63 volumes.
    • The format of the index article is fairly complex. It links to both article space and page space, therefore violates the space separation.

All of these refinements are secondary to completion of the basic work of completion of the DNB transcription. I am in awe of the work that has been done. -Arch dude (talk) 03:52, 5 February 2013 (UTC)

Pictogram voting comment.svg Comment Some of the red links become redirects, those "See" that are sub-articles can now be directed to the article since I added an anchor = parameter to {{DNB lkpl}}.
Re disambiguation articles, I disagree, as we need to look at this in the context of all enWS, not just this work. enWS has a specific approach to disambig and it encompasses this, AND we are not a book, we link to articles (internally or externally from enWP or implicitly from search engines), or people type in the search box and meet the ajax look ahead function. the numbers who will free type a name with the DNB00 suffix would be absolutely minimal.
Reproducing the indexes from the work will come, they are just the lowest priority, going to be complex, IF we choose to link them, there will be some complexity. They will not link to the Page: ns as that has been a determination to the community, and I don't see the point of it for the page numbers as the article links already to do that to the top of the article. — billinghurst sDrewth 05:55, 5 February 2013 (UTC)
General point: we are right now on the cusp between "heavy lifting" and "intense curation". Getting the last DNB articles up is a matter of days away. I have been trying to list the remaining "issues" in some sort of orderly way, and there are more than a dozen points. So we shall have to chip away at a long agenda. Perhaps a fitting way to start would be to archive this page next week, sorting out the active threads. Charles Matthews (talk) 07:38, 5 February 2013 (UTC)

The final article in the first edition went up yesterday. Charles Matthews (talk) 11:48, 11 February 2013 (UTC)

Volume 31 update[edit]

As of 10 February 2013, volume 31 has been replaced with an improved quality source file on Commons. No bulk-moves or bulk-deletions were neeeded nor any changes made to existing mainspace articles.

There were, however, a handful of Pages: previously marked Problematic among a dozen or so others mrked un-proofRed - all most likely due to the prior thumbnail scans being blurred, cut-off and the like (the content itself looks like its been processed though). Knocking out the current 20 or so remaining red pages would bring the Index: into the rare Validation phase for someone who likes the taste of low-hanging fruit. Another volume fix in a couple of hours... -- George Orwell III (talk) 09:18, 11 February 2013 (UTC)

Volume 40 update[edit]

As of 11 February 2013, volume 40 has been replaced with an improved quality source file on Commons. Two prior duplicate pages no longer exist and the bulk-move to correct for that has been completed. All mainspace articles in the affected Page: range have already had their pages tag-lines adjusted as well.

Dozens and dozens of Pages: previously marked Problematic, among a handful of others marked un-proofRed - all most likely due to the prior thumbnail scans being blurred, cut-off and the like - still exist. The content itself looks like its been processed from 3rd party sources however so the remaining proofreading needed should be "light".

Now that the Project is technically "Live", I'd hate to see any uptick in visitors as a result come across something as unusual to the unfamilar as the Problematics may be than they really should have to, but I leave addressing them up to the members to prioritize. -- George Orwell III (talk) 23:36, 11 February 2013 (UTC)

All Articles? Great![edit]

Charles informs us that all articles in DNB00 are now present. This is a huge milestone: congratulations all. Should I now update the boilerplate in the main article and the boilerplate in the "access to scanned articles" template? -Arch dude (talk) 23:59, 13 February 2013 (UTC)

After reading the announcment at the UK site, I decided to just make the changes. -Arch dude (talk) 01:31, 14 February 2013 (UTC)
And why not. Charles Matthews (talk) 08:19, 14 February 2013 (UTC)

A broken header[edit]

Something went wrong HERE. Could someone please fix it? --P. S. Burton (talk) 01:38, 15 February 2013 (UTC)

Le Marchant, John Gaspard (1766-1812)_(DNB00)[edit]

Le Marchant, John Gaspard (1766-1812)_(DNB00) is proof read but there seem to be a lot of OCR errors in the text. Also at least one of the page joins needs fixing. I am working on other things at the moment so I am posting here in the hope that someone else has time to fix the errors. -- PBS (talk) 17:04, 22 February 2013 (UTC)

On the bigger problem of validating the work now posted, we basically have no slick solution: dozens of people have contributed to the project, and we AGF all round, naturally. Here's how I see it. There is the digitisation on the ODNB site, of a later edition. It can certainly be used to patch our versions. And we should be doing plenty of volume-by-volume passes through the DNB, for specific tasks such as hyperlinking. So in a piecemeal way many of the problems will get picked up. But can't we be more systematic and smarter? Well, yes, perhaps.
Here's the deal. If there were not the "spurious linebreaks" in the text we have, some semi-automation could be used to display the diff between our text and the ODNB text. For example in this case the linebreaks had been fixed by the proofreader. Also no change had been made from the first edition (ours) to the ODNB edition (in effect 1912, but there may have been a few tweaks later). So it wasn't hard to replace the text, at the cost of redoing a little format.
If the linebreaks are still the "OCR legacy" type, then "Show changes" doesn't display a proper diff, which makes it hazardous to use ODNB text. It is possible to remove said linebreaks by find-and-replace, but then you lose the para breaks that are needed. So marking the paras by tokens, replace the linebreaks (with due allowance for hyphenation), put back the paras by replacing the tokens, is one pass that fixes the essential problem. Then "Show changes" is OK to monitor that nothing improper to the first edition gets added in - it would be a bad idea to correct one load of issues just to introduce others (though generally in the readers' favour, which is why I have used ODNB text in proofing).
To sum up, to get a more systematic validation path, we can either try to get onto the linebreak issue; or (and I think it gets interesting here) is there a technical fix, a more intelligent way to display the diff that can cope with linebreaks that are not after "."? The latter doesn't sound so much out of reach.
Charles Matthews (talk) 14:04, 24 February 2013 (UTC)

DNB12 authors[edit]

To be specific, missing authors from the first volume of the 1912 supplement. There is a page up at Dictionary of National Biography, 1912 supplement/List of Writers which transcludes the pages listing authors by their abbreviations. It would be traditional to put the detailed discussion on Talk:Dictionary of National Biography, 1912 supplement/List of Writers. I have just done a pass to pick up some obvious identifications and typos for the first volume there; it leaves 59 redlinks. Some of those at least are not at all hard.

This is topical because there is some article creation now proceeding for vol. 1. NB that there is an author page template {{DNB contributor 2ndSupp}} and if you look at backlinks to Template:DNB contributor 2ndSupp you'll find quite a number of new author page creations for DNB12 by User:Dsp13. Dictionary of National Biography, 1912 supplement, Volume 1 for recent action on the biographies. Charles Matthews (talk) 08:40, 9 April 2013 (UTC)

Author abbreviated L. M. M.[edit]

How do we know that Miss Middleton abbreviated L. M. M. is Author:Lydia Miller Middleton? Particularly as she married Sir Middleton to get that surname. At Page:A Dictionary of Music and Musicians vol 3.djvu/10 I have a Miss Louisa M. Middleton who is referred to in this publication as the contributor to DNB. Beeswaxcandle (talk) 04:34, 23 May 2013 (UTC)

You appear to have a very good point. It is Lydia Miller Middleton in Gillian Fenwick's Contributor's Guide. But no supporting references there, I believe. Charles Matthews (talk) 08:37, 23 May 2013 (UTC)
Married 1890, so that isn't a help. I'll see what else I can dig up. — billinghurst sDrewth 14:46, 23 May 2013 (UTC)
Actually, it does help us, when I fully reread the statements. The first LMM in the DNB is 1889, which is a year before the marriage. So, I do find Louisa Middleton, b. c1855 Calcutta, who is listed in the 1891 census as having occupation of Literature/Author; and from 1861 census is the daughter of a Calcutta merchant. In school in Scotland in 1871, still looking further. — billinghurst sDrewth 15:24, 23 May 2013 (UTC)
It is worth noting that the ODNB website gives the name as "L. M. Middleton". Charles Matthews (talk) 06:06, 30 May 2013 (UTC)
I think that we should be moving these over to the alternate author as provided by BWC. Many of them are musicians, so the evidence tends to point to the alternate view. — billinghurst sDrewth 12:02, 24 November 2013 (UTC)
Agree, a good catch. Charles Matthews (talk) 08:46, 25 November 2013 (UTC)
Yes check.svg Done and putting a copy at Author talk:Lydia Miller Middletonbillinghurst sDrewth 13:02, 25 November 2013 (UTC)

To note Author:Louisa M. Middleton

DNBmatch on Labs[edit]

Magnus Manske is migrating tools to Wikimedia Labs. The DNB match tool is done:

The look is improved - note the query form at the end that includes two other works. But the big plus is that the performance is much better than we've had to put up with on the toolserver. if you have requests for other tools.

Charles Matthews (talk) 18:40, 1 June 2013 (UTC)

The maintenance tool has also been ported:

Charles Matthews (talk) 13:32, 3 June 2013 (UTC)

And a new tool ...[edit]

The stability of Labs has made possible something that previously was only a "cunning plan":

shows WS:WP, ie the ratio of lengths (in bytes) of DNB articles here and the linked article on WP. Once again, our thanks to User:Magnus Manske.

The top hits have ratios over 100, and there is a reason for that, namely the "linked article on WP" may currently be a redirect. I think this is useful at the moment as a maintenance feature. We should make the "wikipedia=" field point to the actual title.

The WS version used is cached, while the WP version is live. That means that the updated articles will stay at the top of the list until the cached version is refreshed. I'll post more about that when I know more.

As of right now, I have done the top of the list down to Scotus, i.e. the first ten. Charles Matthews (talk) 11:48, 3 June 2013 (UTC)

Update. Re the caching, a tweak has been done, and the refreshing issue has gone away. Charles Matthews (talk) 11:55, 3 June 2013 (UTC)

Further update. After some work at the coalface, the tool is now providing the intended data. See w:Wikipedia talk:WikiProject Dictionary of National Biography#Ratios tool for a run-down. As a by-product maintenance on the "wikipedia=" field here can be done fairly painlessly. Charles Matthews (talk) 07:37, 24 June 2013 (UTC)

Matching pass[edit]

I have just completed a pass through the whole alphabet with the DNB matching tool. As a result, Category:DNB No WP now stands at just over 9K articles, i.e. under 70% of what it was when the DNB00 and DNB01 articles were completed. Matching and linking to WP is not only good in itself, it paves the way to further automation for the project as a whole.

The tool has quite a few quirks. Generally it suggests "false negatives", which is OK if one is aware: false positives are much more troublesome. Some remarks:

  • It doesn't cope with hyphens in names. Names that are hyphenated are worth checking by hand.
  • It doesn't cope with apostrophes, e.g. in O'Brien. I have just worked over the Irish names of this kind, by hand.
  • Singleton names, e.g. Osmund, are confusing to the tool.
  • It may transpose, e.g. "Lewis Thomas" when you want "Thomas Lewis".
  • The treatment of disambiguated names is rather uneven, so you may need to find the main dab page from the hit given.

All in all, my pass will not have picked up everything. There is quite a bit more, and I'm starting a second pass now.

Category:DNB No WP has some pre-made searches and there should be more. While the performance of the tool at Labs is much better, it probably fails when asked to search more than 300 to 400 names. So initial pairs of letters are good.

NB that the Epitome lists on enWP, such as w:Wikipedia:WikiProject Missing encyclopedic articles/DNB Epitome 01, are a separate operation. Reconciling those lists fully with the "wikipedia=" field here would be a good idea, but currently would be labour-intensive. Charles Matthews (talk) 10:31, 27 June 2013 (UTC)

Tracking page on WP[edit]

More the business of the other end of this project, but the discussion at


and direct link to


may be of interest. The technology makes tracking a project of about 30K at least something that can be attempted. Charles Matthews (talk) 12:33, 4 July 2013 (UTC)

Fourth Supplement on Internet Archive[edit]

Here, and it says it is not in copyright. I have to assume that's a mistake. Charles Matthews (talk) 21:42, 23 August 2013 (UTC)

Tools page[edit]

There is now a project page Wikisource:WikiProject DNB/Tools about the special DNB tools. Charles Matthews (talk) 07:42, 24 August 2013 (UTC)

DNB12 remaining authors[edit]

The discussion of authors has gone on spread over various pages, in a scatty fashion. There are currently five hard-core redlinks, at Wikisource talk:WikiProject DNB/1912 authors#Remaining authors. Some other author pages, out of the 300-odd needed for DNB12, have been created as "details not available", though.

Disambiguation of the author initials of the DNB12 authors has been proceeding, not quite complete. Charles Matthews (talk) 07:12, 4 September 2013 (UTC)

Really just two tough ones left. Author:A. L. Armstrong is definitely a connection of the Harcourt political family, but I don't know more. Author:D. J. Owen wrote three biographies of mathematicians; there is no reason to connect those to the David John Owen of the London Port Authority, author on ports. So he may be a connection of Author:W. B. Owen who was on the staff in 1912. In which case there is a candidate who was of the same age group of graduates, I believe, but hard to say much more.

There was the Northern Ireland historian [Sir, David John OWEN (1874 Mar 8 – 1941 May 17)] who is a possibility. Re Armstrong I can only find one biography and that usually indicates a personal knowledge, though I cannot find a relationship to this time. — billinghurst sDrewth

These are the last few discussions for around 800 authors.User:Charles Matthews/Companion theorises about one way to make a scholarly work from all the research that has been done, allowing for future changes of mind. This has been suggested before: in 2014 we should get down to "next steps", when all the DNB text has been posted. Charles Matthews (talk)

DNB12 milestone[edit]

The last of the DNB12 articles is now posted. User:Slowking4 did the bulk of the second supplement. That's all the public domain DNB articles, for now.

There is of course a large amount of checking still to do. There is a lot of work left round the edges: tables of contents, and "see" articles, in particular. There are hyperlinks to add, and the article start and page end format issues to pursue. There are still scans for DNB00 that should be replaced.

I have thoughts about author pages. I have posted them to get an outside opinion, at User talk:AdamBMorgan, to see if they fit into a sitewide style guide. Charles Matthews (talk) 10:27, 5 November 2013 (UTC)

So, with Adam's help, there is now something to look at, Author:Frank Herbert Brown in a proposed format. Note that the detailed proposal includes the idea of putting the supplements in a single alphabetised list, rather than at the end; and rationalising the contributor template family.

Please weigh in with any views. What I'm suggesting affects other works than the DNB, so if people here are happy, I'll mention it on the Scriptorium. Charles Matthews (talk) 05:51, 6 November 2013 (UTC)

Format dilemma[edit]

Not so accurate a title: "format inconsistency" would perhaps be better. I'm doing a pass, which will take a while, formatting the initial words of the articles, which allows me also to troubleshoot a few other things. Some of the supplement volumes don't bold the surnames. I assume that when I get to those, I also don't bold the surnames. Charles Matthews (talk) 10:39, 28 November 2013 (UTC)

IMNSHO just bold them. The intent of the authors hasn't change, and it would be an oversight by the typographers. — billinghurst sDrewth 12:31, 28 November 2013 (UTC)

Fixed page width[edit]

As someone who often reads DNB pages on Wikisource, I think that the new fixed page width is less than helpful.

The format forces a line of text to a specific width and on a wide screen device it is like reading a newspaper column. It involves lots of scrolling because my web window is filled with white space which previously contained text. Conversely if I am reading it on a small screen device that is narrower than the text, instead of adjusting the text to fit the screen, it forces the reader to scroll to the width set by the text formatter.

If this format replicated the layout of the physical DNB pages then there would be some justification for it, but it does not.

As far as I can tell from this discussion page, this change in format was not discussed, so please put the format back to how it has been for many years until it is shown that there is a consensus for the change.

I am against this new fixed width format. -- PBS (talk) 13:15, 30 November 2013 (UTC)

See also Wikisource talk:WikiProject 1911 Encyclopædia Britannica#Fixed page width it seems that this change affects more than one project and that this change was not trailed on that talk page either. For the same reasons as given here (including not replicating the original layout), I do not think the change go a fixed with page format is an improvement for EB1911 pages. -- PBS (talk) 13:29, 30 November 2013 (UTC)
Archived Wikisource:Scriptorium/Archives/2014-01#Fixed page width -- PBS (talk) 12:21, 22 February 2014 (UTC)

Standard abbreviation for £ ?[edit]

I've been working on some articles in Wikipedia which are based on material from the DNB and have found what appear to be monetary amounts expressed as "200l". Can that be rendered as "£200", or what does it mean? Thank you. SchreiberBike (talk) 01:15, 21 December 2013 (UTC)

In books of this period sterling currency was often referenced in terms of l.s.d. from the Latin librae, solidi, denarii or pounds, shillings and pence. So, yes, you can treat 200l. as typologically equivalent to £200. Beeswaxcandle (talk) 01:28, 21 December 2013 (UTC)
Thanks. Keep up the good work. SchreiberBike (talk) 03:28, 21 December 2013 (UTC)

End notes[edit]

Here is a crazed idea, which might turn out not to be useless or totally barking. Someone with a bot to extract the end note sections (within [ and ]) from all the DNB articles, parse them at the semi-colons, and then alphabetise the lot.

I was thinking how interesting it would be to know what the major references used in the DNB are. But equally, at this point, it occurs to me that the endnotes are a major source of remaining typos. The reason being fairly obvious: the smaller text may defeat the OCR and human proofreader alike. A typical typo is, say "Diet." for "Dict." abbreviating Dictionary, or just "Hist," for "Hist." with a punctuation error.

Some of the endnotes do have sentence structure after a period, rather than just ";" separators all through. Many, of course, will have ":" for ";" as separator as a typo, and that would show up on a listing. Some endnotes may not be properly within [ and ]. It would be worth working over these issues first, and iterate, to get a cleaner listing. (On a technical note, the wikitext uses a number of different techniques to get the small text: there may be a preliminary issue to solve here.)

So the output would be of the order of 500K entries, I guess? All that is required is a bulleted list, with the bit of endnote followed by a link to the article where it is found. Charles Matthews (talk) 09:58, 22 December 2013 (UTC)

Consideration to moving all DNB works to subpages of respective Dictionary of National Biography editions[edit]

This has been something that I have been long considering, however, wanted to leave it until the work was completed. Now it seems with WS ↔ WD coming, it pushes it high up my alert list (that mentioned later).

From early days the DNB was transcribed as pages named Name, Name (DNBnn), and it was a little contentious even at the time, however, we started without scanned volumes and it was simply what eventuated. [Long story and it is in the WS:S archives (2008?) if you really want to know what happened] What I would like for us to now do is move all these biographical works to being subsidiary to their respective publications, i.e.

(Noting that I am comfortable ignoring their respective volumes)

The community has (long) been looking to keep true to the published work, and to have the work at the root level, and any (split) components of the work as subpages. The advent of scans allowed that in easier and logical sense, and redirects allow us to direct to subpages where considered desirable. Other biographical works DMM, DAB, IndianBio, CE, SBDEL, IrishBio were able to be configured that way from their beginnings.

What puts this into TO CONSIDER basket, and with some urgency, is the forthcoming inhalation of Wikisource data to Wikidata, and how and what do we interlink, root pages vs. subpages, etc. At the moment with all DNB pages sitting at root level they will all be inhaled, and that probably not what is the neatest and best way to do that with DNB biographies, but it is something that is wanted to be done with the compendium works.

Mechanically, this would mean moving the works to their respective subpages, and leaving redirects in place. There is a little downside with the category listings as they will be the extended/long name (note that there may be a solution for that with mw:Extension:SubPageList which I hoping that we can test somewhere. Bugzilla: 59762. There should not be any requirement to have any other changes (maybe need a defsort put into the header template). Templates are fine as they are; links are okay, and all those bits should work. — billinghurst sDrewth 11:38, 7 January 2014 (UTC) @Charles Matthews, @Beeswaxcandle, @George Orwell III, @Arch dude, @Hesperian, @Mpaa, @JamAKiska:

  • I support this in principle and always have, but wouldn't be comfortable proceeding without Charles' blessing. Hesperian 12:00, 7 January 2014 (UTC)
  • So this is about having everything as subpages, rather than using suffixes? That for me is far worse for the reader. I use the DNB all the time by typing in a surname, in the search box, when I get a prompt in the form of a drop-down list. Very useful for research. So I would argue against the subpage restructuring on those grounds alone.
    • The redirect names will still exist, it just they do not show in the type ahead functionality. I will ask the question of Nik, what capacity there is to have redirects appear in redirects, is it just a config issue, or a current impossibility that would need to progress via a development request. I would argue that while you may search that way by surname first, I am not sure that the common WP reearcher is that aware that we have names back-to-front to how they use them there. Also you can just go the next step and hit enter and get a list of results to your query. So I understand your issue, as I use it a lot for disambiguation. In thinking about this, if push comes to shove, I could just ask the bot operator to skip any page with a {{DNBxx}} header. I don't see that they want all the loose subpages, tehy definitely not in first phase where they are looking at interwikis.
  • Which is to say we need to be looking at this from another angle. I'm aware of Wikidata's needs and there is a DNB (actually ODNB) related tool-based initiative. The drop-down lists are a big plus for humans: the subpage titles may be for machines. For the best of both worlds we surely need a rational policy on redirects, with what goes to what the crux. Charles Matthews (talk) 06:28, 8 January 2014 (UTC)
    We have a redirect approach, mindfully redirect as required where a work could be at the root level and you want to point to its subpage location'. Redirects are cheap to the system. The subpages approach would allow for us to build a tool to more easily search within the DNB, either for authors or for content, which is a little hard at this point of time as the works are not structured to be search as a collective. (he says without having fully explored the new search componentry). — billinghurst sDrewth 09:29, 8 January 2014 (UTC)
  • Could we just step back a bit? We can obviously have a definite policy on 'aliasing', which would be a way of talking about certain navigational options. Which would be good also to have site-wide. I have mentioned a point about what I find user-friendly. There must be other points, such as "disambiguation" and "similar pages", that relate to organising the material on the site better. User:Arch dude actually raised the suffix issue with me a while back. So I know there are others who think the same way as you about it. If there is to be such a policy, I'd like a discussion of what it is enhancing and what it could enhance. The point about redirects being a cheap way to do things is exactly my own point of view, in fact. Charles Matthews (talk) 16:36, 8 January 2014 (UTC)
We do have guidance and it is at Help:Redirects and Help:Disambiguation. Lots of general discussions about policy/guidance/thought bubbles have been had and presumably will continue to be had at WS:S. — billinghurst sDrewth 13:04, 9 January 2014 (UTC)
While I'd usually support something like this in principle as well, I'm not sure what the anticipated end-product would look like here and to what end. Is each mainspace subpage still going to be a single entry found on one or more pages as transcribed in the Page: namespace or is a mainspace subpage going to be based on a range of pages as transcribed in the Page: namespace covering a bunch of entries (say all the "A" last names)? I hope its not the latter though technically that is the type of "division" marker used in the originals if I'm not mistaken. -- George Orwell III (talk) 01:05, 9 January 2014 (UTC)
My proposal is the easiest possible of just moving the existing pages to be subpages clumped under one of the three editions of the existing pages listed at the dot points above. Moving pages and having redirects. eg. "Machin, John (d.1761) (DNB00)" moves to "Dictionary of National Biography, 1885-1900/Machin, John (d.1761)". The parent pages exist, the content pages of the volumes exist; all links would redirect. The prev/next links exist and will redirect. No retooling, no new transcibing, nada. — billinghurst sDrewth 13:04, 9 January 2014 (UTC)
  • So a bit of preliminary research suggests that the drop-down feature that mainly interests me is not particularly well documented. I know now how to turn it off, assuming my skin were Vector (which is it isn't). How it handles redirects is something I'd like to clarify, to make progress here. Charles Matthews (talk) 08:46, 9 January 2014 (UTC)
    I sent an email to Nik yesterday about redirects and the typeahead function to see what is possible, and I await a response. I mentioned that they didn't work for long title names, and horrid for subpages. — billinghurst sDrewth 13:04, 9 January 2014 (UTC)

From the point of view of links from Wikipedia it would be slightly simpler if volume numbers were not incorporated into the paths, because at the moment with the Wikipedia templates if volume is not set the page is still found, but there is a proposal to add a volume parameter value to all instances of the DNB template in article space on Wikipedia that link to Wikisource, so it is a small not large obstacle to overcome.

As to the search string problem would that be simplified by setting up a name space that mapped say "DNB" onto the full name? (This is I think something that billinghurst mooted as a possibility with the Encyclopaedia Britannica 11th edition which currently reside at 1911 Encyclopædia Britannica) -- PBS (talk) 12:43, 22 February 2014 (UTC)

Data item in header[edit]

I'm getting ever more interested in the potential of Wikidata, as applied here. As things stand, most articles on enWP have a matching Wikidata item. Of the DNB articles, 73% now have been matched to enWP, a figure that is rising steadily (say 4% a year) now that most of the "legacy" matching has been done, and the growth basically comes from article creation.

It would be therefore, quite soon, be possible in theory to automate the process of filling in a "data item=" field in the DNB header, for about three-quarters of the DNB articles. It could be treated as a sister link, though on a different basis from those on author pages.

Here are some possible applications:

  • Matching up data items between the Author: pagespace and the DNB articles, there could be an automatic check of which author pages could have a DNB link and don't.
  • Create a list of duplications within the DNB.
  • Use the data items as part of a larger topical classification.

The last of these is an old issue for WS: how to list/categorise/portalise non-fiction texts here, by subject matter. Thinking just of biographies, there is the potential to create {{similar}} pages in an automated fashion, by doing the same data item indexation on EB1911, Catholic Encyclopedia etc.

It comes down to saying that standardising on Wikidata codes as our underlying "library classification" scheme now looks like an idea whose time has come. Charles Matthews (talk) 08:03, 9 April 2014 (UTC)

Page ends[edit]

I've reached the half-way point of my first general pass through the DNB volumes: I have just done vol. 35 of the 69. I have mainly been concerned so far with the article starts, and the links to WP. There are some other matters I feel need to raise, about the page ends, which will have to be the main area in a second pass. I believe the main issues are:

  • Use {{hws}} and {{hwe}} to deal with word breaks at the page end;
  • Use {{smaller block/s}} and {{smaller block/e}} to deal with page ends falling across the endnotes;
  • Use now {{nop}} to force a newline, when a para ends with a page.

The third of these is relatively new to me. I get the feeling that the relevant behaviour of ProofReadPage has shifted around, with upgrades; but is usually then shifted back to the status quo ante? Though not in the change that made {{nop}} now part of the business?

In any case these matters should now be written up. A couple of recent diffs suggest to me that this all is not quite as well understood (and I include myself) as it could be; and the DNB project should be moving to a final "manual of style". I take it that {{smaller block}} everywhere is the choice for the manual. Charles Matthews (talk) 07:12, 17 April 2014 (UTC)

I don't want to encourage people to validate without a final decision on these matters, so I will remove the "already proofread" link on the statistics page I put there. ResScholar (talk) 10:36, 18 April 2014 (UTC)
None of them are new. HWE is more important than HWS (you can just take the hyphenated word into the footer). The blocks are important to note break the text of the notes; and nop is solely for the wiki environment as it gobbles empty lines.
  • I can probably get a bot run through the volumes looking for terminating hyphens with a pretty good success rate to identify the hws issue. Wikisource:WikiProject DNB/Terminating hyphen (pages identified Yes check.svg Done ; corrections X mark.svg Not done)
  • I can run the same but to identify terminating }} and </small> though the former will have numbers of false positives as we know that it is okay to terminate if it does not continue
  • Not much we can do, it is an eyeball test.
billinghurst sDrewth 14:10, 19 April 2014 (UTC)

That's not quite the project's practice, though. Validation, piecemeal, has been ongoing. I have never known exactly how we are going to get the whole 69 volumes validated; nor quite what the scope of "finishing" we are intending to aim at is. Seems rather better to take things in stages.

My own view: the patches of green don't always live up to the standards one can hope for; but it is going to be a long job anyway. I have not wanted the "overhang" of validation to do casting too long a shadow, and distracting from getting things done. We have working text, and it is still quite bad in parts. My typo finder is meant to track down some of those bad parts by means of recurring OCR issues: I'm building it up as I find likely searches.

On another front, there are things like: endashes for hyphens (not always clear in the original, but the ODNB site's transcriptions give a standard); ligatures per the original; apostrophes per the original. These do not really have to be taken care of in validation. I don't like the "gratuitous spaces" that are, in my view, artefacts of the old typesetting and convey no information, so I'd like them removed. The bolding at the article start is not yet uniform across articles, but this pass of mine is intended to include that.

There are some unresolved issues with article titles and disambiguation, also.

As is typical of the DNB, it is hard to confront all the issues at once. Thank you for your interest. I can only suggest an ongoing pattern of passes to try to get the standard up eventually. Charles Matthews (talk) 08:02, 19 April 2014 (UTC)

My major issues, apart what you have mentioned above, are for where they wrote works to have author links (both ways), and adding the links where we have [q.v.] to works. Plus I would like to identify any redlinks, and resolve them. My methodology is to just bash away on the pages as I can. Typography bothers me less. — billinghurst sDrewth 14:10, 19 April 2014 (UTC)

Apparently broken this morning are {{smaller block/s}} and {{smaller block/e}}: Elderton, William (DNB00), Eldred, John (DNB00). I can't see any syntax problem, so assuming this is a "transient" software issue (i.e. random breakage). Charles Matthews (talk) 08:33, 24 April 2014 (UTC)

Manual draft[edit]

I have written down the things that immediately occur to me at:

User:Charles Matthews/DNB manual draft

Retrospective legislation for what is "validated" is not the point here, in fact. We will need a second checking system, probably, e.g. hidden category. Much to do first, though, and a thorough spellchecking pass is to be considered more urgent. Charles Matthews (talk) 09:07, 19 July 2014 (UTC)

Trialling an override, need some brain cells[edit]

@Charles Matthews: On a page like Page:Dictionary of National Biography, Second Supplement, volume 2.djvu/171 the author is Henry Stephen with the footer initials are H. S. (ambiguous) however with how we have done the template it was showing as H. S-n.. I have modified Template:DNB HS Stephen so that it has an override function override=yes that pushes H. S. as the text and keeps the right link, I am not certain that override is the right parameter name. Seeking feedback for the best/simplest/obvious means to portray this. — billinghurst sDrewth 03:51, 22 July 2014 (UTC)

Resolving [q.v.] versus [q. v.][edit]

@Charles Matthews, @Slowking4, @Mpaa, @Beeswaxcandle, @Arch dude: + others ... I am running through and fixing existing redlinks in pages, and I notice the variety of ways that we have q.v. Do you have a feel for which way we should unify these?

  1. [q.v.] no space
  2. [q. v.] normal space
  3. [q. v.] &nbsp; to stop line break splits

Noting that we also have these used in Supplements where we have a clear space between the q.v. and Supple...

I will get a bot to go through and to standardise (in a slow replacement) on whatever is the ultimate decision. — billinghurst sDrewth 09:43, 14 September 2014 (UTC)

With the space, as [q. v.] for me. Abbreviates quod vide, so I feel the space is natural and better. Charles Matthews (talk) 10:35, 14 September 2014 (UTC)
yes, agree, i’m afraid i left them as off the OCR (variable); you could run a bot to change, and flag for the template:DNB lkpl insertion. Slowking4Farmbrough's revenge 11:56, 14 September 2014 (UTC)

Authors in the DNB and their author pages[edit]

I'm currently preoccupied with Wikidata, involved in a six month project on the ODNB codes, which might be done in February, say. I thought I'd explain that more fully here later. Many of the DNB biographies are of authors: who could qualify for an Author page therefore. Do we have an idea of the number of those? Charles Matthews (talk) 16:45, 9 January 2015 (UTC)

Died dab extensions[edit]

Currently AFAICT the died dab extensions are in a format where the "d." is followed by that year with no intervening space. I think this is a mistake because I think most people would expect a space between the dot and the year, and within the text of the volumes there is one. -- PBS (talk) 17:20, 12 January 2015 (UTC)

The general philosophy for the choice of titles was minimalism: no space is just part of that. Charles Matthews (talk) 06:40, 13 January 2015 (UTC)

Wikidata Project[edit]

There is now a sister project at d:Wikidata:WikiProject DNB.

It should help Wikidata absorb the many items for DNB entries. Jura1 (talk) 13:32, 25 March 2015 (UTC)

Many thanks! Charles Matthews (talk) 03:59, 26 March 2015 (UTC)

See articles[edit]

The "see article" type of soft DNB redirect has a Wikidata item, d:Q19648608. This is a reason to create them systematically now (anything in Wikisource mainspace can have a Wikidata item).

As we know, there are various kinds, and they need to be handled somewhat differently. d:Q19766142 is an example that points from one surname to a variant. Charles Matthews (talk) 08:10, 7 April 2015 (UTC)

DNB01 and DNB12 main subjects on Wikidata[edit]

I have finished going through the "main subject" links for the two DNB supplements, 1901 and 1912. These make up about 10% of the total DNB articles here. On this smaller scale, it is still possible to explain what possibilities are opened up by the matching.

What this means that the "data item" link on an article here, which leads to the Wikidata metadata page for that particular article, can be followed by another link, to the Wikidata item for the person who is the subject of the biography. While the metadata page is not intended to carry a huge amount of information, the biography page well might.

For example, it may carry a link to enWP, if there is a corresponding English Wikipedia article. These links to enWP ought to be in sync with the header links here. If that is not the case, there are a number of possibilities:

  • What is in the header here is a redirect, which can and should be replaced.
  • There is nothing in the header here, but there is an enWP article that could be.

This is interesting to me, since finding those links is now something that can have an automated component.

  • There is a link in the header here, but no enWP link on the Wikidata item.

One should be alert to what is going on, in this case. It may, for example, indicate that the enWP article has not had a Wikidata link put in yet (which is true in about 1% of cases). Just as likely, the enWP has a valid Wikidata link running to it, but the item from which it comes needs to be merged into the item found starting here.

In other words, starting here at article A, there is the matching Wikidata item B, and a "main subject" link takes us to C. On the other hand, starting from a header link here to enWP, we get to D. We may expect C on Wikidata to link to D. It may simply not do so yet. Or item E on Wikidata links to D on enWP, where E is some other item that is a merge candidate to D.

Investigating D and E in this situation may throw up interesting checks. For example, it is perfectly possible that some of our links to enWP are not to biographies. Sometimes a link is to a company, while the DNB article here is about its founder.

Therefore, in case we have A -> D -> E, it is going to be worthwhile to check first whether E is "instance of human", or of something else. I expect instances of "duo" (two people), "family", "list article", "fictional human", "fictional character", "company", and odder things such as pubs, ballads and so on that may be named after people.

There may of course also be straight errors and omissions.

This all comes at the beginning of using Wikidata to manage search here by topic. Much checking to do, at the outset. DNB00 is on its way, but there are about 4K links to put in. This query is one way to find those, if you'd like to help.

NB that in principle there is a bot-created item to which the "main subject" link should run, if only you can find it (ODNB access really required); but the caveat is that those items only cover around 98% of the ODNB. By all means leave me a note on Wikidata for any baffling cases. This is another dimension of the project over there, and will end up with a complete old DNB -> ODNB matching that is machine-readable, an independently useful by-product.

Enough said for the moment. Charles Matthews (talk) 17:48, 15 February 2016 (UTC)

DNB00 main subjects[edit]

All now filled in, on Wikidata. That means that various possibilities for automatic checking are now open. I intend to spell things out on the Wikidata DNB wikiproject shortly. The one that will have the most impact here is to find the matching Wikipedia articles, as they have Wikidata articles created for them. Charles Matthews (talk) 05:52, 13 July 2016 (UTC)

Suggested edit for DNB footer initials[edit]

Hi, I thought I'd suggest here an edit for the DNB footer initials to make the initials appear on the same line as the last line of text (as the printed book has it). I made the edit on the 1911 Encyclopædia Britannica version "Template:EB1911 footer initials" a while back to achieve this — use:
 style="float:right;" instead of
 style="clear:both; text-align:right;"
to get the initials on the same line. I don't have rights to edit the source of Template:DNB footer initials. An example of the EB1911 footer initials template in use in EB1911 is here:
 DivermanAU (talk) 19:04, 17 October 2016 (UTC)

A couple of issues. Some here may prefer the template after a newline. And what happens in the book does vary. We need a consensus on what would be best. Charles Matthews (talk) 05:28, 20 October 2016 (UTC)
We tried that initially, however, it was problematic in some situations. What can happen is that it can word wrap the last word(s), and leave the template sitting above the last word. In the end we made the decision to just have it terminate on a new line. Space isn't the issue for us, we didn't need to blindly follow the style of the work. — billinghurst sDrewth 01:23, 21 October 2016 (UTC)

Scanned index pages[edit]

I've proofed Page:Dictionary_of_National_Biography_volume_01.djvu/493 and Page:Dictionary_of_National_Biography_volume_01.djvu/494. Each was several hours of quite handraulic cut/paste work, given the poor OCR. Eventually, though, it came to a repetitive routine that I think should be possible to bot-assist.

1. Set up the headers and footers. 2. For each entry indexed on the page, paste one of two template forms as follows: 2a: For a "See so-and-so" use the form: {{Template:Dotted TOC page listing||Aboyne, second Viscount (''d''.1649). See [[Gordon, James (d.1649) (DNB00)|Gordon, James]].|}} 2b: For a regular entry use: {{Template:Dotted TOC page listing||Abraham, Robert (1778-1850)|[[Abraham, Robert (DNB00)|66]]|5}} 3. In a separate browser tab, find the target page, either by searching for "Abraham, Robert DNB00" or by clicking the link to the next entry in sequence, for instance as found in the page header at Abney, Thomas (d.1750) (DNB00), which targets e.g. Abraham, Robert (DNB00). Either way, select and copy the target page title e.g. "Abraham, Robert (DNB00)". 4. In the pasted template form, select the part identifying the target page and Ctrl-V to paste the correct title. 5. Repeat for the human-readable first template parameter, "Abraham, Robert". As appropriate repeat the copy/paste for Abraham's lifespan dates "(1773–1850)", from the target page onto the template. Markup italics as needed for b, d, or fl. 6. Edit the page number to match the scan, as confirmed on the target page's left margin. 7. Visually check that the entered data matches the scan image. 8. Repeat 2 through 7 as required.

The page layout isn't a perfect match, but it suffices. The effort involved doing this manually for four pages each of sixty volumes would be substantial, but not impossible. A bot assistant could make it quite tractable. Is it worth doing? LeadSongDog (talk) 21:13, 8 February 2017 (UTC)

Could be helpful, in checking the article creation, and giving a key to the various "see articles", which we are probably going to create some day. The volume listings haven't reached a definitive form; and these index pages complement them. Thank you for looking at this issue. Charles Matthews (talk) 05:18, 10 February 2017 (UTC)
Botting standard formatted text is pretty easy. Get the text right, and I can run a bot through. — billinghurst sDrewth 09:07, 10 February 2017 (UTC)
@Billinghurst: Thank you. Could we bot or bot-assist the harvest of page titles and lifespans? That would go a long way. A big part of the problem is that the OCR on these index pages is just terrible. The good news is that the entries are in alpha order, so about 90% of the job for one index page is just harvesting all DNB00 entries in a specific alpha range. Only the "See so-and-so" entries vary from this. LeadSongDog (talk) 16:19, 10 February 2017 (UTC)
@LeadSongDog: first step would be see if our local and improved OCR function (you may need to turn the gadget on to get the button) can do a better job? We may also be able to see if there are other volumes with better scans are available, each volume seems to have variability in its scan quality throughout. So we may be able to scrape text from another copy of the index page, paste and work with that. Running a bot to fix poor OCR is a variable process. — billinghurst sDrewth 00:34, 11 February 2017 (UTC)
Our OCR is worse on those pages. :-/ — billinghurst sDrewth 01:33, 11 February 2017 (UTC)

Index page hovertext[edit]

Pages such as Index:Dictionary_of_National_Biography_volume_01.djvu could be significantly improved by some small changes. When one points at "200" on that page, the hovertext/tooltip pops up saying "Page:Dictionary of National Biography volume 01.djvu/214". This is useful, but it omits the key information that that page is Airay-Airay, which could be mined from the header of that page. Similarly, pointing at v.30 on the index page one sees the hovertext "Index:Dictionary_of_National_Biography_volume_30.djvu" rather than the more useful "Dictionary of National Biography, 1885-1900, Vol 30 Johnes - Kenneth". Is there any chance this could be easily fixed? LeadSongDog (talk) 22:45, 10 February 2017 (UTC)

The page links point to the underlsying scanpage to edit. The id is a hard page link primarily based on the <pagefile> data on the corresponding Index: ns page. To pull the text from the page of the running header would be a complex task, and in many DNB pages that information is not there (early style issue). Also it sits within a <noinclude> tag, so its availability is problematic. What were you trying to achieve through the hover?
With regard to volume information, the (linked) volume data is at the top of the page. When I hover over that I get the pop-up data that shows the vol, and through the redirect the SURNAME to SURNAME. I suppose we could consider the presentation of the surname components within the header, though I am not sure that the extra detail is always relevant to present, and may be busy/noisy on the page. — billinghurst sDrewth 00:48, 11 February 2017 (UTC)
Oh, on the _Index_ pages. We can amend the template:DNB indexes to get hover information for the volumes. That would just take some time, though could be done incrementally.

There is no way to populate the pagelist with names.

If there is a separate list then we can add those below the pagelist, though I am not certain of the value. As a sort of test, I have added the compiled list to the bottom of Index:Dictionary of National Biography volume 56.djvubillinghurst sDrewth 01:02, 11 February 2017 (UTC)

The issue (that I neglected to clearly specify) was that the current scheme requires the reader to go on an Easter Egg hunt to find the article they're after. A list of volume numbers or of page numbers conveys little to the user. The "through the redirect the SURNAME to SURNAME" approach to finding a volume works only after clicking to follow the link, so requiring attempts on several volumes. That is why, on the physical bound books, the spine showed not just vol number but also the first and last SURNAME. A hovertext/tooltip/popup would allow them to know which page they are after before opening it. While wear and tear on the spines isn't a problem, there are still places where bandwidth is scarce or expensive. Users familiar with the project's scheme may know that they can a)search for the name; b)open the article; c)find the scan page link as a page number on the left side of the displayed article; and d)follow that link to get the scan page. As something of a newbie on s:, however, I certainly took a while to figure it out. That transclusion on the v.56 index page certainly seems to me to be a step forward: a human-readable list of names that are linked to the corresponding articles. It does not, though, make it obvious what the proofing status of the target article is (as does the colour-coded numeric scanpage map). LeadSongDog (talk) 16:30, 13 February 2017 (UTC)

Maintenance of WP links[edit]

Big advance via Wikidata: Petscan queries on d:User:Charles Matthews/Queries#Petscan allow one to find rapidly the English Wikipedia articles corresponding to a DNB article here, but not yet linked. This is done separately for DNB00, DNB01 and DNB12; takes just a couple of seconds each time.

Caveat: this approach does of course depend on data being in Wikidata. So English Wikipedia articles have to have a Wikidata item; and item must be the one to which the data item of the DNB article points under "main subject". If an English Wikipedia article A has an item D1, while the article B here that corresponds has its data item pointing to D2 which is different, the query will only pick it up as and when D1 and D2 are merged.

A few misidentifications of "main subject" are showing up.

All this said, these long-sought one-click queries (to which User:Jheald helped me) are going to be really helpful. Other works here can be treated the same way, if the essential infrastructure is put in place, analogous to Category:DNB No WP here, and a set of "main subject" links. The special situation is that the ODNB property on Wikidata has been maxed out.

Charles Matthews (talk) 05:42, 21 February 2017 (UTC)

Authors here in DNB lacking enWP article[edit]

This is quite a neat use of SPARQL: query here. Today it brings up 76 authors with pages here, not having enWP article (according to Wikidata), but being DNB people. Charles Matthews (talk) 16:12, 1 March 2017 (UTC)