Wikisource talk:WikiProject DNB
| Wikisource: WikiProject DNB Main Talk | Archives |
[edit] Table of Contents formatting
Question around ToC (Moved from User talk:Arch dude) Some style questions
- Replicating the OR entries, are we leading with preferred name, or do we have entry for both in ToC. Issues is that some of the OR get quite long. Plus if we use full, do we full wikilink or just first component.
- Dates of life. We have used some DoL, to disambiguate, however, what guidance are we giving?
-- Billinghurst 03:53, 26 August 2008 (UTC)
- I don't know. Do you have some examples? what seems to work best? The guiding principle is to preserve the look and feel of the original, but the original does not have a ToC in this sense: It's a navigational artifact that we added to replace the original's (physical page-based) navigation. Once we find a workable solution, we can put it in the "Style" section. -Arch dude 14:25, 26 August 2008 (UTC)
-
- 1) Example ToC
- Look at name like Waad/Wade as an example of alternate names. Gut feel is to enter under first appearing name. It seems that more will get pretty ugly especially around order, and wikilinks. As you mentioned as long as the transcription replicates the book, that is the important item.
-
- 2) I would think guidance is give DoL only where it is needed to disambiguate, though this requires it then in the DNB template, for each entry, which is a level of complication. The complexity of wikilinks makes me hesitate to give definitive statement. Whatever is more foolproof is my preference.
-
- Billinghurst 14:52, 26 August 2008 (UTC)
- Your example is very informative. It's an index from the original DNB, not a ToC from the Wikisource project. We should (eventually) create an "article" that is as close as possible to an exact duplicate of the index, with every single character (including the page numbers) duplicated. By our own rules, we are permitted to link the entries in this article to the relevant articles in our project. This is (theoretically) completely distinct from the ToC. The ToC is a modern navigational construct: a navigational tool that we added to replace the paper navigation of the original. Since the project is still new, we can elect to abandon the ToC and replace it with a faithful reproduction of the DNB index. Alternatively, we can keep our ToC and also reproduce the index. My inclination is to abandon the ToC and use the index instead, but please remember that I am only one of the (currently) three members of this project. If we do elect to use the index as our primary navigational tool, we should first agree on the exact format of each "volume" article. -Arch dude 02:28, 27 August 2008 (UTC)
-
- I have had a start at some trial text in [Abbadaire - Anne] and in the talk page is the example with full dates added. Note
- later in the index is
- Avershawe, Louis Jeremiah. see Abershaw.
- While I can understand articles being word for word, I wonder whether an index page where we convert to a ToC should be an exact replicate. -- Billinghurst 06:47, 27 August 2008 (UTC)
- It's clear that the "(DNB00)" is not wanted or needed in the ToC. As an initial matter, let's remove it using pipe notation.
- [[Abershaw, Louis Jeremiah (DNB00)|Abershaw, Louis Jeremiah]]
- [[Abershaw or Avershawe, Louis Jeremiah (1773?-1795) (DNB00)|Abershaw or Avershawe, Louis Jeremiah (1773?-1795)]]
- My vote would be for the second article title to be
- Abershaw, Louis Jeremiah (1773?-1795) (DNB00)
- The article title is a Wikisource navigational construct, not a part of the original text, so we are free to choose. -Arch dude 11:41, 27 August 2008 (UTC)
- It's clear that the "(DNB00)" is not wanted or needed in the ToC. As an initial matter, let's remove it using pipe notation.
-
-
-
- Given it a try, have a look at Dictionary_of_National_Biography,_1885-1900/Vol_58_Ubaldini_-_Wakefield, specifically looking at Wadd, William (1776-1829) -- Billinghurst 16:01, 27 August 2008 (UTC)
- This raises a few interesting points. First I think that if the DNB has an article, even if only a see reference, it should be a link. Second, the John Wadham situation is interesting because he is in the index, but what we have about him is embedded in another article in a way that cannot be easily isolated; perhaps the link to Nicholas Wadham there should be wikified. Third, Arthur and Felix Wakefield have identifiable paragraphs within another article. Would people consider it too much a breech of purity to add headings withing the article so that we could have "See Wakefield, William Hayward (DNB00)#Arthur Wakefield. Eclecticology 18:31, 27 August 2008 (UTC)
- I think that adding headings to create anchors is in fact a "breach of purity." Fortunately, it is also unnecessary, since it is possible to add invisible anchors instead. Now I just need to remember the correct syntax... -Arch dude 00:19, 28 August 2008 (UTC)
- Very spooky, I just happened to do one of those articles today, so I have gone and done as suggested.
- Note that I have wl'd the component of the name after See under rather than the specific name itself. I am not wedded to that methodology. In the end, with many articles being short, I don't think that it will even be noticed. -- Billinghurst 06:53, 28 August 2008 (UTC)
- I think that adding headings to create anchors is in fact a "breach of purity." Fortunately, it is also unnecessary, since it is possible to add invisible anchors instead. Now I just need to remember the correct syntax... -Arch dude 00:19, 28 August 2008 (UTC)
- This raises a few interesting points. First I think that if the DNB has an article, even if only a see reference, it should be a link. Second, the John Wadham situation is interesting because he is in the index, but what we have about him is embedded in another article in a way that cannot be easily isolated; perhaps the link to Nicholas Wadham there should be wikified. Third, Arthur and Felix Wakefield have identifiable paragraphs within another article. Would people consider it too much a breech of purity to add headings withing the article so that we could have "See Wakefield, William Hayward (DNB00)#Arthur Wakefield. Eclecticology 18:31, 27 August 2008 (UTC)
- Given it a try, have a look at Dictionary_of_National_Biography,_1885-1900/Vol_58_Ubaldini_-_Wakefield, specifically looking at Wadd, William (1776-1829) -- Billinghurst 16:01, 27 August 2008 (UTC)
-
-
-
- 1) Example ToC
- Look at name like Waad/Wade as an example of alternate names. Gut feel is to enter under first appearing name. It seems that more will get pretty ugly especially around order, and wikilinks. As you mentioned as long as the transcription replicates the book, that is the important item.
-
- 2) I would think guidance is give DoL only where it is needed to disambiguate, though this requires it then in the DNB template, for each entry, which is a level of complication. The complexity of wikilinks makes me hesitate to give definitive statement. Whatever is more foolproof is my preference.
-
- Billinghurst 14:52, 26 August 2008 (UTC)
- Your example is very informative. It's an index from the original DNB, not a ToC from the Wikisource project. We should (eventually) create an "article" that is as close as possible to an exact duplicate of the index, with every single character (including the page numbers) duplicated. By our own rules, we are permitted to link the entries in this article to the relevant articles in our project. This is (theoretically) completely distinct from the ToC. The ToC is a modern navigational construct: a navigational tool that we added to replace the paper navigation of the original. Since the project is still new, we can elect to abandon the ToC and replace it with a faithful reproduction of the DNB index. Alternatively, we can keep our ToC and also reproduce the index. My inclination is to abandon the ToC and use the index instead, but please remember that I am only one of the (currently) three members of this project. If we do elect to use the index as our primary navigational tool, we should first agree on the exact format of each "volume" article. -Arch dude 02:28, 27 August 2008 (UTC)
-
- I have had a start at some trial text in [Abbadaire - Anne] and in the talk page is the example with full dates added. Note
- later in the index is
- Avershawe, Louis Jeremiah. see Abershaw.
- While I can understand articles being word for word, I wonder whether an index page where we convert to a ToC should be an exact replicate. -- Billinghurst 06:47, 27 August 2008 (UTC)
- It's clear that the "(DNB00)" is not wanted or needed in the ToC. As an initial matter, let's remove it using pipe notation.
- [[Abershaw, Louis Jeremiah (DNB00)|Abershaw, Louis Jeremiah]]
- [[Abershaw or Avershawe, Louis Jeremiah (1773?-1795) (DNB00)|Abershaw or Avershawe, Louis Jeremiah (1773?-1795)]]
- My vote would be for the second article title to be
- Abershaw, Louis Jeremiah (1773?-1795) (DNB00)
- The article title is a Wikisource navigational construct, not a part of the original text, so we are free to choose. -Arch dude 11:41, 27 August 2008 (UTC)
- It's clear that the "(DNB00)" is not wanted or needed in the ToC. As an initial matter, let's remove it using pipe notation.
Thanks Billinghurst for joining the project. It takes getting a few heads together to sort out the questions that are being raised. Several points have been raised by both of you that I want to address.
- ToC vs. DNB Index. I think it's important that we are recognizing the importance of Wikisource navigational constructs. Failing to do this can make work awkward. (I've already run into this over works that have quotation marks as part of the title.) I have no complaint about including the DNB index with page numbers as an additional group of pages, but what page numbers should it show? My own hard copy is the (originally) 1921 reprint edition which combined the original 63 volumes into 21. The pages themselves were essentially duplicates of the originals; right down to the same word breaks at the beginning and end of a page. That edition, however, did include the changes from the 1904 errata volume. (See Internet Archives for this one.) With the 1921 reprint the pagination was changed so the original separately paginated volumes 4, 5 and 6 became the new continuously paginated volume 2. The indexes for the new volumes also had footnotes to take into account those additions made in the First Supplement.
- DoLs. I very much support using these as disambiguators, but only when necessary. I know that Wikipedia favours disambiguation by what made a person famous in life, but that is more likely to require some sort of subjective determination than DoLs. To make it easy for a person who wants to cross link articles we should not require that he read the entire linked article just to make the link; the dates used should be exactly as they appear in parentheses at the beginning of each article. (At some point we may need to deal with same individuals who have different dates in another reference work, but I think that that problem can be deferred.)
- Using a pipe to suppress the "DNB00" from what is seen in the ToC is just fine.
- Honorifics. My preference is to suppress these from our article titles. We would, of course, continue to include this material in the article itself. It is to be noted that titled people usually have a see reference at the title, and these "see" articles should be kept as such (e.g. "ALEMOOR, Lord. [See Pringle.])
- Alternative names. I agree with Arch dude's solution for the situation expressed in the Abershaw example, but without the dates since there is no ambiguity with some other person. We would maintain the see reference at "Avershawe". The alternative name should continue in the ToC since there will be an article, even if it is only a see reference which maintains continuity between other articles. Eclecticology 17:42, 27 August 2008 (UTC)
-
- This looks like a consensus to me. Shall we now add it to the project page's "style" section and begin converting the ToCs? -Arch dude 00:19, 28 August 2008 (UTC)
- I just found the syntax for an invisible anchor: go the the note on purity above. The syntax is {{anchor|my_hidden_anchor}}. -Arch dude 00:50, 28 August 2008 (UTC)
[edit] Two questions
- What do we do with junior and senior see Dictionary of National Biography, 1885-1900/Vol 2 Annesley - Baird "Armstrong, John, senior (1784-1829)" and "Armstrong, John, junior (1813-1856)"?
- Should it be "Armstrong, John, of Gilnockie" or "Armstrong, John (d. 1528)"?
P. S. Burton (talk) 20:04, 19 August 2010 (UTC)
- We use dates of life to disambiguate people. So that would respectively be Armstrong, John (1784-1829) (DNB00) & Armstrong, John (d.1528) (DNB00) for your two examples. That said, do feel welcome to create redirects for Armstrong, John, senior (DNB00), junior and Armstrong, John, of Gilnockie (DNB00). I would also encourage you to create the page John Armstrong using {{disambiguation}} and list/link all the John Armstrong pages on WS. It all helps findability. — billinghurst sDrewth 06:26, 20 August 2010 (UTC)
- Wouldn't it be logical to do what the DNB itself does? That would be as Billinghurst says for the first case but "Armstrong, John, or JOHNIE, of Gilnockie (d 1528)" or maybe "Armstrong, John, of Gilnockie (d 1528)" in the second case. Incidentally, ODNB corrects his date of death to 1530, but of course we follow the original.--Longfellow (talk) 17:12, 22 August 2010 (UTC)
- I think the basic ideas we have worked on come down to this: (i) there is no "official title" to follow in the books, since there is only a "lead sentence"; {ii) we have chosen to make the titles minimal; (iii) we have chosen not to make the titles "informative", relying on the articles themselves to inform. Only (iii) here really requires defending. And that is perhaps only a discussion about how we think people will search the site, or otherwise find the material. I don't think such discussions are ever conclusive: my own first-order search would probably "John Armstrong"+Gilnockie in Google, so I'd want the name in natural order to appear somewhere (such as a dab page, for example). Since I'm currently typing up the author listings of redlinks, I certainly appreciate the use of short titles. Charles Matthews (talk) 10:18, 1 September 2010 (UTC)
- Wouldn't it be logical to do what the DNB itself does? That would be as Billinghurst says for the first case but "Armstrong, John, or JOHNIE, of Gilnockie (d 1528)" or maybe "Armstrong, John, of Gilnockie (d 1528)" in the second case. Incidentally, ODNB corrects his date of death to 1530, but of course we follow the original.--Longfellow (talk) 17:12, 22 August 2010 (UTC)
[edit] {{DNB00}} & Category:DNB biographies
Noticed that the DNB biographies were all appearing in Special:UncategorizedPages. This is eventuating due to they not being subpages of the main. To alleviate the matter, I have created the hidden category Category:DNB biographies and embedded it into the {{DNB00}} header. It can be accessed via the Category:DNB. -- billinghurst (talk) 11:44, 20 March 2009 (UTC)
[edit] The supplements
Organizational point: there were three DNB supplements published in 1901, mainly catching up with folk who had died after the relevant volume was completed. It would make sense to handle these in parallel with the 1900 DNB: but how exactly? Charles Matthews (talk) 16:08, 6 May 2009 (UTC)
- I believe that we are managing these with the later year of publication, eg. DNB01, and so on. I don't think that there was anything against extending, more focussing on something achievable(???) -- billinghurst (talk) 08:02, 7 September 2009 (UTC)
- Coordinating the establishment of DNB01. Will require review to ensure files and index pages fit within existing DNB framework to facilitate tracking, template interoperability and the like. JamAKiska (talk) 17:34, 3 October 2010 (UTC)
[edit] Vol 28 skips
Page:Dictionary of National Biography volume 28.djvu/149 is p. 151 of the original, while Page:Dictionary of National Biography volume 28.djvu/150 is p. 154. Charles Matthews (talk) 16:07, 17 May 2009 (UTC)
- As a lash-up I have made Page:Dictionary of National Biography volume 28.djvu/148a, Page:Dictionary of National Biography volume 28.djvu/148b, and Page:Dictionary of National Biography volume 28.djvu/148c, to fill with text for the moment, since I want to work on the Michael Hudson article. I'm not up with how to add djvu's yet. As on an previous occasion, I'd be grateful to have bot assistance with sorting this all out. (Shouldn't be hard, given that vol 28 is untouched so far). Come to think of it, there is plenty to discuss about improving the posted text, also. Charles Matthews (talk) 16:20, 17 May 2009 (UTC)
-
- My understanding is that we would need to pull it down, add the new pages, and then reload. Probably only want to do it once, so we should check first for illegible and other missing pages so we can insert these and do it once. -- billinghurst (talk) 08:05, 7 September 2009 (UTC)
I'm not treating these matters as being of urgency: the work of creating articles can go on, and "copying back" of text is essentially quick and trivial. What I am doing is to log all the glitches on my userpage. Logically there should be a project page that does that, though. And since this is going to be with us for a while, discussion on its talk page. Hey-ho. Yes, before implementing radical change, we should at least go through the volume seeing exactly where all the problems are. And I think that suggests priority for a "mapping subproject", identifying:
- progress with adding text;
- images so corrupt as to be useless and in need of replacement;
- calculation of the "offset" (image number minus page number) which should be constant if there are no glitches;
- progress with the master volume lists;
- progress with adding titles to author listings.
Ouch. Plenty of work. But we really need some of these project management tools to open up areas of work. I have to say that adding decent text where possible is really my priority: once there is text on the page, it starts to show up on search engines, and I find that pretty useful anyway even before any proofing. Adding text can be done by bot, but (returning to the topic) I'm not sure why these bot glitches were there in the first place. Charles Matthews (talk) 19:54, 7 September 2009 (UTC)
Done and resolved. Hopefully vol. 28 is all very special now. -- billinghurst (talk) 01:49, 27 September 2009 (UTC)
[edit] Progress table
I've now marked up Wikisource:WikiProject DNB/Progress with the most basic information on how we are doing, from the side of having text in place for proofing. A more honest title would be "progress and troubleshooting", since there are quite a number of legacy problems from the bot runs, as well as the inherent difficulties and constraints caused by bad scans. Obviously this page is for everyone to update, and can be expanded to include other issues (one obvious one being the per-volume article listings). Charles Matthews (talk) 15:34, 3 October 2009 (UTC)
[edit] Category:Problematic
I'm just getting up to speed on Category:Problematic, which currently has 17 DNB pages. If that is to be used as a general place for cleanup of scans, it is probably going to need its own subcategory subsystem and a bit of infrastructure. I suggest systematic use of discussion pages, in reference to the various kinds of issues such as are discussed under the previous topic. Charles Matthews (talk) 10:34, 27 October 2009 (UTC)
- Sure, however, for specific editions, we can manage it by adding notes to the respective Index_talk pages. We can also look at DNB IndexPages to get an overall picture of where the problematic pages are situated. I also remember someone showing me how you could search for the union of two categories. billinghurst (talk) 11:29, 27 October 2009 (UTC)
[edit] access to scans
I would suppose that the most frequently used and important links for this project are to the djvu files, the only way I have found is the subpage/progress. I am keen to do the odd article, but I can not easily find or remember this. Could we put a page in main space with the volumes and indexes, if we don't have the page the user (and bewildered contributor) can still directly access the scans, a la The Botanical Magazine. The NLA newspaper project offers anyone viewing their scans the opportunity to tag and correct as much of the text as they feel like, apparently this is very successful (several users making around a quarter of a million corrections!). Cygnis insignis (talk) 11:04, 7 November 2009 (UTC)
- I am currently working through and updating each volume (Index: <-> main ns), and will be adding scans links on the volume main ns pages. As each volume lists the available biographies, I believe (at this point) that this is a better solution (for the moment). billinghurst (talk) 12:37, 7 November 2009 (UTC)
-
- I'd be happy to format the information required and put it anywhere considered more prominent. For the "casual" participant, I suppose the questions most needing an answer are: (i) how to locate the djvus relevant to a given biography you have in mind; (ii) how to replace the OCR text with better text, in the frequent case that the bot posting/text layer assocatied with the djvus is not the best available. The way things are set out on the Progress page recognises that there are various qualifications and caveats likely to be useful to someone approaching all this work, but in a reference format rather than an exposition starting from the basics. Charles Matthews (talk) 10:50, 11 November 2009 (UTC)
[edit] Upgrading scanned volumes
I believe that I now have a decent understanding of replacing the File: versions at Commons, and how we can upgrade, and the consequences of such a change.
- Replacing a volume for volume with a better quality version is eminently doable, (and actually one we should keep a watch upon for the potential negatives and positives)
- if a complete volume is replaced with another complete volume, then things are okay as long as the same start page, and corresponding thereafter.
- if an incomplete version is replaced with a complete version, while this is great, it may involve a bit of page moving, hence we should resolve these files before we further advance much further with the fixing text in these beasts.
[edit] Background
A DjVu file has two layers (image and text). When a DjVu file is loaded to Commons, both are available to ThomasV's Proofread Page extension.
- Creation of an Index: page, locks in the file at Commons, and use of pagelist reveals the images available.
- Creation of a Page: page, shows the respective image from the Commons file, and grabs the text from what it sees as the corresponding text layer for the page. The text is what is imported to WS, and thereafter stored on WS.
[edit] Why do I tell you this?
When our Page: text is of a poor quality, and we upgrade the Commons file, the image will be upgraded, however, as the text is at WS, there is no change to the text displayed. So to grab the text layer from the new file, we need to delete our text (Page:yaddada.djvu/nn) and when we recreate the page the text layer from the upgraded file is now imported if available.
So, for example, in Vol. 57 where a number of the scans were poor and, subsequently, I have replaced the DjVu file, and we already have a number of pages created. Some of the red NOT PROOFREAD pages will have poor OCR, we may have to delete those existing pages and recreate. Deleting the pages holus bolus will not be advisable as some are partially or fully corrected, and just not advanced in their proofreading status.
Where I do replace a DjVu file at Commons, I will make a prominent note on the corresponding Index: page to alert people to this information, and direct them here to ask for admin to undertake requests identified.
[edit] So?
If a page is of poor quality, check to see if we have reloaded the image file (see Index: page), and we may be able to help out. Leave any such request on this page, and an admin will deal with it. Questions? -- billinghurst (talk) 13:17, 7 November 2009 (UTC)
- Your delving into these issues is much appreciated, and, yes, the sooner the better as far as sorting out the obvious page glitches is concerned, since the work involved is only going to increase over time. Coming at it from the end of identifying the cleanup to do, I'm now proactively using Category:Problematic to report dud djvus, and I see there are now 56 there. That is probably only the tip of the iceberg for illegible djvus, though. I assume we're agreed on priorities? Gaps in the continuity of pages are the worst issue, because the knock-on effects of a numbering change are large. Illegible djvus where there is no alternate scan available at all (say case "poor" in the Progress table) are the next worst case, since that means that proofing can really only be done at present from a physical book. Bad text can be got round using the ODNB-hosted scans for those with that access (now includes me), as long as there is any sort of readable djvu.
- We need ... oh, lots of things (the project as whole is fairly complex, as we are finding out) but given the request in the previous topic, can we have a look at just a few? (A) Mapping of the offsets (i.e. number in pagespace − page number in volume) as a way of finding those pesky gaps more systematically, by sampling through all 63 volumes; (B) Documentation that will make sense to those wishing to come in and help out; (C) Central page for requests - I mean a forum that is not so much about these project-level issues, but where people can simply post "I want to create/have created for me the biography of X" and state the issue they have, for an answer and attention. Charles Matthews (talk) 11:13, 11 November 2009 (UTC)
-
- I've done a basic survey on (A) now and entered the findings at Wikisource:WikiProject DNB/Progress; and I'm working on (B), which means writing down for the first time systematically numerous bits of know-how (a couple of pages still need to be created, see redlinks on the main project page). As for (C): I can think of various kinds of requests. There may well be a need for admin recreation of pages that have been deleted only for the sake of sorting out the djvu sequences, and for those you can just contact Billinghurst or me. Either of us will try to handle requests for general help, also. I can imagine people unclear about getting decent scanned text for a particular article or section in pagespace. The answer may still be "difficult to do", but please raise such matters here, so that we can start to log the worst places. Any time you find you are leaving a gap deliberately in creating articles, we really should be aware of the issue. /Most wanted articles is effectively unused, except by me. I don't see that this page is redundant, though. Charles Matthews (talk) 12:11, 13 November 2009 (UTC)
[edit] problem page, better index?
The page Page:Dictionary of National Biography volume 49.djvu/22 was missing a bit at the bottom, I found another scan at dictionaryofnati49stepuoft and corrected it from that. The source of the current index is sometimes problematic, in my experience, the latter may be a better version. Cygnis insignis (talk) 11:16, 18 December 2009 (UTC)
- Thanks, yes, in general the "best" scan is listed at Progress, and the initial postings may not have used it. Charles Matthews (talk) 15:57, 18 December 2009 (UTC)
[edit] We need a "HowTo" for new contributors
Thanks for the fantastic work over the last year. I dropped off of Wikipedia and Wikisource about a year ago and I stopped by last week to see how the project is going. I see 2000+ articles, plus major improvements in the infrastructure, particularly the scanned images.
I tried to read the project page with the eyes of a newcomer, and I found it difficult to see how to get started. I think we need a specific "howto" section that enumerates the steps a potential contributor should take. If there are multiple methods of working, we might need a separate howto for each method.
In particular, it appears that the current default method of working is:
- Understand the difference between page space and article space (or whatever we call them)
- see if the article exists in article space (how exactly)?
- Find the correct pages in page space (how exactly?)
- determine if the scans are OK and if the OCRs are OK.
- If not, try to find another source (how?)
- If scans are acceptable, continue
- Proofread and edit the OCR'd pages in pagespace. Start from scratch if the OCR is hopeless but the scan is readable.
- use the manual of style to handle small caps, end-of-line hyphens, greek letters, Italics, ligatures, etc.
- add "section" templates to identify stuff for the "transclusion" step below. (how exactly?)
- check your work.
- advance the state of the page to "proofread." (How, exactly?)
- create your article in article space
- If a redlink for your article exists in the ToC for the volume, use that name exactly (check it for adherence to the style manual and fix if needed, but it's probably OK.) If your article title is not in the ToC, add it now as a redlink.
- use the DNB00 header template, using a "worked example such as xxxx (How, exactly?)
- Transclude the sections from the pagespace article you created earlier, using the "worked example."
- Add the "previous" and "next" article titles to the ToC (perhaps as redlinks) if they are not already there.
- Ask for help if you need it (Where?)
And Happy New Year! -Arch dude (talk) 20:37, 3 January 2010 (UTC)
- Welcome back AD. A trickle of info, not clasping the whole at the moment. See {{DNBset}} it pulls together the info and makes page creation easy in main ns. Slowly working through upgrading and updating the scans where possible, and better integrating the scans to pages and vols. I would also think that we would want to be smart about how our instructions can blend in with the general instructions for the site, even if we quilt pages together with transclusions. The instructions need to basically say TYPE WHAT YOU SEE. billinghurst (talk) 04:37, 4 January 2010 (UTC)
-
- Yes, welcome back, and I've seen you active on the WP end too. I can work up a "how-to" guide, but I wouldn't want to be off-putting, either by putting in a huge amount of detail, or by prescribing a way of working too closely.
Commenting:
In particular, it appears that the current default method of working is:
- Understand the difference between page space and article space (or whatever we call them)
-
- Yes, there are multiple namespaces involved: main, Page:, Index: and Author: all relevant.
- see if the article exists in article space (how exactly)?
-
- The safest method is the volume listing.
- Find the correct pages in page space (how exactly?)
-
- Several options. Determine the volume first, naturally, from Dictionary of National Biography. These days you should be able to find articles to 'bracket' yours, article A before and article B after in alphabetical order. Where articles are transcluded, you can get to the page from the article (either directly by clicking in the margin, or by editing the article, depending on transclusion method. With page numbers to 'bracket', bisecting the range gets you there quite quickly. We are though moving to ability to look up. Author subpage listings should have the page number in the original volume added, and knowing the "offset" you can go to the exact page directly. (I have been lent a handbook, but unfortunately it is based on the 22-volume edition so I can go directly, but at the cost of some mental arithmetic.)
- determine if the scans are OK and if the OCRs are OK.
- If not, try to find another source (how?)
-
- The "Progress" subpage lists the so-called "best" scan at archive.org. On occasion you'd need the full listings of scans. The ODNB option is good for particularly bad text; so I'd recommend adding them cleanup categories, and requesting help for articles urgently wanted.
-
- If scans are acceptable, continue
- Proofread and edit the OCR'd pages in pagespace. Start from scratch if the OCR is hopeless but the scan is readable.
-
- See note before. I think we need to emphasise that 'triage' makes sense in this project. There is plenty to do that is not so tough. Faced with an article you really want and is apparently very hard to do, request help. Otherwise your time is probably better spent in other ways.
-
- use the manual of style to handle small caps, end-of-line hyphens, greek letters, Italics, ligatures, etc.
- add "section" templates to identify stuff for the "transclusion" step below. (how exactly?)
-
- It's the 'section begin', 'section end' tags at the end of the Wikisource-specific line below the editing box.
- check your work.
- advance the state of the page to "proofread." (How, exactly?)
-
- Radio buttons below editing box, advance to status yellow from status pink.
- create your article in article space
- If a redlink for your article exists in the ToC for the volume, use that name exactly (check it for adherence to the style manual and fix if needed, but it's probably OK.) If your article title is not in the ToC, add it now as a redlink.
-
- My method is to go from the author template at the article end to the author page, and add the disambiguated redlink to that page. Disambiguation is easy to check with a full volume listing, otherwise you need to look back to two articles before, two articles ahead, to check disambiguation for the article itself, previous and next. Yes, it's a pain sometimes. I create from the author page, and then check "links to". If the volume listing shows up, you're done. If not, you need to go and add or tweak a link.
-
- use the DNB00 header template, using a "worked example such as xxxx (How, exactly?)
-
- I now always use DNBset, and I keep it in a text editor. If you are working through sequentially in a volume (which cuts down overheads) you can update the template quickly, including the previous and next links.
-
- Transclude the sections from the pagespace article you created earlier, using the "worked example."
- Add the "previous" and "next" article titles to the ToC (perhaps as redlinks) if they are not already there.
- Ask for help if you need it (Where?)
-
- Here, me, Billinghurst. I'll take queries on text matters, but am not very competent on what happens behind the scenes technically.
Charles Matthews (talk) 10:00, 4 January 2010 (UTC)
Continuing work on this at Wikisource:WikiProject DNB/Walkthrough. Charles Matthews (talk) 10:29, 4 January 2010 (UTC)
[edit] Article space and page space
We seem to have settled on a specific set of scanned volumes to serves as our basic source. We also seem to have settled on transclusion from these sources as our basic method of operation. I think we should expose this to our readers, by adding a pointer to the index of each volume to the volume's ToC header. This will allow a reader to (try to) access the scan itself when the desired article has not been written. This may also entice readers to become editors. Eventually (in approximately the year 2025 at the current rate of progress) we will have the entire DNB00 in both page form and article form, and the ToC can become an easy way to navigate in both forms. -Arch dude (talk) 10:12, 5 January 2010 (UTC)
- Not quite clear about this. I think we are settled on the books (out of the various editions); though there is e.g. still the issue of how to incorporate the 1904 Errata. We are not yet settled on the precise scans, since one scan can replace another, and sometimes should. The numbering of the djvus is settled in many cases, not all (not all volumes can stay as they are for all time, since for example there is missing text). We could indeed try to set up links anchored somewhere and linking to text somewhere else. Though adding 27000 links (or anything) is a major undertaking. I think you are suggesting that the volume ToCs should link directly to the starting djvu of an article, by a link sitting by the article name in that listing. OK, can be considered. I would like to have page numbers (from the book) added to various listings. This is the same concept, except that you are thinking of a clickable page number that takes you to the djvu (in volume v , djvu.(page number + offset)). Which could probably all be put together with the article name in a fancy template. Since I basically approve of a template over a piped link on the ToCs, let's kick this around some more.
- (As far as extrapolating to the project finish goes, seems like a mug's game to me, but we'll discuss this again in 2011.) Charles Matthews (talk) 15:19, 5 January 2010 (UTC)
There are several things called index here. The page in Index: space will display the original page numbers of the document in the djvu file, once adjusted, these also appear in main (or article) space when transcluded. Any other information is manually added, as with the indexes (the listing of existing articles) for the DNB volumes in main space. The original index, several pages at the end of volume, is much more powerful. I fiddled around with some of these, Page:Dictionary of National Biography volume 11.djvu/478 is an example, and found they not only give page numbers for the entries, but also cross reference alternate spellings and people mentioned in other's entries. I also discovered widowed entries in main space (not appearing at the volume listings) from the blue links that appeared and found mis-titled pages with the names given. Advantages to having these in these linked (or at) the volume's page include: applying the naming conventions to redlinks; users being able to discover if someone is mentioned, and which volume to "see", rather than looking for one they know exists; having the previous and next entry in one spot; navigation for browsing, q.v. links, and where to look in the scan if the entry is not in main space. This latter index would be useful for users, but even more so for the building and maintenance of the project. Doing this half-done, no format, no proof page was not as interesting as doing an actual entry, but I will try to add some more every now and then.
The errata could be created separately, then linked to and from the entries; making it even more useful than the original format was. Cygnis insignis (talk) 20:52, 5 January 2010 (UTC)
- Yes, I would dearly like to see the indices in the books proofed: because starting from those, other working listings can be created. We have in practice mostly worked from the other end. (Picking up orphans can also be done using Magnus Manske's tool, by the way, if they link to their author page but are not linked back.) Charles Matthews (talk) 22:13, 5 January 2010 (UTC)
- I have added a new project page on navigation at Wikisource:WikiProject DNB/Pagefinding. It lists both types of indexation for convenient reference, and explains what to do with page numbers. It still needs the conversion information on the 22-volume edition added (of particular interest to me, but probably not urgently needed by others). Charles Matthews (talk) 12:39, 6 January 2010 (UTC)
[edit] Treatment if "redirect" articles?
Have we decided how to treat the "redirect" articles in the original DNB? I just added Osborne, Edward (DNB00). Its predecessor (see Page:Dictionary of National Biography volume 42.djvu/290) is is a "redirect." Should I create an article for Osborne, Dorothy (DNB00), or not? if not, what should I use as the "previous" for the Osborne, Edward (DNB00) article? -Arch dude (talk) 02:39, 9 January 2010 (UTC)
- This point has not been decided. My view is this: let's not include any redirect pages in the volume ToCs. Let's not include them in the "previous" or "next" fields, either; what is helpful is to go directly to the previous or next full biography. Where they can fit in is as linked from the wikified Volume Index pages. In other words, in creating hyperlinks from the Index pages at the ends of each volume, there is the scope to create a wikilink to a short page for these various types of redirects. Charles Matthews (talk) 10:03, 10 January 2010 (UTC)
[edit] Questions
Hi, I have three questions regarding Wordsworth,_Christopher_(1774-1846)_(DNB00) (Page:Dictionary_of_National_Biography_volume_63.djvu/31 to Page:Dictionary_of_National_Biography_volume_63.djvu/33): A) Is there a possibility to avoid the line break in the small text at the bottom of the article? B) The article has a subentry about the subject's son. Is it better to split the article or to let it intact? C) I have created the article with a hypen between the years in its name, however later have noticed link(s) to it, which use a &ndash instead. Shall I move the article now to the version with the &ndash or shall I redirect the latter to the original version with the hypen? ~~ Phoe talk ~~ 12:48, 23 January 2010 (UTC)
- Use <div style="font-size:smaller"> and
<noinclude></div></noinclude>on the last page, and then<noinclude><div style="font-size:smaller"></noinclude>on the second page. Ensure that the div is inside the tags. I have done it for this article. - Put an {{anchor}}, and we can link to it #son's name from the vol's Toc
- We have an n-dash in the url? Gee, I normally have hyphens, they give nicer urls. In answer, redirects are cheap for the server, so don't feel to concerned about creating one. billinghurst sDrewth 13:16, 23 January 2010 (UTC)
-
- I think we should stay away from endashes in the titles. At some later stage we could decide to use endashes, but the MoS has always said hyphen? The gain would be small, I believe, while the possibility for confusion is large. (The DNB scans in general use hyphens in the text. In the text I feel it doesn't matter, more important things to worry about. The ODNB transcriptions do use endashes.) I know there are a few titles in the volume ToCs that do use endashes, but the volume ToCs in general may not conform to the MoS (in which case they are wrong, for current purposes). Charles Matthews (talk) 17:11, 23 January 2010 (UTC)
-
- (ec - help! ;-) To 1.: Thanks, I will keep it in mind. To 2.: See 1 :-). To 3 and as explanation.: After I had seen your validation of my first proofread page [1], I inserted also cross-references using hypens in the next pages. When I had created the mentioned article however, I noticed that the link to you had inserted in your validation in fact didn't link to the article. On closer inspection I saw that you had used &ndashes, which confused me a little bit since I had assumed that hyphens were correct. Therefore I checked also Category:DNB_biographies, where generally the articles had hyphens, which confused me a little bit more. :-) ...
- By the way there are some few articles with &ndashes (for example Hunter,_William_(1755–1812)_(DNB00) and Langley, Thomas (1769–1801) (DNB00)). Furthermore as an information: apparently some articles are categorised under the wrong letter, see [2]. ~~ Phoe talk ~~ 18:00, 23 January 2010 (UTC)
-
-
- Just move articles that are non-compliant titles at present. We don't usually worry so much about it: the title conventions aren't even completely codified in all cases, so we should be reasonably relaxed, tighten up the manual when there is a particular issue, and just admit that some moving is a minor price to pay for having people contribute.
-
-
-
- The other point is a known bug in the underlying software, and will get sorted out by an upgrade some time. Apparently the transclusion by {{DNBset}} gets an illicit defaultsort. Charles Matthews (talk) 19:54, 23 January 2010 (UTC)
-
-
-
-
- Aye, thanks for the information. ~~ Phoe talk ~~ 22:17, 23 January 2010 (UTC)
- To 1) if you want someone else to do it, that is okay, either mark it or leave it and we will get to it. To the rest.
- billinghurst sDrewth 23:08, 23 January 2010 (UTC)
- Aye, thanks for the information. ~~ Phoe talk ~~ 22:17, 23 January 2010 (UTC)
-
-
- For the hyphen, Wikisource:WikiProject DNB/Style Manual is clear: the article title uses a hyphen, not an ndash. Yes, you may add a redirect, but you may also change the article that links to the "wrong" title. Here, the problem is the typography of the original: if in the opinion of one or more proofreaders, an ndash was used in the original, then we need to display an ndash in the linking article, but this has no bearing on conventions we use to create an article's title. Speaking perspnally, with regard to the text, not the Wikisource article titles, I cannot determine by examination the difference between a hyphen and an ndash in the original in most cases, and as proofreaders we are not supposed to convert from what the typesetter DID do to what the typesetter was SUPPOSED TO do. -Arch dude (talk) 03:42, 24 January 2010 (UTC)
On that last point, take another example, the [q. v.] or [q.v.] references. I believe that both forms 'occur' (i.e. spaced or not spaced) and the reason is that these books were hand typeset by compositors in right-justified lines. Therefore the spacing in the qv's was used as a way to right-justify: it was elastic. It would not surprise me at all to find that both hyphens and endashes were used, in the dates, for exactly the same reason (and there were likely other things of the sort). For me it is a bridge too far be peering at the hyphen/endash in the text and worrying about it when it may have been arbitrary anyway. Which is not to say that others can't worry. I always put a space in the [q. v.] by the way, for aesthetic reasons. I don't think we should be bothered about these matters in validating text; but if the mission requires it, some post-validation checking can go on. Frankly, if we had better scans to start with, it would make more sense to me. Charles Matthews (talk) 08:59, 24 January 2010 (UTC)
[edit] Source description pages
I have been bold, or perhaps stupid, and started adding "source description pages." Look at any of the first eight volume pages, (e.g. Dictionary of National Biography, 1885-1900/Vol 1 Abbadie - Anne). You will see a link from the header: "Access scanned source of Volume x," that links to a new page, e.g. Dictionary of National Biography, 1885-1900/Vol 01 source description.
These new pages are generated manually, using a new template: {{DNB volume source description}}. This means we can tweak the wording of all of the source description pages by changing the template.
These pages occupy an uneasy space between the article space (purely intended for readers) and project space (intended for editors.) The audience for these pages is the reader. When our project is complete, the casual reader will not need these pages, but our project is far from complete, so the reader may be forced to use un-transcribed material. These pages are intended to permit the casual reader to access the un-transcribed material in the least painful manner.
Please comment. -Arch dude (talk) 01:14, 24 January 2010 (UTC)
- Thank you for the work put in on these pages; treating them as extra documentation doesn't seem problematic to me. When we finally "complete" a volume in terms of all biographies, perhaps we should then discuss what happens next (completing the front matter, index, the "redirects" if we are doing those, wikilinking the index and rationalising the volume ToCs, and treating these pages as scaffolding of some sort).
- To resume on the "Proposal: Add Volume index "articles" " thread, and while we are looking at categories, there is Category:DNB Add text that gives an older form of "dummy" article. That just has a patch of articles from the start of one volume, and I've added text to a few that were there. I'll get round to the others, if no one else does. When I thought you were talking about "dummy articles that at least link to scanned pages", I was envisaging such articles that also actually made available the specific pages. There is more than one way such pages could operate. In the common case that a given article covers no more than two pages, I see an attractively simple way to create them (<section begin="marker"/> <section end="marker"/>) on a page transcluding nothing but bringing up the page number in the left margin to click. This requires no commenting-out. With what I know about transclusion if there is a whole transcluded page in at least three, the "middle" pages would need to be commented out to give the intended "dummy" effect. But then what I know about transclusion is not much. Charles Matthews (talk) 09:16, 24 January 2010 (UTC)
-
- This is problematic in its existing form.
- The introduced text is not part of the work, it is all commentary, hence it does not belong in the main namespace. Such text belongs on a Talk: page, if anywhere.
- I would argue that the text itself belongs on the Talk page of the Index anyway, not of the main work in the main namespace, as the text is specific to the version of the scan, not to the volume itself.
- the template creates subpages to non-existent parent pages, which is far from ideal
- too much text to get to the information, a table would better present the data
- it duplicates an already existing template called {{edition}} which works perfectly fine
- it removes the direct link to the index pages from the notes field which is eminently useful on each page
- While it was an idea, it really doesn't float with me, and needs to go back to the drawing board. billinghurst sDrewth 22:15, 24 January 2010 (UTC)
- The question here is "who is the customer?" I maintain that the customer is the reader who is looking for information from the DNB. Project members have little or no need for these pages, because we already know all this stuff, but a casual reader has no clue. If we are not trying to help the casual reader, then why are we here? These pages are currently in main-space, not because they are sourced from the DNB, but because they are intended as navigational aids. This is similar to the volume pages themselves. As to the direct links, these are useful to contributors, but are fairly scary for a reader who has never seen our strange deja vu system, and this is precisely why I want to interpose a simple-minded explanation. For a direct link, I propose to add a trivial transclusion. This (apparently) causes the mediawiki software to add the "source" tab to the tab set at the top of the page, and will get that function back for those who want it. I just blew away all the "index to scan" links (less that 25% of the volumes had them) because I did not see your note in time. Sorry. I will go back and add a trivial transclusion to all 63 of my entries. In fact, I will convert my "note =" into a template so we can do a bulk change as needed to meet your objections or even completely revert this effort.-Arch dude (talk) 22:55, 24 January 2010 (UTC)
- This is problematic in its existing form.
- [responding to Arch Dude mainly, everyone else has heard this rave.] I'm concerned that efforts are being put into solutions that will beget more problems. Doing the following would be tedious, but it greatly reduce the need for notes, guidelines and disclaimers: proof, redlink and display the index pages of the original. Readers and contributors may want the same thing when the entry has not been created, the ability to navigate to a scan of the original. If the indexes replace [!] the ToC at the parent page we get the following advantages: a page number in the vol., a cross reference to a different volume (as "Pseudonym, Jane. See Surname, Jane"), the spelling of the subject's name, a pre-disambiguated title, and whether an entry was even written.
- The Index: namespace's page may be confusing for those who haven't used them, but no less so than any other method of organizing the navigation. Full information about the integrity of the scan is evident from that page's display, though it is summarised elsewhere in project space. Bear in mind that this project existed well before the adoption of page scans. For those who don't know, the offset can be noted at these as
<pagelist 15=1 />
- It should be possible to generate metadata from these to a project maintenance page, but reiterating the information seems confusing and redundant to me. The direct link from mainspace to these (Wikisource) Indexes was a better solution. Any improvement to the functionality of these would benefit other projects, so, rather than attempting to compensate for perceived shortcomings, efforts would be better directed toward that. Cygnis insignis (talk) 05:11, 25 January 2010 (UTC)
-
- A confusing discussion, to be sure. The issue addressed is navigation, and the trouble is coming, forgive me, from self-imposed Wikisource limitations on navigational pages (namespace usage issues). There are also debates going on here on short-term, medium-term and long-term priorities. The index pages from the DNB volumes number around 200 in all; and some of the scans of those suffer from generic bad-scan issues. Proofing them is not a short job, and likely can't be completed in the short term. I like the idea of doing something with the DNB volumes' pages in the Index namespace. This has the advantage of logic (look in that namespace for navigational information, seems intuitive), and perhaps transcluding some project pages into those Index pages could help centralise information on one page that is now spread over several. Charles Matthews (talk) 09:39, 25 January 2010 (UTC)
- I concur: the "page source descriptions" are a poor substitute for the proper navigational solution. Their only justification is that we (I) can complete them in a very short time, and for the unsophisticated reader they are a great deal better than nothing at all. As Charles says, this is a as short-term solution. As we complete the transcription of the index pages, we can (and should) add a link to our transcribed index page from within the source description page and of course from the volume page. As we complete the transcription of each volume, we can entirely remove the "page source description" for that volume. To Billinghurst's point about mainspace: I am in no way wedded to this. I will experiment with moving the pages to talk space. I'm thinking about moving each to a sub-page of the talk page of its index page.-Arch dude (talk) 15:31, 25 January 2010 (UTC)
- Update: Please look at the vol 1 ToC. I am now using a template and pointing to a subpage of the talk page. The template lets us make mass changes to the display text as we converge on consensus. I added a small superscript link (SI) for power users. This links directly to the scan index instead of the verbose Source description page. I am now changing all 63 ToCs to use the template, and moving the source description pages to the new locations. -Arch dude (talk) 00:47, 26 January 2010 (UTC)
- I changed all 63 ToCs. This repairs the damage I did by removing the old "links to scanned pages." Instead of 25%, we now have 100% of the ToCs linked to the scan indexes, via the "(SI)" superscript link. All 30 of the "source description" pages I craeted have been moved out of mainspace. Change the template {{DNB sdp}} to change all 63 ToCs as desired. I will create theremaining 33 source description pages later, not that the damage is repaired. -Arch dude (talk) 02:47, 26 January 2010 (UTC)
- Update: Please look at the vol 1 ToC. I am now using a template and pointing to a subpage of the talk page. The template lets us make mass changes to the display text as we converge on consensus. I added a small superscript link (SI) for power users. This links directly to the scan index instead of the verbose Source description page. I am now changing all 63 ToCs to use the template, and moving the source description pages to the new locations. -Arch dude (talk) 00:47, 26 January 2010 (UTC)
- I concur: the "page source descriptions" are a poor substitute for the proper navigational solution. Their only justification is that we (I) can complete them in a very short time, and for the unsophisticated reader they are a great deal better than nothing at all. As Charles says, this is a as short-term solution. As we complete the transcription of the index pages, we can (and should) add a link to our transcribed index page from within the source description page and of course from the volume page. As we complete the transcription of each volume, we can entirely remove the "page source description" for that volume. To Billinghurst's point about mainspace: I am in no way wedded to this. I will experiment with moving the pages to talk space. I'm thinking about moving each to a sub-page of the talk page of its index page.-Arch dude (talk) 15:31, 25 January 2010 (UTC)
- A confusing discussion, to be sure. The issue addressed is navigation, and the trouble is coming, forgive me, from self-imposed Wikisource limitations on navigational pages (namespace usage issues). There are also debates going on here on short-term, medium-term and long-term priorities. The index pages from the DNB volumes number around 200 in all; and some of the scans of those suffer from generic bad-scan issues. Proofing them is not a short job, and likely can't be completed in the short term. I like the idea of doing something with the DNB volumes' pages in the Index namespace. This has the advantage of logic (look in that namespace for navigational information, seems intuitive), and perhaps transcluding some project pages into those Index pages could help centralise information on one page that is now spread over several. Charles Matthews (talk) 09:39, 25 January 2010 (UTC)
(Outdent). I have completed the initial phase of this effort. All 63 "source descriptions" are now created. I have attempted to address Billinhsurst's concerns: there is a tiny ,sup>(si) link on each ToC page for power users, and each 'source description page" now starts with a terse summary, also for power users. The bloated verbiage is still present, but is now in section called "Explanation for new readers." I feed very strongly that this must be present, because it is likely to be the very first exposure the casual reader has to the raw sources, and we need a way to convey this basic information. Comments? -Arch dude (talk) 00:08, 31 January 2010 (UTC)
- If we are trying to address the reader of the DNB, rather than workers on the project, there should be a clear link on the main WikiProject page of the type "if you are here to access the DNB, start by reading this". Maybe a hatnote. It should lead to an expository page explaining what is to be done about reading, given the work in progress. Charles Matthews (talk) 08:21, 31 January 2010 (UTC)
[edit] improved scans
I didn't look very deeply so pardon my laziness and delete this post if it is redundant. Has anyone looked into creating new djvu files from the same source data. Most files at archive.org contain much larger files, such as a tif.zip, that might produce a better file (and ocr) than the one made available. The settings used by others to create our current djvu files may have compromised the integrity of the data, such as favouring compression over resolution. Cygnis insignis (talk) 10:34, 29 January 2010 (UTC)
I may be on to something here, compare the line of the index for " Cole, Thomas" in the djvu at Page:Dictionary of National Biography volume 11.djvu/479 with the online view of a jpg from the same scan data. The algorithm of the djvu conversion decided that the 6 was an 8, but the actual dates are evident from the jpg of the second link. Now the bad news, the source data file is huge and pushing them around requires a lot of bandwidth. A work-around is using what we have and referring to the online scan. Resampling and tweaking the conversion may be possible from the smaller flippy.zip that is the source of the 'online viewer'. Cygnis insignis (talk) 11:34, 29 January 2010 (UTC)
- I can imagine this conception being very useful, particularly for WS:CEU (a project that has barely started, but never mind), where the key proof-reading step is for reference listings that are omitted from all the only Catholic Encyclopedia versions. There it is small blocks of tiny type that really need close attention. Charles Matthews (talk) 16:03, 9 February 2010 (UTC)
[edit] Author subpages
These seem to be catching on, so this is the moment to move out the basic "template" as Wikisource:WikiProject DNB/Author subpage template. Not all authors need them, but maybe up to 100 DNB authors would benefit from not having their page dominated by the DNB list. I made a list on my userpage.
Also time to explain a little, so that this method gets documentation. The column for page numbers is intended for the page numbers in the paper DNB, not the djvu numbers. For one thing, those djvu numbers may change as the result of maintenance work that still has to happen on the uploads. Also allowing people to give or check those page numbers easily, for citations, is going to be useful for some readers. The fourth column was initially intended as a Y/N for WP links, but it can be for unpiped links (possibly with comment) if we agree on that.
In the third column, and generally to create listings, the "plain link" {{DNB lkpl}} (as opposed to the full citation link {{DNB Link}}) is apparently now the accepted way (certainly by me, since I find a column of those to be more legible than the [[Bloggs, Joe (DNB00)|Bloggs, Joe]]), and it must have real advantages for machine reading also. Charles Matthews (talk) 16:41, 2 February 2010 (UTC)
[edit] Blind DNB links
Have been introduced to our first blind DNB link. Where they did a qv in an earlier volume, and subsequently it would seem that they have decided against the biography.
- Machin, John (d.1761) (DNB00) has a qv to Dr. (Alexander) Torriano
We need some thinking for how we wish to handle this.
I don't think that we should make the qv wikilinks disappear, and feel that we could look to having a page without any body text, with a standard DNB00 header, though with something in the note field that says something along the lines "while another DNB biography links here, the biography was not undertaken in the published volumes 1-63." — billinghurst sDrewth 00:08, 7 February 2010 (UTC)
- Preferably with something added: "The wikilink to this page is a placeholder. Feel free to improve it with an internal or external link". And we should probably have a project page somewhere noting the blind links, and so that such changes can be logged, to avoid duplicated efforts. Charles Matthews (talk) 10:32, 7 February 2010 (UTC)
-
- There it is probably easiest to create a {{maintenance category}}, do we use Category:DNB blind linked page, and slap it onto Category:DNB? If the addition is any more than simple I think that we should template whatever we have there so we can more easily update the words. Should we create a page as a pilot? billinghurst sDrewth 12:48, 7 February 2010 (UTC)
-
-
- Shouldn't we both keep it simple, and avoid any mission creep within Wikisource? I propose that we simply have one page (a subpage of this project) to which such blind links (and this may be the only one) would direct. If anything more than that is need then what am I missing? Jan1naD (talk • contrib) 14:49, 7 February 2010 (UTC)
- Like it centurion! Make them redirects to the page, and we can add each bio name to a compile list as we find them. As redirects also easy to undo if we find it in a weird spot. billinghurst sDrewth 16:34, 7 February 2010 (UTC)
- OK, can do it all on one project page if the cross-namespace linking isn't considered weirdness. Charles Matthews (talk) 08:41, 8 February 2010 (UTC)
- I cannot see why we cannot do it in the main ns, make it a specific subpage of DNB, it keeps the work together. Not perfect perfection, however, it should do. billinghurst sDrewth 12:27, 8 February 2010 (UTC)
- OK, can do it all on one project page if the cross-namespace linking isn't considered weirdness. Charles Matthews (talk) 08:41, 8 February 2010 (UTC)
- Like it centurion! Make them redirects to the page, and we can add each bio name to a compile list as we find them. As redirects also easy to undo if we find it in a weird spot. billinghurst sDrewth 16:34, 7 February 2010 (UTC)
- Shouldn't we both keep it simple, and avoid any mission creep within Wikisource? I propose that we simply have one page (a subpage of this project) to which such blind links (and this may be the only one) would direct. If anything more than that is need then what am I missing? Jan1naD (talk • contrib) 14:49, 7 February 2010 (UTC)
-
Set it up then, can be moved whenever. Charles Matthews (talk) 18:08, 8 February 2010 (UTC)
- Please make it clear that these are blind links (i.e. to articles that were never added to the DNB) rather than redlinks (i.e. to articles that WERE added to the DNB but which have not yet been transcribes into the Wikisource DNB.) We will likely need to re-emphasize this to new contributors occasionally. I recommend a brief note for readers on the new page itself. and a longer note for contributors on its talk page. -Arch dude (talk) 14:37, 9 February 2010 (UTC)
Blind Link adds from Volume 28 & 33 Found these over the last month.
- Hurd, Richard (DNB00) - William Weston (author) who published book in 1747
- Hunter, William (1718-1783) (DNB00) - Francis Sandys (Georgian architect)
- Hunter, Andrew (DNB00) - contains Robert Walker (1709-1802) but does not match the description found in this version of Walker, Robert (DNB00) article.
- Lindsay, David (1551?-1610) (DNB00) - John Lawson (16th c. educator).
- Hume, Thomas (DNB00) - Gustavus Hume (Irish surgeon) wiki link to Hume street, Dublin.
The following two currently have only Wiki articles available.
- Edward Howard, 9th Duke of Norfolk Howard, Walter (DNB00)
- Sir Edward Ponynges Howard, Edward (1477?-1513) (DNB00)
JamAKiska (talk) 23:15, 28 May 2010 (UTC) All of these Blind links are now red links. Those with wiki links have been moved to [q.v.] symbol.JamAKiska (talk) 16:42, 30 May 2010 (UTC)
- Ibbotson, Henry (DNB00) contains blind link to William Joseph Hooker. The [q.v.] links to vol 27 article on his son, William Jackson Hooker, currently unwritten. JamAKiska (talk) 22:06, 31 May 2010 (UTC)
- Iestin ab Gwrgant (DNB00), blind link to Howel ab Morgan. Should be a peer of Gruffydd ap Rhydderch. JamAKiska (talk) 01:56, 1 June 2010 (UTC)
- Griffin, Benjamin (DNB00), Vanbleek or Van Bluck. JamAKiska (talk) 14:58, 24 June 2010 (UTC)
- Herbert, Philip (1584-1650) (DNB00), John de Critz, a painter. JamAKiska (talk) 13:29, 4 July 2010 (UTC)
- Herbert, St. Leger Algernon (DNB00), Herbert, John Alexander Cameron, war correspondent. JamAKiska (talk) 14:02, 4 July 2010 (UTC)
- Gordon, James Bentley (DNB00), Thomas Neeve, nephew of Richard Bentley (scholar). JamAKiska (talk) 19:06, 8 August 2010 (UTC)
- Mitchel, Jonathan (DNB00), Rev. John Cotton (d. 1652). JamAKiska (talk) 13:39, 31 August 2010 (UTC) Included in 1901 edition of the 2nd supplement.JamAKiska (talk) 00:52, 1 October 2010 (UTC)
- Montagu, Edward (d.1557) (DNB00), John Roper, AG to Henry VIII.JamAKiska (talk) 21:29, 2 September 2010 (UTC)
- Masquerier, John James (DNB00), John Hoffner, fellow painter. JamAKiska (talk) 03:21, 20 September 2010 (UTC)
- Lucas, Charles (d.1648) (DNB00), John Lucas, Lord, older surviving (?) brother. JamAKiska (talk) 20:14, 25 September 2010 (UTC)
- Love, Nicholas (DNB00), Andrew Broughton, regicide. JamAKiska (talk) 02:13, 28 September 2010 (UTC)
[edit] Wikisource:WikiProject DNB/Wikification
Trying to grasp the nettle of what we mean by saying that articles should be hyperlinked. Couple of points. The "blind links" discussion just above this should be referenced by this new project page, mentioning what is to be done about the qvs that can't be resolved. And there is the possibility that some wikilinks should run to a disambiguation page, to give the reader a choice of references rather than picking out one. This is also something to mention on the page, depending on what we think. Charles Matthews (talk) 09:54, 15 February 2010 (UTC)
- The disambig page is a good idea. Better than my (unpublished) thought that something like [Abercorn Earls of. See Hamilton] should simply point to the contents page that lists the Hamilton articles. Jan1naD (talk • contrib) 10:12, 15 February 2010 (UTC)
- What is said at Wikisource:Style guide#Disambiguation pages is not particularly well adapted to reference texts (the common experience). In practical terms it doesn't seem to annoy anyone to create a dab page about a given topic, including reference work articles with closely-related titles. Linking to such pages is what I had in mind. Your idea can be refined by using {{anchor}} to set up the "Hamilton" anchor on that page. I actually just don't know where we stand on using dab pages more systematically, for example disambiguating "Hamilton Earls of Abercorn". Discussions on general principles seem to turn out inconclusively. Charles Matthews (talk) 12:02, 15 February 2010 (UTC)
- To annotate this discussion, there is a general discussion at WS:S about use of a namespace to collate topic pages (person/pace/other). To also note that as I have been working on a couple of biographical works, I have been starting some disambiguation pages to manage such occurrences. An example is Adrain, Robert. — billinghurst sDrewth 15:20, 28 May 2010 (UTC)
- What is said at Wikisource:Style guide#Disambiguation pages is not particularly well adapted to reference texts (the common experience). In practical terms it doesn't seem to annoy anyone to create a dab page about a given topic, including reference work articles with closely-related titles. Linking to such pages is what I had in mind. Your idea can be refined by using {{anchor}} to set up the "Hamilton" anchor on that page. I actually just don't know where we stand on using dab pages more systematically, for example disambiguating "Hamilton Earls of Abercorn". Discussions on general principles seem to turn out inconclusively. Charles Matthews (talk) 12:02, 15 February 2010 (UTC)
[edit] Creating good pages in "bad" volumes?
I have created or proofread some "bad" pages, getting reasonable results. I leave some of these in the "not proofread" state, since I'm only using a part of the page. The question is: what happens to this work if someone later decided to reload or otherwise re-work the volume? Is there a chance that this work will be lost? What mechanisms are in place to preserve such pages?
A related problems is page number transclusion. We know that some volumes have missing pages, and presumably these pages will be inserted later. But transculsion depends on the sacn-index numbering, not the original volume numbvering, so if a page is moved to a different scan number, any articles that transclude the page will be messed up, right? How do we address this? -Arch dude (talk) 16:45, 15 February 2010 (UTC)
- It could be worse than just part of a page. Both Charles and I (among others, I'm sure) have created text using various external sources, and taken it at least as far as Proofread, or even Validated. There would be some unhappy bunnies if that work got trashed. Jan1naD (talk • contrib) 17:10, 15 February 2010 (UTC)
-
- This is one of the downsides to a bot extracting the ocr layer, if left as redlinks then 'not proofread' would show any activity and fully proofread sections. The upside is that I can link an entry to a work or author when they get added, because it sometimes turns up in a search of the site, but I have become wary doing this at DNB for the reasons given above. Expecting others to detect my more trivial edits seems like an unkind burden to those who undertake this complex task, at least I can see the changes and deletions turn up on my watchlist. Strictly speaking, and certainly from the pov of politeness, we should move any valid contributions. I have done this when an error was in the file was realised, a dozen or so pages was fiddly enough. This project does valuable work and is a good starting point for newer users, so urgency and caution needs to be applied. I don't expect others to leap at the opportunity, but I'll put my hand up to assist in pushing the data around. Cygnis insignis (talk) 17:38, 15 February 2010 (UTC)
-
-
- Various volumes have in fact been rectified, and the fallout hasn't been so great. What I have noticed has included a small percentage of transclusions that have not been modified correctly; and some pages that need to be recreated. This is a tribute to the care with which Billinghurst has implemented changes, in fact - mostly people don't notice much. So, while it is accurately said that the system is not completely robust, our experience has been far from traumatic. Charles Matthews (talk) 18:12, 15 February 2010 (UTC)
- From this response, I infer that Billinghurst is the only(?) person who fixes structural problems in the page scans. Is this correct? If so, then the rest of us need to adhere to whatever rules or guidelines that Billinghurst wished to promulgate in this regard. What are these rules? -Arch dude (talk) 21:23, 15 February 2010 (UTC)
- Billinghurst is the only person who has undertaken that task, he may provide some suggestions or guidelines that derive from his efforts. As it stands, after Charles' reassurance, this is unlikely to affect future contributions. I imagine that the only 'rule' is that if one ignores the red warning about a file needing fixing before proofreading, one is risking making work for the user who fixes it and needlessly sweating over a crummy text layer. Cygnis insignis (talk) 22:02, 15 February 2010 (UTC)
- I believe Billinghurst is the only project member who has done the "behind the scenes" work on ProofreadPage to know how to change over scans, and cure the "bot hiccups" that apparently account for the gaps and duplications in the posted scans. I doubt he would mind training up someone else :-) Apparently it all starts with backing up what is already there, which should be of some reassurance. Other Wikisource people may be the ones who run bots and do the "heavy lifting" with the large djvu files. I just don't have the technical chops to get involved in such things, but there is no reason that others can't be involved if they do. I think much of the sweated labour is checking over 450 pages afterwards. Sooner rather than later, really. Charles Matthews (talk) 22:52, 15 February 2010 (UTC)
- From this response, I infer that Billinghurst is the only(?) person who fixes structural problems in the page scans. Is this correct? If so, then the rest of us need to adhere to whatever rules or guidelines that Billinghurst wished to promulgate in this regard. What are these rules? -Arch dude (talk) 21:23, 15 February 2010 (UTC)
- I really need to make myself clear. I think that we need to encourage Billinghurst's efforts and make sure we do not make his work harder. This is a primary goal, because the project now depends fundamentally on improving the scans. I will do whatever it takes to support that goal. Given that, I need some guidance on how to avoid losing the work I am doing on "bad" pages. I do not want Billinghurst to have to worry about trashing a single trivial edit when upgrading a horrible OCR to a good OCR page, but I also do not want to lose an entire proofread page as a side-effect of a multipage upgrade. I want to be able to find the pages I need for an article and proofread them, even if they are problematical, and I want that work preserved. But I also do not want to get in the way of a desperately-needed multi-page upgrade. Tell me what to do. -Arch dude (talk) 02:39, 16 February 2010 (UTC)
- I think it is wrong to say that the project depends "fundamentally" at this time on the remedial work. The approach that has operated (well) so far is like this: get on with the work, carry a machete, help map out the problems. For example, there are pages that are simply missing: my guess is that this may be around 60 out of 30,000, or 0.2%. There is a "bad quartile" of scans that have no alternative at archive.org, and are the Google-sourced ones that are basically a disgrace to the human race. That represents 16 volumes last time I counted. Obviously the latter problem is much more of an obstruction, but there is a way round (for me and those in my position). This being a wiki, we don't have project management as such; given the scale of the project, I'm not surprised that the lack may be felt.
- Various volumes have in fact been rectified, and the fallout hasn't been so great. What I have noticed has included a small percentage of transclusions that have not been modified correctly; and some pages that need to be recreated. This is a tribute to the care with which Billinghurst has implemented changes, in fact - mostly people don't notice much. So, while it is accurately said that the system is not completely robust, our experience has been far from traumatic. Charles Matthews (talk) 18:12, 15 February 2010 (UTC)
-
-
-
-
- We do have a consensus that the volumes will be fixed, and many have been, in reverse numerical order. I don't see who could give warranties about the work anyway, but I have said what I can about the actual effects of the upgradings. So people working on the later volumes of the DNB have no reason to be anxious anyway, and the second half of the alphabet has been neglected, so that there is a practical way to avoid hassle. Charles Matthews (talk) 08:56, 16 February 2010 (UTC)
-
-
[edit] The identified issue with M's tool
| “ | Wikisource DNB links to Magnus Manske's statistics and maintenance tool. (The detailed readout is now more complicated than in the past, because the /DNB author subpages are causing some unproblematic pages to register in both of the main cleanup lists.) | ” |
As an interim measure, why don't we transclude the subpages back to the top level of the author space. It keeps the page smaller, subsidiary, yet brings them back to the front and away hopefully available to Magnus's toolserver script. At least worth consideration. — billinghurst sDrewth 15:47, 22 February 2010 (UTC)
- OK, could try it on a small sample. I should mention the slightly embarrassing (for me) glitch meaning that the tool picks up a couple of Catholic Encyclopedia pages. I was tinkering with {{CE13}} as an adaptation of {{DNB00}} and never finished changing over some of the categories. Charles Matthews (talk) 16:00, 22 February 2010 (UTC)
- Hah, that isn't even trying!
The other thing that we could look to see if we can leverage a tool that ThomasV was building. Discussion at User talk:Beeswaxcandle#Have a look at the presentation of ..., as I think that there is some potential there. — billinghurst sDrewth 06:56, 23 February 2010 (UTC)
- Hah, that isn't even trying!
[edit] Article names in natural order as on WP?
I'm new to DNB. Is there a present or future plan to have article names in natural order as on WP? Otherwise when I Google, for example, for "Robert Murray M'Cheyne" (in quotes) I am never going to find the relevant DNB page unless he happens to be mentioned under that name in the text of the article. I Google for people's names a lot by this method.--PeterR (talk) 10:57, 6 March 2010 (UTC)
- I'm not a fan of inverted name order generally, since it doubles (at least) the time for searching any database if one even remembers to do a second search. On the other hand the convention is established here, so that the natural solution is to create redirects. Which could be done systematically at some point: the project is really too big/long to say when we might get round to that, and too laborious to be prescriptive in the sense of saying that "redirects must be created as we go along".
- On the bright side, once there is a Wikipedia link on an article, that will be in "natural order", and therefore should be picked up by search engines, so the business isn't hopeless. It is reasonable to ask how articles will actually be found, and the answer is like "they should be created with at least four incoming wikilinks anyway, and in many cases with a link from Wikipedia; and we shall be wikifying the text of the DNB articles so that over time there will be other links in, reflecting the DNB qv structure and other occurrences". I hope that is a fair answer to the concern. Charles Matthews (talk) 11:38, 6 March 2010 (UTC)
- I think the most compelling argument you made is that our link to the WP article by itself introduces the appropriate searchable text. when a WP article does nto yet exist, perhaps we could figure out a way to add the "natural order" name in a non-displayed field for use by web crawlers? Many of our subjects actually have multiple "natural" names (e.g., Duke Wellington) which could also be added as hidden search terms. -Arch dude (talk) 23:06, 8 March 2010 (UTC)
- We could use the "extra notes" field to display fuller names, since our title convention is "minimal", while in some cases the useful name might be quite complicated: I was struck by the case of Savile, Thomas (DNB00) which begins as "SAVILE, THOMAS, first Viscount Savile of Castlebar in the peerage of Ireland, second Baron Savile of Pontefract, and first Earl of Sussex". I don't know quite what that proves, though. "Thomas Savile, 1st Earl of Sussex" is the WP title, and very sensible too. Perhaps there is a maintenance task associated with Category:DNB No WP, along the lines you suggest; that category is suddenly becoming big, for various reasons. Charles Matthews (talk) 06:53, 9 March 2010 (UTC)
-
- The correct answer for searches is probably none of the above. Metadata is the appropriate answer, and utilising future renditions of the mediawiki software to organise the work. How we seed works to get extra terms as indicated above, it will be interesting to see how will eventuate, and we have brought it to the attention of developers. That said, I don't think that the concept of surname, firstname is that foreign, especially when one considers its relationship to DEFAULTSORT. — billinghurst sDrewth 10:28, 9 March 2010 (UTC)
- Worth saying also, in reply to the original query, that the WP onsite search is ahead of the game here: for example Thomas Savile searched does show the WS page in a box. Charles Matthews (talk) 11:52, 9 March 2010 (UTC)
[edit] An awkward one
Sion Llywelyn (DNB00) is proving elusive (vol. 52). It is a duplicate of another article to be seen at Page:Dictionary of National Biography volume 34.djvu/28, as Llywelyn of Llangewydd (DNB00), a.k.a. Llywelyn Sion, and for that reason, presumably, occurs in no edition after the first (they caught this by 1904). The page of the unique scan is Page:Dictionary of National Biography volume 52.djvu/327, which is badly corrupted, but you can see the initials J.E.L. above Sion Lleyn (DNB00). The Fenwick handbook, based on the 22 volume edition, denies that this article exists. It's only a short article, but it may need someone with the physical original volume 52 to fill this gap. Charles Matthews (talk) 19:23, 1 April 2010 (UTC) Completed...JamAKiska (talk) 02:55, 1 October 2010 (UTC)
+1. Charles Matthews (talk) 07:10, 1 October 2010 (UTC)
[edit] Author subpages: a bit confused
Just for clarity's sake, what is the "master plan" for the author subpages and how should individual articles be handled currently as a best practice?
- If a subpage exists, only ensure that the article we transclude is present on the subpage and do nothing more?
- If a subpage exists, do the above but also add the newly transcluded article to the author's main page?
- Ignore subpages for now and only add to the author page?
- None of the above?
Also, not directly connected, I'm wondering if we couldn't build a template for each individual row for the subpage rather than navigate huge tables that I have trouble to read in wikitext. Is anyone working on something like that? Otherwise I could give it a shot. MLauba (talk) 11:16, 5 May 2010 (UTC)
- It's confused, but not that serious I think. My own current practice is just to add titles to the author pages; that is because I'm doing batches of articles, several dozen a day, and adding each title is an overhead of around a minute. Adding to a table could be an extra minute, and I'm not doing that because I don't have an extra half hour in my schedule. It would be much more efficient to add a group of titles to the subpages, from time to time.
- Table syntax seemed to be a natural way to do this, but that is because I wanted the (original paper) page numbers included. Those page numbers allow us to check and create references to the paper version.
- In normal wiki workflow, I would say that if A adds title T to the author page, that is good, and if B moves T to the author subpage, that is also good. A and B need not be the same person and this doesn't need to happen at the same time.
- There should certainly be a bigger plan, and it looks like this: there are about 650 authors for the DNB. As of now there are just over 300 of them who have a complete DNB listing on their author page already. There will be in the future I think around 100 who should have an author subpage, though only about 50 of those are really long lists. My initial idea was to fill up the author subpages for volume 1, then volume 2, ... etc., ahead of systematically working through. This alphabetical way of working is only one part of the project, though. We have volume ToCs for the first six volumes, so the idea was to create the listings for volume 1, to accelerate the work on volume 1 (and also to correlate with checking on enWP the presence of articles for the first few volumes).
- As it is, volume 1 is about 80% done, but not much is done in volume 2 yet. I have been working to do more complete author listings, because I have the reference work that means I can do that. There are about 250 more to do, before I would start systematically listing for the authors who will need subpages. I'm working currently to do the letter S, which will take at least a month more.
- So ... the point really is that all of these listings (volume ToCs, author page and author subpage listings) are auxiliary work. If they are done in their own right, it is much more convenient after that to proofread and create articles. But if anyone stops working on articles just to do them, there is a cost in the number of articles. I think the way is to have the listings compiled by everyone as part of the process, which means some tolerance of "confusion" is necessary. It is all made more complicated by the interaction of where articles are listed and the Magnus maintenance tool, which finds articles created but not listed on the author page. You could add in that the format on author pages isn't standard (there are some alphabetical and volume subsections, depending on which page you look at).
- I would welcome a discussion on "format", namely how we could be more tidy. I don't think we'll succeed in a discussion of how the work ought to be done (when the real problem is the size of the task). Charles Matthews (talk) 13:14, 5 May 2010 (UTC)
-
- If it is easiest to add them to the author page then do so, it is not a problem to build copy and paste to subpages and to use regex expressions to build the table. Ten names or one thousand take the same time with a regex replacement. As usual, do whatever with which you are comfortable and we can come and fix behind you. The major task is the bios, the rest is wikignome territory. With regard to standardisation, we can get a bot to do such pages, we can readily find the author pages by a number of means, so that is the easy part.— billinghurst sDrewth 13:44, 5 May 2010 (UTC)
-
-
- But I'm thinking, now the question has been asked, whether the subpage is the correct solution. An alternative would be a template that you could collapse, on the author page itself. Charles Matthews (talk) 13:48, 5 May 2010 (UTC)
-
-
-
-
- For clarity's sake, I meant to say that *I* was a bit confused, not the system itself :). Re: subpages vs non subpages, my line of thinking is, with transclusion the point is moot: provided we have an uniform format, any author subpage can be transcluded back to the author's page (and collapsed there if we want it to).
- Regarding the format proper, while the table listing itself isn't an issue, I was thinking that having a template like '''{{DNB auth|article=XYZ|vol=N|pp=123|pedia=y/n}}''' in lieu of the complete table rows might be easier to handle visually (and for consistency's sake, it would include {{subst:DNB lkpl|XYZ}}).MLauba (talk) 14:10, 5 May 2010 (UTC)
- If you think that would be easier, then I am happy to build it. Do you think that we need to subst: DNB lkpl, or is there benefit in having it as a template? — billinghurst sDrewth
- As an aside, I believe the scan index of Vol I now accurately displays every single page that has an illegible scan (went through them a while ago), so we now know exactly where our gaps are. MLauba (talk) 14:10, 5 May 2010 (UTC)
- Nice! I do need to get back to those, so many balls in the air, so few hands. — billinghurst sDrewth 14:34, 5 May 2010 (UTC)
-
-
(outdent): I've gone ahead and tinkered a bit. The result is at User:MLauba/Sandbox, containing the drafts for DNB auth top, DNB auth and DNB bottom. Feel free not only to comment but to fix. In particular, my syntax is a bit rusty and in User:MLauba/DNB auth, the conditional check on the 'w' parameter (whether a wikipedia article is present or not) is pretty weak atm. And for User:MLauba/DNB auth top, I dimly remember that there's a trick to make the collapsebox header a proper wiki l3 header but I just cannot remember how to do it atm (or perhaps it isn't implemented here, dunno). MLauba (talk) 09:15, 6 May 2010 (UTC)
- Any feedback? MLauba (talk) 10:02, 11 May 2010 (UTC)
- Apologies, was thinking that CM would comment about it stylistically and functionality. Just nowI had a fiddle in case no parameters were passed for the first two. I am not sure what you are trying check with the {{{w}}} at the moment, if you want to have w being present, to say yes, and no value entered to say no, not quite there. If you could try
{{#ifeq:{{{w|{{{$4}}}}}}|y|yes|no}}which says if w=y or parameter 4 = y, then yes, otherwise = no. — billinghurst sDrewth 11:40, 11 May 2010 (UTC)- Need a conditional {{{volume}}}.
{{#if:{{{volume|}}}|add WIKILINK CODE here}}which will test for the existence, and if it exists, then wikilink, otherwise fail gracefully. — billinghurst sDrewth
- Need a conditional {{{volume}}}.
- Apologies, was thinking that CM would comment about it stylistically and functionality. Just nowI had a fiddle in case no parameters were passed for the first two. I am not sure what you are trying check with the {{{w}}} at the moment, if you want to have w being present, to say yes, and no value entered to say no, not quite there. If you could try
Despite anything written above, I've been busy with author page listings for a couple of days. See next item - there is some interaction with the subpages issue.
[edit] A listings automation issue
It is pleasant to be able to announce that we shall reach the milestone of 5000 DNB articles shortly. Another milestone relates to Category:DNB contributors with incomplete listings, in other words author pages not having a full list of DNB articles: this is down to 100 authors (out of a notional 683 - there are a few author pages for the DNB not yet created, but they are negligible for the listing issue). Barring various kinds of error, we are down to the authors who were prolific (more than 50 articles). I can do some more on this, but the longest lists are many hundreds.
I have been thinking along these lines: with listing complete by author, wouldn't it be possible to have some automatic way to scrape the names from the author pages, put them in alphabetical order, and then create volume ToCs in that way? The listing by volume issue is at most one third done at present. There are some points about the results you'd get (disambiguation is not guaranteed but with a list showing the duplicates can be elucidated by finding which authors link to a name, and also ASCII order isn't exactly right for the DNB ordering by dates); but work by hand on rough lists would be quite reasonable to handle these matters. Technically the author pages should all carry {{DNB contributor}} and related templates, and the relevant list would be enclosed in {{DNB lkpl}} or {{DNB link}}.
I find this attractive not just because if it works it would save a great deal of typing, but actually it could be done in pieces (a few volumes or initial letters at a time). It would be a reason and motivstion to get the very long listings done piecemeal. Charles Matthews (talk) 16:04, 11 May 2010 (UTC)
[edit] Stats
I have just updated Wikisource:WikiProject DNB/Statistics for June. There has been a gradual accretion of numbers to track. Now that we are finishing volumes, where should we record progress on completed volumes, and completed letters of the alphabet? These are probably the most conventional measures, for a project such as this, together with the headline number of articles (which is now close to 20% - we should get an accurate number of DNB00 articles, which total around 27,000). Charles Matthews (talk) 09:40, 2 June 2010 (UTC)
- The Fenwick handbook says 27,326 articles are DNB00, disgreeing with Sidney Lee's Statistical Account. So we have done 19.8% by that measure. NB that the articles generally get longer, on average, in later volumes. Charles Matthews (talk) 09:45, 2 June 2010 (UTC)
- It is also likely that the shorter articles will be done first, so the proportion of text inserted is probably less than the proportion of articles.--Longfellow (talk) 20:35, 2 June 2010 (UTC)
- That may be, depending on how people work, but I doubt it is really significant (the very short articles are not so interesting, and may well be ignored by anyone who isn't going through systematically). The average or "normal" (median) article is about one page of DNB; my impression, working through letter S, is that the really scanty articles are fewer, probably because the team of authors by then had enough experts in all the required fields. Anyway as the project progresses, it will become more possible to extract information. Gillian Fenwick calls the DNB "a fascinating subject, barely documented to date". Charles Matthews (talk) 07:02, 3 June 2010 (UTC)
- I would also reflect that I typeset pages, not articles, so there will also be lots of part pages waiting for the remaining parts. We could look at the number of proofread pages, though in the earlier period, there was more of a tendency to not use the progress markers. — billinghurst sDrewth 10:12, 3 June 2010 (UTC)
- Shouldn't really get hung up on numbers: extrapolation says this is a three-year project now at the current rate of progress, and that is indicative enough. Consolidation will take in already-done text in the natural course of things. Once the articles are there for DNB00, DNB01 is another 5%. Then DNB12 is also possible. I'm interested in tracking the various referencing and cross-linking issues because they form a part of the bigger picture, as well as motivations. Creating all those author pages was part of fighting initial inertia and getting some momentum, too. I was asking about how to display the "headline figures" mainly because the project's front page hasn't up till now made a point of announcing progress, while we are reaching one or two milestones. Charles Matthews (talk) 10:37, 3 June 2010 (UTC)
- That may be, depending on how people work, but I doubt it is really significant (the very short articles are not so interesting, and may well be ignored by anyone who isn't going through systematically). The average or "normal" (median) article is about one page of DNB; my impression, working through letter S, is that the really scanty articles are fewer, probably because the team of authors by then had enough experts in all the required fields. Anyway as the project progresses, it will become more possible to extract information. Gillian Fenwick calls the DNB "a fascinating subject, barely documented to date". Charles Matthews (talk) 07:02, 3 June 2010 (UTC)
- It is also likely that the shorter articles will be done first, so the proportion of text inserted is probably less than the proportion of articles.--Longfellow (talk) 20:35, 2 June 2010 (UTC)
-
-
-
- We can always link to http://en.wikisource.org/wiki/Special:IndexPages?key=Dictionary+of+National+Biography
-
-
[edit] British Museum
See Wikisource:Scriptorium#British Museum tie-ins for what this is all about; and Wikisource:WikiProject DNB/British Museum where I'm marking the project's card about author most relevant to us. Basically my list shows that 25 out of 35 author pages for writers who worked at the British Museum are DNB authors. Charles Matthews (talk) 20:03, 5 June 2010 (UTC)
[edit] [q. v.] or name
It is extremely discouraging to see the controversy regarding [q. v.] or name. I wanted to immediately undo the edit recently done at [5] but I realize the necessity of discussion and ingenuity to reach a superior end result. Personally I prefer to link the name. Further, until we are all on the same page I will probably "undo" future editing of my preference on pages I create. It is clear, however, that links should be made or they could get lost I just hope a reasonable solution is in the near future. Daytrivia (talk) 08:24, 11 June 2010 (UTC)
- These stylistic things shouldn't be allowed to become a major distraction, firstly. (Every hard-and-fast "rule" becomes a barrier to entry on the work.) I happen to prefer the logic of linking a [q. v.] where it there: quod vide being the Latin for "click here". The idea of linking the name seems to come from general experience of wikification. There is an argument that the presence of links distracts the reader, so that a shorter link is in some ways better. I haven't myself done much of the linking on the DNB, intending to make later passes at it when there is more to link to. Where we are according to Wikisource:WikiProject DNB/Wikification is simply "No consensus so far on whether to link the name or the [q. v.]". So we need to talk this through. Charles Matthews (talk) 09:05, 11 June 2010 (UTC)
-
- Controversy? Sheerly difference preference, and I don't think that it is worth bringing on discouragement nor worth undoing. A link is a link, and we will sort it out in time
Importing section that started a conversation, unknown whether there was more elsewhere
Hi according to this edit [6] I will have to redo everything I have done. That's the breaks I guess. Daytrivia (talk) 00:56, 10 June 2010 (UTC)
- Well, ummm, I hyperlink the names, not the [qv], especially as they are too hard to see, and that has been our style from the beginning. Take it to Wikisource talk:WikiProject DNB for discussion. BTW, do not redo. Surprisingly they didn't have hyperlinks in their books, and needed a way to identify that, isn't that just weird and so old-fashioned.
— billinghurst sDrewth 06:45, 10 June 2010 (UTC)
- The way I see it, a name, generally given, is an opportunity to a general author page. A "q. v." is specifically a link to another location within the same document, and should be hyperlinked as such. Hesperian 06:55, 10 June 2010 (UTC)
- With this work, it is most likely linking to a non-author, and should link to the respective article, and the link is to put it into context as the work intended not to the author page, plus it is not evident that there are two different links and it will be confusing. [qv] is archaic and redundant when we can link the name, and a damn sight more obvious. If the person is an author, then we hyperlink their introductory name to the author page quite appropriately, as well as other appropriate linking. Plus we are well into the work, and to start raising that as a concern at this point seems an inappropriate reversal.— billinghurst sDrewth 07:03, 10 June 2010 (UTC)
- You make some reasonable points, mate, but the Argument From Inertia ain't one of them. ;-) Hesperian 10:18, 10 June 2010 (UTC)
- (ec)In the context of my talk page, it is not inertia. This was discussed early in the start-up of the project (where? no specific memory, it occurred somewhere), and we have extensively progressed without any specific issue being raised, and there still has been no particular case presented for a change, so the status quo should be maintained until a the appropriate consensus otherwise prevails, and one that is not undertaken on my talk page. We should not have a smattering of each way bets through the works. — billinghurst sDrewth 12:00, 10 June 2010 (UTC)
- You make some reasonable points, mate, but the Argument From Inertia ain't one of them. ;-) Hesperian 10:18, 10 June 2010 (UTC)
- With this work, it is most likely linking to a non-author, and should link to the respective article, and the link is to put it into context as the work intended not to the author page, plus it is not evident that there are two different links and it will be confusing. [qv] is archaic and redundant when we can link the name, and a damn sight more obvious. If the person is an author, then we hyperlink their introductory name to the author page quite appropriately, as well as other appropriate linking. Plus we are well into the work, and to start raising that as a concern at this point seems an inappropriate reversal.— billinghurst sDrewth 07:03, 10 June 2010 (UTC)
- The way I see it, a name, generally given, is an opportunity to a general author page. A "q. v." is specifically a link to another location within the same document, and should be hyperlinked as such. Hesperian 06:55, 10 June 2010 (UTC)
- I touched on this issue at talk:WPDNB/style. If the name is linked to a DNB article, there would be very few links to the author:ns. This isolates DNB from the site's SOP, a linked name goes to that namespace. If this is case, that needs to decided and explained to the user and User: and some consideration given to what other works this will apply to. And do we link articles when there is no [q.v.], a person as subject might have other legitimate references.
- If we maintain the author:link, the solution is to use
[q.v.]for the link — Cygnis insignis (talk) 10:59, 10 June 2010 (UTC)
So eventually we are going to need a pass through all the articles (preferably done volume-by-volume as the work becomes more complete). There will be various things that will need to be done at that time (I can think of transclusion style and artefacts, validation, check WP link+status, categorisation at least). It has just been uncovered to us that the original style at the start of the article is like SMITH, JOHN.
One approach is to say that we shall standardise wikification only at that point; being at that future date in possession of a more worked-out scheme. At present, it seems, we should only actually avoid overlinking, in the WS sense of going "over the top" in adding links. Now we could also work some on the style guide now, to lessen future efforts by getting it right first time. This is laudable as intention, but we should also notice that there has been a "moving target" throughout, with innovations being taken up. I have no profound feelings about the detail of a wikification guide, but it generally (i.e. making Wikisource look more like hypertext in its reference areas) does seem to be a quite knotty discussion, if it is a question of designing hypertext rather than just laying down style guides, work by work. Charles Matthews (talk) 10:37, 11 June 2010 (UTC)
- I have no preference. However, the lack of a consensus causes me to simply not link at all, since we will need to normalize at some point in the future when consensus is reached. If we can reach a consensus, then I might start linking. If I were to start from scratch with no knowledge of other editor's preferences, I would probably include both the name and the [q.v.] in the link text. perhaps we need a template: {{DNB QV|John Doe|John Doe (fl.1500)}}. Then we can change the effect globally.-Arch dude (talk) 11:04, 11 June 2010 (UTC)
-
- I think that's a good point: if we agree now to place qv links in a template with the right features, we can postpone the ultimate decision. Also we can track those links, and maybe compile an automated list of redlinks that are wanted for qvs, so there are other advantages. Charles Matthews (talk) 12:58, 11 June 2010 (UTC)
Now we're talking. One more thing, I have started working on Hugh Burgoyne and am now curious as to the "see" link before him [7]? Daytrivia (talk) 13:57, 11 June 2010 (UTC)
- Or even the next page here [8] name or "see" as hyperlink? Name seems more appropriate than "see" but it's a very similar variable that I presume will eventually need addressed. Daytrivia (talk) 14:55, 11 June 2010 (UTC)
- Caution: Outsider's opinion: Makes sense to me to link the name to the Author: NS page (if we have one, and we should), and the [q. v.] to the DNB page: Jonathan Swift [q. v.]. This means we maintain the WS trend to link names to Author: NS pages, and the original "linking", such that is is, in the work. Inductiveload—talk/contribs 16:21, 11 June 2010 (UTC)
- However, the simple fact is that you are going to see an extended blue link, and two urls together (not neat practice), and unless people can and are watching where the link is they will be on one or the other. The major purpose of the links for the work is to taken them to the existing biographical detail, in the qv links they are not looking for Author pages and that should maintain the priority, over our cross namespace links. Linking to the Author page clearly from the biography is perfectly adequate. Now if we were in the references for the works, especially the smaller links at the end and often a Foster Alumni Oxonienses then that may be a different case and perfectly suitable to link to author pages, and books.
- Caution: Outsider's opinion: Makes sense to me to link the name to the Author: NS page (if we have one, and we should), and the [q. v.] to the DNB page: Jonathan Swift [q. v.]. This means we maintain the WS trend to link names to Author: NS pages, and the original "linking", such that is is, in the work. Inductiveload—talk/contribs 16:21, 11 June 2010 (UTC)
- Examples, with faked links
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. George Berkeley [q.v.] Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. George Berkeley [q.v.] Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. George Berkeley [q.v.] Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo.
With the these examples, where do you go to click the link, and do you normally look where it is going? — billinghurst sDrewth 16:44, 11 June 2010 (UTC)
- I'm not aware of a prohibition on two link being adjacent, happens all the time at the other place and is unavoidable here. However, here is an example, with separation: George Berkeley [q.v.]
—Cygnis insignis (talk) 17:13, 11 June 2010 (UTC)
- I also suppose that q.v. is used judiciously, by the author, editor, or indexer, or else it would appended to every name with an entry. The [see Surname, Jane] format is an explicit reference to something of relevance in another entry. What smallcaps only means seems ambiguous, but it seems to indicate a particular authority is being cited; the entry on Blake uses it twice, only one has a DNB entry. Cygnis insignis (talk) 17:26, 11 June 2010 (UTC)
OK, the new template is Template:DNB qv. It currently makes the whole name+qv into a single link, but anyone is free to change it, or ask me to change it. Here: {{DNB qv|Tone, Theobald Wolfe|Wolf Tone}} yields Template:DNB qv. -Arch dude (talk) 18:59, 11 June 2010 (UTC)
- Further opinion: the original intent of the [q. v.] in the DNB was to reference another DNB article. In my opinion, we should honor this intent. I think we should link name+qv as a single link to the DNB article. Sometimes, but not always, the linked person also happens to be an author: When this is the case we should have a link in the "info" section of the named DNB article to the author page. This is metadata, not wikisource transcription. On the other hand, a [see...] link, or other inline ref, or an entry in the list of authorities, was clearly intended to guide the DNB reader to the correct reference material outside of DNB. In these cases, we should link the author name to the author page, because the original intent is clearly to provide the name of an author, not the name of a DNB article. Frequently, but not always, the reference in the DNB is to an author who has a bio in the DNB: We should treat this by ensuring that that our wikisource author page links to the wikisource DNB article for that author. -Arch dude (talk) 01:32, 12 June 2010 (UTC)
-
-
- Umm, HOLD UP this template is superfluous, it performs no different function that the existing {{DNB lkpl}}, so please don't start using it.
-
-
-
-
- But that's not unknown: {{DNB contributor complete}} and {{DNB contributor done}} show the same text, while having different tracking functions. It's more a question of whether the distinction is likely to prove useful at some point. Charles Matthews (talk) 09:35, 12 June 2010 (UTC)
-
-
-
-
-
-
- That is tail wagging dog stuff. We try to make things as simple as possible for newbies, and this doesn't. If there is complexity required, then let us plan for it and layer it into the templates, not start up something that confuses and replicates. — billinghurst sDrewth 10:41, 12 June 2010 (UTC)
-
-
-
-
-
-
-
-
- OK, the Wikification subpage has been up around four months now. It clearly needs an overhaul so that it would be newbie-friendly and also a reference for the project. Perhaps we should adjourn to the talk page. Charles Matthews (talk) 11:34, 12 June 2010 (UTC)
-
-
-
-
- "This [Author:ns] is metadata, not wikisource transcription" Keeping metadata out of the sites documents is good practice, however, making a link to this metadata is done everywhere, appropriately I think: "... though it has Basire's name affixed, is, on the authority of Stothard, from Blake's hand." [emphasis] A mention of an author is often, in one sense, a reference to their works, especially in the context of a library. Nevertheless, I am swayed by the opinion above that this is a reference work and linking from those, rather than to them, is somewhat incongruous.
- With regard to the current form of the template, I'm reckoning that if another style is adopted the template could not simply be modified because it has appropriated the characters that would make a link to the author:ns. Wouldn't this require a script to fish out the characters, losing the named advantage of templating the link. The only reason I can see to avoid the semantically correct form,
James Barry [q.v.], is that someone looking at the text might miss it; I've read hundreds of the articles and think I notice when that form of reference is given. The 'outsider'[?] view is that one expects a name to link to Author:ns, and that q.v. fulfils its function. The inimitable 'insider's view is that we move forward very cautiously, avoid conflicting approaches with a style guide, and focus on the primary task of adding the articles. Cygnis insignis (talk) 07:17, 12 June 2010 (UTC)
Summary so far - Wikisource:WikiProject DNB/Wikification will need editing to reflect this discussion.
- We appear to have semi-settled one issue, namely that a template for the qv links exists and is now the preferred way to create such links, other ways being deprecated by the project, and the form of the template being something that can be discussed further at need.
- Not sure about the "See" links yet. I had assumed they were functionally equivalent to qv.
- Other small-caps names: I believe this syntax is used for all author names that are in parenthesised inline citations. There is no presumption that either the author or the work needs a link, but where enWS has already posted the work it would seem to add value to the reader to link to it. There doesn't seem to be a reason why the format alone would change the normal linking considerations.
- Priority of linking. This hits the broader issue of what the hypertext is trying to do: to lead the reader to specific information, to a navigation page for enWS, or the "best" information (which might for example be a good enWP article). I propose a slogan like "everything within two clicks" as a rule of thumb. A link to a specific (I mean mainspace) page is OK, provided that page is itself provided with links to the relevant Author:, Portal: or Wikipedia pages. If not, some questions should be asked.
Charles Matthews (talk) 08:18, 12 June 2010 (UTC)
-
- Priority of linking: We are trying to reproduce the original document and the authors' intentions. The authors clearly meant to link to another DNB article, not the WP article. Also, it is possible that there is a reference to some specific info in the DNB article that is not in the WP article.--Longfellow (talk) 18:26, 12 June 2010 (UTC)
I have found myself wandering between linking styles, as apparently did the original authors in their attempt to guide the reader to related material. From the discussion thus far, it would seem that the inclusion of a person's title in the link represents a departure from this group's established "norm." I have three examples I would like the group to consider, hybrid qv, 1st name only, and blind DNB link together with the [q.v.] linked to valid wiki article. The hybrid qv seems rarer than the number of blind links I have seen to date. The links containing only first names are usually found in the discussion of the subject's family. It is my intention to continue providing piped links to DNB articles, and including the title only in those cases when the full name is not given, as this path will facilitate the final edit upon reaching consensus. JamAKiska (talk) 15:13, 21 June 2010 (UTC)
- The discussion is largely related to where the name should link, rather than how, to the author:page or the DNB article. What part of the name would be a finer point of style issue, except that would exclude the possibility of linking the author:namespace. Three users think linking the characters "q.v." is logical, linking the name after see is an explicit reference to another part of the work. Linking the author page is, however, a site wide preference; not linking any part of the name allows that to be implemented. Cygnis insignis (talk) 15:55, 21 June 2010 (UTC)
-
- There are several occasions when [q. v.] appears in the text that it takes a moment or two, depending on the wording, to figure out who exactly the q.v. is meant for. Once the editor has made this determination it seems they would endeavor to make it a no doubter for "newbies" or general reference and use the name. There have been a number of opinions presented above and they all have valid points; I am amazed at the brainstorming that went on here. The general user, however, when referencing one of the DNB articles and sees a name linked they should immediately know where it is going. The problem with just the [q. v.] linked is that they would have to notice the "mouse-over" before clicking.Daytrivia (talk) 19:16, 9 July 2010 (UTC)
Here is a page [9] for which the name (actually two names) has to be used as the link not the "see" or the "and." Daytrivia (talk) 00:40, 30 July 2010 (UTC)
[edit] Linking from the Wikipedia page on the DNB
I have left a note on w:Talk:Dictionary of National Biography saying I intend to link from the Wikipedia article to volume ToCs here, as we finish up the volumes. Currently the links run to archive.org versions. This is partly prompted by discovering that Dictionary of National Biography here rates only at a lowly #36 in a google search for "Dictionary of National Biography". Our efforts could be more prominent. Charles Matthews (talk) 10:10, 16 June 2010 (UTC)
[edit] Author template needed
Hi Billinghurst, I am curious to know if there is an author template for author:Arthur Herbert Church as can't seem to get it here [10] thanks. Daytrivia (talk) 18:17, 22 June 2010 (UTC)
- Having same problem here [11] what's going on I wonder? Perhaps I'm not doing something? Daytrivia (talk) 18:57, 22 June 2010 (UTC)
I just added both templates. My original semi-systematic effort to add all author templates fizzled out about halfway through the volumes. others have filled in most of the remaining templates, but about 50 remain. When you find a missing one, feel free to add it. -Arch dude (talk) 19:51, 22 June 2010 (UTC)
-
- Many thanks. Daytrivia (talk) 21:20, 22 June 2010 (UTC)
- If they are a onesie or a twosie level contributor, it is probably just worth using the underlying template to create the link. — billinghurst sDrewth 10:44, 23 June 2010 (UTC)
- Many thanks. Daytrivia (talk) 21:20, 22 June 2010 (UTC)
- There will be nearly 100 more to create, since there are 685 author pages (not quite complete), and 604 templates with some double counting in there. But they are not hard to create from any existing one. I now log the total done in the monthly stats, just to remind us all that this is one task to bear down on among the many. Charles Matthews (talk) 12:08, 23 June 2010 (UTC)
- There are 23 "dab" templates, so there are 604-23=581 "real" templates for 685 authors. A few authors have two templates, so there are slightly more than 100 missing templates, and they will almost certainly not be for big contributors. However, I recommend that we complete all of the templates, merely as a matter of consistency. -Arch dude (talk) 14:14, 23 June 2010 (UTC)
- Working through the second half of the alphabet at Wikisource:WikiProject DNB/Listings#Singletons, checking what links to the author page, should catch most of them. Charles Matthews (talk) 14:35, 23 June 2010 (UTC)
- I have now done a trawl through the singletons, and the number of templates has reached 705. Which should therefore be most of them. (I'm not quite clear about counting the DNB01-authors.) I met a few disambiguation issues as I went through, which I dealt with ad hoc; when we get serious about DNB01, we should probably revisit the cases where ABC is an author abbreviation in DNB00 and also for a different author in DNB01 - it's something to look out for. Charles Matthews (talk) 20:08, 24 June 2010 (UTC)
Update: Category:Dictionary of National Biography contributor templates now has 721. There must be 15 or so cases where the same author has more than one template in different places (which happens), to account fully for the number. Anyway I expect only a handful missing now. Charles Matthews (talk) 07:16, 25 June 2010 (UTC)
- Only one template didn't have the category. All should now be listed. — billinghurst sDrewth 02:58, 1 July 2010 (UTC)
[edit] Generic nature
Tommy Jantarek (talk • contribs) brought up an interesting point about these footer templates based on {{DNB footer initials}} in whether they could be used for other encyclopaedic material. The answer is yes, presuming of course they are the same person. What makes the initials specific for the project is that a set of initials (which of course can be widely used) identify specific individuals (or numbers of people as we have found) working for a specific publication. Apart from making it easy and standard to link with the template, the value is that we can use it for audit.
In reality, this underlying template probably should be quite generic so it can be more widely used by all the projects, and probably the case with {{DNB contributor}}. So I will look to see if we can free the base templates from the projects, while still maintaining the project components. Before I do, does anyone envisage or plan for other use for these templates. — billinghurst sDrewth 02:40, 1 July 2010 (UTC)
- The templates generate a specific text in a specific format for a specific reason. Unless another work was originally printed using this particular format (initials, right-justified) these specific templates will not be useful. Furthermore, unless these specific authors use these identical initials in the other work, then these templates will not be useful. Also, the larger the number of works we try to cover with these templates, the higher the chance of ambiguities. Finally, these templates are in the "DNB author templates" category. Given all of these constraints, a non-DNB project will probably find it more useful to create its own templates. The biggest advantage of these templates is that the create a super-easy way to figure out who the author is when transcribing. Much of this advantage will be lost in a new work if we increase the possibility of ambiguity. IN general, the other work is likely to have at least a slightly different footer style. If so, then create an analogue of {{DNB footer initials}} and modify it to implement that style. Then, go the the author list of the work in question and start making your templates. I am a slow typist, but I was managing about 30 an hour, even though the situation with the 63-volume DNB is horribly messy. Please feel free to contact me for help. -Arch dude (talk) 12:52, 1 July 2010 (UTC)
-
- Most of the encyclopaedic works of that period utilise the right hand side initials for contributors, and the prime example of these works is EB1911 where there was a crossover of contributors. Anyway, I was more looking to create something like {{footer initials}} as a base template and convert {{DNB footer initials}} to utilise it with an allocated parameter for DNB. So there should be no extra work required, and no changes to any of the DNB XX templates. Also, if someone does use a DNB footer for EB1911, we will pick it up and should be able to create the other templates as necessary. It shouldn't be a biggy. — billinghurst sDrewth 15:48, 1 July 2010 (UTC)
[edit] Where the DNB errs
Following a discussion on my talk page about the DNB's mistakes, I have put together Wikisource:WikiProject DNB/Errors and errata. Charles Matthews (talk) 12:06, 29 June 2010 (UTC)
[edit] Replacing crap scans
Hi to all,
I am in a reasonably good time and space to get to and undertake some file replacements at Commons for DNB. Though such a task is going to need more than one head (brain and eyes) to address the tasks. As I see it the processes are:
- Identify type of replacement we are doing
- Volumes that require substitution of problematic pages (ie. will result in the same number of pages.
-
- If it is not a straight vol for vol swap, this will require rebuilds of djvu files and to upload, and we will need to identify the pages that we wish to replace (presumably all marked problematic) and the source that we wish to use to replace.
- If volumes have a dud page that is rescanned subsequently on the following page, we can more than likely live with that outcome.
-
- Volumes that require pages added or subtracted and also have poor scans
- here I know that vol. 20 is a candidate
- This will require working out, from the respective Index: pages, the pages that we are looking to keep
- Are we straight reinserting a better copy or do we need to construct a version from components
- upload the respective file (and neuter the old file at Commons)
-
- deleting (identified) pages that are not worth saving (admin task)
- moving (identified) pages worth saving
- Subsequent fixing of any main ns pages that transclude moved files
- here I know that vol. 20 is a candidate
Think that is all. — billinghurst sDrewth 05:15, 2 July 2010 (UTC)
- I'd like to see the big gap in vol.23 fixed - there is a better scan anyway. Working from the front, vol. 3 has missing pages and would hold us up. Charles Matthews (talk) 20:40, 2 July 2010 (UTC)
- With volume 3, is the text that bad? Moving that many pages is going to be problematic (mega painful). We can add extract the two missing pages and upload those separately if it is those two alone. Otherwise would we just be moving the pages that are NOT transcluded and delete the rest? — billinghurst sDrewth 04:57, 3 July 2010 (UTC)
- Sure, transcluding those two pages from userspace is possible right now. My "roadmap" tends to be built up from known "obstructions" to getting various areas done. Charles Matthews (talk) 08:07, 3 July 2010 (UTC)
- Umm, is that a yes or no for a vol. 3 reload? — billinghurst sDrewth 13:24, 3 July 2010 (UTC)
- You can give other things greater priority. Charles Matthews (talk) 09:19, 4 July 2010 (UTC)
- Umm, is that a yes or no for a vol. 3 reload? — billinghurst sDrewth 13:24, 3 July 2010 (UTC)
- Sure, transcluding those two pages from userspace is possible right now. My "roadmap" tends to be built up from known "obstructions" to getting various areas done. Charles Matthews (talk) 08:07, 3 July 2010 (UTC)
- vol. 26 matches your second description, as it requires several 1:1 swaps, while majority of remaining djvu pages are blurry. I suspect vol. 27 is in similar condition. Thanks for your help on 23. JamAKiska (talk) 15:47, 3 July 2010 (UTC)
Vol 25
- I did some work on this volume fixing several pages by transposing several pages as this volume has a numbering problem as it is now. This volume is pretty bad and I also had some discussion about it here. I specifically mentioned that this Google image seems to be far superior to the one we have right now. Maybe we can use that instead. Ww2censor (talk) 16:34, 3 July 2010 (UTC)
[edit] More on author pages
I'm working through listing the first edition articles on author pages, and should be done with it some time in August. At which point there will be some very long lists around. There have been previous proposals and discussions about the use of subpages. Leaving that aside for the moment - having analytical lists is of interest since it ties up with the "missing article" drive on WP - my current thinking is that for all longer lists we should use a collapsible template on the author page. Such things exist here, e.g. {{British legislation lists}}. Taking "long" to mean 50+, there would be just over 100 to create; if it means 20+ it is more like 160.
Therefore I'm asking the more technically-minded DNBers to look into the syntax issue here. As I understand it, enWS doesn't have a standard off-the-shelf navbox we could use, as WP does. For the author pages, a simple list of article links with {{DNB lkpl}} is most of what is required; we should allow for the possibility of DNB01 and later editions, too, and probably for an alphabetical breakdown for the really long listings.
Thoughts? I'm no expert, but coding up a generic navbox should benefit the site as a whole. Charles Matthews (talk) 07:45, 8 July 2010 (UTC)
- Coding a box shouldn't be a killer, though, not anything that I have done. Let us put aside the number to make, and what it takes to do as I see that as least of a concern, compared to 63 volumes! 100s of authors, etc.
- Let's explore what you/we want to show.
Other bits
- 1 box, or boxes within; thinking possibility of different boxes for different groupings, or different boxes for
- multiple columns or straight; if replicating the subpage tables, we wanting wrapped columns? I suppose that is "what data are you wanting to show?"
- (thought) in a toggle of a collapsed list, I would think that we would not want the length of the toggled space to be larger than the depth of a screen, allowing people to toggle without scrolling.
- always simpler more likely to be more compliant across browsers
- agree that subpages doesn't really work well for us
- — billinghurst sDrewth 10:45, 9 July 2010 (UTC)
To clarify a bit: my first thoughts were to use just a fairly standard middot format for most of the boxes. So that if it's a sub-50 author such as Author:James Bass Mullinger, you click and then see a list of the article names with middot separation, several to a line. The assumption is that most readers will scan down to the author they want (probably having been brought there by a search), and click. For the authors with longer lists, I think a single block would be less suitable, and there should be alphabetical division as on Author:Sidney Lee. Charles Matthews (talk) 13:04, 19 July 2010 (UTC)
[edit] Grand Program(me) for Infrastructure
I have raised some of this in bits and pieces in previous threads. I now have a target date of September for trying to address some of the remaining big infrastructural issues (i.e. pretty much everything that isn't proofreading or adding links). It seems that tackling what remains to do awaits finishing the listings on author pages. Once that is done, I feel we can move ahead on several fronts:
- (a) Definitive format on author pages;
- (b) Scraping and sorting the author page listings so we have have "rough" ToCs for each volume;
- (c) Troubleshooting the "rough" ToCs, which means various things including proper disambiguation checks, catching omissions, and full set of author pages plus (i.e. we'll need to have an Anonymous listing, and a side issue is whether that is in project space or the Author: namespace);
- (d) Definitive manual on article titles, which comes down to one main point (sort out which of the small caps wording gets into the page titles, and which doesn't, with the main burden being medieval names).
I think we'll probably need to spread out some detailed ideas over project pages for all this.
But, first, am I making sense? The overview is that we want to get to the situation where proofreaders can create pages that will automatically be linked in, from ToCs and author pages, and will be able to get the "previous" and "next" from the ToCs, with no quibbles, for the easiest possible experience of article creation. There will have to be a big push to get to this point, and I'd like to think that there is consensus about what we're pushing towards.
Charles Matthews (talk) 12:50, 19 July 2010 (UTC)
[edit] Manually transcribed - a way to tag
We still have a number of manually typed or pasted DNB bios, and I am proposing that if we stumble across them that we at least mark with {{migrate to djvu}} as a means to track them and get to converting them. Crude, but it is better than nothing. — billinghurst sDrewth 04:35, 25 August 2010 (UTC)
- OK. Particularly for volume 6, I have in mind to "batchify" the process of conversion, i.e. to apply a version of my current standard method with pre-prepared text. This will be a time-efficient way to do any longer runs of articles. Given that, I would suggest that others working to convert pick off the isolated examples where there is no great efficiency to be had by more industrial methods. Charles Matthews (talk) 10:06, 1 September 2010 (UTC)
[edit] New DNB WikiProject on Wikipedia
For information: I have set up w:Wikipedia:WikiProject Dictionary of National Biography, since the time has certainly come when there should be a sister-project, and a definite place for collective discussion of the DNB adaptation effort over on WP. Please come and participate. Charles Matthews (talk) 09:35, 9 September 2010 (UTC)
[edit] Straight text biographies
MAINTENANCE TASK (manual) Produced a list of biographies that are text alone in the main namespace listed at Wikisource:WikiProject DNB/non-transcluded. These are those that need to be migrated to the relevant places in the respective volumes. There is also the need for {{DNB00}} improvements through each. — billinghurst sDrewth 00:59, 14 September 2010 (UTC)
- NB that #6 to #333 on the list come from volume 6. As I mentioned above, this volume deserves a more systematic push. Charles Matthews (talk) 08:39, 14 September 2010 (UTC)
[edit] #section transclusion
MAINTENANCE TASK (bot/semi-auto) List of files that utilise #LST, Wikisource:WikiProject DNB/section transclusion and need to be converted to <pages>, and probably need |volume = xx added in {{DNB00}} and to be wrapped in <div class="indented-page"></div>. — billinghurst sDrewth 03:39, 14 September 2010 (UTC)
Done — billinghurst sDrewth 13:42, 14 September 2010 (UTC)
[edit] Updating the Manual
I have gone into Wikisource:WikiProject DNB/Style Manual and updated material on titles, to reflect better where we stand. This does need further work. It has been suggested to me that there should be a section on when and how to add rational redirects. Charles Matthews (talk) 07:42, 16 September 2010 (UTC)
- I would agree about the redirects, though feel that it would be best framed with a discussion about disambiguation, including within a whole of site context. — billinghurst sDrewth 11:48, 16 September 2010 (UTC)
- It's quite a big area, considering that "redirect" also describes the DNB fragments that send you to articles from variant names. The Manual needs some beefing up to deal with all the hypertext issues we are gradually getting to, with volume ToC format, Author page format also now on the agenda as we get a bit more complete. I'll try to spend further time on it, and make the structure more obvious as well. Charles Matthews (talk) 14:21, 16 September 2010 (UTC)
- I think that we are better to have the strategic solution, and look to have a bot fix primarily, and then a second semi-auto run through on the non-obvious targets. Phe's scripting for managing author pages indicates that there is scope for validity checking and proper bot cheating. — billinghurst sDrewth 16:07, 16 September 2010 (UTC)
- I'm going to want to come back to Author pages, and in particular scraping our article names off them, very shortly. The current position is that there are L and M of Author:Thompson Cooper to add, followed at some point and somewhere a list of the anonymous articles. And then all the DNB00 and DNB01 articles are listed (somewhere, if you are forgiving about those on subpages which are caught on the Magnus tool, and whatever omissions I need to apologise for in advance). This does open up new fronts, as I have said before. In particular I thought we could look if the three 1901 Supplement volumes could now be posted, because a testbed of scraping and sorting the entire DNB00 listing would be to sort the DNB01 names somehow and create three volume ToCs for the Supplement with much less pain than in the past. Charles Matthews (talk) 21:05, 16 September 2010 (UTC)
- So now Author:Thompson Cooper is now complete (I think) if messy, at 48K and the longest DNB list at over 1400 biographies. This is the "worst case" for author page format, and I'm going to start a thread on Author talk:Thompson Cooper for those who want to experiment with various format options. Charles Matthews (talk) 08:19, 17 September 2010 (UTC)
- I think that we are better to have the strategic solution, and look to have a bot fix primarily, and then a second semi-auto run through on the non-obvious targets. Phe's scripting for managing author pages indicates that there is scope for validity checking and proper bot cheating. — billinghurst sDrewth 16:07, 16 September 2010 (UTC)
- It's quite a big area, considering that "redirect" also describes the DNB fragments that send you to articles from variant names. The Manual needs some beefing up to deal with all the hypertext issues we are gradually getting to, with volume ToC format, Author page format also now on the agenda as we get a bit more complete. I'll try to spend further time on it, and make the structure more obvious as well. Charles Matthews (talk) 14:21, 16 September 2010 (UTC)
[edit] Milestone: Vol 2 articles are done
I completed the last few articles in Vol 2. Other editors had already done almost all of the articles, leaving just a few associated with three problematic pages. Two pages, Page:Dictionary of National Biography volume 02.djvu/202 and Page:Dictionary of National Biography volume 02.djvu/458, are true problems, with one or two characters of each line cut off on the right. The other page, Page:Dictionary of National Biography volume 02.djvu/255, was merely a poor OCR requiring manual input.
For the cut-off pages, I interpolated the missing characters. This was trivial except when the missing character was part of a date. For these, I guessed, and added a [?] to the text. Only two articles are affected. My reasoning is that having an article with very slight damage is better than not having an article, but it would be better if someone can use a paper copy to fix this. -Arch dude (talk) 06:27, 26 September 2010 (UTC)
-
- Proofed first two pages and corrected dates. — billinghurst sDrewth 02:29, 27 September 2010 (UTC)
- It's great to have another volume done. We don't currently have a list of volumes "all articles added"; but we should have a page for listing that and other progress, such as completed letters.
- The "issues" bring up a general point, which is how is the validation effort going to be managed? The page status "traffic light" system ought to be the prime tracking tool, of course: it is better to mark pages as "problematic" status blue if there is any doubt about the proofing. There are paper copies around for the finishing of the validation (I know of one participant with access to all 63 original volumes, not a given in that later editions do have updates). But I think we need to be patient and simply work round the blue pages, currently.
- The other point is that volume 3 is now very much on the agenda. There are two missing pages, as logged on Index:Dictionary of National Biography volume 03.djvu. Now bridging the gap is not itself a major problem, in that I have "patched" other such gaps with transclusion from my userspace, and this will do for the time being. That solution doesn't mix well with the "ribbon" monitoring system, and obviously something has to be done eventually, for validation.
- I'm led to remark that we are short of tools that could handle the maintenance issues, which as I understand it require making mash-up djvu files on Commons, and handling here on Wikisource the preservation of the existing pages of text, for selective restoration once a new djvu is uploaded on WS. The pages may well have to go back to newly-numbered places. (Please anyone correct my understanding here: I've not done this work.) I think we really should be asking if such tools can be written, because we have many volumes to sort out, and they could benefit the work of others too. Charles Matthews (talk) 09:39, 26 September 2010 (UTC)
-
- Two ways to view this. 1) Are we trying to make a copy for the web where the text is complete; or 2) are we trying to build complete djvu files for download. If the former, then we can do some level of mix and match, as has been done in Index:Dictionary of National Biography volume 60.djvu where there are additional pages added (see top left) and transcluded directly. This has been in a couple of other places too. If it is the latter, then all the transcription is not relevant, as the text and images only work in the Page: namespace as that is the only place they are pulled together. — billinghurst sDrewth 03:28, 27 September 2010 (UTC)
-
-
- My priority, certainly, is to have all the DNB biographies available as articles at a standard where they can act as references. It looks like we can get there for the first edition, at current rate of progress, in 2012. Then there is the issue of making the DNB into a piece of hypertext. I'd estimate about 50,000 qv links, so that this is substantial (perhaps the main business will take 150,000 to 200,000 edits). Making the text for the djvus complete requires another substantial chunk of work; probably of the same order (a couple of edits to each page). That would be most of it; but for utility we should do the Errata, and also categorise the pages. Categorisation would be a priority for the needs of the sister project on WP. Then some sort of mopping up, but this gets over the horizon, at least for me. Charles Matthews (talk) 11:30, 27 September 2010 (UTC)
-
I spoke too soon. I realized that my previous "milestone" merely turned all TOC entries blue without verifying that all articles are present in the TOC. I just quickly ran through all pages in the volume using the "next page" link. I did not find any missing articles, but I did find and fix a bunch of bad "next page" links. I also converted all(?) old-style articles to transclusion. I also found one article that depended on yet another problematic "cut-off" page, so I interpolated that page also. It would be helpful if someone with access to an alternata source can please proof-read Page:Dictionary of National Biography volume 02.djvu/336. -Arch dude (talk) 21:27, 26 September 2010 (UTC)
- Checked last page against alternate source. — billinghurst sDrewth 02:45, 27 September 2010 (UTC)
-
- Wikisource:WikiProject DNB/Completions is now the place to flag up volumes and letters for which all the biographies have been created. Charles Matthews (talk) 08:20, 1 October 2010 (UTC)
[edit] Volume 3
Have completed review of links on TOC page using online 1885 edition and adjusted lateral links as required. There were a few extra's that I removed as I could not find them in 1885 edition. Should finish page checking the volume in next couple of days (currently up to page 387). Am replacing text from alternate source which should accelerate the editing process, currently through djvu page 26 (will try to stay about 15 pages in front of current editing). Have replaced text on the two missing pages and included headers to facilitate editing from alternate source (So text is available for every page in this volume through page 387). Have also replaced text on all problematic pages and recategorised page 308 as not proofread. The two missing pages currently contain duplicates of text pages 126 and 127 which should facilitate a swap when these pages become available. While the text replacement is not optimum, it will help until good quality djvu pages become available for the entire volume. It would be beneficial to come to closure on qv links if that has not already be accomplished.JamAKiska (talk) 13:20, 29 September 2010 (UTC) Volume 3 page review is complete. Found only one more page that needs replacement. I added text to transform existing characters into something more easily recognized.JamAKiska (talk) 12:45, 30 September 2010 (UTC)
- Thanks! What do you mean by "online 1885 edition," and "alternate source?" Please provide links so we can maintain provenance. To the extent that you are using an outside source to repair or help interpret a somewhat poor scan, there is no problem, since the scan should still be usable as the source of record. However, if our existing scan is actually illegible and you are replacing the text from another source, then I think we need some indication of where the material came from. The idea here is that it should be possible for some third party to validate your source: this is equivalent to the "verifiability" requirement at Wikipedia. Of course, we have already asserted that these pages are from the original edition, so a third party is free to go to a library that has the originals, but I feel that it is better to provide a web-accessible source if at all possible. Forgive me I am being too picky. -Arch dude (talk) 15:18, 29 September 2010 (UTC)
- Am trying to support without intruding...The documents found at Wikisource:WikiProject DNB/Progress are equivalent to other DNB volumes available via the internet. Through trial and a few edit revisions and working with feedback from Charles, discovered it was best to verify the provided DNB text was from an 1885-1900 edition. These "on-line" resources have been scanned in a variety of locations around the globe; those that I reference are being made available through US libraries collaborating with Google in their ongoing effort to mainstream access. The authenticity can be verified using the side-by-side review process with the existing djvu pages. The vagueness of my language was to help distinguish this material from some of the existing fuzzy djvu scans. The volume 3 link on the provided page should help you authenticate the missing pages.JamAKiska (talk) 16:36, 29 September 2010 (UTC)
- w:User:Charles Matthews/DNB scans for the complete list (probably) of the scans as they appear on archive.org. The "Progress" page links to the so-called "best" scan, but it may not be the best for any given page. There probably are many more Google scans than have reached archive.org: I have tried to post as many Google Books "keys" as possible (at User:Charles Matthews/DNB referencing data) for scans of the DNB of many editions that are relevant; but there are two or sometimes three Google scans of the 1885-1900 books. It's actually a fantastically complicated picture. Here in the UK I cannot actually read any of the Google Books postings, one reason why I somewhat obsessively want to replace Google Books DNB links on WP by links to our own versions. In short, Google has raw material above and beyond what is easily apparent to us. Charles Matthews (talk) 20:48, 29 September 2010 (UTC)
- This is not a complaint. I am in awe of the collective progress being made as different individuals take the initiative to attack our various problems in different ways. If everyone just keeps doing whatever seems to be useful, we will continue to make progress. There is no particular reason for anyone else to cater to my obsessions about provenance unless you just want to. Let me summarize the situation as Charles outlined it:
- Scans of many volumes from many different sets of volumes, by Google and others, are available on the web.
- Many of these are in fact the correct edition of the DNB 1885-1900.
- A Wikisources editors/contributors/project people/whoever (thanks!) have uploaded the "best" instance of each of the 63 volumes to commons as .djvu files. Each upload generally has a comment with a link back to the source web site.
- Some of these volumes have subsequently been replaced with better copies, with or without updating the backlink to the new source.
- The "best" volume is not necessarily the best for any particular page.
- Some page-by-page repair work has been done or is being contemplated: this would involve downloading the whole volume, using tools to replade image pages, and uploading the result. There (is or is not?) a system for logging the changes made by this method, including which pages came from which alternate sources.
- Other repair work is being done by modifying the text pages using the alternate sources in conjunction with the normal "pageview" tools. There (is, is not) a system for logging these changes including which alternate sources were used for each page.
- Charles has some information about the locations for alternate sources for each volume.
- Is this a valid summary? -Arch dude (talk) 22:30, 29 September 2010 (UTC)
- This is not a complaint. I am in awe of the collective progress being made as different individuals take the initiative to attack our various problems in different ways. If everyone just keeps doing whatever seems to be useful, we will continue to make progress. There is no particular reason for anyone else to cater to my obsessions about provenance unless you just want to. Let me summarize the situation as Charles outlined it:
Not quite it. The initial choices of djvu upload were before my time, but even with charity those weren't always best. The "logging" issue comes in two parts. I think Billinghurst once said that we don't really care about the provenance of text from which we proofread. There are multiple sources at archive.org of the correct edition, and there are sources for later editions. It's all grist to our mill: text layer or external source. We do it patchwork. Changes to the djvu sequence have been more disciplined, and are much more of an effort. It's an admin task. I have not participated. I believe changes have been logged to the Progress page. Charles Matthews (talk) 09:53, 30 September 2010 (UTC)
[edit] Way forward on categories?
I was working on Category:DNB No WP, which is much improved by the bug fix (thanks ThomasV); and I've added a LargeCat ToC (thanks Pathoschild). I found Palmes, Bryan (DNB00), a typical instance of an article for WP usage (twice an MP); how should I categorise it here, though? Category:British politicians is full of author pages, and the same is apparently true for related categories. I don't want to start a big drive to categorise if there is going to be resistance. Am I BOLD? Do I decide that Wikisource:Categories has to be created at last? Or do I start another round of the "topics" discussion at the Scriptorium? I'm certainly not going to start a corner of the category system just for the DNB, having argued in general terms against such things in the past. Charles Matthews (talk) 08:23, 29 September 2010 (UTC)
- Why wouldn't we point at Category:Categories, or are you meaning something else for the creation of the category. To your former comment, is it that you do not think that the author ns pages and the main ns articles should not be in the same category? What is it that we are looking to try to separate? To the doing, I don't think that it would need to be anything that would require further conversation, as I would think look at the history of WS that the discussion has been had. — billinghurst sDrewth 11:41, 29 September 2010 (UTC)
- Yes, one issue is the tie to namespaces, or at least in the oblique form that "category X contains in practice pages of a certain type". There is more than one way to set up a category system. The "German" or deWP system relies much more than what enWP does on being able to intersect categories. I don't like some of that in detail, but here I think it would help clarify things. It would get us away from having to think about a biography as either by nature a mainspace text or something to be classified via its topic, a person. Category:British politicians is something one might want to search in various ways (also an author page, also a DNB page, also nineteenth century). Charles Matthews (talk) 17:59, 29 September 2010 (UTC)
- [mounting his hobby-horse] We have this well covered I think. We already have a subjective, topical, intersecting, method of categorisation, possibly the most elaborate ever devised, it certainly has the greatest number of enthusiastic workers and users. I'm not exaggerating here, I'll give another clue: the DNB project is already populating these categories. Any DNB article, or other biographical text, will meet the 'within two-clicks' criteria* for a huge number of topics and possible paths of enquiry.
*Wise words that are achievable ideal, with a bit of thought, and never far from my thoughts since I read them here. Here are someone else's, if a method of indexing is redundant it is only likely to cause confusion. cygnis insignis 14:40, 30 September 2010 (UTC)
- [mounting his hobby-horse] We have this well covered I think. We already have a subjective, topical, intersecting, method of categorisation, possibly the most elaborate ever devised, it certainly has the greatest number of enthusiastic workers and users. I'm not exaggerating here, I'll give another clue: the DNB project is already populating these categories. Any DNB article, or other biographical text, will meet the 'within two-clicks' criteria* for a huge number of topics and possible paths of enquiry.
- Yes, one issue is the tie to namespaces, or at least in the oblique form that "category X contains in practice pages of a certain type". There is more than one way to set up a category system. The "German" or deWP system relies much more than what enWP does on being able to intersect categories. I don't like some of that in detail, but here I think it would help clarify things. It would get us away from having to think about a biography as either by nature a mainspace text or something to be classified via its topic, a person. Category:British politicians is something one might want to search in various ways (also an author page, also a DNB page, also nineteenth century). Charles Matthews (talk) 17:59, 29 September 2010 (UTC)
-
- If you are implying what I think you are, then you advocate the chicken and I the egg (if that is not disrespectful of a bird of your distinction). If we are to rely on WP categorisation, then articles have to be created over there, the process I'd like to promote. So is it egg first or chicken? Charles Matthews (talk) 21:20, 30 September 2010 (UTC)
Using intersects
What are trying to achieve/present/differentiate? — billinghurst sDrewth 12:56, 1 October 2010 (UTC)
- I'm trying to develop the point of view that WS does need big categories such as Category:British politicians. A category such as w:Category:Members of the pre-1707 Parliament of England is interesting because it encodes expert knowledge (something major happened in 1707), and you would not typically do as well with intersecting with birth dates (say). That is the snag with the intersecting approach: expert knowledge does not reside in the category system. Bad for zoology (say). For WS we should be able to fix that by saying "a portal on MPs from a given period would be fine: we don't need to import enWP's category system, but we'll develop another way." This is now where I'd like to head. Charles Matthews (talk) 15:42, 1 October 2010 (UTC)
[edit] Lead required
Hello, I've just stumbled upon the project page and my thoughts were that it needs a lead at the top of the project page outlining the basics of the project. One or two sentences is all that's needed. It should cover such basics as the country (Dictionary of National Biography of New Zealand? England? USA?) and the aim of the project. Just a suggestion :) Schwede66 (talk) 23:41, 30 September 2010 (UTC)
- I've added text to the header, which has a field to describe the project. Charles Matthews (talk) 07:08, 1 October 2010 (UTC)
I added a new lead, because the text in the header is not very prominent. -Arch dude (talk) 17:29, 1 October 2010 (UTC)
[edit] Master lists
My ideas on these are now set out on a project page: Wikisource:WikiProject DNB/Master lists. It is now a matter of greater urgency, given the starting of Dictionary of National Biography, 1901 supplement, to get the DNB01 articles listed. So that that subproject will start with good ToCs, I mean. So I'm putting in time on scraping the names off the author pages: no need for others to duplicate that. Charles Matthews (talk) 09:49, 5 October 2010 (UTC)
[edit] Format on author pages
Further to the welcome appearance of Dictionary of National Biography, 1901 supplement, we need to talk about some format issues:
- If the DNB01 links are routinely piped as the DNB00 links are, then we presumably should be flagging the edition in author page listings. One way would be to use as standard a semi-colon heading within the DNB section (thus not creating a subsection).
- We should have a template to do the piping, because at the very least it makes list handling much simpler. Would it be OK to call this {{DNB01 lkpl}}? Also we'd need {{DNB01 link}} as the version displaying the full reference.
Charles Matthews (talk) 07:23, 7 October 2010 (UTC)
- The template issue is now handled. And there are volume ToCs up for the supplement: things move on apace. The tracking category for DNB01 text on WP is w:Category:Articles incorporating DNB01 text without Wikisource reference, which currently has just ten articles to create. Charles Matthews (talk) 07:09, 9 October 2010 (UTC)
- Volume II articles of 1901 Supplement are complete. Eight articles to go... Volume III text layer missing, Volume I awaiting upload.JamAKiska (talk) 12:54, 19 October 2010 (UTC)
-
- With regard to templates, I would rather have less templates though with more options, rather than by having more templates. For example have {{DNB lkpl}} rather than {{DNB00 lkpl}} Alternatively if people do not like that idea, I would prefer to create an underlying template that picks up the options. I would favour more/specific templates where there is value in having the templates, eg. how many works utilise DNB01...? What value can people see in having specific templates, what extra information do we want or what do we want to know about things that are specifically about the individual volumes? — billinghurst sDrewth 09:31, 20 October 2010 (UTC)
[edit] An obvious remark
Dates later than 1900 should not appear in DNB00 articles. I'm onto this as a way of checking that should be run. It's particularly relevant to the way I work, but apparently not solely my problem. Very often updates in later editions are adding references appearing in the early twentieth century. Charles Matthews (talk) 10:16, 9 October 2010 (UTC)
[edit] simplified 'LST'
As this project uses LST, the labelled sections in the Page: namespace, a great deal, I thought I would bring attention to the changes introduced and discussed at Wikisource:Scriptorium#Easy_LST. cygnis insignis 21:37, 19 October 2010 (UTC)
- Thanks, I'm trying to test this as it affects my normal way of working. The idea is that if a section runs over multiple pages, then only the starting and closing section marks are then needed? Didn't work for me first time I tried that. Charles Matthews (talk) 11:40, 20 October 2010 (UTC)
-
- hmm no that’s not the idea ; the internal syntax is unchanged. The script is only a preprocessing.
- I fixed Page:Dictionary of National Biography volume 56.djvu/148 because the end mark was missing. Note that this cannot happen when you use the script.
- to use it, reload your javascript. You’ll see that section tags are replaced with ## titles ## during edition.
- ThomasV (talk) 11:44, 20 October 2010 (UTC)
-
-
- Well, I'm going to have to try to understand this. Page:Dictionary of National Biography volume 56.djvu/148 has this diff you made: [12]. You added a section end. But the change in wikitext, for me, is nothing there. Charles Matthews (talk) 12:04, 20 October 2010 (UTC)
- There is a spooky effect caused by what was termed a pseudo labelling. Thomas outlined how defining the end is not needed in our practices. In theory the code is synonymous, it adds the same thing, but only the start of section code 'appears' to the user (as ## section title ##). The section end is 'there', but does not appear, the end is implied by the start of a new section. Does that help conceptualise what is going on? cygnis insignis 12:16, 20 October 2010 (UTC)
- You can continue as you were, and there is an opt out if you don't want that display. cygnis insignis 12:19, 20 October 2010 (UTC)
- I'm going to have to think hard to understand how robust this now is. I can't guarantee not to make mistakes in markup: I certainly make mistakes 10% of the time. My first reaction is that this upgrade is not helpful to me personally. Charles Matthews (talk) 12:38, 20 October 2010 (UTC)
- By the way, if the opt-out is anything to do with the Vector skin (perhaps implied by what is written at the Scriptorium about this), I don't use it and have no intention of doing so (it breaks my way of creating articles with two windows side-by-side, amongst other things). Charles Matthews (talk) 12:47, 20 October 2010 (UTC)
- Well, I'm going to have to try to understand this. Page:Dictionary of National Biography volume 56.djvu/148 has this diff you made: [12]. You added a section end. But the change in wikitext, for me, is nothing there. Charles Matthews (talk) 12:04, 20 October 2010 (UTC)
-
- I will add a gadget to simplify opt-out ; currently you need to change yopur javascript manually.
- However, this upgrade should be useful to you personally, because it guarantees that you cannot make mistakes such as unbalanced or misnamed tags, such as the page that I fixed.
- That was a test of mine, in fact, not an error, and you made the change while I was creating an article to check whatb happened. A typical mistake I make, in adding text from a long strip in a text editor, is to copy the wrong beginning or end section up or down (I have the correct names at the beginning and ends of sections, but the page break may come in the middle, and so I need to copy). With the new system I suppose I couldn't see if I have copied down the wrong end section tag by mistake. But that is why I said I needed to understand better. Charles Matthews (talk) 13:12, 20 October 2010 (UTC)
- I’d like to understand why it breaks your way of creating articles. could you clarify this point ?
- ThomasV (talk) 12:52, 20 October 2010 (UTC)
- That's purely about dimensions and tabs when I have two windows open: something I found inconvenient when I last tried it. Charles Matthews (talk) 13:12, 20 October 2010 (UTC)
-
- I understand the upgrade better now: thank you for the time you have put in. A way of switching back to the view without the preprocessing might prove helpful in troubleshooting. Charles Matthews (talk) 07:51, 21 October 2010 (UTC)
[edit] Matching tool for Category:DNB No WP
This is another Magnus Manske tool: see here for initial Q and edit the browser line to put in any letter to replace Q. The run for Q is mercifully short, and it identifies one hit, for Quarles, John (DNB00), as an existing Wikipedia article. I'm leaving that for demonstration purposes, therefore. For other letters, leave the tool to itself and it will run through DNB biographies not yet matched with WP articles for that letter. As Magnus comments, slow but thorough. Charles Matthews (talk) 06:23, 24 October 2010 (UTC)
- I have added it to the Category page for each letter. — billinghurst sDrewth 11:07, 24 October 2010 (UTC)
Seems to be remarkably useful. Also somewhat touchy, though, as the toolserver can be. Smith names are so common it can take a long time on one. It should work better once the backlogs on common letters are cut down. Charles Matthews (talk) 12:09, 24 October 2010 (UTC)
Upgraded: you can now use two initial letters, to narrow down for the longer listings. Charles Matthews (talk) 11:34, 27 October 2010 (UTC)
Further wizardry: tool has been adapted to the Catholic Encyclopedia articles. See Wikisource talk:WikiProject Catholic Encyclopedia Upgrade. Charles Matthews (talk) 08:11, 29 October 2010 (UTC)
[edit] DNB redirects: categories and accounting
I mean the DNB's own redirects such as More, Roger (DNB00). It would probably be better if they were not in Category:DNB biographies but in a category of their own. And they show up in counting articles, I believe. They can and should be linked to WP: why not? Charles Matthews (talk) 09:58, 25 October 2010 (UTC)
- Agreed: they need to be handled differently. If I recall correctly, we originally decided to not include them at all, except as inline data in the TOC. However, I now believe that we need them as separate tiny little articles, just as in the example, merely as a matter of consistency, and also to enable us to write a an automated "coverage" tool for pagespace. I propose that we add a "redirect" parameter to template:DNB00. This would take the page out of that category and the "no author" category. -Arch dude (talk) 15:19, 25 October 2010 (UTC)
I wouldn't say we "need" them now, but they are going to be a part of making the hypertext version. Charles Matthews (talk) 21:34, 25 October 2010 (UTC)
- Thoughts
- I have been manually adding Category:DNB See as per a very early discussion. It is not a major issue to amend {{DNB00}} to allow for an extra parameter to choose between adding Category:DNB biographies and Category:DNB See, nor to convert the pages that use See at this point in time. The tricky bit will be to collect those that have been missed, as we would need to be looking for a (DNB00) biography that is not directly linked from one of the 63 main ns ToC pages. Not sure that we got a lot of benefit, beyond article separation, by adding the parameter, as people will need to know what it means, how to use it on the rare occasions, and they could just as easily manually add. <shrug> Note that it will NOT affect the no contributor and I would have to think about whether we can tie the two parameter tests together functionally.
- I haven't traditionally listed the
Wikipedia =field on the page, though don't see that it matters particularly either way - While I don't wikilink to these referring/pointer pages ("redirect" has too many local connotations) from our ToC pages, when we changed to transclusion, I ended up doing them and have them in the general prev/next run. — billinghurst sDrewth 03:29, 26 October 2010 (UTC)
- Having a think, we might be able to do something with the
contributor =field. Something along the lines of WHERE CONTRIBUTOR = see (or whichever unique keyword) that it puts the DNB See category, this would both move it out of the NO CONTRIBUTOR category, and also put them into the referral category. No impact upon the DNB biographies category. — billinghurst sDrewth 11:29, 27 October 2010 (UTC)
I'll comment that we are overdue a manual of style for the volume ToCs (what to include, format). I don't create the referring pages, nor do I use them as "previous" or "next"; but the interpolation of such pages in sequence as they are created between articles that have already been created can go on in the background. Charles Matthews (talk) 06:46, 26 October 2010 (UTC)
- I add the "redirect" parameter to Template:DNB00. If this parameter is set, the article is added to Category:DNB redirects instead of Category: DNB biographies, and it is not added to most certain content-oriented maintenance categories. -Arch dude (talk) 17:51, 31 October 2010 (UTC)
- I have undone the change. As I mentioned above, I see misunderstanding using the term "redirect" which has a the common parlance of wikimedia, and I am not comfortable with that potential conflict for misunderstanding. Also, as I also mentioned above, I see a better means to manage the direction aspect and the contributor aspect in the one hit. I also think that it is a premature to make the decision about DNB biographies category and to remove the SEE articles from that category without a better understanding of what was the purpose and how we look to manage the corpus of the articles. — billinghurst sDrewth 22:54, 31 October 2010 (UTC)
- To continue the discussion. Category:DNB biographies was created to house all articles relating to DNB, not just those that were the articles themselves, noting that as the articles are not subpages that this is the only ready means to produce the articles. If we are going to split out the referring articles then do we need to maintain a complete list, or to have a complete list available somewhere, and how do you propose to have the category hierarchy? Or does having the categories just duplicate what is available from the main ns and the category itself with the dual listing is redundant? — billinghurst sDrewth 08:55, 1 November 2010 (UTC)
- Sorry, I did not realize that we had not reached consensus. I am primarily concerned with the fact that the "see" articles are contaminating the maintenance categories, and my modification solves this problem. I suggest that we address your two points as follows:
- modify the parameter name from "redirect" to "xref." I do not like the word "see" as a parameter name as I think it is confusing. More generally, parameter names should be nouns, not verbs.
- modify the category to be "DNB cross-references."
- make both "DNB biographies" and "DNB cross-references" subcategories of "DNB articles." We can also have a subcategory for other DNB mainspace pages such as the TOCs.
- Thanks. -Arch dude (talk) 13:28, 1 November 2010 (UTC)
- Sorry, I did not realize that we had not reached consensus. I am primarily concerned with the fact that the "see" articles are contaminating the maintenance categories, and my modification solves this problem. I suggest that we address your two points as follows:
- To continue the discussion. Category:DNB biographies was created to house all articles relating to DNB, not just those that were the articles themselves, noting that as the articles are not subpages that this is the only ready means to produce the articles. If we are going to split out the referring articles then do we need to maintain a complete list, or to have a complete list available somewhere, and how do you propose to have the category hierarchy? Or does having the categories just duplicate what is available from the main ns and the category itself with the dual listing is redundant? — billinghurst sDrewth 08:55, 1 November 2010 (UTC)
- I have undone the change. As I mentioned above, I see misunderstanding using the term "redirect" which has a the common parlance of wikimedia, and I am not comfortable with that potential conflict for misunderstanding. Also, as I also mentioned above, I see a better means to manage the direction aspect and the contributor aspect in the one hit. I also think that it is a premature to make the decision about DNB biographies category and to remove the SEE articles from that category without a better understanding of what was the purpose and how we look to manage the corpus of the articles. — billinghurst sDrewth 22:54, 31 October 2010 (UTC)
-
-
-
-
- So within Category:DNB, Category:DNB biographies actually stands for "DNB content" or "DNB texts"? In due course, we'd also want to include the index pages in the volumes, and there's a memoir in vol.63 also. So one approach would be to rename that top holding category for actual DNB text content, and have various subcategories within it? Charles Matthews (talk) 13:32, 1 November 2010 (UTC)
- Charles, you seem to be reacting to Billinghurst, not to me. my proposal is to create the category "DNB articles" as a subcategory of "DNB." "DNB articles" would be the supercategory of a set of subcategories that will include all mainspace DNB articles, and each mainspace DNB article will be in exactly one of these subcategories, although any article may be in other categories in different hierarchies. The initial two subcategories will be "DNB biographies" and "DNB cross-references." One-off articles such as the memoir in vol.63 and the odd little article at the end of vol. 01 may be placed directly in "DNB articles." -Arch dude (talk) 17:55, 1 November 2010 (UTC)
- So within Category:DNB, Category:DNB biographies actually stands for "DNB content" or "DNB texts"? In due course, we'd also want to include the index pages in the volumes, and there's a memoir in vol.63 also. So one approach would be to rename that top holding category for actual DNB text content, and have various subcategories within it? Charles Matthews (talk) 13:32, 1 November 2010 (UTC)
-
-
-
[edit] archived posts
I have archived a slab of posts, though the page is still big. If you think that a section above is complete and can be archived, then please mark it as such and I will get back to copy and paste them to the archive. Of course feel free to do it yourself. — billinghurst sDrewth 08:49, 1 November 2010 (UTC)
- Yes, when posts from two years ago are still deemed relevant, it feels like time for some FAQ-like material. Charles Matthews (talk) 13:35, 1 November 2010 (UTC)
[edit] Five figure milestone
Wikisource:WikiProject DNB/Statistics#Stats 1 November 2010: the project passed 10,000 articles early on Friday (UTC). Charles Matthews (talk) 21:20, 1 November 2010 (UTC)
- Congratulations, Charles! (Well, Charles, 95+% and the rest of us, 5-%.) Since there are <30K articles, we (i.e., Charles) are more than one-third done with the most fundamental part of the project. -Arch dude (talk) 22:40, 1 November 2010 (UTC)
- ADude, there are some other great performers in that space; so while CM is just like a master machine, we have some great apprentice machines, JamAKiska, there too who need us to dip our lids to them. — billinghurst sDrewth 03:31, 2 November 2010 (UTC)
It's a coming of age, certainly: no cake, but a new project page at Wikisource:WikiProject DNB/FAQ on the way. In numbers, I do about two-thirds of additions, but I'm sure I put in less than half the hours worked on the project. To underline that, I know that whatever the notional article #10000 was, it wasn't one of mine. Charles Matthews (talk) 08:05, 2 November 2010 (UTC)
Congratulations…on hitting the critical mass, and thanks for guiding the trek! JamAKiska (talk) 11:19, 2 November 2010 (UTC)
[edit] Change to coding across WS, modified template
I have found that a number of the <div> formatting is now redundant on transcluded pages, accordingly I have removed it from its application within {{DNBset}}. At some point I will get a bot to rung through and tidy the DNB biographies to remove that redundancy. — billinghurst sDrewth 11:45, 4 November 2010 (UTC)
[edit] Page numbering for volume 33
Can someone please "fix" the numbering for volume 33? For example Livingstone, David (DNB00) the first page link says 390 and that links to Page:Dictionary of National Biography volume 33.djvu/390 but it is 384 in the physical volume. I would do it myself but I do not know how. So if it can be done please explain it here or please provide a link to a page with an explanation. -- Philip Baird Shearer (talk) 05:14, 6 November 2010 (UTC)
Renumbered the pages of volume 33 to align page one of text with that number. The preamble pages have been renamed, the final two using roman numerals that align with the original text. This statement can be found in the "pages" section while editing the index to this volume. JamAKiska (talk) 12:30, 6 November 2010 (UTC)
- Thanks volume 16 seems to have a similar problem. -- Philip Baird Shearer (talk) 01:51, 9 November 2010 (UTC)
That should do it…let me know if you see any others… JamAKiska (talk) 02:31, 9 November 2010 (UTC)
- Volume 6 seems to have a similar problem page 125 links to 113. Is there any documentation on what you do to solve this? If not could you give an explanation here so that if I find any more I can fix them myself rather than having to trouble others by asking for it to be done. -- Philip Baird Shearer (talk) 22:19, 16 February 2011 (UTC)
-
- I’ll give it a go…
- When the djvu file images are transformed into an index page (Wikisource namespace Index), the index page reflects the total number of images starting from the first page of the file (can be the front cover), image 1 and also djvu 1, and continues to the last image. As most texts have pages prior to page 1 (prefatory notes and the like) it is possible to align the image for page one with the page: ns (where the edited text is stored). Using your example V. 6 p.113, the image is of page 113 and yet it is the 125th image on the index page. The difference between these numerical values is referred to as the offset, and corresponds to the number of images (pages) prior to the first page of text. If you edit the Index Vol. VI you will observe on the left hand margin a box titled "pages:" that contains a statement <Pagelist…> which allows a renaming (realignment) of the page: ns. In volume 6, page: ns 13 was reset to aligne with the image for page one of volume 6. So prior to the upgrade yesterday, this statement functioned properly. During the upgrade, this statement no longer functions as intended, and all pages reflect the djvu image number…which can be a little confusing when trying to locate pages using the index page…as you noticed…when the bugs get worked out of the upgrade this statement should function as intended… JamAKiska (talk) 12:45, 17 February 2011 (UTC)
[edit] ODNB ids
It is going to turn out to be useful to have the identifiers on the ODNB site http://www.oxforddnb.com/ recorded somewhere, for each of our DNB articles. My question is, where and in what form? I suppose there might be objections (non-free) to including this information with the articles? For the WP end of this project, though, we want to get on top of matching articles to ids as well as to biographies here (it seems that Dsp13 has already done plenty in this direction). We are getting into a triangular situation, then, and this is likely to be reinforced by development of w:Template:ODNBweb. What is the right way to go? Charles Matthews (talk) 15:57, 10 November 2010 (UTC)
- Personal opinion is to include
ODNB = xxxxxas a parameter in the DNB00 header at this point in time. It will do nothing, but it will break nothing. If/when we decide to do something, and how we decide to do something with this data, then we can address that and know that we have prepopulated articles to do so. — billinghurst sDrewth 07:57, 11 November 2010 (UTC)
Flexibility in the plan, I like it ! JamAKiska (talk) 11:32, 11 November 2010 (UTC)
- Just to report that w:Template:ODNBweb has been partially upgraded now. This is really a WP issue, naturally; but with a further upgrade it could populate a maintenance category (DNB articles needed to provide a free alternative). The actual business of integrating the existing w:Template:DNBfirst into w:Template:ODNBweb in optimal fashion has been mooted at w:WT:WP DNB. Exactly how to do all this is still up for grabs. Charles Matthews (talk) 07:06, 29 November 2010 (UTC)
[edit] Vol. 11 replaced with good version
I have replaced volume 11 with the identified "better version". I have deleted those NOT PROOFREAD (red) pages from the scan where it did not look as though there had been text transcluded from them. I did that by looking at what was proofread and checking pages around the edges. If you find pages deleted that should not have been so, will need to be recovered. Tlak to anyone with admin rights, though primarily Charles or myself among the DNBizens.
- Index:Dictionary of National Biography volume 11.djvu
- Dictionary of National Biography, 1885-1900/Vol 11 Clater - Condell
I have found that this introduced version from UofT does not have the listing of biographies at the rear, so we may have to add that somewhere else. Will need to mark that as a TO DO task. — billinghurst sDrewth 05:57, 29 November 2010 (UTC)
[edit] An Index page absence
After some searching, I couldn't find the page like Index:Dictionary of National Biography. Sup. Vol II (1901).djvu but for Vol I. All a bit first-day-at-school. Any clues? Charles Matthews (talk) 19:35, 8 December 2010 (UTC)
The Status, found on Talk:Dictionary of National Biography, 1901 supplement reflects the present situation. Only volume II is available to proofread at this time. Vol. III is in the "needing OCR" as the images look letter perfect, but am still not getting any text. Vol. I has neither images or text. JamAKiska (talk) 23:51, 8 December 2010 (UTC)
Supp. Volume III index page is available for proofreading. Both sources for [Index:Dictionary of National Biography. Sup. Vol I (1901).djvu] had missing pages or unreadable & blurry text. Created a composite pdf document of page images from these on-line sources and attempted upload at IA this morning. Did not see the file in the processing queue after upload or afternoon reload. JamAKiska (talk) 22:07, 28 December 2010 (UTC)
Supp. Volume I index page is available for proofreading. JamAKiska (talk) 03:08, 30 December 2010 (UTC)
[edit] Site to reOCR a page
I have the website http://www.free-ocr.com/ very useful to reOCR a page, especially where the centre line seems to have been ignored in our scans.
- To use Wikisource image, click into edit mode and save the image (right click, save as, etc.); alternatively
- To use archive.org online version, find image, zoom in to 100% and right click and save as ...
Then just need to upload the image to the site above. Note that the image needs to be <2MB, and pages of our text tend to be about 1MB. — billinghurst sDrewth 22:19, 17 December 2010 (UTC)
[edit] Volume 44
Volume 44 index page is available for proofreading after switching to the alternative source. All 14 of the problematic pages have been identified and marked accordingly. The djvu pages that correspond to text pages 284-291 need adjusting, as this source has duplicates of pages 282-3 in the first two pages of this sequence and no images for text found on pages 290-291. The text for pages 284-289 are offset from their respective images by two frames (hence the +2 ID on label) If someone is familiar with removing and replacing a small section of the djvu file, that would be preferred to downloading and building up a fresh image file. JamAKiska (talk) 22:33, 28 December 2010 (UTC)
- All volume 44 images and text are now aligned. Pages 290-1 have images, leaving 12 problematic text pages in this volume. JamAKiska (talk) 22:36, 29 December 2010 (UTC)
[edit] Vol. 6 all text transferred
From the many pages that were manually transcribed in the main namespace, I have finally completed the transfer of these pages to the Page: namespace from Index:Dictionary of National Biography volume 06.djvu. A big sea of orange, some splashes of green, and some pockets of redlinks. George Burgess (talk • contribs) has been following through picking off the red pockets. Now that is out of the way, I can look at some of the other problems that people have left me through DNB. %-) — billinghurst sDrewth 15:02, 29 December 2010 (UTC)
- Transfer now complete through volume 11, my load was much lighter:^). JamAKiska (talk) 17:29, 31 December 2010 (UTC)
[edit] Volume 07 djvu file replace.
Replaced volume Index:Dictionary of National Biography volume 07.djvu from 2008 IA source. Will need to append index pages as they come available. With some care, the previous images remained in place for some of them. JamAKiska (talk) 17:31, 31 December 2010 (UTC)
[edit] 1901 Supplement.
All three volumes of 1901 Supplement are available for proofreading. All of the articles previously forwarded have been migrated. JamAKiska (talk) 17:38, 31 December 2010 (UTC)
[edit] Author page headers
I have boldly made DNB into a disambiguation page, to reflect the presence of DNB01. Now we should agree on Manual handling of the headers for the DNB sections on author pages. I think where there is more than one edition (typically DNB00 and DNB01 articles) the header should be "Contributions to the DNB", with a subsection as a semi-colon header for the DNB01 articles (could be an actual subsection, but in any case there should be an agreed way to help future automation). Where there is just one edition it can be like "Contributions to the Dictionary of National Biography, 1885-1900", or with the Supplement link. In any case we should move to clear up past tentative efforts to get the author pages under control. Charles Matthews (talk) 07:06, 3 January 2011 (UTC)
- To me they are all DNB and I haven't seen a need to differentiate further. I suppose that means that I am more interested in making sure that they are listed, and no particular opinion about drill downs. — billinghurst sDrewth 13:55, 14 January 2011 (UTC)
[edit] Volume 6
The Index page shows this volume is close to complete. Around a dozen articles need to be created from proofread text now; and there is just one page marked "problematic" that actually needs attention. Charles Matthews (talk) 10:12, 14 January 2011 (UTC)
- OCR'd text and inserted, back to "not proofread" Others done. — billinghurst sDrewth 13:50, 14 January 2011 (UTC)
I have created the remaining articles. Charles Matthews (talk) 20:03, 20 January 2011 (UTC)
[edit] Volume 24 has been updated to best available scan
All pages moved, page transclusions updated, buffed and polished and paint black on the tyres. — billinghurst sDrewth 16:39, 19 January 2011 (UTC)
[edit] George Smith Memoir
Expect to have the memoir proofread later this week. My intention is to transclude this memoir, as is, onto the page created from the link found here. This memoir, 39 pages in length, is sub-divided into nine parts using roman numerals. JamAKiska (talk) 13:44, 25 January 2011 (UTC)
[edit] Vol. 1 - transcluded the peripheral pages to end
To the end of Dictionary of National Biography, 1885-1900/Vol 1 Abbadie - Anne I have trancluded the lead pages to the work, including the contributors, added a link to the anchor within the header. The trailing pages [dinner for George Smith (1894)], I have included a couple of the pages, though I think that they may have been bound in at a later time, and wonder they are from another volume, those pages are incomplete anyway, and we need to work out whether to cull them or what. Note that we can probably dig up the newspaper articles themselves and add them rather than rely on these as standalone. Anyway there as food for thought. — billinghurst sDrewth 12:51, 26 January 2011 (UTC)
[edit] Replaced Vol. 19 scans
I have updated vol. 19 scans as that was the recommendation from the /progress page. I trimmed the version to align precisely, so no page moves are required, and I am now deleting the bot applied pages (by checking validation status and transcluded articles). If there are any that were incorrectly deleted, please get back to me and I will return them. — billinghurst sDrewth 02:58, 27 January 2011 (UTC)
[edit] Seeking opinions on how we manage vols with missing pages.
Volume 30 has two pages missing and is basically held up until we replace them. We can either wait until there is a new scan of the whole volume (for which I am not holding my breath) or we can try and get the two missing pages (p.28&29). Now I have ready access to copies of those pages from a later edition, or someone can go and scan the pages. While the former is the less pure option, it is readily achievable, however, to progress that way is a community decision, not one persons. We can always replace the two pages at a later time as required. — billinghurst sDrewth 23:12, 30 January 2011 (UTC)
- The long term solution is a complete volume with all of the pages intact. The interim solution needs to provide proofread text for article creation. In the past those text pages were archived in another location and made available for transclusion.
- The following reflect preferred options to meet the interim goal.
- (Plan A) If AI or Commons could splice in the missing pages, that would provide completed djvu files for those volumes. Moving the existing pages to match the revised djvu file is fairly straight-forward. Once completed we have achieved the long-term goal.
- Support as preferred methodology, though you want to make that call early, as moving a completed volume is going to be très ugly as it means so much work and surely something breaks.
- (Plan B) Volume 60 created an extra index page for pages 18 & 19 that provides a great location to store these extra pages as an interim step. We could organize this index page along the lines of the Errata volume to provide partitions for only those volumes that need pages stored.
- Support as secondary fix, to be used where there is small blocks of missing pages, eg, 1-2, once or twice through a work, though I don't really feel that it is necessary to retrospectively insert them into the work, see plan A ugly for supposed purity of little value
- (Plan C) Form a small working group that stays in place until all of the problematic or missing pages are validated for volume 30 as a starting position and then complete the remaining volumes as needed. The two of us could finish volume 30 this week…once we decide where to store the proofed text. JamAKiska (talk) 01:18, 31 January 2011 (UTC)
- Not my preference nor how I want my involvement
-
- (Plan D) Works where there are major problems/gaps in a work, prepare a work with the best effort, insert dummy pages where there are missing pages and when/if they become available then to insert these at a point in time.
Does anyone have access to a library that has the physical volume? If so, it should be acceptable to go to that library and take photographs of the pages. Clearly, this is not a copyright violation. Then, upload the photographs to commons and use them. It is not strictly necessary for all of the pages of the deja.vu file to be readable as long as we have a data trail of where the proofread data actually came from. In this case, the proofreader could simply place a note on the talk page of the proofread page in pagespace to point to the picture in commons. The volume (probably) physically available at the Stanford University library. I don't know if it is available at the Library of Congress. -Arch dude (talk) 23:30, 31 January 2011 (UTC)
- OK, the LoC claims to have a copy of the whole DNB00 plus the DNB01 and also the later editions, and they are open on Saturday. I have never been there, even though I have lived in the area since 1970. I am willing to try this myself. Do we have a list of all pages of that we need across all volumes? According to their web site, it is also possible to online-order digital images of specified items. The charges are somewhat high, so I'll try to do the bulk of the work in a few in-person visits. I'll try this if everyone agrees that this is an acceptable approach to repairing our problematic pages. -Arch dude (talk) 00:07, 1 February 2011 (UTC)
- Yes, I know that is possible, however, my point was the purity to which we were looking to be. There has never really been anyone say that they have ready access to the volumes, though I that was part of my implicit question in case someone was going to say that they did have access. The known problematic pages should be able to be determined, without mega issues, though it may need someone who better query the API. — billinghurst sDrewth 04:58, 1 February 2011 (UTC)
I downloaded clean copies of volumes 13, 21, 58, & 60 earlier. Is there a preference as to the order in which to begin their replacement? if not will proceed in order and should have these four complete by the weekend if I work alone on this one. Volumes 20 and 41 are complete volumes, but with a few pages where the fringes are ripped and missing small groups of (1-4) characters. The overwhelming majority of these torn pages are completely legible and have the text quality found in volumes 23 & 44. In volume 20 these pages are 95, 96, & 97. If the existing text of these pages were validated using the scans in place (already been proofed), then replaced by text images from this fresh volume, very few if any, would question the authenticity of these rough pages until such time as the replacement pages became available. If time is not an issue, we wait until the images are ready. Volume 41 needs pages 145, 146, 175, 176, 177 & 178. These volumes are all downloads from AI. If Arch dude can work these images we should have a way forward through these tough volumes. I’ll map volumes 27 and 30 tonight to provide their requisite pages. Volume 30 needs at least pages 28 & 29. Our fallback position includes public domain scans from Google to fill our existing void with a text quality currently found in volume 3. JamAKiska (talk) 02:14, 1 February 2011 (UTC)
- I don't see any real value in doing volume 60 (see plan B comment), the others all look ripe for it. I would suggest working on volumes where there are least deletions to do undertake first, so we can work on the deletions separate to the upload on those volumes, they are separate actions. — billinghurst sDrewth 05:15, 1 February 2011 (UTC)
I’ll save 60 for last in case of 2nd thoughts (this is a complete volume)…take another look in the 80s of this index page. Would like to splice in some details from above into my volume 20 request below…believe it should be replaced with new file that is a 99.9% solution (easier on the proofreader) and will only require at most three images to be replaced (as they become available), or linked to a "reference" volume (perhaps in Greenwich :^)
- I'm somewhat confused by all this, so here are some simple questions:
- Should I go to the Library of Congress and get copies of certain individual pages?
- Which pages? (I assume at least pages p28&29 of Vol 30.)
- Should I get the title page of each volume from which I get other pages? ( I assume yes, as part of attesting to the provenance.)
- Which other specific pages should I get as part of this initial effort?
- -Arch dude (talk) 21:10, 1 February 2011 (UTC)
Pages for Volumes 20 & 41 specified above. JamAKiska (talk) 22:44, 1 February 2011 (UTC)
-
- ArchDuke. There is no exact answer without forensic analysis to compile an exact list from the Index: pages, beyond to say MISSING PAGES and PROBLEMATIC PAGES. With regard to provenance, if you are taking digital photos having a copy of the title page is probably pretty useful, especially if it becomes a string of files as that would make it easy to determine where to split. It may even be worthwhile converting the string of images into PDF file to upload, and then we can either pull it apart or to separate a page to put through an external OCR. We can (quietly) delete the PDF from Commons at a later time. — billinghurst sDrewth 04:18, 2 February 2011 (UTC)
- Thanks. I know that there is not yet an exact answer for the general question of "which pages?" I was asking for guidance on which pages to try for during my preliminary scouting expedition this Saturday. I will use the lists provided above for vols 30, 20, and 41. I also infer from these responses that both you and JamAKiska feel that this approach is worth trying. With regard to formatting: The LoC has a "copying services" department that apparently does the actual copying, and part of this service is e-mailing a .tiff file. I hope they are willing to perform this service (and charge for it) without also charging the research and retrieval fee if I am physically present with the actual books in my hand. The research and retrieval fee is the expensive part of this mess, but cost is not the main objection (I have the money and I'm more than happy to spend it on my hobby.) My reason to go in person is to be able to check the results instantly, Also, I would like to finally physically lay hands on the physical volumes after more than three years of working with them online. -Arch dude (talk) 14:36, 2 February 2011 (UTC)
- I would think that we can work from TIFF to DJVU, though I don't have one available to play. But no photos of books? Wow that is just bloody rough. Yes, it is worth trying. — billinghurst sDrewth 14:46, 2 February 2011 (UTC)
- I'm back from my scouting expedition. See User:Arch dude/DNB at the US Library of Congress. Short version: I can access the DNB, and I'm going back next week with a camera. -Arch dude (talk) 20:32, 5 February 2011 (UTC)
- Excellent news. Probably we'll see how the first expedition proceeds, and what we can do with the text, and then we can then work out what is a reasonable schedule going forward, the number of pages that you can manage, the number of trips that we think that it might take, and our priority list. — billinghurst sDrewth 00:15, 6 February 2011 (UTC)
- Thanks. I know that there is not yet an exact answer for the general question of "which pages?" I was asking for guidance on which pages to try for during my preliminary scouting expedition this Saturday. I will use the lists provided above for vols 30, 20, and 41. I also infer from these responses that both you and JamAKiska feel that this approach is worth trying. With regard to formatting: The LoC has a "copying services" department that apparently does the actual copying, and part of this service is e-mailing a .tiff file. I hope they are willing to perform this service (and charge for it) without also charging the research and retrieval fee if I am physically present with the actual books in my hand. The research and retrieval fee is the expensive part of this mess, but cost is not the main objection (I have the money and I'm more than happy to spend it on my hobby.) My reason to go in person is to be able to check the results instantly, Also, I would like to finally physically lay hands on the physical volumes after more than three years of working with them online. -Arch dude (talk) 14:36, 2 February 2011 (UTC)
- ArchDuke. There is no exact answer without forensic analysis to compile an exact list from the Index: pages, beyond to say MISSING PAGES and PROBLEMATIC PAGES. With regard to provenance, if you are taking digital photos having a copy of the title page is probably pretty useful, especially if it becomes a string of files as that would make it easy to determine where to split. It may even be worthwhile converting the string of images into PDF file to upload, and then we can either pull it apart or to separate a page to put through an external OCR. We can (quietly) delete the PDF from Commons at a later time. — billinghurst sDrewth 04:18, 2 February 2011 (UTC)
[edit] Prefatory Note to 1901 Supplement
Would like for it to go here. And would like to add a link to the George Smith memoir from this location as well. JamAKiska (talk) 01:26, 31 January 2011 (UTC)
[edit] Help request...volume 20.
Have replacement volume (contains all pages of text and index with good quality images) ready to upload. Need to validate existing pages 105, 106 & 107 (text pages 95-97) with existing scan prior to upload as new volume has portions of these pages torn off. JamAKiska (talk) 18:14, 31 January 2011 (UTC)
- We should be replacing the pages in the incoming volume with the decent pages from the present volume. Otherwise we may as well just have two volumes at Commons and call the requisite pages individually. There doesn't seem much point in going from an old bad to perpetuating a new less bad. — billinghurst sDrewth 23:36, 31 January 2011 (UTC)
[edit] {{DNB errata}}
In a trial to display the errata from Index:Dictionary of National Biography. Errata (1904).djvu with the respective article, I have mirrored something that I have been doing for A Compendium of Irish Biography and created a means to append the corrected data to the article. Now it is still in the play zone, so we can evaluate its effectiveness and tweak it further if we lke down. We have it working for erratum on one page Greaves, Edward, and I have coded it for where that goes to a second page, though not yet tested. Issues to resolve, names of parameters we may wish to change them to something shorter like p1, p2, ... When we transclude the text, the references to line numbers is no longer current, so it may be worth having a rider that trails onto the end of the template that explains such. Probably other stuff that I haven't considered. — billinghurst sDrewth 15:29, 4 February 2011 (UTC)
The shorthand used in the Errata to help the reader locate the amendment will require slight adjustments for us made easy by transclusion. By way of example, see 1904 Errata p. 50 at the page bottom of the original, specifically the Campbell, Frederick W. article. The original page only includes the line number entry, 15f.e. for this article, as the page and column were indicated in previous entries found higher on that page. The next errata page 51 contains another entry for this article. JamAKiska (talk) 17:26, 4 February 2011 (UTC)
Greaves makes for a curious example, but it's fortunate in that I didn't have to look any deeper to demonstrate that messing with this can easily go wrong: 'error happens' [13]; the column has some relevance, if someone checks the scan of the source; I see f.e., for example, is not an abbreviation of "for example"; and without counting, I think the line number is accurate (a line ends in a full stop, not where it wraps on the printed page), but I didn't check how this is counted where an article goes onto the next page I checked, it restarts for each page. Any entry with an erratum should link to that section from the notes. cygnis insignis 19:44, 4 February 2011 (UTC)
The first page of each chapter in the Errata has that information - namely l.l. for "last line", and f.e. meaning "from end," as in counting backwards from the column end. I have yet to encounter an Errata entry which applies to more than a single page in any particular volume, until now See Edwin Sir Humphrey. The absence of those abbreviations indicates counting lines from the top of the column. Perhaps the way forward is to leave a simple word like "Errata" in the notes to help alert the reader to the information contained at the conclusion of the article to include the link to the appropriate errata page which the current template provides.
Your example also brings up a great point. A few months back I was proofing pages in the early volumes when I discovered that I was undoing some Errata adjustments made by another editor. So sheepishly I undid my edit to re-establish the "improved version," as I recognized that while in good faith I was editing the text and hopefully adding value, I was also doing so without all the facts. I now look through the edit history before joining into the editing fray. Which brings up a great topic for discussion. Namely, when is it a "reasonable" time in the edit cycle to include links and Errata adjustments? to help others avoid situations like this in the future. JamAKiska (talk) 00:24, 5 February 2011 (UTC)
- The emboldened comment is my point, I saw what it is "not" because I read that information at the errata. And the 'way forward'—the work-around to the work-around—is my solution: don't mess with it, preserve the integrity of what we transcribing, link back from the notes section of the header.
- The example was not mine. If someone incorporated the errata, made corrections, the text no longer matches the scan; that is all that is required, no more. I realise that this leaves little opportunity to be erudite, knowing with the BOH-S, and demonstrate our talents as clever editors of text, it is as boring as bat shit from that perspective and incredibly frustrating for wikipedians, but that is not what libraries do. If the same text appears at wikipedia, which it will, it can be corrected there with another ref to the errata volume. Dig this: there will, eventually, be very few reasons for a reader to use the DNB text here. The timescale is 'when someone gets around to it'. cygnis insignis 21:25, 4 February 2011 (UTC)
This DNB article, Brand, Henry Bouverie William, addresses previous concerns by preserving the original. The note alerts the reader’s attention to the appendage, and there are currently no links. Would like to add the text below to this template to aid the first time viewer.
Dictionary of National Biography, Errata (1904), p.141
N.B.— f.e. stands for from end and l.l. for last line
| Page | Col. | Line | |
| 37 | i | 22f.e. | Greaves, Sir Edward: for 279 read 302 |
| 9f.e. | for 225 and 279, i. 18 read 51 and 302 |
-
-
-
-
- The formatted three columns found nearest the left margin provide the specific location on the DNB biography page where the original text is to be adjusted, in this case page 37 of the volume. There are only two columns from which to chose, and the text line is determined usually counted from the column top. In this example the counting starts at the bottom of the 1st column and is located 22 or 9 lines f.e. (see also l.l.) These text adjustments can be replacements, omissions, or insertions. In this example there are two text replacements. When there is no indication of the page or column immediately adjacent, use the first value found above in the respective column.
-
-
-
JamAKiska (talk) 05:04, 5 February 2011 (UTC)
-
-
-
-
-
-
- We could inhale the erratum into the Notes field, which means that the article retains the initial publishing integrity; however, it being errata rather than an extension of the data, I am reasonably comfortable with a segregated transclusion. If we transclude, we can easily argue that the article shall remain as is. To the wikilink on the page number, at the moment it goes through to the Page:, I think that we should amend that so that it points to the main ns transclusion of the errata pages, and we can poke in the wikilink. So that may be something like [[Dictionary of National Biography. Errata (1904)/Volume 23#141|p. 141\] — billinghurst sDrewth 05:30, 5 February 2011 (UTC)
- Re the key, there are a number of ways to do this, 1) add in key to each transcluded component, 2) add a hover over each code, 3) link through to a page that explains the key as part of the template. All have strengths and weaknesses. — billinghurst sDrewth 05:33, 5 February 2011 (UTC)
- We could inhale the erratum into the Notes field, which means that the article retains the initial publishing integrity; however, it being errata rather than an extension of the data, I am reasonably comfortable with a segregated transclusion. If we transclude, we can easily argue that the article shall remain as is. To the wikilink on the page number, at the moment it goes through to the Page:, I think that we should amend that so that it points to the main ns transclusion of the errata pages, and we can poke in the wikilink. So that may be something like [[Dictionary of National Biography. Errata (1904)/Volume 23#141|p. 141\] — billinghurst sDrewth 05:30, 5 February 2011 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
- Cochrane-Baillie, Alexander Dundas Ross Wishart was an early view of the Notes option, and went the final step to include the errata. The next option, Brand, Henry Bouverie William, includes a brief note to alert the reader of errata(um) YET preserves the integrity of the original AND provides this additional research for those whose interest is accuracy.
- While the one line Errata entries fit nicely into the notes section, not all erratum footprints are so easily managed. A hybrid approach that includes hover keys as well as a detailed example, available via link or transclusion, would go a long way to help editors work their way through this problem successfully. JamAKiska (talk) 14:03, 5 February 2011 (UTC)
- Cochrane-Baillie, Alexander Dundas Ross Wishart was an early view of the Notes option, and went the final step to include the errata. The next option, Brand, Henry Bouverie William, includes a brief note to alert the reader of errata(um) YET preserves the integrity of the original AND provides this additional research for those whose interest is accuracy.
-
-
-
-
-
-
I have amended the template and you will see that it now has the key embedded after the page break label. It just seemed easier. — billinghurst sDrewth 02:09, 6 February 2011 (UTC)
-
-
- Kind of hard to build a range of offsets when the Index: pagelist itself doesn't reflect the actual offsets either! :-ο
Anyway, I swapped in the "fixed" template to the main position already - hopefully nobody even noticed that change - and have since managed to "redo" offset 10, didn't know where exactly offsets lower than 8 started/ended so I just jumped to the end where djvu page 293 & higher have an offset of 4 now. Please respond on the errata template's talk page - this gap thing going on here displays way too weird for me. — George Orwell III (talk) 17:03, 7 February 2011 (UTC)
- Kind of hard to build a range of offsets when the Index: pagelist itself doesn't reflect the actual offsets either! :-ο
-
-
- Nicely done, pages are properly numbered in each of the four offset regions. Howell, Laurence would provide a good example for errata straddling 2 pages if you are looking for one. I’ll add a note to the errata djvu file to give you a heads up when this file gets replaced to fill in the missing pages. JamAKiska (talk) 23:43, 7 February 2011 (UTC)
[edit] Status
- The following Volumes Errata pages are proofread, have links installed, and are ready for transclusion: 1, 2, 6, 50, 51, 55, 57. The Errata have been included into all written articles in Volumes 23, 28, and the Suppl. vols 1 & 2. MISSING PAGES - The offset dropping by six indicates 6 missing pages, only five of which have text: text pages 170 & 171 (last 2 pages of V. 30); text pages 193 & 196 (1st and last pages of V. 36 - not much text on p. 196); and finally text page 197 (lead page for V. 37). Will see if I can locate a complete volume prior to resuming the effort to reduce subsequent page moves. I need to leave it at this point of progression for a few days as other concerns are more pressing. JamAKiska (talk) 15:11, 11 February 2011 (UTC)
- I don't know if this was for my benefit or not but whatever you folks wind up doing and the resulting change in page ranges it may or may not cause; any adjustment(s) needed to the template would not be major. I did not use coding that only relied on the typical plus or minus the offset # from a base page # but rather wrote it to use equal to or greater than an opening page # and less than the closing page # of the range of vol. pages in question before any offest is calculated (+ or -) for display/link purposes. Carry on & let me know when things need-a-changin' — George Orwell III (talk) 15:29, 11 February 2011 (UTC)
- Found a better scan of 1904 Errata that has all 300 pages of text. This Errata volume will add eight pages to the page count, six of which will include text (the 6th is the final page of the Errata for the 1901 Supplement Volume III). The only down side to this move is the loss of the title page which in the replacement volume is a blank page; however, it does retain the Title page with identical markings on the cover. Upon insertion, all text pages up until page 169 will align. 42 page moves will be required following insertion, beginning with text page 170 or djvu page 180. I will adjust the few pages of links that require adjustment during this move. JamAKiska (talk) 18:51, 14 February 2011 (UTC)
- Replaced Errata djvu file with complete volume. Offset is now set at 10 throughout. JamAKiska (talk) 14:01, 15 February 2011 (UTC)
-
- Erratum template adjusted to expanded djvu file…thanks George…initial look is favorable…will poke and prod for a couple of days as I begin to incorporate these links into completed volumes. The first ten volumes are ready (23 & 28 already in use), will prep other volumes based upon DNB progress page. JamAKiska (talk) 23:20, 15 February 2011 (UTC)
- Errata template installed in volumes 1, 2, 6, 18, 23, 28, 47-60, 62, Suppl. vol. I and II (written articles in 18, SI & SII) . Discovered a few articles where the editors had already included the errata or had clarifying notes to that effect. These I left in place. Will continue with errata installs. 17 & 22 will be the next group. JamAKiska (talk) 16:11, 11 March 2011 (UTC)
[edit] Broken section markers
The sections for the underlying pages seems to be broken for Boyle, Michael (1609?-1702) (DNB00) a couple of articles and a large fragment of another article are appearing on that page. Please could someone fix it and be kind enough to explain here what needed to be done (teach a man to fish) -- Philip Baird Shearer (talk) 22:08, 16 February 2011 (UTC)
- Troubleshooting is underway…believe it to be related to the Mediawiki upgrade which was initiated this morning. Refer to several central discussions related to this topic. If in the course of your editing you experience any additional "difficulties" post them at that location. Thanks…there is a bugzilla report filed this morning that we should amend as we make additional observations in this new environment. JamAKiska (talk) 22:26, 16 February 2011 (UTC)
[edit] Section begin/end problem?
Somethings gone wrong here How (DNB00). Please help. --P. S. Burton (talk) 21:36, 17 February 2011 (UTC)
- See reply directly above 'Broken section markers' — George Orwell III (talk) 21:48, 17 February 2011 (UTC)
[edit] Vol. 35 cleansed
Vol. 35 had the text and page images out of kilter. As the pages were bot applied, I have deleted the pages that were not used in articles or were only showing as /*not proofread*/. — billinghurst sDrewth 14:58, 20 February 2011 (UTC)
[edit] Recent Author pages created.
Author:Philip Norman from supplements was not created and now is, haven't checked what else may have been missing for this bloke. Charles: I also didn't run through your checklist for authors, articles … — billinghurst sDrewth 02:19, 3 March 2011 (UTC)
Antiquarian remembered for his artisan vantage point regarding the buildings and architecture of London. Added links to those works I could find on AI as a means to preview any which should be included in WS collection. Located 2 DNB Supplemental articles, one each in DNB01 and DNB12. The work about him was a brief 1906 review of one of his works that gave some insights into who he was. I would consider it a placeholder pending the substitution of a more suitable alternative [see BAL Biography file]. JamAKiska (talk) 18:57, 10 March 2011 (UTC)
[edit] Supplementary list in prefatory - linking
At Page:Dictionary of National Biography. Sup. Vol I (1901).djvu/14 and the following page there is a list of people who died int he first six months of 1901, and were indicated as going to have biographies added. Do we link the list of names? If so, do we link to future articles, or to other places within WS, eg. author pages? — billinghurst sDrewth 03:52, 4 March 2011 (UTC)
John Farmer musician made DNB12. JamAKiska (talk) 04:34, 4 March 2011 (UTC)
[edit] 1912 Supplement
Have located volumes II and III of DNB12. Have yet to do a full page scan, but initial look seems as favorable as DNB01 scans. Am pausing awaiting location of good volume I from this second and final supplement from Smith, Elder & Co. JamAKiska (talk) 17:41, 8 March 2011 (UTC)
Second Supplement published 1912 in three volumes (DNB12): [Volume I] [Volume II] [Volume III] Only blemishes I saw were author pages in volume I. The errata are included in the front of each volume, and complete index pages are found at the end. Would organize along the lines of First Supplement (DNB01). The file sizes range from 34 to 42Mb. JamAKiska (talk) 13:56, 9 March 2011 (UTC)
Page check complete on volumes I, II & III of DNB12, all pages present & images are clearly readable. The addition of these three volumes will require an adjustment to the DNB index template. JamAKiska (talk) 19:44, 16 March 2011 (UTC)
[edit] Wikilinks to works
Some pointers to works that have been added or are around for wikilinking from the body or ref section.
- Greyfriars Chron. redirects to Chronicle of the Grey friars of London, and sometimes there are page refs
- Men of the Time (11th edition) (underway)
- Alumni Oxonienses parts available, and happy to create articles as needed, see
- Men-at-the-Bar parts available, and happy to create articles as needed
- Dictionary of Australasian Biography (underway)
- A Compendium of Irish Biography (underway)
+++
Also looking at a getting Portal:Notes and Queries populated as we have located all the volumes. Next to get done is Gentleman's Monthly. If there are other works that we should be looking to locate, then do let me know. I am thinking through whether this should sit separately as part of a Portal, or as a separate project, or both. Would like to hear your feedback on this. — billinghurst sDrewth 02:59, 5 April 2011 (UTC)
[edit] Bad scans
Hi folks...
While going through Category:Index - File to fix and attempting to resolve some of the issues with the listed Index files I noticed there are about 5 or 6 DNB volumes in there. After trying to do some back tracking/investigating at Commons, InArc, etc. I get the felling that most, if not all, of these have been "updated" at GoogleBooks since the originals were scanned for InArc. For instance, Volume 25 on Commons gives the source as InArc and the URL there citing GooBoo was
... with a creation date sometime in 2007 but since updated in 2009.
Upon visiting that URL now, it is obvious the scan has been re-freshed since 2009 -- signified by an additional URL w/ apparently same (now fixed) content as the above URL of...
Now at first I was thinking of slowly removing and inserting the "bad" pages in the existing djvu, but it dawned on me the OCR'd text is also 4+ years old and probably could be better by using today's utilities to (re)extract that too. Somebody just tell me the best option, patch the old or pull down a fresh version, for the project and I will try to fulfill that need. — George Orwell III (talk) 21:26, 5 April 2011 (UTC)
- As I scan the various sources and I find a good file I store a link at Progress. See Managing Vols with missing pages discussion for background as we continue to gather concensus. Have yet to discover the path to inserting specific page images for volumes that are almost complete. Hope to spend time in that direction soon. JamAKiska 23:14, 5 April 2011 (UTC)
-
- We'd certainly be grateful for help. Charles Matthews 06:54, 6 April 2011 (UTC)
-
-
- Sure but as I peel back the proverbial onion one thing is clear -- Google has a made a real mess of this series. Each volume has at least 3 separate file names (URLs) but it is hit or miss if the content is same for all 3 (or even just 2 of 3). Once you compare the 2 or 3 at a glance with each other - hardly any of the have the same number pages. Upon closer inspection, none of them are perfect. Some of them are slightly bad with double or missing pages while others are FUBAR with half-scans, "folded" pages and faded text through out. I'm going start organizing my findings into something that I can make a better informed decision with for best possible results in the meantime. Let me guess - most of you don't have access to the U.S. Google section where most of these are found? — George Orwell III (talk) 08:30, 6 April 2011 (UTC)
- You have clearly identified our issues with the files. We started a premise of uploads vols; which then became bad vol, let us find another; which became, oh, that is bad too, let's mark the pages problematic, and come back to it. Finally at a level of maturity that we can identify volumes sufficiently that we can look to construct them where there is no perfect volume available. And you are correct that from Google most of us (all?) cannot see them.
- I'll gladly up the PDFs not "viewable" to you folks as we determine which volume to address and in what order so that those more familiar with this series than I can determine which revision or version is best to convert to a bundled .djvu (and how). Once this clear path of "what" to address first is laid out before me, I can get my hands dirty almost immediately. — George Orwell III (talk) 22:37, 6 April 2011 (UTC)
- You have clearly identified our issues with the files. We started a premise of uploads vols; which then became bad vol, let us find another; which became, oh, that is bad too, let's mark the pages problematic, and come back to it. Finally at a level of maturity that we can identify volumes sufficiently that we can look to construct them where there is no perfect volume available. And you are correct that from Google most of us (all?) cannot see them.
- Sure but as I peel back the proverbial onion one thing is clear -- Google has a made a real mess of this series. Each volume has at least 3 separate file names (URLs) but it is hit or miss if the content is same for all 3 (or even just 2 of 3). Once you compare the 2 or 3 at a glance with each other - hardly any of the have the same number pages. Upon closer inspection, none of them are perfect. Some of them are slightly bad with double or missing pages while others are FUBAR with half-scans, "folded" pages and faded text through out. I'm going start organizing my findings into something that I can make a better informed decision with for best possible results in the meantime. Let me guess - most of you don't have access to the U.S. Google section where most of these are found? — George Orwell III (talk) 08:30, 6 April 2011 (UTC)
-
-
-
-
- Plus we find that the bot applied pages for a bad volume makes it difficult to clean up as we don't know which page has edited text unless it is completely proofread, and I am trying to get some time to learn some jquery to work out a schema to run a query on the api that will give us that result. — billinghurst sDrewth 13:10, 6 April 2011 (UTC)
- Well that is an issue the goes well beyond just this project and one that am I least likely to have an answer for. My "gut" tells me that some across-the-board type of refinement might be best -- one that lets us toggle? away from the current defaults of list displays of oldest-creation-date first or the straight alphabetical at the top. If we can pull contribution lists indicating "top" for being currently the last one to make an edit, then there must be a way to pull a list of edits made by a User (the bot in this case) that does not show "top" for a range of articles (Pages in this case). Like I said, I'm not that technically advanced but it seems possible for somebody who is. — George Orwell III (talk) 22:37, 6 April 2011 (UTC)
- Plus we find that the bot applied pages for a bad volume makes it difficult to clean up as we don't know which page has edited text unless it is completely proofread, and I am trying to get some time to learn some jquery to work out a schema to run a query on the api that will give us that result. — billinghurst sDrewth 13:10, 6 April 2011 (UTC)
-
-
-
- For reference: w:User:Charles_Matthews/DNB_scans, my page about what there is at archive.org. The Google scans are by no means the pick of the bunch. Charles Matthews (talk) 17:47, 6 April 2011 (UTC)
- Thanks. That will surely help. I beg to differ though - even if it isn't made clear at IA's description page, a good portion of IA hosted works orginated at or in partnership with GooBoo with the occasional dedicated individual tweaking the scans a bit before archiving it at IA. While GooBoo is slow to place copyright-free works properly in the public domain no matter where in the world you are logging on to the internet, once they do (full-view), the work is more prone to being re-freshed/replaced and such is the case with this series from what I have gathered so far. — George Orwell III (talk) 22:37, 6 April 2011 (UTC)
- For reference: w:User:Charles_Matthews/DNB_scans, my page about what there is at archive.org. The Google scans are by no means the pick of the bunch. Charles Matthews (talk) 17:47, 6 April 2011 (UTC)
I have created a few DNB articles. They have very good depth of treatment for the most part, which makes them useful for Wikipedia references, but the scans I have run into seem to be very poor quality. I found Hartlib, Samuel (DNB00) the most difficult. Details are on the talk page of that article. I resorted to Google for a scan of one of the pages. It must be the subject of an erratum, because the OCR text I was working with had a significant difference on one item from what I was finding in the Google text image. Judging from the Hartlib article, the text here seems to be the most up to date for v. 25. The Hartlib article has a numbered list of his works, and an item listed as a footnote here as a republication of someone else's work, appears in the list there as one of his own works. Bob Burkhardt (talk) 18:05, 20 April 2011 (UTC)
- Several locations to help locate better scans…DNB Progress has a few files stored in that location to include volume 25. Other times they will be found as a link on the respective volume index page or the source description page. The better scans have been the priority for some time, and as they are located, they are placed in easy to find locations. As previously mentioned, AI and GooBoo have provided most if not all of the readable image files for the project thus far. JamAKiska (talk) 22:08, 20 April 2011 (UTC)
-
- Thank you. These are good to know about. The Google copy is different from the Google copy I was using, but the text appears to be the same, and again slightly different from the text layer of the djvu page I edited which appears to be the result of correcting errata, though not by following the instructions from 1904 Errata which I have now appended to the article. Bob Burkhardt (talk) 21:42, 23 April 2011 (UTC)
-
-
- You are most welcome. Glad you could join us on this lengthy effort. Some of the contributing editors used material from the re-issued 1908 and 1909 edition that incorporated the 1904 Errata. Appending the 1904 Errata to revised acticles helps readers verify the authenticity of the original or slightly revised text through 1904. Upon transfer from the original publisher, Smith Elder and Company, to ODNB in 1917, some of the articles underwent revisions based upon additional research. The discussion found at w:Talk:Henry Grey, 10th Earl of Kent provides some insight into the subtleties involved in the interpretation. JamAKiska (talk) 14:58, 25 April 2011 (UTC)
-
- Found an intact volume 41 in Toronto and replaced existing file…unable to locate good replacements from that location for the remaining six files. Volume 25 in that location is actually volume 24 mislabeled by year and volume on the library cover pages. JamAKiska (talk) 22:19, 4 May 2011 (UTC)
[edit] End of vol. 3
Looks like there are issues with created articles, from around p. 380 onwards. I have fixed up a couple where the page range was shifted (needed increment of 2). But Bastwick, John (DNB00) might not be so simple. Charles Matthews (talk) 18:39, 10 May 2011 (UTC)
Adjusted links for article aligment. JamAKiska (talk) 23:15, 11 May 2011 (UTC)
[edit] List of contributors
On Wikipedia I created a list of contributors, using the DNB author templates which I transwikied some time ago, at http://en.wikipedia.org/wiki/List_of_contributors_to_the_Dictionary_of_National_Biography .
It should be possible to cut-and-paste that page straight back here, if it would provide a useful reference, and indication of which author pages are and are not done. Rich Farmbrough, 00:23 21 May 2011 (GMT)
- There'll be various comments. I would say that not all the contributors are notable; but exactly which ones are is not so easy to determine. Nor are they all easy to identify. Plenty of work has gone on to remedy that, but I believe (for example) that we don't know whether the J. H. Thorpe is the one who was Jeremy Thorpe's grandfather. Charles Matthews (talk) 07:42, 21 May 2011 (UTC)
-
- Rich, I am hoping that you have already seen Dictionary of National Biography, 1885-1900/List of Contributors and have a look at the talk page to where we have plenty of identification. I cannot say that I have been back to much work there more recently. — billinghurst sDrewth 10:10, 22 May 2011 (UTC)
[edit] 15,000 milestone
According to Magnus's gadget the 15000th DNB biography here is Wilmot, John (1647-1680) (DNB00). Category:DNB biographies has slightly different ideas; but let's not quibble about numbers. I've thought for a little while that this milestone is the best one we'll have to mark the midpoint of the project—as far as the creation of the biographies goes, that is.
So, a good time for a chat about how it should go from here, in consolidating the project. Certain things about the desired final state are not clarified; and some idea of strategy would probably help. I have various points to bring up. But would someone else like to start? Charles Matthews (talk) 20:01, 22 June 2011 (UTC)
[edit] Unsigned
I'm putting up a list of the unsigned (i.e. anonymous) DNB articles at Wikisource:WikiProject DNB/Unsigned. Most of these are in early volumes, and were because Leslie Stephen was being coy, and there are something over 300 in all.
There are a couple of definite uses for the list (checking that all articles are present as a complement to author pages; making up volume ToCs). But there is another issue: we currently have two styles in play, namely linking to Author:Anonymous and placing "no contributor recorded" in the contributor field. A third option would be to create a dedicated page in the author namespace for such a list, and link to that. I wonder what people think: for some purposes having such links would be a plus. Charles Matthews (talk) 09:36, 30 July 2011 (UTC)
- I thought that we had decided to identify them as "no contributor recorded". They should appear on the category that we set up, though we do different if required. I probably need to go back and poke it, as we still had a number of those identified on that page that were there solely due to the field being omitted (ie. early transcripts). — billinghurst sDrewth 15:12, 30 July 2011 (UTC)
- Category:DNB no contributor has the bits, and there is a subset Category:DNB See, which I also remember chatting about though again was waiting until we had moved all the pages that were not backed by scans to scans. — billinghurst sDrewth 15:34, 30 July 2011 (UTC)
[edit] Author pages
I have a couple of points to raise about what we do on author pages, one for now and one for the future.
Firstly, there was my theory that very long lists of DNB articles on author pages would prove unacceptable. Author:Thompson Cooper has, however, had 1400 DNB links now for a while, and I'm not aware of complaints. So it seems my caution was unnecessary, and the proposed solution of subpages of author pages isn't required. So I suggest that those subpages now be dismantled. NB that there is another solution, in fact what the Germans do for the ADB, their equivalent: a dedicated category. I wouldn't enjoy this while there were still redlinks to fill in.
- Okay, we can start dismantling them, and putting them back into the root level. I will ask someone to run a sql query to generate a list.
- I have repatriated the remaining subpages to the root level. Are we right to kill Template:DNBauthorsubpage or would you like it to remain. — billinghurst sDrewth 15:29, 17 August 2011 (UTC)
Second, there is the issue of the separation of the DNB00, DNB01 and DNB12 parts of the work on author pages. Having thought about it, in the long term, I believe it would help the reader not to maintain the distinction, but to have a single DNB list, alphabetical by name. There is a small twist to this, in that some amount of disambiguation might need to go on; this actually could be handled by piping, and needn't involve a new layer of disambiguation on the page. But I would see this as part of the final tidying of the work, when all redlinks are filled: the separation currently serves a useful function as we work through the volumes. (Also it comes to mind that the DNB12 effort will need fresh application, to get author pages up and list the articles to create on them.) Charles Matthews (talk) 06:51, 31 July 2011 (UTC)
- Probably right, though one wonders whether a numerical count of biographies undertaken by volume would be helpful. Note that we can probably get someone to do a sql run once the vols. are complete. Re 1912, do we have a list of authors, and that probably means that we should review the templates against the requirements for all the linking templates. Probably worthwhile running a pilot to get our head around it. Also we can probably grab the author list from the volume, and work upon it. — billinghurst sDrewth 08:52, 31 July 2011 (UTC)
[edit] "Messy" lists
In case anyone wonders what is happening at /Messy lists, it is an attempt to do something about what is written at /Master lists by means of a CatScan search. In other words to scrape redlinks off about 450 author pages. I believe this is going to be useful in a few ways; but on the other hand there are also a few caveats to enter. I have been quietly working with some other "messy" lists over at letter P. There is the potential to cut down the time needed to complete the DNB00 volume ToCs, and also to pick up on some dab work in an economical fashion. I can explain more if required. Charles Matthews (talk) 13:36, 16 August 2011 (UTC)
So the longer lists by letter have been put on subpages of that page, to avoid the slowness of manipulating a page of nearly 400K. There are a few tasks that are immediate and in the nature of tidying. Where there are complete letters and volumes, any DNB00 redlinks on these new lists in those areas are caused by a mismatch of an author page redlink and the article title as created (can be a dab issue, typo, title convention point, and so on). So a bit of maintenance to do. Also I noticed that the search throws up some dab issues itself: for example Cooke, Thomas (DNB00) is the top hit, being linked to by five author pages. I can get these done out of the Fenwick handbook easily enough.
The main list should be redone over time, at least in part, to purge bluelinks as they appear, and to feed back the dab work. The main idea has always, in my mind, been to create volume ToCs from a "master list". Given a list for a volume in "messy" form, paging through a volume that currently has no ToC to create the listing is not going to be a very long task: certainly quicker than starting from scratch. This is where the caveats come in, of course. From the "messy" redlink list, you need to:
- add bluelinks that are available on the volume ToC already;
- add "unsigned";
- ASCII-sort the whole list you have;
- check with the actual volume text on ordering (not exactly ASCII) and proper dab.
There may be a few missing (e.g. errors of omission in author page lists, bluelinks that are not on the ToC). But after the pass through the volume there should be a good-enough volume ToC to post. Charles Matthews (talk) 09:22, 17 August 2011 (UTC)
[edit] A Plantagenet oddity
Plantagenet, Family of (DNB00) is a non-standard article. Nowhere else have I seen the kind of listing of cross-references that occurs after {{DNB JT-t}} on Page:Dictionary of National Biography volume 45.djvu/407. I haven't transcluded them into the article, partly for reasons of time, and partly because this is like "see" material we usually put on its own page. Charles Matthews (talk) 21:22, 17 August 2011 (UTC)
- I think that it is a case of "it is what it is", and as we are just reproducing the work, we just reproduce it and maintain the integrity, and let others judge it separately. I would think that {{tl}|DNB lkpl}} would sufficient and that if it is not fully transcribed and linked, that it remains non-proofread until otherwise undertaken. — billinghurst sDrewth 23:07, 17 August 2011 (UTC)
[edit] Terminating dead [q. v.] links
I cannot remember where/what we decided to do with [q. v.] links that ended up with a dead link, ie. no biography was written for that person. I vaguely remember that we made no decision previously. At this point in time I see that we have three options:
- to leave the created links and create a terminating page under the name and the DNB header that has standard terminating text
- Positives -(not much) keeps all qv links as coloured links, can allow respective onward referencing
- Negatives - almost becomes misleading, and takes us away from the 'true book reproduction'
- to leave the created links but redirect them to a singular (generic) terminating page that says that no biography was created
- Methodology - creates standard landing page that has explanatory text
- Positives - keeps all qv links coloured; enables all names to be tracked; standard explanation, and can point elsewhere
- Negatives - almost misleading
- to undo the links, and use something like {{tooltip}} that underlines the text and has pop up text that says that no biography was created.
- Methodology - create a specific DNB template with standard text based on tooltip
- Positives - we are not creating non-DNB pages; undoing redlinks; we can track pages of the referring biography
- Negatives - we don't know the terminating links, without otherwise referencing the originating pages
I think that I favour option 3. Cleaner and simpler and remains truer to the original publication.
Do others see other options? — billinghurst sDrewth 01:09, 21 August 2011 (UTC)
- There is an old thread, yes. I think I'd favour #2 at least as a first step. I like the flexibility of it, and once the links are there we can reconsider. There are numerous options really for wikifying; it's going to take some time to get a definitive style in place, if we ever do, and sending all the problems to one page initially seems OK. We can for example offer links onwards by doing section-anchored places on the page. Charles Matthews (talk) 09:25, 21 August 2011 (UTC)
[edit] Volume 46 duplication
I was looking forward to finishing volume 46 shortly, but there is a glitch in the page images I have just run into (entered on Wikisource:WikiProject DNB/Progress: a couple of duplicate pages not previously logged). So I'll stop working forward at this point, after today, and move to something else (so as not to create extra text that needs to be warehoused when the scan is updated). This is the first really bad alignment problem I have seen in a while. Charles Matthews (talk) 11:29, 24 October 2011 (UTC)
[edit] OCR digit errors
It has been pointed out over at Wikipedia that some digits are not being proofread carefully enough, and so errors can be propagated. I'm inviting User:Fram to contribute here with any examples, so we can learn more. Charles Matthews (talk) 10:25, 27 October 2011 (UTC)
- In my experience "5" and "6" are often misinterpreted by the OCR software. -- Philip Baird Shearer (talk) 02:27, 28 October 2011 (UTC)
- Definitely the 5<-6>, but also see more broadly S <-> 8 <-> 5; 1 <-> I <-> l <-> !; 7 <-> y, 11 <-> n +++. — billinghurst sDrewth 03:11, 28 October 2011 (UTC)
[edit] Wildman, John (DNB00)
The article Wildman, John (DNB00) consists of several pages. The join between the second and third and fourth and fifth is causing an unintentional paragraph break. Could someone have a look at fix it and please let me know how it was done here so that I can fix problems like this myself in future. -- Philip Baird Shearer (talk) 02:30, 28 October 2011 (UTC)
- There is a generic problem recently introduced through all of Wikisource (see Scriptorium) with the addition of terminating line feed. I am running a bot through several times a week to fix, so it will get capture and resolved then, and we try to hurry WMF to fix the problem. — billinghurst sDrewth 02:58, 28 October 2011 (UTC)
[edit] Index missing from current page images
Vol 7, this is a bit of a nightmare image, here and the previous few pages do not match the text. Not sure what's going on. Rich Farmbrough, 22:12 6 November 2011 (GMT)
- When our scan at commons is bad, you should access the "scans" page for the volume to see if we know of an alternate scan. For volume 7, we know of an alternate scan: see this page. to find it, go to the volume TOC (Dictionary of National Biography, 1885-1900/Vol 7 Brown - Burthogge.) From there, go to Access scanned source of Volume 07, and from there, try the alternates. -Arch dude (talk) 00:50, 3 January 2012 (UTC)
[edit] Stats January 2012
Posted at Wikisource:WikiProject DNB/Statistics: we start the year at 63% done (figure includes the supplements in the 100%, so it is more like two-thirds of the first edition). It ought to be the case that 2012 is the year in which we get the DNB under control.
Anyway this is a reasonable moment to do some looking ahead and planning. I have a straightforward idea or five-year-plan, which I'll be posting and discussing mostly at the WP end of the project: take a volume complete here each month, and do adaptation and checking of that volume over at Wikipedia.
I don't at all suppose that a complete job can be done there, in a month, but new tracking pages can certainly be set up and some of the more serious stuff seen to. Such a VOTM implies a few things at this end also: a focus for validation and the checking of WP lks; definitive format for the volume ToCs; and a completeness check (make sure all the biographies are present). So if there is a task list for DNB VOTM/WS, what should be added to it? Charles Matthews (talk) 20:08, 2 January 2012 (UTC)
- I've been slowly working through volume 4. At my current rate of progress, there is no way it will be ready by April. Sorry. -Arch dude (talk)
Not starting at the beginning, though. Letter A has had a lot of attention. I decided to try vol. 21 as more typical. See w:Wikipedia talk:WikiProject Dictionary of National Biography#Volume of the Month for the deal in the other place. Charles Matthews (talk) 20:41, 3 January 2012 (UTC)
[edit] Wikisource:WikiProject DNB/Most wanted articles
This page is now active, after a lull (material for a project around w:Nominate reports, by User:James500, prime territory for DNB additions). If anyone would like to tidy up what is there now, that would be great: I'm doing a daily pass at present to create articles. Charles Matthews (talk) 10:41, 6 January 2012 (UTC)