User talk:Inductiveload/Archive 1

From Wikisource
Jump to: navigation, search


Can we build a DjVu file[edit]

Gday Inductiveload. A Commons expert looking at your contributions. I am wondering whether we could be looking to build DjVu files from Stukeley's works. If we did that, even without the text layers, we could then create an Index: file, and then have the text against the images in the resulting Page: ns, which we can then transclude back to the main ns. A bit more work, however, it may be a better result. Definitely something that we could then nominate as a Featured text. Note: that this is not to stop you doing the work that you are currently undertaking as we can backfill the work into this namespaces once the backend file was completed. More about our Proofreading project. billinghurst sDrewth 22:41, 20 January 2010 (UTC)

You are quite right, I've been working today on a script to automatically compile the images (which I batch downloaded) into a DjVu file, and I've succeeded! The result is at File:Memoirs of Sir Isaac Newton's life.djvu. I've never used djvu before (or scripted Windows like this), so it took a while, but the script is here is you're interested. Inductiveload (talk) 01:53, 21 January 2010 (UTC)
My main concern is the low quality of the DjVu file. Compare page 2 of Memoirs of Sir Isaac Newton's life.djvu and File:Memoirs of Sir Isaac Newton's life - 001.jpg. Notice how much detail is lost in the background of the former, and the noise around the text at the bottom! It's not surprising given the style of the writing and the detailed background is not well suited to compression. I will upload the rest of the images to commons as well as the DjVu, and Wikisource can use the DjVu while high quality images will also be available at commons. I will try to change to book over to the DjVu file now, as I'd like to learn how it works, and as you say hopefully be part of a featured text.
Also, if you know how to encode a DjVu with higher quality settings, I can do that easily and reupload a better version. Cheers, Inductiveload (talk) 02:13, 21 January 2010 (UTC)
Truly magnificent. I have created Index:Memoirs of Sir Isaac Newton's life.djvu and will hunt up some earlier discussion about djvu files, or get one of our more learned colleagues in that field to assist. billinghurst sDrewth 03:31, 21 January 2010 (UTC)
Those pages that I can find are pulled together at Help:DjVu files/other pages billinghurst sDrewth 04:12, 21 January 2010 (UTC)


That script doesn't pass any quality options to the c44 executable, so you're getting the default options. If you want better quality, you'll need to ask for it. Personally, I mostly just use "-decibel 48", which specifies a single block of the highest possible quality. That is, change

cmd = '"'+EXEDIR+'c44.exe" ' + '"'+infile+'"' + ' "'+TMPDJVU+'"'

to

cmd = '"'+EXEDIR+'c44.exe" -decibel 48 ' + '"'+infile+'"' + ' "'+TMPDJVU+'"'

Information on other quality options can be found at http://djvu.sourceforge.net/doc/man/c44.html.

If ramping up the quality results in a DjVu file that is larger than our upload limit of 100Mb, then you'll have to fiddle around until you find quality options that yield a small enough file.

Hesperian 04:45, 21 January 2010 (UTC)

Excellent! Just the information I needed! Thank you so much for the quick reply! − Inductiveload (talk) 05:16, 21 January 2010 (UTC)
You're very welcome. While I've got you, the other way to get massive compression whilst maintaining nice looking images, is to discard the colour: most pages are be blackish ink on whitish paper, so colour takes up a lot of room for very little information. Use imagemagick to convert the pages to (greyscale) PGM format, then use c44; or, if you really need a lot of compression, convert them to (bitonal) PBM format, then use cjb2. Rock on. Hesperian 05:34, 21 January 2010 (UTC)

Managing annotations[edit]

Started a discussion for your consideration at Index talk:Memoirs of Sir Isaac Newton's life.djvubillinghurst sDrewth 10:20, 21 January 2010 (UTC)

Apples and pears[edit]

I liked it very much :-) I'm sure the text scans will turn up sooner or later, I've seen some pages displayed at another site. Cygnis insignis (talk) 06:50, 23 January 2010 (UTC)

Philosophical Transactions of the Royal Society (1665-1886)[edit]

Sometime ago I started to work on them, I can build single djvu volume of the Philosophic Transactions from the various pdf part for each volume available on commons. Is it a deliberate choice to separate djvu per issue number ? And what about figure, if I upload per volume djvu, must we keep them in the djvu file ? Phe (talk) 16:37, 30 January 2010 (UTC)

Hi Phe, I'm working with PDFs from a different source without that banner at the bottom. They come (as the files at commons do) in sections by the article (many to an issue). I have a script to automatically explode these PDFs into pages and recombine into a djvu to form an issue. I could make a whole volume, but I reckon that single issues are easier to manipulate (and easier for me to do bit by bit when I have time - a whole volume has a lot of constituent files to deal with, and I'm not sure that I can download them all in one go without breaching ToCs). I tried using the existing Commons files, but I couldn't strip off the banner nicely as I can by getting alternatively sourced files.
As for figures, I guess that the best way is to make a separate image file on commons, in a sub-category of the Volume and give a systematic file name (eg Phil Trans Figure 01, Volume 001, Number 01.png) for easy reference later on. I have other projects right now, but I'll be back to this soon.
I already brought the drop-caps out to separate files for inclusion in the text (although the naming is more problematic, as the same one is often used multiple times). Inductiveload (talk) 04:20, 31 January 2010 (UTC)
Having had a look around, maybe I will start collating into complete volumes. Inductiveload (talk) 22:43, 31 January 2010 (UTC)
 
For figure, yes, I wanted to upload them as separate file (like I did for Commons:Category:Transactions of the Geological Society of London, 1st series, vol 4), my question was ambiguous and meant "must image must be uploaded as image and added to the djvu too", I think yes even if it'll make the <pagelist > more complicate. I removed the banner so, no problem from them. I don't understand what you mean by breaching the ToCs. The main issue is splitting, in short term it's easier but I think Phil. Tr. are enough big to allow us to create a mess very easily and splitting volume will make our life more difficult in long term. Anyway I'll start on volume 2, will upload it and create the index in a few day so you'll have a real example to check how I see the whole things. I need first to fix the OCR results, long s is very poorly recognized and I must write a bit of code to fix them at least for the most common word. Phe (talk) 08:43, 1 February 2010 (UTC)

Newton[edit]

You picked up one extra page in the transclusion at Newton, Isaac (DNB00) (the "Sir" is omitted in the title, by the way), and therefore picked up the wrong author too. All fixed now. Charles Matthews (talk) 14:10, 1 February 2010 (UTC)

Ah, thank you! Just my luck to have two mathematician astronomers next to each other, I didn't spot the difference! Thanks for fixing it, and for the heads up about the naming, I'm new to the DNB project. See you round, hopefully our next discussion won't be about my newbie mistakes :-) − Inductiveload (talk) 17:27, 1 February 2010 (UTC)


Cropping useless things[edit]

It's a trick based on the various crop box defined inside a pdf and some peculiar of the image we want to remove. ([1] will give some information about these crop crop. Note this code is specific to the pdf we was talking. Phe (talk) 17:40, 1 February 2010 (UTC)

# -*- coding: utf-8 -*-

import sys
from pyPdf import PdfFileWriter, PdfFileReader

def crop_negative_box_offset(in_file, out_file):
   input1 = PdfFileReader(file(in_file, "rb"))
   output = PdfFileWriter()

   num_pages = input1.getNumPages()
   #print "%d pages." % num_pages

   for page_nr in range(num_pages):
      page = input1.getPage(page_nr)
      if page_nr == 0:
         page.cropBox.lowerLeft = (0, 0)
         page.cropBox.upperRight = page.mediaBox.upperRight
      #print page_nr, page.cropBox, page.mediaBox
      output.addPage(page)

   outputStream = file(out_file, "wb")
   output.write(outputStream)
   outputStream.close()

if __name__ == "__main__":
   crop_negative_box_offset(sys.argv[1], sys.argv[2])

Volume 2 is uploaded![edit]

Inductiveload, volume 2 is uploaded! I uploaded [File:The Mathematical Principles of Natural Philosophy - 1729 - Volume 2.djvu volume 2] this morning. Please look it over and make sure we are not missing any pages because of the converting process. Also, we need to remove the Google page and eventually move it to the Commons. Hopefully, I'll talk to you tonight! --Mattwj2002 (talk) 12:49, 4 February 2010 (UTC)

Transactions of the Linnean Society of London/Volume 20/Some account of Triplosporite, an undescribed fossil fruit[edit]

That blue link had me very excited... and then very disappointed to discover the contents worse than non-existent. In my opinion a link should not be turned blue until there is real content there to be read. I see you working around the fringes of this article; are you planning on transcribing it soon? Hesperian 06:00, 9 February 2010 (UTC)

Yes, I will do it tomorrow probably (the Plates are also present in this scan :-)), as I'm about to head off now. If you want to "break" the link temporarily, that's fine by me.
In other news about the Trans. Linn. Soc., I presume you saw the upload of volumes 9-22? I couldn't find 23 onwards at the IA or at JSTOR, but I'll keep an eye open, as it would be nice to have the set. Cheers, -- Inductiveload (talk) 06:07, 9 February 2010 (UTC)
Sweet. You might find it easier to copy-paste the validated version, and use match and split. Not that I've ever tried that trick myself.

The plates are theoretically present in Miscellaneous, but they are in the atlas volume, Volume 3, of which there are no online scans at present.

Nah, as long as it's on your radar I'm not bothered. I was worried that you might be planning on doing that for every article in all 20 volumes, and then walking away!

I uploaded volume 10 ages ago, then 8 and 1 a while later. Then I decided to go crazy and upload the lot, but I ran out of puff. It was great to see you finish the job for me.

Hesperian 06:31, 9 February 2010 (UTC)

Very impressed by the way you uploaded separate images for each figure, then recomposed the plate using a table. For bonus points, note that if you go to http://www.archive.org/details/transactionsofli05linn, then click on "HTTP" in the side bar, you'll find a very big file named ...jp2.zip, which contains jp2 images of every page, at scanned resolution. The images that could be extracted from that would be vastly better than the ones you've uploaded. Or, if you don't want to go quite that far, you could click on "Read online" in the sidebar, navigate to the desired page, zoom to 100%, right click, and save out a jpg that will be vastly better that what you've uploaded. Hesperian 03:39, 10 February 2010 (UTC)

Hi! I was just going to say I had done it. I didn't know that about the IA, I had a look at the file, but I will have to download it tomorrow using a faster connection (260+ Megs)! At least I have done the legwork and created the image files and descriptions, it is easy to upload over them. Robert Brown now also has a Commons creator template at http://commons.wikimedia.org/wiki/Creator:Robert_Brown. − Inductiveload (talk) 03:48, 10 February 2010 (UTC)

Never noticed it before[edit]

however we do have Help:Mathematics and Wikisource: fractions and functions that you may wish to peruse, and improve or more successfully tell the community. billinghurst sDrewth 03:20, 12 February 2010 (UTC)

Thank you[edit]

Thank you for your work on the page Assistance to Haiti. Much appreciated, Cirt (talk) 19:58, 14 February 2010 (UTC)

No problem. You can copy and paste that layout for other CongRec pages, if you want to make them match. Any questions, come and ask me! − Inductiveload (talk) 20:16, 14 February 2010 (UTC)

Oi![edit]

you got this baby ...

  • "Observations on the Marine Barometer," was published in the Philosophical Transactions of the Royal Society, Part 2 1806 ?

Thx — billinghurst sDrewth 11:52, 15 February 2010 (UTC)

Ok, it got that (it's in Philosophical Transactions, 96, pp. 239-266). Lucky for you it's one of the easy-to-collate ones, so I'll have it at the IA today, (converting a 500+ page document at 600dpi is not reasonable for my computer), and I'll up the DJVU when it's done. The last one took less than 12 hours to get converted, so I might be quick! If you have access to JSTOR, you can find that particular article at http://www.jstor.org/stable/107194, so you can begin writing if you like, and copy to the pages when they come. − Inductiveload (talk) 17:39, 15 February 2010 (UTC)
Thanks. Not urgent, and I can wait until it is in place. I just like to serially link works. /me has an evil and novel plan to mathematically demonstrate that every document is linked within six degrees of separation. Seriously, it just looked interesting, linked into Matthew Flinders stuff, and helped for your goals. Win win win. — billinghurst sDrewth 00:49, 16 February 2010 (UTC)
Now available at Index:Philosophical Transactions - Volume 096.djvu, starting at Page 287. Inductiveload (talk) 08:08, 16 February 2010 (UTC)

Mars[edit]

Hey Inductiveload, I just wanted to let you know that I uploaded File:Mars - Lowell - Page 290.jpg and File:Mars - Lowell - Page 291.jpg. They are located on the Commons as you requested. I'll talk to you soon. Let me know if there is anything more I can do to help. --Mattwj2002 (talk) 21:12, 20 February 2010 (UTC)

One other note. Where is my beer? :P --Mattwj2002 (talk) 21:13, 20 February 2010 (UTC)

Index:EB1911 - Volume 01.djvu[edit]

I started to work on this file since a few days, uploaded manually page 182 and 183. Average quality of the ocr is good (182 is in the 20% worst page, 183 in the 30% best page). I added the rh template to the text layout. I haven't yet uploaded the new djvu version, unsure if mediawiki: will accept a so big text layer until this bug is fixed. I'm looking now if I can add <section begin/end= /> but dunno what section name to use ? And what about if I try to add link for string looking like "see <CAPITAL LETTERS>" ? unsure if it's enough reliable. Any thought on both questions ? Phe (talk) 15:50, 24 February 2010 (UTC)

*sighs*, I tried three time to upload the first volume, three failure ... Phe (talk) 18:05, 25 February 2010 (UTC)
Cool! Thanks for doing that, I'm sure the EB guys will appreciate having the text avialable (when it comes through, that is...). You could ask Blurpeace (often in IRC) to delete the files for you and then reupload them. He's a Commons admin, so he might be able to do it for you. I'm not sure sure if that's allowed or not, but I was the uploader, and I have no problem with it. It's a pretty poor workaround, but if it works, it works :-). I'm busy at the mo, just dropped in to check messages, but I should be in IRC this weekend.
I think you should add the section tags if you can get a reasonable reliability. Even if they are occasionally wrong, they will only actually be used when the article in transcluded, and so they will be proofread first. An existing articles will just overwrite them when they are moved over, so no problem there either. As for the "see STUFF" you can also try, worst thing is a red link. The template to use is {{EB1911 Article Link}}. This would be good, as it will force standardization of new articles' inter-linking. Inductiveload (talk) 20:27, 25 February 2010 (UTC)
Pardon the interjection, helpful or nosey: you will find a link on Files (at commons) that overwrites it with a comment on the changes. This one is at "Upload a new version of this file", like a page you can revert changes if you need to. Cygnis insignis (talk) 21:01, 25 February 2010 (UTC)
Sorry, just read bug report and see this may have been more complicated. Cygnis insignis (talk) 21:08, 25 February 2010 (UTC)
@Cygnis insignis. I used this link, in fact, I didn't even reach the bug, with my poor bandwidth upload, it takes 1h30 to upload the file and I always get a "session lost, retry or logout/login". I'll retry with a higher compression rate to get a smaller file.
@Inductiveload, looking this template. Unsure if I'll create link, I've no idea for now on the number of false positive. First I'll upload vol. I without them and discuss with Blurpeace of possible improvement.
Phe (talk) 09:51, 26 February 2010 (UTC)

Guidance on proofreading Phil Trans.[edit]

As a novice I embarked (ambitiously) on the proofread of volume 1 of Phil. Trans. My first concern is how to treat the lower case letter 's'. I would prefer to render these into a standard lower case letter but on looking looking back to the dedication I see that they have been left as 'long s'. (Named?) I think that the latter look rather ridiculous, particularly in the chosen font. Surely we are not trying to mimic the original fonts? I am happy with preserving archaic capitalisation, italics, spelling etc but we do have to produce a readable final product. I'm also distinctly queasy about using a modern sans-serif font for material of this vintage. (The title page is strikingly ugly). Could we decide on a serif font which is closer to the original? (After all we all know that serif fonts are easier to read). Shan't read another page until you make the decisions! And where will you put any guidelines so that they confront a new editor before he starts on a page? Peter Mercator (talk) 22:53, 6 March 2010 (UTC)

Hello! Glad to see that we have interest in this ambitious project!
Long s. The jury is still out on this one. The template {{ls}} renders as a "long s" in the "page" view to indicate that the text originally used such a letter, but when that is transcluded to the main page, it renders as a normal "s", as you say, to improve readability. I don't know if you noticed before but pages like this one do not have long-s's in them, even though the source pages do, due to this template. Generally we would try to maintain the original in this case. There is a case to be made for not bothering to put the {{ls}} templates in the first place, as this is a minor thing and is rather a pain to do by hand (and could be added later with automated help, as the replacement rules are very simple), but certainly they should not be removed, as they render the same in the end product, and only add informational content to the source. We are not trying to mimic the original fonts - the focus is on the textual content, and it is debatable whether long-s counts or not. If you want to put them in where there weren't any, do so, but if there are some, leave them be!
Serif vs sans-serif. Texts at Wikisource are usually done in the "default" sans-serif font, as this is usually easier to read on a screen, as opposed to printed matter, where serifs would be preferable. It would be better to leave the text as "default" and if you really care, you can alter your preferences to render as serif just for you. The title page still needs work, and there is a {{serif}} template to allow you to do this on a small scale. Remember that you can make no assumptions about the fonts people will have, or the browser settings, so you can never specify a specific font to use, not even a "basic" one like "Arial" or "Times New Roman". However, the main body of text should generally not be made serif. The primary aim of Wikisource is the textual content, not the formatting. We try to give the general idea of the original layout and formatting, but at the end of the day, if you want the original font and layout, go and read the scan!
I will lay out a simple Manual of Style for the project, as this has not been done yet, but as you say needs doing.
If you want to discuss these options, continue here, and remember you can also come and chat in the IRC channel - the WS channel is usually very friendly (if not always busy).
I hope we can recruit you long-term into this project, as it is huge and is in serious need of help! See you around! Inductiveload (talk) 00:00, 7 March 2010 (UTC)
Manual of Style now here. Feel free to add to it or adjust what I've put.
I changed the {{ls}} templates back on that one page. I hope you don't mind, but I strongly feel that they only add to the informational content, and aesthetically speaking it makes no difference to the end product, as in the main page, they are rendered as "s". However, I am really impressed with your proofreading diligence in moving punctuation in and out of the italic areas! I hadn't even considered that. That's a good eye for detail you have there!
I had another thought about the serif font issue. It would be possible to have a special style class for the Philosophical Transactions, which would allow users to set the appearance specially for those pages. I will look into it, and get back to you on it. Best, Inductiveload (talk) 01:13, 7 March 2010 (UTC)


Thanks for the welcome to the Phil Trans project. I have been editing physics matter in various forms over my career and it would be interesting to work on this project in time of retirement. Don't be too impressed by adjustments of punctuation for on the edit page the changes were obvious to cope with one or two incorrectly italicized words: I'm not that fanatic. (I trust you don't mind going back to the left margin: I'll leave the indent free for you. Otherwise we creep ever rightwards.)

 Several points. I think the title page looks much smarter, even if it needs later tweaks. I like the option of being able to switch to a serif font for the main text. (I hadn't realised that I could set a font preference so you will realise that I'm still just scratching the surface as a wiki editor.)

 That long 's'. I'm afraid I have to differ. Ok, I now understand that the main article loses the long s but I can see no reason for including it in page view. In the long run it will be only the main page that is accessed (and possibly also the original scans). The page view will be seen very little after validation. So, only editors will work on the page view and the question is whether the long s makes their work easier or more difficult. I suspect the latter if you were to add them in their hundreds using a clever macro which leaves alone the 's' extant in original. (How? Presumably the ocr marks them in some way.) Personally I would read the text and compare with the scan in page view and only then open the edit window. Remember how big this project is going to be. The long s will not make it simpler to recruit proof readers. I can see that it adds a spurious authenticity but, as you say, in the end we only require that the main page be as good as possible rendering of the original textual content. What happens in between is irrelevant. And what about that eszett on line 8 of page 1? Do other editors have opinions?

 I foresee another problem. I strongly believe in the value of first line indentation (other than the first para of each section). No indentation is a modern perversion promulgated by management consultants (not printers) who think it looks smart and modern. I am sorry it is the standard on wiki. For example, your first reply to me shows how difficult it is for the eye to seize upon breaks and assess the structure of the text (especially when the last line of a para is 'full'). (Of course, in mathematical work indents are essential because without them one cannot assess the logical status of text after a displayed equation). So, looking ahead in Phil Trans I see lots of indented paras. We should be faithful to them. (Aside: I looked at the Newton biography and I think the readability is poorer without the indents of the original.)

 Do you have a policy for page breaks at hyphens? My preference would be to put the whole word on the first page and expunge it from the second. I have seen this on some pages I have proofread (not Phil Trans) but I am aware that there is a rather heavy template to cope with this. Should numbered lists be marked up as such? (Yes for me). And what happens to a list over a page break? Final point. Should a note on old style dates go into the style manual. Not every editor in the future will know these conventions. Given the size of the project it would be good to get these points clear before the production run!

 Enough for now! Peter. (Peter Mercator (talk) 22:29, 7 March 2010 (UTC))

  • Long 's': I know they are a hassle, and I personally will have no problems with editors ignoring them for now. Feel free to not put them in. Phe, another editor, is very good at OCR tailored to a specific work, and his OCRs even gets most of the long-s's and convert to ls templates. "Normal" OCR doesn't expect long-s's and fails every time to recognise them, especially as they look a lot like f's. The scan/printing quality isn't high, so the general OCR we have will not make a neat job of it anyway, Phe's custom jobs will be significantly better. I don't think he's doing the Phil Trans right now (there is a lot of text waiting for this treatment), but you can always discuss it further with him. He has made a huge input to Newton's Principia which has more of the benighted things. You can leave them out with no problems, some enterprising individual may put them in later if he likes them, or it may never get done, who knows? Eventually it would be nice (IMO) if editors can choose to see the long-s in the main view or not (using some preferences settings), using template magic, in which case using the ls template would be required to add the required semantic content. Exactly this discussion has been started at the {{ls}} talk page before.
  • The eszett: Leave it for now, it could conceivably have a similar treatment to long-s, with a lss template, or something. I don't find this or the long-s issue particularly pressing. I'm having words with others about a generic treatment for this kind of thing.
  • Indents: we have a way to deal with this. In the overarching div tag around each main page, there is a style attribute. Using style="text-indent:1.5em" will indent each first line. See here for an example in use. However, this breaks the drop cap alignment. I will take a look at it later. For now, don't bother, it is a single edit to change a whole issue's styling, so it can easily be done later on. I am still working on the Newton thing, the layout hasn't been sorted properly yet. Text first, layout second!
  • Thought: add that styling option to your personal options, and you may be able to make indenting default for you for all pages.
  • Reverse indent: I haven't thought a lot about this kind of reverse indenting of the first line. There may be a CSS thing we can do. Leave it for now, I will look at it in time.
  • Hyphenation: we have a system for that. {{hyphenated word start}} and {{hyphenated word end}}, abbreviated {{hws}} and {{hwe}} are used. The first one provides the first half of the word on the page view, and shows nothing in the main page, the latter shows the second half of the word in the page view, and includes the entire word on the main page.
  • Lists: Generally, yes I prefer to have a real numbered list, but over page breaks there will be a problem. There is a way to start an HTML list at a certain point, but it is ugly and there is no neat wiki markup for it. Since the text is fixed, I find it acceptable to just use hardcoded numbers. This also sidesteps any issue we might have with how the software renders numbered lists, eg. 1. vs 1)., and if we can rely on that rendering being fixed for all time.
  • Dates: Yes, feel free to add a note about the old/new style, which is I believe caused by the shift between Gregorian and Julian calendars (or something like that). If there are any other points you want to mention, put them in, at least on the talk page, so we can have a think about them!

Best, I will have a think about all this stuff, try some ideas out and have a chat with some others and get back to you. In the meantime, I would suggest working on importing raw text and heavy lifting and not to worry overly about fine formatting until we have some better solutions. The text is the important thing after all! Have a good one! Inductiveload (talk) 00:57, 8 March 2010 (UTC)

Update I have made a template {{custom substitution}} that will let users see the long-s as "ſ" in the main space too, should they wish. It is a general template, and there now also exists {{eszett}} with the same aim. The default behaviour is what happens now: it is shown in the page namespace, but not in the main space. I have the settings in my monobook.css to reverse that and allow me to also see it in the main space. The {{ls}} template hasn't actually be changed over yet, as it is a large change to a commonly used template, so I'll make sure it will work properly first. Inductiveload (talk) 02:43, 8 March 2010 (UTC)
Update on reverse indentation. See here on how to do that reverse indent. It's a combination of "text-indent:-2em" which pulls the first line left, and "padding-left:2em", which pushes the whole block right. Inductiveload (talk) 18:35, 14 March 2010 (UTC)
News – I created class=leftoutdent to do this. — billinghurst sDrewth 09:04, 19 March 2010 (UTC)
  • More news, duing template farming yesterday, I saw {{hanging indent}}, which may help simplify things if it just a single paragraph to format.

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

You ready for the catwalk?[edit]

Now you are can say no if you really want to and are prepared to be a major disappointment to me, however, I am wondering whether you are interested in becoming an administrator at English Wikisource. You are totally capable, have been doing the mop and bucket job, and it would make some of that work easier if you were given admin access. I am prepared to submit your name for nomination if you are okay with the suggestion. — billinghurst sDrewth 05:06, 16 March 2010 (UTC)

  • Sure, I'd love to! Thanks, I'm honoured! Inductiveload (talk) 16:38, 17 March 2010 (UTC)
The deed is done. The convention is to accept if you so wish to continue. — billinghurst sDrewth 09:00, 19 March 2010 (UTC)

PSM image uploads[edit]

Many thanks for your guidance. I was loosing hope. I've got another ~50 images already cropped and ready for upload and this will complete 99% (sure that some were missed) of the images in Volumes 1 to 10. Afterwards, I will use IA.— Ineuw (talk) 11:10, 19 March 2010 (UTC)

New York Times[edit]

Hey Inductiveload,

I just wanted to let you know that I have moved the New York Times Collaboration Page to the Wikisource namespace. I have announced it on WS:S. I just thought I would let you know. --Mattwj2002 (talk) 12:04, 22 March 2010 (UTC)

PSM image uploads continued[edit]

Hi. Being occupied, only now got around to deal with your suggestion regarding image copying directly from IA. I created a little gallery of three different sources of the same image, each yielding different resolutions and the file names explain the source. All three were copied and pasted into IrfanView (Windows) image editor and saved as is.

The best resolution by far is gained from the way I've been doing it, which is copying the image from the Wikisource .djvu page when in edit mode. All uploaded images on the commons were processed this way. Below are the links to the results:

  1. First attempt directly from IA as recommended. deleted
  2. Copied from the commons copy of the djvu. deleted
  3. The most detailed result. deleted

This copy on the Commons is the one used. It originates from #3 and the resolution was lowered to about 1024x1024. I found this size is sufficiently clear for hand drawn illustrations. If this was wrong, I will replace it.

Thanks again,

— Ineuw (talk) 00:03, 28 March 2010 (UTC)




The most important reason was forgotten. The volume, the .djvu page (not the printed page number), and the Fig. numbers are used to construct the filename. Part of the process when the .djvu page is opened in edit mode is to define the name and proofread the image description for use in the upload template. Only then is the image pasted into the editor for possible cropping or resizing and saving as a jpg file. Since, I am prone to typos, this is my preferred way of reducing them.

Added on: — Ineuw (talk) 18:53, 28 March 2010 (UTC)

Hi! Sorry for the delay, I was on holiday! If you look carefully at the image derived from the DJVU (#3) you will see regions destroyed by the compression. Are you sure you're zooming into the IA jpg far enough? You need to go to 100% for full detail. This image here was derived like that and slightly sloppily cleaned - it may need more work, but you can see the extra detail lost in the DJVU). However, the hatching is even and clean, without that bad block-wise compression caused by DJVUification (DJVU is a heavily compressed format - that is the whole point). Plain resolution isn't the aim here, the quality of the image is - DJVU destroyed the details but kept the resolution! Inductiveloadtalk/contribs 22:06, 3 April 2010 (UTC)
Thanks for the reply, and yes, I see what you mean. I am still experimenting with the cleanup, as I want the images to be rendered as clear as possible. The heartening fact is that most of the illustrations are fairly simple and there is no loss of clarity in detail. —— The big mistake was that until recently I didn't know how to remove the yellow decently, and uploaded 1000+ images as is (a.k.a rushing it). Now, I already have some good results with IrfanView, which can batch the conversion process for future volumes. Additionally, I am trying to set up IrfanView as the external image editor in Wikimedia, (it's possible) but so far I can only retrieve text but not the image. Gradually, I wish to revisit anf fix my previous uploads. — Ineuw (talk) 23:23, 3 April 2010 (UTC)
Cool. The hard bit is uploading the first versions, with the metadata, linking them, categorising, etc. New version are quite easy to upload later, over the old ones, and as long as the source is listed, anyone can see the original jpg at IA for now. If you get IrfanView working it batches, I'd love to see it, as de-yellowing is a very common task for myself and many other Wikisourcerers! Inductiveloadtalk/contribs 00:37, 4 April 2010 (UTC)
For sure, the great thing about IrfanView is that their .ini text file keeps all settings used and can be posted here and shared. Your interest encourages me to have results for you as soon as the weekend is over. We are having unseasonably beautiful weather (Montreal), and I wish to take advantage of it. — Ineuw (talk) 01:04, 4 April 2010 (UTC)
I didn't forget, but I had no luck so far to configure Wikimedia images for external edit, and no one was able to help on the commons, or on Mediwiki where I left several posts - so far. So, I'll just keep on trucking as is for now but will post a link in the scriptorium just in case. — Ineuw (talk) 00:08, 10 April 2010 (UTC)

moved something[edit]

Hey, I moved The Mathematical Principles of Natural Philosophy (1729), but as I checked the 'what links here' I saw (and vaguely remembered) it is an example in a template. Shall I undo what I did? Cygnis insignis (talk) 14:11, 3 April 2010 (UTC)

Hi, I saw the move, the rational for moving it is fine but relative links can't work for this page, as this code is transcluded in the index page. Phe (talk) 14:14, 3 April 2010 (UTC)
It also created a bunch of double redirects. Shall we undo for now? Cygnis insignis (talk) 14:18, 3 April 2010 (UTC)
I've fixed some of them, but I prefer to wait Inductiveload comment before continuing it. No hurry to revert. Phe (talk) 14:38, 3 April 2010 (UTC)
No complaints about the move from me, fell free to rejig as needed! Phe is right about the relative links - I used those to have a table of contents in the Index page. Seemed the only way to do it. Inductiveloadtalk/contribs 22:07, 3 April 2010 (UTC)

Adminship[edit]

You now have the sysop flag. Good luck and feel free to ask if you have any questions.--BirgitteSB 11:55, 5 April 2010 (UTC)--BirgitteSB 11:55, 5 April 2010 (UTC)

Congratulations! -- Cirt (talk) 01:58, 10 April 2010 (UTC)

re Sklar v. Commissioner of Internal Revenue[edit]

Thank you very much for doing this! I see that there is some text missing - can you add the full text of the opinion? Thank you again for your time, -- Cirt (talk) 01:57, 10 April 2010 (UTC)

Congrats on completion[edit]

I am presuming that it is your first solo work. Nice and congratulations. From looking it would seem that you have done paragraph line breaks from first principles … very heroic. Hesperian and I have been farting around with monobook.js scripts based on the regex gadget. We have a nice script their called cleanup() which does a good fix of lots of basic bits. — billinghurst sDrewth 02:30, 12 April 2010 (UTC)

Leadin indents in PSM[edit]

Hi. Many thanks for placing the indent in the PSM layout template. How do I negate this auto indent where there shouldn't be one? There are such cases. :-) - Ineuw (talk) 06:51, 3 May 2010 (UTC)

Hi, I was about to leave you a message. The template {{nodent}} will do that for you (wrap the first paragraph) See here for an example. In the case where a non-indented paragraph spans a page break, you can use {{nodent/s}} and {{nodent/e}}, at the start and end of the paragraph. See here for an example. Inductiveloadtalk/contribs 06:54, 3 May 2010 (UTC)

Is table reconstruction entirely manual?[edit]

Please see this discussion, and kindly reply there, if you will. I'll keep a look-out for your response, if any. I'm kind of addicted to the "Kohs Block", as it has special meaning for me personally. If table-building is 100% a manual task, then I will still perform that task. I just don't want to try to build a garden bridge with plywood, nails, and the sole of a loafer shoe if a hammer happens to be available. -- Thekohser (talk) 14:47, 5 May 2010 (UTC)

I tried my first Wikisource table. Only if you have the time and inclination, I would like to learn how to "outline" the cells of the table, so that it better matches the original source. If you do so here, I'll observe how you do it, and then will hopefully have been "taught how to fish". -- Thekohser (talk) 14:46, 6 May 2010 (UTC)
I gave it a light scrub to how I might handle it. I created a rudimentary class tablecolhdborder that does the vertical lines as in older books I have found it a reasonable popular format, see some info at Wikisource:Style guide/Tables. — billinghurst sDrewth 15:18, 6 May 2010 (UTC)

À Eugène Lefébure - Monday 27 May 1867[edit]

Hi Inductiveload and thank you for your feedback. This project has just started and will need contributions from many. However, the translation of someone who wrote words more than ideas is bound to fail when it has given hopes of perfection. For this reason I chose for remain literal in its realization since the author himself could not do otherwise. It would also contrary to the author's view to attribute plain clarity to text, even in prose. The debate can continue, so I shall accept (but maybe not without discussion) any modification since they are ultimately possible. Francois Jacob. unsigned comment by Francois Jacob (talk) 14:39, 8 May 2010.

Hi, I'll continue the discussion here to keep it together. It wasn't a criticism or anything like that, I was just wondering how if there certain levels of meaning that were missed and so on (since the text as it stands makes almost no sense to me). However, if it is equivalently poetic in French, then of course we will lose a great deal of that in translation unless done "freely", which may distort the meaning. It is, of course, your call, and we can host a literal and free translation separately if needed.
The standard procedure with homebrew translations (as far as there is one, see WS:TRANS), as opposed to an already published translation, is to assign the "translator" field in the header as "wikisource", to indicate that that translation is a collaborative process, rather than straight transcription from a fixed source like the rest of Wikisource. We have quite a few French speakers here, and you may know some of them from frWS, and I'm sure they would gladly help you should you so wish.
One more thing: are you François Jacob, or just share his name? Just curious, as we don't have many 89-year-olds here. :-) Cheers, and please ask if you need anything! Also please sign your posts with four tildes (~), it helps us keep track of conversations. Inductiveloadtalk/contribs 14:51, 8 May 2010 (UTC)
The same question has been asked by a fellow user on the French Wikisource, and I had the same regret when I informed him that François Jacob was a namesake. I also apologized to him, after a lengthy text, for the absence of a scientific mind in our project. I wish to thank you for your help and advices. Francois Jacob (talk) 16:44, 18 May 2010 (UTC)

Illuminated initials from DJVU[edit]

Thanks for the note. Just a quick question: how do I find the original PDF? Is it (always) available through Wikisource for scanned books of this type?

IncognitoErgoSum (talk) 17:29, 8 May 2010 (UTC)

Hi! Generally speaking, the DJVU has its source listed on the file page, but some times you need to look for it. I found it at Google books, and sometimes the Internet Archive (archive.org) has a nice copy.
The best quality from the IA can be found in the "JP2" files they provide in a zip file (follow the "HTTP" link) This is large file, and you may need special software to convert the files, so it might be hard for you (it's hard for me). Failing that, you can view each page as a JPG by selecting "Read online", and saving the image ot disk, and editing.
If the worst comes to the worst and the DJVU is the only thing you can find, then use that and drop me a note, and I'll have a look for it. If no other source exists (it happens) then we just have to use the DJVU, but it is rare that that happens. Best, Inductiveloadtalk/contribs 17:58, 8 May 2010 (UTC)

Unfair block[edit]

Dear Inductiveload – I have become fairly active on the Wikisource:Possible copyright violations ‎ discussion page in recent times. You are probably familiar with some of the issues I have raised there (including what is meant by an “Edict of Government”). On 2 May 2010 I made a number of edits. Most of these edits related to me “tagging, hiding and listing for discussion” works that were labeled as “Edicts of Government” (e.g. South African political speeches, a national anthem and other works). The same day Administrator Billinghurst blocked me. I cannot say precisely why – as he did not give precise reasons – but the general heading he gave was that “Okay, that is too rampant” (i.e. I was being too active in ““tagging, hiding and listing for discussion”).

I have disagreed with Billinghurst on a number of copyright points of late – basically, I would like the same standard to be applied to all works. The same high standard that is – even if that means that a lot of works need to be listed for discussion etc - but his approach is different. I think Billinghurst views me as ‘trouble’. In contrast, I think I have made a worthwhile contribution, prompting interesting discussions, greater clarity and the removal of some works. Indeed, the works I “tagged, hid and listed for discussion” on 2 May 2010 have led to interesting copyright discussions on the copyright violations discussion page. I would like Billinghurst to apologise for blocking me and somehow “expunge” my record.

I would appreciate any contribution you would like to make on my talk page where my block is being discussed. I am sending this message to all persons who have participated on the same copyright violation discussions as me. I do not know how else to generate further participation in the discussion concerning my block save direct messages – as I cannot list this matter (a personal one) on the copyright violations page. The discussion is at User talk:Formosa. Given my treatment, I admit to feeling a bit disheartened about my continuing involvement in the copyright violations project. Thanks. Formosa (talk) 13:08, 9 May 2010 (UTC)

Thanks - I responded to you on my page. Formosa (talk) 18:43, 11 May 2010 (UTC)

Talkback[edit]

You have new messages
Hello, Inductiveload. You have new messages at Template talk:Overfloat left.
Message added 14:31, 15 May 2010 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

some thoughts with regard to a sidenotes usage — billinghurst sDrewth 14:31, 15 May 2010 (UTC)

Page move[edit]

[2] Is there a policy on this? Elsewhere, the policy seems to be to name article with the commonly used name, and which in this case is certainly "C. J. Dennis". Author:Banjo Paterson isn't at Andrew Barton Paterson. Moondyne (talk) 01:39, 23 May 2010 (UTC)

Not particularly, as far as I know. I thought we put the author at the fullest name we have, and redirect to there. I could be wrong. Inductiveloadtalk/contribs 03:10, 24 May 2010 (UTC)
There has been Scriptorium discussion in the past, and it came about due to the need to disambiguate pages, and also that some author's were writing under multiple names, including pseudonyms, variations of spellings, and variations of parts of names, and variations or titles. So it was agreed (in the lackadaisical WS way) that names were to be fully expanded, and we would liberally use redirects for the Author: namespace. There was no expectation that the naming convention would apply in the main ns, where either the full name or the work's use is acceptable. — billinghurst sDrewth 03:33, 24 May 2010 (UTC)
Thanks and noted. Moondyne (talk) 03:50, 24 May 2010 (UTC)

Script[edit]

I forgot to ask you how to use the script I copied to my monobook the other day. Since I'm not very good with any type of code, I can't seem to figure it out :-) Maximillion Pegasus (talk) 19:04, 25 May 2010 (UTC)

You should be able to just paste in in, reload your JS (restarting yuor browser should work if you are not sure). Then, when editing a page, you will see the script links appear as grey text below the "toolbox" on the left. Thats about the sum of my knowledge. Billinghurst and Pathoschild know more than me! Inductiveloadtalk/contribs 19:31, 25 May 2010 (UTC)
Ensure that you have your CustomRegex gadget turned on too. — billinghurst sDrewth 00:03, 26 May 2010 (UTC)

Florence Earle Coates Mine and Thine (1904)[edit]

Thank you very much for the touch-ups to the completed book... However, the images you used are from the 1905 reprint edition (note the "MDCCCCV" on the Title Page). I used a first edition copy, and would prefer the use of the first edition. I am able to scan my copy, and was in the process of doing so, if you'd like. I am able to save the images in djvu format... Please advise... Thanks again, Londonjackbooks (talk) 02:04, 9 June 2010 (UTC)

Also, there are some textual differences (although very few--yet noteworthy) between the 1904 and 1905 editions which I made reference to when transcribing (see poem "Autumn," for example). Londonjackbooks (talk) 02:16, 9 June 2010 (UTC)
Damn, I though that was the 1904 one (I saw "1904" on the page after the title page, and didn't read the roman numerals). Go ahead and upload your copy. I can't see another edition on the Internet Archive, so it looks like your scan will be the first one of the 1904 version. All you need to do is upload the DJVU to Commons (upload to "File:Mine and Thine, Coates, 1904": I will get the current file moved to "...1905.djvu"). I will leave the index page and all the pages written so far in place for you to use (the file will slot in nicely against the pages if you use that name). You may need to copy and paste a little to line the right pages up, but mostly it should be easy. If in doubt, see what I did before for the 1905 version, or ask here. Good work, by the way, you don't do anything by halves from what I've seen! Inductiveloadtalk/contribs 02:20, 9 June 2010 (UTC)
Thank you! Not a problem! I am slow-going on the scanning, but faithful and thorough...trying to do justice to my favorite poet... I save my scans in .jpg format then convert them to djvu online. Do you know an easier way without my having to acquire software? Also, for future reference, most of Mrs. Coates' collections (page images) available online are reprint editions. For example, on Google Books, the following are reprints:

And the following are first edition:

Poems (1916) Vol. I does not seem to be available online at all, but I have first edition copies of all her collections, and will eventually upload them...

Thanks again for all, and I hope you can perform your "magic" to the other books once I've completed them(?)--or even teach this layman how! Londonjackbooks (talk) 02:33, 9 June 2010 (UTC)

Well, I just uploaded some scans I found at the IA (IA has some Google scans and some higher quality ones from other source - the Google ones are pretty low qaulity and sloppily done in many case, such as being able to see the hand of the scanner, etc). You can check to see if they are correct or not. If not, tell me and I can move the files appropriately (it is easier before there are content pages). As for conversion, the easiest way (less uploading/downloading of large files) is if you download djvuLibre, Python and my conversion script you can convert batches of files yourself. The other easy option is to scan to a PDF (most scanning software permits this) and upload to the IA, where it will be converted to DJVU for you in about 1 hour, more for big files. The other way is to zip up the jpgs and put them online (rapidshare dropbox, etc) and I can do it for you, and upload the DJVU. This takes as long as I take to get round to it.
I recommend the first option, and you can come on IRC and I'll help you set it up if you have trouble. Inductiveloadtalk/contribs 02:46, 9 June 2010 (UTC)
I'm pretty much bungling things up now...being unfamiliar with the process. I have a request: Can all of the 1905 images found at Mine and Thine (1904) be returned to the original (1904) for now (i.e., "undo" 1905 additions)? I also moved the "Index:Mine and Thine, Coates, 1904.djvu" to "Index:Florence Earle Coates Mine and Thine (1904).djvu" so it is similar to the format name of her previous collection that I (albeit incorrectly--it needs to be fixed) started a few days ago at: "Index:Florence Earle Coates Poems (1898)". Also, I am unsure what/where "IA" is for viewing your scans. Lastly, the online image converter (jpg to djvu) that I use is this one. My apologies...You are dealing with an amateur here! Londonjackbooks (talk) 04:07, 9 June 2010 (UTC)
  • First off, you can't separate an index page and the underlying DJVU file. They must have the same name. Don't worry too much about what the index is called - the end user doesn't see that. So long as it isn't called 2132342.djvu, it is OK. Make sure you upload the new file with the same name as the index page, or they won't get linked.
  • Second, I have moved all the subpages of the 1905 book from the incorrectly names 1904 pages to correspond to the file at commons now (File:Mine and Thine, Coates, 1905.djvu). I have transcluded these into the main page for now until you get the 1904 version online, then you can change it to pages from the 1904 version when you have content there. It won't hurt to use the 1905 pages for a short while until the 1904 ones are ready.
  • The IA is the Internet Archive, a very large collection of book scans (from Google, Microsoft and eleswhere). Have a look at archive.org. They are a good first port of call when looking for a scan of a book.
  • If you converter works, don't sweat it, use that one. If it ain't broke don't fix it.
  • Also check out Index:Poems, Coates, 1898.djvu - I uploaded this DJVU earlier. Inductiveloadtalk/contribs 04:27, 9 June 2010 (UTC)
Question for you at Talk:Mine and Thine (1904) Londonjackbooks (talk) 13:15, 9 June 2010 (UTC)