Wikisource:Scriptorium/Help

From Wikisource
Jump to: navigation, search
Scriptorium Scriptorium (Help) Archives, Last archive
The Scriptorium is Wikisource's community discussion page. This subpage is especially designated for requests for help from more experienced Wikisourcers. Feel free to ask questions or leave comments. You may join any current discussion or a new one. Project members can often be found in the #wikisource IRC channel webclient.

This page is automatically archived by Wikisource-bot

Have you seen our help pages and FAQs?


Original translation of foreign-language public domain work probably never translated into English[edit]

Hi! A fellow Wikipedian emailed me and asked if I could help translating material that I am not allowed touch on English Wikipedia at the moment per the terms of a TBAN after I appeal the ban, and I suggested that I could also either do it for him off-wiki or perhaps put it on Wikisource. I looked around and found WS:T#Wikisource original translations, which appears to allow for such. But I was wondering about a few things:

  1. How much research am I expected to do to verify that a work has "never before been translated into English"?
  2. Are original translations allowed when we know there is a pre-existing one but it is difficult to access or can't be used the way we want to for copyright reasons? (The user I was in contact with is concerned that a Wikipedia article they wrote is pushing the limits of fair use by quoting published translations of nine Japanese poems.)
  3. Does "works that are incomplete and abandoned for long periods" necessarily include translations of several poems in a sequence or anthology that could be taken as complete works in themselves, or would that be dealt with on a case-by-case basis?

Hijiri88 (talk) 04:40, 16 December 2016 (UTC)

There's no requirement that a work has never before been translated into English; we do allow for "new, complementary translations that may improve on existing versions in some way." Generally speaking, poems can be counted as stand-alone works.--Prosfilaes (talk) 06:57, 16 December 2016 (UTC)
@Prosfilaes: Thank you. Any idea what kind of license we are supposed to put on Wikisource original translations or how we are supposed to mark them as such? I've been trying to find an example, but... Hijiri88 (talk) 01:56, 23 December 2016 (UTC)
If they are Wikisource-generated translations (ie. volunteer), then they belong in the Translation: ns where we put the original licence, then either {{cc-by-3.0}} or {{cc-by-4.0}}. I would have hoped that it was covered at Wikisource:Translation. — billinghurst sDrewth 14:19, 23 December 2016 (UTC)
One question I have though: Are the original works still under copyright in the original language? If they are still under copyright, then we won't host translations of them, because that constitute infringement of the copyright. Also, to host a user-created English translation, the original language published text must be hosted on the Wikisource for that language, with a supporting scan. So, if you plan to do original translations here of Japanese poetry, then the original Japanese poems (a) must have been published, (b) must have a scan of the original at Commons, (c) must be transcribed and housed at the Japanese Wikisource. Unless these criteria are met, we would not host an English translation here that was created by an editor. --EncycloPetey (talk) 14:31, 23 December 2016 (UTC)
@EncycloPetey: Well, I was hoping to worry about that when I actually start working, but they were published more than a thousand years ago and have been in circulation pretty much ever since, and I was planning on using the text on Japanese Wikisource as my translation source. All I've got done so far is most of the table of contents at User:Hijiri88/Shūi Wakashū, which is translated from ja:拾遺和歌集. (Actually this isn't the work I was originally asked to translate, the Zashiki Hakkei songs, which are more obscure and probably not on ja.wikisource yet, and arr somewhat newer -- only around 200 years old -- so I'd need to check their copyright status with someone more learned before posting them.) Hijiri88 (talk) 11:36, 24 December 2016 (UTC)

New user - request for early feedback[edit]

I am still finding my way here. I have been working on Index:The Golden Hamster Manual.djvu and I am writing to request some initial feedback.

  1. Page:The Golden Hamster Manual.djvu/9 - here is a typical page. Is everything here formatted correctly, and is it right for me to tag this one for proofreading? Is the page number at the bottom correct, and do I need to repeat this page numbering in the footer for every page or is there an automatic numbering system?
  2. Page:The_Golden_Hamster_Manual.djvu/10 - Can I mark a page as needing proofreading if I am using raw images instead of processed images? When there is text in an image, should I transcribe that somewhere, or should it only remain in the image?
  3. Page:The_Golden_Hamster_Manual.djvu/6 - There are lists in various places in this book, such as this table of contents. Can someone provide an example on 1 line for how I can address this, so that I can copy the process elsewhere?

Thanks - I am feeling comfortable with this and am hoping to get through this book. Blue Rasberry (talk) 15:59, 21 December 2016 (UTC)

  1. This looks great. If you put {{center|—{{{pagenum}}—}} in the "Footer" field of the index page, it will automatically put the formatted page number in the footer for you on each newly created page. This page can be marked proofread.
  2. If you have raw images instead of processed images, mark the page as problematic. You don't have to transcribe the text in the image, but you can if you want; useful templates for that are {{figure}} and {{overfloat image}}.
  3. The simplest way to do TOC entries like this one is with {{Dotted TOC line}}: {{Dotted TOC line||Adult Male|Cover}}
You're doing great! —Beleg Tâl (talk) 16:39, 21 December 2016 (UTC)
Thanks for the encouragement, guidance, and for doing an example line.
Can you say more about how the pagenum template works? I inserted it at Index:The_Golden_Hamster_Manual.djvu but it is returning a red link. It seems that template:pagenum does not exist, so I could not find the documentation. Blue Rasberry (talk) 00:19, 23 December 2016 (UTC)
It's not a template, but rather a magic word (I think). Note that there are three curly braces instead of two; {{{pagenum}}} instead of {{pagenum}}. I don't know if it can be reasonably be used beyond what I have described, using it on the Index page in the header or footer field to automatically populate the value in the Page namespace. —Beleg Tâl (talk) 01:58, 23 December 2016 (UTC)
I edited the page for you so you can see the minor edit that I made. You had the right idea. The values of parameters which can be fed to a template are represented in the template code by items enclosed between triple braces. Just a tip, don't get hung up over formatting, as long as it's consistent throughout the text it should be fine. Rochefoucauld (talk) 02:02, 23 December 2016 (UTC)
@Bluerasberry: As Rochefoucauld correctly says, it is a magic-word that is built-in to ProofreadPage, and as also stated it doesn't comply with standards <shrug> but it is that or nothing at this point of time. From memory it is documented over at mul:Wikisource:ProofreadPage. AND I never remember what it is, so I have it recorded on my user page and I go and look. <more shrug> — billinghurst sDrewth 14:26, 23 December 2016 (UTC)

Roman History confusion[edit]

So, many years ago I put up a copy of Ammanius' Roman History, as an index. I worked on it, and I am coming back to it now due to recent edits being done on it. But I am in need of some help: we have a Tertullian project work of it on here, and the text is the same, but I would like if we could scrap the Tertullian page and its subpages, and let me build up the work from the Index. This way we can hopefullly declutter the pages on WS of this book, and have a unified namespace for the index. - Tannertsf (talk) 19:51, 21 December 2016 (UTC)

It would be easier to assess and advise if you provide wikilinks to the relevant locations. --EncycloPetey (talk) 20:41, 21 December 2016 (UTC)
Index:Roman History of Ammianus Marcellinus.djvu and Roman History - Tannertsf (talk) 01:25, 22 December 2016 (UTC)
While these appear to be different printings in different years, and from different publishers, there is nothing I can see to indicate that they are different editions. In fact, it looks as though someone did a "match-and-split" to fill the pages in the index file, rather than relying on the OCR. I'd agree that replacing the unsourced copy with the sourced one would be a good idea, but the work will need a lot of formatting before it can be transcluded. We have a new editor, who is eagerly helping, but who does not seem to know about headers, nor how to format footnotes for transclusion. Those issues will need to be attended to before the current copy can be transcluded over. --EncycloPetey (talk) 02:15, 22 December 2016 (UTC)
Yeah the recent edits are not too great. But, I understand when someone is new here. But why can't we just erase the old work and then when the index is proofread transclude it? - Tannertsf (talk) 13:21, 22 December 2016 (UTC)
What we normally do is proofread the scans, and then put them up in place of the older ones. Taking the old one down first means depriving users of access to a work, and there is no telling how long it would be before the new copy was ready to be used. --EncycloPetey (talk) 14:34, 23 December 2016 (UTC)
Okay sounds like a good plan. Thanks for the help! - Tannertsf (talk) 16:30, 23 December 2016 (UTC)

Patching Works with Lacuna[edit]

I was going to take a look at some of the source file needs fixing works, having fixed up a work recently.

Would it be acceptable for Wikisource purposes to insert "placeholder" pages, until someone can find the relevant missing pages, but upgrading the works concerned to "To be Proofread status?.

ShakespeareFan00 (talk) 21:31, 21 December 2016 (UTC)

Creating a template[edit]

I am requesting for some help, from template creators. I am working in Bengali Wikisource, on a big dictionary. In this work, Western diacritics like breve, tilde, caron etc have been used above Bengali characters, to show pronunciation. I am stumped about how to achieve this effect. Using unicode for "combining breve above" and the like results into white boxes or invisible display. So a template is required. Something like {{dual line}} or {{sfrac nobar}}, with lower line in alignment with the rest of the text, with normal font size & height, with the upper line floating above it. Say, {{Float above}}. Can anyone help? Thanks, Hrishikes (talk) 02:50, 29 December 2016 (UTC)

Are you sure such a thing doesn't already exist? Something like an IPA for Bengali used by linguists? Seems like you're trying to reinvent the wheel. (Also, whatever you do, please remember your poor mobile users!) --Mukkakukaku (talk) 03:46, 29 December 2016 (UTC)
@Mukkakukaku: In the page you cited, diacritics like tilde are shown in-between Bengali characters, not above. I am trying to achieve the above effect, not between. Hrishikes (talk) 04:33, 29 December 2016 (UTC)
To me this is one of the cases where I (the curmudgeon) would have been looking to be doing what is practicable and usable, not looking to have a facsimile copy of a work. The templates that I have seen are artefacts of putting decorative aspects over or under. — billinghurst sDrewth 08:07, 29 December 2016 (UTC)
So you are saying that where the book shows ã, I should transcribe a~, in case of Bengali characters? Hrishikes (talk) 10:31, 29 December 2016 (UTC)
Short of using a custom font, I can't think of a solution that wouldn't be terrible from a usability perspective for your mobile users, incomprehensible to your screen reader/visual assist users, and a nightmare to maintain. It seems like the modern convention is to put the diacritics between characters rather than above. I think this is analogous, in English, to the ligature characters (which we no longer use in transcription, eg the attached ct.). --Mukkakukaku (talk) 13:08, 29 December 2016 (UTC)
I have solved the issue, to some extent, with some advice from another friend. There is a special characters (diacritical marks) box accessible from the advanced toolbar. Clicking the mark from there adds it, even to a Bengali character. But the result is appreciable in Firefox only, not in Chrome. Anyway, why are you thinking that a custom font will be required? If a template can be created as I said, that is likely to solve the issue for all browsers, including mobiles, I think. I don't have knowledge about the technicality involved, else I would have tried. Hrishikes (talk) 13:56, 29 December 2016 (UTC)
Historically, our templates are not a very good experience on mobile, as the average editor tends not to be a web developer. A custom font, or images, is the best way with to implement a consistent experience across multiple browsers and multiple types of users (screen readers, mobile, tablet, desktop) with the least amount of necessary technical knowledge. Either way, I would advocate the modern solution of a horizontal display of character and diacritic, which appears to be how these sorts of letters are displayed today. However if the special characters diacritic box javascript fails gracefully to a horizontal appearance, I'd say go for that instead. --Mukkakukaku (talk) 16:51, 29 December 2016 (UTC)
This is as close as I can come to a solution on short notice:
˘
. I can't speak to how well it'll work on mobile, and it doesn't degrade gracefully for screen readers (the diacritic is written left of the letter instead of to the right), and you need to wrap the paragraph in <div></div>, but it's a start, I suppose. And it looks the same in Chrome. --Mukkakukaku (talk) 17:15, 29 December 2016 (UTC)
Thanks a lot, but there is a space on both sides: সি
~
ক্‌ল্ Hrishikes (talk) 17:48, 29 December 2016 (UTC)
The left and right margin/padding will need to be adjusted. Bengali letters are placed snugly side by side, unlike latin characters.
That being said, it's going to be a spectacularly complicated template because it's going to have to be very sensitive to the diactritic being used. The vertical displacement I used was perfectly acceptable for the breve I was using as a test case, but the tilde you used in yours is practically lying on top of its parent letter. The template would need a switch and logic such that the diacritic mapped to an appropriate amount of vertical space. Furthermore, I think that the appropriate amount of horizontal spacing also varies on the diacritic and base letter used, which is more complication.
I really do think you're better off putting the diacritic horizontally alongside the Bengali letter as per the modern convention. --Mukkakukaku (talk) 18:34, 29 December 2016 (UTC)
The Unicode Combining diacritics are the technically correct solution, and should in theory display right. If they don't, I don't think any HTML magic is going to do better.--Prosfilaes (talk) 19:32, 30 December 2016 (UTC)
@Mukkakukaku, @Prosfilaes: Problem solved beautifully with {{Letter position}}: সি~ক্‌ল্ Hrishikes (talk) 03:59, 31 December 2016 (UTC)

Older Projects Needing Validation[edit]

I'm looking for guidance on better practices on these relatively abandoned projects. If the html "big" "/big" pair are used, would it be better to replace, even on validated pages, with the corresponding "larger" templates?

Also, ran into the double "the the" situation. This has probably been asked in past years but would like today's answer. I used SIC template, marking the second as "duplicate". Is there perhaps a better way to handle this? Humbug26 (talk) 20:49, 29 December 2016 (UTC)

For double "the", I would use this: the the, rather than adding the text "duplicate". When using {{SIC}}, simply mark the incorrect text with the corrected text and nothing else, whenever possible. --EncycloPetey (talk) 21:30, 29 December 2016 (UTC)
Thanks, page adjusted. Humbug26 (talk) 18:28, 30 December 2016 (UTC)

Adding a bilingual Swatow-English dictionary (a Minnan dialect)[edit]

Hi, I'm new to Wikisource and I have a few questions about a bilingual dictionary I found on the web. It's this document Archive.org/A pronouncing and defining dictionary of the Swatow dialect, arranged according to syllables and tones.

  • Is it legal to add this document to Wikisource ? The author died in 1916.
  • In which linguistic edition of Wikisource should I add it ? Minnan ? English ? The author is already mentionned in the English edition of Wikisource.

I apologize if this question is not asked at the correct place. Assassas77 (talk) 11:07, 3 January 2017 (UTC)

In terms of the file itself, the work was published in 1883 and the author died in 1916. This would make it {{PD-old}} (published pre-1923, author dead 100 or more years), which means it is hostable on WS.
Looking through the book, it looks to me that this is primarily a Swatow-to-English dictionary. That is, each entry shows a word in Swatow, followed by a pronunciation guide, an English definition or definitions, and then some example usages with accompanying English translations. Is that an accurate assessment? --Mukkakukaku (talk) 14:29, 3 January 2017 (UTC)
Yes, it is correct. So, I can start the procedure of reading helppages to understand exactly how Wikisource works and adding its DJVU ?
  • PS : are dictionaries accepted in Wikisource ?
  • I just checked in the document and it says : "Digitized for Microsoft Corporation by the Internet Archive in 2008. May be used for non-commercial, personal, research, or educational purposes, or any fair use. May not be indexed in a commercial service." Does it mean it can't be used on Wikisource ?Assassas77 (talk) 14:39, 3 January 2017 (UTC)
Yes, this can be hosted here. To address your concerns:
Beleg Tâl (talk) 16:10, 3 January 2017 (UTC)
You're welcome to start it here, or on zh-min-nan.wikisource if you would prefer and they accept it. Either way, a cross-note should be added (though that's up to zh-min-nan if they want one).--Prosfilaes (talk) 19:55, 3 January 2017 (UTC)

Image/text placement[edit]

Can the placement of image and text be improved upon at this page, or is what I have sufficient? Thanks, Londonjackbooks (talk) 13:32, 4 January 2017 (UTC)

I shifted the text up some by adding a negative bottom margin to the image. --Mukkakukaku (talk) 14:56, 4 January 2017 (UTC)
That is better, thank you! Londonjackbooks (talk) 17:42, 4 January 2017 (UTC)

Page:Instruments of the Modern Symphony Orchestra.djvu/22[edit]

Recently converted this to sentence case, using an external tool, Is there a script for doing it internally? ShakespeareFan00 (talk) 20:21, 6 January 2017 (UTC)

Not a script that I know of, but you could use {{lc}}, then preview, copy the displayed text, then paste it back into the edit window. Not the neatest of solutions, but it would work in a pinch. --EncycloPetey (talk) 20:55, 6 January 2017 (UTC)
I think it would be 5-6 lines of Javascript, but would makes things a lot easier:)

ShakespeareFan00 (talk) 21:05, 6 January 2017 (UTC)

Index:Instruments of the Modern Symphony Orchestra.djvu[edit]

Got the images done on this, but have no idea how to insert the musical score sections as adapting the help examples resulted in stuff that would not render.

Attempting to do the score portions as extracted images was also a non-starter because of limitations in my image editor/djuv viewer. ShakespeareFan00 (talk) 17:20, 7 January 2017 (UTC)

Does formatting have to be fully faithful to the original?[edit]

I ask because I've already done partial digitization of a book elsewhere. I want to continue it here but my work reformats the original slightly to be more legible (IMO) for the 21st century: link vs. link. Suzukaze-c (talk) 21:10, 8 January 2017 (UTC)

Yes and No:
  • When we transcribe in the Page namespace, as you are currently doing, we want to match the formatting of the original as closely as is possible and reasonable. The faithful transcriptions are then transcluded into the Main namespace to provide a text that matches the original.
  • However, we do allow for "annotated" versions (policy details are still under review), and I think it would be reasonable to create a second "annotated" version using the friendlier formatting. This would not use the transcription in the Page namespace directly, but would be copy-pasted to a new location in the Main namespace as a second copy of the work. The second copy would have notes in the header of each page explaining exactly how (and why) the formatting differs from the original.
  • Also: Please note that we strip the Google notice page from the front of works which we host. The notice is neither part of the original, nor presents any supportable claim of copyright. It is better to have this change made before editing individual pages, since it will affect page numbering.
--EncycloPetey (talk) 21:54, 8 January 2017 (UTC)


Re: Encyclopetey's "we match the formatting of the original as closely as it possible", I would add that we try to capture the intent of the author while ignoring the contribution of the typographer. We don't care what font was chosen, we don't care about line breaks, etc. Hesperian 23:52, 8 January 2017 (UTC)
Legibility and faithfulness to the text are our drivers. So allowing the user's browser, in the user's viewing platform (wide monitor or small telephone) to do the work! Within that we may use a template like {{blackletter}} (blackletter) to represent some part of the work, we wouldn't force a change from how a user has set their default browser typeface nor size, with all our text sizing being relative proportional. — billinghurst sDrewth 02:50, 9 January 2017 (UTC)
Thank you for the answers. I suppose I'll try to match the original for now. (also, what software should I use to remove the Google notice...) Suzukaze-c (talk) 01:42, 10 January 2017 (UTC)
I've never learned how to manipulate a DjVu file myself, and unless you know how to edit DjVu files properly (it can be tricky), then it's best to post a request for someone's assistance. We have a few skilled hands here who can help with that. Learning how to do this might make for a good talk at a Wikisource gathering. --EncycloPetey (talk) 02:15, 10 January 2017 (UTC)
I notice there's a section for requests of this sort at Help:Internet_Archive#Google_Books that seems to be rather neglected. Suzukaze-c (talk) 02:50, 10 January 2017 (UTC)
Vast bulk of people here are neither net nor image geeks, they are interested in the magic of the word, and its transcription. Predominantly none of us has worried about the image scans having the Google component, and as it doesn't affect our transcription and presentation, we politely ignore it. Noting that I do have a little search and replace script that deletes that component on the occasions that it gets reproduced by the OCR which I use as part of my general text clean-up when proofreading. — billinghurst sDrewth 03:03, 10 January 2017 (UTC)
(forest and trees <sigh>) Oh, you mean the front notice page. I blank its text, and mark it as a blank page. I have been gently asking for some time for someone to build a labs tool that could excise the front page, and replace it with a blank page. No-one has considered it of sufficient importance to progress. — billinghurst sDrewth 04:47, 10 January 2017 (UTC)
If you're willing to try it, the easiest way to do it is with DjView, see Help:DjVu files#Removing a copyright page, though that description could do to be fleshed out a bit. The best place to ask for someone to do it for you is probably the Repairs (and moves) section of the Scriptorium. Also: while we don't really care if the page is present here on Wikisource, I have seen the folks on Wikimedia Commons flag files for deletion if they contain this file, so that is also something to keep in mind. —Beleg Tâl (talk) 03:13, 10 January 2017 (UTC)
@Beleg Tâl: I think that I have complained sufficiently that they have stopped deleting those works. I went through the argument that an author's work that is in the public domain should not be prevented due to Google slapping a generic page on the front. The deletions and nominations have seemed to have stopped. — billinghurst sDrewth 04:50, 10 January 2017 (UTC)
Apropos removing the Google frontsheet. The most practical way is to ignore it and treat it as without text. But some editors may still want its removal. This should not hinder proofreading, because it can be later replaced with a blank. Some of us can do it, e.g. User:Mpaa, User:Jpez and myself. I suggest putting up a list of such books at some place, so that when someone capable wants to do it, he/she can gradually progress through the list. Hrishikes (talk) 06:56, 10 January 2017 (UTC)
Personally, I prefer having the page removed rather than replaced. Its presence affects the numbering such that odd page numbers in the work appear on even page numbers in the file (and vice versa), and this becomes confusing in a work of any length. --EncycloPetey (talk) 13:37, 10 January 2017 (UTC)
The quickest method (IMO) is to use the command line DjVuLibre tools, as so: [1]. Download the tools, put djvm.exe in the same folder as the DjVu file, and run djvm.exe -d foo.djvu 1 in a command prompt, where foo.djvu is the name of your file. (Edit: ah, this exact method is laid out at Help:DjVu files#Removing a copyright page.) clpo13(talk) 23:44, 11 January 2017 (UTC)


What about Page:An alphabetic dictionary of the Chinese language in the Foochow dialect.djvu/31, where Chinese text couldn't be placed inline due to technical difficulties during printing and had to be moved to the footer? Suzukaze-c (talk) 08:43, 18 January 2017 (UTC)

Yes, I would preserve this formatting; it should be straightforward using footnotes. —Beleg Tâl (talk) 13:18, 18 January 2017 (UTC)

Adding a work without a scan[edit]

Hi, I want to add the poem "The Lark Ascending" by George Meredith. It's public domain, since Meredith died in 1909. Here's a link to the text. Do I need to find a paper copy to scan, or can I work off of the online version? Thanks, Icebob99 (talk) 03:01, 9 January 2017 (UTC)

It is permitted to add works without a scan,. However, scans of "The Lark Ascending" are readily available, so I strongly urge you to use one of them. The link you point to is from A Victorian Anthology, 1837–1895 ed. Edmund Clarence Stedman, which has several scans: [2] [3]. You may find this collection more manageable :) —Beleg Tâl (talk) 03:51, 9 January 2017 (UTC)
... and I've started off Index:Poems and lyrics of the joy of earth.djvu so that we can get a scan-backed copy. —Beleg Tâl (talk) 04:01, 9 January 2017 (UTC)
... and here's where you can add the poem: Page:Poems and lyrics of the joy of earth.djvu/80Beleg Tâl (talk) 04:13, 9 January 2017 (UTC)

/*Cassell's Illustrated History of England vol 5 */[edit]

I and two others cannot fix the 1st page marked in Problematic-Purple on Index:Cassell's Illustrated History of England vol 5.djvu . It is very short, probably needs something like text float over the image. Will someone smarter than us please assist with this? —Maury (talk) 21:50, 12 January 2017 (UTC)

I want to personally THANK whoever fixed the above problem. It is very kind of you and you also aided Wikisource itself. I had thought nobody would help with it. Kindest regards, Maury —Maury (talk) 13:30, 13 January 2017 (UTC)

Help with MediaWiki:PageNumbers.js[edit]

@George Orwell III: @Hesperian: Hi! I've imported dynamic layouts to Ukrainian Wikisource and it's mostly works but I have following problem. Dynamic layout applying for Ukrainian version of Help:Beginner's guide to reliability. I'm understanding that layout applying because this page including Template:PageStatus that create the table used as indicator of ability apply dynamic layout, but in English Wikisource there are no dynamic layout in Help: namespace. Could you please explain in which way unexpected dynamic layout disabled in Help here? Just for note: in Ukrainian Wikisource we have the same namespace with the same index, code was mostly copy-pasted (and localized for Ukrainian). Artem.komisarenko (talk) 14:01, 21 January 2017 (UTC)