User talk:So9q

From Wikisource
Jump to navigation Jump to search

Welcome

Hello, So9q, and welcome to Wikisource! Thank you for joining the project. I hope you like the place and decide to stay. Here are a few good links for newcomers:

You may be interested in participating in

Add the code {{active projects}}, {{PotM}} or {{CotW}} to your page for current wikisource projects.

You can put a brief description of your interests on your user page and contributions to another Wikimedia project, such as Wikipedia and Commons.

Have questions? Then please ask them at either


I hope you enjoy contributing to Wikisource, the library that is free for everyone to use! In discussions, please "sign" your comments using four tildes (~~~~); this will automatically produce your IP address (or username if you're logged in) and the date. If you need help, ask me on my talk page, or ask your question here (click edit) and place {{helpme}} before your question.

Again, welcome! —Beleg Tâl (talk) 04:57, 20 November 2018 (UTC)[reply]

Complete English-Jewish Dictionary, 6th ed. - Harkavy - 1910[edit]

I'm interested in how well the updated newest OCR transcription features work. Check these two pages title and p. 2 where I tried the Tesseract "Transcribe text", but first setting advanced options.

When you go to p. 3 you'll see "Transcribe text" above and to the right of the edit area. Click the "ˇ". Then click "Advanced Options".

On that page make sure "Tesseract OCR" is selected, and then within "Languages (optional)" make sure "en-English" is selected *and* scroll down and (holding control key down) select "yi – ייִדיש". Since the dictionary has both English and Yiddish we want the OCR to look for and understand both.

Then hit the blue button "Transcribe whole page". After a short time the text area on the right ought to fill up with both the English and Yiddish text. It amazes me every time anything like this works!

You'll have to copy-n-paste the transcribed text back into the original "Creating page" window.

It isn't perfect, and here it is having trouble understanding which parts are left-to-right and which right-to-left, but oh so much better than the default scan! Have fun. Shenme (talk) 05:40, 17 October 2021 (UTC)[reply]

Oh, and I had to find pages like w:Yiddish orthography and w:Hebrew punctuation to figure out the Hebrew 'dash' used on the title page (second line) ('maqaf'?). The OCR didn't understand, but guessed that it was a hyphen. You'll probably know about that already. Shenme (talk) 05:45, 17 October 2021 (UTC)[reply]