From Wikisource
Jump to: navigation, search

On Other Projects...[edit]

Present Usernames on other WikiMedia Projects
UserName Wiki
SQL-DB Spanish Wikipedia
SQL Simple English Wikipedia
SQL English Wikipedia (Admin)
SXT40 English Wikipedia (Unprivileged account for public computers)
SXT-404Bot English Wikipedia (Bot)
SQL Meta
SQL Old Wikisource
SQL Wikisource

OCR'ing the PNG Scanset[edit]

  1. Download the page that you wish to convert.
  2. Open it in your image editor, and, split the page in half, so there is only one column to process (Tesseract can only handle one column at a time)
  3. Save it as a BMP
  4. Run Tesseract (tesseract inputfile.bmp outputfile -l eng)
  5. Manually correct mistakes, and, remove extra linefeeds
  6. Merge into a single article