User:Inductiveload
Inductiveload User Area | |||
Main User Page | Talk Page | Gallery | Contributions |
If you can suggest improvements to my own work, tell me. Don't let any poor quality work hang around! | |||
Wikisource user page | Commons user page | Wikibooks user page | Wikipedia user page |
Languages: |
Rarely do great beauty and easy proofreading dwell together.
About Inductiveload | ||||||
---|---|---|---|---|---|---|
Awards for Participationedit | ||||||
|
Tools and scripts[edit]
User preferences and custom javascripts:
- common.js
- monobook.css
- Regexp toolbar.js
- Running header.js - completes a {{rh}} template by copying the header from the last-but-one page and incrementing the number.
- Custom toolbar buttons.js
- InlinePagenums.js - toggle display of inline pagenums
- ColourBackground.js - shade text boxes and background in edit mode to avoid eye strain.
- Visibility.js - custom visibility switching
- index preview.js - preview index page thumbnails with alt-click on the page-list
- Commons scripts
- commons:User:Inductiveload/basic upload templates.js: add buttons for preloading book templates on the Basic Upload Form
- Popups Reloaded
Popups, but way better.
mw.loader.load('//en.wikisource.org/w/index.php?title=User:Inductiveload/popups_reloaded.js&action=raw&ctype=text/javascript');
mw.loader.load('//en.wikisource.org/w/index.php?title=User:Inductiveload/popups_reloaded.css&action=raw&ctype=text/javascript', "text/css");
- Quick Access
Keyboard-driven tool access
mw.loader.load('//en.wikisource.org/w/index.php?title=User:Inductiveload/quick_access.js&action=raw&ctype=text/javascript');
- Preview markup
mw.loader.load('//en.wikisource.org/w/index.php?title=User:Inductiveload/show_markup.js&action=raw&ctype=text/javascript');
mw.loader.load('//en.wikisource.org/w/index.php?title=User:Inductiveload/show_markup.css&action=raw&ctype=text/javascript', "text/css");
- Maintenance Wizard and Replacer
Perform maintenance without going into edit mode.
// maintain script has no purpose in special
if (mw.config.get("wgCanonicalNamespace") !== "Special") {
mw.loader.using(['ext.gadget.utils-difference', 'mediawiki.util', 'mediawiki.api',
'oojs-ui-core', 'oojs-ui-windows', 'oojs-ui-widgets']).done(function() {
mw.loader.load("/w/index.php?title=User:Inductiveload/maintain.js&action=raw&ctype=text/javascript");
mw.loader.load("/w/index.php?title=User:Inductiveload/maintain-ws-tools.js&action=raw&ctype=text/javascript");
});
}
- Jump to file
Add a button to go to the book file at Commons from the Index or Page namespace, and to the transcluding page from the Page namespace
mw.loader.load('//en.wikisource.org/w/index.php?title=User:Inductiveload/Jump to file.js&action=raw&ctype=text/javascript');
Tweaks[edit]
- Show an indicator when a script loads
I use this to check my local script is loading
$(function() {
$(".mw-indicators").append($("<img src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/c/c4/OOjs_UI_icon_chem.svg/20px-OOjs_UI_icon_chem.svg.png\">"));
});
- Add
nocache=1
to WS-export sidebar links
$(function() {
$("#n-epubExport a,#n-pdfExport a,#n-toolExport a").each(function(i, a) {
$(a).attr("href", $(a).attr("href") + "&nocache=1");
});
});
Some extra functions that might be handy for other scripts:
- Roman numerals.js - couple of functions that might be handy
Maintenance and reports[edit]
Below are lists of pages in Wikisource which are useful for various purposes. All of these could be out of date. If you really need up-to-date reports, just leave me a note, and I will do it as soon as I can.
- templates A list of all templates in use on enWS, along with links and usage counts.
- wikisource pages A list of all Wikisource namespace pages.
- portals A list of all Portal pages.
- ws-portal redirects A list of all Wikisource pages which redirect to Portal pages. No pages should link to these.
- ws-wp no backlink A list of Wikisource pages linking to Wikipedia pages which do not link back here.
- false root pages A list of pages that should be subpages but aren't.
- site-css-js: in-progress CSS tidying-up - look here for CSS and/or JS moved out of MediaWiki namespace (rather than being deleted)
- SPARQL: Useful SPARQL queries.
- HTML processing: Useful HTML transforms for extracting data
Bot activities[edit]
I operate a bot, InductiveBot, which performs minor maintenance tasks. It is based on pywikipedia and is quite flexible. If you have a specific request, please let me know on my talk page, and I'll see what I can do!
- InductiveBot information, containing information about custom scripts runnig over pywikipedia, etc.
Works I contributed to[edit]
- Memoirs of Sir Isaac Newton's life: uploaded illustrations and converted to DjVu. The text is handwritten, so OCR is worse than useless.
- Philosophical Transactions: collated from single-article pdfs and converted to DjVu. This is part of the WikiProject Royal Society Journals, and is a huge job (200+ volumes of ~500 pages each). The primary aim is to get them all uploaded, so articles can be added as they are linked from around wikisource.
- Wonderful Balloon Ascents, (1870) a history of early hot air ballooning and "aerostation"
- The Perfumed Garden, (1886)
- Of the Gout
[edit]
These are some useful scripts I have hacked together. I guarantee nothing! They are certainly not always neatly coded or structured, but they work for quick and dirty jobs.
- Script development: general notes on JS script development
- Universal batch image to DJVU converter. This script takes JPG, GIF, PNG, TIFF and anything else that Imagemagick can convert to PPM.
- DJVU OCRing script which uses Tesseract to OCR and insert a text layer into a DJVU
- Pagewise DJVU OCR extractor
- Index page tabulator, creates tables of individual files for use in collecting files into an index page. See for example Index:The Complete Collection of Pictures & Songs by Randolph Caldecott.jpg.
- Template usage tabulator. This script generates a table of all templates, along with the number of uses. Results can be found at User:Inductiveload/templates. Ask me if you want it regenerating, but bear in mind that it is a lot of requests to the server.
- Page namespace editor A simple script to decompose Page: namespace pages, perform operations on the header, footer, and body separately, and reupload.
- Page shifter A script to shift a set of Page: pages within the same index, or move to a different index.
- PDF page converter A shell script to convert a PDF to images (threaded and doesn't run out of memory and die after a few hundred pages like convert can do)
- Image splitter A Python script to split images in two. This is useful if you have books scanned at two-page spreads.
- Page concatenator A Python/Pywikipedia script to grab a bunch of pages and string them together. Good for assembling a complete text out of many chapter subpages prior to match and split.
- Archive.org API How to use the Internet Archive S3-like API to upload large files, instead of the flaky web-client.
Header script[edit]
This is a tiny script to add the path of pywikipedia to the Python PATH environment variable at runtime, so you can run scripts from outside the PW directory, without messing around.
pw_script_header.py
|
---|
#!/usr/bin/env python
import sys
PW_PATH = '/home/user/src/pywikipedia' #this is the directory containing the pywikipedia files
sys.path.append(PW_PATH)
|
General Python scripts[edit]
- Integer to Roman numerals converter (eg. 11 -> XI)
GIMP scripts[edit]
- Remove-paper-texture.scm Gimp script to remove the paper background from a scan of a black and white image.
- Remove-background-colour.scm Gimp script to remove a flat background colour from an image.
How to split a table across many Page: pages so they transclude neatly into one[edit]
- Page 1
{| table styling | col1 || col2 || col3 <noinclude>|}</noinclude> <---this is the footer of the page
- Page 2
<noinclude> {| table styling (same as page 1)</noinclude> <---this is the header {{nopt}} | col1 || col2 || col3 <noinclude>|}</noinclude> <---this is the footer of the page
- Page 3
<noinclude> {| table styling (same as page 1)</noinclude> <---this is the header {{nopt}} | col1 || col2 || col3 |}
One touch template wrapping with Autohotkey[edit]
If you use Autohotkey (and you should be), the following is a useful function that lets you wrap the current mouse selection in a template, which saves you having to paste in the contents.
F2 & s :: wrapTemplate("sc") ; small caps wrapTemplate( name ) { front :="{{}{{}" . name . "|" back :="{}}{}}" wrapTags( front, back) return } wrapTags( front, back ) { AutoTrim Off ; Retain any leading and trailing whitespace on the clipboard. ClipSaved := ClipboardAll ; Save the entire clipboard so we can restore it when we're done clipboard = ; clear the clipboard SendInput ^x ; cut the selection to the clipboard ClipWait ; wait for the clipboard to contain something SendInput %front%%clipboard%%back% ; Output what was selected, surrounded by front and back Clipboard := ClipSaved ; Restore the original clipboard ClipSaved = ; Free the memory in case the clipboard was very large. return }
Regular expressions[edit]
Function | Search pattern | Replacement Pattern |
---|---|---|
Remove single newlines. Useful for OCR'd text | /([^\n])\n([^\n])/g | '$1 $2' |
Convert relative links to static links. Useful when putting a TOC in the Page: namespace. | (/\[\[\/(.*)\/\]\]/g | '\[\[$1\|$1\]\]' |
Technical wishlist[edit]
Some things I'd like to see done (that isn't actual proofreading). Some of it is unimportant, some of it may be controversial and un-discussed, but would be nice to address and tighten up.
- Get dynamic layouts to work for non scan-backed works and scrap {{prose}} and other hard-coded formatting.
- Fix poem tags - only by having all lines as p or span-tags can we have hanging-indented continuation lines like 95% of all printed poems are. Stanzas should be divs . Might need a whole new tag in the poem extension, but might not be that hard?(???)
- Train Tesseract specifically for 1700s-style printing esp. with long-s
- Allow match-and-split to match to PDFs (since there are now ~1m PDFs on Commons)
- Get para breaks working in OCR loading: phab:T230415
- Add common fonts:
- Cursive: phab:T166138
- JUnicode: phab:T173573
- Sans Outline (and remove hacks like ℕ𝔼𝕎 𝕐𝕆ℝ𝕂)
- Serif Outline
- Maybe a better Polytonic greek?
- Move MediaWiki:Proofreadpage_index_template to a module
- Move Template:Header to module
Half-done[edit]
- Ebook review process leading to categorisation, then... Category:Ready for export
- Improve index autofill to fetch author links from Commons creator templates: Getting there: Mediawiki:Gadget-Fill Index.js
Fix {{FI}} which is invoking full-size images every single time.Merge with and/or deprecate {{large image}}.- Fix headers on mobile, the tabular structure is unfriendly on narrow screens (main header done, other namespaces pending main header module-ificaton). See {{header/main block}}
Done[edit]
- Tool to convert import IA page list JSON: Mediawiki:Gadget-ImportPagelist.js
- Fix the print CSS: centre is broken Fixed: diff
- Fix page numbers in {{TOC begin}} c.f. phab:T232477
- ODPS catalogue of "exportable" ebooks for integration into e-readers: phab:T270387. See Category:Ready for export for link