InductiveBot is based on pywikipedia, and a few scripts that run over pywikipedia, performing more complex edits that cannot be done simply with the out-of-box scripts.
Generally I call my scripts "aa-xxxxx.py", so they float to the top of list of files. Text input files I call "ab-xxxx.txt".
One class of script I have found to be very useful is the "handler script" which is a script which calls a pywikipedia script in turn. This means that for common operations, you don't have to type in a complex command line argument list, just load up the script from last time, change relevant parameters and away you go.
|Replace.py handler script|
import subprocess THROTTLE = 10 FIX = 'cat' # the relevant fix in user-fixes.py FILE = r'c:\ab-filelist.txt' #file containing list of pages to apply fix to cmd = ['python', 'replace.py', '-fix:' + FIX, r'-file:' + FILE, '-pt:%d'%THROTTLE ] subprocess.call(cmd)
A useful pair of functions for Page: namespace
|Decompose/compose function script|
def decomposePage(wikiText): regex = re.compile(ur'(?ms)^<noinclude>(.*)</noinclude>(.*?)<noinclude>(.*)</noinclude>$') m = regex.search(wikiText) if m: header = m.group(1) body = m.group(2) footer = m.group(3) return header, body, footer else: print "Can't find header, body, footer" return None def composePage(header, body, footer): return '<noinclude>%s</noinclude>%s<noinclude>%s</noinclude>'%(header, body, footer)