User:Inductiveload/InductiveBot information

From Wikisource
Jump to navigation Jump to search

InductiveBot is based on pywikipedia, and a few scripts that run over pywikipedia, performing more complex edits that cannot be done simply with the out-of-box scripts.

Generally I call my scripts "aa-xxxxx.py", so they float to the top of list of files. Text input files I call "ab-xxxx.txt".

Custom scripts[edit]

Handler scripts[edit]

One class of script I have found to be very useful is the "handler script" which is a script which calls a pywikipedia script in turn. This means that for common operations, you don't have to type in a complex command line argument list, just load up the script from last time, change relevant parameters and away you go.

Replace.py handler script
import subprocess

THROTTLE = 10

FIX = 'cat' # the relevant fix in user-fixes.py
FILE = r'c:\ab-filelist.txt' #file containing list of pages to apply fix to

cmd = ['python', 'replace.py', 

        '-fix:' + FIX,
       r'-file:' + FILE,
       '-pt:%d'%THROTTLE
       ]
    
subprocess.call(cmd)

A useful pair of functions for Page: namespace[edit]

The following functions decompose a page in the Page: namespace consisting of a header, body, and footer, and then recompose them neatly after modification. The noincludes around the header and footer must be well formed, or the Javascript that presents them in the edit page with break.

Decompose/compose function script
def decomposePage(wikiText):

    regex = re.compile(ur'(?ms)^<noinclude>(.*)</noinclude>(.*?)<noinclude>(.*)</noinclude>$')
    m = regex.search(wikiText)
    if m:
        header =  m.group(1)
        body   =  m.group(2)
        footer =  m.group(3)
        return header, body, footer

    else:
        print "Can't find header, body, footer"
        return None
    

def composePage(header, body, footer):

    return '<noinclude>%s</noinclude>%s<noinclude>%s</noinclude>'%(header, body, footer)


vorbis=1[edit]

Original replacement:

python pwb.py replace -search:"insource:/vorbis=\"?1\"?/" '[^%]vorbis="1"' "%vorbis=\"1\"%%T257066%" -summary:"Comment out vorbis=1 while T257066 (score extension disabled) is resolved." -namespace:0

Inverse:

python pwb.py replace -search:"insource:/vorbis=\"?1\"?/" "%vorbis=\"1\"%%T257066%" 'vorbis="1"'