Talk:Literary Research Guide

From Wikisource
Jump to navigation Jump to search
Information about this edition
Edition: Sixth edition (2014) plus revisions through December 2017, when it was taken offline
Contributor(s): czar
Level of progress:


Wanted to dump the code I used to convert from

pandoc XML/LRGfixed.xml -f docbook -t mediawiki -s -o XML/LRG.mediawiki

Extended content
#pre pandoc
text = re.sub(r'<emphasis[\s]*role=\"(roman|tight)\"[\s]*>([^<]+)<\/emphasis>',r'\2',text) # need to remove roman/tight type so citetitles can process properly without nested tags
text = re.sub(r'<emphasis[\s]*(role=\"italic\")?[\s]*>([^<]+)<\/emphasis>',r"''\2''",text) # this leaves only italic type and unspecified emphasis
text = re.sub(r'<citetitle[\s]*(pubwork=\"(?!article)[\w]+\")?(role=\"(periodical|other|other2)\")?[\s]*>([^<]+)<\/citetitle>',r"''\4''",text)
#post pandoc
text = re.sub(r'=+\s\s=+\n\n',r'',text) # rmv erroneous headers
text = re.sub(r'\[{3}',r'&#91;[[',text) # ascii-fy brackets to work in wikicode
text = re.sub(r'\]{3}',r']]&#93;',text)
text = re.sub(r'\[{2}#((\w)\d+)\|\1\]{2}',r'[[../\2#\1|\1]]',text) # repair links for wikisource
text = re.sub(r'=\s([^=]+)\s=\n',r'{{title|\1}}\n',text) # format titles for wikisource

czar 14:25, 24 June 2018 (UTC)[reply]


This book was converted from XML so while I get that the idea is usually to stay true to the original layout, there's stylistic reference here. I'm open to ideas on how best to format the content. I'll look back to the print editions. I can't find a copy of the sixth edition—perhaps it was online-only—but I have a print version of the fifth edition en route, which might give some layout/markup ideas. czar 17:08, 24 June 2018 (UTC)[reply]