User:Pathoschild/Standardization

From Wikisource
Jump to navigation Jump to search
Regex standardization

This page describes a script using the Regex menu framework to comprehensively standardize, normalize, and update any page on Wikisource. The code can be obtained from User:Pathoschild/monobook.js; note that this is highly experimental and very likely to have glitches.

Changes[edit]

All namespaces[edit]

  • Syntax
    • Templates: remove msg: and template modifiers, and remove parameter whitespace.
    • Headers: normalize whitespace (no space between syntax and header text; no blank lines between header and following text; one blank line above header, unless it follows another header).
    • Lists: normalize whitespace (one space between item syntax and text).
    • Categories: normalize capitalization and parameter whitespace.
    • Links: normalize pipe whitespace, remove redundant link text ([[foo|foo]] to [[foo]], replace underscores with spaces.
  • Sorting
    • Group license templates, categories, and interlanguage links at the bottom of the page in separate lists by type delineated by a single blank line between each.
    • Sort each list alphabetically (case-insensitive).

Author namespace[edit]

  • Normalize {{author}} parameter layout.
  • Convert deprecated {{author}} parameters and add new parameters.
  • Remove categories added automatically by {{author}}.
  • Update redirects from {{author-PD-*}} to {{PD-*}}.

Main namespace[edit]

Known glitches[edit]

Author pages[edit]

  • When no date categories are present and a full date is given in the deprecated {{author}} 'date' parameter, the script mistakenly uses the first number instead of the year for the birthyear (ie, "June 18, 1986").
    1. Causes: regex pattern matches the first number; difficult to resolve, may need to get all numbers and use the biggest (will work in most cases).
    2. Workaround: correct manually.

All namespaces[edit]

  1. Occasionally, an extra blank line will be inserted above the sorted elements. This is a non-issue.
    • Causes: unknown, possibly related to the end whitespace regex.
    • Workaround: remove the extra line, or ignore it (no significant effect on the output).