User:Alex brollo/WIP

From Wikisource
Jump to: navigation, search

Proofreading and book tool[edit]

I just applied a trick to have a good pdf book from a proofread work. Take a look to User:Alex brollo/Books/Equitation.

The trick is, to use the tool book feature to read perfectly plain transclusion, while it is confused by Template:Page. So, I used some of empty talk pages of Equitation and I posted codes for normal transclusion of index file, Index:Equitation.djvu. The code is extremely similar to the vervion using Tamplate:Page. I used - as you can see editing my ebook - Talk:Equitation/Preface, then Talk:Equitation/Chapter 1 to Talk:Equitation/Chapter 4.

The normal code is:

{{Page|Equitation.djvu/25}}
{{Page|Equitation.djvu/26}}

while the plain transclusion code is:

{{Page:Equitation.djvu/25}}
{{Page:Equitation.djvu/26}}

So I launched book tool with Talk pages and the result is pretty good. --Alex brollo (talk) 19:08, 20 January 2010 (UTC)

#titleparts : what a lovely function![edit]

ROOTPAGENAME : an example of "normal" use of #titleparts[edit]

Here the engine of a new "variable", {{ROOTPAGENAME}}, really a normal template without parameters; but I used upper caracters just to underline how its output is similar to that of other page variables.

{{#switch:{{NAMESPACE}}
|Page=Index:{{BASEPAGENAME}}
|Page talk=Index talk:{{BASEPAGENAME}}
|{{#titleparts:{{FULLPAGENAME}}|1|1}}
}}

I'm using this to build useful templates to transclude something from "root page", in ns0, in nsPage and in their talk namespaces, and presently to use root page in ns0 as a "self-template" of the whole work, with an appropriate use of tag to select the needed code. Something like {{:{{ROOTPAGENAME}}} transcludes the root pagename content from any subpage...

#titleparts as a "string parser"[edit]

But the beloved #titleparts can be used too as a "mini-array" of a large variety of strings, replacing into an elegant way lots of heavy #switch.

Take a look to this "number to roman" coverter:

{{#titleparts:I secolo/II secolo/III secolo/IV secolo/V secolo/VI secolo/VII secolo/
VIII secolo/IX secolo/X secolo/XI secolo/XII secolo/XIII secolo/XIV secolo/XV secolo/
XVI secolo/XVII secolo/XVIII secolo/XIX secolo/XX secolo|1|{{{1|20}}}}}

Two limitations and a warning:

  1. no more than 25 elements;
  2. the whole string to parse can't be longer that 255 caracters;
  3. the function returns the first one element (and only the first one) capitalized. This has the cause of a terribly difficult bug to fix into a complex template.
  1. titleparts works pretty well in cooperation with #section .... the next topic in these notes.


Simple bidimensional arrays with #section[edit]

I posted into User:Alex brollo/Data a bidimensional 4 x 4 array where cells are 0.0, 0.1.... 1.0 ... 3.3

The content of any of those cells can be recalled with this extremely simple code:

{{#section:NameOfThePage|x.y}}}

so that this code:

{{#section:User:Alex brollo/Data|2.1}}}

gives this result: content of cell 2.1

You can build large arrays (thousands of cells) without any server delay; section search is really fast and effective, it's simply a substring search (IMHO). There's a limitation and a warning.

  1. the limitation is: the cells can't contain functions (so you can't build an "array of template codes" :-() nor sections
  2. the warning is: use string delimiters if the name of the section contains spaces!
    1. this runs: <section begin=1.2 />
    2. this runs too: <section begin="1 2" />
    3. this gives unpredictable results: <section begin=1 2 />

Main page of a text as a "self-template"[edit]

Your en.source header template is really simple, so this trick is not so useful, but perhaps the idea can suggest something useful.

Imagine to enclose header template into onlyinclude tags (I'm using Equitation header as an example):

<onlyinclude>{{Equitation style}}
{{header
 | title      = Equitation
 | author     = Henry L. de Bussigny
 | section    = 
 | previous   = 
 | next       = [[/Preface|Preface]]
 | notes      = See also the [[Index:Equitation.djvu|original scanned version]] (page images with accompanying text)
}}</onlyinclude>

If you add this tag, you van use the whole main page as a template for subpages and directly call it with {{:Equitation}} or, much better, with previous {{:{{ROOTPAGENAME}}}} syntax, that is a general one. You can pass parameters too if you edit the code of main header:

<onlyinclude>{{Equitation style}}
{{header
 | title      = Equitation
 | author     = Henry L. de Bussigny
 | section    = {{{1|}}}
 | previous   = {{{2|}}}
 | next       = {{{3|[[/Preface|Preface]]}}}
 | notes      = See also the [[Index:Equitation.djvu|original scanned version]] (page images with accompanying text)
}}</onlyinclude>

then you can call it, ie from Equitation/Preface, with:

{{:{{ROOTPAGENAME}}}}|Preface|[[Equitation{{!}}Cover page]]|[[Equitation/Chapter_1{{!}}Introduction]]}}

or better

{{:{{ROOTPAGENAME}}}}|Preface|[[../{{!}}Cover page]]|[[../Chapter_1{{!}}Introduction]]}}

I'm going to test this on Equitation (bugs are dangerous beasts...;-)

Yes it runs. :-)

As you see, there's no mention of the name of main page into the last code; this means that the goal of "relativization" of links is perfectly catched.

A new toolbar button for a useful js function[edit]

Happy to share with you this button: Button ocr fix.png, connected to a function which fixes small, common OCR scannos with a single click (spaces near punctuation, broken words at the end of the rows, accents and apostrophes common in Italian language). It's simple to customize, so that you can add other regex for specific scans. You find the js engine into my it.source monobook.js from scripts of User:Pathoschild, with original contribution of it:User:FiloSottile.

My monobook is a little heavy and WIP, tell me if you need help to understand how that mess works ;-)

A simple python word parser[edit]

I use sometimes a simple python script to parse any text to extract "words". It simply converts any string into a list of "words" and "other", using a free list of caracters to distinguish them (the user can pass this parameter listing both "word caracters" or "not word caracters" freely with deeply different results). The resulting list of parsed text has the very interesting feature that it can be joined again giving back exactly the source text, and that can be scanned very simply searching for specific word, classifying and counting words, replacing specific words and so on.

I.e. I used the script to parse the pages of an old text in ancient portuguese (I don't know portuguese...) then loading the sorted list of different words into the Page talk ns: see Page 40 of that proofread work and its talk page, an the following pages till page 70. My project is to build a dictionary of all the words, to find the wrong ones and to fix the whole text by my bot.

Inside poem "tag"[edit]

Poem is really tricky, into its double character of "div class" and "software compiling directive". Nevertheless it's effect is really simple. Take a look to this, then browse the wikicode.

How poem works[edit]

The following verses don't use poem tag:

Lasciate ogni speranza voi ch'entrate,
    Per me si va tra la perduta gente
    per me si va nell'eterno dolore

The resulting html code is the following:

<div class="poem">
<p>Lasciate ogni speranza voi ch'entrate,<br>
&nbsp;&nbsp;&nbsp;&nbsp;Per me si va tra la perduta gente<br>
&nbsp;&nbsp;&nbsp;&nbsp;per me si va nell'eterno dolore</p>
</div>

On the contrary, here I used the tag poem:

Lasciate ogni speranza voi ch'entrate,
    Per me si va tra la perduta gente
    per me si va nell'eterno dolore

As you can see browsing the source html of tis page, the resulting html code is absolutely identical:

<div class="poem">
<p>Lasciate ogni speranza voi ch'entrate,<br>
&nbsp;&nbsp;&nbsp;&nbsp;Per me si va tra la perduta gente<br>
&nbsp;&nbsp;&nbsp;&nbsp;per me si va nell'eterno dolore</p>
</div>

So the tag poem, as a compiler directive, does these:

  1. opens and closes a div "class=poem";
  2. opens and closes a single p;
  3. converts new lines into br tags;
  4. converts any space at the beginning of the rows into &nbsp; html entities.

No more than this.