User:Inductiveload/Sandbox/Formatting for export

From Wikisource
Jump to navigation Jump to search
Preparing for export

How to prepare works for export to e-book formats

Preparing works for export[edit]

Certain things must be checked before marking a work as "ready for export".

Headers are not exported[edit]

The {{header}} template is not exported[1]. This includes the notes field. There should be no content in that field that is necessary for the navigation of the ebook. For example, do not put a Table of Contents in that field if there is no TOC also in the main text body.

Also, do not rely on a singe "next" link to provide navigation to begin the book. Use a TOC on the front page of the work.

Listing pages for export[edit]

The export tool looks for links to subpages on the top level page and uses them in the order that they appear. Usually, this works well, as most works are either on a single page, or have a TOC on the top page that lists all subpages in order.

If a work does not have such a TOC (e.g. it has multiple TOCs on subpages, you must add a TOC that WS-export can read, using the class class="ws-summary. (TODO: this needs an example in use)

<div class="ws-summary" style="visibility:hidden;">
 ....
</div>

Formatting for export[edit]

Shortcut:
H:EXPFORM

Some formatting that works well on a device with a large screen and feature-rich browser, like a computer, does not work so well on less-capable devices like e-readers. There are some things you can do to make the EPUB and MOBI exports look and function better on e-readers. There are some main things you should consider when formatting a work with a view toward exports:

  • E-reader devices generally have much smaller screens
  • E-reader devices, apps or the ebook export tools may not support all formatting features that work in browsers
  • Some content visible on Wikisource is excluded from the exported formats

Formatting for small screens[edit]

Smartphones often have an effective pixel width[2] of around 350px. For a "normal" font size, this is about 23em. Because e-readers can adjust the font size, you should be cautious when making assumptions about screen width in relative terms such as "em". If the user has set a large font (perhaps due to their vision), they may have a page only 10em wide.

Avoid fixed-width formatting[edit]

Content spilling off a narrow page.

Any formatting that uses a "fixed width" is at risk of not fitting on a mobile device screen, especially if the width end up over about 350px.

In the following examples, the red box is a simulation of a small screen, and any content that spill out is either not visible at all, or must be scrolled to be seen. Blue boxes are examples of a larger screen.

Here, we have a fixed-width {{block center}} template that is wider than the screen. Everything outside the red box will spill off the screen on an e-reader of that size:

{{block center|width=500px|{{lorem ipsum}}}}

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

The solution here is to use the max-width CSS property. This means that the content will not grow larger than the stated size, but it can shrink to be smaller if needed, for example, if the screen is smaller than the stated 500px. If the screen is wider than 500px, the content will be limted to 500px.

{{block center|max-width=500px|{{lorem ipsum}}}}

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Avoid wide fixed-width images[edit]

Images are also often elements that spill off pages, as they are specified in pixels and are frequently wider than 350px:

[[File:Frontispiece, What Katy Did at School, 1876.png|500px]]

Some ebook readers (and the mobile Wikisource site) provide extra logic to ensure images fit the screen, so you may or may not run into this issue in testing.

An alternative is a template like {{img float}} or {{large image}} that provides CSS that prevents the image being larger than its container, but still allows the image to expand up to the given pixel size, if there is space to do so:

{{large image|[[File:Frontispiece, What Katy Did at School, 1876.png|500px]]}}

On a 350px screen, the image will not spill out:

On a 600px screen, the image goes up to the specified 500px:

The {{FI}} template is similar, but can also be a problem, because it loads the full-size image. This can inflate the size of the ebook by 10–50 times, resulting in a file over 100MB. Some e-readers have only 1–2GB of storage, so this is quite limiting. {{large image}} allows the same shrink-to-fit as {{FI}}, but will not request the full-sized image.

Avoid fixed indenting[edit]

Indenting by a large amount with the following construction (sometimes used to simulate right alignment) can spill off the page:

:::::::::::::::Indented content
Indented content

Depending on what you are trying to achieve, one of the following might be more suitable:

{{right|Right aligned}}
{{right|offset=2em|Right aligned with offset}}
{{center|Centered}}
Right aligned
Right aligned with offset

Centered text

Avoid fixed columns[edit]

Fixed column layout look fine on a computer, but they can become very squeezed on e-readers:

{{div col|3}}
{{lorem ipsum}}
{{div col end}}

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

You can specify the minimum width of the columns, so that the number of columns reduces in narrow screens. The correct minimum width may well depend on the content, but generally, around 12em is a good lower bound, below which columns tend to start looking very squeezed.

{{div col|3|width=20em}}
{{lorem ipsum}}
{{div col end}}

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Table-based columns templates like {{multicol}} cannot do this, and these are very likely to produce ebook content that is difficult to read due to extremely narrow columns, especially if there are more than 2 columns.

Sometimes, for things like side-by-side translations (as is common in bi-lingual treaties, for example), there might be not much you can do about this.

Complex or Wikitext-only markup[edit]

Obsolete tags[edit]

Obsolete HTML tags like <center> are not understood by the ebook formatters. Do not use them, and prefer templates like {{center}} instead. Such tags are also often lint errors too, so they should be removed anyway.

Table alignment[edit]

The following construct for centring a table on the page does not export, and the table will be aligned to the left of the page:

{| align=center
|...
|}

Instead, use the CSS margin:auto; style to do this in a modern and portable manner:

{| style="margin:auto;"
|...
|}

Dot leaders[edit]

Table dot-leaders generally do not export well, as they are generated by a complex "hack" that some ebook readers do not understand.

Other considerations[edit]

Page breaks[edit]

The {{page break}} template should be used to force page breaks in ebooks. It contains special CSS that ebook readers can use to paginate content. This is often useful in the front matter of books where the content should not flow together:

{{C|Page 1}}
{{page break|label=}}
{{c|Page 2 - this will be a new page in an e-reader}}

Page 1

Page 2 - this will be a new page in an e-reader

Testing[edit]

You can text e-book formatting in 2 ways:

  • Viewing the online page in a browser's "mobile view".
  • Downloading an EPUB or MOBI format and viewing on an e-reader or e-reader app. Only this method allows to you to check for issues like missed sections.

Online viewing[edit]

You can test how a page looks in a mobile browser (which is generally broadly similar to most e-reader devices) by using the "Responsive Mode" in your browser. In Firefox, this is Ctrl-Shift-M and in Chrome it is also Ctrl-Shift-M, but the developer tools has to be opened first.

As a rule of thumb, if the work looks OK in both Layout 1 (full-screen width) and Layout 2 (constrained central column), it will generally be OK on mobile. However, Layout 2 is still about 50% wider than a phone screen, so you could miss some issues.

Using an e-reader or e-reader program[edit]

You can test e-reader compatibility by downloading the EPUB or MOBI file as normal and opening it on an e-reader device or with an e-reader program or simulator.

Native desktop programs that aren't dedicated simulators generally use fully-capable HTML renderers (like browsers do) so they may do better than real devices at rendering content.

Examples of e-reader programs

Examples of simulators that attempt to render an ebook as on a device:

Issues to address[edit]

Wikisource issues[edit]

There are some site-wide issues that lead to issues in ebooks. Not all of these may be tractable to fix.

  • Dot-leader tables: as mentioned, these do not look good due to the hacks used to format them. There is probably not a lot that can be done about this, other than simply not using them.
  • {{block center}} with fixed width: perhaps these can be changed to max-width across the board. Is there any case where a block center must not shrink to fit its container?
  • Sidenotes rarely work in ebooks. Generally they are simply inlined with the surrounding text, usually with a fairly acceptable result. Again, there is probably not much that can be done here.
  • {{sfrac}} does not work well - the line ends up spanning the whole page. Some usages can be changed for Unicode fractions, but not all.
  • {{overfloat image}} is hardcoded to use pixels for sizes. This is pretty much guaranteed to break if the image is rescaled (e.g. on mobile)

E-reader issues[edit]

These might indicate issues in Wikisource HTML output (in which case they belong above), ebook conversion (open a WS-export bug) or the apps/devices themselves (open issue on those projects):

  • {{block center}} doesn't seem to render correctly in MoonReader+, it ends up left-aligned.
  • {{small caps}} doesn't work in MoonReader+

  1. Because it sets has CSS class="ws-noexport".
  2. Modern devices often have HD displays of over 1000 pixels' width, but a scaling factor is applied to make the text readable. Usually this factor is between 2 and 4, depending on the device's physical size and resolution.