Wikisource:Scriptorium/Archives/2008-05

From Wikisource
Jump to: navigation, search

Announcements[edit]

HotCat[edit]

After (again) feeling guilty that Wild Wolf was doing all of the categorisation of Author pages that I had created, I have set up gadget "HotCat", also used on Wikipedia and Commons. John Vandenberg (chat) 01:26, 30 May 2008 (UTC)

Proposals[edit]

Wikimedia Radio[edit]

I see a few names of users I recognise above, people who'll know me from various mailing lists, and know my home project is Wikinews. We have what I think is an interesting proposal here on the Wikinews Water Cooler. The initial proposal was for a Wikinews Internet Radio station, not realistic, but a thought-provoking suggestion. I've put a little work into raising a number of points on this for discussion and I was gently reminded I'd omitted Wikisource then informed you guys have audio books that could be used.

The idea has, as the heading explains, morphed into Wikimedia Radio and would involve material from all projects (news from Wikinews, "did you know?" and featured article synopses from Wikipedia, recorded workshops and lessons from Wikiversity (imagine running a Spanish course for 2-3 months before Wikimania 2009), material from Wikibooks and Wikisource that's packaged to a radio format, Quote of the Day (with background) from Wikiquote, similarly Word of the Day from wiktionary, and free music from Commons with the option to do a long slot on a particular work of a composer with a bio of the composer and during the interlude a backgrounder on the work.) Put like that it starts to sound like enough to fill eight hours and repeat three times per day.

I've quietly gone about putting out feelers on this to smaller projects (i.e. avoided the 800lbs gorilla that is Wikipedia). This is because there are, from discussion on IRC, some Wikinewsies concerned the whole project would get hijacked by WP and we'd get sidelined.

So, now it comes to the obvious point, are there any Wikisource contributors (what do you call yourselves? Wikisourceologists?) who see potential here. Wikiversity has been kicking around a similar idea since sometime last year called Wiki Campus Radio, and when I fleshed out the proposal in an email to the ComCom list, the proposal was described as "WikiRadio4"; this is a bow to the BBC's Radio 4 channel which is heavyweight, serious, and... well, look up the Wikipedia article, "gravitas" springs to mind.

If you're interested in working on how Wikisource could contribute then there's a section on [[Wikinews:Water_cooler/proposals#Radio|]] to sign up to. Even if you're not going to add your name there feel free to chip in comments and detail any ways you can see Wikisource material being used. I can't think of anything else that would involve so much cross-project collaboration, to the extent that I'm seriously considering taking the whole thing to meta as a proposal for the first project predominantly serving up content without using MediaWiki. --Brian McNeil /talk 09:12, 23 April 2008 (UTC)

Some quick thoughts

  • Personally, I would be much more interested in a "journal" run by Wikiversity, but that takes nothing away from the usefulness of a audio based approach.
  • Wikisource is yet to have a "daily" feature, so we are unlikely to be able to feed content into a stream that is up tempo. That said, we could probably push one larger work out per month, and that could be broken up into a daily segment, so listeners end up listening to the entire work over the period of the month. Our audio usually comes from other sources (I havent seen any audio files created by Wikisource users) so it would take a significant effort for us to ensure that we have an audio stream for each featured text. It is hard enough for us to find good pagescans to accompany our Featured texts, that I doubt we are able to also require an audio stream before texts are promoted.

John Vandenberg (chat) 10:15, 23 April 2008 (UTC)

It may not be necessary that the texts to be featured to be used for audio output. Of course, volunteers are still needed to read and record the texts. I gave up for English texts, but I might be interested to do it for French texts. Yann 10:24, 23 April 2008 (UTC)
  • As to what we call ourselves I have preferred the term "Wikisourcerors" from the very beginning of the original Wikisource. :-) I believe that being able to laugh at ourselves is always a good principle to follow.
  • I have certainly been intrigued by the project as presented on the mailing lists, and have no reason to oppose it. Nevertheless, I see it as something for other people to do. I share the same kind of concerns expressed by John and Yann. Most of us will see these initiatives as good ideas, but unless someone is ready, willing and able to maintain them they ain't gonna happen. Note that on this Scriptorium page the "Wikisource News" feature at the top has not been changed since October.
  • All that being said, we certainly do not lack for material, such as long lost stories, which would be perfectly suitable for radio presentation. If we were to record this for radio we would also prefer broadcast quality, and not everyone is equipped to do that. In the short term there's a lot of public domain Old Time Radio (OTR) material available on cd for very low cost. Running an Inner Sanctum series could be attractive to many people. Eclecticology 17:21, 25 April 2008 (UTC)
For a daily effort, how about if we found a book with audio recording and then WMF Radio had a chapter read out each day? That way, no additional strain on Wikisource to meet a daily demand, but if somebody would like to see their favourite poem read aloud on radio, they can record a version, and insert it into the queue after the current book is done in two weeks, the poem will be read one day, and then on to a new 21-chapter book, etc. Sherurcij Collaboration of the Week: Author:Percival Lowell 17:02, 13 May 2008 (UTC)

Site redesign[edit]

How much flexibility does Wikisource have from WMF on general site outlay? I'm just curious, because I wouldn't mind getting a bunch of talents together and trying to make us "stand out" a little more - lends an air of legitimacy, a community-building project and all of that great stuff. For example,

Ionic Wikisource.png

Wouldn't that be neat? I say so! And I'm not aware of any WMF policies that would prevent us from fooling around with the default skin. After all, aren't we the new w:Library of Alexandria? :) Sherurcij Collaboration of the Week: William Lyon Mackenzie King 07:19, 3 May 2008 (UTC)

I dont mind if radical ideas are explored for the main page, however I do not want pillars taking up screen realestate on content pages. John Vandenberg (chat) 06:17, 6 May 2008 (UTC)
the idea of greek pillars is worth turning into a full-fledged window manager skin ThomasV 06:49, 6 May 2008 (UTC)
  • This idea sounds like a good idea, but it may not render so well in Opera or Firefox. I use those browsers a lot more than [[:en:w:Internet Explorer|Internet Explorer]], and a large amount of people do as well. The CSS would have to be changed so the font-width is narrower, and that may make it hard to read. Thanks, AP aka --Kelsington 10:15, 6 May 2008 (UTC)
Way to go Sherurcij, you totally jinxed us there. Now the Wikisource servers are going to burn down.  ;^)

How about Flash animations? I want to see the Titanic smash into the iceberg and sink. With accompanying audio - "SOURCE DOCUMENT! DEAD AHEAD!" and the foghorn blasting.

Just kidding, of course, but a redesign sounds interesting. Maybe the pillars could be in the background somehow? And I'm thinking there could be some watermarked ss and ffis and æs that figure largely in the design... *makes devil horns sign behind Eclecticology's back, then runs away* --❨Ṩtruthious ℬandersnatch❩ 21:32, 11 May 2008 (UTC)

Nothing in the world an embedded MIDI can't make better! Excellent planning! :Þ Sherurcij Collaboration of the Week: Author:Percival Lowell 17:00, 13 May 2008 (UTC)
I can't think of any reason Opera or Firefox wouldn't recognise CSS or any other formatting - do you have a specific concern in mind? Sherurcij Collaboration of the Week: Author:Percival Lowell 17:00, 13 May 2008 (UTC)

Standardization run[edit]

Pathosbot has been transitioning pages from {{header}} to {{header2}} over the last few days, and is almost done. Next we'll replace {{header}} with {{header2}} and switch pages back to {{header}}, completing the upgrade. On the most part, this will simply involve changing "header2" to "header".

Jayvdb pointed out the opportunity for other expensive standardization and page analysis, since the bot will be making the edits anyway. If there's any standardization you think should be done, please comment.

A few ideas:

  • Synchronize all pages with new {{header2}} parameters (like 'translator');
  • generate a list of possible poems without <poem> formatting (based on the proportion of line breaks per line);
  • generate a list of possible translations without the 'translator' parameter (based on the pattern 'translat(ion|ed|[eo]r)' in the 'notes' parameters).

{admin} Pathoschild 05:34:44, 06 May 2008 (UTC)

The migration from header to header2 has been painful, but I dont see that we need to migrate to back to header as quickly, as the two templates will be equivalent, making it much less painful to explain why there are two. Some contributors may copy existing usage of header2, however that can be addressed by patrolling. So I think we can take our time; simply altering all 100,000 pages replacing "header2" with "header" is not worth doing.
One of the expensive changes I think would be beneficial is for subpages to use a template which is mostly automated; see Template talk:Header/Archive 1#Subpages. {{subpage-header}} achieves this for simple cases, excepting that it doesnt display the author. I see three solutions:
  1. add more params to {{subpage-header}}; I dont like this idea
  2. the author value on the main page is defined as a labeled section (i.e. LST) and all subpages slurp that section into the header.
  3. a header template is created for each work which acts as the header, and it is transcluded into the subpages
    this is already done with larger works, such as EB1911, NSRW, etc.
    for many works, the specific template for each work would call a generic template which has the appropriate logic. i.e. for all works which have Arabic numbered chapters, the template would be something like {{header-arabic-chapter|author=...}}
On subpages, some thought needs to be given to prefaces by someone other than the author, as that is a common occurrence.
Another change is that I would like the header on every work to include more detail, like the German Wikisource template de:Vorlage:Textdaten. I am torn on how much detail we want to present (see here), but what we have is definitely insufficient, and putting {{textinfo}} on the talk page has never struck a chord with me. I would like to see more detail on the main page for our works, but do not like the unsightly German box on the right hand side - that is too much. My preference would be to display a minimal subset like we currently do, and add a "display option" to show the additional edition information. However I am not suggesting that all of the textinfo fields should be in the header; the source and contributor information is not valuable in identifying our edition, except where our edition is not uniquely identified and needs to be investigated.
As we move text to the "Page" namespace, the main-namespace page for works will become more like a bibliographic entry. Long term, I would like each work to have COinS, MARC records, and/or a International Standard Bibliographic Description. Then our pages can be indexed in WorldCat as an Internet resource for any specific edition.
For example, a "wikipedia" parameter would be very beneficial, as we already have {{wikipedia}} or {{wikipediaref}} on most works. Year of publication is another param that has been asked for often, and can often be determined automatically using the categories or header notes field. See Template talk:Header/Archive 1#Date_param. Also, and perhaps a little less expected, is an "oclc" parameter, which is a unique identifier for every work. John Vandenberg (chat) 08:18, 6 May 2008 (UTC)
While some of these are good ideas, most either cannot be automated by a text processing bot (like adding links to metadata databases), or require human supervision and should be done separately (like placing {{subpage-header}} or adding data in the header). Do you have suggestions for a short-term automated run with little human supervision?
It would be better to discuss each of these ideas separately, then update pages according to the decisions reached. —{admin} Pathoschild 18:35:56, 06 May 2008 (UTC)
parsing "[[category:dddd works]]" into a "year" parameter is simple. filling a "wikipedia" parameter can also be automated for most of the simple cases. replacing "{{header2....}}" with "{{subpage-header}}" is dead simple, and subpages are the larger part of our page count. (what is the point of changing 10s of thousands of sub-pages from header2 to header if the entire header block is able to be replaced with a one liner)
more importantly, if we are going to modify all pages to replace "header2" with "header", the least that we can do is add placeholders for some new params in the header block of the pages, in order to encourage humans to add the values.
Yes, I agree that we need to discuss these ideas a little more, either here or on Template talk:Header, but my intention is that we should find all of the ideas that have already been discussed to death (and put on hold because of the header->header2 conversion), and roll as many as possible in now, so that any automated update of header2->header is helping us catch up. John Vandenberg (chat) 00:59, 7 May 2008 (UTC)
I don't disagree. Some of your ideas are easy to implement, as you've noted, and I'm willing to perform them simultaneously.
However, some of those ideas are not so easily automated. For example, placing {{subpage-header}} would require determining whether the page is a subpage or not (and dealing with cases like "The 1/2 baked apple" which look like subpages but aren't) and comparing the header on each subpage to the main header to find discrepancies (like different authors, notes about a particular chapter, et cetera). This would be a complex transition that is much better to perform separately, so that it is easier to review changes. We could iterate through Category:Subpages easily enough.
If you want to minimize duplicate edits, we could place {{subpage-header}} first (once we achieve a consensus on its use), which would remove them from the list of pages needing a header2→header transition. —{admin} Pathoschild 01:57:09, 07 May 2008 (UTC)
I'd recommend we start the header2->header migration by renaming template:header2, then gradually resolve the redirects. -Steve Sanbeg 16:40, 7 May 2008 (UTC)
Pathosbot won't only resolve redirects; it also standardizes parameter layout, performs validation (invalid parameters, invalid or deprecated parameter use, formatting in metadata parameters, redundant symbols, broken syntax, and so forth), and detects incorrect conversions (bot or human), so there is value in a one-time conversion. Since we encourage users to copy & paste from other pages, it would be beneficial to ensure that pages are using the header correctly, but we cannot easily do this manually or during other bot tasks. I think it would be better to do this in a single run of one or two days and be done with it.
So if we do that, the question is what more can we do while we're editing these pages anyway? —{admin} Pathoschild 21:21:31, 07 May 2008 (UTC)
For starters, it can add new optional parameters to be filled in. Now is the time to make these changes. Lets discuss what new optional parameters we want first, and then tackle the second pass of moving "header2" usage back to "header". Does anyone have a problem with "year" or "wikipedia"; can we think of others? John Vandenberg (chat) 01:22, 8 May 2008 (UTC)
I split off a separate thread below, so we can discuss each issue separately. I think this will also invite more comment, since not everyone will be following the above discussion with much fascination. —{admin} Pathoschild 01:57:44, 08 May 2008 (UTC)

For the most part the header template is already too rigid and its operation incomprehensible. It should be simplified rather than having more added to it. Why do we need "override author" when simply leaving the slot blank would be enough? If "previous" and "next" are irrelevant to the page in question, we should be able to remove them completely from the template. If we want to include a lot of additional material in the headers, the burden of adding that material should not be foisted upon the average contributor. Such detailed requirements only work to discourage contributors. While I appreciate that some want formats that will be more consistent with standard bibliographic practice, let them be the ones to add that, perhaps in a separate template.

Enough information to establish the freeness of the text is still important, but it can be modularized out of the header. Eclecticology 08:21, 14 May 2008 (UTC)

It's precisely because header usage should be simple that parameters should not be removed. This way, editors can copy and paste from any page and simply change the values, without needing to dredge through documentation. Usage is simple: fill in what is relevant, ignore what is not.
The "override_author" parameter is for rare special cases where we need to change the author line, for example to say "edited by Billy Joe" instead of "by Billy Joe". It's not needed at all if you just want to remove the author line entirely (just leave the author parameter blank).
I agree that we should minimize parameters, which is why each parameter should be discussed thoroughly to decide the benefits of including it. —{admin} Pathoschild 19:00:25, 14 May 2008 (UTC)

Add new parameters to {{header}}[edit]

comment split from above discussion.

Lets discuss what new optional parameters we want first, and then tackle the second pass of moving "header2" usage back to "header". Does anyone have a problem with "year" or "wikipedia"; can we think of others? John Vandenberg (chat) 01:22, 8 May 2008 (UTC)

Perhaps forcing the {{edition}} template in the header, that way users will have to provide their sources in the talk page to avoid a red link in the work page, and we wouldn't need to include a template link in the notes section. And I admit I'm one who forgets to add this template, although I always provide the sources for the work in the talk page. Also, by "Wikipedia", I assume you mean to integrate the {{wikipediaref}} template? It would be very useful when adding notes, since it would separate the background notes about the work and notes about the particular edition shown. They could look something like this:
| Wikipedia info =
| Wikipedia article = (link)
If the article link is inserted by the user but no wikipedia info is provided, could the header default to {{wikipedia}}? Just throwing ideas around. :) - Mtmelendez 13:07, 8 May 2008 (UTC)
My suggestion would be to move the licensing parameter into the header also, to make clear that this is a mandatory piece of information about every work. Subpages could inherit this item from their respective base pages. A lot of the work that winds up getting hashed out in WS:COPYVIO could be avoided if the original contributors had provided licensing information or PD status up front. (I also like John’s idea, expressed above, about including an OCLC parameter, or other WorldCat identifiers, in the header, although those should be optional. The PD/license info ought to be required, IMHO.) Tarmstro99 13:37, 8 May 2008 (UTC)
Some of the licensing templates are very large and would cause the header to take up the entire screen - something I'm against. When people click a link, they should see the text of their story immediately, not have to scroll to find it. Sherurcij Collaboration of the Week: Wikisource:Confucianism 14:32, 8 May 2008 (UTC)
Maybe we can put the licence into a show/hide box?--GrafZahl (talk) 15:17, 8 May 2008 (UTC)
I like the sounds of that. Sherurcij Collaboration of the Week: Wikisource:Confucianism 17:51, 8 May 2008 (UTC)
I think we can find a compromise to this. Perhaps a single, simple line saying: "This work is in the Public Domain worldwide.", or "This work is in the Public Domain outside the United Kingom.", or "This work is licensed under Creative Commons 2.0.", and then either provide a link to the talk page with detailed information or the copyright template, or simply continue with the normal format of the copyright template serving as a footer to the work. - Mtmelendez 17:10, 8 May 2008 (UTC)
I definitely like that idea. I'll add it to the template if there's no objections. —{admin} Pathoschild 18:09:44, 08 May 2008 (UTC)
I have been thinking that the license displayed on the page should be a single line (I was thinking it should remain at the bottom of the page), with a "license" tab at the top of every page which shows a more detailed explanation. This license "tab" could be driven by JavaScript; if the user has JavaScript disabled, we could have the more detailed explanation appear on the page (which is why it would be better at the bottom of the page). A show/hide box would also work for me.
I'm not in favour of license information being sent to the talk page; it is a "talk" page. We could add a license namespace attached to each page, but I think that is overkill - users are more likely to add a license template if they can quickly add it to the page they are creating.
Please dont alter the template until we have agreement. John Vandenberg (chat) 23:24, 8 May 2008 (UTC)
I definitely think a "year" parameter (or maybe more specific like "year_published"--to avoid those situations where something was published decades after it was actually written) would be great. That way we can automatically do categorization for the Category:YYYY works and even many of the licensing templates like PD-1923, PD-old, PD-70. This would significantly reduce the amount of work the contributor has to do as well as help ensure that our works are not copyvios.—Zhaladshar (Talk) 15:18, 8 May 2008 (UTC)
"year_published" or "publication_year" works for me. Thinking long term, I would prefer that our header/bibliographic template had enough raw data in it that we don't need license tags. The license can be inferred in most situations; either with a bit of LST magic, or a bot which reviews new pages (after a day to avoid annoying users?), checking the author pages, etc. John Vandenberg (chat) 23:37, 8 May 2008 (UTC)
Why do half measures - how about we port the entirety of the Wikipedia {{citation}} template in here and incorporate it into the header so we cover most of the bibliographic data anyone could want? If the aesthetics of page real estate are a concern I'm extremely handy with javascript and CSS and I'd be happy to whip up and cross-browser-test some code that would elide most of the data (and/or format it completely differently) unless the visitor clicks a twisty or something, à la CategoryTree, and work on the license tab thingie John mentions too. (Maybe a tab for the bibliographic data too, huh? With pre-formatted {{citation}} code ready to cut-and-paste into Wikipedia articles? Excuse me, I'm drooling. But check out what I did recently with Men of 1914 - follow the link in the citation in the Wikipedia article John A. M. Adair, for example - I think there could be much better two-way integration between Wikipedia and Wikisource.)

Also, whether or not we go full bore with bibliographic data, it seems like we ought to use or at least alias the same parameter names that {{citation}} utilizes. --❨Ṩtruthious ℬandersnatch❩ 22:00, 11 May 2008 (UTC)

Absolutely. We should seriously include as much bibliographical information as possible. If you do not do that, then the articles lose their potential to be a good citable source. When you click the "Cite this" link on the side, it should present as much info as possible. A while ago I was working on all the possibile parameters that would be of use in a WP citation and started a test template here. Feel free to play around with it (i've gone into self-imposed exile on WP) as the template is still incomplete. 52 Pickup 07:18, 16 May 2008 (UTC)

License parameter[edit]

Discussing these separately, {{header3}} is a working proof of concept for integrated mini license texts. The text is transcluded from User:Pathoschild/Licenses, which uses a simple syntax to define all the templates. This page can contain the full texts without displaying them in the header, so we can link to the appropriate section in a "for more information" link for long license texts. If the license text isn't defined, it returns an unknown-template error.

For example (substituted):

{{header3
 | title      = Sandbox
 | author     = 
 | translator = 
 | section    = 
 | previous   = 
 | next       = 
 | license    = PD-release
 | notes      = 
}}
Sandbox
This work is in the public domain worldwide because it has been so released by the copyright holder.

This also allows us to perform license validation (except on subpages): output a warning and categorize the page if no license is specified, or if a license not applicable to the US is specified without a US-applicable alternative. —{admin} Pathoschild 05:29:08, 09 May 2008 (UTC)

That looks really good. But, there's an issue regarding licenses with detailed documentation, such as {{UK-Crown-waiver}}, and with works that have dual licenses. Therefore, adding a show/hide tab for license messages in that same license sentence seems necessary. The current templates or a text description may appear when the user wants to know more. John's suggestion of placing the license line at the bottom of the page, just before the footer, may also work.- Mtmelendez 14:01, 12 May 2008 (UTC)
This will probably be a very unpopular proposal, but I was thinking that it might be better to have the entire page within a template. The actual page content would have its own field. There would be a number of advantages to this:
  • It would be easier to set some global page standards, if necessary.
  • You can edit the header AND footer from within one template. Use the same parameters as currently used and simply include references to Template:Header/Header2/Footer/.. so this template does not become too unwieldly.
  • Licenses could still be placed at the bottom (where they belong, IMO), using a separate licence field, allowing for validation.
  • If all pages (not subpages) use this, it would be easy to set up any future field validation checks than may become useful.
Thoughts? 52 Pickup 07:30, 16 May 2008 (UTC)
There are some semantic metadata extensions that do something like that (example), but there are some technical problems with doing that using regular templates. For example, we cannot use wiki tables in template parameters, and there are some issues with whitespace removal. If we wanted to go that route, it would be much better to go with a full-fledged compartmentalized interface. —{admin} Pathoschild 07:51:38, 16 May 2008 (UTC)
I had not yet seen any articles here that used tables, so I assumed that they were not used. My mistake. It is possible to use tables within a template if you replace all instances of "|" with "{{!}}", although I admit that that might be creating more problems than it solves. The compartmentalised interface is new to me and I like the idea. Definitely worth considering, since it also hides most of the code that would otherwise confuse many casual editors. 52 Pickup 08:24, 16 May 2008 (UTC)
In my estimation John's proposal above to have javascript that creates an additional tab within the page to display license data (and another one for bibliographic data!) is the best approach for this. Such a solution would not require the Semantic Metawiki extensions nor tables code. --❨Ṩtruthious ℬandersnatch❩ 23:56, 16 May 2008 (UTC)
It would, however, require JavaScript and not be visible when printing (which it needs to be). Making the information visible by default on the page and hiding it with JavaScript might work, since it would be visible when printing and to those with JavaScript disabled. —{admin} Pathoschild 00:04:24, 17 May 2008 (UTC)
If the template output only javascript that might be true, but it could simply output something that's both whatever the minimum requirement is for printable copyright status notice, as well as the detailed info for a full tab.
But the Semantic MediaWiki stuff is cool, that could make a great solution for this too. In principle I'm definitely in favor of using the semantic extensions as much as possible, and I'm guessing by the transcription index pages you've already got all that installed?. --❨Ṩtruthious ℬandersnatch❩ 02:28, 18 May 2008 (UTC)
The "Index:" pages are not using semantic mediawiki; the HTML form is constructed using JavaScript. John Vandenberg (chat) 02:49, 18 May 2008 (UTC)

I wouldn't mind a link to the source document/PDF/ogg/djvu being included with a little note that if you're concerned about the veracity, you can check it against the original text yourself and make any necessary corrections. Sherurcij Collaboration of the Week: Author:John Masefield 21:35, 30 May 2008 (UTC)

Other discussions[edit]

How to create a DJVU file[edit]

Hello, I started this basic Howto. Please complete and correct. Yann 17:51, 5 May 2008 (UTC)

A great addition. Was surprised it wasn't written yet. I don't have much experience, but most admins do. I'm sure they'll take the time to expand it. - Mtmelendez 18:17, 5 May 2008 (UTC)

Gutenberg copyright releases[edit]

Just now I came across this (huge page) listing Gutenberg IP information collated by David Price, UK Project Gutenberg Coordinator, including release numbers, which I assume to be similar to our OTRS IDs. What caught my eye was that Categories is a "release" rather than "copyright cleared", which means it isnt by nature PD. I think we need to understand this Gutenberg release process, and perhaps add a template to indicate that a work we host is {{PD-Gutenberg-release}}, and perhaps note the number. John Vandenberg (chat) 13:22, 6 May 2008 (UTC)

Are you sure it's that what it means? On [1] it says "not copyrighted in the US". I'm not sure, but could it be that "copyright cleared" means the PD status was officially determined, and "released" means that the text is actually published by PG? So that they have a two-stage process: first copyright clearance, then release of the copyright-cleared material to the public?--GrafZahl (talk) 14:06, 6 May 2008 (UTC)
(ec)My mistake; the numbers are the Gutenberg etext number. It seems that the page I linked to doesnt include the "Copyright cleared" date for any etext that Gutenberg has released. Anyway, it is a great big lovely list of PD works. here is what it has under Confucius:

The Analects of Confucius by Confucius, trans. James Legge - released 3330, Project Gutenberg

The Chinese Classics, Vol. 1: Confucian Analects by Confucius, James Legge - released 4094, Project Gutenberg

Chinese Literature contrib. to Confucius, ed. Epiphanius Wilson - released 10056, Project Gutenberg

Sacred Books of the East Vol. 16 - The Sacred Books of China - Part II - The Texts of Confucianism: The Yi King (I Ching) by Legge, James - Copyright cleared 12 Aug 2003 by Confucius, James Legge

The Sayings Of Confucius by Confucius, trans. Leonard A. Lyall - released 24055, Project Gutenberg

John Vandenberg (chat) 14:13, 6 May 2008 (UTC)

CIA World Fact Book, 2004 image evolution creep[edit]

The CIA World Fact Book, 2004 relies heavily on maps which are stored on the Commons. The problem is that Commons users may update these maps with newer versions (for example, compare [2] against [3], watch the Serbian border change in particular), inadvertently destroying the faithful reproduction of the 2004 situation. I became aware of this happening through the Wikisource:CommonsTicker. I don't know whether such evolution creep has happened to other works, too.

The solution is, I think, to re-upload the 2004 versions of these maps to the Commons with a "2004" somewhere in the file name. There should also be a warning on the image description page that Wikisource requires a faithful copy. Maybe we should update our image guidelines to make editors aware of possible evolution creep.

--GrafZahl (talk) 15:00, 7 May 2008 (UTC)

Good idea. I doubt Commons users will oppose. It's best to name them by date, then leave a note or custom template at the image. - Mtmelendez 04:06, 8 May 2008 (UTC)
I uploaded "2004" editions of all of the CIA WFB flag images; one of them has since been deleted as a duplicate - I dont recall what I did - maybe I just restored it.
There is an existing Commons process for renaming files, which is mostly automated these days. Also, there will soon be native MediaWiki support for moving images (I saw the code checked in a few days ago), so it could be worth waiting, as that will make it even easier. Moving the existing images to a more specific name will also help Wikipedians realise when a map is probably out of date, and also realise that they need to create a new copy rather than replace the 2004 image. John Vandenberg (chat) 05:09, 8 May 2008 (UTC)
here is the deleted 2004 flag. John Vandenberg (chat) 00:10, 12 May 2008 (UTC)
Can we not put a sort "date-sensitive: do not mark duplicate" tag on these for Commons. The only solution to the issue is to reupload everything commons with 2004 in the file name. but if they delete these as duplicates is pointless.--BirgitteSB 19:57, 12 May 2008 (UTC)

A simple Commons template certainly seems like the most certain, and simple, solution Sherurcij Collaboration of the Week: Author:Percival Lowell 20:10, 12 May 2008 (UTC)

See commons:Image:Somalia-CIA WFB Map (2004).png for an example of the template that I use when taking care of this problem. --Spangineerwp (háblame) 00:13, 17 May 2008 (UTC)

Way to go University of Hong Kong Libraries![edit]

I just struck a rich vein of public domain document-ore: in the course of some Wikipedia research I found the domain ebook.lib.hku.hk. These ebooks seem to each be composed of a bunch of single-page .pdfs, so I assume they're actually original scans and not re-publications from elsewhere on the net. The distribution of titles amongst genres is also distinctly different from Project Gutenberg, Internet Archive, or Google Books: there's a higher percentage of technical works (because China was especially interested in Western tech pre-1923, I'm guessing?), learning English, and Hong Kong publications. (Curiously, I do not seem to have come across anything written in Chinese.)

I can't seem to find a main index or search page for all the books; if someone else finds it be sure to post it here. So I have been using a domain-scoped Google search. To get the proper ebook interface you have to trim down the URL to the last slash; so if your search result takes you to something like

http://ebook.lib.hku.hk/CADAL/B31413456V2/toc.html

trim it down to

http://ebook.lib.hku.hk/CADAL/B31413456V2/

And then click the "Go" button in your browser (or pressing "Enter" on the keyboard usually works too).

Unfortunately the PDFs do not seem to be OCRed so that means any given text would probably still need to be transcribed. Windows users, see my notes on the free ConcatPDF tool which will allow you to agglomerate all these one-page .pdfs into a single big .pdf for conversion to .djvu. N.B. that you have to install those two Microsoft libraries before you try to install ConcatPDF.

If anyone figures out how to just download the entire book at once let the rest of us know... --❨Ṩtruthious ℬandersnatch❩ 11:46, 15 May 2008 (UTC)

Examining the JavaScript at the HKU site (http://ebook.lib.hku.hk/res/js/heading.js) reveals that the PDF files are stored in a “pdf” subdirectory under each title’s directory. So, for The Art of Cross-Examination (http://ebook.lib.hku.hk/CADAL/B31423735/index.html), the images are in http://ebook.lib.hku.hk/CADAL/B31423735/pdf/. Files are named nnnnnnnn.pdf, where nnnnnnnn is the page number (padded out with zeroes to make an 8.3 filename—so, 00000001.pdf, 00000002.pdf, etc. Although you can’t browse the contents of the pdf directory, you can request individual files. If you know that the last page number is (for instance) 289, you should be able to write a short script that calls wget 289 times and get them all that way. Tarmstro99 00:15, 16 May 2008 (UTC)
And if you were to do such a thing and concatenate the scans, you might end up with something like Image:The Art of Cross-Examination.djvu. :-) Tarmstro99 01:39, 16 May 2008 (UTC)
Sweet. It's alive! It's alive! --❨Ṩtruthious ℬandersnatch❩ 04:05, 16 May 2008 (UTC)
It sounds like an interesting book: here is colum one of a two column NYT book review. John Vandenberg (chat) 07:48, 17 May 2008 (UTC)
People are seeing a high percentage of "dud" .pdf files in these ebooks, but I have gotten through to some clear pages. Hopefully there are some entire books that work, or perhaps there's a particular .pdf reader you have to use... I'm going to keep looking into it. --❨Ṩtruthious ℬandersnatch❩ 23:25, 15 May 2008 (UTC)

Proposed a Mythbusters Episode[edit]

The U.S. television program MythBusters investigates mechanical and technological myths and mysteries through experimentation. (I think it's franchised internationally, isn't it? I've seen clips of what appeared to be a Norwegian version of the show on YouTube.)
Recently jayvb and I worked on the 1908 New York Times article Fatal fall of Wright airship which is a first-hand account of the crash of an early airplane, the Wright Mark I Military Flyer, as it was being demoed for the U.S. Army at Fort Myer, VA. A number of different theories about the crash appeared in the article, and the MythBusters guys usually like a project with a historical context, so I went and proposed an episode exploring the crash on the MythBusters forums. --❨Ṩtruthious ℬandersnatch❩ 10:10, 21 May 2008 (UTC)

Internet Exploder 8 Beta[edit]

To my tech brothers (and sisters?) - I have just been playing around with the IE8 Beta and it looks promising. Microsoft may have finally boarded the browser standards train to some degree.
The nice thing is that it appears this release will greatly broaden the percentage of the W3C CSS standards that are cross browser: in particular the CSS 2.1 counters, which are going to come in handy for me in an upcoming text I'll be working on which is basically a 300-page indented outline. And check out the CSS 2.1 "content" property - you will not believe the kinds of things you can do with it. --❨Ṩtruthious ℬandersnatch❩ 17:01, 23 May 2008 (UTC)

Broken redirects[edit]

Admins, we have a huge quanittity of broken redirects, which, per CSD, should be deleted. Special:BrokenRedirects. If someone could go along to clear those out, it'd be great. Thanks. ---- Anonymous DissidentTalk 02:58, 25 May 2008 (UTC)

I deleted some of them. Yann 08:50, 25 May 2008 (UTC)

Questions[edit]

The license on licenses[edit]

A bit of a confusing concept, but if you have a text on a license, would the text be under copyright or licensed under the license it describes? For instance, Nupedia Open Content License has no indication of its copyright. Would it follow the basic copyright or the NOCL licensing? Another example would be the GNU Free Documentation License. Is it under the GFDL? Although the GFDL says "Copyright (C) 2000,2001,2002 Free Software Foundation, Inc. ... Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed", the license is situated on Wikipedia, which is a bit ironic since it effectively releases everything into GFDL. Very confusing. Although the text (GFDL) has been deleted as being "under copyright", I feel that Wikisource's view on these "license on licenses" isn't clear. BTW, another interesting license involved is the GNU General Public License, which has been salted. —Dark (talk) 09:50, 16 March 2008 (UTC)

The GFDL and other FSF licenses were deleted due to the discussion at Wikisource:Possible_copyright_violations/Archives/2006-11#FSF_Licenses. more licenses can be found at Wikisource:License_documents. John Vandenberg (chat) 10:10, 16 March 2008 (UTC)
oh right. I guess the licensing of Nupedia follows the pattern of that discussion (discussion with GFDL and such...) ? —Dark (talk) 06:25, 17 March 2008 (UTC)

Here is another one: GNU GPL John Vandenberg (chat) 03:11, 29 March 2008 (UTC)

At the bottom of Free Art license/1.3, it says permission to copy, but no derivatives. Free Art license may have a similar problem. John Vandenberg (chat) 18:30, 30 March 2008 (UTC)

Here is another: BitTorrent Open Source License --John Vandenberg (chat) 13:29, 1 May 2008 (UTC)

Extremely confusing. We need to set a precedent before any more licenses are created/deleted, otherwise the situation will become harder to sort out. —Dark talk 11:58, 2 May 2008 (UTC)

And another : Apache License Version 2.0 John Vandenberg (chat) 18:56, 5 May 2008 (UTC)

DjVu Already OCRed...[edit]

Okay, thanks to John Vandenberg pointing me to the wonderful Any2DjVU tool, I have my scanned text in DjVu already OCRed and containing a TOC, which I have uploaded. So all of the text is already contained in the file. Is there some automated way to generate the Index: and all the Page: pages, or do I have to do that manually?

And while I'm at it, is there any solution to the page number discrepancy problem? I.e., if you look at what the DjVu plugin (or a desktop DjVu viewer) considers to be page 65 the actual page number in the image is 60, due to the title page and various other pages that precede the text of the book. --Struthious Bandersnatch 10:26, 19 March 2008 (UTC)

John has already done the Index page for you, but I am unaware of any tool that allows you to automatically generate the text for the "Page:". I suppose you could ask for a bot to be created if you know what text is on what page though... —Dark (talk) 10:52, 19 March 2008 (UTC)
Yes, which text is in which page is information that is automatically contained in the DjVu file (because internally that file format splits everything up by page). So if someone wanted to write a bot for that it seems like it would be worthwhile. But in my case I'll just cut and paste.

(P.S. if anyone goes to write a bot like this the DjVuLibre library is what you'll need.) --Struthious Bandersnatch 11:00, 19 March 2008 (UTC)

Gosh, now you tell me; after I just typed page 2 :P
A bot to extract the text out of a DJVU file and place the text into Pages would be good, but I dont think anybody has written one.
I saw ThomasV fix the page numbering problem you mention on one of the Index files, but I cant find which one he altered. I'll ask him to comment here. John Vandenberg (chat) 11:07, 19 March 2008 (UTC)
Sorry 'bout that :D but y'know it was the Any2DjVu tool you pointed me to that did the OCRing automatically, I didn't even ask it to... :P :^) --Struthious Bandersnatch 11:15, 19 March 2008 (UTC)
Manually typing things can be very annoying ;) I think I can get someone to write a bot script for this though...but that might take some time. —Dark (talk) 11:12, 19 March 2008 (UTC)


about the syntax : sorry, I did not write a doc about this feature. no time, really atm.
here are a few examples : fr:Livre:Hugo - Les Misérables Tome I (1890).djvu, fr:Livre:Baudelaire Les Fleurs du Mal.djvu
ThomasV 12:06, 19 March 2008 (UTC)

In order to create an index with the book pagination you can use the following syntax (if you are renumbering from the very first page of DjVu file):

  • 1=5 : 5, 6, 7, ...
  • 1="1;roman" : i, ii, iii, ...
  • 1="1;highroman" : I, II, III, ...
  • 1="1;char" : char1, char2, char3, ... (char can be any character string you need).

You can begin a new numbering in any page of the DjVu file (<pagelist 1=5 3=12 /> : 5, 6, 12, 13, 14, ...), even to number several pages with the same number (<pagelist 1=5 2=5 3=5 />).

--LaosLos 14:28, 27 March 2008 (UTC)

This worked well, BTW. --❨Ṩtruthious ℬandersnatch❩ 16:33, 23 May 2008 (UTC)

Is DoubleWiki broken?[edit]

I can't find a page (or a browser) for which the DoubleWiki <=> works. The right panel comes up empty. Am I the only one? --Mccaskey 22:03, 29 March 2008 (UTC)

It depends on what skin you are using. Eclecticology 06:44, 30 March 2008 (UTC)
It's blank for me too with the MonoBook skin; it's not the JavaScript component (that part still works fine), so a recent change in MediaWiki might have broken the extension itself. I left a note on ThomasV's talk page, since he developed the extension. —{admin} Pathoschild 08:43:27, 30 March 2008 (UTC)
[[Yes Double Wiki is broken i am sorry to say. And we can not fix this problem for a while. Please do not get upset by this and we are trying our best to make it work and up and running again. Thank you for your time reading this note and once again we are trying to do the best we can to fix everything.]] unsigned by 125.236.44.47 20:29, 2 April 2008.
Who is included in we? —{admin} Pathoschild 20:54:43, 02 April 2008 (UTC)

Footer problem[edit]

The normal footer isnt appearing on Talmud/Seder Zeraim/Tractate Berakhot/2a; any suggestions? John Vandenberg (chat) 02:02, 19 April 2008 (UTC)

Text copying from textbooks[edit]

Would A-Level English/Science/History textbooks be allowed on here?? --84.45.219.185 11:55, 23 April 2008 (UTC)

To determine that, we need to know the title of the book, the name of the author(s), and it would also be helpful to know the year it was published in. John Vandenberg (chat) 12:42, 23 April 2008 (UTC)
In other words, such textbooks are subject to the same copyright restrictions as any other book. Being such a textbook, is not in itself an impediment. Eclecticology 15:36, 23 April 2008 (UTC)

Chess games[edit]

Is it possible to add chess game notation here (like games from world chess championships)? --Conspiration 20:22, 23 April 2008 (UTC)

Hi, Wikisource does not include reference material. However, if you have a free publication of commented games, you're welcome to add it!--GrafZahl (talk) 07:58, 24 April 2008 (UTC)
I believe as long as the game is "notable" (Kasparov vs. Deep Blue, woohoo!), I'd love to see you use {{Chess diagram}} to do up a proper game transcript of each move. On the other hand, I see User:Forestfarmer/sandbox and Morphy's Games/Anderssen/Game I. Sherurcij Collaboration of the Week: John Gould 08:15, 24 April 2008 (UTC)
At the moment it's the games of the world chess championship 1986 since on de they don't want it to be included in the article but linked somewhere. :P --Conspiration 11:02, 24 April 2008 (UTC)
I think this is probably the best way to set up the layout for a match - being sure to throw a NOTOC in there somewhere before the end. I've asked the template designers at WP if they can't help us out a bit, and create an easy way to "shade" a piece that has just moved - so that "non-chess fanatics" can still easily follow the play of the game. Sherurcij Collaboration of the Week: John Gould 11:48, 24 April 2008 (UTC)
I don't think that it is appropriate to dismiss this out-of-hand by a simplistic link to a policy about reference material. The idea is worth more serious consideration than that. On the other hand, I do not support getting into a debate about when games are "notable". To whatever extent that that may be an issue it should be sufficient that the game has been verifiably published.
The transcript of the games themselves are not copyrightable, but the published annotations are, as are non-predictable presentations of those games, or particular non-predictable compilations of them. Anyone who sets out to do things as Sherurcij suggests may just be giving himself more work than he can handle. If he tries this for only one complete game it may take an unrealistic amount of time to get it done.
I don't hold much hope for a massive inclusion of raw data from chess games, except as a part of otherwise includible published works, though I would be open to some arguments in favour. For the most part I think that this kind of project is more suited to Wikibooks. Eclecticology 18:55, 24 April 2008 (UTC)
To be honest, I don't think it would take more than two hours to chart up an entire game. For a chess fan, they could easily add each of the World Grandmaster tournaments for the past twenty years, or Kasparov/Fischer's games, etc. Sherurcij Collaboration of the Week: John Gould 19:07, 24 April 2008 (UTC)
For the 2007 championship tournament alone, with 8 players in a double round-robin, that's 112 hours. Only a patzer would want to have every move diagrammed. A serious player will only care to have key situations diagrammed. Eclecticology 04:34, 25 April 2008 (UTC)
Have you seen Corset-guy? We just need to find an OCD Chess-lover ;) But nah, I just meant the defining games of the matches or whatever. I'm not that big on Chess, but I assume it comes down to a face-off between the two top contenders? Sherurcij Collaboration of the Week: John Gould 06:25, 25 April 2008 (UTC)
I believe, all games from a real world championship (not FIDE world cup and how they called their pseudo-wch) are notable. So let me do those 48 hours of work and just tell me, where. ;) --Conspiration 12:38, 25 April 2008 (UTC)
The last thing I would want is a protracted debate about which chess games are notable. :-P A more realistic debate would be whether such a project is better suited to Wikibooks. For someone near Toronto, I remember something from when I was living there, and having more interest in chess than I do now. I would spend a lot of time going through the chess books at the Central Branch of the Toronto Public Library (then on College Street). One book that I called up was on the famous London tournament of 1851. It turned out to be miscatalogued. It wasn't a "published" book, but a unique bound together set of manuscript score sheets for half the tournament in various hands, presumably in the hands of the praticipants. A scan of that would be a welcome addition. Eclecticology 17:44, 25 April 2008 (UTC)

I can see how this is more than just reference data, as it is a static and unabridged record of an event, however it isnt a transcript of a work of any language. It is a game transcript. If it belongs on Wikisource, wouldnt it be best placed on Multilingual Wikisource ? Im not sure Wikibooks would want a standalone transcript, unless a lot of annotations were added to give it some educational value. See b:Opening theory in chess.

Before there were restrictions on the type of file that could be uploaded, games in w:Portable Game Notation were uploaded to Wikipedia; there are three left. It would be good if Wikimedia Commons accepted PGN files, or a similar format, and even better if the mediawiki software provided the ability to render these (as a gif, or as a multipage image like it does with djvu).

There is a meta project to address this at meta:WikiProject Chess.

As an aside, the extension that would provide Musical score support can also provide chess markup[4]. bugzilla:189 is a request to add mw:Extension:WikiTeX support.

John Vandenberg (chat) 09:01, 26 April 2008 (UTC)

p.s. There is a chess wikia here. If all else fails, they will probably be very keen to accept a game transcript. John Vandenberg (chat) 09:07, 26 April 2008 (UTC)

Webster's first dictionary[edit]

I have a copy of Webster's first dictionary published in 1806, which date is noted in Wikipedia's first page of text on Noah Webster. He taught in my town of Glastonbury, Connecticut. I believe my late father, Dr. Lee Jay Whittles, M.D., received it in the 1930's from a grateful patient during the Depression Does anyone know how many copies of this 1806 edition were printed by Goodwin in New Haven, Connecticut? The printing total of the subsequent 1828 edition was 2,500 copies. Thank you, Bruce Whittles, P.O. Box 959, Glastonbury, CT 06033-0959. Email: brwhits@cox.net.

Non-specific license[edit]

I'd like Wikisourcians opinions on whether this is includable. The report states:

© 2005 Author(s). This work is licensed under a Creative Commons License.

Is it acceptable to default this to mean CC-2.0 and therefore make it uploadable, or does the non-specific nature of the license render this text unable to be included in Wikisource? Opinions much appreciated.

Thanks, Daniel (talk) 01:09, 25 April 2008 (UTC)

Click, speak to them, get clarification Sherurcij Collaboration of the Week: John Gould 01:27, 25 April 2008 (UTC)
Would it be better to contact the author directly? Daniel (talk) 01:47, 25 April 2008 (UTC)
I have emailed the source. Daniel (talk) 12:57, 25 April 2008 (UTC)
The author doesn't know, so I'm contacting NHESS. Daniel (talk) 11:34, 3 May 2008 (UTC)

Bottom-headers[edit]

(Posting here per Pathoschild's suggestion) Does anyone know why The Clansman/Book I/Chapter II has a bottom header and yet The Clansman/Book I/Chapter I doesn't? Thanks, Daniel (talk) 00:55, 4 May 2008 (UTC)

I had similar problems on #Footer problem It is controlled by JavaScript function "DisplayFooter" in MediaWiki:Common.js (down the bottom) --John Vandenberg (chat) 05:00, 4 May 2008 (UTC)
I cannot read Javascript, but it looks like pages which don't have values for both the "next" and "previous" parameters don't generate a footer. Look at Anna Karenina/Part One and Anna Karenina/Part Two (as well as Anna Karenina/Part One/Chapter 1 and Anna Karenina/Part One/Chapter 2). Would this be possible to fix?—Zhaladshar (Talk) 18:16, 4 May 2008 (UTC)
Well, looks like you've found the problem. The footers look fine now. -Steve Sanbeg 23:09, 6 May 2008 (UTC)

The footer still doesnt appear on Talmud/Seder Zeraim/Tractate Berakhot/2a. I think the footer should be displayed whenever there is a value in next or previous, and it is probable that the top header is off the screen when the reader is at the bottom of the page.

We can use dynamic properties in IE to have it appear whenever the pageContent height is more than marginally higher than the window frame, and Mozilla has similar functionality available via XBL. As a fallback for other browsers, we can listen to resize events.

If we can agree to when the footer should appear, I can do the coding. John Vandenberg (chat) 02:29, 7 May 2008 (UTC)

There are a couple of cases where previous/next are perhaps not desirable; one complaint was raised regarding the footer appearing on NYT. Perhaps the footer should only be displayed if the previous/next values are part of the same "work", which we could approximate to avoid the problem cases. One way would be to look for "Wikisource:" in the previous/next. John Vandenberg (chat) 02:52, 7 May 2008 (UTC)

Talmud/Seder Zeraim/Tractate Berakhot/2a looks OK to me; maybe it was still cached before, or is in your cache. It would be trivial to limit it to only display if there was a next or previous entry, although it may be better to make it unobtrusive enough that we can display it everywhere, for consistency, and manual disable it on a few pages where we don't want it. -Steve Sanbeg 16:36, 7 May 2008 (UTC)

It displays properly for me now. Thanks. John Vandenberg (chat) 01:17, 8 May 2008 (UTC)
It is now also being displayed on empty pages; I noticed it on Design [5]. John Vandenberg (chat) 01:43, 8 May 2008 (UTC)
Would it be possible to change the code so that it only displays if either "previous" or "next" actually have a value and is hidden otherwise? Disambiguation pages are showing the footers and there really isn't any need to do so; it just makes that page look odd.—Zhaladshar (Talk) 18:59, 8 May 2008 (UTC)

Acceptable Article?[edit]

Hi everyone! Well, first of all, I am very new to this website (but not wikipedia), and I am really not sure if i should be typing this a different way or not. But what I would like to know is if my relatives' Autobiography would be acceptable content on Wikisource? But I know that most Autobiographies (or self written articles) aren't usually uploaded to Wiki sites. The main reason that I would like to upload this, is because my relatives' story is a story of an immigrant, and I might add, from a very good writer. So, can anybody help me on this inquiry? Any help would be muchly appreciated. unsigned comment by TheGreatAwno (talk) 17:05, 5 May 2008.

Has it been published? If so, where was it published; a journal, magazine, or a book? John Vandenberg (chat) 08:42, 5 May 2008 (UTC)

Shakespeare[edit]

Aren't shakespeare's work under eternal copyright in the UK. Anonymous101 18:28, 5 May 2008 (UTC)

Who has the right to exercise the copyright? - Mtmelendez 19:43, 5 May 2008 (UTC)

No. John Cross 22:51, 24 May 2008 (UTC)

semi-protection request[edit]

Hi, I don't know if semi-protection works the same way on Wikisource that it does on Wikipedia, but I was wondering if I could request semi-protection of some texts that I transcribed. Specifically I was wondering if I could get all of the texts by Ross Winn semi-protected. There are several reasons I'm requesting this. Firstly, I have painstakingly transcribed and proofread most of these texts from the only known copies kept at the Labadie Collection at the University of Michigan. So if someone edited one of these texts, it would take a Herculean effort to verify the accuracy of the changes, and since several of the texts have strange spellings or typos that are from the original texts it is likely that someone someday will try to "correct" them. Since I only get on WikiSource extremely rarely, it isn't practical for me to monitor these texts for changes myself, unfortunately. I would like to upload scans of the original texts so that other people can verify changes easily (in fact I have done this already for a couple of the texts), but many of them are too old and delicate to undergo scanning (according to the archive staff). Let me know if this is a possibility or not. Thanks! Kaldari 02:19, 9 May 2008 (UTC)

Hi, our Wikisource:Protection policy outlines when we use protection, but if this doesnt seem to cover your request, we can still discuss whether it is appropriate in this case. I see you have uploaded two pages of Winn's Firebrand:
Personally, I would prefer not to protect these pages, as there are still many good changes that can be made to a text without altering the transcribed text. There are ways to ensure undesirable corrections to not occur.
You can now enable email notifications in your Special:Preferences, so that it will inform you when there are modifications to pages that you have placed on your watchlist.
Another way to reduce the chance of incorrect changes is to add to each talk page that the text has been transcribed from the original held at the archive, perhaps even recording the archival identifier. The regulars here on Wikisource will request that newcomers prove that changes are faithful, and will revert any changes that are not explained. John Vandenberg (chat) 02:45, 9 May 2008 (UTC)
See also Help:Patrolling - for the most part, every single change by users other than an admin will be inspected by a Wikisource regular. The current changes that need to be reviewed can be see here. --John Vandenberg (chat) 02:50, 9 May 2008 (UTC)
I concur with John that there are better alternatives to protection at this time. Incidentally John, doesn't your bot have a patrol function, or am I dreaming? giggy (:O) 03:05, 9 May 2008 (UTC)
Yes, it automatically approves user contributions to certain pages, defined by User:JVbot/patrol whitelist. At present, the whitelist can be edited by anyone. It has patrolled 20396 changes, so it is catching up to its master who has patrolled 20768 changes :-) John Vandenberg (chat) 03:15, 9 May 2008 (UTC)

{{Page}} Template[edit]

We currently have a {{Page}} template that transcludes text from our Page: namespace. That is, typing {{Page|Foo}} causes the system to look for a page named Page:Foo. If it finds it, it replaces the {{Page|Foo}} tag, wherever it appears, with the text of Page:Foo.

Isn’t this unnecessary? Typing {{Page:Foo}} accomplishes exactly the same thing, without the intervening step of loading the template. Is there something I’m missing? Are we somehow saving processor cycles or server read/writes by using a template rather than just transcluding the page directly by typing its name inside curly brackets? Tarmstro99 15:58, 13 May 2008 (UTC)

yes you are. the Page template also displays javascript links to the scanned pages, in the margin (assuming it is used with in a <div class=lefttext> section) ThomasV 19:44, 13 May 2008 (UTC)
On my monitor, in Firefox 2.0.0.14, the page links show don't make it over into the margin - they show up underneath the text of the article (underneath z-order-wise, not underneath vertically.) --❨Ṩtruthious ℬandersnatch❩ 07:02, 15 May 2008 (UTC)
This is due to recent changes in the template, which were made to suit The Wind in the Willows/Chapter 1. Further improvement is needed. John Vandenberg (chat) 07:31, 15 May 2008 (UTC)

(unindent) With The Wind in the Willows/Chapter 1 as a formatting exemplar, I used {{Page}} to create a couple of new pages with the text transcluded from scans in the Page: namespace. The results are available at The Records of the Federal Convention of 1787/Volume 1/Preface and The Records of the Federal Convention of 1787/Volume 1/Proceedings of Convention, May 14-25, 1787. I’m pretty happy with how they look, although it would be nice if the {{Page}} logic were smart enough to (1) assign the correct page number in the left margin based on the corresponding item from the <pagelist> on that particular scan’s Index: page, rather than requiring the user to pass a num= parameter to the template, and (2) allow an adequate left margin for the link to the scanned pages without requiring the user to place all the calls to {{Page}} within a table. I’m no programmer, and am perfectly willing to concede that what I want to see may be beyond the capabilities of any template, but that’s how {{Page}} would work in a perfect world, IMHO. Tarmstro99 01:00, 28 May 2008 (UTC)

The Z problem[edit]

Mtmelendez (talkcontribs) and I have been trying to find the correct character for the Z used in Index:Sacred Books of the East 3.djvu. e.g. Page:Sacred Books of the East - Volume 3.djvu/315 currently has "Zo Kwan". pengo suggested on IRC that the Z may be w:Open-mid central unrounded vowel. John Vandenberg (chat) 14:40, 14 May 2008 (UTC)

It looks like ℨ to me. Eclecticology 19:43, 14 May 2008 (UTC)
I know I am commenting quite a bit late, here, but the example you referenced (Page:Sacred Books of the East - Volume 3.djvu/315) looks like a Unicode Character 'BLACK-LETTER CAPITAL Z' (U+2128), see [6]. It looks like this: ℨ and the HTML entity is &#8488;. I think the existing character should be replaced with this. Earthsound (talk) 21:52, 13 October 2009 (UTC)

There's also an issue regarding an unidentified chinese character on pg. 309. - Mtmelendez 18:22, 14 May 2008 (UTC)

Is this not the same as 書, or am I miscounting the strokes in the middle? Eclecticology 19:43, 14 May 2008 (UTC)
Awww, c'mon, if you're going to call me in pick something challenging! There's an entire Wikipedia article on Legge's personal romanization of Chinese dialects. (I'm just kidding, I think I was able to track that down because I've taken a little Mandarin, plus I vaguely remember reading something like that when I read about Wade-Giles versus Pinyin romanization.)
So we're lookin' at fancified Fraktur-font versions of З / &#x0417; and з / &#x0437;.

And Eclecticology is right, the page 309 character is 書 / &#x66F8; ― that footnoted text is referring to the Shūjīng, the topic of the preceding chapter.--❨Ṩtruthious ℬandersnatch❩ 02:23, 15 May 2008 (UTC)

Thank you. Hopefully this one is harder: Page:Sacred Books of the East - Volume 3.djvu/329 has what appears to be another, different, Fraktur Z . John Vandenberg (chat) 05:11, 15 May 2008 (UTC)
Are you talking about the one in "Măng-зze"? I think that's just the way they're doing the lower-case; it looks the same as the one on page xi in "Зăng-зze". I know it has that little loopy tail, and there's a fairly similar-looking character ʓ / &#x0293; called "LATIN SMALL LETTER EZH WITH CURL" but the "ezh" is not mentioned in the Wikipedia article on Legge's romanization, so I think the little loopy tail is font theatrics on the lowercase Cyrillic "ze". Or it might specifically correspond to ҙ / &#x0499; "CYRILLIC SMALL LETTER ZE WITH DESCENDER".
Not to beat a completely dead horse, but if you throw it in a template and insert it that way, if at any point in the future you find you guessed wrong you can do the switcheroo. (Or you could do the same thing with your bot-minion, it doesn't matter.)
It's awesome you guys are doing Legge's text, anyways. Here's some code that does a fairly good fascimile of those left-hand floating boxes in case you haven't tackled that yet:
<div style="float:left; font-size:.8em; line-height:1em; text-align:center; width:7.5em; padding:.5em;">The writers of the odes.</div>
--❨Ṩtruthious ℬandersnatch❩ 06:49, 15 May 2008 (UTC)
I can see some value in using a template to present these, esp. as Legge's personal romanization of Chinese dialects is ..., for what of a better term, notable. Researchers might like to see where each character was used. I'll wait to see what others think.
Regarding the left-hand floating boxes, we are currently using normal section headers. Do you want to do a demo of your voodoo on one page, so we can see how it looks and discuss? We need to keep in mind that when this text is displayed in a logical structure, these floating boxes may appear too close together because the typical screen permits more text per line than the printed page where they were placed. Still, we could use template logic to display these as either normal text or floating boxes depending on which namespace the template is called from. John Vandenberg (chat) 07:33, 15 May 2008 (UTC)
Sure; here ya go.
I'm not proposing that you change the way you're doing things, though, I just noticed that in the discussion page on 329 there Mtmelendez said he was looking for a way to do those boxes. --❨Ṩtruthious ℬandersnatch❩ 09:16, 15 May 2008 (UTC)

Do we really need the fancy "div style" formatting here for what is essentially a talk page? Some of it above only manages to screw up the text wrapping. I was tempted to remove the superfluous formatting, but thought that exercising a little wikiquette for now would be better appreciated.

The Wikipedia article on Legge's romanization mentions the Cyrillic/Fraktur characters, but does this without any reference. I've added a "fact" tag there, since mere resemblance does not imply verifiability.

Going through those of Legge's contributions that I do have gives me the impression that Legge did not use the two special symbols throughout in all his writings. My first impression is that the characters were abandoned for the later works. A table of transliterations for oriental alphabets at the end of The Texts of Taoism, Part I does not even mention ℨ, but does show the other character with and without a dot beneath as variants of "z". These two it uses to distinguish between the Arabic ص and ض. For me this supports the notion that "ℨ" is nothing more than the capital form of "ʓ". Let's disavow ourselves that they have anything to do with Cyrillic. I don't think that a template for these would be helpful. They could be added to the special characters, or instructions could be given in the notes for the head page of a work that uses them.

The floating box is an unrelated issue. How is your proposal different from what there is at Wikisource:WikiProject 1911 Encyclopædia Britannica/Style Manual#Shoulder headings? Eclecticology 22:02, 16 May 2008 (UTC)

What do you mean by the "fancy div-style formatting"? Are you talking about the dashed-bordered box that appears around text when you place a space at the beginning of a line? That screws up line wrapping, but it's wiki "code", it has nothing to do with HTML div elements. But yes, editing other people's comments for the sake of making them look pretty on your computer's screen wouldn't be really great wikiquette.
As far as the character being something other that the Cyrillic "ze" - that's certainly a possibility. That's exactly why I'm suggesting that they add a template instead of any particular hard-coded character - so that it can be changed effortlessly in the future.

Also, the capital "ezh" is Ʒ / &#x01B7;. The character you're suggesting - ℨ / &#x2128; - is named "BLACK-LETTER CAPITAL Z" in my copies of the charts and it's in the "Letterlike Symbols" plane next to ™ and № and ℞ and fractions and roman numerals. (Not that that precludes it from being a match for what we're seeing in the text, but the fact that it isn't accompanied by a lower-case makes it look to me as though it's a special-purpose symbol to me. That symbol may be in the Fraktur font on your computer, that could be why it looks especially similar to you.)

The snippet of code I offered above is not a proposal, again it was simply in response to Mtmelendez's comment about finding some way to do those boxes. --❨Ṩtruthious ℬandersnatch❩ 00:35, 17 May 2008 (UTC)
What dashed box? What I see is in the comment section beginning with "Are you talking about ..." is extra wide text that requires side-scrolling to be read. The purpose of Scriptorium is to allow issues to be discussed; gratuitously idiosyncratic formatting does not serve that purpose. I have no intention of editing your comments, but then I don't consider the special formatting that has been included "for the sake of making them look pretty" to be a part of the comments.
I'm not in a position to determine whether the ℨ is an exclusive part of the Fraktur font. The Unicode Standard Version 3.0 book simply states, "The Unicode Standard simply uses black letter forms as archetypes." (p.298) Having a special template for one symbol that's unlikely to have much use beyond the Sacred Books of the East does not strike me as constructive. When more distinct fonts are necessary for entire texts, a template might then be more useful than for single isolated characters. Eclecticology 01:38, 17 May 2008 (UTC)
I'm pretty sure that what you're seeing is the wiki code effect produced by beginning a line with a space, which generates an HTML <pre/> tag for its formatting effects and consequently causes line wrapping problems. It's a standard way of presenting code snippets within wiki text. There's really no need to be bitchy about it. If you don't like the scroll bars it produces in your browser you should complain to the developers of MediaWiki.
I explained the purpose of using a template here. If you're both complaining that someone might get the text wrong, but at the same time complaining that using a template to avoid the problems of getting the text wrong might be unconstructive, I think you're being a bit cavilous. "Constructive" or not, it would solve the problem and it's not like there's some sort of risk or major cost to using a template for a single text, nor does the number of characters a template produces have any bearing on how "constructive" or appropriate it is to employ such a template - I refer you to Wikipedia templates like {{!}}, {{=}}, {{((}}, and {{))}}. I would also note that you yourself referred to the style guide of EB1911 above, to a section that describes how to use {{EB1911 Shoulder Heading}}, which is - drum roll please - a special template that appears to be intended for use with a single text.
If you wouldn't go with my suggestions yourself, say so. There is no need to contrive crappy hypocritical arguments asserting that my ideas are completely without merit or try to condemn the way I format my talk page comments.
In any case, I don't care what the people working on this text do. Questions were put to me and I tried to respond in a manner as helpful as possible. If you want to talk about being unconstructive, examining the nature of your own responses here might be a good place to start. --❨Ṩtruthious ℬandersnatch❩ 02:17, 18 May 2008 (UTC)
I realized that underneath all the nasty posturing about propriety might be what someone else would express as a request for help in making the talk page look better on your monitor. I took a crack at replacing the wiki formatting with HTML code that would allow the lines to wrap. I hope it makes viewing this page more comfortable for you. --❨Ṩtruthious ℬandersnatch❩ 04:46, 18 May 2008 (UTC)
At least the identified passage is now readable without side-scrolling. Thank you. Eclecticology 17:16, 19 May 2008 (UTC)

Shakespeare edition[edit]

I was not able to ascertain what edition was used as the source for any of the plays I looked at. Aside from issues of textual transmission, this raises serious questions about whether the particular edition sourced is under copyright. What should be done about this? Would a Fidelity template at the head of each play be an appropriate way to proceed? Webbbbbbber 22:57, 15 May 2008 (UTC)

Cowardly Lion (talkcontribs) has spent a bit of time looking at our works of Shakespeare; I've left a note on his talk page in case he has time to throw in some ideas.

{{Fidelity}} templates would work, but I would place them at the top to attract attention. A better approach would be to set up transcription projects for each, like Romeo and Juliet, and adding a {{migrate to djvu}} tag to the page being migrated.

As an example, this is at least PD in the US, so it can be uploaded to Wikisource - if you can work out who Author:W. Osborne Brigstocke is, it might even be PD in the UK, which means it can be uploaded to commons:

On archive.org, there are 860 sets of page scans of works created by William Shakespeare. John Vandenberg (chat) 23:46, 15 May 2008 (UTC)

Project Gutenberg seems to have done a highly creditable job transcibing directly from the first folio (where available). Could we use their transcriptions? Webbbbbbber 23:54, 15 May 2008 (UTC)
Most PG etexts come from editions that can be found on archive.org. If you can match up a PG etext with a set of page scans, the transcription project is simply a matter of copy and paste, with a bit of reformatting and a quick final proofreading. John Vandenberg (chat) 00:09, 16 May 2008 (UTC)
Just because someone has used a modern printing of a Shakespeare work does not imply that that edition acquired a new copyright when it was issued. Minor corrections to the text would not generate a new copyright. What would be copyright in those editions would be any new introductory material, typesetting and layout. It would be quite alright for me to use a modern text version in a wiki text, but a scanned version should be avoided.
To say that PG seems to have done a creditable job of transcribing the first folio is not enough. Unless they specifically state that they have taken their version from the first folio we cannot assume that they have. Eclecticology 01:05, 16 May 2008 (UTC)
They specifically state that they have taken their version from the first folio. (-: At least they have for Hamlet: http://www.gutenberg.org/catalog/world/readfile?fk_files=89&pageno=7 The reasons that I say "seems" are: I do not have access to the first folio, and I am by no means a Shakespeare scholar.
By the way, I'm not arguing that the first folio is necessarily the best way to go; just that it would be quite easy to copy & paste from PG. My main concern is that the edition used as the source text be cited. Which edition is chosen I will happily leave to others more knowledgable (or opinionated) than myself.
Webbbbbbber 15:51, 16 May 2008 (UTC)
It's worth noting this from the page you cite: "Another thing that you should be aware of is that there are textual differences between various copies of the first folio." This is really an avenue to a wide range of interesting problems, whether or not we are individually Shakespeare scholars. We sometimes need to ask what is "our" canonical Shakespeare text, and how do we reconcile other variations to that? I had enough of an argument with a Rudyard Kipling poem that had only two clearly identifiable versions. The two versions now have separate pages. For a Shakespeare play the situation is incredibly more complicated.
My bigger concern was with your initial comments that suggested imputing copyrights to modern editions of older works that are otherwise clearly in the public domain. One simply needs to avoid reading too much into a publisher's copyright statement. Eclecticology 18:06, 16 May 2008 (UTC)
"For a Shakespeare play the situation is incredibly more complicated." I couldn't agree more! There is two-volume work on the textual difficulties of Hamlet alone. I get the impression that no modern editions will use a single source for any given play. So it may be best to leave the editing to the scholars, and choose a modern edition, rather than going with a pure first folio transcription. This is especially true of plays where they really have to do a lot of cutting & pasting from different sources to come up with a coherent text.
Yes, and any scholarly modern edition will do (not expurgated school editions). This can be determined on a play by play version. Primary proofreading could then be based on that "canonical" edition. Other editions could show their difference from that one.
My apologies about any miscommunication in my initial statement. I didn't mean to imply that the text of the plays themselves would be under the same copyright as the commentary, etc; just that the text of the plays *might* be under copyright, and that this would be extremely difficult to verify (one way or the other) if the source isn't cited. Webbbbbbber 18:34, 16 May 2008 (UTC)
I don't think that apologies are needed. Copyright law can be a little confusing. The essential text of the plays would not be under copyright at all, even for a modern edition. A modern text would need a strong element of originality to be protected ... but then (like Macbird) it would no longer be Shakespeare. Eclecticology 22:54, 16 May 2008 (UTC)

I added a textinfo box to The Merchant of Venice, since the discussion page there mentioned the source. Would anyone like to have a look and let me know if I've filled about everything properly? Thanks! Webbbbbbber 22:07, 16 May 2008 (UTC)

That looks good, but you may get objections from those who believe that that box should be on the talk page. I would vote for having it where you put it, except perhaps after the header; maybe that whole issue needs to be revisited in the light of the header discussion on this page. Eclecticology 22:54, 16 May 2008 (UTC)
I have moved the textinfo block to the talk page, as that is where it is intended to go, and that is where it will be expected to be. John Vandenberg (chat) 02:52, 17 May 2008 (UTC)
Those passive voices explain nothing. It would certainly be more useful on the project page, perhaps with some redesign to make it look less intrusive. Eclecticology 08:34, 17 May 2008 (UTC)
I am all for a redesign; but that is being dealt with in a separate thread and wont be implemented immediately as is complex enough that it cant be pushed through piecemeal. Until we have a new working solution, the old practises should be maintained so that when we do come to implement the redesign, there are less oddballs to trip up the bots. John Vandenberg (chat) 09:06, 17 May 2008 (UTC)
In some ways it is a different question; I see the header and textinfo as worthy of separate templates. Fitting both of these concepts into the same template just makes that one great template more complicated. Eclecticology 17:01, 17 May 2008 (UTC)
Just a naive question. If a text is backed by scans of a reference edition (see WS:TP), and made faithful to that edition, then there are two sources that should be distinguished. The first source is the scanned edition. The second source is the source of the e-text (whether it was copy-pasted from Gutenberg, from another website, from an OCR of the scanned edition, or an OCR of another edition).
I suppose that the second source is not very relevant to copyright. The source that wikisource is reproducing is the reference edition, therefore the only thing that matters is whether the scanned edition is free. The secondary source (the provenance of the e-text) might be acknowledged, but I doubt that it should be considered as a copyright issue. Am I wrong ?
ThomasV 09:01, 17 May 2008 (UTC)
It's an important question, and not at all an easy one to answer. How do we determine which edition to use as the reference (or canonical) edition. The WS:TP page shows links to two conflicting texts of The Wind in the Willows from 1908 and 1913. There may be some merit to including both editions, but that would strike me as a waste of resources. If we choose to have only one whole version, how do we reconcile the two versions which we know to have a long range of small differences? What differences are important enough? I have no problem with the Oxford edition of Merchant of Venice, but when I compared the first speech with my paper copy of the Yale edition I found two punctuation differences. This alone may not be a significant difference, but it is a warning about what might be found in the rest of the play.
I have been working with some hundred year old magazines which were the first publication sites for many authors whose works were later collected into books. That was the way things were done then. The Conan Doyle short stories are only one example of these. The differences between the two can be quite striking.
I agree that it's not generally a copyright issue. The scanned version can be an older one where copyright is not a factor at all. The reference version may be more recent, and more reliable. It may be protected by copyrights on layout, typography and added footnotes, but the corrected text itself will be in the public domain. Eclecticology 17:01, 17 May 2008 (UTC)
Hoo boy. So, now I'm wondering if the {{Textinfo}} template ought to be expanded to offer a slot for scanned source and e-text source. It wouldn't always be necessary to have both: (e.g., PG is pretty good about citing the scanned source). But should there be a slot in {{Textinfo}} for cases where the e-text doesn't clearly indicate the scanned source? Or would that just confuse matters? Webbbbbbber 02:57, 18 May 2008 (UTC)
There's no need to feel limited by the existing textinfo template. If you feel that an existing template is inadequate to your (and you feel that you have enough geekish skills), then by all means go ahead and develop your own version. Eclecticology 17:08, 19 May 2008 (UTC)

Tax documents[edit]

I've been doing some cleanup of PDF files on Wikipedia, and I've found an editor who has uploaded a large number of tax documents relating to the Church of Scientology in PDF format [7]. Would they be better off here? Hut 8.5 15:59, 18 May 2008 (UTC)

I've superficially skimmed the page titles and description pages; here are some problems I see:
  • The documents are tagged as US government public domain. That's OK for the IRS stuff, I believe, but others are from the Californian judiciary, so at least the tag is wrong. I'm no copyright expert, though.
  • Wikisource doesn't include mere lists or compilations of data, as e.g. w:Image:List-Organizations-CoS-1993.pdf appears to be (I haven't read it). Lists which are part of a larger publication are OK, though.
  • Wikisource only includes texts which were previously published in a forum with some kind of peer-review. So those files which are internal Scientology docs don't qualify. Official documents of a court case where everything is included in a nice roundup is a different story, however.
--GrafZahl (talk) 09:10, 19 May 2008 (UTC)
  • Californian documents, etc would likely be covered under {{PD-GovEdict}}
  • If the "published document" itself is a list, I believe policy dictates it could be included.
  • Wikisource was intended to be a repository for "source" documents. Tax documents such as these are unlikely to be reproduced in a published work, however it may be that some or all of these are documents (or others like them) are referenced, or reproduced in part, in published works. That being the case I believe Wikisource should include (where someone is willing to add it) the entire document, not just the partial reproduction or reference. In short, I believe a document should pass the "importance" requirement if it has been published with some kind of peer review, or if it is referenced by a document that has been published with some kind of peer review.--T. Mazzei 17:02, 19 May 2008 (UTC)
While I don't see a lot of problems working around the first two objections there are potentially important copyright problems with modern source texts that have never been legally published elsewhere. The peer-review point is not that relevant here, since it is primarily designed to prevent vanity publication. That might be more in question if someone was disputing that Scientology was in fact the source for these documents, or we were talking about someone's personal rant for or against Scientology. Eclecticology 17:31, 19 May 2008 (UTC)

Traditional[edit]

Is it useful to create an author's page for "Traditional"? There are many old folk songs that have no real author, in most of the cases the composer is billed as "trad." or "public domain". Songbird 16:27, 18 May 2008 (UTC)

Hi Songbird, and welcome to Wikisource. When the author is not known, you can use the override_author parameter of {{header}} to say "traditional" or "author unknown" or some such. You may want to add the work to the anonymous texts index (there's an ongoing discussion as to how this page should be organised). You can also add your folk song to Category:Folk songs and Category:Anonymous texts. The former contains only six songs, so it would be really nice if you could add some more.--GrafZahl (talk) 08:45, 19 May 2008 (UTC)
Thank you very much, I'll check it out. Songbird 13:46, 19 May 2008 (UTC)
But perhaps it's useful to create a list of such traditionals, in this case an author's page would be useful, I think. What do you think? Songbird 14:50, 19 May 2008 (UTC)
Categories, or a page in the Wikisource namespace, are prefered. Consensus in the past has been to keep the author namespace pure and Aryan :^P --T. Mazzei 17:13, 19 May 2008 (UTC)
Wikisource:Song lyrics has a place specifically for traditional folk songs (as well as sea shanties, Christmas carols, religious hymns, national anthems, etc) Sherurcij Collaboration of the Week: Author:Percival Lowell 18:27, 19 May 2008 (UTC)

Encyclopedic articles[edit]

The following discussion is closed: See the discussion "Composite encyclopedia pages" above for a portion of this.
I use the term "encyclopedic article" to mean not only articles in books named "Encyclopedia", but also articles in other compendia with a more limited selection of topics. These are more commonly in alphabetical order, but other orders, such as date of death order are also common. If two succeeding articles are related to each other it is purely a matter of coincidence.

Many of these articles are short, and articles on the same subject can appear in several encyclopedic works. Including them on the same page makes a comparison of these articles easier.

Using the encyclopedia's name as a primary title item provides little if any benefit. A benefit is conceivable in such a scheme with a diligently applied system of categorization, but that would require considerable work that nobody seems prepared to do. Such a categorization or indexing system must also be prepared to account for the different names by which the topic, and the vagaries of different alphabetization systems that may be either word-by-word or letter by letter.

In working through a revised approach for this material I have worked with the entries in the 1911 Encyclopædia Britannica, the 1906 Americana encyclopedia, the late 18th century orginal set of the Dictionary of National Biography and its first supplement from 1901, the 1701 edition of Jeremy Collier's Great Historical Dictionary, and the 1903 Cyclopædia of American Biographies, all of which I have in hard copy. I took note of the 1913 Catholic Encyclopedia, but out of respect to the work of others, did not move any of these articles since that work has been essentially all included under its old titles. By starting work with Volume 19 of the EB, (beginning with "Mun") I was able to avoid causing damage to the efforts of others.

I have entered the pages according to the names shown in the particular work with a disambiguator based on which work was being extracted. (I admit that I am not completely satisfied with these abbreviations as they could become a bigger problem as the number of works used increases.) If that was the only article available on that topic I put it directly on that page, keeping in mind that it could be moved at a later time if another article is discovered. If the article was present in more than one work it was redirected to a common page with a suitable name. There, all the articles found are on the same page, each with its separate header using a usual format. In the template the "article" parameter should show the article name as it appears in the original work, adding only a disambiguator where necessary. The same can be done with the "previous" and "next" parameters, which should resolve as links to those articles in that encyclopedia. A similar approach was taken with links found in the text.

There were five articles about Kentigern, a sixth century Irish saint. Some sources show him as Kentigern; others as Mungo. These have been brought together in one page. Another matter to be resolved had to do with Müntzer, Thomas whose name is sometimes spelled "Münzer" or "Muncer". These are only a few of the details to be resolved.

We generally need to be more flexible about the way we present material. Any credible approach should be permitted. If someone else's work has a different appearance, or uses different templates that should be no problem. If that presentation is not viable, nobody else will use it. Eclecticology 07:05, 20 May 2008 (UTC)

Pathoschild raised a similar discussion at #Composite encyclopedia pages. John Vandenberg (chat) 07:24, 20 May 2008 (UTC)
You have not caused any "damage" to others, however we are often having to make exceptions for the way you are working, which is against the style guidelines (see #Statistics). That is a high cost, and only so that you can champion a new approach. It has been about six months now? I hope we can roll any good ideas into our standard practises, and that you will abandon any practises which do not have community approval. We need to move on in order to tighten up our standards. I do not agree we should be more flexible, as tighter standards help new users pick up the ropes sooner, and allow tools to be written to better manage this enormous collection. John Vandenberg (chat) 07:55, 20 May 2008 (UTC)
What high costs? What's so wrong about championing new approaches? Is it really an exception requiring a martyr's barnstar to allow others to develop different visions? Where is the community approval for the notion of "standard practises" as something more enforceable than mere guidelines? Where is the community approval that determines just how tightly your standards should be adopted. The whole notion of community approval leaves much to be desired, given the small size of the regular community. Giving the new editor a set of guidelines with which to start is fine, but he also needs to feel free to deviate when these arbitrary standards are inadequate to his efforts. As for the management tools, it is important to remember that tools should serve the community, not the other way around. Eclecticology 07:18, 21 May 2008 (UTC)
My tool is trying to find an accurate algorithm to answer the question "how many works are on Wikisource?"; that is serving the community.
I dont mind having to build better tools, but it is insane to require a developer to build tools to cater for each persons different approach. That does not scale. We need to have minimal standards in order to answer questions like that. Having only text from a single work on each page, and using a common prefix for each collection of pages: those are the minimal standard that are used throughout our collection, with the only exceptions being really old pages, really new pages, and your recent work. The common practise is the sign of community approval. This is not my imposition on the community; this is my interpretation of the community best practises. John Vandenberg (chat) 07:41, 21 May 2008 (UTC)
So your reason for tight standards is only so that you can count articles? How is that serving the community? Content is for the benefit of users, and I doubt that any of them give a damn about the number of works. If my work is making such a chore of your counting algorithm it doesn't bother me that you don't count it. To say that community approval is determined by inertia does not strike me as very sound. Without getting into semantic distinctions between "imposition" and "interpretation" there remains no community decision to the effect that your way is the only way. Eclecticology 08:44, 22 May 2008 (UTC)
Not to count articles; .. to count distinct works. not because I particular care how many works we have; but .. because if we cant count them, we have no clear definition of a "work". Shall we have a poll? John Vandenberg (chat) 09:28, 22 May 2008 (UTC)
I don't think there are many old pages not using the subpage format; most of those were corrected a long time ago. —{admin} Pathoschild 03:03:54, 22 May 2008 (UTC)
"Corrected" is your POV" Eclecticology 08:44, 22 May 2008 (UTC)
Indeed, mine and that of the various community members who agreed to enable subpage syntax on Wikisource and to implement a subpage convention, and who then manually renamed pages to conform to the standard.
There are several reasons for the subpage convention we use, which the approach you advocate blocks. A few off the top of my head:
  1. The slash-delimited hierarchy is part of the Internet-wide URL standard, so that the convention gives us highly readable, logical, and standard URLs. For example, "en.wikisource.org/wiki/United_States_Code/Title_31/Chapter_69/Section_13" is a much better URL than "en.wikisource.org/wiki/Section_13_%28USC_31/69/13%29".
  2. As such, MediaWiki explicitly supports a subpage hierarchy like the one we use. For example, [[../]] on any page links to its parent page, which allows us to use location-independent links. We could rename The Time Machine without needing to make a single link correction to its pages.
  3. The resulting URLs are highly stable, since they are based on objective criteria. The page names for The Time Machine will not change in the future based on differing preferences or conditions.
  4. The convention makes it easy to find content without searching, so that I know I can find chapter I of The Time Machine at "The Time Machine/Chapter I".
  5. The simple standardized hierarchal structure makes Wikisource very easy to contribute to, because it is obvious how pages should be named and organized. (Where should I put Chapter 2 of Work? At "Work/Chapter 2".)
  6. The convention makes Wikisource very easy to parse for scripts, which has a whole range of benefits including generating statistics, extracting content automatically, performing services such as printing all pages of the given work, et cetera.
  7. The structure makes it very easy to manipulate and duplicate content at will. For example, we could very easily create composite encyclopedia pages that are updated automatically when their source is edited, using simple transclusion.
{admin} Pathoschild 02:27:12, 23 May 2008 (UTC)
 :P--T. Mazzei 04:17, 23 May 2008 (UTC)
Done. —{admin} Pathoschild 06:15:22, 23 May 2008 (UTC)
  • It appears that the merger was only done selectively. Eclecticology 18:12, 30 May 2008 (UTC)

Copyright on new Islamic contributions[edit]

We have around 300 poor quality new page contributions from Mohammed 2010 (talkcontribs) and 84.36.148.242 (talkcontribs), and they are all waiting to be patrolled. Are these worth keeping; can anyone say with 100% assurance that these are not copyright? John Vandenberg (chat) 08:05, 20 May 2008 (UTC)

They're significant texts and certainly "worth keeping", a quick glance suggests most of them are centuries-old - so the text itself is certainly not in copyright, although we need to hunt down translator information. M2010 has not responded to talk page comments, and does not have an eMail address listed. Sherurcij Collaboration of the Week: Author:Percival Lowell 08:22, 20 May 2008 (UTC)
These seem to be parts of larger works; if so, they need to be organized into subpages. Pathosbot can add the header and navigation (if we know the navigation). —{admin} Pathoschild 08:46:16, 20 May 2008 (UTC)
"Pages that link here" leads to the parent page, it simply seems be a case of M2010 not realising/remembering the [[/ notation. Sherurcij 18:52, 20 May 2008
Sunan Abi Da'ud is w:Sunan Abi Dawood.
It looks like this is the source, or is a copy of the source. It makes no mention of a translator. Without a translator, these pages will all need to be deleted. John Vandenberg (chat) 10:33, 20 May 2008 (UTC)
Without a translator, they'll need to be tagged with the proper tag for a number of months while we wait for someone to identify the translator/publisher - seems to be the customary route. I don't think they're anymore dangerous than the Confession of Saint Patrick which is similarly tagged. I imagine finding a way to get m2010 to acknowledge us would be the most efficient route. Sherurcij Collaboration of the Week: Author:Percival Lowell 11:13, 20 May 2008 (UTC)
I agree that we should avoid hasty deletions before there is opportunity for a proper investigation, especially when there is no copyright dispute about the original language version. That being said for the general case, I would not be optimistic for this particular batch of texts.
The pattern of omitted paragraphs in Purification (Kitab Al-Taharah) supports John's contention about the source. But this suggests not only a copyvio, but a biased selection of paragraphs. The translation appears to be a modern one by Prof. Ahmad Hasan. The introduction suggests that he supports a non-commercial licence. Perhaps someone more interested in this kind of thing can take it further; he could be offered a convincing argument that a completely free licence makes commercial exploitation less profitable. Eclecticology 19:59, 20 May 2008 (UTC)
  • Praise belongs to God, Whose Glory lies beyond, appears to be a 1984 translation by "Ali-Ibne-Abu Talib, Mohammad Askari Jafery" - though he does refer to them being translated earlier, so I'm not certain he is the translator himself - just that he published this translation.

New Jersey Plan[edit]

Hello, I am currently working on the article on the Philadelphia Convention in the French-speaking Wikipedia. I would like to link the text of the New Jersey Plan on that article, but I have doubts on its content. I have found nowhere else the first part of the text, before the articles ; and the article does not say where the text was taken from. Can there be a mistake ? --patapiou 86.212.86.207 17:27, 20 May 2008 (UTC)

The various “whereas…” clauses do not appear in any of the versions of the New Jersey Plan printed in Farrand’s The Records of the Federal Convention of 1787, which is generally taken as a canonical reference point for documentation of the Philadelphia Convention. It looks to me as if they are transcriptions of some remarks Patterson delivered in support of the Plan on June 16, 1787, the day after it was introduced before the Convention. Compare the “whereas…” clauses with this excerpt from Madison’s notes of June 16, 1787:

Mr. Patterson. said 〈as〉 he had on a former occasion given his sentiments on the plan proposed by Mr. R. he would now avoiding repetition as much as possible give his reasons in favor of that proposed by himself. He preferred it because it accorded 1. with the powers of the Convention.4 2 with the sentiments of the people. If the confederacy was radically wrong, let us return to our States, and obtain larger powers, not assume them of ourselves. I came here not to speak my own sentiments, but 〈the sentiments of〉 those who sent me. Our object is not such a Governmt. as may be best in itself, but such a one as our Constituents have authorized us to prepare, and as they will approve. If we argue the matter on the supposition that no Confederacy at present exists, it can not be denied that all the States stand on the footing of equal sovereignty. All therefore must concur before any can be bound. If a proportional representation be right, why do we not vote so here? If we argue on the fact5 that a federal compact actually exists, and consult the articles of it we still find an equal Sovereignty to be the basis of it. He reads the 5th. art: of Confederation giving each State a vote — & the 13th. declaring that no alteration shall be made without unanimous consent. This is the nature of all treaties. What is unanimously done, must be unanimously undone. It was observed (by Mr. Wilson) that the larger State gave up the point, not because it was right, but because the circumstances of the moment urged the concession. Be it so. Are they for that reason at liberty to take it back. Can the donor resume his gift Without the consent of the donee. This doctrine may be convenient, but it is a doctrine that will sacrifice the lesser States. The large States acceded readily to the confederacy. It was the small ones that came in reluctantly and slowly. N. Jersey & Maryland were the two last, the former objecting to the want of power in Congress over trade: both of them to the want of power to appropriate the vacant territory to the benefit of the whole. If the sovereignty of the States is to be maintained, the Representatives must be drawn immediately from the States, not from the people: and we have no power to vary the idea of equal sovereignty. The only expedient that will cure the difficulty, is that of throwing the States into Hotchpot. To say that this is impracticable, will not make it so. Let it be tried, and we shall see whether the Citizens of Massts. Pena. & Va. accede to it. It will be objected that Coercion will be impracticable. But will it be more so in one plan than the other? Its efficacy will depend on the quantum of power collected, not on its being drawn from the States, or from the individuals; and according to his plan it may be exerted on individuals as well as according that of Mr. R. a distinct executive & Judiciary also were equally provided by this plan. It is urged that two branches in the Legislature are necessary. Why? for the purpose of a check. But the reason of the precaution is not applicable to this case. Within a particular State, when party heats prevail, such a check may be necessary. In such a body as Congress it is less necessary, and besides, the delegations of the different States are checks on each other. Do the people at large complain of Congs.? No: what they wish is that Congs. may have more power. If the power now proposed be not eno’. the people hereafter will make additions to it. With proper powers Congs. will act with more energy & wisdom than the proposed Natl. Legislature; being fewer in number, and more secreted & refined by the mode of election. The plan of Mr. R. will also be enormously expensive. Allowing Georgia & Del. two representatives each in the popular branch the aggregate number of that branch will be 180. Add to it half as many for the other branch and you have 270. members coming once at least a year from the most distant parts as well as the most central parts of the republic. In the present deranged State of our finances can so expensive a system be seriously thought of? By enlarging the powers of Congs. the greatest part of this expense will be saved, and all purposes will be answered. At least a trial ought to be made.

Unfortunately, this is a portion of The Records of the Federal Convention of 1787 that we haven’t yet transcribed, but the quoted text appears in Volume 1, pp. 250–52 (and can be seen online here). Volume 3, Appendix E (also available online) contains Farrand’s notes about the various versions of the New Jersey Plan that were floating around during and after the Convention—apparently the records are not all in agreement as to the precise language of the Plan, but I have not seen any that includes the “whereas…” clauses. Tarmstro99 18:38, 20 May 2008 (UTC)
Thanks for answering so quickly. I can only agree with you. As a consequence, shouldn't this part of text be removed ? By the way, the work you are doing on The Records... is great, thanks. --patapiou 86.212.80.100 12:21, 21 May 2008 (UTC)

Help[edit]

Hi. Can someone fix the name of this page? I can't because I'm a new user, I forgot the i in William. Wlliam the Silent's Apology Red4tribe 13:12, 23 May 2008 (UTC)

Done. See William the Silent's Apology. I am not sure if this text meets Wikisource rules for inclusion. See WS:WWI. Yann 13:16, 23 May 2008 (UTC)
It was in 1580, so thats before 1923. Should I add a source to where I got it from? Red4tribe 13:19, 23 May 2008 (UTC)
Yes, and the name of the translator, and the date of the translation. Yann 13:21, 23 May 2008 (UTC)
Ok thanks. Also, there is still one other problem. On William the Silent's Apology up top it says written by William the Silent, but for some reason it says the page does not exist. Do you know why? Red4tribe 13:24, 23 May 2008 (UTC)
Because it didn't exist. ;o) But I have just created it. Yann 13:27, 23 May 2008 (UTC)
I added in the translation. But why are there two pages on him now? Are there supposed to be? Red4tribe 13:42, 23 May 2008 (UTC)

Collector's works of Robert W. Chambers[edit]

Hi, I am new to Wikisource. Hopefully someone can assist me. I have an original book of short stories tiltled 'The Mystery of Choice'written by Robert W. Chambers,copyright 1897. Despite its age, the book is in good condition. I would like to know if it is a collector's item. If so where can I go to get information. unsigned comment by Bukwurm (talk) 11:06, 24 May 2008.

I dont know the worth of it, but we can help you estimate that, and perhaps also predict its future value. This edition is listed on WorldCat as OCLC 2077228all editions (there are other editions; including a microfilm edition of the 1897 edition), and it appears to be well held in libraries. (e.g. new york)
Also, I have just now uploaded Index:The Mystery of Choice - Chambers.djvu, which is a digital edition from the University of California Libraries that should be exactly the same as your 1897 edition. In my opinion, the value of all rare books will slowly go down over time as digital editions are placed online. John Vandenberg (chat) 02:16, 24 May 2008 (UTC)
As a person who buys a lot of books on eBay I often use abebooks.com for a reality check to make sure that I don't let my bids get out of hand. For The Mystery of Choice, not counting the more recent reprints, I found 16 copies available. The prices ranged between $35 and $314, with a median price of $195. Like many other collectibles, good quality rare books should hold their value, and move up or down with the economy. Prices may drop somewhat in poor economic times since non-necessities are the first things to be given up. Condition will be a big factor in the book's value. Eclecticology 07:34, 30 May 2008 (UTC)

Microsoft will shut down Live Search Books next week[edit]

See http://archiv.twoday.net/stories/4946163/ If MSN has digitized Public Domain content in cooperation with the Open Content Alliance download links to the Internet Archive (IA) were given in MSN Live Search Books. After the end of Live Search the books are still downloadable at the IA but there will be no full text search.

There were no download links for the proprietary MSN Public Domain content especially the Cornell library cooperation. If Cornell doesn't make this books available this content will disappear next week. It is possible to circumvent the MSN digital rights management with simple means. I would like to recommend to do so and to save as many books as possible for the Public Domain. --Histo 11:45, 24 May 2008 (UTC)

wow, this is important. perhaps it would be worth mentioning this on the (barely used) mailing list ThomasV 13:08, 24 May 2008 (UTC)
MSN announced two days ago that they were withdrawing their funding of the Internet Archive scanning projects - you can read the "upbeat" announcement by my employer here. Sherurcij Collaboration of the Week: Author:Percival Lowell 19:34, 24 May 2008 (UTC)
Im a bit confused; afaics there is nothing to "save" if it was archived at IA. Histo's announcement suggests that there are MSN digitised books that were not archived on IA, and these need to be "saved".
IA says that MSN "donated" 300,000 books; the MS press release says that they archived 750,000 books. Does that mean there are 450,000 books that need to be saved?
Do we know how to find these, in order to save them? When will be MSN site go offline? John Vandenberg (chat) 00:54, 25 May 2008 (UTC)
Can't help you with your current problem, but you'd asked on Commons a few months ago about using the DJVU files with the Microsoft watermark - so I threw w:Brewster Kahle an eMail following this last announcement asking if we'd be removing the watermarks, and he confirmed that the DJVUs will be reprocessed without the watermark - so that even existing books will no longer bear the copyrighted logo...so that's good news. Sherurcij Collaboration of the Week: Author:Percival Lowell 06:43, 25 May 2008 (UTC)
That is great news. Thinking out loud.. I think it would be beneficial if the reprocessed books have the same number of pages, and perhaps the prior notice is replaced with a new notice to indicate that there are no further restrictions. John Vandenberg (chat) 12:45, 25 May 2008 (UTC)
I'll toss him a follow-up suggesting it if you can give me the reasoning why it's important to keep the same number of pages. I don't know enough about it all to know why I should be pitching it to him. Sherurcij Collaboration of the Week: Author:Percival Lowell 00:57, 26 May 2008 (UTC)
My reasoning is so that what was on the 100th page is still on the 100th page. If the pages move around, our editions become out of sync. If we upload the new djvu, we will either need to manually add blank pages in, or move all of our pages of text to suit. We probably dont have many MSN contributed djvu, so it isnt a big deal. John Vandenberg (chat) 02:16, 26 May 2008 (UTC)

The number of books to be saved seems to be 75,000. These are the Cornell cooperation books. But Cornell has agreed to donate them to IA and is considering an own repository. Thus it would be the best only to save these books one needs and cannot wait for these solutions. --Histo 15:22, 25 May 2008 (UTC)

The speed with which Archive.org churns out books (currently 1000 a day) is going to be diminished by a fair margin as several U.S. centres face extinction because of this - though as the sole Canadian centre, Toronto will remain open and operating at full capacity. Sherurcij Collaboration of the Week: Author:John Masefield 17:23, 27 May 2008 (UTC)

Translation of the Encyclopédie[edit]

Hello, Is this license acceptable here? [8] Thanks, Yann 16:14, 25 May 2008 (UTC)

I'm not a lawyer nor otherwise a copyright expert but by my reading of that it wouldn't be eligible for inclusion in Wikisource. This seems to be the pertinent part:
Articles are posted to the site monthly, with attribution, but without compensation. They are freely available to all users for personal or course use without cost or copyright restrictions. Translators retain copyright over their work for any other purposes, such as reproduction in print or digital media.
To me, that states that free use is only permitted to other users of that web site for non-commercial use and outside of such uses the work will be a mixture of each individual author retaining standard rights.
However, this appears to me to create a complex and encumbering situation - anyone who ever wanted to use that work, even by paying the authors royalties, would have to go and negotiate rights with every single author beforehand. It might be worth it to point that out to them and suggest that they use a more comprehensive license, and suggest one that would be compatible with Wikisource. ;^) Point them to the Creative Commons website (non-WikiMedia) which is all standardized legally-reviewed stuff accompanied by plain-English explanations.
(Many of those licenses would be compatible with Wikisource, but I believe the sticking point would be that like on WikiMedia Commons they would have to choose a CC license that permits commercial use and derivative works. Am I right about that, guys?) --❨Ṩtruthious ℬandersnatch❩ 18:52, 25 May 2008 (UTC)
That site's statement appears self-contradictory. I think that a more pro-active approach to copyright would be more suitable for us. We could be doing our own as a wiki, and inviting them to join us....for the same rate of pay. Machine translations are not copyrightable because they are mechanical and lack originality. They are also bloody awful. Taking an article from the Encyclopédie starting the translation mechanically, then editing it into sensibility would be a perfectly wiki way of doing things. Eclecticology 03:51, 30 May 2008 (UTC)
I wrote to them asking if a CC-BY-SA license would be acceptable for these translations. Yann 10:35, 30 May 2008 (UTC)

Want to start this book[edit]

The Catholic Dogma: Extra Ecclesiam Nullus Omnino Salvatur http://www.traditionalcatholic.net/Tradition/Information/The_Catholic_Dogma

How do I start a bot to carry out the copy paste? It's out of copyright (published in late 19th century). Agonzaga 17:02, 26 May 2008 (UTC)

Only five chapters, it's probably easier to do by hand - than configure a bot. Create The Catholic Dogma: Extra Ecclesiam Nullus Omnino Salvatur and then include links like [[/Chapter I|Chapter I]] for the five chapters, and create them. Don't forget to list it at Wikisource:Roman Catholicism when you're done! Welcome to Wikisource :) Sherurcij Collaboration of the Week: Author:Percival Lowell 18:16, 26 May 2008 (UTC)