Wikisource:Scriptorium/Archives/2003-09

From Wikisource
Jump to: navigation, search
Warning Please do not post any new comments on this page. This is a discussion archive first created in September 2003, although the comments contained were likely posted before and after this date. See current discussion or the archives index.

Note: This archive was originally copied from ps.wikipedia.org. These are the very earliest Wikisource discussions from the first days of the project!

The contents of this archive have been reformatted for easier reading, though the text itself has not been changed. To see the original format, go to Talk:Main Page/archive 0 at wikisource.org, or look at a previous version of this page. Please do not edit this page any further!

Wikisource is up

Cool! This got set up.

Actually, "ps" must be a language code for some obscure (to me) language... We've added a bunch of those lately. --Larry Sanger
In an unusual fit of synchronicity, ps is the language code for Pashto, the language of the Pashtuns, so in the news as the Muslim Afghans from whom the Taliban derive their popular support. --TheCunctator

Pi

Keep it on the lowdown... but I can get you pi to 1.24 trillion places. i didnt tell you.... Hfastedge

but in SI terms taht will be 1.24 Terabytes.
On another note, why do we want a link to Pi to 1,000,000 places, when that article doesn't exist? Keep in mind that, if it existed, it would be a hundred times as large as Pi to 10,000 places. -- Toby Bartels

Scope

So I take it this is for .. what? Any public domain text/information not found elsewhere? -- Sam

It's for archival copies of public domain (or GFDL) documents that shouldn't be edited in the wiki manner but might well be linked to from a Wikipedia article. For example, the Constitution of the United States. It's not really being used, however -- and still has Phase I software! -- Toby
So anything in the public domain? Even things that are already online? I suppose that's a stupid question... I definately see the use: having a reliable source that we can do things like correct typoes and add our own notes to, for example.. -- Sam

Name and subdomain

I think the name should be changed and have its own domain name as it is a sister project to wikipedia and wiktionary. maybe it cold be called {languagecode}.wikisource.org -fonzy

I agree with that. -- Toby
Yes, I think that's a good name. Yann from http://fr.wikipedia.org/
Much better name - anon
Hm. Wikisource. I kinda like that - although I'm not trilled enough to register it myself. It is available for 20 bucks US here. Besides I bought wiktionary.org already - it is now somebody else's turn. We need to bug Brion about upgrading this site but we should have a URL for him to work with (but then again I see nothing wrong with http://wikisource.wikipedia.org for a short-term solution). --maveric149
yes, Great name, lets get brion to set up the tempoary address so that some of us can start this Sister Project properly.
if i had the money i would buy the domain. - anon

how about sourcewiki ? breaks up the monotony of the wikithis wikithat... Stevertigo

Eureka! How about Wiklibrary.org, Wikilib.org, Libwiki.org or even Bibliowiki? (all based on Wiki + library). This is a far more clear name than anything with "source" in it. Wikisource and Sourcewiki both give the impression that they are the "Source of the Wiki" or otherwise deal with Wiki.--maveric149
hmmm to me it sounds like a site with loads of books, and where i can rent them or smething. -fonzy
maybe Wikisourcedata.org abit long tough but says to me source data that includes speaches, conversion tables and more. -fonzy
But we do want to have loads of books here. That is why the above suggestions make more sense. --mav
ok then, erm i vote for wikilib.org -fonzy
Sourcewick.org, Sourcerary,org, Sickywarts, WikiSourcerary.org, er... to much wiki, not enough sexi. - Stevertigo
I still prefer Wikisource.org
I second mav. wikilib.org is much more clear to recognize its meaning.

In light of this, let me amend my agreement with fonzy above to say: {language code}.{primary name}.org. But I won't get involved in debates over the primary name. -- Toby Bartels

I'm sure I don't belong in this discussion, if its still happening, but ... I get the feeling a "wiki" is something freely editable for anybody. Clearly, these sources shouldn't be, since they are someone's original text. Si, as unwieldu as it is, Project Sourceberg might be better than wikilib, which otherwise would be a much better word. Atorpen
I like "Wikisource.org" - Wooloomooloo

Just thought I'd weigh in on this whole thing...

First off, I agree with maveric149: the name of the site should definitely include "library" or something similar.

Secondly, as w:User:Atorpen said, the "anyone-can-edit" nature of Wiki is probably not the best option for something that is going to be considered, primarially, a library. Once a work on the site is considered "finished", it could be potentially damaging if someone modified it into a form that the work did not originally have.

My solution to this would be to have a change in the code where a page can be "locked" once it has passed review by enough people. An entry starts out editable, so copy editing/layout/etc. can be done by anyone interested. Once it has hit a point where no modifications have been made for a short time and/or a few editors have posted an "OK" on the Talk page, it is set to a "locked" status. If there is later found to be a problem, an appropriate Talk page could be set up to allow discussion of whether or not unlocking is needed to allow a revision.

Third, there are many forms that the works can come in, and I personally would like to see conversions of works into all of the appropriate forms. Perhaps having each page on a work list a brief summary of the work, and then have links to all the different versions available would be the best way to deal with this. Additionally, this would have the advantage of allowing different revisions of a work to all be included under one main page - for example, the earlier versions of The Hobbit had a different tale of Bilbo's finding of The Ring - it would be very cool to be able see both versions available for download on one page (once they are out of copyright, naturally) for those who are interested in the revisions.

I think a good list of supported filetypes that all works should (ideally) be included in are as follows:

  • The two basic forms that should be supported are: 80-column wrapped Plain Text (.txt), and Rich Text Format (.rtf) - these two filetypes will be readable on 99.9% of computers. The former is universally cross-platform (although perhaps different MS-DOS, UNIX and ANSI versions would need to be be provided), and the latter allows most of the simple formatting needed to make it look nice for printing.
  • Two advanced forms that should also be included are: Adobe Acrobat (.pdf) and HTML (both single- and multi-page versions). HTML is especially good here, as it can be used to allow browsing the book directly from the Wiki-Library website, with no need to download. PDF is, I think, a very good option for the distribution of works formatted as closely as possible to the originals (you can include pictures on the pages, full formatting features, transparency support) and for added functionality in the work (hyperlinked indexes, etc.)
  • One format I would avoid: MS Word Document (.doc) is OK, but probably redundant as .rtf will provide all the same functionality needed for most these files, and .pdf is better than .doc for the ones that are too complex for .rtf. Additionally, MS Word is not universally cross-platform, and .doc files can contain macro viruses.
  • One more filetype to consider including is the compressed archive: ZIP is probably the best option, as it is almost universally known by internet users, and is supported on the most platforms. I would suggest providing ZIPped versions of all of the above files to decrease bandwidth expenses for the site - it would require larger file storage needs to have works available in both formats, but in my experience storage space is cheaper than bandwidth.

Another thing to consider for this site is the inclusion of non-text works. There are many pictures, audio, and video that could be included to enrich the site. Many important works, from 16th century woodcuts to radio broadcasts, could be included.

The one danger of this, however, is the possibility of the storage & bandwidth requirements of the site skyrocketing. Also, careful control would have to be exercised to keep the site from becoming short-term storage for warez d00dz and other pirates.

If a good control structure could be put in place, I beleive it would definitely be worth it to include these types of works - as far as I know, there is currently no site that is doing this, and I beleive it is a niche that needs to be filled.

Another question that needs to be asked, is what types of freely-available works are going to be included on the site?

Works that are old enough to have fallen out of copyright are naturally expected, as are newer works that the owner has prematurely released from copyright. However, what about still-copyrighted works that are freely (or near-freely) available? For example, works under a LGPL/copyleft/etc. style of "license"? What about works that the author has released for free to the internet for distribution, but maintains restrictions on publication? (Such works could be included on this site with author's permission.) What about authors who are willing to have all, or even just part of, their works available on this site but prohibit them from being included on other sites? These are important subjets to consider now, before they come up later and a policy has to be decided on-the-fly as to what to include.

The last thing I will mention is that a standard method should be come up with for labelling any copyright information attached to a work, as well as by what means this site is permitted to post it. (i.e. "Public Domain", "Author's Permission", etc.) Additionally, such a notice should include information as to whether others can repost the materials on their own sites, and if permission is needed, who/where to contact for that permission.

-- The Rizz

Many of The Rizz's concerns above are about preempting problems that may -- or may not -- appear in the future. There's nothing wrong with planning ahead by brainstorming ideas, but we shouldn't start implementing any of these restrictions until we need them. On wikiwiki editing in particular, I oppose any locking mechanism until we see a need for it, since an introductory paragraph should continue to be edited. If people start editing the text itself, then we can revisit this.
Everybody here should be familiar with Wikipedia, the parent project of this site, and its experiences -- warez d00dz, for example, have been dealt with there. This project will almost certainly be moved to that software eventually too. As for copyleft material, anything compatible with the GNU FDL can be placed here -- possibly more if we work things carefully. You can find recent discussion on the prospect of moving "Phase I wikis" (like this) to "Phase III" (the current Wikipedia software) on the intlwiki-L and textbook-L Wikipedia mailing lists. -- Toby Bartels (2003 July 10)
It should also be noted that the newly-created Wikibooks project is probably going to also act as a public domain text repository as part of its mission (we want to annotate those texts as well). See our temp url at http://textbook.wikipedia.org/ --maveric149
Given recent discussions on intlwiki-L, it seems likely that this will be moved to http://wikisource.org/ (or maybe http://wikisource.wikimedia.org/), with Phase III software.
Both Wikibooks and and Wikipedia will be able to quote from Wikisource for their own purposes; possibly automatically with special software, but this hasn't been written yet. -- Toby Bartels (2003 September 2)