Wikisource talk:What Wikisource includes

From Wikisource

Jump to: navigation, search

Please note that this discussion began at the Proposed deletions talk page. Earlier comments can be found there.

Contents

[edit] Math, various data, computer code, etc.

[edit] Finding the right project

Hi. As I wrote earlier, I will be away for until the end of the week. Here are a few thoughts on the topic of what Wikisource includes, or should or shouldn't include. It seems to me that for the most part it won't be hard to agree on a pretty wide consensus, and once we do things will be clearer.

To understand how we got here, remember that this project started at ps.wikipedia.org. In other words, "Project Sourceberg" was initially thought of as a repository not exactly of "source texts" but of "sources" for wikipedia articles. If someone read an article on Mark Twain that mentions something from chapter ten of Huckleberry Finn, he could click on a link to the actual chapter and find it at Project Sourceberg.

The same thing was true of math. Someone could read an article on pi at Wikipedia, and "pi to 10000 places" would be stored for reference at ps.wikipedia.

It seems to me that this has changed. While our texts here at the "modern" Wikisource can certainly still serve Wikipedia and the other projects, we are no longer just a source of data for them. We are trying to build an independant library, and we have to decide what such a library should (or should not) include.

That is why I think the initial question of whether "data" like math, cryptography, etc. should be stored here is a legitimate one. I'm not against deleting it. I'm also not for deleting it. I'm simply not sure where we should draw the lines of our library. I just think we should all collectively think about it some more.

One thing I am against is unilaterally deleting something that did have a home here without first figuring out where it does best belong. I think we would all agree that lists of data should have a home somewhere, within a Wikimedia that supports giving "every single person... free access to the sum of all human knowledge". The question becomes stronger when it is recalled that until now, that place was here, and was agreed upon from the very beginning of the project. So before deleting, we should first figure out what that home should be.

It seems to me that data like this is arguably appropriate for any one of three projects: Wikipedia, which could legitimately include data related to its articles; Wikibooks, arguing that data is supporting material for a course of study in the subject at hand; Wikisource, arguing that in a library one would expect to be able to look up various kinds of data. I personally lean towards the last option, but in principle they are all OK. The main thing is - we have to decide which one, and it has to be a clear decision by whichever project saying: Yes, we agree in principle that this type of material belongs on our website, whichever one it is.

I also think we should get feedback from other Wikisource languages about this.

[edit] Source texts

A separate argument on principle might be that collections of data and computer source code are not "source texts" and that the latter are actually "original." This also requires some history.

There were two reasons why Project Sourceberg was defined as "source texts":

  • So that people wouldn't write their own novels and post them here.
  • In order to differentiate Wikisource from Wikibooks. The basis of the differentiation was that Wikibooks are "wiki-creations" - collaborations just like Wikipedia articles - while Wikisource stuff is unchanging texts that were produced at a certain point in time by one or more authors.

I propose that neither of these criterion apply here, and that furthermore we would do better by starting to define ourselves more as a "Library" like our new slogan says, and less as "a collection of source texts."

The first argument obviously doesn't apply to math data and computer code. Even computer code, while somewhat original, is just trying to supply examples of algorithms mentioned in Wikpedia articles. While a personal creation, it is not the same as writing your own novel.

The second argument also doesn't apply. The truth is that textbooks are part of a library, and so the whole Wikibooks could be included within Wikisource. This won't happen in reality because the nature of the two things is so different: Writing your own textbooks from scratch in collaboration with others is an entirely different experience than organizing other people's books into a library. That's why I think these projects will always remain separate, even though the distinction between them is somewhat artificial. Nevertheless, it seems to me that the "data and source code" experience seems to fit better into the Wikisource experience (archiving, classifying, editing) than the Wikibooks experience (creation instructional texts from scratch).

[edit] Conclusion

It seems to me that all the data and related stuff can legitimately be kept here or moved to a different project (if that project agrees). When all is said and done it doesn't really matter which project it is kept on, it just isn't that important, but it also shouldn't just be lost.

Regardless of whether these texts are kept here or put elsewhere, we really should start thinking of ourselves more in terms of "Free Library" and less in terms of technical "source texts." This includes:

  • How to add value through the wiki system to the texts that we host. (Doing so will also mute the frequently and recently asked question: "Why do we need Wikisource if there is already Project Gutenberg?")
  • Defining criterion for when even an original text is "good enough" to be included here (in the past, for instance, academically certified papers like a masters thesis were uploaded).

Hopefully will be back at the end of the week.Dovi 08:40, 3 October 2005 (UTC)

[edit] Drawing a line

Thanks Dovi for your POV. That's a good start for a discussion.

My problem with source code and math tables is where to draw the line about what we accept and what we don't. For texts, the line is easy. It has to be free and published somewhere else before (except for translations). For math tables and source code, it is very difficult to draw such a line. You can't put copyright on a math table (so essentially any math table is free) and IMO, math tables are useful if they are difficult to calculate (i.e. there is no point storing a table of even numbers), so if they are available somewhere else or easily calculated, there is no use to add them here. So math tables look like more new material to me, so they should be hosted on Wikibooks or Wikipedia if needed.

You don't read a table of prime numbers or the source code of the Linux kernel like you read a text from Shakespeare. Math tables and source code are not useful in themselves, but as a base for learning, further programming or calculations.

Also I don't believe that Wikimedia should host everything which can be found on the Internet. There are formats much are more useful for some data than a wiki, and math tables and source code are in this category. Should we host all source code which is free ? The Linux kernel ? etc. You see my point. So what is interesting is short programs explaining or demonstrating an algorithm, so that looks like more teaching material, so it belongs to Wikibooks. Yann 18:22, 3 October 2005 (UTC)

Thanks for the feedback, that was interesting.
First of all, a basic question: Has Wikibooks actually agreed to take this stuff? If so, with what conditions? In other words, will they take all pages of this type, or will they say that some meet their needs as "institutional materials" but others don’t? (It is for this reason that I personally lean towards the library option, because if they belong in a library then all of them belong in a library!)
Secondly, the following is an alternative to deletion, that I would really like to get feedback on from others. (The idea is borrowed from the deletion reform debate going on at en.wikipedia, but I think it might actually serve our needs here even better than theirs.) The idea is based on the notion that Yann mentioned, namely that reading a math table (if it is read at all) is completely different than reading Shakespeare (even if you could find both of them in a library).
The idea is to remove all data, source code, etc., from the main namespace (better: to remove it from the main bookshelf reserved for "regular" texts) and instead keep it in its own special namespace: "Data:" namespace, "Source code:" namespace, etc. It would not be "counted" with the regular texts of normal books, nor listed amongst them. All such texts would be stored in a strictly separate space that clearly defines them for exactly what they are. You get the advantages of deletion through a process less radical than deletion. The texts are still there for those who want to use them or link to them from other projects, but are clearly demarked as a class unto themselves, and somewhat separate from the main project.
Would love to get feedback on the namespace idea. As to the question of whether everything needs to be on a wiki, I agree with Yann that it is not the most important thing in the world. Nevertheless, many times the value of things being on a wiki only becomes evident later-on when projects go in a certain direction. Remember that in the beginning, many people said there was no value to having "source texts" on a wiki!Dovi 09:38, 6 October 2005 (UTC)
Hm, that's a good idea, Dovi. While I still agree with Yann, and would personally love to see it all removed from WS, if we could have a new namespace created for math tables (would we put source code and cryptography under those namespaces or create new ones?), then they would not be counted as articles, and I would be willing not to raise anymore issues with their inclusion (as long as they don't raise any problems I'd have with a "regular" text). Also, after some thinking, physical libraries do have a reference section, and these new namespaces would sort of serve in that purpose--people reference lists of constants (we'll still need to talk about constants and if we are going to include "Phi/pi/e to the Nth place") and source code, etc.
I'm assuming for this that "Data:" or whichever we decide would actually be a true namespace? If that's the case, can we have "Author:" created as a new namespace as well?—Zhaladshar (Talk) 14:07, 6 October 2005 (UTC)
Are you saying that "Author:" is still only a pseudo-namespace? I didn't know that. If so, we should get that fixed pronto! Let's ask around about how to request namespaces (maybe ThomasV knows, and actually Yann probably knows, too). Things like "Author" and "Title:" should be requested immediately. That's far more important than 5,000 places of pi... :-)
As to the matter at hand, my initial hunch would be that each of the distinct things we have been discussing would to get a distinct namespace: "Math:" "Cryptography:", "Source code:" etc. "Data:" for plain old lists of basically anything...Dovi 14:40, 6 October 2005 (UTC)
Yeah, "Author:" is just a pseudo-namespace. Having "Math:", "Cryptography:", etc. is fine with me. Now, we just have to get consensus and go about getting the namespaces created.—Zhaladshar (Talk) 14:59, 6 October 2005 (UTC)
I do not know if there is a specific procedure for requesting a namespace. I suppose you have to ask developers. But before that, I believe someone should install mediawiki on his/her computer, and play with it a little bit (configuring namespaces, etc), in order to get an idea of what it is possible to do with namespaces, and what has to be requested. ThomasV 15:12, 6 October 2005 (UTC)
I know nothing about php, but if no one else does it, I'll give it a try.—Zhaladshar (Talk) 15:44, 6 October 2005 (UTC)
See m:Help:Custom_namespaces. Since constants are language generic, maybe we should try to get these created on the main wikisource namespace instead of here. --CSN 22:09, 6 October 2005 (UTC)

[edit] Mathematics has a right to be in Wikisource

I just stumbled on this discussion (I normally hang out at English Wikipedia) and I must say I am appalled at the notion that a major field of human knowledge is proposed for deletion from Wikisource. To begin with, Wikisource:What_is_Wikisource? plainly states:

" What do we include?

Some things we include are:

  • 1. Source texts previously published by any author
  • 2. Translations of original texts
  • 3. Historical documents of national or international interest
  • 4. Mathematical data, formulas and tables
  • 5. Statistical source data (such as election results)
  • 6. Bibliographies of authors whose works are in Wikisource
  • 7. Source code (for computers) that is in the public domain or compatible with the GFDL

Contributions are not limited to this list, of course."

I believe this statement is a commitment, not just to Wikisource, but for the entire Wikimedia community, for which Wikisource is an important resource. If it is to be materially changed, it should receive community wide attention and discussion, the same attention that, say dropping minor languages from Wikipedia would receive.

Also, I do not see the point of creating namespaces for mathematical material. As I understand it, namespaces are already being used for different human languages. Much mathematics source material will have accompanying explanatory text, which would have to be categorized by language. Hopefully at some point important math texts and papers will begin to appear, mathematical writing going back thousands of years in many languages. What is wrong with the category mechanism?

The fact that the mathematics category is sketchily filled in at present is no excuse for deleting what little material has been submitted. On the contrary, it only discourages future contributions. If there is need for further clarification about what material is appropriate for the mathematics category, there should be an effort to involve contributors to the mathematics sections of other Wikipedia projects. From what I have seen, that discussion is premature at the moment. In any case the deletions that have eviscerated (cut-down) this category, should be reversed, pending a widely discussed and accepted policy change. --ArnoldReinhold 21:36, 6 October 2005 (UTC)

First off, about comments from your first paragraph, this debate's been going on for months now. Only no one but a select few want to take part in coming up to a resolution. They want to complain about any action or possible action, but not help in the actual discussion aspect. This has been a community-wide discussion coming over from the main Wikisource.
I'm unsure about what you are saying when you say "namespaces are already being used for different human languages." I can say that that is not the case here. The only language to be found on this wiki is English. Note that there is also a difference between a mathematics table and a source. If a mathematics paper or selection from a mathematics journal ends up here, that's great! What the discussion centers around is a page that goes to the 100,000th place of pi or e. Or lists of prime factors. This is only what our discussion centers around.
About creating namespaces, is there any reason why you oppose them? As they are more reference material than articles, they should not be in the main namespace, as they should not be counted as articles Wikisource includes.—Zhaladshar (Talk) 23:15, 6 October 2005 (UTC)

Has any effort been made to involve the mathematics or cryptography communities on Wikipedia? Both have active portals. I suspect you would get a lot more interest. I am relieved to hear that mathematical texts are welcome, but I would note that the entire category of mathematics is currently marked for deletion.

Thanks for correcting me on the (non) use of namespaces for languages. I withdraw that objection. Here is my remaining concern about creation of namespaces for mathematical data. In Wikipedia, at least, namespaces are used exclusively for "behind the scenes" information used in constructing the encyclopedia: users, categories, images, talk, etc. All content sought by end users is in the main namespace. End users who are simply seeking information usually have no reason to know namespaces exist, much less learn their special syntax. I see no justification for adding such a burden to people looking for mathematical formulas or data. A school child should be able to go to wikisource and simply type in "Pi", not "Number:Pi."

I'd be happy to help in coming up with a solution, but I don't see what the problem is. Perhaps you could explain your concern about data tables "counting" as articles. Why shouldn't they? I would note that all 154 Sonnets of Shakespeare count as separate articles. What is the big deal? --ArnoldReinhold 12:59, 7 October 2005 (UTC)

Hi Arnold. First of all, you should be aware that this discussion already began earlier at Wikisource talk:Proposed deletions, so my comments above about the possible use of namespaces (providing the community agrees to it) are somewhat out of context. Actually, at that time I seem to have been nearly the only person to wonder whether this stuff should really be deleted (not opposing, just questioning), and thus the "namespace" proposal was actually an idea for a compromise, not an attempt on my part to get rid of mathematics! As Zaladshar pointed out, the namespaces don't create any language problems. Nor is there really such a problem with "Number:Pi" - "Pi" can always be a redirect.
From the point of view of those who favour to delete, the advantage of namespaces has to do with the origins of the text - see Yann's comments above. Yann is concerned that Math tables, for instance, are not the original creations of an author which he published in his own personal format in one particular edition, but something that is by definition not copyright-able and has no single printed "edition" that we are basing our text on. Since it is such a different kind of text from Charles Dickens, he wonders whether it should be here at all. A namespace would at least make it absolutely clear, from his point of view, that this is a completely different kind of text.
Zaladshar, you wrote: "This has been a community-wide discussion coming over from the main Wikisource." I looked a little bit, and I'm embarrassed to say that it seems I was so focused on he: and on the language domains business that I didn't even pay any attention to this, and somehow missed the whole thing. Sorry.
I still think that for the Math/Data/Code question we are discussing here, as well as other questions that are likely to arise in the future, we should be framing the question in terms of: "What should our library include and how should it be classified?" with an attempt to create a library that is as comprehensive a resource as possible. And less in terms of "What exactly is a source text?" though we should still be avoiding things that are entirely wiki-creations with no relationship at all to a real source-text (like somebody's own novel).Dovi 15:11, 9 October 2005 (UTC)
I agree with the last comment of Dovi. In my own field of technical history there is data like historic screw thread, wire and sheet gauge tables which are pretty well non-existant on the web, and Wikisource would be an admirable place to place them. Similarly, data like historic sizes of hand-made paper. But I would not wish to spend time on such a project only to find that I was having a battle with editors who wished to delete them since they were not 'literary texts' or some such argument. Where might unpublished transcripts of historic documents (in paper form) fit? Many scholars might welome the possibility of having an on-line home for this kind of thing. Apwoolrich 16:13, 9 October 2005 (UTC)
Things like historic screw thread can easily be covered by digitizing some really old edition of Machinery's Handbook. It's not like these things have never been written down before. I have a problem with the inclusion of "source-less" facts or new translations in "Wikisource". Facts, such as a list of historic paper sizes, can be stored in Wikipedia. But if you want to author a free alternative to Machinery's Handbook, or write a free modern Polish translation of Beowulf, this can be done within Wikibooks. Because any such effort is indeed subjective authoring, not merely an objective representation of eternal facts. --LA2 17:07, 12 October 2005 (UTC)
I also agree with Dovi. I am beginning to see this debate as a way to kill two birds with one stone. One big question that I'm sure is asked a lot, is how are we different from Gutenberg. This is one way that would make us different. - By including "non-literary" articles (preferably under a different namespace). As Gutenberg does not contain this information, inclusion here (especially if we a-massed a great number of these kinds of articles) can help distinguish ourselves from Gutenberg. Also, we can reach an agreement of some sorts with non-Wikisource editors who constantly publish this stuff here and create a large stir by finally giving no reason for them fighting with us: we just let them post it. And to make the WS editors who do not feel that this is the place for that information (because of various reasons), such pages will not be counted as an article, and we can at least say that we carry "N number of literary articles." Of course, even if such data is allowed, there must be guidelines that must be followed for their inclusion here, just as there are guidelines for the "literary" articles (e.g., no self-publication).—Zhaladshar (Talk) 16:50, 9 October 2005 (UTC)
I have no particular objection to a separate namespace. But I don't see much benefit to it either. I view wikisource primarily as a collection of reference materials. Part of the mission being, to support other projects. What's wrong with having a table of Mersenne primes, for example? They are genuinely useful to know for some purposes, and are essentially impossible to calculate in finite time. At any rate, it seems to me that a compelling case should affirmatively be made for moving this stuff elsewhere. - I don't know all the arguments, but from reading the above, I'm not convinced that there is such a case. In short Wikisource:What_is_Wikisource, referenced above, lays out the mission, I presume as stated by the Wikimedia board. What do we know that the board didn't? Wolfman 20:00, 9 October 2005 (UTC)

[edit] Guidelines for mathematical data, formulas, and tables

I recognize, that using separate name spaces for Math was proposed as a compromise, but it feels like an "ok, we'll let you on the bus, but you have to sit in the back" type of compromise. From the discussion, the use of separate namespaces is proposed as an answer to some who question the suitability of the material and how needed decisions about it can be made, not a technical need to avoid naming conflicts, which is the problem namespaces are intended to solve.

I'd like to focus on the broader question of which "mathematical data, formulas, and tables" are appropriate for inclusion in Wikisource. The basic criteria in Wikisource is, that material has been published elsewhere. There may be some cut-off for really obscure publications, but mainstream publications certainly belong there. All of the mathematical reference material I would expect to appear in Wikipedia would have been published before. With rare exceptions, we should be able to cite books or peer-reviewed papers that contain the same information.

There are many mathematical reference books that have been published. Unfortunately, scanning technology, as far as i know, is not up to the task of accurately transcribing either data tables or formulas. Nor is it necessary to import entire reference works, though a few would be nice to have. The information in them generally represents a consensus and will typically be found in multiple number of books.

Even if we did import a single reference, its contents should be spread over many articles, just as The Complete Works of William Shakespeare are split into separate articles on Wikisource. No brick and mortar library has a separate book for each of his Sonnets, but Wikisource has separate articles for them. It's more convenient for the reader and we don't have the physical problems traditional libraries have shelving and cataloging one or two sheets of paper.

Then there is the question of how to digitize data tables and formulas. Numerical tables are best re-created using computers, It would be ideal if the computer program used to create a table were published along with the data table itself, either in the article or in the talk page, or some new validation page that we could dream up. Formulas can be typed in manually using a Tex editor (Wikimedia accepts TeX markup) or, in some cases (perhaps a table of integrals), computer-generated. Contributors should be free to make editorial judgement as to format, precision and organization subject to the usual Wiki revision process.

That brings us to the copyright issue. Current copyright law, at least in the US, makes expanding Wikisource's literary collection beyond the first two decades of the 20th century problematic. Mathematics has an advantage in this respect. As I understand U.S. law (and I am not a lawyer) the content of mathematical data, formulas, and tables are ideas which cannot be protected by copyright. However, slavishly copying the selections and layout of a copyrighted source is potentially a problem.

This suggests three possible strategies:

  • Use reference works whose copyright has expired
  • Put together reference pages based on information in more modern reference works still under copyright in ways that do not raise copyright concern, e.g. not following any one (i.e. particular) work. Still more recent material may have to be gleaned (i.e collected in small quantities) from scholarly publications.
  • Find relatively recent works that are in the public domain.

One particular example in the last category is w:Abramowitz and Stegun's Handbook of Mathematical Functions published by the U.S. National Bureau of Standards (NBS) in 1964 and recently called "perhaps the most successful work of mathematical reference ever published." [1]. It is still available at Amazon.com. Most of its 1046 large format pages are devoted to numerical tables, but each chapter begins with a couple of dozen pages of formulas and graphs. A lot of arguments about whether material is appropriate can be settled using this book alone.

In the related field of cryptography, there is quite a bit of material published by the U.S. Government in the FIPS Pub series. Again it may be best for reference purposes to extract sections rather than simply mirror the publications, especially if we don't want to include PDF format files.

So here is a propose set of guidelines, to serve as a starting point toward developing a policy:

  1. Content should be in the main Wikisource names space
  2. Information on provenance, methods, validations etc. should be recorded. This could be done in the article, but the talk page or a separate page created for the purpose might be a better approach.
  3. All mathematical material must have appeared in a recognized, published source.
  4. Public domain or open-licensed sources are preferred.
  5. Accuracy, reliability and verifiability are major goals.
  6. Where possible numerical data should be recreated algorithmically, rather than being scanned or manually input.
  7. What data tables to include, and their precision and format are subject to editorial discretion, taking into account both the desire to preserve history and the needs of modern users.
  8. Data and formulas should be accompanied by a source and, where possible, the program used to create data tables should be exhibited.
  9. Data and formulas should be validated against published sources by the original contributor. Additional efforts at verification should be recorded, even if no discrepancies were found.

Again this is intended as a first cut. I'm also trying to put together a draft table of contents, showing the kind of material I think should be included.--ArnoldReinhold 15:59, 12 October 2005 (UTC)

Concur. Slavishly restricting ourselves to exact reproductions of published works is needless, pointless, and, in this case, difficult. The important part is that the information contained be well-referenced and verifiable. This poses no more obvious difficulties than ensuring that a reproduction is exact. In the case of easily computable tables, e.g. 'sin', I think we're better off just reproducing (with cites) any common formulae to calculate it to arbitrary precision. No need for tables there. Anyone who simply wants to know the value for ordinary purposes can find it much more quickly and accurately using commonly & freely available software.
Such material does not belong in wikibooks; that is a place for explaining how to do things at length. This is reference material, the purpose of wikisource.
Btw, some attempt to include at least the more important parts of Abramowitz and Stegun, formulae & graphs etc, would be phenomenal. I'd pitch in, if anyone wanted to start such a project. Wolfman 17:42, 12 October 2005 (UTC)
Hi,
The discussion is interesting and I think we are going to find a consensus. Separate namespaces can help in this regard. My concern is that we set clear guidelines as Arnold has done above, to avoid a future situation where someone adds something like
#!/usr/bin/my-favorite-language
print "My own novel (ten pages or more)";
end
People will claim that it was published before on their personal web pages (how is it different than the 1,000,000 decimals of pi, which were never published on paper), and it can therefore be included in Wikisource. So in brief, we should be clear in order to avoid any future conflict, as far as we can. Yann 18:21, 12 October 2005 (UTC)
I still believe different namespaces should be used to remove them from the main namespace, as these are reference materials and not pre-published, literary works. However, by doing this, I do not think they are relegated to "the back of the bus" but are set apart from our normal articles. (Of course, this is all assuming that we actually do get namespaces; we might have to rely on pseudo-namespaces).
I do agree with Arnold, however, about having a set of guidelines, and aside from the first, I have no problem with them. I think we should add another that will...limit the extent of works that are published. That is, we need guidelines on how long these tables/constants/etc. should be. Specifically, I'm refering to a constant like pi or phi. Why does Wikisource need "pi to 20,000 places," "pi to 30,000 places," or "pi to 40,000 places" when it already has one that goes to 50,000 places? This sort of thing should be excluded.—Zhaladshar (Talk) 23:09, 12 October 2005 (UTC)

Is there any danger of our being held to run foul of the WP No Original Research guidelines in any of this? I am thinking of the use of the term 'pre-published' in —Zhaladshar (Talk)'s contribution above. Apwoolrich 15:15, 13 October 2005 (UTC)

I interpret the "no original research" policy as meaning everything in Wikisource should have first been published somewhere else. Self-publication would seem less of a problem for math and science than it is for literary works. For math and science "published" should normally mean in a book by an established publisher or a peer reviewed journal, paper or electronic. Exceptions might be made in unusual circumstances, as when a result is particularly notable and easily verified, e.g. factoring an important integer, where one can simply multiply the two factors.
Equilateral triangle
The thornier problem for me are results that have been published in hundreds of reference works, e.g. common formulas for area and volume. Should we find a pre-1920 reference work and copy that or simply make our own selection and arrangement of the material, citing multiple references? The former might be fun, but the latter approach seems in keeping with the original Wikisource charter. I also see no problem in adding original illustrations. See right e.g.
By the way, The Joy of Pi, ISBN 0802713327, published in the USA by Walker & Company and in the UK and Overseas by Penguin Books, contains the first million digits of Pi behind the decimal point, printed on actual paper, according to the books web site http://www.joyofpi.com/pi.html.

--ArnoldReinhold 14:02, 17 October 2005 (UTC)

It seems like the conversation has died down a bit on this page, yet we have come to no real resolution. We need to finish this once and for all. - I can see the benefits of having a reference section here at Wikisource. Unlike the main section here, the "pre-published" guidelines might not be best, unless we allow information to be compiled from numerous sources.
These are the guidelines mentioned above:
  1. All mathematical material must have appeared in a recognized, published source.
  2. Public domain or open-licensed sources are preferred.
  3. Accuracy, reliability and verifiability are major goals.
  4. Where possible numerical data should be recreated algorithmically, rather than being scanned or manually inputted.
  5. What data tables to include, and their precision and format are subject to editorial discretion, taking into account both the desire to preserve history and the needs of modern users.
  6. Data and formulas should be accompanied by a source and, where possible, the program used to create data tables should be exhibited.
  7. Data and formulas should be validated against published sources by the original contributor. Additional efforts at verification should be recorded, even if no discrepancies were found.
The first question is "What to include?" - Formulae, constants, tables, lists of numbers, etc.? Let's try to get a consensus on this before we proceed further.—Zhaladshar (Talk) 20:12, 5 November 2005 (UTC)

I have no problem with the criteria set out above. WS exists as a service and we cannot really exclude whole chunks of reference material on the grounds that it might not be "literary". We should not exclude material of value. I do have difficulties with some of the literary material we accept, simply because I am not familiar with it and never want to read it. But who I am I to deny it a place?

One point that I have (been having difficulty with) is actually finding stuff on WS. If we adopted a proper library classifaction as the basis of sorting out material, there should be slots for the sort of reference material we are discussing here. I know that the "search" function should find everything, but that pre-supposes that the right question is asked in the first place. - It is my impression that we do not make sufficient use of "see also" links within Wikisource at the bottom of the pages. In other words, editors are posting text with no links to other WS articles by the same writer or genre. This is perhaps the wrong place to pursue this, so might be aired eleswhwere Apwoolrich 20:50, 5 November 2005 (UTC)

For additional texts by the same author, there should be an author byline-link at the top of the page. But I do agree about classification and believe that categories are under-utilized. However, I assume that most users of Wikipedia are not professional librarians and probably wouldn't always do a very good job trying to put things in order. —Mike 23:20, 5 November 2005 (UTC)
What to include? Mostly formulae & algorithms. Lists of numbers only when computational software is not ubiquitous -- no sin, exp, arctan, etc. For those, we can display the algorithm ... both infinite precision and the rational polynomial 16, 32, & 64 bit precision approximations.
Mersenne primes & the like go in -- essentially uncomputable. Sequences such as the Fibonacci up to some reasonable limit (a hundred maybe) ... it's easily computed, but not ubiquitous. Anyone who needs more than a hundred is serious enough that they'll use software.
One possibility is to actually include functional java or javascript programs to calculate common tables e.g. of erf. I'm not sure of that, but we do have a section for software ... so why not just make it browser functional?
In short, I agree with all of Zhaladshar's guidelines. I like formulae. I'm not a big fan of tables, no one will use them. I think, maybe some functional software would be helpful, but that's a stretch. Wolfman 02:13, 6 November 2005 (UTC)

I admit I feel swamped by the sheer volume of discussion that has taken place here (in this section and below). It's great, but it is impossible to address everything. So I'll simply state that I think Zaladshar's guidelines are excellent, and I hope they (or some variation of them) are adopted.

As for his question, "What to include?", maybe the best thing would be to make sure we have some educated Math-enthusiasts working on this, so that they themselves can make reasonable decisions as to what sorts of materials should reasonably be kept at Wikisource. Dovi 08:39, 8 November 2005 (UTC)

[edit] Computer Code

I would like to add my thoughts about computer code. Suppose code is submitted that purports be a valid implementation of some algorithm when in fact the code is defective. It seems that the wikisource community should not be in a position where it becomes the arbiter of correctness in such a case. In fact there are code libraries that exist in the public domain where this sort of issue can be better addressed. I have no objection to the inclusion of code of historical interest (perhaps with a disclaimer) or code which might be found in published texts. In these cases and, perhaps, in other cases the wikisource community would only be quoting an existing document and thus would avoid the above mentioned difficulty.

I believe that if wiki could at some future time implement a repository for verified (e.g. proofed code) it would be a meritorious project. However, the field of Computer Science is vast and growing daily.

I would be glad to hear other opinions on this issue.

Dave R

[edit] User-compiled, well-referenced lists

Lists such as my List of victims of the 1913 Great Lakes storm, which survived VFD, are basically just selected source material combined together into one document, to save other people countless hours of research on their own. These lists are just like lists of numbers that are generated on a Wikipedian's computer (in terms of who created the list and how verifiable it is). --Brian0918 05:03, 6 November 2005 (UTC)

[edit] In support of inclusion of mathematical material—even in a broader sense

I think Wikisource is an appropriate place to store mathematical material. By material, I mean not only numerical tables/lists, but also definitions and theorems. So, it can host the collection of current accepted mathematical knowledge.

I agree with the suggested guidelines.

My view is like this.

The definitions and theorems shouldn't be original, rather they should match those from some peer-reviewed publications. The publication must be referenced. The researchers that were first to suggest the definition or prove the theorem must be specified.

Also, famous hypotheses can be included, also with references to where they were published and discussed, and who suggested them.

The theorems can be accompanied by proofs (also originating from peer-reviewed publications). There may be several proofs of a single theorem. Analogously, definitions can be accompanied by comments (giving equivalent definitions and explaining why they are equivalent). Yes, probably, comments are appropriate for theorems as well. Additionaly, examples can be supplied.

By match, I mean that the wordings may or may not be the same as in the sources but the formal mathematical content must be exactly the same.

I suggest that the mathematical material is presented in an exact and formal fashion. Nothing relevant can be omitted in favor of better presentation.

(Benefits of the wiki.) Every occurence of a formal concept must be linked to the page that defines it. And every use of a known mathematical result in a proof must be linked to the page that formulates it (and hosts the proofs).

(Difference from Wikipedia or Wikibooks.) The mathematical material at Wikisource would be a collection of exact mathematical material, split into small individual pages, which might be difficult to read and understand quickly, and it might be of some sense only to specialists, whereas the corresponding articles about the mathematical results at Wikipedia try to present them to a reader in a way that is best for getting the idea of it (plus all other non-formal information about the history etc.), and Wikibooks try to explain some theory by re-ordering the material, giving informal high-level comments etc. So, Wikisource can really serve as the source for the other two projects.

There might be alternative formal concept systems in mathematics. Well, let them all exist under Wikisource, as long as they are not original research and are accepted by peers. Use some kind of disambiguation to resolve the conflicts.

[edit] Concluding remarks on this view

So, my view of mathematics under Wikisource approaches some characteristics of the Books by Bourbaki,

  • with the additional benefit of wiki-technology for linking and presenting the material,
  • with the additional benefit of Wikimedia project for collaborative work on maintaining it, checking it, and extending with the constantly growing mathematical knowledge,
  • with the addtional benefit of not requiring making the difficult and controversial decision on the linear order of the presentation of the mathematical material as well as not requiring the enormous effort to first fill in the pages on the more basic areas of mathematical theory,
  • with the difference in that Wikisource could host alternative mathematical "systems" side-by-side, whereas Bourbaki present only a single point of view.

Also, there is an example of a similar sort of work in linguistic morphology by w:Igor Mel'čuk: Cours de morphologie générale. The wiki-format would be wonderful for it, if it once could be put to Wikisource, since it consists of linked formal definitions accompanied by examples and comments (probably, now it cannot because of the copyright).

Probably, both of the works have a wonderful introduction about the organization of the works, which could serve as a basis for Wikisource guidelines concerning the organization of such material.

I think that nothing about this view is against the general Wikisource idea and guidelines. The formal content (formal relations between concepts, either given/invented by humans, such as definitions, or derived, such as theorems) is the source in the case of mathematical material. In other words, in mathematical material, (to a certain extent) the matter is not about how something was said (with which words), but is about what was said (what is the formal meaning), and that formal meaning should become the content stored under Wikisource, but, in a "human-oriented" form ;) (so that the formalistics is not exaggerated).

Of course, obtaining high quality of mathematical material placed under Wikisource this way would require participation/review of corresponding specialists (but there will be some out there, won't they?).--Imz 22:15, 6 November 2005 (UTC)

[edit] Example

  • The books by Bourbaki are near to meet, after imaginary wiki-reformatting, distribution over many individual pages and additions of appropriate references to each individual page, the general requirements that could be imposed upon mathematical material in Wikisource.--Imz 23:40, 6 November 2005 (UTC)

[edit] Copyright issue

I wonder whether the major reorganization of the referenced publication (into individual linked pages) and "simplification" of the wordings (cutting out informal things) would solve the copyright issue. The mathematical ideas are not subject to copyright, and as long as the references are OK, the presentation of the mathematical content at Wikisource in relevamt wiki reformatted fashion would be OK, wouldn't it?--Imz 23:00, 6 November 2005 (UTC)

Yes, as long as we don't exactly present something the same way someone else did (i.e., we made the presentation a full Wikisource one) we should not run afoul of copyright infingement.—Zhaladshar (Talk) 03:03, 8 November 2005 (UTC)

[edit] Kinds of mathematical material

I've been thinking further, if the general idea of inclusion of mathematical material in this way is accepted, what might be the useful set of kinds of such material (kinds of individual pages) and the useful namespace divisions.

The main kinds are definitions and theorems. Theorems are supplied with proofs. Is there really a need for commentaries/remarks to definitions or theorems, like "def1 is equivalent to def2"? Actually, this is a theorem, but perhaps an obvious one. So, commentaries as a separate kind shouldn't be included perhaps.

Some additional things are hypotheses, axioms, named properties (that are neither true or false; perhaps, formulas is a better, although unusual name). Probably, axioms are not a good kind (definitions + named formulas would replace them).

Examples are a reasonable kind (although being probably a subkind of theorems).

So, these kinds might be reasonable namespaces for mathematical material. The discussed numerical tables and other such data can belong either to examples or theorems.--Imz 02:34, 7 November 2005 (UTC)

Are you maybe over-thinking things a little? I thought the original discussion was whether to include numbers like Pi, Phi, etc. and tables of numbers like factors, constants, etc. If I wanted to know about the Pythagorean theorem, I could look at the Wikipedia article. And if you were adding some kind of mathematical textbook to Wikisource that describes the theorem, you would include that with the other content here and not use any kind of namespace.
If we want to describe the documents that can be found in Wikisource, that is one thing, but we should leave the original writing about mathematical topics to Wikipedia and Wikibooks. —Mike 03:02, 8 November 2005 (UTC)
I was hoping we could just keep this initially about numbers/lists of factors and such. This is a good discussion, but I think we need to nail a few things down first. I think it would be best to talk about only lists or math information, lists of constants, etc. As, historically, these have caused the most problems, I think they should be addressed first.
So, for the time being, let's just try to get first things first, and restrict our discussion to what Mike mentioned above. After that, we can bring up what Imz raised.—Zhaladshar (Talk) 03:18, 8 November 2005 (UTC)
Yes, I agree, I was over-thinking things, and probably, in that form, that's a topic for another discussion. Let me just make a few remarks. It seemed interesting and useful to me to think on what can really be assumed a "source" in the case of mathematical material. So, I thought that it is the formal content without any presentational/explanational and so on "toppings", and I thought that's nice if there is such a collection (perhaps, at Wikisource).
As to the tables, lists, etc. -- they are, of course, just a particular case of that kind of content.
As to the relation to other places you mention (Wikipedia, Wikibooks), I meant that the form of mathematical material that can be stored at Wikisource is different from those places. For instance, in the case of Pythagoras theorem, Wikipedia's article on it presents the idea of the theorem and tells about the relation of the theorem to other parts of human knowledge, Wikibooks teach the theorem presenting it in a way suitable for learning and understanding it. Wikisource can have a much shorter entry on it, with only the bare formulation of the theorem (one sentence), but formally exact and grounded on other Wikisource entries for the concepts used in the theorem (triangle, etc.); and with a list of references to articles/books where this formulation was used. It's not the place one looks for if one wants to know what the Pythagoras theorem is (one looks in Wikipedia or Wikibooks in that case), but one looks here if one wants to see what the place of the theorem was in the suggested formal mathematical systems, if one wants to work with the formally complete mathematical systems. (There might be several alternative formulations stored, if there are different formal systems of mathematical concepts.)
Ok, I'm sorry for writing another long passage on a slightly different topic, but I think there is a task for Wikisource (or perhaps a separate "Wikimaths") project, and not only Wikipedia or Wikibooks, in this area.--Imz 22:58, 8 November 2005 (UTC)
I remember from my distant past having to use little handbooks of mathematical tables. I assume they still exist. Now on the one hand, perhaps the availability has made something like a table of logarithms obsolete. On the other hand, the more esoteric tables, like tables of probability for the t statistic or some such, which can of course be calculated by the user with a computer, on the other hand most people would need to look up the formula, on the other hand.... ? Gzuckier 18:45, 9 November 2005 (UTC)

[edit] Consensus?

Just to make it clear, have we reached a consensus where we will allow mathematical tables (such as prime numbers, Paschal's triangle, etc.), and constants (as these are what initially started this debate, I think we just need to get this down now--we can move on from there) to a limited extent? Please say whether you support or oppose their inclusion. I think we should do it this way, because right now, we're just doing a lot of talking and nothing is actually getting done.—Zhaladshar (Talk) 17:10, 9 November 2005 (UTC)
Support
Apwoolrich 18:39, 9 November 2005 (UTC)
Mike 02:04, 10 November 2005 (UTC)
Wolfman 04:08, 10 November 2005 (UTC)
Dovi 04:28, 10 November 2005 (UTC)
surueña 13:51, 10 November 2005 (UTC)
Imz 22:14, 10 November 2005 (UTC)
--ArnoldReinhold 20:08, 11 November 2005 (UTC)
Zhaladshar (Talk) 19:32, 13 November 2005 (UTC)
Oppose


Well, seeing as nobody is opposed to this, I say we create a new page at Wikisource:Mathematical and scientific guidelines or a similar such page, where we draft a set of guidelines for the inclusion of mathematical/scientific guidelines (I'm including science as well, because they have numerous tables and such that will probably end up being added after time). Of course, we should use the talk page to begin formulating these rules. But this page should be a focused discussion about the rules mathematical data should follow to be accepted here.—Zhaladshar (Talk) 19:32, 13 November 2005 (UTC)

I urge this goes as part of the Help corpus. Since I have not been involved in the discussions I will have a stab at making a first draft unless anyone else is volunteering:-) Apwoolrich 13:57, 16 November 2005 (UTC)
Please, take a stab at it. This is something that should be in a help page, and I'll do my best to contribute (after Thanksgiving I'll have more time). If you don't get around to drafting something I'll begin the work then. Otherwise, I'll just help edit what you've created.—Zhaladshar (Talk) 17:14, 16 November 2005 (UTC)

[edit] Not only math and source code, but also music, chess, diagrams, and more

Having found this discussion accidentally, I would like to remind that there are even more types of material that should be put here at Wikisource. In my POV, mathematical contents should be allowed, both mathematical proofs and tables ("lists of prime numbers", "astronomical coordinates",...). And much more.

Have you ever read something about WikiTeX? Please, read about this MediaWiki extension in its home page. It's really, really awesome! It brings a lot of potencial, specially to Wikisource (I can't wait!). Think Wikisource hosting…:

  • Musical scores: like the Mutopia project, a library of sheet music of the greatest composers of all times.
  • Chess playings: all chess moves of a historical game, like the tournament of Kasparov vs. Deep Blue.
  • Schematics: the diagrams of the Apollo Lunar Module
  • Historical source code: why not having the code of Minix V1 or the first FORTRAN I compiler (this one doesn't need any MediaWiki extension).

Maybe namespaces could be used to have special editors in each of them (e.g. an editor for musical scores), as proposed in a Wikitex usability review. But I don't know whether namespaces are neccesary for that.

Of course that not everything has a place at Wikisource (only "published" material), but IMHO we cannot create a new project when we want to host a new published topic. Hope this helps--surueña 13:51, 10 November 2005 (UTC)

Wikitex WOW, Apwoolrich 15:50, 10 November 2005 (UTC)
I think all of this is great. After all, we are to be "The Free Library." Dovi 16:00, 10 November 2005 (UTC)


Since this whole wiki idea is based on inclusion, not exclusion, this discussion is really absurd.

I say the more the merrier, and if there is a problem with the mechanics of the specific venue, then we need a new venue.

CORNELIUSSEON 06:20, 11 November 2005 (UTC)

[edit] Include Everything Useful

I'm tempted to say, "include everything", but then someone would take that to an absurd extreme. But I think no limit should be set on topic. If it is factual; an existing part of our real knowledge store; then why exclude it? No argument against inclusion meets my approval. Xiong 17:32, 11 November 2005 (UTC)


The initial purpose of Wikisource has been fulfilled. The value of Wikisource has progressed past the the initial purpose of Wikisource. A purpose is only valuable as long as it proceeds the implementation. If you were traveling from Warsaw to New York and you were crossing the Atlantic you would not say "My purpose is to reach Paris". When the implementation exceeds the purpose and you try to use the purpose as a guide, you are lost. It would be similar to putting the horse behind the cart instead of in front of the cart. The initial purpose of Wikisource is irrelevant. Update the purpose!

The objective of all references is to improve the ability of mankind to predict the future. This started with predicting the sesaons for harvesting crops. But quickly advanced to predicting human interactions. Eventually the predictions became good enough that we could predict, in part, the results of interactions between the very small, an atom, and the very large, a galaxy. A big change occured for mankind when we could quantify precisely what we knew and to what degree we knew it. This required mathematics.

To throttle the knowledge of mathematics in any way is to decide that only a small group should have access to THE TOOL of accurately predicting the future of constrained systems. Hence, the proposal is not about reducing or preventing the flow of mathematical information. It is about deciding to constrain access to the tool that enabled mankind to design almost everything invented in the last two centuries.

Where should we draw the line? Never accept proposals that would limit mankinds ability to predict the future.

Now pragmatically how do we prioritize what is included in Wikisource? This is obvious: 1.Include anything that helps mankind to predict the future. Obviously, if you do not know the past it is almost impossible to predict the future. 2.Include ideas that might help predict the future. These ideas could be from past literature or current literature.

The mathematical works because of their direct importance to mankind are included in level 1.

[edit] Distinction between Wikibooks and Wikisource

I think this whole discussion gets into the heart of what the distinction between Wikibooks and Wikisource should be. I have been invovled with moving a couple of Wikibooks over to Wikisource (in fact, just did so today). Bizzarely, somebody deleted text of the Bible from Wikisource and put it on Wikibooks. Go figure and try to understand what other purpose of Wikisource would be than to have the actual text of the Christian Bible and not be duplicated elsewhere.

In regards to mathamatical material on Wikisource, I think the general rule of thumb should be more toward where it has been previously published before. If the content is to be a scholarly review of mathmatical formula that contains original content or commentary, it should go on Wikibooks. If you are doing a copy of material that has been previously published elsewhere, such as Principia Mathmatica by Isaac Newton or Einstein's original paper on Relativity, that should go to Wikisource instead. Translations of these classical papers should clearly stay on Wikisource as well.

The larger issue would then be for things like a list of forumlas like b:Calculus:Tables of Integrals or for computer software source code. The two issues do need a little bit of separation, however.

Computer software source code is a huge issue on Wikibooks right now because of the GFDL/GPL conflicts. In short, you can't have GPL'd source code in GFDL documents and the other way as well. That is something IMHO needs to be fixed in the GFDL, but something relevant here as well. Many Wikibooks projects are now instead placing the source code on Source Forge, because it is a public repository that allows GPL'd software. I consider that to be a loss for Wikimedia projects, but then again MediaWiki software doesn't do a good job of doing software versioning either like a good CVS system. Perhaps a Source Forge instance on Wikimedia servers would be a better alternative here?

Classical pieces of software, such as the source for the original Crowther & Woods Colossal Cave Adventure or Weizenbaum's ELIZA would be clearly of interest to Wikisource and something that should stay here. Some other source code for classical programs, such as IdSoftware's Wolfenstein 3D (having been placed in public domain) is also available. The question then becomes if it makes sense to put stuff of that nature on Wikisource when it can chew up a huge amount of server space for questionable utility to anybody as an HTML page of source code. That is previsouly published information, which should be one test to see if it belongs on Wikisource.

As far as tables of mathmatical values, such as logrithmic tables or similar sorts of classical tables that are now done through calculators or math CPUs, the utility of those can be questioned quite a bit. There is no real practical use for a book of the first million digits of Pi, but some people do find it interesting. Other transedential constants like e or the square root of two are of similar nature. On the positive side, there are relatively few of these numbers to worry about too much. As long as they don't get too far out of hand (like over 1 million digits) I don't see the harm of having them sitting around on Wikisource. They are constants, and don't ever change after being verified for accuracy. Wikisource is better equiped for policies to deal with information of that nature than any other Wikimedia project. --Robert Horning 08:28, 13 November 2005 (UTC)

I'm not sure this really does get to the heart of the distinction. The discussion on this page has been more or less about overall subject areas, not specific texts, and the conclusion seems to be that we will welome texts on most topics, among them mathematics.

As for the distinction: "If the content is to be a scholarly review of mathmatical formula that contains original content or commentary, it should go on Wikibooks" - I question whether that is correct. Wikibooks does not host scholarly reviews. It hosts instructional resources, such as a guide to learning and applying mathematical formula. It is not a place for scholarship per se. This, to my mind, is an entirely separate topic, which is very important but not appropriate for this page. For those interested, please see:

I initially wrote both pages, but my thinking on this changed a bit over time and I currently have a slight preference for the way things are described on the Wikibooks page. One important suggestion on this whole topic is that slight overlap is preferable to falling throught the cracks (i.e. that some kinds of text projects with source-texts can be reasonably put either on Wikisource or on Wikibooks, and we should respect the gray area here). In any case, this is an important topic, and if people are interested in discussing it the best place would be on the above two pages.Dovi 09:25, 13 November 2005 (UTC)

[edit] In or out debate

In replying to invitation to join in mathematics debate I encounter the above statement which as a joke clearly highlights why Mathematics should be included in Wikisource but only if it IS SOURCE MATERIAL ONLY i.e. historical formulae, hypothesis and data etc. Wikisouce is not paper so why not, unless contributed material drains resources. My usual handle to Wiki projects is the NORWIKIAN but in attempting to register am no longer recognised as a contributer ! Best not take any of these meta-projects TOO seriously in the larger scheme of things tho' i do not quite subscribe to the views of the above ! NORWIKIAN

Mathematics has as much of a place here as does anything else. Physics, Astronomy, Chemistry, Astrology, Theology, Cryptography, Geography, Geology, Chronology, the study of everything should be included. I don't believe in Greek Mythology or Christianity, but I find it interesting, helpful, and enlightening to study both. Wikisouce has the potential to be the ultimate compilation of information, thoughts, ideas, methods, a 411 at the center of the universe, and I certainly do not see why closed-mindedness should prevent you, or us rather, from accomplishing such a goal. By the same token, I do not believe that it is appropriate to censor material found here - hate is perhaps the worst thing in the world, but that doesn't mean we should prevent material from surfacing about skinheads or kamikaze or anything else. If this is truly to be an opensource community, we must omit nothing and censor nothing.

The issue is not "math or not math". The works of Newton or Fermat would be very welcome here. The question is "mathematical tables". Yann 17:05, 15 November 2005 (UTC)
Please see Wikisource:What is Wikisource? --ArnoldReinhold 11:18, 18 November 2005 (UTC)
I don't understand your comment. The point here is to discuss what should be in Wikisource. Yann 20:42, 22 November 2005 (UTC)
My point is that there is already a policy in place and. as the above discussion shows, there is no consensus to change it, quite the opposite. At some point this question must be considered settled. --ArnoldReinhold 04:59, 7 December 2005 (UTC)
Yes, you are right, there is no consensus to change it. It looks like it has been agreed that "tables" and "constants" are to be allowed. What we now need is to formulate a set of guidelines for the addition of such tables. Such as, how many digits to take a constant to, do we have pages containing a constant to the 10,000th decimal, to the 20,000th decimal, and to the 100,000th decimal all at the same time. Devising that will be the most productive discussion at this point.—Zhaladshar (Talk) 22:10, 10 December 2005 (UTC)

[edit] Meta Proposal

I noticed there is a recent propasal for m:Wikilists which will apply to some of the issues in this disscussion.--BirgitteSB 03:24, 21 November 2005 (UTC)

It's nice that there is a proposal for this sort of thing, but unfortunately, few of the proposals ever become reality.—Zhaladshar (Talk) 15:54, 22 November 2005 (UTC)

[edit] Guidelines for numerical table precision

As suggested by Zhaladshar, here is a proposal for guidelines on the level of precision in mathematical tables and constants to be included in Wikisource. Many of the parameters I am proposing are judgement calls on my part and, obviously, open to discussion. Whatever we come up with should be included in a guidelines page, along with other issues discussed above, and with the understanding the guidelines are not absolutely rigid and that contributors who wish to deviate from the guidelines should propose changes or exceptions on the discussion page.

There are several types of tables and constants:

  • Historical tables. For centuries people who did calculations relied on mathematical tables. These range from the simple tables of logs, sines, cosines and tangents found in the back of high school math texts, to the elaborate tables used by navigators. Exhibiting such tables, either completely or as sample pages has clear historical value. For these, the precision question is easily answered: reproduce what was originally done in some published exemplar.
  • Cultural artifacts. Certain numbers have a special place in mathematics and the popular imagination. Pi and e are clearly at the top of the list. A second tier might comprise the square root of 2, the en:golden ratio, etc. Pi deserves its own categories, which might include:
    • First million decimal digits. As noted above, there is precedent for this in a published book. These expansions also have some utility in cryptography, see w:nothing up my sleeve number. Note that the space required for a million digits is about that needed for one megapixel resolution jpeg image.
    • Values in other bases, from, say, 2 to 20, to, say, 1000 digits with perhaps a much longer expansion in base 16, for various reasons
    • Related values, e.g. 2Pi, Pi/2, Pi/4, 1/Pi, Pi/180 (one degree in radians). 180/Pi (one radian in degrees), same for Grads
    • Historical values (in particular the famous and partially erroneous William Rutherford calculation of Pi to 208 digits); see w:History of Pi
    • samples from more extreme calculations, e.g. the 100 digits starting at billionth, trillionth, quadrillionth etc. positions (in base 16 if necessary)
    • continued fraction expansions
    • Famous formulae for computing the values

Second tier constants might be limited to, say, 100000 digits

Third tier constants, e.g. volumes of the unit n-spheres, might be limited to, say, 100 digits

Number series, such as factorials or Bernoulli numbers should stop when a single entry takes up most of a 800 X 600 resolution screen, or sooner.

Values that are difficult to compute and only known to modest precision (e.g. w:Euler's constant) should be shown to full precision

  • Current utility. Some important mathematical functions are not available on scientific calculators nor in most subroutine libraries, e.g. statistical integrals, Bessel functions, etc. For these, a useful limit might be the maximum accuracy attainable with w:Quad precision floating point under the w:IEEE 754r standard, which is 113 bits or about 35 digits. Note that any table that displayed 35 digits would have to be computed at a higher precision to achieve full accuracy. Obviously we should never display tables to a precision that is greater than their accuracy.
  • Validation of computer algorithms. It is potentially useful to have selected values of common functions computed to high accuracy. For example, values of the common trigonometric functions computed for every whole degree to, say, 100 digit accuracy.

Proposals to include other constants to very high accuracy, say, more than 100 digits, would be subject to advance discussion and consensus formation.

Sorry, I just noticed that the above comment, which I posted 15:12, 12 December 2005, was not signed. Sometimes I get logged out but do not notice. Anyway, what do we need to do to move this discussion toward a conclusion? --User:ArnoldReinhold 21:42, 18 January 2006 (UTC)

[edit] TV Listings

I'm a newbie to wikisource and i was wondering if putting cable tv channel listings would fall under the regulations of wikisource. Sam916 02:08, 22 December 2005 (UTC)

[edit] User namespaces

I'm responding to the addition of adding the bit under self-contributions about the user namespace. While it seems like a good idea to allow people to submit their works in their user namespace, this really seems no different that having them put it in the main namespace. They can still link their websites to those pages, and people can still read them. Entirely libraries of bad and unimportant fiction can be added if this is allowed. I think it would be better (and safer) if this weren't allowed. Unless, of course, I'm misunderstanding the terms "with reason." And if I am, then let's please get a good grasp of what's reasonable and what isn't.—Zhaladshar (Talk) 20:33, 28 January 2006 (UTC)

I agree that this could be abused with people unloading huge amounts of material into user namespaces. I was thinking more along the lines of limited quality text as in Apoolrich's example, that could provide a personal version of a text in the main namespace.
As to the idea itself, I was thinking along the lines of Wikinews: NPOV doesn't allow for editorials reflecting a personal vision there. The solution has been for people to "editorialize" within their personal namespace. But I agree as above with Zaladshar that is could be abused. That might not be a technical problem since we are not paper, but something (I'm not sure exactly what?) seems wrong with letting people unload 1000 page novels to their userspace. Any other ideas or opinions from others? Dovi 21:24, 28 January 2006 (UTC)
Perhaps we can put a size limit on User space. --BirgitteSB 21:43, 28 January 2006 (UTC)
If that's possible, I'd rather steer away from that. Some users might have (or want to have) large amounts of material on their namespaces (subpages for their own projects/to-do's/etc.) for valid reasons; that shouldn't be discouraged.
Dovi, could the problem that you might not be sure exactly what it is concern self-publication? Publishing a small amount seems fine, but if they take it to uploading entire novels, this is essentially a self-pub, just not where (probably most) people will ever find it. I think that User spaces are a prime place for a person to "editorialize," and I don't want to limit anything like that, so we should come up with a guideline concerning what is and is not acceptable to put in the user namespace (in terms of adding texts--this would circumvent the whole 1000-page novel problem).—Zhaladshar (Talk) 21:49, 28 January 2006 (UTC)

[edit] What about letters?

I have added the text of some old (1800-1950s) letters and papers to WikiSource in the past. These letters have not been published (except on my website...[2]. However, I feel that at least some are of general interest. Do they belong here, somewhere else, or nowhere. I published one here today as supporting information for the page on w:Carl Friedrich Gauss at Wikipedia. Among other things, I have a journal that my great great grandfather kept from the early 1800s until his death in 1899.

Yes, we can host letters, provided they are of historical significance, as yours appear to be. We list them at Wikisource:Letters by sender. I believe we should also be able to accept journals/diaries/etc though offhand I'm not sure where to put them, perhaps Wikisource:Non-Fiction, Wikisource:Biography or possibly a new section such as autobiographical material or diaries. AllanHainey 14:36, 17 March 2006 (UTC)
I do think a new rule or exception should be made regarding this. If original media such as letters, diaries, maps are previously unpublished, but have historical significance and they belong on wikisource, then this should be stated in some way on the project page. At the moment it leaves this area in a bit of confusion by saying that all items must be previously published in some kind of area that would invite peer review. However for these sorts of items, it would appear that wikisource is the publisher of first venue. This needs more clarification on the project page. Wjhonson 16:12, 9 July 2006 (UTC)
I have made a slight clarification in the Original Contributions section to ensure that it reads as discussing *your OWN* original contributions. This would then allow an old letter, writen by someone else (obviously) to be part of wikisource. I hope everyone agrees with this interpretation, or maybe can clarify it further. Wjhonson 16:18, 9 July 2006 (UTC)
Shaladshar deleted my changes, which were based on discussion in this section. The policy of unpublished letters, diaries, other manuscripts of historical interest should be explicitly stated on the policy page to make it clear. That's what I did. I see no reason to revert. Wjhonson 20:56, 14 July 2006 (UTC)

What about family trees (Henrietta Lacks) or birth or death certificates (Adella Wotherspoon)? And how should we categorize pages wich contains different kinds of sources (Middlebush Giant)? --82.212.68.183 17:45, 17 March 2006 (UTC)

In my view we should be including letters as well as other kinds of sources. There is an enormous amount of similar stuff like diaries available which never gets published in regular historical journals, but has value. There are two kinds, letters which are already in print, and unpublished ones. Tne latter might fall foul of our rules for acceptibility, so perhaps these will need tweaking, as we have recently done for original translations. The texts cited by the previous writer are both genealogical, so maybe we ought to have a category for these as well. Apwoolrich 18:51, 17 March 2006 (UTC)
Family trees (or pages which are only family trees) seem to be outside the range of WS's purview. We should not accept geneological information if it is all by itself. Geneologies have no real sources and are a lot of user-contributed research. That seems like something which should be incorporated into Wikipedia articles.—Zhaladshar (Talk) 19:44, 17 March 2006 (UTC)

Well, I am confused. I gather that 'categories' are more than just something like 'Letter', 'Author', or something like that. After doing some reading in Wikimedia, it looks like The Charles Henry Gauss Family Papers could be a category. If so, and if it would be relevant, how do I do it. Also the letters I have submitted seem to have mysteriously been put on the page, Wikisource:Letters. How did this happen? Was it somehow automatic, or did one of the roaming editors do it? Also, I think it would be nifty to have some sort of style guide for letters, e.g., how should they be titled, etc. Mathsinger 03:25, 18 March 2006 (UTC)

I figured out the category thing...added Category:The Charles Henry Gauss Family Papers, and a subcategory to it Category:John Jay Johns Journal. I guess this stuff would be called transcriptions of primary source material. I wish someone would look over what I have done, and give me some feedback. Mathsinger 00:02, 20 March 2006 (UTC)

I know this discussion has been dormant for a long while, but I'd just like to clarify whether transcripts of birth, marriage and death certificates are wanted on Wikisource. It seems to me that, although not really 'published', these are very valuable source documents and would be good to have here. Any thoughts, or pointers to where this has already been discussed? Thanks. —Sam Wilson contrib's | talk 00:25, 29 April 2008 (UTC)
These could be considered acceptable under "documentary sources". The difficulty is determine whose transcripts we want, as I dont want to be accepting these documents about anyone. John Vandenberg (chat) 02:24, 29 April 2008 (UTC)
My feeling is that "standard" family documents (birth and marriage certificates, as well as death certificates where the cause of death is described in just two or three words) don't belong in Wikisource. But I understand your point: we want some of these for added information, such as when there is a clear reference to an author or a biography character. The question of reference data springs up time and again. I believe the problem is that there is no single sister project where all these reference data can be dumped into. Compare the situation with author or character quotes which have been made off-the-cuff, i.e. not as part of a publication. We want those, too, for added information, but we'd never include them in Wikisource. Instead we link to Wikiquote through the {{author}} header or in the notes section of a text. There may be isolated solutions for certain types of references (I've done some searching, and w:Wikipedia:Persondata might be partially adequate for family data, and there are other unimplemented ideas such as m:GlobalFamilyTree) but it appears to be about time this WikiData thing should be gaining some momentum.

[edit] What about Transcripts?

I've got an idea about creating a collection of transcripts. It's more than that, but that's the simple way to put it.

WikiSource seems to exclude random transcritps that JoeUser creates when he hears someone say something he thinks is interesting, historic or newsworthy.

Please check out Transcript project goals at wikinews to see the discussion. The upshot is that the wikinews guys are arguing that transcritps belong on WikiSource, but what I read on this page makes me think that Wikisource wouldn't want transcript material. Mattks 09:27, 1 April 2006 (UTC)

We certainly do host political speeches. However it seems to me you are talking about more than just public speeches (interviews, TV appearences, etc.) and the other stuff will be a copyright problem. The Broadcaster retains the copyright to such things in most cases. The other is issuse is you seem to want to focus on one individuals comments. Transcripts of things where there are several participants would have to be complete and inclusive of everything said. Editing of the material is frowned upon, because it can introduce bias. You will probably have to give us some specific examples for a definative answer, but I suspect there will be a problem with some of the material you want to add. --BirgitteSB 12:06, 1 April 2006 (UTC)
Thanks for the prompt reply BirgitteSB. I imagine 4 ways of getting material:
  1. Direct copy of material that is freely distributable
  2. Obtaining permission of stuff that is not freely distributable.
  3. Creating original work where the wikisourcian (wiksourcerer?) is a first person witness to the spoken words and the words are not otherwise copyrighted by the speaker. This might be a big category; Matts law: "For every person or subject, one out of N citizens will schlep over hill and dale to hear the words themselves rather than trust the mainstream media to give a faithful report".
  4. Copy a link to the material and provide an NPOV abstract.
Editing of the material would indeed be a bad thing; the goal is to provide a body of work that is as close to dispute free as possible so that citizens and scholars alike can have a trusted place to turn to for the truth. Zero dispute is the goal and the only way to get close to that would be to leave out all interpretation and include only the context that is dispute free. For example, date and time of the spoken word, persons present, reporters name and time of transcription, method of transcription (from notes, memory, personal recording etc).
Yes, all the speakers present would have to be included in the transcript to make the transcript meaningful.
I don't have any examples yet; when I think I have the right idea, I'll just start doing it and let the wiki community make of it what they will. -Mattks 22:22, 1 April 2006 (UTC)
My opinion on your examples is of that course #1 is fine and acceptable. #2 is also acceptable, but it is unlikely we would be given permission for all the things you would want. #3 is questionable as there is a problem of verifiabily; it would be best in such case if recordings could be uploaded along with the transcripts. However the copyright on recordings may be even more stringent, I am not sure. #4 would not be acceptable. Wikisource is not a collection of links and we would not accept contributer written abstracts. I think you will be able to put some of the material you want on Wikisource. If your only goal is to compile the complete remarks of person X, I do not believe we will be able to reach that goal. --BirgitteSB 00:38, 2 April 2006 (UTC)
My goal at the moment is to start doing something besides nothing.  :) A limited version anywhere would be useful if the data could be easily copied when a more suitable environment pops up. A table of public domain speeches would be a good start. Maybe Wikipedia is a better place for that, with links that point to the actual text here on wikisource? For that matter, maybe wikinews would be the place for news hounds to document the fact that a some words have been spoken, those events tabulated on Wikipedia, followed by other researches who obtain legal (permission to) access the text and copy it to wikisource and/or Commons (Amgine on wikinews said something about 'Commons'). Hmmm, maybe a combined effort across wiki* is the right way to do the complete project?--Mattks 18:04, 2 April 2006 (UTC)
I'm mulling the issue of completness, it has dimensions (in the sense of catagories to be populated that are independent of other catagories):
  1. a table of all the times a person spoke within earshot of witnesses is one useful dimension. (WikiNews--->WikiPedia?)
  2. a table of all the people that have spoken on a given subject is another.(WikiPedia?)
  3. a complete transcript of a given spoken-word-event(WikiSource, assuming copyrights and verifiability satisfied)
  4. an abstract of what each spoken-word-event was about(NOT WikiSource. Maybe WikiPedia?)
  5. a histogram of words from each spoken-word-event. (I include this because it's an entirely objective way to summarize the content of the text without risk of bias or violating copyrights. Probably too weird for Wikipedia)
Is item 5 too weird for wikisource?--Mattks 19:33, 2 April 2006 (UTC)
And what about transcripts of historical documents? I recall a discussion about this somewhere on WS where it was felt that this was OK, providing images of the original MS were added. This will cause mega-problems for most record office I know are very picky about allowing images of their documents to appear on the web, because of reproduction-fee loss. I personally feel that WS ought to be accepting transcripts of this sort, as a service to Scholarship, but there is no way I can see of ensuring that the text placed is not corrupt or without page images as well. Apwoolrich 15:24, 1 April 2006 (UTC)
I don't see any problem hosting complete transcripts that are old enough to be beyond copyright restrictions. Mattks is wanting to be able to hold current public figures accountable for what they say by keeping a public record on Wikisource. I don't know exactly what kinds of transcripts you are talking about when you say "historical", but I don't know why we would refuse them simply because they are not widely available. They are still verifiable even if isn't easy to do so. If something seems unlikely and is not substantiated by anything else we can remove that on a case-by-case basis.--BirgitteSB 16:06, 1 April 2006 (UTC)
WikiSource seems to exclude random transcritps that JoeUser creates when he hears someone say something he thinks is interesting, historic or newsworthy. This sounds to me like the transcript project is going to become nothing but a compilation of excerpts by public officials (I don't even know on whose sayings this project will focus—is it politicians or any kind of public person?) that will be archived on some wiki. If I'm misinterpreting this, please tell me, because compiled works are expressly excluded from WS.
Personally, I'm most interested in politians, but public officials often have something to say that serves to document the state of the world at a given time.
I'm not talking about excerpts, I'm talking about complete transcripts. One persons "nothing but a collection of transcripts" is another persons invaluable reference material. I don't know how to differentiate between the two; do either belong on WikiSource? --Mattks 22:55, 1 April 2006 (UTC)
Like Birgitte, I don't have any problem hosting complete transcripts that are out of copyright protection. It comes down to the more current day transcripts that concern me. And this is where a number of questions must be asked:
  1. Upon what people will this project focus? Politicians? celebrities? athletes? all of them?
  2. Is the transcript complete? We will not take transcripts that have omissions in them. This is because WS is an archive of source texts. We do not push any kind of agenda, and any omissions of transcripts that involve any of the groups of people from question 1 could very easily become POVed.
  3. How verifiable is this transcript? There might be problems later on down the road if there is just know way to verify that the transcript is accurate (especially if the content of the transcript in question is not consistent with other transcripts). This might result in the transcript being deleted, which would be a phenomenal waste of time for the transcripter, since transcription takes to much time.
I would like to ask that maybe one or two examples be presented (or just give a detailed explanation of what you are aiming at with this project). As has been said before, this project seems like it would be beneficial, but some considerations must be addressed first.—Zhaladshar (Talk) 16:58, 1 April 2006 (UTC)
Answers:
  1. I'm interested in whoever says something that documents their view of the world at a given moment.
    1. Yes, the transcript should be complete to be useful.
    2. This could be an issue for WikiSource: a transcript standing alone is precisely biased towards the speakers view, and of course the particular transcript I choose to put on wikisource will reveal my bias; I can't instantly upload everything a pol has said so I'm going to pick the things that seem most important to me. The only thing I can promise to do is to faithfully transcribe a particular contiguous body of work with complete context. Hopefully, the rest of the community is watching and will point out errors. For a given politician, I'd hope that some other partisan would provide other transcripts that will serve to broaden the picture