From Wikisource
Jump to navigation Jump to search
Warning Please do not post any new comments on this page. This is a discussion archive first created in March 2009, although the comments contained were likely posted before and after this date.
See current discussion or the archives index.



Wikisource:Possible copyright violations[edit]

I have done a lot of clean up and closing over at Wikisource:Possible copyright violations but there are some that I don't feel comfortable with closing, having either been too involved in the discussion or not feeling the the discussions have run full circle. Please take this opportunity to jump over and see if you can assist in bring these discussions to closure. Thanks :) Jeepday (talk) 00:26, 5 February 2009 (UTC)

Thanks for your works. That's not always fun, but much needed. Yann (talk) 13:14, 5 February 2009 (UTC)

Vote of confidence for Eclecticology[edit]

The following discussion is closed: discussion closed
Eclecticology has the minimum number of objections needed for a vote of confidence on Wikisource:Administrators. His continued adminship will be decided by simple majority, as explained on that page. Please consider commenting there if you have not done so.

(The restricted access policy requires three votes to initiate a vote of confidence, to make sure that no active administrator is removed simply because his supporters were not aware it was in question.) —Pathoschild 01:15:04, 07 February 2009 (UTC)

I've informed Eclecticology by e-Mail as they appear to have been inactive the last few weeks.
I must admit I was a bit surprised by the wording "vote of confidence". In Wikisource vernacular, I've come to distinguish the yearly "reconfirmation vote" from the unscheduled "vote of confidence". Now, reading the restricted access policy I realized that both processes appear to be technically identical, with proposal votes becoming confidence votes once three established editors vote "Desysop". Did I get that right?
When a reconfirmation vote is called, I vote only if I've got enough information to come to a decision. If I don't know a contributor well enough, I won't start digging deeply in their edit histories and logs to get a detailed image. Due to the nature of my Wikisource activity I may be a bit more isolated from everyday activity than other admins but with ever growing numbers of users with privileged access (we've got more than 30 admins now) this behaviour will become the norm. The typical outcome of a confirmation vote is either large or even unanimous support, or removal of access due to inactivity, where "inactivity" is clearly defined by policy. Hence, every established editor rummaging in the edit histories of every candidate would be a waste of time.
If, however, something has the tag "vote of confidence" attached to it, I'd be much more willing to spent some time to see what is going on. This is automatic when the vote takes place out of schedule as it's obvious that something out of the ordinary is going on. But when an important vote is disguised as the yearly confirmation, I must rely on the courtesy of involved editors to notify the rest of the community. (Thanks Pathoschild, BTW.)
Therefore I would like to see a clearer distinction between reconfirmations and votes of confidence, with mandatory notification of the community in the scriptorium and the affected admin by talk page or e-Mail.
--GrafZahl (talk) 15:42, 9 February 2009 (UTC)
I concur with Grafzahl in principal, I was also confused by the process. Jeepday (talk) 18:24, 9 February 2009 (UTC)
There is no difference between these processes. The reason for three people needing to agree for one being called out schedule is simply to avoiding having everyone's time wasted with a case where there is overwhelming confidence in an admin. The scheduled votes are not merely a formality on activity but they are designed to trigger the community to reflect on an admin and to bring up any issues they may have. As someone who is required to look over this process very closely, I can assure that "typical outcome" described above is common but it is not the only outcome I come across. Overall my thoughts on the concerns expressed above are that these things are features rather than bugs. Regarding notification, I think that since these are open for so long and scheduled so far in advance I doubt anything could sneak by a community member, however I can start to put a notice around the middle of the month here about how things are progressing on each confirmation.--BirgitteSB 00:46, 11 February 2009 (UTC)
The original proposal included the intention of annual votes . . . creating a process for the removal of administrator access with minimal conflict in potential future cases of abuse . . .--BirgitteSB 00:53, 11 February 2009 (UTC)


Wiki Money[edit]

UPDATE: See bottom of thread, idea implemented at Wikisource:Purchases
it's 4am, if this idea is insane, I hide behind the "omg, so tired" excuse

A thought occurred to me today, while browsing the w:Wikipedia:Reward board; it sure would be nice if I could say "I'll give fifty cents for every text added to this week's COTW", or "I'd give $5 to have somebody copy/paste over this 100-chapter book".

So then the idea snowballed in my mind a bit, "WS is a fairly small community...why couldn't we create a Paypal account specifically for WS - and only the Bureaucrats would have access to its funds? It'd just be nickels and dimes, never more than $100 I doubt -- where people pay in to get little chores done. And then, when somebody sees an uncommon and interesting book on w:eBay that's $14.50 that would be great to have on Wikisource, they can throw up a proposal on the Scriptorium or something, and if four admins vote to buy the book, it's purchased with WS money and shipped out to some admin who has either a high-res scanner or digital camera, etc.

We could have a little template in the top corner of the Scriptorium;

Wikisource currently has $23.15, suggest ways to spend it?

Anyways, how crazy is the idea? It would only be Bureaucrats (Zhaladsar and Briggite) who had access to the money; and since we're dealing in tens of doesn't seem like a risk any of us would be averse to taking -- if we agreed to chip in $5 to WS as thanks for somebody's help with a task. Sherurcij Collaboration of the Week: Nikola Tesla‎. 09:23, 29 November 2008 (UTC)

  • I remember a similar proposal at Wikipedia a year or two ago, but I just searched for the history and could not find it. Jeepday (talk) 11:30, 29 November 2008 (UTC)
    You weren't looking back far enough. This came up before Wikimania Frankfurt in 2005. During the year before that there were experiments of the kind to kick-start Wikipedias in Ossetian and Bambara. Nothing more was heard about these after the funds ran out. Eclecticology (talk) 19:00, 29 November 2008 (UTC)
  • I'll pitch in a $10 donation to the fund if we do this. I think it would be great to get some text that Project Gutensberg, Google Books, or Internet Archive don't have. That is my 2 cents (or should I say $10 :P ). --Mattwj2002 (talk) 12:23, 29 November 2008 (UTC)
  • While I would not vote against this, I'm a pessimist about its success. Work out a budget. That will give you a much better idea about the viability of the scheme. On the revenue side, if (optimistically) each of our admins were to give $10 that would still only total $400. On the expense side, after the eBay and shipping costs, how much would it take to make it worthwhile to do the work? A person with good high-speed equipment probably doesn't need the money, so if we estimate on the basis one page per minute and a minimum wage of $10.00 an hour we would need to pay 17 cents per page. With the pittance involved there is little need for any kind of formality for managing the proposal. As long as the pool is less than $1,000 the person who volunteers to do this should have a free hand in its administration, but it would be good if he makes regular reports. Eclecticology (talk) 19:00, 29 November 2008 (UTC)
Presumably the person who says "Wow, facsimile copies of a bunch of personal letters housed in the w:Smithsonian by famous people...that's definitely worth $18, can we use the WS money to buy it?" would also be the one to volunteer his time transcribing them afterwards (and if they didn't get transcribed six months later, we'd hardly trust him in the future to suggest uses for our money - "finish those letters first") -- though others may interject to say "I support buying this book, and if you need someone to help you scan its pages, I'd be willing to do it". I'm not really sure why you think we'd be paying people to transcribe works they asked us to buy. Like most of my WS views, it would try to be focused on "interesting, little-known texts/letters/books" that would be of wide-ranging interest -- so hopefully nobody proposes buying something so boring that the rest of the project isn't willing to help them transcribe/scan/OCR it. But I've often seen things on eBay that would make a great addition to WS - but the thought process is "Well, I don't have $20 to spare on some facsimiles of letters" (Amusing story, Author:Leo Tolstoy has already cost me over $100 in fees from international libraries and universities sending me facsimiles of unpublished manuscripts and essays). This just fosters some more community "collaboration", while also opening a doorway to convince Jay to program his bot for me ("Come on, I'll give WS $20 if you do it this afternoon"), or for me to go help fill out texts related to some subject I don't particularly care about ("Hey guys, I'll pay $1 for every Scientology-related text added this week"). Sherurcij Collaboration of the Week: Nikola Tesla‎.
Your revision to only pay for buying the books on eBay is certainly more workable than paying people to do the work. A budget would still be needed with reasonable funding projections. The Wikipedia Reward Board depends on individuals who make private offers that do not draw from a pooled fund. Eclecticology (talk) 23:24, 29 November 2008 (UTC)
I think you misread something in the original proposal, I never suggested we pay people to do work. I said that, for example, I could ask Pathos to run his bot for me and I'd donate $1 to the treasury, or I could say that for every work on Author:Leo Tolstoy added this month, I'm paying $4 into the treasury. The treasury does not, and was never suggested to, pay members. It is used solely for things like "purchasing a text on eBay" or something similar. Sherurcij Collaboration of the Week: Nikola Tesla‎. 23:31, 29 November 2008 (UTC)
Wiki Money sounds like a great idea, Sherurcij. I would contribute $10 to this.
While the Bureaucrats attempt to set up a Paypal account, a budget, funding projections, etc., I encourage you to be bold, go ahead and start a project at a wikipedia: fund and release website -- perhaps or -- and post a link here to your project there. --DavidCary (talk) 02:02, 6 January 2009 (UTC)
Initially I must confess I was reluctant, but now I think I'm being won over. I'd like to see something like this happen. We could get some really interesting books out of this (and if we shop smartly, we can get them for not exorbitant fees, either). Why not just use an address like Or What service money would we even use for this?—Zhaladshar (Talk) 15:20, 6 January 2009 (UTC)
Paypal seems to be the most widely-accepted currency for online purchases, so I'd suggest we get a (avoid hotmail? technically it's allowed) account going -- though I think we need somebody to front a new bank account, or credit card or something similar to get the account set up to make purchases...not absolutely certain on that. I could go to my bank in a month or so and see if they'll let me open a new account without overdrafts Sherurcij Collaboration of the Week: Author:Nostradamus‎. 15:45, 6 January 2009 (UTC)
Somebody would need to run the idea past the foundation, and get approval, and probably a foundation email. I suspect there are multiple tax and legal considerations to accepting donations. Jeepday (talk) 00:09, 7 January 2009 (UTC)
Right, so we just call it "the Zhaladshar fund" instead and make it clear that no, you don't get tax receipts for your donations. Voila, problem solved - as long as it officially doesn't have "Wikisource" in the name (or eMail address), we should just handle this internally among ourselves. WMF isn't going to mire itself in liability for $40. Sherurcij Collaboration of the Week: Author:Nostradamus‎. 04:39, 7 January 2009 (UTC)
I've already stated that I'm neither a supporter nor an optimist about this proposal. On this last point though acting boldly on the plan is most likely to bring success. Bogging the proposal down in search of permissions would be far more effective in bringing it to a grinding halt than any negative expressions on my part. Eclecticology (talk) 07:34, 7 January 2009 (UTC)
  • (unident) What if you create a list of books you want, and set up some process so the granter can purchase online and send the book to someone with a scanner. That way instead of collecting "donations" without offering a reciept, you have two volunteers contributing to the project. One donates the book, the other donates the time tools. The book can then be donated to a local library in the name of Wikisource or something. Jeepday (talk) 11:45, 10 January 2009 (UTC)
      • Having the purchaser send you the book, and then shipping it to the person doing the scanning involves two sets of shipping chasrges. Better to have the seller send it directly to the person who needs it. Eclecticology (talk) 07:56, 11 February 2009 (UTC)**
    • Until/Unless we come up with better, might be a nice incentive for people to volunteer to photograph/scan if they got to keep the book unless otherwise specified? Sherurcij Collaboration of the Week: Author:Bahá'u'lláh. 15:02, 29 January 2009 (UTC)
      • Concur "nice incentive for people to volunteer to photograph/scan if they got to keep the book unless otherwise specified" Jeepday (talk) 01:09, 30 January 2009 (UTC)
  • Alright, I've created Wikisource:Purchases - Put on watchlist and request you all check it out; add books you see for sale anywhere online (not just eBay) that you'd like to see some collaborative interest on, and sign up to help on existing listings. Sherurcij Collaboration of the Week: Author:Bahá'u'lláh. 15:02, 29 January 2009 (UTC)
    • I object to a link that automatically puts itself on someone's watchlist without warning. Eclecticology (talk) 07:56, 11 February 2009 (UTC)

Remove last_initial parameter from Template:Author[edit]

I would like to propose that the last_initial parameter be removed from Template:Author. The author's last name is already provided in the lastname field, and we can use wikicode to extract the first letter of the last name without an additional parameter. Take Author:Nikola Tesla for example. The code:

[[{{padright:Wikisource:Authors-|20|Tesla}}|{{padright:Author Index: |19|Tesla}}]]

produces Author Index: T. The secret is that the padright function uses only the first letter of its third parameter, and the first letter is exactly what we are looking for.

This method prevents users from making a mistake when copying the last initial, and also means less menial work in general. —Remember the dot (talk) 05:41, 29 December 2008 (UTC)

If other people agree with this, I would propose making the change. Making things as automated as possible I think is the best approach, and one less parameter we have to add is preferable.—Zhaladshar (Talk) 15:20, 29 December 2008 (UTC)
Only issue is that I typically use "al-Hami" as the "last name" for an Arabic author, and that would automatically categorise as A, not H - which isn't ideal. And adding the al- onto the "First name" would seem weird...any chance you could code an exception to ignore al-? Sherurcij Collaboration of the Week: Author:Nostradamus‎. 17:33, 29 December 2008 (UTC)
This is an important point. Not all names are as well behaved as modern western names. This is normally not a difficult parameter to fill in, and it's just as easy to fix an error here when it does happen. It's also my understanding that the "padright" template would not work properly with non-US-ASCII characters such as in Author:Ælfric. I do think that "Last Initial" is misnamed and thus misleading, and would prefer something like "Filing Initial(s)" to better account for other possibilities, but there is no immediacy about that. Eclecticology (talk) 20:52, 29 December 2008 (UTC)
OK, those are very good points. We'd want to leave a parameter available to override the default categorization if necessary. Also, thanks for pointing out Ælfric. I just tried it and including {{padright:Wikisource:Authors-|20|Ælfric}} anywhere on a page blanks the entire page. Not very helpful. That bug would have to be fixed before this could be deployed. —Remember the dot (talk) 06:42, 30 December 2008 (UTC)
I filed bug 16852, please vote for it if you're interested. —Remember the dot (talk) 22:26, 31 December 2008 (UTC)
The bug has been fixed, so this capability will work well now. I have an improved version of Template:Author sitting at User:Remember the dot/Sandbox. If the last_initial parameter is not specified, it substitutes the first letter of the last name, and if there is no last name then it substitutes the first letter of the first name. It even supports ligatures: Ælfric will by default link to Wikisource:Authors-A. —Remember the dot (talk) 04:54, 28 January 2009 (UTC)

And the Wikipedia link?[edit]

If we are modifying the Author template to not enforce that field, could we also have something done for the WP link for when it is left blank. Its current default when left blank is ungraceful. Not all our authors are going to have WP articles. -- billinghurst (talk) 11:05, 31 December 2008 (UTC)

What's wrong when the author template has a blank WP parameter? I can't see anything wrong with it.—Zhaladshar (Talk) 17:49, 24 January 2009 (UTC)
Uh, yeah, I don't see anything wrong with the author template when it lacks a Wikipedia link. What's the problem? EVula // talk // // 20:59, 24 January 2009 (UTC)
Umm, err. (Drew buries his head in his hands). It is behaving and I cannot replicate. Apologies. -- billinghurst (talk) 05:27, 25 January 2009 (UTC)
Heh, there are worse things than everything behaving well. No worries. :) EVula // talk // // 22:01, 28 January 2009 (UTC)
The prompt helped me. It isn't the {{Author}} it is the replacement {{DNB00}} which was snaffled from {{EB1911}} that plays up. Coding there is too complex for me. -- billinghurst (talk) 23:45, 28 January 2009 (UTC)

Tweak main page title[edit]

I suggest that we change the title bar on the Main Page to say "Wikisource, the free library" instead of "Main Page - Wikisource". This would make it easier to find Wikisource in search engines, and be generally prettier overall. See for example wikipedia:Main Page, which does something similar. We can make the adjustment by creating MediaWiki:Pagetitle-view-mainpage with "Wikisource, the free library". —Remember the dot (talk) 06:50, 27 January 2009 (UTC)

Agreed, good idea. Yann (talk) 09:40, 27 January 2009 (UTC)
Yes check.svg Done EVula // talk // // 22:02, 28 January 2009 (UTC)
I noticed a similar epithet on French Wikisource a while ago and I asume it's possible to change the system settings to place "the free library" on every page, just like Wikipedia has "the free encyclopedia" on every page. But then again aren't most libraries free? — Blue-Haired Lawyer 00:07, 29 January 2009 (UTC)
Not usually free-as-in-freedom, no. —Remember the dot (talk) 06:42, 29 January 2009 (UTC)
In many countries not free-as-in-beer either (you may have to pay a fee to get a library card). Also university libraries (the most important ones) usually don't provide full accomodations to the public for free.
This seems like a nice idea, but I looked at the main page and couldn't see any difference. What exactly has been changed? Dovi (talk) 10:29, 29 January 2009 (UTC)
The top of your browser (way at the top, to the far left of the close button) shows some text about the page. That is what has been changed. Psychless 00:14, 30 January 2009 (UTC)

Other discussions[edit]

CrankyLibrarian project[edit]

CrankyLibrarian has kindly decided to assist us pull his collection into Wikisource. I have created a page listing all of the books with links (the links dont work yet, ..) to the pages on the crankylibrarian website.


We will probably be slurping these pages in via bots so to assist the bots get it right the first time we will need author pages to be created, page names disambiguated, and copyright checked.

If we already have an edition, it would be good to spot check that they are the same, and that our edition is better quality - any that we dont want imported can be removed from the list. John Vandenberg (chat) 03:12, 6 December 2008 (UTC)

Wow, this is quite the collection we'll be getting! (Too bad there aren't pagescans to go with it, but oh well. :) ). It's going to take forever to do those author pages, I must say.—Zhaladshar (Talk) 16:52, 6 December 2008 (UTC)
Would it help if I rebuilt the page as a table? Then we could have a "notes" column, or an "action" column to record whether we want to import it or not, or whether it needs to be manually merged into our copy. John Vandenberg (chat) 22:37, 7 December 2008 (UTC)
I don't think it matters much, one way or the other. Simple notes after each entry should suffice.I presume that this is all happening because he wants to get out of the text hosting business. It would, in either case help to add letter headers for ease of navigation. Are we working to a time limit? Comparing two editions can involve a whole raft of problems; if we can't be sure of the source of either we can't know which is the better. Eclecticology (talk) 23:23, 7 December 2008 (UTC)
I corrected a bit this wikiproject page. I think most of these works where copied from Gutenberg. That way, I even found an error where Gutenberg attributes a work to the wrong author, and Cranky most probably copying the error with the text. I have found a few works in the list which are copyrighted in USA, and were deleted from WS, notably The Great Gasby, so copyright has to be carefully checked. Yann (talk) 20:09, 29 December 2008 (UTC)
To the extent that the Cranky list includes material copied over from Gutenberg with the usual lack of sourcing, we would do better to remove them from the list because we can copy them directly from Gutenberg if we want. The list would then be left with only those works that are relatively unique to the Cranky site, and these could be given greater priority in our efforts. Eclecticology (talk) 21:15, 29 December 2008 (UTC)
I'm thinking, to help "prioritise" this project - we should remove from the list those works we already have. I'm wary about removing works we don't have that are copied from Gutenberg, since Cranky seems to have an easier set-up for a bot to parse however. Sherurcij Collaboration of the Week: Author:Joseph McCabe. 06:13, 16 February 2009 (UTC)

Since the blatantly obvious has been stated regarding the source of the Cranky content, I'll reiterate the implication of the Cranky interface. If Wiki's intention is simply raw content and is not usability of content, the CL will be of little value. The implied CL interface would be a subset of the book list presentation which, if a CL copy was avaiable, would route the user to CL for portrayal. There would be a "Please Convert" link for unconverted manuscripts in Gutenburg, et al. This is the only interface that makes sense. The inherent nature of cost-efficiency is the re-use of information and it's packaging into a user friendly format. The inherent costs of scanning, vetting, legal clearing, and other numerous trivialities have prevented a usability philosophy in public domain content. Consistent formatting and portrayal enables future exploitation without re-architecting; just re-implementing. But in the conversations on this microscopic issue I detect the inherent creep of bureaucratic mentality and stagnation. This is exhibited by a mind-set that focuses on established process and ignores potential interoperability. At one point I envisioned being able to help with architecture and inter-activity of Wiki systems and external providers, but I detect the same "built-here" mentality and technical naivete that pervades the public domain providers. The best to you in your endevors. Ghost Out.

Overriding default sort on Author pages[edit]

Author:Virgil is now showing a lovely bright warning due to using {{DEFAULTSORT:Virgil}}.

Warning: Default sort key "Virgil" overrides earlier default sort key "Vergilius Maro,_Publius".

The defaultsort is intended to be overridden in the Author template param defaultsort.

In the case of Virgil, Eclecticology added the DEFAULTSORT: in November[2] and I am guessing that the error has been added to the software since then. I've seen other contributors also using DEFAULTSORT.

IMO this warning could be turned off, or ... we need to go and find all the uses of DEFAULTSORT: and replace them with the {{Author}} param defaultsort, or we need to tweak this system message to tell the user how to fix it. John Vandenberg (chat) 00:25, 24 January 2009 (UTC)

Yes, this was added to the software recently. We can add an explanation of how to fix it to MediaWiki:Duplicate-defaultsort, probably with a {{NAMESPACE}} switch. Or, this CSS will hide the error in the author namespace (although I'm not fond of applying it to such a vague class name).
.ns-102 .error { display:none; }
Pathoschild 08:40:04, 24 January 2009 (UTC)
Is there an easy means to finding where the DefaultSort magic word is used? It isn't a major issue (I hope) to convert the author pages over as long as there is a means to identify said pages and then happy to step through them. -- billinghurst (talk) 10:10, 24 January 2009 (UTC)
FWIW AWB can suck in an XML database dump file of the relevant pages. If I need to find out more, then I can do so. NADA! AWB doesn't take Author Namespace, so I will have to see if I can get an amendment before I fulfil that promise. -- billinghurst (talk) 10:38, 24 January 2009 (UTC)
Good news. Found my AWB problem. Yannf said in IRC that "not easily" so I am trawling through the pages, and will get it complete over the next while. [Author pages -> rm {{DEFAULTSORT...}}; add |defaultsort... ] -- billinghurst (talk) 12:22, 24 January 2009 (UTC)
All {{DEFAULTSORT}} replaced with header | defaultsort = ...' in existing Author pages. It would be good to have a ready means to identify new additions when they occur. -- billinghurst (talk) 12:51, 25 January 2009 (UTC)
Adding a category to MediaWiki:Duplicate-defaultsort should work. —Pathoschild 00:34:01, 26 January 2009 (UTC)
Category:Authors with DefaultSort error prepared and awaiting your addition to aforementioned page. -- billinghurst (talk) 05:44, 26 January 2009 (UTC)
Done and tested; works fine. —Pathoschild 07:27:03, 26 January 2009 (UTC)

Themes on "New texts"[edit]

Last time I completed a transcription, I popped over to Template:New texts to give it some Main Page time, only to find that New texts was running with a Christmas theme, and was scheduled to move straight from that theme into a New Years Day theme. I have just now finished another work, the largest and most important work I've ever transcribed, and again I cannot get it on the main page, because New texts is running with an Obama theme. I note that it also ran with a Martin Luther King Day theme in January. The current Obama theme is presently quite stale, and surely overdue for removal, but I can't do it because I only have one new text to post, not eight.

Personally, it is quite frustrating to come to the end of a 580-page, three-month effort, and not be able to get it onto the main page because New texts has gone thematic. And personally, these themes do nothing for me.

I am wondering what other people think. Is there consensus for running with themes at New texts? If so, is there consensus for running with themes so often as at present?

It occurs to me that the crux of the matter is that a single template is being used to present both new texts and topical texts. If both are desirable, then could they be split? Would there be room on the Main Page to display both new texts (which need not be topical) and topical texts (which need not be new)?

Hesperian 13:11, 6 February 2009 (UTC)

I agree with your general concern, Hesperian; I recently completed a work and would like it to see some main page time. As for the best solution, I'm not sure that splitting the template (and thereby increasing the maintenance load) is a good idea, but perhaps creating a proposal page for Template:New texts (not unlike w:Template talk:Did you know) could work. That way, if a theme is running, as soon as it is stale and an admin sees that there are a few works in the queue ready to go, it's easy to update the main page. I was considering getting rid of the Obama theme myself for the reasons you mention, but one of the things that stopped me was that I didn't know of too many good texts to add. A proposal page might make that easier. --Spangineerwp (háblame) 14:11, 6 February 2009 (UTC)
I think there is a place for both a box for strictly new works, and a box for works on a theme, and I would give precedence to new works over a theme. Yann (talk) 15:18, 6 February 2009 (UTC)

I must agree. I find the constant use of themes extremely annoying. With the global focus these themes take, it wouldn't be hard to constantly find a theme for every week, thereby leaving out the the non-thematic work that our contributors work so hard on. I would like to see a more restrictive use on themes (like, one per month) so that: (1) we don't cycle through all the major world holidays/events so quickly, and (2) we can make sure that the hard work of other contributors who don't have any interest in adding works about the inauguration, Christmas, or other themes still get recognized for their labor.—Zhaladshar (Talk) 15:49, 6 February 2009 (UTC)

It's just a mistake - the Obama theme was not supposed to run that long. Anyone can feel free to update with other stuff now. Cirt (talk) 18:05, 6 February 2009 (UTC)
The themes seem interesting. According to the notes on the template, etc, they only run for a week, and the template hasn't been edited for two, so it looks like it's just overdue for an update. Maybe it shouldn't be written more clearly somewhere; i.e. The current theme, until January 30, is Barack Obama or something to encourage people to disrupt the theme at the right time. -Steve Sanbeg (talk) 18:17, 6 February 2009 (UTC)
I think at least part of the issue is that it's tough to come up with a completely new list of works, which is what the person has to do who does the switching from "theme" to "new works". Normally it's a piece of cake to add one or two to the already existing list, but looking through recent changes for 6-8 of them isn't as easy. That's why I'm thinking a staging area of some sort would be a good idea when these themes are running, so that when the week or two weeks or whatever is over, there's a list of works ready to go on the main page. --Spangineerwp (háblame) 20:18, 6 February 2009 (UTC)
Easier process - when the theme is over, simply remove the theme formatting, but keep those list of themed documents. Then, they will get pushed down by random new documents during the "normal" new documents period, before the next theme. Cirt (talk) 00:10, 7 February 2009 (UTC)
Personally I wouldn't want to dispose of the theme aspect. There is plenty of opportunity to market the site based on a theme. Certain events will have people searching and looking for information. Being the Source, we can highlight both new texts or thematic texts per events. That said, themes don't have to be new texts, they also present the opportunity to present anew our treasures. Cirt's process seems worthwhile to try, and it can be used to book in certain dates -- billinghurst (talk) 01:22, 7 February 2009 (UTC)

One of the issues for me is that I no longer feel particularly welcome to update that template — I feel like it has become the domain of a few editors, and that I must temper my boldness with care not to upset their plans. (This is not a criticism of them) The creation of a staging area would, I think, institutionalise that. Therefore, if the template is to continue much as it has been, then I much prefer Cirt's suggestion that we simply remove the theme banner and rotate the thematic texts out as per normal.

I still think there is merit in the idea of providing separate Main Page areas for New texts and Topical texts. Per Billinghurst, there are good reasons for us to maintain a list of topical texts on the main page: the problem is that doing it via New texts defeats the original point of that template. Spangineer suggested there would be an additional maintenance load; but I suspect that the same people who update the themes at New texts would willingly update them at Topical texts instead, and that this would allow New texts to continue as it has done for years, which is virtually maintenance-free. And finally, the Main Page appears remarkably uncluttered on my screen: like it would benefit from some more material.

Hesperian 04:01, 7 February 2009 (UTC)

Typically I just revert to the "New texts" that were displayed before the theme was added -- but you're right, like Song of the Day, Collab of the Week and others, we do need to try and get more people involved in updating it on time. However a "staging area" seems dumb, just use the talk page to say "Hey, when X is over, can somebody remember to add Y?" Sherurcij Collaboration of the Week: Author:Charles Sheldon. 04:35, 7 February 2009 (UTC)
Sure; I didn't mean anything complicated, and indeed, the talk page is the easiest place to put that kind of stuff. --Spangineerwp (háblame) 05:23, 7 February 2009 (UTC)

I really like Hesperian (talkcontribs)'s idea - of having the New texts section of the Main Page just be for literally new texts, and have a new section on the Main Page for themed Topical texts. Cirt (talk) 07:39, 7 February 2009 (UTC)


Testimony from a British courtroom[edit]

The transcript of testimony from a British anti-terrorism case is released...what license would that fall under? Are we able to host it here? Sherurcij Collaboration of the Week: Author:Nostradamus‎. 01:43, 1 January 2009 (UTC)

I imagine it's crown copyright, but it should be ok to republish it here one you acknowledge this and indicate the source. Blue-Haired Lawyer (talk) 16:52, 1 January 2009 (UTC)
{{PD-EdictGov}} seems to apply to the judgments, not the testimony, based on the wording of it currently. Sherurcij Collaboration of the Week: Author:Nostradamus‎. 16:55, 1 January 2009 (UTC)
One first needs to determine whether this material is copyrightable in the first place. If not, the question of licensing doesn't matter. The other issue for our hosting is one of verifiability. Where can a reader go to verify that we have an accurate report of the testimony. Eclecticology (talk) 18:40, 1 January 2009 (UTC)
Well I'm asking the question about the licensing, on the assumption that verifiability is a separate issue. I'm not worried about verifiability, and I would assume testimony is copyrightable unless there's a clause I'm forgetting -- so it comes down to a question of what exemption we use. Sherurcij Collaboration of the Week: Author:Nostradamus‎. 19:08, 1 January 2009 (UTC)
I agree that verifiability is a separate issue, but it does need to be kept in mind. Your "assumption" that it is copyrightable does not make it so. The section to look at is the general one that makes anything copyrightable. If the testimony cannot fit into those provisions in the first place one need not look into the exceptions. Eclecticology (talk) 19:25, 2 January 2009 (UTC)
...which would be why I'm asking people for help. Not for "You have to find out if it's copyrighted", I'm asking if it's copyrighted/ableToUseHere/etc. Anyways, court transcripts are hardly a concern to be verifiable; shouldn't really be an issue on any front -- just the outstanding question of copyright. Sherurcij Collaboration of the Week: Author:Nostradamus‎. 21:25, 2 January 2009 (UTC)
Of course if it's copyright in the first place is the most important question. Your neologism "copyrighted/ableToUseHere/etc." would also depend on it. It seems quite clear to me that courtroom testimony cannot qualify as an "original work of authorship". A tag that says "PD-not copyrightable" would likely be appropriate. Eclecticology (talk) 07:50, 3 January 2009 (UTC)
I would definitely agree that in principle, courtroom testimony ought not to be copyrightable, and I certainly hope that there is some British law that rules so, but I feel compelled to point out that there seem to be many precedents of transcriptions of things in general being copyrighted. The various forces that pursue establishment and enforcement of copyright law in general, which have successively and continuously been getting laws changed all over the world in the past several decades, have unfortunately been extremely invasive into the fabric of international society and law. They have been successful in shifting what was once the practical (and legal) default assumption of "not copyrighted unless otherwise stated" to instead be "copyrighted unless otherwise stated". (Which really sucks and amounts to a major demolition of property within the public trust - basically the ultimate grand theft of intellectual property in history - but that is what has happened.)
So it seems to me that we would need to be able to point to some very firm legal precedent or code to declare any category of works as not being copyrightable as well as establish what criteria is necessary to document that a work is within a given non-copyrightable category. --❨Ṩtruthious ℬandersnatch❩ 14:18, 8 January 2009 (UTC)
But here's an idea if we were going to skip that: we could describe a courtroom transcript as a "derivative work of a creative act of speech that occurred in a public venue." --❨Ṩtruthious ℬandersnatch❩ 14:36, 8 January 2009 (UTC)

(unindent) It is nearly impossible to base a successful strategy on the premise of self-defeat. Copyright is a creation of statute, and, at least in common law countries, that means that if copyrightability is challenged it is easy to shift the burden of proof to the person claiming the copyright. Hoping that a direct British precedent will come out of the woodwork is not likely to be helpful if none exists. Speaking of "many precedents of transcriptions of things in general" is exactly what makes this approach defeatist. It is not the quantity of such precedents that matter, but their direct relevance to the matter at hand. I agree that vested interests have become extremely invasive in recent decades in protecting what they consider to be their rights. Commercial interests can afford to put money into protecting their economic rights; for those of us who act in the public interest the legal costs can be disproportional. Those who would normally be defendants in copyright case also do not go around starting cases; there would be no benefit to doing so. A lot of these precedents don't exist because they involve processes which, until recently, were not practical for a member of the general public. Before the internet age, transcribing, printing and freely distributing a thousand-page court transcript was not economically feasible; now, only the first of those three steps remains a significant barrier.

Of themselves, speeches are not copyright, because of the lack of fixation; only their later fixation generates copyright. (There have been cases dealing with this.) Viva voce courtroom testimony comes down to a series of speeches which do not become "fixed" until a transcript has been published. The transcriber may receive a limited copyright in the compilation, but that does not give him a copyright in the individual speeches; if it exists at all it likely belongs to the witnesses themselves. Perhaps there, only perjured testimony may meet the test of originality.

Thus, waiting for a definitive precedent on this and many other copyright issues is a loser's strategy. It capitulates to the very activities you purport to oppose, and ensures their perpetuation. Eclecticology (talk) 19:42, 8 January 2009 (UTC)

I guess I don't see this as a matter of strategy at all; in response to the question posed I attempted to determine the copyright status of the work, not play some game or fight some battle.
If declaring that courtroom transcripts are non-copyrightable is some sort of strategy on your part to accomplish a goal related to the way you think copyright law should work, rather than a genuine effort to assess the actual legal copyright status of the work under current law, I guess we just had different interpretations of Sherurcij's question. But in any case we're talking about different things.
As you say, "The transcriber may receive a limited copyright in the compilation"; this would appear to completely controvert your assertion that this kind of thing is not copyrightable. It's not defeatism or any sort of gambit, it's an attempt to factually assess the claim that it's not copyrightable.
You yourself are the one who said above, "Your "assumption" that it is copyrightable does not make it so." By the same token, our desire that it should belong to a category of works which are impossible to copyright does not make it so. --❨Ṩtruthious ℬandersnatch❩ 17:22, 10 January 2009 (UTC)
I'm not the one engaging in political rants about what copyright law "ought" to be. Strategic considerations are about how we can best deal with the copyright law that exists, or how we can best deal with things that are not clearly addressed in the law. "May receive a limited copyright" is not contradictory at all because a compiler does not receive any copyright in the individual items he puts into his compilation; he only gets it on the way he puts them together.
The reality of law in general is that very little is black and white. With copyright law in particular electronic communications has raised a large range of questions that were unimaginable in a purely dead-tree era. In that era it was impossible to legislate about what could not be imagined. In court, particularly when there are no significant factual disputes, it is customary for each party to propose the most favorable interpretation of the law as it exists. In the absence of clear legislation the interpretation gap is especially wide.
My argument in this case is simple. The testimony is not an "original work of authorship", and it has not been "fixed" under the authority of the putative copyright owners, the witnesses giving the testimony. What is your counter-argument to that? Eclecticology (talk) 19:49, 10 January 2009 (UTC)
If, as you say, very little is black and white in law (which I would agree with), then that would seem to support exactly what I'm saying: declaring a category of works to be non-copyrightable is an extremely absolute statement. We ought not to make such a declaration entirely upon our own cognizance if very little in law is black and white. As I said above, we ought to establish a firm basis external to the Wikisource project before declaring a category of works non-copyrightable.
I wouldn't object to some language that says we think something hasn't been copyrighted, but declaring it non-copyrightable (without prominently stating that this is our own personal conclusion) seems to me beyond our purview and not especially honest. I think that AllanHainey has the right sort of idea trying to track down analogous situations. --❨Ṩtruthious ℬandersnatch❩ 16:16, 12 January 2009 (UTC)
Stating that something is in fact copyright can be just as absolute. Non-copyrightability is only one of several possible reasons why some writing is in the public domain. I have no problem with stating that we have taken a position that something is not copyright because is is not copyrightable; that would certainly be consistent with the non-declaratory position that I have been taking all along. With this issue and many others the "firm external basis" simply does not exist. If you know of one, let us know too. Eclecticology (talk) 09:18, 13 January 2009 (UTC)
I'm having difficulty parsing that last post, but as long as we're going to avoid making the definite statement "This cannot be copyrighted" in a footer copyright notice without any firm external basis for such a statement, but would instead clearly indicate that any assertion of non-copyrightability isn't backed up in the same way that, say, the {{PD-1923}} footer notice is backed up with a firm external basis, I think we agree. --❨Ṩtruthious ℬandersnatch❩ 02:26, 17 January 2009 (UTC)
To make it absolutely clear, this applies equally to "This can be copyrighted." Eclecticology (talk) 19:38, 19 January 2009 (UTC)
These seems like the essence of extemporaneous speech, with the exception of exhibits and speeches and other prepared material. I think it would fall under the existing rules Wikisource has for stuff like that.--Prosfilaes (talk) 17:48, 10 January 2009 (UTC)
I agree. Sherurcij's original question was phrased in terms of "What licence ...?" If something is in the public domain no licence is needed. Eclecticology (talk) 19:49, 10 January 2009 (UTC)
I've done a little bit of research on UK copyright law, I can't find a definitive answer but some info indicates that the transcript is non-copyright (unless copyrighted by the transcriber). [[3]] doesn't include court records in the list of types of work which can be copyrighted by UK law. This site [[4]] states
"“A work can only be original if it is the result of independent creative effort. It will not be original if it has been copied from something that already exists. If it is similar to something that already exists but there has been no copying from the existing work either directly or indirectly, then it may be original.”
UK Intellectual Property
This would seem to rule out claiming of copyright on transcripts. It would certainly be ruled out in USA law because the only grounds would be "sweat of the brow" which has been deprecated. Again, of course, any commentary added would be an original work."
To obtain a transcript what seems to happen is that you contact the Court concerned with details of the case, parties, dates etc. You then select a transcriber, the Court sends the tapes to the transcriber and you pay fee. It may well be that the transcriber creates a copyright over the work (though I think this is unlikely from the above comments), though they could have a copyright over the format/style of their transcript if not the actual text taken from the text. AllanHainey (talk) 13:48, 12 January 2009 (UTC)
I thought it'd be helpful to see if those UK court transcripts already on-line are noted as copyrighted, Unfortunately those at the British and Irish Legal Information Institute [[5]] all seem to be noted as crown copyright & the site notes"
The copyright in the text of legislation and judgments displayed on BAILII's website may belong to courts, other government bodies, judges, and/or to commercial publishers. BAILII cannot authorize any copying of such material, and users of BAILII's website are referred to the copyright policy of the relevant copyright owners. BAILII endeavours to indicate the existence of third party copyright on the pages of databases and individual judgments, but users remain responsible for checking whether their use of the materials is authorized. AllanHainey (talk) 14:10, 12 January 2009 (UTC)
Looking at that further leads to this, where the third paragraph discusses Crown Copyright in a way that we would likely find acceptable. Eclecticology (talk) 08:42, 13 January 2009 (UTC)
I hate to say it (well ok not really) but this is more or less what I said in the beginning. As far as I know similar terms of use apply apply in commonwealth countries, Ireland and the European Union. — Blue-Haired Lawyer 00:15, 29 January 2009 (UTC)

Copyright renewals[edit]


I could not find a renewal for this book: wikilivres:Image:Plato's theory of knowledge.djvu. Can I can import it on WS? I also would like a confirmation regarding the The Origins of Totalitarianism by Hannah Arendt. Sherurcij wrote here that the renewal is for the 1952 edition. However the Registration Date is 22 March 1951. Can someone explain this further? Thanks, Yann (talk) 15:17, 10 January 2009 (UTC)

I'm a bit confused myself what I was using at the time, I would assume pagescans of a 1950 edition, though I can't find that online right now; so I can't really "back up" my November claim since I don't remember what my reasoning was at the time for believing "the renewel is specifically for the 1952 "enlarged" edition, not the original 1950 edition. IA is erroneously hosting the 1962 copy of the 52 edition, which would not appear to be PD". Sherurcij Collaboration of the Week: Author:Nostradamus‎. 16:02, 10 January 2009 (UTC)
Plato's theory of knowledge was originally published in 1935. This included US publication by Harcourt-Brace. Cornford died in 1943, so maybe there was no-one there to renew it. This should be good to import.
The original 1948 Origins of totalitarianism was renewed, R620089. It does not appear to be usable here. Eclecticology (talk) 18:41, 10 January 2009 (UTC)
From the Geneva University library, I got the first edition from 1951. Eclecticology, the renewal you mentioned is for a publication “In Partisan review, July 1. 1948.” My book specifically mentioned that “scattered passages of this book have appeared in the pages of the following magazines: ..., Partisan Review, ...” It looks like this is not the same text. Further more, this renewal is not mentioned in the Stanford database. Yann (talk) 15:32, 12 February 2009 (UTC)
I usually make use if the Rutgers database, which seems a little more comprehensive than the one at Stanford. From what you say the suggested answer is that those pages which appeared in Partisan Review would be governed by that renewal, and that the pages from other magazines may be restricted on the basis of what happened with those magazines. We may end up with a patchwork of copyright and public domain material. Eclecticology (talk) 22:19, 12 February 2009 (UTC)

On the same suject, here is a study [6] by Peter B. Hirtle, from Cornell University: "Copyright Renewal, Copyright Restoration, and the Difficulty of Determining Copyright Status" which has far reaching consequences for WS. Dozen of works may have to be deleted. Yann (talk) 14:33, 12 January 2009 (UTC)

I've just read through this very interesting article, and also checked to see what the announced abridgements were that he mentioned. See [7]. Hirtle give a lot of fine intellectual and throretical analysis of the law involved, but next to nothing in the way of court decisions one way or another. The average judge may have little patience for some of his hair-splitting arguments, absent real courtroom precedent.
Perfectionistically "doing the right thing" comes with a cost. Not the least of that cost is the time and effort required to properly investigate each individual situation. We really need to analyze the point where we transition from a pure adherence to the letter of the law, however absurd, to a risk management approach. A risk management approach accepts that there is a remote possibility that bad things could happen, but that the costs of protecting oneself are greater than the probable costs of harm.
The omitted portions of the essay were from the risk management section. Thus:
"Settling such an unlikely suit might be less expensive than conducting the incredibly thorough analysis needed to establish copyright status with the highest degree of certainty. The experience of the Internet Archive in this regard may be instructive. It, and in particular the Universal Digital Library found in the Archive, contains some titles that may havehad their copyrights restored. Yet to date there have been no reported actions against the Internet Archive for copyright infringement of restored works, nor have there been any actions for contributory infringement reported against a library that provided the volumes. An institution might decide, therefore, that while the issues described in this paper are a theoretical possibility, they are unlikely to be an issue in practice. After careful analysis, the institution might conclude that digitization of some works can be risked even when it cannot be established with 100% certainty that the work is in the public domain."
What remains is to know where to cross over between absolute certainty and sane risk management. Eclecticology (talk) 01:55, 13 January 2009 (UTC)
I, too, would favor such an approach. It is very sad that many Commons admins adopt a prove-or-delete standpoint of fundamental principle, which mean that these scans can't be hosted on Commons. I have several times proposed there a more pragmatical approach, without any results up to now. Yann (talk) 12:14, 14 January 2009 (UTC)

In regards to Plato's theory of knowledge by w:Francis Macdonald Cornford, the author is British and it was first published in the UK, so it fails {{PD-1996}}. John Vandenberg (chat) 00:14, 13 February 2009 (UTC)

That depends on whether it was simultaneously published in the US. Eclecticology (talk) 10:44, 14 February 2009 (UTC)

In regards to The Origins of Totalitarianism, by Hannah Arendt, it does not appear to have been published in Partisan Review. I have added a complete list of articles she published[8]; she wrote many reviews as well - I've not added those. I've also consulted the 1948 volume, and it really isnt in there, so I cant figure out what R620089 is referring to. Since it isnt easy to access the rutgers search results, they say that renewal is:

AUTH: Hannah Arendt. (In Partisan review, July 1. 1948)
TITL: The Origins of totalitarianism.
ODAT: 1Jul48; DREG: 18Dec75 RREG: R620089. RCLM: B152365. Hannah Arendt (A)

The 1951 edition appears to be the first (I dont have access to the first edition; I have in my hands a copy of the 1962 Meridian edition which says (c) 1951; Worldcat agrees), and it is covered by copyright Renewal RE035306. I assume that the typed edition we have from Congress (File:The Origins of Totalitarianism.djvu) is the same as this first edition, which means it is protected by copyright. I cant find a renewals for the 2nd enlarged edition of 1958, which is quite different to the text of File:The Origins of Totalitarianism.djvu. New matter in a 1966 edition was renewed. Is it possible for the second edition to be PD while the first is not? John Vandenberg (chat) 06:06, 13 February 2009 (UTC)

It is bizarre that she would renew the copyright on something that did not exist in the first place. The Project Gutenberg entry says,
The Origins of totalitarianism. By
Hannah Arendt. (In Partisan review, July
1. 1948) © 1Jul48; B152365. Hannah
Arendt (A); 18Dec75; R620089.
The absence from the Stanford base may be accounted for by the original registration being a "B" series registration, used for serials. It is also remarkable that the renewal was done by herself, as evidenced by the "(A)", 14 days after her death. Eclecticology (talk) 10:02, 13 February 2009 (UTC)
From what I understand, if any underlying work is copyrighted, that copyright also protects the derivative parts of any later work; that is, the copyright on the first edition, while it lasts, protects any parts of later editions that aren't new. I've seen court cases where a movie was held to be in copyright since the original play was, for example. However, despite this, the Superman cartoons are widely treated as public domain despite the fact that the first appearance of Superman, and hence the character himself, is still under copyright; either DC Comics doesn't care or has a different understanding of the law. It's long been an issue that left me scratching my head.--Prosfilaes (talk) 00:35, 14 February 2009 (UTC)
I believe that there are still squabbles going on in the courts about Superman, and whether his creator had legally given up his rights. Wide reputation is not enough to establish public domain. Eclecticology (talk) 10:44, 14 February 2009 (UTC)

File:The Origins of Totalitarianism.djvu is not the same as the first edition. It is much shorter, and apparently a unpublished manuscript. The text in Partisan review is only an extract, but I can't figure out which part of the book: none of the chapters is called The Concentration Camps. Yann (talk) 10:11, 13 February 2009 (UTC)

This is yet another twist in the case, as is the fact that the manuscript has been published by the Library of Congress. Apart from this, whether we accept the 1948 or 1952 as the valid publication for starting the copyright clock leaves with the practical question of whether it goes into the public domain at the beginning of 2044 or 2048. That's not a very immediate problem. Under what authority did the Library of Congress republish the (unfixed) typescript? Would they engage in copyright violations? Eclecticology (talk) 10:44, 14 February 2009 (UTC)

Texts and proofreading[edit]

The idea is to validate texts, not books, to be sure that the text is in accordance with the edition of reference, not with everything in the edition of reference, only the text.

I will explain what I mean: on we have validated a novel of Honoré de Balzac, the title is Eugénie Grandet. We have proofread it from a book where there were three novels, this one and two other ones. Of course the two other ones are not Eugénie Grandet.

Same thing here: we will validate the Lays of Marie de France, from a book where there are two texts which are neither of Marie de France nor Lays so I think it is right to validate this page without the two texts at the end of the page. We could create a template for this, and the template would explain things and would create the right categories. We could put this template on the Discussion page of the text.

What do you think of this idea? -Zyephyrus (talk) 13:46, 14 January 2009 (UTC)

If the text you're checking it against is the same version (publication date, etc) as has been uploaded here then I don't see a problem with it having been included in a folio of other books. AllanHainey (talk) 13:00, 21 January 2009 (UTC)
Is it better to have two different lists: 1. Indexes proofread by many users. 2. Texts proofread by many users. Or is what I have done sufficient? I have added a proofread text under "Finished projects" in the Transcription Project.---Zyephyrus (talk) 10:27, 26 January 2009 (UTC)

Category intersections[edit]

I just discovered that basic category intersection is working both on it.source and here.

Test this into search box:

+incategory:"Early modern works" +incategory:"Poems"

You'll get works that are tagged with category:poems and with category:Early modern works. This means that really well-designed and fully populated "basic" categories (t.i., categories with only one "significant datum") can be used presently as keywords. But you have to break the suggestion "don't categorize an article into a general category if a more detailed category exists".

The advantage is that a low number of well-designed categories can produce a very high number of possible intersections; just imagine, for authors, how many categories you'll need if you want to categorize by genre(2); nationality(say 25); profession (say 15) ; century (say 25)... You need 18750 categories to cover every possibility (the product of such numbers), where with category intersection you need only 67 "basic" categories (the sum of such numbers). --Alex brollo (talk) 00:14, 18 January 2009 (UTC)

I agree with the principle that you outline here. It would just take a bit of work to make things more compliant. "Early Modern works" should probably be dumped in favour of something precise like "19th century works".
We should also make a distinction between the way we categorize authors and the way we categorize the articles themselves.
The subdivision of categories is still valid. Biology can subdivide into zoölogy and botany; Entomology and herpetology are only two of the valid subdivisions of zoölogy. Eclecticology (talk) 19:27, 19 January 2009 (UTC)
You're right; tree categorization is excellent, nevertheless limited; keyword search is needed sometimes, but it's very difficult because end-user must be teached deeply about and a special software is needed to use it. "Granular" categorization, IMO, joined with intercategory search, has some interesting features of keyword search and can be searched with an existing tool of wiki. --Alex brollo (talk) 13:13, 26 January 2009 (UTC)

Getting Lives and subpages moved[edit]

I have just had a look at Lives and it has many subpages. From looking at the links, it would seem that not all the links relate to this work, and it may be more realistic for Lives to be a disambig page. Would someone with the powers please look to relocate the work. [Note, it is not evident where to request page moves.) Thanks. -- billinghurst (talk) 01:03, 22 January 2009 (UTC)

I'll go ahead and work on the disambiguation stuff.—Zhaladshar (Talk) 13:33, 22 January 2009 (UTC)
I've done about what I can. The pages that link to Lives from the Short Biographical Dictionary of English Literature confuse me. I don't know quite what to make of those links. Any ideas would be helpful.—Zhaladshar (Talk) 14:40, 22 January 2009 (UTC)
It looks like "Lives" from the Short Biographical Dictionary is shorthand to refer to old works that were used as references or are suggested further reading. For example, Google books has the 2nd edition of "The life of Jonathan Swift" by Henry Craik (published 1894), which according to the preface was published 11 years after the first edition. SBDEL/Swift, Jonathan refers to "Lives" by Craik (1882), which must refer to the first edition of Craik's work. The links should be updated to the actual works by those authors (perhaps linking the author rather than Lives; for example, [[The Life of Jonathan Swift|Craik]]). --Spangineerwp (háblame) 15:01, 22 January 2009 (UTC)
Thanks a bunch. That clears up some confusion. Now I just need to do a ton of research to be able to update all those links.—Zhaladshar (Talk) 15:08, 22 January 2009 (UTC)
It was from the Short Bio... that I came across the issue, so am happy to lend some assistance to sort that out. I just couldn't do anything until we had done the move, hence the request. That publication has also indicated a few other (little) issues that will push some cleanup. -- billinghurst (talk) 22:30, 22 January 2009 (UTC)
Zhaladshar. Working through these pages further, I have come to the conclusion that Cousins in Short Bio uses the terms Lives and Works liberally, more seemingly to reflect the type of work produced rather than to necessarily reflect the actual title. With a number of these I have wikilinked the author(s), and unlinked the Lives, especially where the reference has been a generic reference to publications by multiple authors. -- billinghurst (talk) 01:54, 3 February 2009 (UTC)
That's good to know. I've been following a similar practice when I come up to SBDEL entries that need disambiguation.—Zhaladshar (Talk) 14:27, 3 February 2009 (UTC)

Publisher on Author pages[edit]

I tripped over this page Portal:P. J. Kennedy & Sons and it doesn't fit nicely on Author template, and I wasn't sure if there had been some previous thoughts about this sort of subject matter? -- billinghurst (talk) 09:26, 25 January 2009 (UTC)

Yep, with no consensus; see the linked category. —Pathoschild 11:31:39, 25 January 2009 (UTC)
Heh, didn't know that - I created Author:William Blackwood back in December, then deleted it when I realised it was the name of the book's publisher, not author. Sherurcij Collaboration of the Week: Author:Bahá'u'lláh. 00:29, 27 January 2009 (UTC)
There are really two kinds of entries in that category: Corporate authors, which should be kept, and publishers where treating them as authors is misleading. The P. J. Kennedy page suggests some authors such as "Rev. L. L." That was that time's counterpart of our using pseudonyms here ourselves. The author should read as "L. L." Of course there are likely to be other authorsd with the same initials that could go on the same page with a note to the effect that they cannot be distinguished. Eclecticology (talk) 09:30, 11 February 2009 (UTC)

Wikisource:WikiProject United States Executive Orders[edit]

Wikisource:WikiProject United States Executive Orders has been created as a common area to discuss and improve this content on Wikisource. — MrDolomite | Talk 16:18, 27 January 2009 (UTC)

line numbers[edit]

Has Wikisource ever considered putting line numbers into texts? I was just trying to cite some lines from Romeo and Juliet and noticed that <gasp> there aren't any. It would help a bunch, not just for that article, but for all poems and/or works organized into lines. flaminglawyer 23:59, 27 January 2009 (UTC)

We've got plenty of works that have line numbers. The problem is, to do them well requires additional formatting which contributors might not know of or want to put in the effort to do. But there is no rule forbidding them, so you are more than welcome to add some.—Zhaladshar (Talk) 13:34, 28 January 2009 (UTC)
Are there any templates that would help in adding line numbers? flaminglawyer 14:15, 28 January 2009 (UTC)
Not that I know of, sorry. Maybe someone else can give an answer to this question.—Zhaladshar (Talk) 14:22, 28 January 2009 (UTC)
Using something like the {{verse}} template would be a good way to handle it—it makes it really easy to link directly to a line. I don't know how easy it would be to modify for lines as opposed to Bible verses. --Spangineerwp (háblame) 16:17, 28 January 2009 (UTC)
Note that just willy-nilly putting line numbers in doesn't help a lot; they have to match a reference edition. (One volume I worked on for Project Gutenberg[9] had three sets of reference numbers in the margin.) In Romeo and Juliet's case...why are you citing this version in the first place? Many of Shakespeare's plays, including R&J, have a complex enough history that you've got to include which version in your citation, and line numbers aren't meaningful without the exact version labeled. WS's R&J comes from an unknown source and isn't worth citing anywhere you actually want to cite something with line numbers.--Prosfilaes (talk) 15:39, 28 January 2009 (UTC)
I assume "WS" in the above comment refers to Wikisource, not to William Shakespeare. Angr 18:49, 29 January 2009 (UTC)
There is a similar discussion above and Ec mentioned a template used on a transwiki. Supposedly worked as long as it wasn't transcluded. -- billinghurst (talk) 21:04, 28 January 2009 (UTC)
I figured out that the {{verse}} template works—it's used on Paradise Lost. If you notice note 22 of this work, you'll see the link to line 795 of Book XI. Not the most elegant solution, but functional. --Spangineerwp (háblame) 18:58, 29 January 2009 (UTC)
I believe you'll find Paradise Lost (1667)/Book I to be a prettier (and more complicated) solution. And thank you for reminding me to disambiguate the two editions. Psychless 00:38, 30 January 2009 (UTC)

Project of the Month[edit]


Is the Proposal page the place where to propose a PotM or is it better to propose it on the Scriptorium? ---Zyephyrus (talk) 23:52, 30 January 2009 (UTC)

End-of-page wordspace issue[edit]

As a brand-new contributor, I immediately noticed a regularly-occurring apparent omission of wordspaces in texts such as that in the first line under page 30 here. I jumped in, went to the previous page and added a space to the first one I saw. Then I noticed, hmmm, that the anomaly was very profuse in occurrence, hence officially tolerated, eh?? A search of Help failed to provide an answer, so, what's with it, folks? Should I insert the spaces or not, please? Bjenks (talk) 00:00, 31 January 2009 (UTC)

Hrm, interesting problem. I haven't seen that before (I haven't played with the Page namespace much) but I'd say that this needs to be addressed and my vote would be to take care of it on Page:Wind in the Willows (1913).djvu/30, not in the source of The Wind in the Willows/Chapter 1. Perhaps start the line with a space, assuming that doesn't cause the software to kick into "pre" mode. --Spangineerwp (háblame) 00:57, 31 January 2009 (UTC)
Thanks--so the problem's real. I still think a page-end wordspace is the answer (but only, of course, when there is a broken sentence). Surely this could be handled by someone with bot skills (not me, for sure!). Bjenks (talk) 01:32, 31 January 2009 (UTC)
In my experience, imported texts have had the additional space at the end of the texts. The issue is often at the end of these texts there is gmph and one has to delete that and remember to leave an ending space. Possibly the technical solution is in transcluding of each Page text is to add a space, knowing that if a space already exists that it will be ignored by the html when constructed. As an interim measure, try to add a space between the transcluded pages in the Main NS where they are constructed, rather on Page: constructions. In the end, whichever is done improves the work. :-) billinghurst (talk) 03:17, 31 January 2009 (UTC)
I don't even seen what's wrong? What's the issue? I ask because I translcude a lot of pages and want to make sure I'm not replicating the issue.—Zhaladshar (Talk) 14:22, 31 January 2009 (UTC)
The issue is that when we transclude consecutive pages, we need to include a space somewhere so that we don't have two words without a space between them. In the example given by Bjenks, there was no space after the last word on Page:Wind in the Willows (1913).djvu/29, so when it and the following page were transcluded on The Wind in the Willows/Chapter 1#30, the line contained "heremarked" instead of "he remarked". Bottom line seems to be that we need to make sure that we include word spaces at the end of pages, but as bilinghurst points out, there's more than one way to handle it. --Spangineerwp (háblame) 14:53, 31 January 2009 (UTC)
That's weird, because I don't have that problem at all with my texts. I never added spaces at the end of each Page: and when they transclude the page shows up. (See The Study of a Country for an example.)—Zhaladshar (Talk) 15:05, 31 January 2009 (UTC)
That's because you have spaces between transclusions—i.e., the fact that you have a line break between {{Page|Historical Lectures and Addresses.djvu/300}} and {{Page|Historical Lectures and Addresses.djvu/301}} in the source creates the space (at least I think that's what's happening). In Wind in the Willows, there's no space in the source between transclusions, so a space is necessary at the end of the proof page. Your way seems to work too, though, so I don't think it makes any difference. --Spangineerwp (háblame) 15:24, 31 January 2009 (UTC)
I too do like Zhaladshar, and I think that's a better solution. It also makes the transcluded page much readable if there is a line break between each page. Yann (talk) 16:41, 31 January 2009 (UTC)
Good gets! I also noticed that some have a better (in my opinion) means of partial transclusion than the instructions given at Help:Side by side image view for proofreading. I think that we can update the instructions at that page and get a better result. I have started a conversation at that talk page to progress this matter. -- billinghurst (talk) 00:53, 1 February 2009 (UTC)
Oddly enough, I've always had the opposite problem - when I have a hyphenated word between two pages I can never get rid of the space between the halves. But possibly it's different in my case because I usually put the HTML paragraph tags in myself rather than letting the MediaWiki software do it.
Fortunately there are the templates {{Hyphenated word start}} and {{Hyphenated word end}} for this. --❨Ṩtruthious ℬandersnatch❩ 17:29, 3 February 2009 (UTC)

Public domain books - how to upload?[edit]

I have some public domain books, which I'd like to somehow make available here. How do I go about doing so? There doesn't seem to be an obvious page saying how to get the process started...

Specifically, I have some books from the start of the 20th century, some publications by NASA, and three books from the MIT Radiation Laboratory, published in the late 1940s and stating "The publishers have agreed that ten years after the date on which each volume of this series is issued, the copyright thereon shall be relinquished, and the work shall become part of the public domain."

Thanks. Mike Peel (talk) 18:36, 31 January 2009 (UTC)

Hi Mike. Nice. We upload files to Commons, though they need to be =<100MB (if bigger then they need splitting). Ideally you want DJVU files and with a text layer. From there, if you create an Index:Filename.djvu (ensure that Commons filename.djvu and Index:filename.djvu match). Here is an example of one of mine Index:List of Carthusians 1800-1879.djvu. We have a couple of people who have bots who can apply text to match the images and a page to cover those requests. -- billinghurst (talk) 11:17, 1 February 2009 (UTC)

Question regarding some US Govt/Department of State "Foreign Media Reaction" reports.[edit]

Hi, I'm an editor over on I had a question about whether certain texts would be eligible for inclusion on Wikisource: Specifically, the US Department of State released something online called the "Foreign Media Reaction Issue Focus" briefing, for many years, up until 2005, when it mysteriously disappeared from the US State Dept's website. Basically the product was a weekly translation, excerpting, summarizing, and commenting on non-English newspaper editorials about US foreign policy.

There is a mirror of the (2005 and earlier) Foreign Media Reaction reports online at: so that the editors can get a feeling for what exactly these reports look like.

I am wondering if: 1. This has a compatible copyright status to be included on Wikisource? Since it is a work of the USG, it is public domain - but there are excerpts of foreign newspapers included in the work. 2. This is a kind of material that Wikisource would like to include? These reports are interesting to say the least, but I don't know if it meets the inclusion criteria.

If this could be looked into, I would appreciate it, as this is a valuable resource that needs a permanent home, and I would be willing to work on wikifying it. Katana0182 (talk) 00:24, 2 February 2009 (UTC)

My initial thought is that there would be copyright problems, with excerpts from published and copyrighted news articles. Though it is something we would like to have if there was not a copyright issue. Jeepday (talk) 00:26, 4 February 2009 (UTC)
This is a situation where the proper application of fair use should prevail. Each quotation seems to be no longer than a single paragraph from what is likely a much longer article. In one page I find, "EDITOR'S NOTE: This analysis is based on 57 reports from 29 countries." The extracts are for the purpose of research, and do not infringe upon the newspaper publishers' rights to profit. If anything, they could inspire the reader to seek out the entire article where he could see the quote in context. If there is a copyright problem it's not because of the law, but one of our own making arising from a too rigid interpretation of the idea of fair use. One point that I would make if this source is to be used: The site is a private site, and not the State Department's own site; that indirect sourcing would need to be acknowledged. Eclecticology (talk) 18:54, 11 February 2009 (UTC)


User: has created Template:PD-USGov-POTUS and has started to replace Template:PD-USGov with it in appropriate places. There's no legal point in having a separate template for actions of the President versus actions of the US Government as a whole, so I don't see why it should exist. Comments?--Prosfilaes (talk) 16:19, 2 February 2009 (UTC)

Many specialized templates exist on Commons, including this one, so I suspect that that's what caused the anon to think that it might be helpful here. But I agree that there's no reason for it here—works by the president should be indicated as such through categories and Author pages; there's no need for a redundant template. --Spangineerwp (háblame) 17:27, 2 February 2009 (UTC)
From looking at the changes, it would seem that the difference is that it adds a Category, so it seems to be a convenience issue. It would probably be better if we kept the original template, and added a pipe (|POTUS) that allowed the addition of the required category. -- billinghurst (talk) 01:23, 3 February 2009 (UTC)
The problem with categorization via templates is that it's not intelligent. For example, if we create a "Barack Obama" category or a "19th century presidents" category (two bad examples, but hopefully the point gets across), a work with the POTUS template ends up in category:Barack Obama (because that's where it belongs) and in category:Presidents of the United States (because that's where the template puts it). Redundant. Much simpler (and I'd argue, more "convenient") is to use templates for licensing (including license organization via categories) and categories for content organization, and ne'er the twain shall meet. --Spangineerwp (háblame) 02:31, 3 February 2009 (UTC)
The only category this template adds pages to a license category. I see no reason why to have unnecessary vagueness with using one template for all U.S. Government works that has no explanation as to its actual origin. This template is also short, adding origin information in two sentences, as opposed to the vague template's one sentence. Also, there are many pages on which this template could be used, which I believe may justify its existence. I don't really see how you can say that it is that much more redundant than even PD-USGov, because one could just as easily have a template that says, "This work is in the public domain." without any explanation, although it may be assumed that if it appears here, it is public domain/ineligible for copyright. This template seems to provide what a license template should provide: an explanation of the text's origin and why it is public domain. If the consensus is in favor of more general license tags, I will undo my actions. Thank you.
-- 14:03, 3 February 2009 (UTC)
It's established convention here to have PD templates that state the "reason" that the work is PD—so many years after the author's death, published in the US prior to 1923, etc. A template for the federal government does the same thing—it identifies why a work is in the public domain. A template for presidents specifically doesn't add any additional information, because all works by presidents are licensed exactly like works by the rest of the federal government. It's unnecessarily specific, and if carried to its logical conclusion, it means that we'll have dozens of templates to maintain. Works by the CIA, works by the FBI, works by attorney generals, works by secretaries of state, etc. etc. Tons of templates, with nothing gained. I think it's much better to simply create categories when we need to keep things organized—the combination of template ("this work is PD because it's a work of the federal government") combined with the appropriate category ("US presidential inauguration speeches") tells the reader exactly what the license status is, and what kind of work it is. No additional templates required. --Spangineerwp (háblame) 18:02, 3 February 2009 (UTC)
Yeah; what Spangineer said at 18:02, 3 February 2009, above. Jeepday (talk) 00:23, 4 February 2009 (UTC)

Gadget preloading header[edit]


I asked for 2 changes in this script (MediaWiki talk:Gadget-TemplatePreloader.js) 6 months back, and nobody answered. :o( Can someone help please? In addition, I would like to know why it doesn't work the same way on Wikilivres. Thanks, Yann (talk) 19:33, 4 February 2009 (UTC)

It is a good source[edit] --Wmrwiki (talk) 06:41, 6 February 2009 (UTC)

It's a mediocre source; raw OCR is pretty lousy. It doesn't compare to some of Project Gutenberg's recent stuff, or anything that's been properly proofread.--Prosfilaes (talk) 00:47, 7 February 2009 (UTC)

Popular Science Monthly Project[edit]

Guys, I started a new project that contains issues of the popular magazine Popular Science Monthly. This magazine is now known as simply Popular Science. Internet Archive has about 90 volumes which we can add. In addition, Google has a bunch of issues as well, but we can't download a lot of them. :( We might want to purchase some old editions off of ebay or Amazon to get some of the issue we don't have. Any help with this project would be greatly appreciated. So far I have uploaded the first 10 volumes. --Mattwj2002 (talk) 04:51, 7 February 2009 (UTC)

The project is certainly a worthwhile one, and the oldest (and public domain) volumes were clearly more informative than what the modern dumbed-down magazine has become. Still I wonder what added value we contribute when we just copy over scanned files from the Internet Archive. A single volume of about 800 pages would be a lot more useful if everything there could be proofread, but the sheer bulk of the scanned material that is already in our files already exceeds what we are capable of proofreading over the next few years. A mildly famous person has recently stated that perfection is the enemy of the good, and we are well on the way to substantiating that truth.
A more practical issue about the scans. Would it not be possible to have the page numbers of the scans match the page numbers in the volumes themselves? It seems that in the past the added prefatory material was numbered with Roman numerals. Eclecticology (talk) 19:59, 11 February 2009 (UTC)


Jack makes a good point in this edit, on nomenclature for Inactive admins. Any thoughts on a friendlier naming convention for Admins who are retired? Jeepday (talk) 13:21, 7 February 2009 (UTC)

promote to "administrator emeritus". :-D Hesperian 05:02, 8 February 2009 (UTC)
Former admin is nicely factual. —Pathoschild 06:26:37, 08 February 2009 (UTC)
I don't think we should be making a big deal out being an admin. Or losing the flag. I don't think we need different names for "Desysop". Just explain further in your reason that it for inactivity. --BirgitteSB 03:22, 9 February 2009 (UTC)

how to produce small footnotes?[edit]

Does anyone know how to make footnotes smaller? If you use


the footnotes themselves are shrunk, but the footnote numbers stay big, and it looks awful; see, for example, Page:Miscellaneousbot01brow.djvu/29.

On Wikipedia this is done with {{reflist}}, and I note that someone has imported that over here. But our version doesn't actually do anything because this template appears to work by invoking css class "references-small", and I guess our global style-sheet does not define it. Me, I'm border-line too stupid to fool around with site-wide style-sheets, so I'm hoping someone else might volunteer or come up with some other solution for me.

Hesperian 05:09, 8 February 2009 (UTC)

You need to wrap the <references /> in a block element (ie, a <div>). {{smaller}} doesn't work well because it's an inline element (specifically a <span>). For example, the following would work fine (and {{reflist}} could easily be changed to this):
<div style="font-size:smaller;"><references /></div>
Pathoschild 06:32:07, 08 February 2009 (UTC)
Sweet! Thanks Pathoschild. Hesperian 10:14, 8 February 2009 (UTC)
When we use different percentages as for instance
<div style="font-size:80%;"><references /></div>
<div style="font-size:90%;"><references /></div>
is it accepted? It would let us decide whether we want very small or less small footnotes. Or is it better to have always the same size?---Zyephyrus (talk) 11:02, 8 February 2009 (UTC)
Consistency is nice, unless there's a particular reason to change the size (say, to match the format in a book). —Pathoschild 11:08:12, 08 February 2009 (UTC)
"{{smallrefs}}" produces small references; you can change the size using {{smallrefs|80%}}". —Pathoschild 19:54:44, 12 February 2009 (UTC)
Thanks again. Hesperian 22:20, 12 February 2009 (UTC)

Can we record this?[edit]

Wikileaks Publishes $1B of Public Domain Research Reports,download.--Wmrwiki (talk) 11:49, 9 February 2009 (UTC)

Based on this "Wikileaks is a website that publishes anonymous submissions and leaks of sensitive governmental, corporate, or religious documents, while attempting to preserve the anonymity and untraceability of its contributors.", I would say no. As it would not be possible to verify that the documents where actually U.S. Government products, based on Anonymous submission. Jeepday (talk) 18:31, 9 February 2009 (UTC)

Request community permission for bot for SDrewthbot[edit]

I am looking to use w:WP:AWB to do some tidying, Author pages for DNB project. It would be semi-automated in that I would review each change, though have the software select, and do. To do this, I would like to utilise the account User:SDrewthbot, have it identified as a bot and apply my existing permissions. I have reasonable experience with AWB with ~1.5k edits on WP. There will be ZERO unacceptable usage changes. Rate will be well within guidelines, max. 30-60 edits per hour. After successful slow testing, I would be looking to apply for a bot flag for the account. If all works well, I would be looking to utilise AWB for the community's tasks. Thanks for your consideration -- billinghurst (talk) 12:17, 9 February 2009 (UTC)

  • Support Good contributor. Yann (talk) 12:54, 9 February 2009 (UTC)
  • Support --Zyephyrus (talk) 15:19, 9 February 2009 (UTC)
  • Support.Zhaladshar (Talk) 16:13, 9 February 2009 (UTC)
  • No probs; by "my existing permissions" I assume you mean the permissions on that account at present; i.e. you're not seeking to get an admin bit on that account. Hesperian 22:33, 9 February 2009 (UTC)
With AWB it will be handling textual changes, not deleting or moving. So whatever is the norm for bots undertaking such tasks. At a point in time, I would like to also add upload texts to pair with images at Page:, however, not asking for that at this point. I work openly, I hasten slowly. -- billinghurst (talk) 23:06, 9 February 2009 (UTC)

Copyright: The Incoherence of the Incoherence[edit]


I would like opinions from other contributors about this work, regarding the date of publication and therefore the copyryght status of this work. I was ready to copy it, but after some research I was unable to find a date of publication earlier than 1954. However the translator published works as earlier as 1907. I was also unable to find any date of birth or death for this translator, and as this link mentions Simon van den Bergh, Jr., there might even be several authors by the same name. What do you think? Yann (talk) 18:30, 9 February 2009 (UTC)

Thanks to Sherurcij, we now have his date of birth: Simon van den Bergh‎. But this isn't sufficient... Yann (talk) 21:56, 9 February 2009 (UTC)
I also found several publishing in 1954 and none earlier. I found a hint that "Simon Van den Bergh, birth Oss 16 Nov 1882, died Monte Carlo 1978", but this is only a Google memory as the page it links to does not mention him any longer. With there being a second publishing in 1987 I don't think we are going to find a PD version. Jeepday (talk) 23:42, 9 February 2009 (UTC)
OK, thanks for the info. The problem is that we have no proof that he is our translator. This seems to be a common name [10]. I find it strange that his profession is not mentioned [11] for a translator of several quite important books, when most other people in this family have their profession mentioned, and some times even a detailed biography. Yann (talk) 04:49, 10 February 2009 (UTC)
This 1958 translation of Tahafut says in its opening foreward that the translator was thankful to Simon van der Bergh for his 1954 translation -- so it seems that is the definitive year. Sherurcij Collaboration of the Week: Author:Charles Sheldon. 02:12, 10 February 2009 (UTC)
OK, so I will delete that page. Thanks for the help. Yann (talk) 18:59, 10 February 2009 (UTC)

Using an icon to indicate a spoken work[edit]

I'm curious about other peoples' opinions on an author page. On Author:Robert E. Howard, I have added the following to show all works to which a spoken version is attached:-

Speaker Icon.svgIncludes a spoken word version of this text.

Is this acceptable or a bad idea? On a related subject, is that author page too visually cluttered or is it OK? - AdamBMorgan (talk) 13:14, 11 February 2009 (UTC)

I really like the icon, and against each relevant text it is neat. May I suggest that rather than repeat the text with the icon, that in the body that you use the icon, and then in the DESCRIPTION field place it as a key and paste the icon with the text
Speaker Icon.svg identifies that the work includes a spoken word version.
-- billinghurst (talk) 23:36, 11 February 2009 (UTC)
If an author wrote that many works you can't be blamed for the clutter. I do find your use of {{Semi-copyvio author}} preferable to the usual wallpaper at the bottom of the page. Eclecticology (talk) 00:40, 12 February 2009 (UTC)
Thanks for help. Cheers, AdamBMorgan (talk) 13:04, 13 February 2009 (UTC)

Copyright status of symbol on title page[edit]

Page:Marie_de_France_Lays_Mason.djvu/9 has an anchor which seems to have been used as publisher J. M. Dent's symbol for many years. There were various versions of the emblem. I created an image of the logo and embedded it in the page and then it was mentioned by Zyephyrus that that might be covered by copyright. I fear the assessment is correct, but am looking for other opinions. That leaves the question in my mind about hosting the page image, and I will request deletion if it needs to be removed. --Mkoyle (talk) 19:38, 11 February 2009 (UTC)

For how many years? If it was published before 1923, it'd be fine, copyright-wise. As it is, I'd remove the cut image, but I wouldn't worry about the page.--Prosfilaes (talk) 20:10, 11 February 2009 (UTC)
I guess I can answer my own question... after 1923 Google Books Search. I am guessing that this logo was picked up in the 40's or 50's... Of course since Dent was purchased by another publisher in the 80's, it will be pretty hard to ask them, too. I'll go ahead and remove it. --Mkoyle (talk) 21:04, 11 February 2009 (UTC)
It's a 1911 text, so by definition any symbol in it was published before 1923, and is therefore in the public domain. It is important not to confuse copyright issues with trademark issues. This may well be a trademark, in which case there are restrictions on how it can be used, but those restrictions certainly do not extend so far as to prevent us from making a faithful copy of a public domain work. Hesperian 22:46, 11 February 2009 (UTC)
Page 10 says it was last reprinted in 1966, so we don't have any indication if the image was published before 1923. Jeepday (talk) 23:16, 11 February 2009 (UTC)
Oh. Then the years quoted at Index:Marie de France Lays Mason.djvu and File:Marie de France Lays Mason.djvu are wrong. Hesperian 23:49, 11 February 2009 (UTC)
The question of the anchor is a question of trademarks and not copyrights, and since the logo is only a part of the background, and we are not using it in an infringing manner. A more important question with a 1966 reprint is copyrights on layouts and typography. Unless the layout is an exact copy of the 1911 version the scanned images (but not the wiki text) could be an infringement. Eclecticology (talk) 00:59, 12 February 2009 (UTC)
An image can be covered by both copyright and trademark. The inquiries are different, but trademark is not something we will ever need to worry about, as we are not making any use of marks in commerce. I would question whether this image embodies sufficient originality to merit copyright protection, however. BD2412 T 04:39, 12 February 2009 (UTC)
I agree with Eclecticology (re: "a more important question"). I would never upload page scans of a 1966 print (with the possible exception of facsimiles, and then only with great care), because there is every reason to believe that the cover page, layout, typography, logos, etcetera, are copyrighted. I think this file would have been deleted at Commons by now, if not for the fact that it is incorrectly identified as a 1911 work. Only the text itself is from 1911! Hesperian 05:25, 12 February 2009 (UTC)
There are no copyrights on layout and typography in the US.--Prosfilaes (talk) 08:18, 12 February 2009 (UTC)
Sections 102(a)(5) and 103(b) say differently. Eclecticology (talk) 09:47, 13 February 2009 (UTC)
What they actually say is:
102(a) Works of authorship include the following categories: [...] pictorial, graphic, and sculptural works; [12]
103(b) The copyright in a compilation or derivative work extends only to the material contributed by the author of such work, as distinguished from the preexisting material employed in the work, and does not imply any exclusive right in the preexisting material. The copyright in such work is independent of, and does not affect or enlarge the scope, duration, ownership, or subsistence of, any copyright protection in the preexisting material. [13]
103 (b) says absolutely nothing about the matter; it only talks about what is true for works that do have a new copyright. 102(a) must be read in view of section 101;
“Pictorial, graphic, and sculptural works” include two-dimensional and three-dimensional works of fine, graphic, and applied art, photographs, prints and art reproductions, maps, globes, charts, diagrams, models, and technical drawings, including architectural plans. Such works shall include works of artistic craftsmanship insofar as their form but not their mechanical or utilitarian aspects are concerned; the design of a useful article, as defined in this section, shall be considered a pictorial, graphic, or sculptural work only if, and only to the extent that, such design incorporates pictorial, graphic, or sculptural features that can be identified separately from, and are capable of existing independently of, the utilitarian aspects of the article.[14]
Given that alone, it's at best questionable whether you're right; certainly the body text of most books would not have any "pictorial, graphic, or sculptural features that can be identified separately from, and are capable of existing independently of, the utilitarian aspects of the article." Given that fonts alone aren't copyrightable, and the fact that in the decade I've been beating around public domain text groups online no one has even claimed otherwise, I'm going to stick with my original statement.--Prosfilaes (talk) 00:54, 14 February 2009 (UTC)
103(b) for our purposes has the effect of allowing graphic representation to be treated separately from text. Even if we determine that certain scans are a copyvio, it would allow us, where applicable, to reproduce the wikitext version.
102(a)(5) does indeed depend on the definitions section as you point out. As a rule though, when the principal verb is "include" rather than "is" it leaves a degree of open-endedness to the question. One can easily treat layout as graphical representation, and separate identification is easy; whether independent existence is possible may depend on the particular circumstances. In the absence of any legal authority I would not be so quick to dismiss the non-copyrightability of a font as a given. We still have ongoing discussions about the need for public domain fonts to cover the more obscure areas of Unicode. If a publisher produces a new fancy font for our standard alphabet why should that be any less protected under copyright law than the Dent logo.
All that being said, and acknowledging that my position may be just as questionable as yours, the fair use argument is still available. I would not be too quick to advocate for the deletion of scans of recent reprints of otherwise public domain material. It would still be up to the person making a take-down claim to prove that his rights have been infringed. What I do support is awareness of the legal environment and the acceptance of certain low-level risks. Eclecticology (talk) 10:20, 14 February 2009 (UTC)
Fonts don't have copyright protection in the US;[15] you can patent them, and you can copyright the TrueType(etc.) programs that produce them, but not the typefaces themselves. A quick search finds those legal authorities.--Prosfilaes (talk) 14:22, 14 February 2009 (UTC)
That may indeed be the trend in the US, even if that trend is contrary to the rest of the world. The interesting article that you cite is informative, but not conclusive; it raises doubts about whether the US is fulfilling its GATT obligations. We'll just have to agree to disagree about fonts. The publication that led to this thread is more than 20 years old, so if design patents were to apply they would have expired anyways. The layout question would probably be the more important one, as would the addition of any new headings to make the material more readable. This first came up in relation to using an updated edition of The Book of Mormon and I have also raised it before in relation to Bartleby's(?) section headings for the Cambridge History of American Literature. When we use these more recent printings, we don't know what has been changed. The vast majority are probably quite safe to use, but we can never be certain of that without relying on fair use. Eclecticology (talk) 23:21, 14 February 2009 (UTC)

Peer reviewed journal articles with open copyrights[edit]

A number of peer reviewed journals (a small number, but there are some) have begun to either use various open copyrights or to leave the copyright with the author. Are such articles considered valid for inclusion on wikisource even if they were published after 1922, provided that they are either copyrighted in such a way that such posting would be legal or the author releases the work for copying in the public domain? Mad2Physicist (talk) 07:05, 12 February 2009 (UTC)

Each would have to be considered on its own, check Wikisource:Copyright policy to see if the work you have meets the criteria. Jeepday (talk) 23:45, 13 February 2009 (UTC) is one such journal—almost everything they "publish" carries a CC-BY-3.0 license. Would it fit within our mission to copy all the articles with that license? --Spangineerwp (háblame) 00:36, 14 February 2009 (UTC)
Copyright and mission are separate issues. If we assume that the contributing authors have done so there with full knowledge of the CC licence, it should be safe to add it here. They have, however, stated in their mission that they also plan to include translations. It might still be safe that the consent of the translator is there, but we can't be sure about the original author.
In terms of Wikisource's mission, there should be no problem with including the content, given that it has previously been published. There is, nevertheless, a moral and ethical dimension that is breached when we talk about copying 'all the articles. If we are going to copy everything they have this diminishes their value to a place of first publication that exists only to fulfill one of our own requirements. We want these sites to thrive; we want the googling public to find those sites without having them overwhelmed by the monopolistic elephant in the little duck pond. Eclecticology (talk) 09:27, 14 February 2009 (UTC)
That sounds like a reasonable compromise. I too feel like copying everything would be unfair to them, in some sense, but I'm glad they're making it possible for us to add some of their best works. --Spangineerwp (háblame) 03:58, 17 February 2009 (UTC)

Last broken word of a proofread page: a solution[edit]

Here the code of a simple, but IMO useful template:pt (pt as page-text) to solve the problem of "the last broken word into a Page: page":

{{#ifeq:{{NAMESPACE}}|Page|{{{1|text shown in Page:}}}|{{{2|text shown otherwhere}}}}}

Its logic: "show parameter 1 if the namespace is Page, show parameter 2 if it isn't".

See here an example: it:Pagina:Olanda.djvu/24 that containg the code {{pt|Comincia-|Cominciarono}}.--Alex brollo (talk) 07:40, 14 February 2009 (UTC)

Thanks; we already have {{hyphenated word start}} and {{hyphenated word end}}. Hesperian 13:12, 15 February 2009 (UTC)
Thanks for reply! Ok, I once more re-discovered something not new at all. ;-) --Alex brollo (talk) 17:37, 15 February 2009 (UTC)

Pages that disambiguate multiple editions or republications of the same work[edit]

Is there a standard header and layout for pages that lay out the various editions of a text, and/or the various works in which a text has been published? For now, I've used {{header2}} and common sense at General remarks, geographical and systematical, on the botany of Terra Australis. But it seems unlikely that I'm the first to encounter this issue. Hesperian 13:16, 15 February 2009 (UTC)

How did it look with a {{disambiguation}} template? It would have been where I would have started, though not quite sure what you are after, especially as it has not author field. -- billinghurst (talk) 14:16, 15 February 2009 (UTC)
I don't think that the disambiguation template is sufficient to the task. It works fine when we want to distinguish unrelated works, but its functionality deteriorates as the number of versions and editions increases. For the General remarks ... mentioned above the problem is not that great. It is a relatively long title, with only four versions noted, and it is not likely to conflict with the title of anyone else's work. For popular authors, especially the ones old enough to be in the public domain, the number of variations can be huge. Short popular poems are frequently anthologized. Different printings of the same edition are not necessarily identical, and very few reprints are going to give any indication at all about what has changed. In the short term the solution suggested is OK, but to be a bit more far-sighted there is a need to move to a more comprehensive approach. Eclecticology (talk) 19:54, 15 February 2009 (UTC)
To Billinghurst: the header looked okay, but it forced me to start with "General remarks... may refer to" which isn't really appropriate for the presentation of editions of a single work. Hesperian 22:28, 15 February 2009 (UTC)

Managing bilingual translations[edit]

I would like to start a blingual French-English wikisource page for a book that is currently available only as a DjVu in the French Wikisource. I plan to start transcribing that and translating it at the same time. Is there a bilingual template that will automatically pull the transcription of the original French text as it grows and allow the community to build the translation without continually copying-pasting the original? (This is the original text I'm talking about: [16] )Thanks. Aldebrn (talk) 21:47, 15 February 2009 (UTC)

Not that I know of, we have copy-pasted our texts for translations on fr.wikisource when needed. A tool for that would be very useful. ---Zyephyrus (talk) 20:28, 17 February 2009 (UTC)

Applying the text advancement to other wikis[edit]

Hello, I'm a User in the Arabic Wikisource and was wondering how I would put the button of text advancement in the edit menu to the Arabic wikisource. Applying this would help a lot in knowing the quality of Pages and is tremendously needed. Thanks a lot.--Diaa abdelmoneim (talk) 13:29, 17 February 2009 (UTC)

Hi, this isn't exactly what you were asking for, but it accomplishes the same thing and more. At Hebrew Wikisource we have added the Flagged Reviews extension, which among other things allows rating the quality of a particular version of the text for completeness ("text advancement"), proofreading, and aesthetic formatting. It is still new and has some quirks, but overall we are extremely pleased with it. Dovi (talk) 14:08, 17 February 2009 (UTC)

Copyvio @ State of the Union Opposition Speeches?[edit]

See here for more. Referencing here as well b/c the State of the Union is coming up pretty soon. --Philosopher Let us reason together. 21:59, 17 February 2009 (UTC)

Duplicate works of Dryden[edit]

Came across Alexander's Feast and Alexander's Feast; or, the Power of Music and my first experience of finding duplicates. A little tricky to compare them, though there seems to be some minor variance. Is it just a matter of adding a link to both variations from the author page? A Disambig page? Adding {{similar}} that points to the alternative? Thx -- billinghurst (talk) 23:47, 17 February 2009 (UTC)

This raises all sorts of issues with me?
  1. Why would an accurate transclusion exclude "OR, THE POWER OF MUSIC./ A SONG IN HONOUR OF ST. CECILIA: /1697." from the top of the first page?
  2. When do we include the subtitle as a part of the Wikisource title?
  3. How many versions of a poem can we sensibly keep? For a 1697 poem there must be hundreds of versions available. It's one thing to say that Wikisource is not paper, but a hundred unexplained versions of a poem is bound to confuse a reader who just wants the "correct" version of the poem.
  4. If we keep them all the significance of Billinghurst's question is magnified. The "similar" template becomes more awkward as the number of variants increases. Sometimes a disambiguation page will be necessary, but when there are only a small number of included variants multiple links from the author page should be enough.
  5. If we keep only some, what are the criteria for deciding which are to be kept? What are the criteria for assigning levels of credibility to these versions?
  6. How does one trace the tree of variants so that one can document not only the differences, but also the provenance of certain variants?
Eclecticology (talk) 01:33, 18 February 2009 (UTC)

double pages in djvu[edit]

With the help of Help:DjVu files I now managed to create a djvu file from my png scans. But one issue remains: My scans were scans of double pages, always a left side and a right side on one png. So my scan of a 180 page book results in a djvu of 90 pages. Is there any convenient way to split the original pngs or the pages in the djvu so I will get a djvu with 180 pages? Does anybody know how to solve this problem? --Slomox (talk) 15:08, 18 February 2009 (UTC)

I do something like this all the time, but under Linux. Given files labeled 001.png through 999.png that are 3500 pixels across and 300 DPI:
mkdir Output
for i in `seq -w 1 999`
    pngtopnm "$i".png > temp.pnm
    pnmcut -right 1750 temp.pnm > temp1.pnm
    cjb2 -dpi 300 temp1.pnm "$i"a.djvu
    pnmcut -left 1750 temp.pnm > temp1.pnm
    cjb2 -dpi 300 temp1.pnm "$i"b.djvu
    rm temp.pnm temp1.pnm
djvm -c book.djvu [0-9][0-9][0-9][ab].djvu

If they aren't even pages, half the width (1750, in this case) may not work, and you may want to cut a bit off the edges, too. If the scans aren't totally even, you may need to change that value part way through the book. Probably less than helpful, but that's how I do it.--Prosfilaes (talk) 16:47, 18 February 2009 (UTC)

  • The unpaper utility, which I generally try to use when cleaning up scanned pages, will optionally convert a single scanned image of two side-by-side pages into two separate output files (see the --input-pages and --output-pages options in the documentation). It locates the proper content for each page semi-intelligently by searching for margins consisting of mostly white space. I have been happy with its output so far. Tarmstro99 (talk) 17:15, 18 February 2009 (UTC)
Unpaper looks good, but I couldn't find a pre-compiled download. Although personally I like GUI programs most, I'm fine with command-line tools. But if I even have to compile the program, that's a bit too much for me ;-) Is there a pre-compiled Windows version available for unpaper? --Slomox (talk) 17:56, 18 February 2009 (UTC)
Google is your friend! :-) See Tarmstro99 (talk) 18:26, 18 February 2009 (UTC)
Thank you. I still have one problem: If I provide a multi-page pbm as input, it will only handle the first page. Is there any special parameter I have to provide to handle all pages? --Slomox (talk) 20:44, 18 February 2009 (UTC)
I don’t believe so. The solution is to split document.pbm into doc0000.pbm, doc0001.pbm, doc0002.pbm, ... doc0099.pbm with pamsplit, then feed the resulting files into unpaper (which will accept an input parameter such as doc%04d.pbm to automatically start processing multiple files starting from doc0000.pbm). If you want to start processing at, say, page doc0004.pbm instead of counting from 0, just give unpaper the parameters -si 4 doc%04d.pbm. E-mail me if you have further problems with unpaper; I’ve used it for quite a few projects now, and the time spent mastering its idiosyncrasies is well worth it given the quality of its output. Tarmstro99 (talk) 21:02, 18 February 2009 (UTC)

Wikimania 2009[edit]

Wikimania 2009, this year's global event devoted to Wikimedia projects around the globe, is accepting submissions for presentations, workshops, panels, posters, open space discussions, and artistic works related to the Wikimedia projects or free content topics in general. The conference will be held from August 26-28 in Buenos Aires, Argentina. For more information, check the official Call for Participation. Cbrown1023 (talk) 18:25, 22 February 2009 (UTC)