From Wikisource
Jump to navigation Jump to search


There are two candidates for administrator currently being decided. I would like to see some more input before closing the discussions. Please take a moment to comment if you haven't already.--BirgitteSB 01:50, 12 June 2010 (UTC)

Gadget: HotCat, new version in test

I have put into test the upgraded version of HotCat (find in the Development section). If users wish to try that version, then please turn off the existing version, and switch on the new version. Feedback on its suitability for WS or if you identify any issues, then please leave comment at Mediawiki talk:Gadget-HotCat-test Thanks. — billinghurst sDrewth 09:49, 1 July 2010 (UTC)

Thank you very much, new gadgets and updates are always fun stuff! ;) -- Cirt (talk) 15:52, 1 July 2010 (UTC)


January 1, 2019 is upon us: Non-Public Archiving


I know I've been a bit inactive, but I was dwelling on the archival side of Wikimedia the other day, ironically after coming across a complete dossier of the w:Charles Manson murders' crime-scene photos which has perhaps never really been properly archived anywhere online before. I was saddened knowing they were not PD, and that by the time the "copyright" of the police department lapsed, they may well be destroyed and forever forgotten...and that got me thinking about the many books I hated "discarding" during my work at the w:Internet Archive; because if we/I didn't "save" them...they may be inaccessible and a Lost Work within decades.

After all of this pondering, I came to the conclusion that it would be fantastic if we expanded our scope to allow for the non-public archiving of works. At its simplest, this would mean creating a {{StillInCopyright}} template with the already-used <div id="copyvio" style="display:none;"> script that hides the text. Essentially, we are not publishing this work, we are archiving it in our source code so that one day when it is PD, all that needs to be done is for the template to be removed. We would obviously set such pages to protected' so that only administrators could do this - and we could even add the work, then delete it, then recreate the page with just the header info and template...if we were so worried that somebody might read the book through the "edit" page.

This would only be done for works set to expire on January 1, 2019 (only 8.5 years away, it seems very likely WS will survive until then) - there would be no "Let's add Harry Potter and just wait!" crap or anything.

Is the idea insane, or insanely awesome? Sherurcij Collaboration of the Week: Author:Thomas Carlyle. 23:58, 16 April 2010 (UTC)

For a film archive to make copies of old movies is reasonable. For us to make copies of old books, less so. If I thought this was actually going to preserve works that we can't host that might actually disappear--say Adventures in Humanity, by William L. Stidger--then I might be for it. I think it much more likely we'd be looking at Wodehouse and Christie and Wells, which won't disappear in 8.5 years even under a nuclear onslaught. What are we going to work on that Google won't have digitalized and isn't held by a hundred libraries, and is it really worth taking away the time from posting material that can read for the next 8.5 years?--Prosfilaes (talk) 01:17, 17 April 2010 (UTC)
I don't see that it falls within the site's expressed scope for us to hold other's copyright material, whether it is visible or not. I would think that this may be something that gets kicked upstairs to WMF to see what scope exists. Surely there are abilities somewhere outside of WS to archive material successfully in a non-public environment. — billinghurst sDrewth 10:21, 17 April 2010 (UTC)
And yet, ironically, we already do this. If Sherurcij posts copyrighted material, it will be deleted... in which case it is in the database but invisible, and can be made visible any time simply by undeleting it. Hesperian 12:04, 17 April 2010 (UTC)
Yes, but on January 1, 2019 we will have no record of pages to go back and "undelete" for example, and my route has the added benefit of including the info header, and we can work on it before 2019, it just won't be published until then. User:Sherurcij
Our standard practice these days for uploading works is to put the djvu on commons and the text in the page namespace here, because the text by itself is less valuable than the scans to back it up. Considering that, I don't see how Wikisource would be the best place to archive a work that will soon be lost. On the contrary, if there were a way to upload files to Commons and then hide them for 8.5 years, that would (it seems to me) be the best way to handle it. On the other hand, why not scan these works, put them on a few external hard drives, and stick them in safes for the next 8.5 years? —Spangineerwp (háblame) 12:12, 17 April 2010 (UTC)
-Because you or I may be dead in 8.5 years, or at least have lost interest in this project...
-Most works on WS do not have DJVUs, and while they may have their place, I would hate to think we hope to limit ourselves to two new works weekly in the future when we gain dozens Commons is unrelated User:Sherurcij

The only way we could "work on" new texts without publishing them is to archive them without reading them. It's not fair use to copy off a bunch of copyrighted texts and digest them with a group of people with the intent of "improving" them. And I think it's a good idea to point out here, now that it's been brought up, that we shouldn't read texts deleted at Wikisource for any other purpose than for carrying out administrative tasks. This kind of chumminess with deleted texts can only foster a bad reputation that could serve outside parties who may not share our goal of building a free library. ResScholar (talk) 10:56, 2 May 2010 (UTC)

Wikimedia is non-profit, but cc-by-sa-3.0 and GFDL do not forbid downstream commercial uses, so posting copyright-restricted orphaned works here, deleting them here, and viewing them in the non-public area by administrators may not fully comply with our copyright policy, even if ever done in compliance with Section 104 of the Sonny Bono Copyright Term Extension Act. If any works are still copyright-restricted in the USA in the last 20 years due to the Copyright Term Extension Act while also orphan and thus eligible for non-public and non-profit archiving, I would like to suggest using Canadian Wikilivres if already in the public domain in Canada. For example, if a work was published in 1923 in the USA with copyright notice and renewal and the author died in 1959, Canadian Wikilivres can already show it to the public, then it can be brought here in 2019 if American copyright term remains the same. How would WMF think of our administrators doing non-public archiving here? Is it really in compliance with Section 104 of the Sonny Bono Copyright Term Extension Act?--Jusjih (talk) 04:14, 18 June 2010 (UTC)

Phase out TextQuality templates and system

With some of the built in functionality that ThomasV has put into the PR extension, I think the {{TextQuality}} boxes added in "page" tab (and the corresponding buttons on the edit pages) are a bit redundant and should be done away with.

The whole point of them was to indicate how close they are to being completed (not proofread, proofed by one person, proofed by 2) which would be roughly 50%, 75%, and 100% completed.

But, if we are transcluding works from the Page: namespaces, that information is already carried over in the form of a colored stripe at the top of the page that gives much more information than the little box created by TextQuality. We know how much of a given text has been proofed by none, one, or two users, so get a better indication of how much work is still necessary to put into that text. The current TextQuality implementation doesn't give us that kind of information.

I propose we begin phasing out this system since we have a built-in way to do the exact same thing.—Zhaladshar (Talk) 15:24, 18 May 2010 (UTC)

For transcluded pages, I agree. There was a (brief) previous discussion now here. I definitely think that we could do away with the radio buttons in the main ns pages, and if people wish to manually add the TQ components through a template in main ns, then so be it. — billinghurst sDrewth 01:28, 19 May 2010 (UTC)
I agree. It is confusing to have two systems. ThomasV (talk) 06:06, 19 May 2010 (UTC)

It seems to me that we could also phase out the {{Textinfo}} template (for works that use ProofreadPage, that is). The "Source" field is replaced by the "source" tab on the document (between "page" and "discussion"); the "Level of progress" field is replaced by the coloured stripe; and the "Contributors" and "Proofreaders" fields are no longer useful because any given work could be proofread by many different users working on different pages. Every time I fill in a {{Textinfo}} I feel like I'm redundantly duplicating information that's already visible elsewhere. - Htonl (talk) 15:31, 19 May 2010 (UTC)

It is a bit redundant and confusing to have both, but they are two overlapping approaches. I still use the previous system in main space to indicate the progress of subpages and their parent, because the coloured stripe may misrepresent the progress for many reasons. I also prefer the coloured boxes, "" displays the information in two ways, and their placement on the tab. The information we get from the ribbon appearing above the header is a report on the status in the Page name-space, but how is the reader able to interpret this information. If we are indicating the development of the text to the reader, we should create a modifiable display, maybe in the header, with text and/or a link.
As pointed out with {{textinfo}}, useful info in a strange place, we seem to duplicate information and have parallel systems. I make use of them as best I can. I don't think the older system is outmoded, so a Luddite's oppose for the moment. I'll try to put some thought into a better way. Cygnis insignis (talk) 18:45, 19 May 2010 (UTC)

I support the proposal. The various interfaces and namespaces here are incredibly confusing to new users and overlapping systems like this are just another disincentive to participate. Moondyne (talk) 02:05, 20 May 2010 (UTC)

 Oppose - Please keep the free-standing {{TextQuality}} template untouched & leave the way it is currently applied for non-transcluded works alone. Also, can someone please point me to the discussion where the template was deemed "deprecated" by the community please. George Orwell III (talk) 02:40, 20 May 2010 (UTC)

The only other discussion I'm aware of is the one linked above, Scriptorium/Archives/Feb, the proposal is to phase it out, then deprecate it. I presume that would involve removing the buttons in edit mode. I don't think the issue of non-transcluded works has been covered, a good point, they would still be the majority of page type. Cygnis insignis (talk) 04:05, 20 May 2010 (UTC)
Please visit the {{TextQuality}} template's home page. It states: << This template has been deprecated. Text matched against scans are preferred methodology, and they have a superior, objective rating scheme >>.
I understand the February 2010 discussion raised the point of {{TextQuality}}'s use as being somewhat redundant with the new features being added since. Where was the discussion to deprecate the template's use across all other types of works is what I was actually hoping to review. George Orwell III (talk) 04:32, 20 May 2010 (UTC)
I removed it while discussion continues. Cygnis insignis (talk) 14:24, 20 May 2010 (UTC)

It's definitely worth keeping {{TextQuality}}, because some texts don't have pages and often the main page is unique (for example, see Lectures on Modern History—the bar says 100% validated, but the actual status is incomplete, as indicated by the little purple box next to "Page"). That said I don't have a problem with removing the radio buttons from the main ns to prevent confusion. —Spangineerwp (háblame) 21:23, 20 May 2010 (UTC)

  • As a new user on Wikisource, I'll just say that there are a lot of very confusing instructions spread over a lot of different pages, and it is not always clear what the "correct" or "preferred" ways to do things are. Any move that reduces redundancy should make WS easier to learn for new users, and I think that's a good thing. As to how it's done, whether TextQuality is still needed for some pages, I'll leave to others to decide - I still can't say I have much of an understanding of how things are supposed to work here. Cmadler (talk) 02:59, 21 May 2010 (UTC)

 Oppose - So far, every new page I've created has used a printed book (still in copyright hence I can't upload scans).--Longfellow (talk) 10:51, 21 May 2010 (UTC)

I have been asked to clarify my opinion. If every new page were created by transcription from djvus followed by transclusion, I would agree that the current system is redundant. However, for the way I work, which relies on in-copyright sources that cannot be uploaded as djvus, I feel that this system is still needed.--Longfellow (talk) 17:10, 5 June 2010 (UTC)
Out of copyright is out of copyright, and there needs to be further artistic merit to create new copyright, so it is hard to see that we cannot get scans. Sweat of brow on a public domain work doesn't particularly add copyright. Plus how about we look to hunt around for an image of out-of-copyright so we can do both. For validation purposes, we need to be pushing harder on image source, rather than aim at a lower standard validation, and if retaining TQ becomes that purpose, then it is tail wagging the dog stuff. — billinghurst sDrewth 00:53, 6 June 2010 (UTC)
The book I am currently using (The Illustrated Victorian Song Book) does indeed have sufficient additions to merit copyright (it adds introductions and marginal notes) though conceivably these could be cropped. Another book I intend to use next (The Edwardian Song Book) also adds notes, and offers transcriptions rather than facsimiles, so the lay-out might be copyright.--Longfellow (talk) 10:07, 6 June 2010 (UTC)

 Oppose removing {{TextQuality}}. I really dislike the ProofreadPage quality system, the one that breaks works into tiny one-per-printed-page wiki pages, and relies on Page: namespace. Thank you for the notification at the top of wiki pages. --Dan Polansky (talk) 10:58, 21 May 2010 (UTC)

Had this discussion before, and the TQ system doesn't rely on a ready evidence base, and tends to works not progressing beyond a non-verifiable, and very arguable version of a work. — billinghurst sDrewth 00:53, 6 June 2010 (UTC)

 Oppose removing {{TextQuality}}. I think the ProofreadPage system is very useful, and certainly the more available the source images are the better, but the best isn't always at hand, and TQ is a useful substitute when the situation isn't quite so ideal and allows for a greater diversity of sources. Bob Burkhardt (talk) 12:08, 21 May 2010 (UTC)

  • Comment many of these are all about the NOW, and not the direction or the future. I wish to turn the question around, I am against maintaining the TQ as the primary means to classify a work, and I am for the removal of the system in the long term. We are not at that point we can just remove it from our system, though we are at the position where we do not need it to be inviting methodology and measure by use of radio buttons, especially where many works are now being transcluded making TQ redundant. The PP ribbon is better as an objective measure and importantly a self-updating system, that said PP is not sufficiently mature at this point to adequately to cover all aspect. The current ribbon only portrays the transcluded pages, which is perfect for chapters of works. For a work, the preferred ribbon would be that displayed as on Special:IndexPages which provides complete feedback, and it may also be one that could assist to magic switch an INCOMPLETE template, rather than our existing management system. We should be striving for a managed system that automates the trivial tasks, and to get away from manually intensive, and subject assessments, these do not lead to a system to produced quality works. — billinghurst sDrewth 12:34, 21 May 2010 (UTC)
  • Remove the TQ radio button scoring system, retain the TextQuality system though with the ongoing and long term aim to make it redundant. — billinghurst sDrewth 12:34, 21 May 2010 (UTC)

I guess I just don't see the benefit of this system. It doesn't seem like it adds any new or relevant information. I've gone through and chosen some of the pages that use {{TextQuality}} (omitting the works that use it AND the PR extension) and the works fall into two main areas:
  1. Use of the TQ template but no proper textinfo:
  1. Use of TQ template as well as proper textinfo:
In the first group, we are given no important textual information. The TQ system does not work without that information, and no one knows how much proofreading went into it or what sources were used. For works like Whiskey Speech we don't even have a transcription of the actual speech to work with. At this stage for these works, we do not have any reliable way to say how closely they are to being done, and the TQ system works against us because it tries to give a certain reliability to these texts.
The second group is very reliable because we do have proper source information. However, much of the sources were taken from Gutenberg or other text repositories. I would hazard a guess that most of the properly sourced works were taken from text repositories (only because of how long we've been using this system). Most of these works have DJVUs floating around on or can be made from Google Book scans.
I think by now we should be discouraging the use of Gutenberg (or any text repository) as a source for our texts. They don't do internal linking and can't use foreign alphabets for many of their works. They also don't link to the scans they used for the proofreading, so we still don't have any ability to verify that anything is accurate. I believe that WS should make a push to move as much of our works to DJVU as possible (excepting the rare cases, of course), meaning that the second group won't need to have the TQ templates because they merely reproduce information that's already been put into the system.
And this is why I think we should fully deprecate the system (or use it in only those 1% of cases for which we cannot legitimately get scans): because we either make statements regarding the reliability of works whose reliability we can't comment on, or we are doing it for works which should actually be moved to page scans. Even with the 1% we would need this for, I don't think it still justifies having the radio buttons hard coded into the system.—Zhaladshar (Talk) 12:38, 21 May 2010 (UTC)
 Support I agree that the TextQuality templates should be phased out, since the ProofreadPage is more accurate as that we can see how much of the page has been proofread/validated, whereas with TextQuality its hard to see whether the entire page has been proofread, or just half of it. But then it would be difficult to know whether non-transcription projects that are not used in the main namespace have been proofread or not. I suggest that we only keep the TextQuality templates for non-transcription works.--Angelprincess72 (talk) 16:14, 21 May 2010 (UTC)
 Support Long term phase out of TextQuality templates for transcluded pages. While they can serve as a feedback tool for transcluded pages, my initial impression was that they replicated the ProofreadPage system. It was also unclear, until explained in this session using the PfP scale as reference, when to advance the TQ indicator. I suspect that with the sparse number of talented folks currently working this project, TQ is redundant. JamAKiska (talk) 15:42, 28 May 2010 (UTC)
 Support The radio buttons should be removed as soon as possible. Their presence sends a wrong message to contributors, namely that Wikisource's quality system is based on self-assessment of one's contributions. The other idea it conveys is that a contributor's work should be blindly trusted, even if they do not provide scanned sources of the texts they add to the site. The continued presence of this system is only going to slow down the transition of en.wikisource from its current status to a trustable library. ThomasV (talk) 12:24, 3 June 2010 (UTC)
 Oppose Maybe they can go when the scan-supported texts (along with index pages) are in common enough use. I use scans that are copyrighted, for example those provided by the New York Times, so they aren't available for use with the new system that I can see since they are external. I would support disabling the radio buttons so if they are activated in a context where the other system is in use, they can just flash a warning to that effect. And if an attempt is made to save edited text with the TQ template present unnecessarily, it can pause to ask for a confirmation and advise removal like it does now in an opposite sense for headerless texts. The radio buttons actually did disappear a year and a half ago for a month or so, and I got into the habit of putting in the template manually, so I hardly ever use the buttons now, but I think they are good for newbees. I think a text without a scan doesn't have the level of trust of one with a scan, but it is obvious which is which, and someone wishing to filter out the scanless ones to get a more trustable library is certainly welcome to. I think accepting texts without scans gives more flexibility without degrading the higher trust one can put in texts with scans. Bob Burkhardt (talk) 00:59, 4 June 2010 (UTC)
Although NYTimes can certainly put a copyright notice on it, I don't think they have any current protection (w:Copyfraud). The ruling in w:Bridgeman Art Library v. Corel Corp. was that the key factor in copyright was originality, so when a reproduction does nothing more than accurately convey the underlying image, there is no new copyright protection. Wikimedia Foundation seems to follow this interpretation (as they generally follow US law); see for example w:National Portrait Gallery copyright conflicts. So unless the work is still under copyright, a scanned image of it can not be under copyright in US law, despite what NYTimes or others might claim. cmadler (talk) 09:46, 4 June 2010 (UTC)

I agree with ThomasV that we should remove the radio buttons from the edit screen. However I think it is too soon to remove the templates and indicators. We could remove the templates from works where we have scans, like Jane Eyre, and delete the templates when we have only a few hundred uses of it. John Vandenberg (chat) 09:25, 11 June 2010 (UTC)

ok, given the opinions expressed above, I have removed the quality buttons from ns-0, and I also disabled the template on pages that have the transclusion status indicator. The PageQuality template remains functional on pages that do not use transclusion from the Page namespace. ThomasV (talk) 16:06, 11 June 2010 (UTC)

Expand scope of Author namespace

One of my favorite features of Wikisource is that we can easily handle internal linking, making it possible to turn a basic text file into something that allows the reader to probe deeper into subjects that interest him. For example, I just completed wikifying The Huguenots and the League, and in the process of doing so, I gained a much better understanding of the material than I would have had I just read Acton's essay by itself.

In wikifying, my general practice has been to link Wikipedia, unless it's a public domain document (which we have or should have) or an author of public domain works. However, as our library has increased, I've begun to wonder if perhaps it would make more sense to take advantage of everything Wikisource has to offer. Instead of sending a reader off to w:Mary, Queen of Scots, what if we created Author:Mary, Queen of Scots and put links like Catholic Encyclopedia (1913)/Mary Queen of Scots and The New Student's Reference Work/Mary Queen of Scots on it, as well as the Wikipedia article? This would give additional visibility to our work, but would also give the reader/researcher more options. Instead of being taken directly to a modern, dynamic article on the person in question, the reader would more easily be able to access historical viewpoints as well.

The primary problem with this that I see is that the namespace is called "Author," not "Person" or the like. I agree that that is confusing, but we already include editors and translators in the Author namespace, even though they technically don't belong there. And for a long time all files on Commons were in the "Image" namespace. The change to "File" came only after it became obvious that there were a lot more than just images on Commons. The same might happen here—if the idea is popular, we might eventually say that it's worth renaming the namespace.

Another objection might be that we're adding an extra hurdle for readers who just want to get to the Wikipedia article (which in many cases will be the most complete and current article on the person). That's a valid point, but perhaps this issue can be alleviated by continuing to link to Wikipedia unless at least some number (two? three?) works exist on Wikisource about the person.

Already in the Author namespace we collect works about the author: for just one example, see Author:Charles_Dickens#Works_about_Dickens. I think it'd be great to give this visibility to our works on non-authors as well, Mary, Queen of Scots being just one example. Thoughts? —Spangineerwp (háblame) 19:04, 20 May 2010 (UTC)

I think that this is a great idea. The expansion of possibilities and potential for usage could be really valuable. -- Cirt (talk) 23:31, 20 May 2010 (UTC)
This is a difficult problem that I have been thinking about for years.

There is already the possiblity of creating Wikisource index pages for people who are not authors; e.g. Wikisource:Yagan. But I have long felt that such pages ought not be in our project namespace, as they are content. We ought to have a Topic: namespace. The big difficulty with this is that we don't really want separate pages at Author:Charles Dickens and Topic:Charles Dickens, else we won't always know which to link to. Hesperian 00:18, 21 May 2010 (UTC)

I like the way Author pages are now—they have both the works by the author and about the author. I don't know if it was originally forseen that that would end up happening when the namespace was created, but I think it's a positive development. In that sense, what started as a strictly "Author" namespace is already treated more as a "Person" namespace, because it doesn't treat its entries strictly as authors—it treats them as notable persons who wrote and who have been written about. I don't think there's value in attempting to draw a line separating the two types of works by using different namespaces. Partly for the reason you say (which would you link to?) and partly because it would not be consistent with what the Author namespace has evolved into.

Incidentally, I'm not a fan of the Wikisource index pages as a rule—that stuff belongs in the portal namespace, I would argue, except when the topic is a person, in which case I think it fits nicely into the Author namespace—perhaps not the original concept of the Author namespace, but again, what the Author namespace has evolved into. —Spangineerwp (háblame) 01:04, 21 May 2010 (UTC)

Isn't it more appropriate to use categorization for that? In the example, rather than creating a new page (confusingly) called Author:Mary, Queen of Scots, and rather than creating a whole new class of pages such as Person:Mary, Queen of Scots, I think the solution is to link to Category:Mary, Queen of Scots, which 1) already exists in some cases, 2) works for topics, not just people, and 3) makes additions intuitive. That is, from the point of view of someone who's spent a fair amount of time contributing to Wikipedia and Commons, but is new here, as new works are added, it seems fairly intuitive to categorize them, but it seems much less intuitive to go look for such a page and list the new work. It's also then easy enough to add a brief summary of the person/topic (which might often be the summary/lead from the Wikipedia article) on the Category page, see for example commons:Category:Eastern Michigan University. In any event, if a new class of pages is created, I'd suggest that it be "Topic" rather than "Person". Cmadler (talk) 03:15, 21 May 2010 (UTC)
That's worth considering, but not my preferred method for handling this for two reasons. First, it creates the problem that Hesperian and I don't want to create, that is, setting a precedent for splitting works by an author and works about an author onto two pages (Author and Category). Second, because categories are so limited. It's virtually impossible to give context in a category: for example, the category for works about Charles Dickens would have a work in it called "Charles Dickens," with no additional details (like that it was written by G. K. Chesterton in 1906) and no way to tie it to other works on Dickens except by alphabetization of publisher or author.

I think too that it's indicative of the limitations of categories that we even have the Author namespace in the first place, that Portals were developed on Wikipedia, and that many categories have corresponding pages in the main namespace on Commons. It does require more effort than simply putting a category on a work, but the usefulness for the reader is so much greater. As always, those who want to simply add a category or do nothing are free to do so; division of labor between those who post new texts and those who make them more accessible is permissible and even beneficial.

Also note, my proposal does not involve the creation of any namespaces. I simply want to expand our view of what fits into the Author namespace. Given the way the Author namespace is used now, it's my contention that it's not a big jump. —Spangineerwp (háblame) 04:17, 21 May 2010 (UTC)

There are four cases to consider:

  1. works about a non-person topic
  2. works about a non-author person
  3. works about an author
  4. works by an author

It does seem to me that the fundamental division here is "about" versus "by". However, Spangineer and I agree that 3 and 4 go together naturally: neither of us like the idea of having separate pages for works by Dickens and works about Dickens. Spangineer's proposal is to expand the Author: namespace to encompass case 2 as well. But this leaves case 1 high and dry, which is odd because 1 and 2 are so intimately related as to be barely distinguishable. Hesperian 05:32, 21 May 2010 (UTC)

For comparison I would note 5. works about works. This was mentioned during recent discussion and are currently linked by Author ('about') and Wikisource ('topic, subject') name-spaces and from the notes and versions pages in main ('works'). Cygnis insignis (talk) 06:50, 21 May 2010 (UTC)
I would say that this issue will come up again, until there is actually a "proper" use of namespaces. I don't see that it really helps, on a long-term view, to promote "incorrect" use of the Author: namespace while deprecating incorrect use of project pages in the Wikisource: namespace, and neglecting the Portal: namespace, and leaving the Category system in some sort of limbo with no serious guidelines. It does all depend what one takes as priority. I would take the line that navigation for the outside reader is the issue really needing to be addressed: the person faced only with the search box who would like to type in "Charles Dickens" and get answers. That doing that and hitting return gets to the disambiguation page "Charles Dickens" seems to me a good thing, and not something to be deprecated. In the case of Mary, Queen of Scots there is not yet a dab page (and should be). Wikifying by linking to non-author dab pages will be the best option in a number of cases. Charles Matthews (talk) 07:34, 21 May 2010 (UTC)
As a concrete proposal, I would "sacrifice" Charles Dickens by allowing that to become a cross-namespace redirect to Author:Charles Dickens, and do the same wherever there is an honest author page (parsimony with pages, profligacy with redirection to get good navigation); but stick to dab pages in the main namespace wherever there is no genuine reason to have an author page, adhering strongly to the idea that namespaces are there to keep different kinds of content separate. There will be a huge number of necessary biographical disambiguations when the EB1911 and DNB projects catch up with the Catholic Encyclopedia and other things already in place, running into thousands. For that reason alone the Author: namespace could become swamped with non-authors if it was decided to do it the other way. Charles Matthews (talk) 08:52, 21 May 2010 (UTC)

Following on from Charles' comment about Charles Dickens, one could make a strong case for moving all works into a Works: namespace, and retaining the null namespace (deliberately avoiding calling it the 'mainspace' here) for navigational purposes. i.e. the null namespace exists solely for routing search and link terms to the authors, works and subjects that they refer to, through redirects and navigation pages. Hesperian 11:05, 21 May 2010 (UTC)

Hesperian, I like that idea for its logical consistency, but not for the amount of work that it would involve, and the confusion that would reign while everything was getting sorted out. That's a big change, and I'm not sure what all the implications would be. Off the top of my head I know there would be thousands of incoming links to fix on Wikipedia. Plus we'd have the same issue we have now with people typing "Charles Dickens" into the search box and not seeing a list of works by Dickens, except that people would type "A Tale of Two Cities" and get search results. —Spangineerwp (háblame) 13:05, 21 May 2010 (UTC)

While not a perfect solution, I would like to propose something that stylistic disambiguation pages in the main namespace possibly offers a solution here. As we are having works that will include these people, especially with all our biographical work that we can set up Mary, Queen of Scots as a disambiguation page, and can link to the specific pages, to any author page, and again to WP. In fact, I have already being doing some of this and you can see it at Whateley, Richard (which also has a redirect to it from Richard Whately). It will be a slower process to build the topic matters. It addresses points #2,3,4 and maybe with some tweaking pick up #1 & #5, however at the same time, the latter two points do seem to be covered in the Wikisource: namespace.

Although the Wikisource: namespace is already used to develop a subject hierarchy, I think a separate Subject: namespace might be preferable so Wikisource: is used for more administrative sorts of pages like this one. Perhaps redirects can be used when the Subject: is an Author: since I think it is desirable to keep the Author: namespace as it is. Bob Burkhardt (talk) 12:28, 21 May 2010 (UTC)

I like the main thrust of this proposal a lot. I am not so much a fan of using the author space for non-author people, but coming up with an alternative to deal with non-author people who are frequently written about (and there are tons) is a great idea. I have not had much time to think on this subject, but I have come up with one thing we would need to overcome (depending on how we solve this)
Using a Subject: namespace or similar is I think a good start, since that really does reflect what is going on: people who look up Mary, Queen of Scots are looking her up as if she were a research topic. But the nice thing about using Subject for non-author people is that it allows us to do similar things for non-people to begin with. So, people can look up World War I or American Literature, as well. We've got works written about those categories and by people in those categories. The ability to aid research is very large in this case.
Here's the drawback. We've currently got 4 different namespaces (Wikisource, Author, Portal, Category) devoted to doing a piece of this. If we are to do it well, we need to find a way to bring all of these together so that we aren't reduplicating effort or hurting ourselves in the long run. We need to ask if the way that Wikipedia splits things up (by having a main namespace supplemented by categories and portals) works for us. Currently it does not seem that this is the case. We might need to find a different way to bring all this information together in a way that is more relevant to a library and our own goals.—Zhaladshar (Talk) 12:56, 21 May 2010 (UTC)
If this is where we end up, we would use the Portal: ns and hopefully/maybe we can create an alias for it, and that would be Subject: I wouldn't want more and more namespaces, we have enough. — billinghurst sDrewth
Just thinking "out loud": for us to have a structure similar to Wikipedia we would have a main namespace for works, a category namespace for generic listings of works, and a portal (or subject, etc.) namespace for authors, non-author people, and non-people. In that case, I think Hesperian's categories #1-4 and Cygnis's #5 would all be in the portal namespace. I'm not opposed to this, but there would be a lot of bot-level work to make this come together. —Spangineerwp (háblame) 13:25, 21 May 2010 (UTC)
I'm also just thinking out loud, too. Do we even want WP's model? (I'm not asking this with any predispositions, I just want to know if others think it's a good formula.) I know we're constrained by the software, but barring human creativity, do we think this is the best way of organizing our information both in the now but also as we continue grow (do we think this is ultimately scalable so that we don't have to revisit this topic in a few years)?
It also sounds like we would be getting rid of the WS namespace as a categorizing mechanism? If we have the tripartite main-category-Subject, would we be moving all those indices to the Subject and once again devote the WS namespace to what it was originally meant to be?—Zhaladshar (Talk) 13:46, 21 May 2010 (UTC)
The Wikipedia solution is a consistent solution, but it's a messy one for us in terms of getting there. I'm still trying to wrap my head around the implications.

As for Wikisource indices, as far as I'm concerned, the answer to your question is an emphatic yes. I'd love to see those things out of the project namespace. —Spangineerwp (háblame) 23:56, 21 May 2010 (UTC)

Bob and Charles are both suggesting using disambig pages in the main namespace to handle non-authors. My concerns:
  • It's not a very clean solution either. We add cross-namespace redirects, and have to move pages from the main namespace to the author namespace when we find a work that that person wrote. For example, Beethoven.
  • It's a concept that I don't feel "adher[es] strongly to the idea that namespaces are there to keep different kinds of content separate," because the main namespace will get cluttered with dabs and redirects. Under my proposal, a reader who lands in the main namespace is virtually guaranteed to get a work, not a navigational aid, unless there are multiple works with the same name. —Spangineerwp (háblame) 13:05, 21 May 2010 (UTC)
Well, redirects are in no namespace. Dogmatically one would like the situation where it would be true to say that type of content predicts namespace; and also vice versa. But from the point of view of navigation and search it is the first part that matters more. Charles Matthews (talk) 14:36, 21 May 2010 (UTC)
Yes, you're right about navigation and search being more important in this case. But it seems like cross-namespace redirects are the only "solution" to the search problem... and I'm not a fan. I think that's the basis of our disagreement—given "cross namespace redirects are ok", your solution is reasonable, but given "cross namespace redirects are bad", I think mine or a derivative of it (using Author and/or Portal/Subject for all organized navigational pages except dabs) is preferable. —Spangineerwp (háblame) 18:40, 21 May 2010 (UTC)
@Spangineer. I would say that initially unless they are an author, that they would get a redirect that points to a work, and as my gut feel is we will get at least a bio on them at some point, be it EB1911, Catholic, Appleton's DNB, IrishBio, ... that when the bio appears it becomes a disambig page, and many will be needed almost certainly for the people at some point. FWIW I do not advocate cross-namespace redirects, I would much prefer a substantive landing page within the namespace. — billinghurst sDrewth 15:06, 21 May 2010 (UTC)
If I understand you correctly, you're saying that Henry II of France should redirect to 1911 Encyclopædia Britannica/Henry II Of France (a red link, obviously) until said EB1911 article is created, at which point Henry II of France is changed from a redirect and populated with links (including 1911 Encyclopædia Britannica/Henry II Of France)? I don't prefer this, as while we wait for the EB1911 article to be created, there's no functioning link and ease of use suffers. Furthermore we don't know if the EB1911 article will actually be the first biographical/informational article on the subject. Or did I misunderstand? —Spangineerwp (háblame) 18:40, 21 May 2010 (UTC)
  • I think we could find a solution to make Category idea work. Right now we have a separate section of the Author: page that hold the about works. If we could populate that section from the Category system then we wouldn't have to maintain two separate page. I imagine these Category pages will exist whether we decide to link to them or not. So the existence of two separate pages isn't the question.--BirgitteSB 23:35, 21 May 2010 (UTC)
But look at how we handle the works by the author. We break them out by type, we give the publication year and related notes, we organize works by collection. Using the category system, we do not have any of these capabilities—we are limited to a mere alphabetical list. We don't use categories for "by" works for good reason, and for those same reasons we shouldn't use them for "about" works. —Spangineerwp (háblame) 23:50, 21 May 2010 (UTC)
Although I suggested it earlier, after giving the matter more thought, I agree with Spangineer that using categories in this way would be a suboptimal solution. It really should be done through some sort of editable page. cmadler (talk) 00:01, 22 May 2010 (UTC)

We all seem to agree that there should be a single one-stop-shop page for Charles Dickens, listing both works by and works about. No-one seems to dispute Spangineer's original premise that a page that lists works about Charles Dickens is fundamentally the same as a page that lists works about Mary, Queen of Scots, and these ought not to be in different namespaces. And no-one has argued against my observation that a page that lists works about Mary, Queen of Scots is fundamentally the same as a page that lists, for example, works about the French Revolution, and these, too, ought not to be in different namespaces. It follows that we should have a single namespace for all our meta-content.

The difficulty lies in what such a namespace should be called. Author is obviously inappropriate. The proposed Person doesn't go far enough. Subject and Topic overlook the authorship aspect, which surely comprises the majority of our meta-content. Meanwhile, Charles observes that the Portal namespace is neglected.

So why don't we simply move (or alias) all our Author pages, and all our Wikisource index pages, into the Portal namespace? Portal:Charles Dickens seems like an eminently suitable title for a page that purports to be a one-stop-shop for works both by and about the man. Portal:Mary, Queens of Scots ditto. Portal:French Revolution ditto. Hesperian 01:40, 22 May 2010 (UTC)

I like this. I didn't think we'd want to bite this much off all at once, but to me this is the logically consistent approach that wrecks the least amount of havoc on our existing content. Granted, the perennial "search problem" is not addressed, but I don't think there is a way to address it without either thousands of cross-namespace redirects or a software update. This will also require a sizable maintenance undertaking. But in the end, we'll have unified navigational pages that display everything Wikisource has to offer on that subject. As such pages develop, it will be natural to "build the web" for each of those pages, so that they receive incoming links from throughout the library. The combination of organized lists of works about a subject and existing tools like What Links Here will really make our library easier to use, and therefore more valuable to readers and researchers. —Spangineerwp (háblame) 02:26, 22 May 2010 (UTC)
The main problem I have with this is that if we put all this in the same namespace we will no longer be able to search the Author namespace and only come up with results that are authors. That is a feature I actually use all the time. I don't understand what is the objection to redirecting Portal to Author namespace where applicable to preserve this functionality. What is the functional advantage to having authors mixed in the same namespace as topical titles?--BirgitteSB 03:06, 22 May 2010 (UTC)
Perhaps I don't understand how you are using the search functionality, but how often do a topic and author have similar names? It seems to me that entering a search term that you hope gives you an author will rarely be similar enough to a topic name that the author is displaced from the search results list. —Spangineerwp (háblame) 17:12, 24 May 2010 (UTC)
Although we do have naming conventions on authors, you still can't be sure how the person who may have created an author page named it. So when you do not find and author page under the expected name, it is best to search the author namespace for the last name to make sure nothing turns up (or double check the ones that do to make sure they really aren't who you are looking for). Also sometimes you do not know authors full name and are trying to track them down from just a surname. As many authors share last names with non-authors, consolidation without redirection will just add more false positives to those searches. It will also kill our random author feature which I also like and use (that use is purely entertainment). And it will kill these things I like and use for no functional reason that I can determine. Is there any functional reason that you comprehend for consolidating this all into one namespace without redirection? Or to rephrase that, would using redirection to preserve the Author namespace result in any loss of function to the schema you are imagining?--BirgitteSB 03:53, 25 May 2010 (UTC)
I like the idea of having, for example, Portal:Mary, Queen of Scots to collect everything by, about or otherwise related to Mary, Queen of Scots. I think moving the indices to corresponding portals is a good idea too. However, I wouldn't want to move the authors as well. Portals should be used as higher levels of a topic, whereas Author is very specific. While there could be author-based portals, Portal:Charles Dickens or Portal:Shakespeare seem likely, I don't see a need for a portal for every author on Wikisource. Keep them to just those authors who require multiple forms of indexing (those that count as subjects in their own right, such as Dickens and Shakespeare). These portals may duplicate information on the author page, or may just link back to it, in addition to further links (which I guess would be along the lines of biographies, reviews, critiques, analyses, inspirations, derivative works and so forth).
If by "alias" you mean create a redirect in the portal space for every author that does not yet have a portal, that would be OK but I don't think it will really be that necessary. If that does go ahead, it will need a bot creating the new portal-redirects whenever a new author page is created (as a casual user will not necessarily be aware of the practice and gaps will grow over time). - AdamBMorgan (talk) 15:43, 26 May 2010 (UTC)
I (too) like best the idea of using the Portal namespace for expanding WS capabilities. Portal:John Milton, Portal:Charles Dickens, Portal:Shakespeare and Portal:Bible are some I would very much like to see. The last one is desperately needed, as I have a very hard time finding and cross-checking the various historical translations of the Christian Bible, which are used for citations to support Wiktionary entries. Translations of the same text from different dates are valuable, but it's hard to find the appropriate texts unless you're already aware of the publication dates, authors, and idiosyncratic page names of the several versions. Likewise, Shakespeare is in need of a portal if there are ever to be copies of the First Folio editions of his plays (the current texts are all modernized editions without source information). There is so much information on the dating of the texts, and information on changes in modern editions, comparisons between early copies of some plays (e.g. King Lear), etc. that cannot be succinctly summarized in an author page but which are critical to the understanding of the source material. Any work that exists in multiple editions or multiple English translations will have similar problems that can't be handled easily through an author page. --EncycloPetey (talk) 06:01, 30 May 2010 (UTC)
This can be done now, and we would encourage you to utilise the Portal namespace: for such work, it is just not particularly utilised by people on-site. The discussion originated with the linking of texts and how we might do this is in our main namespace, and where to link. Truth be known, I would muchprefer the Portal: namespace be used more that way with plenty of freedom and less controls, that said it if we moved Author: pages over they would not fit well as those pages need a higher level of controlled formatting. — billinghurst sDrewth 08:33, 30 May 2010 (UTC)
  • @AdamBMorgan Spangineer was suggesting moving all Author:Foo to Portal:Foo. I think it is better to merely create Portal:Foo as a redirect to Author:Foo in order to preserve certain functions that require authors to be in a discrete namespace. Whether all author all are given such a redirect or only those with works written about them, or some those with other criteria I am ambivalent about.--BirgitteSB 01:34, 31 May 2010 (UTC)
  • I would like 'Author' to be renamed 'Person', however we have already allowed non-author Author pages. I would also like the topical indexes to be separated from the project namespace, and Page and Index merged. John Vandenberg (chat) 09:11, 11 June 2010 (UTC)

Attempt at a summary

Reading this through again it looks to me like there isn't too much support for a) putting non-authors in the author namespace or b) putting all authors in the portal namespace. However, there is significant support for building the portal namespace with information about topics and people (including, perhaps, some authors). Given this, it sounds like we are talking about something similar to the status quo, but perhaps with a few changes in emphasis:

  1. We want to see an increase in the use of the Portal namespace
  2. As a corollary to (1), we encourage the replacement of links to Wikipedia with links to pages in the portal namespace
  3. The portal namespace should primarily contain pages on topics and non-author people, and should not attempt to replace the author namespace
  4. Despite (3), in some cases, we may find it useful to have a page about an author or work in the portal namespace, in order to provide background information or specialized organization in a way not typically used in the author namespace
  5. In the case of (4), we encourage the co-existence of two pages (Author:Foo and Portal:Foo) and links between them
  6. Existing index pages in the Wikisource namespace should be moved to the Portal namespace
  7. The judicious use of cross-namespace dab pages (such as Charles Dickens) should continue

Any disagreements with these points? —Spangineerwp (háblame) 14:15, 31 May 2010 (UTC)

  • Agree with above, move to accept/formalize as the standard. JeepdaySock (talk) 15:16, 1 June 2010 (UTC)
  • Not entirely sure about point 5; strongly agree with everything else. Hesperian 00:51, 2 June 2010 (UTC)
  • Thank you for making the effort to summarize this all! I can agree with all the points you have laid out. I would suggest these be emphasized more as provisional experiments than firm rules. After putting these concepts into greater practice we may discover other options or inherent pitfalls that we have not yet imagined. But most likely they will just become "the way things are done." On a side note we also need to consider the state of archaic Portals which were originally envisioned more as mini-Main Pages (see Portal:Speeches) instead of indices. I am not sure how many are out there in this archaic state and I think they are largely unmaintained, but it would be best to reach out to those stakeholders at this point.--BirgitteSB 03:15, 2 June 2010 (UTC)
  • Accept, need to convert to overarching principles. 1) yes, an informational and exploratory space; 2) yes; 3) author: ns is as a directory; 4) yes, beyond listing (see 1 & 2); 5) absolutely, and we should be looking how we do that neatly and explicitly within or around {{header}}; 6) yes, though clarity will be needed (see below); 7) dab pages in main ns should focus on dab'ing works, and then have the additional feature of pointing to relevant pages, noting that we already have some dab author: ns pages (otherwise that becomes a separate discussion). — billinghurst sDrewth 04:28, 2 June 2010 (UTC)

Things that we therefore need to address (and once we have the questions, probably should be moved to a new section) — billinghurst sDrewth 04:28, 2 June 2010 (UTC) ...

  • clarity b/w Wikisource: ns and Portal: ns, actually we need to give clarity around all our ns, and I think I prattled about this at an earlier point of time
    Help:Namespaces has been added to the mix
  • A structure of portal ns, how loose? encourage the use of {{process header}}?
    I propose that all Portal: ns pages utilise {{process header}} a continuation of the position of the Wikisource: namespace.
  • Allowing x-ns links for an extended period (exemption from the speedy delete for existing pages moved)
  • Reviewing the default search ns parameters, so that portal is among them.
    I am told that this is a change we cannot access directly [wgNamespacesToBeSearchedDefault needs to be changed, requires a bugzilla request] default search
  • Alignment with similar portals and projects and the like at sister wikis
  • Might we need to dab in the portal ns? If find a topic/author/... that has multiple meanings, how are we going to deal with it? (unsigned)

Seems to move in the right direction. Charles Matthews (talk) 07:29, 2 June 2010 (UTC)

Does Item (6) include the 'index' Wikisource:Authors? Cygnis insignis (talk) 05:42, 3 June 2010 (UTC)

This seems to summarize things very well. But as an extension to Cygnis' question, would 6 include the whole Category:Wikisource index pages? Maximillion Pegasus (talk) 20:43, 3 June 2010 (UTC)
I would say a definite no to Wikisource:About, Wikisource:Index, Wikisource:Index/Tools and scripts, Wikisource:Index/Community, and Help:Contents. Those stay where they are. Not passionate either way about Wikisource:Authors, but I think it would fit nicely in the portal namespace, or even the author namespace. All the rest (geography, religions, plants, types of documents, etc.)) should go to the portal namespace. —Spangineerwp (háblame) 21:39, 3 June 2010 (UTC)
I have submitted a bot request to this effect. —Spangineerwp (háblame) 23:28, 10 June 2010 (UTC)

To do

Things that will need to be addressed with a result of the updating of approach. — billinghurst sDrewth 13:18, 10 June 2010 (UTC)

  • Template:Indexes will need to be updated to point to Portal: namespace, though will need to wait until after all pages moved.
  • Review search parameters, especially default
  • In WS: ns, redirects for a period of time, convert to dated soft redirects when all have been moved, and suggest that this be the time to update {{indexes}}

I've submitted a bot request that all be moved more or less simultaneously, and redirects made soft immediately. Does this align with what you are thinking? —Spangineerwp (háblame) 23:28, 10 June 2010 (UTC)

  • {{categorybrowsebar}} looks to be a historic (hysteric?) artefact, and needs a rethink as we go forward into Portals

Vector skin and interwikis

Proposal on the central wikisource project as this affects all Wikisource projects: oldwikisource:Wikisource:Scriptorium#Vector_skin_and_interwikis.

I think interwikis are a core part of our interface, and we should not adopt the new Vector skin until Wikisource projects are configured to always display them. John Vandenberg (chat) 02:17, 4 June 2010 (UTC)

Other discussions


I have just noticed, which provides a list of wikisource pages discussed on It is an interesting mixed bag. Sadly, almost all of the items on reddit have no pagescans or provenance information. John Vandenberg (chat) 04:10, 24 June 2010 (UTC)


Bot flag request for SKbot

I request a flag for SKbot. It based on my own class library and written on Object Pascal / Delphi. A code has a full unicode support. I plan use the bot for a various ad-hoc tasks, when complex algorithm needed and standalone bot is not effective. The test task is completed (see [Special:Contributions/SKbot bot's contribution]).

In closest prospects i want write wikisource-specific interwiki bot. Other task is an analyse; e.g. in last autumn Russian domain finished complete works of Chechov, and i via my bot list his works at en:, which has no ru: iwiki. In any case my bot may do tasks, which not requires deep knowledge of en-wikisource structure.

Personally i'm a professional DB programmer with 15 year experience. -- Sergey kudryavtsev (talk) 13:00, 9 May 2010 (UTC)

PS: Currently SKbot has a bot flag in Russian Wikisource and Wikitionary (most bot experience), German and Polish Wikisource. -- Sergey kudryavtsev (talk) 13:08, 9 May 2010 (UTC)

Gday Sergey. You talk about the interwiki tasks, and I am very comfortable with the work that you have been undertaking and thank you for that. I am comfortable with your bot continuing that work. With other tasks, would you be looking to help with things identified at Wikisource:Bot requests or initiatives of your own? I wouldn't want to give an open approval to do any task, though if you were comfortable with introducing the non-interwiki tasks at Bot requests prior their being undertaken, as a community checking mechanism, then that sounds like it could be a beneficial arrangement. At such times, it would be useful to know whether they are automatic, or semi-automatic. As bot edits are generally not seen unless someone is watching a page, it is useful to know what is being done and when.
In fact, I think that our good practice should be when we are letting our bots do ad hoc tasks, that we should make a one line note of what is being done, and when. We have been fairly casual on letting the community have this information, and I am just at fault as others with that. — billinghurst sDrewth 01:01, 12 May 2010 (UTC)
Currently, all undertaken tasks is my own initiatives. I run it in semi-automatic manner (under IDE debugger to stop work if anything is wrong). I will give publicity at Wikisource:Bot requests those task, which i plan to lunch, and will record the completed task and run logs at the bot's user page. -- Sergey kudryavtsev (talk) 06:51, 12 May 2010 (UTC)
While I'm pretty sure that you'll be fine and your bot will be fine, I am interested to know what other tasks you think it would do: you have been a bit vague, so some other information would be great. For example, what tasks, other than interwiki, has it been doing on ru.wikisource and ru.wiktionary, etc? Jude (talk) 02:53, 12 May 2010 (UTC)
Well, i clearly understand your distrust. ;-)
First, look at the "Chekhov's Translations" section of User:Sergey kudryavtsev/Sandbox. This information collected by my bot in order to bring to light Chekhov's work, which has no Russian iwikis. Such task i call "an analyse".
Second, look this diff. This is a typical "ad-hoc task" with iwikis. I run this task on Russian, German, Czech, Spanish, Latain and other Wikisource domains.
Third, in Russian Wiktionary my bot replaced a generic templates calls with more specific ones, when a verb's conjugation is known. E.g. see this diff. Other task was a replacing deprecated template "падежи de" with "сущ de/m", "сущ de/f" or "сущ de/n" depending on noun's gender, e.g. like this. This is a typical "ad-hoc task" too. -- Sergey kudryavtsev (talk) 07:58, 12 May 2010 (UTC)
Hi, this looks good. Are you confident with our policy at Wikisource:Bots? Also, it would be nice if you published your bot scripts, so others can learn from you.--GrafZahl (talk) 17:56, 14 May 2010 (UTC)
I read Wikisource:Bots of course. One day in future i shall publish my code library. But now it hasn't too many impotant functions, and i has too young son (year and five month) to seriously devote myself to this task. :-) On the contrary i may publish code of the accomplished tasks to inform or learn an interested programmers. -- Sergey kudryavtsev (talk) 22:52, 15 May 2010 (UTC)

Flagged.--BirgitteSB 02:47, 2 June 2010 (UTC)

Thanks! I hope that my bot will benefit for English Wikisource. -- Sergey kudryavtsev (talk) 07:34, 2 June 2010 (UTC)

Legality of scanning PD sections from a non-PD publication...?

Does anyone know how legal it is to scan pages from a book (or any other publication) if the text on those pages is in the public domain? There are a few texts I own in recently published books that are in the public domain but the rest of the book (including other texts, covers, introduction etc) is still under copyright. Scans are preferred here, especially with the possible phasing out of TextQuality, but public domain publications can be expensive.

Take "The People of the Black Circle" — a Conan story printed in three issues of the pulp magazine Weird Tales in the mid-1930s. (It isn't as must-have as the Works of Shakespeare but nevertheless where my interest lies.) The copyright of the Weird Tales version was not renewed, so it is in the public domain. Buying these three issues, however, would cost over a thousand dollars and no one else has scans available. On the other hand, I have multiple copies of this story in various books printed over the last few decades. Can I scan the pages from one of these books, upload it to Commons, and proofread on Wikisource?

(As it happens, I do mean to scan some issues of Weird Tales in the near future, which is part of why I started indexing the magazine here. It will just be some of the other, cheaper, issues and I've hit some delays anyway.) - AdamBMorgan (talk) 13:57, 21 May 2010 (UTC)

IANAL, but my guess is that it might be permissible, due to the Bridgeman Art Library, Ltd. v. Corel Corp. SCOTUS decision, which found that "slavish copying," although doubtless requiring technical skill and effort, does not qualify for copyright protection. Only "a distinguishable variation"—something beyond technical skill—will render the reproduction original. But there may be other considerations applying to a compilation, page layout, etc., which do give copyright protection to the reprint. cmadler (talk) 14:49, 21 May 2010 (UTC)
I am curious about copying old stuff which is owned by a library. Let's say a book which is contained in a "rare books" or microfilm collection. I've seen works of art reproduced with a comment something to the effect of "with permission of the such-and-such gallery", which leads me to believe that the owner of the work is claiming some sort of property right of reproduction. Would anyone claim such a property right in a rare book from the 17th century? I could understand if someone would be upset if their entire microfilm collection, something which they've put a lot of time and effort in producing, were copied and distributed for free. TomS TDotO (talk) 15:05, 21 May 2010 (UTC)
Such a right is often claimed (see w:Copyfraud) but appears not to exist under US law. Current widespread practice on Wikimedia Foundation projects is to ignore such claims as invalid (see w:National Portrait Gallery copyright conflicts). See also [1] which gives a pretty good explanation of the issue. cmadler (talk) 15:26, 21 May 2010 (UTC)
Going back to the original question, the more I read about it, the more it seems that a faithful reproduction of a PD work remains PD. However, some (many?) reprints of PD works are re-edited which might (though not necessarily - I think it depends on the degree of change and the amount of originality; a few minor typographical corrections might not be sufficient) allow copyright, but only on the changed/new portions. To the extent that it is not changed, it remains PD. cmadler (talk) 15:32, 21 May 2010 (UTC)
That's the big problem, and a reason I'm concerned about a lot of our Lovecraft. The original publication can be free and clear, but the new edition, if they edited the text or undid editing done in the original printing, would have a new copyright. Spelling and punctuation aren't a big deal, but it frequently goes beyond that.--Prosfilaes (talk) 03:46, 22 May 2010 (UTC)
I think, regarding Howard, I can solve that problem. I know that the Lancer (1970s) editions were edited (and a separate copyright filed) while the Del Ray (2000s) editions have had all edits removed, going back to original manuscripts, which is probably just as bad. However, the Fantasy Masterworks editions are supposed to match the original magazine publications (the purest source available without the original type-written pages). If I do scan any books, if no other problems come up, the Fantasy Masterworks seem to be the best choice. There is one other source, a late 1940s magazine called Avon Fantasy Reader which doesn't seem to have had its copyrights renewed either. It reprinted selected pulp stories, including a few Howard and Lovecraft pieces. It might be the next best choice after Weird Tales itself (and I already own two issues). - AdamBMorgan (talk) 11:35, 24 May 2010 (UTC)
Incidentally, since typing "no one else has scans available" I have found almost half a dozen places that do have such scans, including one of the installments of The People of the Black Circle (I swear I searched beforehand for months without finding anything). The available scans are still not complete, so I am continuing with plans to scan more recent books, but I'll hold off on that for the moment. I think I will try scanning my own pulps first, then use others and fill whatever gaps remain as necessary. - AdamBMorgan (talk) 23:19, 4 June 2010 (UTC)

Dash-like marks

A paper in The Condor I've been proofreading has some long dash-like marks on pages such as Page:Condor25(6).djvu/8, which I don't know how to represent. Any ideas? —innotata 14:42, 22 May 2010 (UTC)

2 em dashes? It is what I normally do. — billinghurst sDrewth 15:22, 22 May 2010 (UTC)
I just went to take a look and I think it looks good with 3 em dashes on the first, 2 on the second, 1 on the third, en dash on the fourth. See what you think. cmadler (talk) 15:26, 22 May 2010 (UTC)
Dashes are what to use? The gaps don't look very good, but at least to me, this doesn't appear with en dashes; I just can't yet figure out how many en dashes to use in each case. —innotata 15:45, 22 May 2010 (UTC)
It's not using any en dashes. It's all em dashes and a hyphen. I agree with cmadler: 3, 2, 1, hyphen. I changed one of the en dashes to become an em to keep with the consistency.—Zhaladshar (Talk) 16:32, 22 May 2010 (UTC)
En dashes don't have spaces between them, at least on my browser and skin preferences. I tried using en dahes, but didn't save the page, since I can't figure out how many to use. —innotata 20:51, 22 May 2010 (UTC)
Both vary according to browsers and user preference, there is a way to avoid a space, box drawing characters:
I just use emdashes,
"tip———tip——tip—ip-prrrrr, ..."
I think this conveys what the author and editor intended. Cygnis insignis (talk) 21:16, 22 May 2010 (UTC)
I really think lines without breaks would be good, but there doesn't seem to be any way to represent this. —innotata 18:33, 24 May 2010 (UTC)
Thought-bubble, have you thought about cheating ?                        Cheating ...
</nowiki> — billinghurst sDrewth 07:33, 25 May 2010 (UTC)
That looks just right. Brilliant of you. —innotata 15:49, 26 May 2010 (UTC)
Belated suggestion for future reference: <s>{{loop|20|&nbsp;}}</s> ->                      Inductiveloadtalk/contribs 00:35, 11 June 2010 (UTC)

Jammed web page

Could an administrator please erase the contents of THIS WEB PAGE (but not delete the page itself), because it's jammed up and is no longer accessible. It' contains 10,000+ transcluded lines and I must re-design it. This is an original page by User:Mattwj2002 who passed management of it to me. Thanks. - Ineuw (talk) 13:48, 31 May 2010 (UTC)

Done. The difficulty to load the page is there for all of us, so I loaded an early version from the page history and was able to delete its contents. — billinghurst sDrewth 14:07, 31 May 2010 (UTC)

Thanks. . . . again. - Ineuw (talk) 21:41, 31 May 2010 (UTC)


I simplified access to the index pages and wanted to add additional navigation to the notes of the header which would facilitate direct access between Indexes without having to open each TOC. My attempt to insert a 3 column 1 row table into the existing header (notes =) (shown below), didn't work:

Volume 1 Index Scans Volume 3 Index
Did you look to see if {{RunningHeader}} was of use? It works out well in other places. — billinghurst sDrewth 16:29, 5 June 2010 (UTC)
Addendum. Why not add the prev/next volumes to the prev/next links, and let the body text cover the specific links. Also why add the link to scans? It is provided in the SOURCE tab at the top of the page. — billinghurst sDrewth 16:32, 5 June 2010 (UTC)

The other idea is to create (an eventual 92) transcluded index pages with their own headers, but this seems wasteful. I welcome any idea from the community. - Ineuw (talk) 16:11, 5 June 2010 (UTC)

fwiw... The only way I know of to overcome the rigidity of the notes parameter found within the basic header template is to use straight HTML (EXAMPLE). Of course the longer the text/link, the more problematic spacing, padding and wraping become - but for what you need, it seems to look fine. Other than that, the next best thing, for me, has been {{RunningHeader}} as was suggested above. George Orwell III (talk) 02:18, 6 June 2010 (UTC)

Hmmm; This is a lot of good food for thought. Thanks for the pointers. The 'Scans' link is a leftover from my first attempt, copied from elsewhere never paid attention to it. Thanks for pointing it out. Obviously, the {{Running header}} is the attractive option. - Ineuw (talk) 02:43, 6 June 2010 (UTC)

It seems like the logical center link should be to the WikiProject - "List of Indexes" subpage but I'm not sure if quick navigation across all the volumes is the intended "goal" here or not. George Orwell III (talk) 03:31, 6 June 2010 (UTC)

Yes, the intended goal is exactly that. A quick navigation or sequential access. Placing the access to the project Index pages in the center is a very good idea, thank you! Before this, I also considered merging/splitting the index entries and creating a page for each letter of the alphabet. It would have been great for research, but technically is not feasible for the same reason that the original web page had to be dismantled. Based on what collected so far, the completed project would need an average 1350 links per page. So, the running header navigation is the best choice. - Ineuw (talk) 05:30, 6 June 2010 (UTC)

British Museum tie-ins

I took part in yesterday's British Museum Backstage Pass event. I'm now interested in any tie-in results that can be shown on Wikisource. There are in fact numerous author pages here for writers who also worked for the British Museum. Those pages could be improved themselves: or works by those authors could be added. In quite a few cases it would be possible to add works about those authors. Would it be appropriate to set up Portal:British Museum for all related pages, in the light of recent discussion about use of various namespaces? In any case I'm looking out for concrete contributions that can be added to the list of knock-on achievements of the day, with a time scale of a month. Charles Matthews (talk) 13:07, 5 June 2010 (UTC)

I would have said yes, and possibly a subpage for the employee/author listing, and I would have thought that we could make such a task {{CotW}}plus. I know that I have done lots of pages at Dictionary of National Biography, 1885-1900/List of Contributors and there are resulting obits for a number. — billinghurst sDrewth 16:20, 5 June 2010 (UTC)

I have done a quick list based on searching a couple of ways, including 35 existing author pages where there is the direct connection of working at the Museum. Charles Matthews (talk) 16:57, 5 June 2010 (UTC)

From an rough scan, I get 56 results for British Museum in the Author:. At this point, I don't have access to other tools to refine or compare. — billinghurst sDrewth 00:58, 6 June 2010 (UTC)
Now tarted up and moved out. Charles Matthews (talk) 20:50, 7 June 2010 (UTC)

Do you want to include works like Notes on Cookery of the ancients, which lists the source as the British Museum? - AdamBMorgan (talk) 10:04, 10 June 2010 (UTC)

I wouldn't mind a section on works taken from manuscripts in the BM. It really depends whether it helps the portal, which is supposed to be a user-friendly way to navigate towards what you want, however precisely or vaguely you knew a connection with the BM. Charles Matthews (talk) 11:27, 10 June 2010 (UTC)

Since until 5 minutes ago we have had no indication to users on the uses of each of out many namespace, and given the introduction a new one (Portal), I have created a page listing the namespaces we have here at Wikisource along with a brief explanation of the use of each. I invite everyone to have a look and tinker with it. Hopefully it will be a useful page to help users decide where to put new material. Inductiveloadtalk/contribs 02:41, 8 June 2010 (UTC)

I love it. John Vandenberg (chat) 03:09, 12 June 2010 (UTC)

I am proposing that we rename Collaboration of the Week to Current Collaboration. We are not getting weekly updates, nor even close to weekly updates, and it is pretty quiet space anyway. I would suggest that

  1. It is renamed so it does not give false expectations
  2. It needs active volunteers to invigorate the beast, and mostly likely someone who is prepared to step and coordinate

Noting that this is not one that I wish to oversee. Once we have done that, see how it can be salvaged, then we can review again. — billinghurst sDrewth 03:02, 9 June 2010 (UTC)

 Support The last one seems to have been in February.--Longfellow (talk) 17:09, 9 June 2010 (UTC)
Though we do rotate that monthly, so it does add pertinence, though do see that alignment.— billinghurst sDrewth
Suggested for the opposite reason, because of its success the rotation is greater than monthly :) Cygnis insignis (talk) 07:58, 12 June 2010 (UTC)
Though we do rotate on the first of the month, and we do have some works that are not complete and still rotated out. Plus I do like my little cheat that closes the PotM and puts in little works for validation, it has been quiet successful so far. smiley It sometimes comes down to the works under consideration. Depends on what we are trying to achieve, as I know that we do get people who seem to return early in the month to look at the PotM on offer. — billinghurst sDrewth 10:52, 12 June 2010 (UTC)
... perhaps a title that describes the focus, authors. Cygnis insignis (talk) 12:00, 10 June 2010 (UTC)
Personally I would like to not constrain to authors, and would like to see the interests of both Authors and Portals to have that ability to spruik. — billinghurst sDrewth 13:07, 10 June 2010 (UTC)
The other option is that we could utilise an existing word of Featured, so it becomes
  • Featured text
  • Featured proofread
  • Featured collaboration
Though it somewhat becomes a word play. — billinghurst sDrewth 13:07, 10 June 2010 (UTC)
That might be a little confusing compared to how Wikipedia uses the term and how many users we get that were brought up in the WP system.--BirgitteSB 02:06, 12 June 2010 (UTC)
  • I've notified Sherurcij; maybe he plans to revive this in the future. I would prefer it to be left at the current name because the concept is based around a weekly cycle: short bursts to spruce up a narrowly defined topical area. We also have Wikisource:Song of the day; I'm sure someone will come along and want to revive it. Maybe we can tag them an inactive projects. unsigned comment by John Vandenberg (talk) 2010-06-12T03:03:20.

Has anyone had a look at this ? I think its pretty impressive.

A fully searchable edition of 240,000 manuscripts from eight archives and fifteen datasets, giving access to 3.35 million names.

P. S. Burton (talk) 18:07, 10 June 2010 (UTC)

Someone care to update Samuel Mudd Documents?

Looking at this page and the subsidiary page, it would do well to move Samuel Mudd Documents to Portal:Samuel Mudd and then to redistribute the subsidiary pages from being subpages to where they actually sit within the hierarchy. If someone has the time to do, that would be great. — billinghurst sDrewth 13:05, 12 June 2010 (UTC)

Some of the material is in this published selection, which has an authority. Providing access to source material regarding a person whose name is ... err, sullied!, is an editorial decision that can already be done 'judiciously' at the place from whence it came. Cygnis insignis (talk) 07:55, 14 June 2010 (UTC)

PSM authors page

Insight is sought from those with experience with splitting dynamically growing pages, because THIS PAGE is beginning to slow down when editing. Rough calculations indicate that this list will grow to ~3,000 entries by the time all 92 volume titles have been collected. It currently has 712 names as of the end of volume 23. - Ineuw (talk) 15:13, 12 June 2010 (UTC)

Put section breaks into your table, so you only edit part; or split it up into multiple pages (subpages?). If you want to have a long running list, then transclude the pages back into one holding page. — billinghurst sDrewth 15:20, 12 June 2010 (UTC)

Thanks for the reply. I was studying the implementation of the Author pages index since this is nothing more than a transcluded variation. Then, I remembered that the purpose of the list avoid duplication of author pages and indicating contributions of the most recently harvested article titles. Eventual splitting to subpages is the best solution. - Ineuw (talk) 14:59, 13 June 2010 (UTC)


Author page was split to A to K and L to Z - Ineuw (talk) 16:48, 15 June 2010 (UTC)

Standard editing toolbar

Is there a way for a user to remove some buttons from the toolbar that comes with the editor? e.g: The horizontal line, Level 2 header, external links, and signature are of no use in 19th century text editing. - Ineuw (talk) 14:41, 14 June 2010 (UTC)

Assuming you're using Monobook style, add this to Special:Mypage/monobook.css:
#mw-editbutton-extlink { display: none !important; }
#mw-editbutton-headline { display: none !important; }
#mw-editbutton-signature { display: none !important; }
#mw-editbutton-hr { display: none !important; }
Prosody (talk) 00:54, 17 June 2010 (UTC)

Many thanks. I use Vector and it works fine. Could you point me to the pages wher I can find references to the rest of the buttons? I assume its Wikimedia. - Ineuw (talk) 03:35, 17 June 2010 (UTC)

Can't find any canonical listing. Here are all the current ones; just replace the bit after the # and before the { with the corresponding name:
  • Bold: mw-editbutton-bold
  • Italic: mw-editbutton-italic
  • Internal link: mw-editbutton-link
  • External link: mw-editbutton-extlink
  • Level 2 headline: mw-editbutton-headline
  • Embedded file: mw-editbutton-image
  • File link: mw-editbutton-media
  • Mathematical formula (LaTeX): mw-editbutton-math
  • Ignore wiki formatting: mw-editbutton-nowiki
  • Signature: mw-editbutton-signature
  • Horizontal line: mw-editbutton-hr
And if they ever add any new ones, to get the name, view the source of the edit page (usually under Edit or View menus in graphical browsers), search (usually Ctrl+F) for "addButton", you'll see a list of lines like addButton([a bunch of things in quotation marks separated by commas]). Each of the lines represents a button. The second of the comma separated things is the alt-text which explains what the button does. The last is the name. Prosody (talk) 18:47, 17 June 2010 (UTC)

Thanks again. This gives more space for my custom buttons. - Ineuw (talk) 04:08, 18 June 2010 (UTC)

Need comment/advise on image cleaning

Uploaded the same image in two formats, both of which I tried cleaning offline before uploading to the commons. They are File:PSM V01 D113 Bracelets and hairpins background with reflection.jpg and File:PSM V01 D113 Bracelets and hairpins.jpg.

  1. I tried cleaning the first image in various ways, using IrfanView, Inkscape and Gimp. One can see that the shadow of the previous page is imprinted in the background like a watermark. (Appleton's the cheapskates used thin paper). All three softwares gave the same results, all unsatisfactoy, although I am very weak with Gimp. Also, it's a single layer grayscale image.
  2. With the 2nd image, I erased the background manually, a task that has taken nearly an hour and this alone suffices to explain the purpose of this post. Can anyone suggest some pointers/softwares, as to how one can clean the background quickly? - Ineuw (talk) 20:40, 19 June 2010 (UTC)
It took a minute or two, including finding the original, but this is sure to be quicker than "Erased background - painfully long". I was, perhaps, heavy handed with the amount of black (ink), but this is the simplest method for good results: File:Popularscience test a.png The method is in the file's notes. Cygnis insignis (talk) 21:47, 19 June 2010 (UTC)
I tried to improve Help:Adding images a while ago. Please add any other useful information, such as the names of commands in various applications, to that page. Cygnis insignis (talk) 22:11, 19 June 2010 (UTC)

This is fantastic! I am sure it will meet the requirements of all critics. Thanks so much. Now for the $64K question: What software did you use? :-) - Ineuw (talk) 22:52, 19 June 2010 (UTC)

I am a luddite, I prefer technology that doesn't suck - it took 15 minutes to work out how to do everything I should ever need here. The answer is Preview, cost: bundled, so nothing; ease of use: very easy; requirement to fathom a elaborate application that does a hundred things I will never use: none; stability: bulletproof; speed: lightning; underlying R&D and technology: 20 years of, 3 years in advance of other common platforms; requirement for new computer: none. Cygnis insignis (talk) 23:35, 19 June 2010 (UTC)

Hmmmm, so, you're a Mac user. :-) In the meanwhile I figured it out with IrfanView (Windows), as well. It works fine. Should I post the instructions on the Image page? - Ineuw (talk) 00:30, 20 June 2010 (UTC)

I'm guessing that the native application would not do it, if that one is as simple and intuitive then instructions and another advert would be okay. Apart from the MacSmugness, the point I would hope to get across is that creating a reasonably good image is pretty simple when you have done it once - a user doesn't need to upload, run and become an expert on sophisticated software like GIMP. Cygnis insignis (talk) 01:16, 20 June 2010 (UTC)

I like your answer and agree completely. In between our posts, I already cleaned up some 50 images offline after a couple of false starts, and hope to replace all the 1st volume images by tomorrow night (EST) weather permitting. (Nice weather = biking). Until now, I avoided dealing with image manipulation because I am far more drawn to text. I just didn't know where to start, but then, a challenge is a challenge. :-).

I also didn't care for the quality of my uploads and knew that sooner or later it has to be dealt with. Then, the Commons was intimidating and slow to respond to requests because of its size. Now, in the past 24 hours everything fell into place over there, and then you gave me the key to the kingdom. :-)

As for GIMP, I gave up because the documentation lacks detail and I got nowhere. IrfanView is very simple and built for batch image management. Also intend to write a step by step instruction about how to clean up images quickly using that software. Thanks again. - Ineuw (talk) 03:49, 20 June 2010 (UTC)

Help files also suck, there are usually very few things that 95% of users want to know - the instructions for that are buried in pages of stuff you don't need or want to know. Regarding our help pages, many are either out of date, don't reflect the current SOP, or have been inadequately improved 'on-the-fly' by meself and others. If you want to get attention at Commons, start questioning the featured pictures or categories (I'm kidding, don't try this at home). GIMP is open-software, which is cool, but it seems geared toward those who already know how to use high-end IMPs. There are pared down, 'lite' versions of the same thing, I've used one called Seashore which seems okay. Cygnis insignis (talk) 05:21, 20 June 2010 (UTC)

The Thirty-Six Dramatic Situations

Hi. Related to the article at wikipedia:The Thirty-Six Dramatic Situations, I'd like to add the full source material (c.1921) here, copying it from

Is that appropriate material for this project? Which specific license would I tag it with? and which of the formats that offers would you recommended I start from? (I'm a regular wikipedian, but an inexperienced wikisourceror) Thanks for any advice. Quiddity (talk) 23:36, 19 June 2010 (UTC)

Yes. PD-US. DjVu. Rather than confound you with elaborate explanations, I went ahead and demonstrated: the index is being prepared at Index:The thirty-six dramatic situations (1921).djvu Cygnis insignis (talk) 23:55, 19 June 2010 (UTC)
And what Cygnis insignis was too polite to say is "You have our sympathies for being a regular wikipedian, and we are happy to help you amend your sick ways and become a regular here instead" winkbillinghurst sDrewth 03:09, 20 June 2010 (UTC)
Ha! Thank you kindly.
I can see that I'll be puttering away at the text for quite a while. (I've proofread the introduction thus far).
If I have minor questions about formatting (such as: should I use {{smaller block}} around the {{running header}} in each instance?), should I ask here, or at a different helpdesk, or just bug one of you two personally? Thanks again. Quiddity (talk) 05:06, 23 June 2010 (UTC)
Putting the question here will be helpful to others, we don't have any separate notice boards except for the admin one, and may prompt someone to improve the help files (cheers for that, btw). Anyone is welcome to ask me on my talk, though demonstrating is usually easier than explaining. Investigating other works in edit mode and using the template documentation is another to grasp what is going on, although some things are workarounds and might be mysterious - like preserving the end of a paragraph with a nop. Make sure you check your preferences to watch what you edit, I usually try to explain what the diff is showing. The running header can contain different formatting for each part, or the whole thing as a larger or smaller font for the block of text. The effect on regular text is font-size and line spacing when displayed, larger for title pages and smaller for quotes for example.
Billinghurst's kidding comments aside, this part of wikimedia has an inherent difference deriving from its objective purpose - not an imposed culture that seeks to isolate one from the other. Cygnis insignis (talk) 08:06, 23 June 2010 (UTC)
Sheesh Cyg, I struggled to interpret that. I think what he is saying is that we have the one swimming pool — no biting, no scratching, no weeing — and there is no disagreement about that approach. Naturally talk pages work, though, the answers (and the answerers) will probably be the same. unsigned comment by Billinghurst (talk) 11:39, 23 June 2010 (UTC).
I'm not sure why you created this tangent, I read a non-sequitor and I assumed you were kidding with your "too polite to say ...". The fact is I would not say or think anything like that, because this is simply another part of wikimedia: cultural differences derive from its scope, not from suggestions, jocular or otherwise, that there is an 'us and them' mentality predominating here. Cygnis insignis (talk) 07:30, 28 June 2010 (UTC)
IIRC, the preferences for "Add pages I edit to my watchlist" and "Add pages I create to my watchlist" are ticked by default (but "Add pages I move to my watchlist" is not). Usually, almost the first thing I do with a new site or piece of software, is go through (and poke at) the preferences menus to see how (and how well) the whole thing is put together, from the creator's POV. A wonderful habit that has brought me no end of joy and frustration. Thank goodness for software/sites that have a "reset to default" button!
I do a lot of work with the meta/help pages at Wikipedia, so I should be able to flail my way towards most answers with an appropriate namespace search or example examination. :) Quiddity (talk) 22:51, 23 June 2010 (UTC)
Belated responses. Thanks for letting me know about the preference, I had a vague recollection that 'watch what I edit' was off by default at some sites. Cygnis insignis (talk) 07:30, 28 June 2010 (UTC)
Correction: You're right, "Add pages I edit to my watchlist" is OFF by default. (I took the plunge and reset my prefs...) 21:31, 29 June 2010 (UTC)
Thanks again. Cygnis insignis (talk) 02:25, 30 June 2010 (UTC)

Do you think that your work is quirky or of greater interest?

More works are required to be nominated for Wikisource:Featured texts, so please consider nominating validated works you have created, or works at Category:Index Validated that you believe we should be showing to the extended community. All the details required at Wikisource:Featured text candidates.— billinghurst sDrewth 13:58, 26 June 2010 (UTC)


We are making a kind of inventory of the content of svws. Now, I have found this text. As You understand, this is not within the scope of svws, but maybe of enws. Texts from the swedish goverment are normally free. The problem is, that the text has no source. Will You accept the text as it is? -- Lavallen (talk) 18:22, 26 June 2010 (UTC)

Image caption alignment

Whenever someone has the chance, would edit this page how to place the caption below the image when it's not in center? Thanks in advance. 16:40, 27 June 2010 (UTC)

Image caption alignment

Whenever someone has the chance, would edit this page to demonstrate how to place the caption below the image when it's not in center? Thanks in advance. - Ineuw (talk) 16:42, 27 June 2010 (UTC)

Done. Hesperian 23:29, 27 June 2010 (UTC)

Many thanks for the sample. Now I know what to do with the rest, :-) - Ineuw (talk) 01:03, 28 June 2010 (UTC)

A totally new experience wrapped in its usual ignorance

Uploaded a new .djvu file to the Commons and prepared the index page here on Wikisource, but I can only see the text if I access a page by page number from that index. Otherwise it's empty. Fully aware that I did something wrong but have no clue as to what? - Ineuw (talk) 02:52, 28 June 2010 (UTC)

Hello Ineuw. The index works fine for me; there may have been a temporary issue caching or building it. (If you mean the end output, you need to use the <pages> tag to display it on the work pages; see Pensées/I for an example.) —Pathoschild 04:36:53, 28 June 2010 (UTC)
The PSM indexes you have worked around had a bot run through them, editing then saving each page as 'not proofread'. This creates a longer path to edit-mode, more clicks and page loading. The usual, if not preferred, method is to click a bunch of redlinks, proofread them, then save it. Another thing that differs from PSM is that best practice is to transclude the pages from the Page: name-space (our workspace) when they are improved and useful, the uncorrected text layer is already available elsewhere; the 'main' namespace is for the presentation of clean, formatted, and complete texts found in index. An example of 'useful' for mainspace would an corrected article in an incomplete volume, compared to the pointless display of a single chapter in novel. Cygnis insignis (talk) 07:11, 28 June 2010 (UTC)
Sorry Pathoschild, the prize goes to Cygnis insignis. :-) Besides, yesterday I went to the Montreal meet-up and hoped to meet you. But, since everyone there was anonymous, I missed the opportunity.
Cygnis insignis: Thanks for the explanation. I suspected that it was a bot that did the work on PSM and I shall make a request for it to be executed on that Index page. Due to my preferred working style I prefer the index pages to be saved as "not proofread". I used this small project to acquaint myself with the process of uploading and managing .djvu files from the beginning to the end, as well as for personal reasons of great interest in Mexico and it's history. The main namespace page was prepared as part of the process. - Ineuw (talk) 14:58, 28 June 2010 (UTC)
Extended content
thanks for the no-prize, but you attempted to garner a response and received two that wore freely given. The first was a helpful response to a 'new-user', the second was based on experience of your approach. If you read between the lines, I am trying to emphasise that doing everything but proofreading, and drawing attention to that, is not of especial interest to anyone but yourself. When you create a title in mainspace, it then becomes a google hit: the reader doesn't find what they are looking for, only a suggestion that it will perhaps, one day, provide a reasonable transcript of the same. The only likely outcome is that the disappointed end-user would rank Wikisource as yet another internet site that sucks. I'm about to delete your empty page, please focus on making the text presentable and then transclude it. Cygnis insignis (talk) 15:54, 28 June 2010 (UTC)
We did cross paths at the meeting about Wikimedia Québec; I was the dark-haired & -clothed one sitting on the left from your viewpoint, next to the column. I had a nametag with "Pathoschild" on it, but it wasn't visible from your seat. If you attend another meetup in Montréal, we can meet directly. :) —Pathoschild 23:56:26, 28 June 2010 (UTC)


Hey grasshopper (Cygnis insignis);

What is this about my proofreading?

What's your problem???

From your response I infer that it must be the end of your day, you're tired, exceedingly unpleasant and irritating. if not, then take time out and go to your corner for awhile. If the above are not the cases, then leave your prejudices and mood swings offline, and stop your silly threats.

I find that you're are threatening me again, and frankly, I am tired and fed up with your childish behavior. You are annoying, arbitrary, and make demands which are not Wikisource policy, they just happen to be your preferences.

If one doesn't want to do the work according to your preferences, so be it since this is not your personal fief. Wikisource is not a profit making entity in which one must produce according to a criteria set out by an overseer, while you certainly try to act like one.

To begin with, you viciously and arbitrarily edited my proofreading manual and eliminated information which assumed me to be a liar. I truly resent that. You have absolutely no clue as to my background, experience, and observations on which I act. Just because you are with more Wikipedia experience, that does not mean that you possess more life experience, which from our interaction you seem sorely to lack.

If you cannot control your behavior, I will lodge a complaint against you with Wikimedia. Not that it will make a difference, but it will be on record. However, I have a feeling that this has occurred on previous occasion.

I also know that the administrators' group is a cabal. Regardless that a user's viewpoint is correct, the administrators close ranks to protect their own group - an easily predictable reaction in an organization. Also sure that I am not the only one to bring this issue up about Wikimedia, because such instances are abound on the web, and certainly more often than people googling for 19th century travelogues about Mexico.


To answer your most recent complaint, I created the main space title for A Study of Mexico BY FOLLOWING THE GUIDELINES GIVEN ON WIKISOURCE, only to find that the index pages were missing and must be created first from the .djvu by a bot. Then last night, being tired, I quit to continue later while being confident (again) that ±48 hours will not make much difference to the hundreds of thousands of people Googling for 19th century travelogues on Mexico.

My seemingly erratic efforts are "busman's" holidays from my main interest. The 292 pages of proofreading in the above mentioned project is a joke compared to 800+ pages of a single PSM volume, and is a welcome break from PSM, while I mull the issues that confront my main work. Also think that my past efforts well speak for themselves that I continually return to improve my work, as knowledge and experience improves.

Started by proofreading most of volume 1 by myself to learn what's this all about. Then, I have gone through several times on every page of the first 25 volumes to create the TOC's. This allows for proofreading a complete article of interest, rather than haphazardly proofread a page or two here and there.

The index pages at the end of each PSM volume, which you pointed out to be incorrect, I corrected. Although I don't plan to create custom links at this time because ~200 x 2 custom links per volume need to be defined per volume. When the proofreading is completed, a bot will be able to create these. I have thoughts on how to go about it.

The answer to my deleting the running headers relate to an early post of mine about this, which you conveniently ignored.

Every editor/proofreader defined the running header differently, and few did correctly. Since there are ~30,000 identical running headers containing even numbers, they can easily and uniformly be recreated by a bot. For the ~30,000 odd number pages containing the article titles, they exist in the TOC, which I am continuously working on and can already provide a text file for volumes 1 to 25. However, there are more pressing issues left over from my past efforts with which I must deal with.

The ugly yellow images are in the process of replacement with higher quality grayscale images. You can check, if you haven't already done so, in volumes 1 to 3. The 4th volume is currently being uploaded.

Just spent a month to researching and creating 700+ authors, their pages, and their contributions. If I haven't done this at this point, it would have become completely unmanageable. Every volume adds about 40 new authors/contributors and ~25 additional articles by existing contributors. For the aforementioned reason of management, these must be added this at the end of each volume from here on.

On the other hand, this process slows down the creation of new TOC's. The TOC and the main page article, even without being proofread, allows for reading an article on a single page without distractions so that a correct category be assigned out of the 152 possible categories currently selected.

To create the TOC's one must paginate through volume because there is no error-free source. I separately downloaded the IA text files for the volumes and tried to build the TOC's more quickly from there but had to return to the Index pages because the page numbers are not clear and must be recalculated into .djvu numbers, a very risky process as each volume is differently laid out. So, for the time being, the Wikisource Index: pages are my only reliable source for TOC's, as long as the displayed page numbers are accurately defined.

All of the above retard serious proofreading.

Now about proofreading in general. - Due to the personal preferences of the administrator-programmers, the paragraph lead-in indents (gap) were replaced with the CSS text-indent, without examining the effect on the documents. So far, the proposed solution is more cumbersome, time consuming, and at a far greater risk to breaking than the original solution.

By habitually collecting and analyzing statistical data on the work ahead, I knew that there are approximately 4,000 indented paragraphs in a volume, out of about 7,000+. The rest are titles, images followed by non-indented paragraphs, and paragraphs with hanging indents. They are all affected by this new solution, and indented inconsistently.

Thus my past efforts of proofreading were destroyed and without a solution to fix it. I don't hear you complaining about that, or demand a solution within 24 hours? What is your or Wikisource policy about that?

I didn't join Wikisource to proofreading a page or two and then disappear, as so many contributors do. I want to see the PSM project through. Proofreading PSM is an enormous task, it will take years to get any meaningful results. My intent is to provide some meaningful articles on the histories of various academic disciplines as already stated previously, which is another post you ignored or forgot. - Ineuw (talk) 21:01, 28 June 2010 (UTC)

Please take this passive-aggressive garbage off this page, and go away until you have calmed down. Cygnis did not threaten you, but you have now threatened him. Cygnis was not rude to you, but you have now personally attacked and abused him. Cygnis's deletion of the empty page A Study of Mexico was perfectly reasonable. I'll have nothing more to say since you've already pre-empted these and any further objections to your behaviour, by accusing all admins of "closing ranks" and cabalotry. Hesperian 23:53, 28 June 2010 (UTC)
Although Ineuw's reaction seems disproportionate, Cygnis was abrupt in deleting the page and dismissing a valid approach to editing — I would not agree that the main namespace should not contain works in progress. Ineuw seems to have taken offense at Cygnis' dismissal, and while I agree that Ineuw's response was aggressive, we should not try to solve conflicts by telling one editor to go away. Most conflicts are caused by a simple disagreement, not by one side being foolish or corrupt. Cygnis and Ineuw, I don't know if there is bad history between the two of you; can you try talking it out on one of your talk pages? —Pathoschild 00:19:25, 29 June 2010 (UTC)
The deletion lists "G1, G5, A2, M1, M2, M3, and maybe G3". I am struggling to see how any of those apply. I think this should be undeleted. John Vandenberg (chat) 00:37, 29 June 2010 (UTC)
(The codes can change, so for future reference that's no meaningful content (G1), beyond scope (G5), not peer-reviewed or not published in a significant edition (A2), deletion as part of a page move or history merge (M1), unneeded redirect (M2), cross-namespace redirect (M3), and banned contributor (G3). —Pathoschild 00:45:23, 29 June 2010 (UTC))
Pages like that are routinely deleted under "no meaningful content". And I can certainly see how a page that is nothing more than a header and a link to where the scans are, might be seen as an unneeded cross-namespace redirect. Hesperian 00:52, 29 June 2010 (UTC)
In due course, we are going to have a page called A Study of Mexico, and it is going to have the same, or similar, content in it. That isn't useless content. Would you delete it if I created a page with just a header on it? John Vandenberg (chat) 01:05, 29 June 2010 (UTC)
No, because I don't think anything created by a editor in good standing should be deleted without discussion. But yes, I would think that it merited deletion. Hesperian 01:18, 29 June 2010 (UTC)
Ineuw looks to me like an editor in good standing. Could you explain why you think otherwise? —Pathoschild 02:47:04, 29 June 2010 (UTC)
I don't think otherwise. To be clear, I consider Ineuw an editor in good standing. Or at least I did, before I read the vicious personal attack above.

I have my doubts, though, whether this can fairly be considered a case of something being "deleted without discussion". This situation fits into a larger historical pattern—a pattern in which Ineuw does odd things that the majority of us strenuously disagree with, and then ignores both guidance and objections, until someone realises that the discussion is not going to go anywhere, and takes matters into their own hands; and then we get complaints about high-handed unilateral abuse of authority. If we grant that these discussions have already been had many times, then it is not easy to fault Cygnis for what he did, which was to simultaneously explain the problem and take action to fix it. Hesperian 03:30, 29 June 2010 (UTC)

I would think that a more productive discussion would be to extract the principles, discuss them without reference to the individuals. — billinghurst sDrewth 04:12, 29 June 2010 (UTC)
Good call. To get the ball rolling,

Discussion points

I submit that the page namespace is where we editors prepare works for publication, and the mainspace is where we publish them for our reader's benefit. Therefore, in page namespace, our actions are governed by whatever we find most convenient—that is, in page namespace, we do whatever the hell we want—but when it comes to the mainspace, our actions must be guided by what benefits our readers, and this means not publishing works before they exist on the site. Hesperian 04:33, 29 June 2010 (UTC)

I disagree, we should not exclude works in progress from the main namespace. Wikisource is a wiki; we flourish by encouraging edits and collaboration. Many editors are attracted to the project by fixing errors and expanding incomplete works (or at least I was), and much work gets done this way. There is no incentive for new users to edit if the only things to edit are hidden away in a corner only the core contributors know about. —Pathoschild 05:42:24, 29 June 2010 (UTC)
I wouldn't want to exclude works in progress from the main namespace. I routinely add such works myself. But I think works have to reach a certain threshold of startedness before we add them. At the very least, a front page with a contents page of redlinks, so that a skeleton of the work exists. I am opposed to creating a mainspace page for a work not a single page of which has been transcribed. Cygnis put it best above: "The only likely outcome is that the disappointed end-user would rate Wikisource as yet another internet site that sucks." Hesperian 05:52, 29 June 2010 (UTC)
I fall somewhere in between in that I am not absolutist either way. I don't like works hidden away as it it doesn't encourage new users, and I don't like some of the scrappy works that are there that do not declare that they are works in progress and how to assist.
  • I have seen quite a few works that sit with the bulk of them processed, and yet hardly anything in the main namespace, and no indication that they are being worked upon, or how to assist. This is especially the case when we make it difficult to sort and clarify the Index: ns.
  • I also don't particularly like large slabs of ugly unOCR'd work, unformatted, not in context, or Tables of Contents with no backing text, nor an indication where to find the text or even if scans are available.
  • We have projects that can only happen if they fall into the {{incomplete}}; DNB and EB1911 being two prime examples.
It sounds like this is a case of managing expectations and a sense of balance between the two operations, and how we can give guidance to how it can be managed, and to what extent we present the unfinished, or let others know how and what. — billinghurst sDrewth 06:00, 29 June 2010 (UTC)
I've given a lot of consideration to the incentive factor, but the fact is we have thousands of opportunities for users to immediately contribute to main space.
  • We have POTM and so forth, to collaborate on a larger complete text.
  • They can select an article, poem, entry, news item, and so on, for a quick contribution.
  • And we have hundreds of prepared Indexes with any number of 'complete' texts, a user can easily gain the satisfaction of adding something useful from an incomplete volume.
  • If the subject matter is not to their taste, they can model the skeleton structure of a prepared scan to create something they are passionate about.
The problem, if there is one, seems to be making users aware of the possibilities for 'completing' a little or a lot here. And more experienced users are actively helping others to make those skeleton structures available. Anyone who has done it a few times can assist others to get started, that and keeping it simple are the advantages of being an open wiki. The help files still need lots of work.
  • A title with mostly redlinks is still relevant to its complete parts, but not as a soft redirect to the workspace: that is an advertisement to something we don't have, the invitation to contribute to the site is in the Page:namespace. Another pitfall is that it then lights every incoming link. Readers already have access to page scans and uncorrected ocr at other archives and sources, with better presentation and access. Cygnis insignis (talk) 06:44, 29 June 2010 (UTC)

Distributed Proofreaders and Wikisource

Andre Engels posted a long description of how PGDP works:

Thoughts on how we could coordinate work with them? In theory, I would hope this could benefit both projects.

Sj (talk) 09:48, 30 June 2010 (UTC)

I suppose the first point would be that to produce a similarly detailed account of WS workflow would be, well, impossible, because there isn't a single model that everyone follows (as far as we know). And the second point would be that, given a summary of how the workflow operates duly qualified, it would become apparent that the WS model is very different. And so we have to assume the goals are different, also. At which point it should become evident that finding common areas of interest would be an exercise in exploration of what we could say to each other, i.e. from our side this would be the sort of outreach where "strategic alliance" seems too pretentious. We are actually experiencing the bracing and rigorous effects of introducing ProofreadPage, so that proofing goes on alongside scans. It seems that this might be a long discussion, in terms of understanding what each institution adds to the concept "repository of texts". Charles Matthews (talk) 10:04, 30 June 2010 (UTC)
Put bluntly, we are a tiny project in comparison to the PG family, producing a negligible number of validated etexts in comparison to PGDP.
I did have a plan to submit a batch of our validated etexts into PG, and have working code to convert wikisource code into PG format. This would be a nice way to introduce Wikisource as a credible project in the etext ecosystem, but I lost interest due to an unresolved issue with folk at PG.
It would be lovely if PGDP made the scans+texts available; until they do, we need to reunite the PG text with an djvu, which is a laborious task. see The Wind in the Willows for an example of that.
John Vandenberg (chat) 10:33, 30 June 2010 (UTC)


Can anyone recommend a scanner ? Alan unsigned comment by (talk) .