Wikisource:WikiProject Validate/Noticeboard

From Wikisource
Jump to navigation Jump to search
WikiProject Validate Noticeboard
This is a discussion page for coordinating and discussing validation/proofreading tasks on Wikisource. Although only users who are logged in can validate pages, any user is welcome to make a request.

New work for validation[edit]

Yes check.svg Done December 2018 --kathleen wright5 (talk) 23:45, 17 March 2019 (UTC)

For the next work for validation, I suggest Rabindranath Tagore’s Mashi and Other Stories (transcription project). It contains just under 200 pages for validation; however, they contain a relatively small amount of text per page. TE(æ)A,ea. (talk) 12:43, 7 October 2018 (UTC).

Comment from Jan Kameníček[edit]

As Czechia is celebrating the 100th anniversary of independence from Austria-Hungary, I suggest to validate the following works:

--Jan Kameníček (talk) 17:28, 25 October 2018 (UTC)

Jan Kameníček, I recommend formally joining the project here. I will get started on some of your works as soon as I can. ―Matthew J. Long -Talk- 18:38, 26 January 2019 (UTC)

New work for validation[edit]

For the next work for validation, I suggest Arthur Clark Kennedy's Pictures In Rhyme (transcription project). It's a short poetry book. --Level C (talk) 02:29, 25 March 2019 (UTC)

@Level C: I definitely support that! If you could possibly take on a chapter or two of the current featured task, we could get that one done sooner. It's been mostly been me so far validating it. I look forward to validating something shorter to be honest.. hahaMJLTalk 03:03, 25 March 2019 (UTC)

Merging with Wikisource:Validation of the Month[edit]

It seems to me that the community would be better served if both community-validation efforts were working in tandem. WS:VotM is a spin-off of WS:PotM; see Wikisource talk:Proofread of the Month#Validation of the Month logistics. —Beleg Tâl (talk) 13:16, 27 March 2019 (UTC)

@Beleg Tâl: I would not be against merging Wikisource:Validation of the Month into this WikiProject, but I do cringe at the idea we will have to accept some of their processes. Just looking at Wikisource:Validation of the Month/validation works for a bit, and I see 62 works have been queued (not counting the queued PSM volumes). For me, I find just doing one at a time is more than sufficient. We have not completed our featured task yet, but it has been up there since January. That is not to say we are incapable of doing long works; we finished our first one which was pretty extensive.
Further, I don't like the idea of queuing works without consensus on here. People seem to just dump whatever projects they're working over there, and then it's first-come-first-serve. I slightly prefer the model worked out here in that contributors make a pitch for them to be featured, and we all essentially select what interests us. Finally, I really want to echo something that @EncycloPetey: said in response to Londonjackbooks, This can be one reason to establish a separate Validation group, who can support each other and train newer arrivals in Validation practices. I really feel that is one of the primary roles of this WikiProject (besides approving featured tasks). Having multiple featured works validated at a time would probably dissuade that from occurring. Let's do one right rather than have a bunch we do so-so. –MJLTalk 16:08, 27 March 2019 (UTC)
@MJL: The procedures of Wikisource:Validation of the Month could probably be replaced wholesale with those of this WikiProject, provided that the works already queued there are somehow prioritized among potential works for validation (for example, consider them all proposed for consideration and proceed accordingly). Assuming the rest of the community is okay with this also. —Beleg Tâl (talk) 01:40, 1 April 2019 (UTC)
i find there are some editors who prefer proofreading and some prefer validating. a separate taskflow for the latter, seems useful. if we had more editors, we could have more task flows, and a landing page directing to each. Slowking4SvG's revenge 17:41, 1 April 2019 (UTC)
@Slowking4: I agree, if we had more editors, we could have more task flows. However, given that we do not have more editors, I do not think we should have more than one task flow for validating; hence this proposal. —Beleg Tâl (talk) 20:01, 1 April 2019 (UTC)
Well, at the moment it seems that both of them are running quite well. --Jan Kameníček (talk) 23:04, 1 April 2019 (UTC)
if you merge, and the validators do not like the work, they will stay away. it seems a directive form of collaboration, based on your priorities, not the other editors. Slowking4SvG's revenge 01:30, 3 April 2019 (UTC)

The Yale Shakespeare—Histories[edit]

Since MJL so rashly suggested I post it here… :)

The Yale Shakespeare is a early 20th-century series of critical editions of the works of William Shakespeare with relatively light critical apparatus, and thus relatively reader friendly. EncycloPetey has put in a tremendous effort to get this up and running, and all the history plays that are out of copyright have now been proofread (King John, Richard III, and Henry VIII are still in copyright for a few more years). Having these validated would go a long way towards finally having a complete set of critical editions of Shakespeare on Wikisource.

The individual entries in the series are:

(For those unfamiliar, this is the complete list of reigning monarchs of England between 1377 and 1471—excluding Edward IV (–1483)—and covering most of the period of the Wars of the Roses; a source of inspiration for Martin in writing A Song of Ice and Fire / Game of Thrones, for those looking for modern topicality.)

For those who've not yet found occasion to read Shakespeare's histories this would be a great way to get into them, and, though dramatised, the plays are based fairly extensively on the chronicle histories of Holinshed and Hall, so you'll learn something about 14th/15th-century England and the The Wars of the Roses too along the way.

The scans EncycloPetey set up are pretty darn clean, leading to good OCR and surprisingly few scannos. But numbers are a particular problem for OCR in this series (I thought I had a reasonable track on spotting those in the pages I proofread, but I've since been disabused of that notion) and when validating special attention needs to be paid to these, especially in the various indices. --Xover (talk) 05:43, 10 April 2019 (UTC)

@Xover: As The Tragedy of King Richard the Second has already been validated, would you mind clarifying which one you specifically would like us to consider? I never explicitly stated it at the time, but for the last task I simply picked the first work from the request. I think requesting a single transcription project at a time should probably be a hard and fast rule. I will also playfully point out that Level C had a wonderful short little poetry book they put in a request for. I don't think this is first come first serve, so it is up to the group what to decide. –MJLTalk 01:28, 11 April 2019 (UTC)
Well, all of them, really, but if you want just one then pick `em off in regnal order: 1 Henry IV first. And I'd certainly be happy to propose each individually as their proofreading is complete, but I'd worry that'd just generate a lot of noise. For those who read-read as they validate, these are also roughly of a unit—the plays together tell a somewhat coherent story of the Wars of the Roses—so some might prefer to see them as a unit. *shrug* I'm easy any way you want it. --Xover (talk) 05:58, 11 April 2019 (UTC)

Index:Wonder Book.djvu for validation[edit]

A Wonder Book is a shorter work that I proofread recently. The text quality was rather high, and as such, I don’t believe that validating the work will be very difficult. TE(æ)A,ea. (talk) 21:45, 14 April 2019 (UTC).

Suggested instructions of the project[edit]

I suggest to add the following instructions:

  • Adding works to validate
    • Add works that you wish to be validated to the end of the QUEUED list. There should not be more than 4 works in the queue. If the queue is full, consider helping with validation instead of adding another work to the queue.
    • If you want to add a group of related works (e.g. several plays by Shakespeare), create a separate subsection for them.
  • Validating
    • Validate as carefully as possible. The aim is not QUICK validation, but QUALITY validation. Nobody is perfect and it can happen that the validator does not notice a mistake, but it should happen only very rarely.
    • Check not only typos, but also compliance with Wiksource rules and the way of transcluding into the main namespace.
    • Do not change formatting which is against your preference but not directly against Wikisource rules or at the borderline. One work is often validated by several people with different preferences.

--Jan Kameníček (talk) 09:55, 15 April 2019 (UTC)

As for the queuing: Alternatively we can have three queues: 1) Long works (up to 4 in a queue), 2) Short works <20 pages (could be a higher number), 3) groups of related works (up to 2 in a queue) --Jan Kameníček (talk) 10:36, 15 April 2019 (UTC)
@Jan Kameníček: My preference is for now queues because I would prefer if we all voted on the featured tasks every time it opens up. What would you say to that option? –MJLTalk 19:09, 15 April 2019 (UTC)
I understand your opposition to queues, as long queues are very discouraging both to those who wish to add a work and partly also to those who perform validation. However, the voting system does not seem very flexible and so I suggested queues of very limited length. For example now, a work was finished and because there was a gap of several days without new featured task, TE(æ)A,ea. simply added new works to do from those mentioned at the noticeboard. I found myself unable to help with this work, so I simply added another work. It was quick, easy, flexible. The reason why I suggested the queues was that the current system of asking for help at the noticeboard is quite chaotic. Having queues does not mean that the first work from the queue has to be chosen. Any work can be chosen and we can suppose that if a validator chooses a work, s/he is also willing to work on it. If somebody wishes to validate something different, s/he simply adds another task from the queue. Maybe we can call them to do lists instead of queues. No matter what their name is, they should definitely be short, slowly moving long lists drive people away. What do you think? --Jan Kameníček (talk) 19:39, 15 April 2019 (UTC)
One more argument against voting: This system is bad for people whose taste does not comply with the taste of majority. If somebody "loses" in the voting and a work which they do not like is chosen, they will probably not help, and if it happens more frequently to them, they may leave the project. In the system I suggest anybody can choose a featured work, and those who like it will help. Those who don't can start alternative task, as I have done today. --Jan Kameníček (talk) 19:56, 15 April 2019 (UTC)
@MJL, @TE(æ)A,ea.: Thinking about it again, I have a compromise suggestion: What about having three parallel tasks: Task 1 would contain a work chosen for validation by voting; work for task 2 could be added freely by any contributor who does not feel like working on task 1, or if choosing the work for task 1 is delayed for some reason; task 3 would be a long-term task containing e. g. a group of related works such as the Shakespeare plays and it can be chosen by voting again. --Jan Kameníček (talk) 09:52, 19 April 2019 (UTC)
@Jan.Kamenicek: I actually really like that suggestion!! :D –MJLTalk 03:39, 22 April 2019 (UTC)
  • The addition of works, like I had done, based off of various non-conclusive talk-page discussions, is the common practice at WS:PotM. I also believe that there should be general rules for validating in general, and specific rules for individual works, should any problems arise. I do believe, however, that the text should, in general, match the original as closely as possible. TE(æ)A,ea. (talk) 19:11, 18 April 2019 (UTC).
    There are several reasons why I suggest the rule of limited interference into the original contributor's preferred way of formatting:
    1) The original contributor was most probably consistent in the way he formatted the work. If multiple validators interfere with their preference, the consistency will be broken. I think EncycloPetey expressed it well, although I would definitely not mark such behaviour as vandalism, as I am absolutely aware of the fact they are good-faith edits. However, I agree with the main points of his contribution and I believe it is better not to interfere that much.
    2) We should focus on validating the contents of the work. We should not spend time in fruitless discussions on formatting issues and so the validators should interfere in the formatting only in clear and non-controversial cases. I agree that the work should resemble the original as closely as possible, but different contributors may have different views, what the "as closely as possible" actually means.
    3) Overinterference in formatting issues and the need to defend their way of formatting against validators may deter some contributors from applying for validation. --Jan Kameníček (talk) 08:30, 19 April 2019 (UTC)
  • Jan Kameníček, I agree with your treble-task proposal; I would like to add that the second task should generally only be a shorter work, so as to prevent the needless addition of extremely long validation projects. This could be accomplished by a limit on the number of pages requiring validation, the number of pages total, or the number of words (either requiring validation or total). My desire for general formatting constraints was to avoid the concerns you mentioned in “2),” by enabling all WikiProject contributors to have a universal set of guidelines for validation of works and the standardisation of formatting. I would also generally agree with your appraisal of his expression, although it seems his accusation of vandalism stems from a pattern of accusing my edits as being in bad faith. Separately, I have a question for future discussion: “What should be done when, as in the case of Pictures in Rhyme currently, the work is completely validated, but the text is not fully transcluded?” Should the text be considered fully validated and removed from the featured task listing, or should it remain until the transclusion is completed? I would generally prefer the latter, as my actions indicate, although I would like some input from other contributors as well. TE(æ)A,ea. (talk) 12:38, 19 April 2019 (UTC).
    I think the application for validation of a non-transcluded work will happen only rarely, in the case of of Pictures In Rhyme it was caused by the fact that the contributor does not know how to transclude it. I started working on it but have not finished it yet due to a lack of time. This case shows that the validation process cannot be considered complete until the work is transcluded, because the transclusion revealed some problems that needed to be fixed a were not visible in the Page namespace. See my suggestion of one of the rules "Check not only typos, but also ... the way of transcluding into the main namespace." --Jan Kameníček (talk) 10:32, 21 April 2019 (UTC)
    @TE(æ)A,ea.: Originally I also thought that the second task should be for shorter works, but after the change that the second task is for works chosen by any contributor (unlike the first task chosen by voting), I am not sure about it anymore. What maximum number of pages to validate would you suggest as the condition for this task? --Jan Kameníček (talk) 10:11, 5 May 2019 (UTC)
    • I think that the number of pages should be more dependent on the amount of text per page, of course; but, as a general rule, assuming a normal amount of text per page, I think that 150–200 pages should be a rough upper limit. This may be exceeded when, e. g., another editor has stated that they will validate a large part of the work themselves. I think that such a limit will be a good reflection of the purpose of the second choice, to be for an individual contributor’s validation “project.” TE(æ)A,ea. (talk) 13:22, 5 May 2019 (UTC).
      Sounds reasonable. Rules should be simple so I would probably not complicate it with additional condition of another contributor to raise the number of pages, and would just give a slightly higher but single number, let's say 250. What do you think? --Jan Kameníček (talk) 13:46, 5 May 2019 (UTC)
    • I understand your meaning, in regards to my reference to the rules; however, I only meant it as a rare exception, to be applied on a case-by-case basis. Overall, I agree with a singular number, rather than a range; I think 250 pages would be acceptable, although somewhat higher than what I would like. How do you think the rule should apply to works that have a larger amount of text per page? For example, the appendix to this month’s WS:PotM work has about twice the amount of words per page in relation to the rest of the work, without counting the frequent and lengthy footnotes. That section is only ~70 pages, but is roughly equivalent to 150 pages, if you standardise the font size and footnotes. However, I don’t think that this will be too large of a problem, so it can be ignored until such a work is suggested. In addition, I would like to thank you for redesigning the format of the WikiProject. TE(æ)A,ea. (talk) 17:05, 5 May 2019 (UTC).
      You are right that pages of some works are filled with a very dense text, but I do not think such a work would be added too often. So let's try it this way and if it does not work, we can change it later. --Jan Kameníček (talk) 17:49, 5 May 2019 (UTC)
    • I agree; however, I have another question: should there be a limit on the number of works in a group of related works submitted under that relevant section? Personally, I have recently proofread the cases from volume 586 of the U. S. Reports, the official record of U. S. Supreme Court cases. As you can see, there are a large number of pages which would need to validated across all of the cases, although the individual cases often have a very small number of pages themselves. As I would like to suggest these works for validation, I would like the WikiProject to consider the matter on an exceptional basis, and as part of a general rule. TE(æ)A,ea. (talk) 21:41, 5 May 2019 (UTC).
      Generally: There are several reasons why I suggested the limit on the number of works, one of them is that the limit lets all contributors know that our time and forces are limited and if they help to empty the list they will have better chance to get their works on the list. We can suppose that these groups of works will take a really long time to validate and so imo there is no real reason to have there more groups of works than 2. After one is chosen for validation, there will be plenty of time to replace it in the waiting list by another one.
      As for the mentioned United State Reports, technically the rule does not create any obstacle to them, as they all would be dealt together as one group = one of two items of the list. However, I am not really sure, whether there will be enough contributors wishing to work on such juristic texts... Nevertheless, we can try and see. --Jan Kameníček (talk) 22:50, 5 May 2019 (UTC)
    • My reference on work limits was on those within a group, and not the number of works total. However, I believe that, as the individual cases are not too long, it would not be out of place to suggest a smaller number as a group. I would do this myself, but as I am the user who proofread all of the cases, I would not be able to aid in their validation. Another item to note is that the text layer from the files is exact to the text on the page, and the only user-added material would be the formatting, of which I have worked diligently to represent. However, that should be the decision of the WikiProject, and not merely myself. TE(æ)A,ea. (talk) 23:57, 5 May 2019 (UTC).
      I see. Yes, you can add it to the list of group works, no problem. As for the formatting: unless it breaks some important rules, I believe that validators should interfere in the way of formatting as little as possible. --Jan Kameníček (talk) 08:00, 6 May 2019 (UTC)

New suggested works[edit]

Although I do not believe that this should become the whole purpose of this project, I believe that it would be a good idea to validate works that have been proofread by WS:PotM. The list of works from there that have not been fully validated is as follows:


I am not listing any that are from this month or the month prior. I believe that any of these works would be a good choice; in addition, a number of them are very short. TE(æ)A,ea. (talk) 20:02, 21 April 2019 (UTC).

Other suggestions[edit]

Another group of works that should be proofread and validated are the Tenth Anniversary Contest works; those not fully validated are listed below:


I have redesigned the main page WikiProject Validate and the subpage Featured Task according to what has been discussed above. I have also founded subpages Candidates and Featured Task Archive. Is it OK like that? --Jan Kameníček (talk) 10:19, 5 May 2019 (UTC)

New featured task No. 1[edit]

I have suggested to replace the featured task no. 1, see Wikisource:WikiProject Validate/Candidates. --Jan Kameníček (talk) 10:26, 22 May 2019 (UTC)