Wikisource talk:WikiProject Fast Proofreading

From Wikisource
Jump to navigation Jump to search

Suggestions[edit]

@PseudoSkull -- Hello, I'm an on-again, off-again user here on WS. I have used PG in the past. I experience a lot of frustration here at WS even after many years because I can't find things, can't remember things, and it always feels like I'm having to reinvent the wheel. I am not able to devote a lot of time or energy here, which contributes to my problem of not being able to remember how to do things. (For example, I thought using the at-sign would automatically send you a notification, can't remember how to do that. Ditto my links below with hash marks that don't seem to go automatically to the anchors.)

I think this new project is a GREAT idea. Thank you for starting it.

Back in 2016 I posted a rant/essay that compared how we do things here at WS compared to how they do them at PG. I listed several specific things we could change or adapt to make things easier for beginners and for casual users who don't know wiki-markup.

You can read my essay here: [Comparison to Project Gutenberg Distributed Proofreading]

You can also read a bit about the "Wikisource Mono" font, based on the PGDP's "ugly proofreading font" which I use here at WS on my talk page. I think that it makes basic textual proofreading much, much faster. [WikisourceMono font]

Let me know what you think! Laura1822 (talk) 15:20, 9 October 2021 (UTC)[reply]

@Laura1822: Hey, thanks so much for taking notice to this little "project" I've started! I appreciate your input on that page—the perspective of a Gutenberg proofreader makes it more interesting. It is very true that our proofreading process is rather convoluted, and I think this is a large factor as to why we have an extremely low active editor base compared to other WMF wikis. This certainly explains the (relatively) low amount of proofread texts on this wiki. However, I am glad that we at least have a handful of very dedicated editors here.
There are quite a lot of things that could be improved as far as our interface and functionality goes; some things I admit to having trouble imagining offhand. A lot of the tricks I'm using for fast proofreading are exactly that—tricks—and it would be better if those weren't needed to proofread so efficiently, and in the long term, I'd support looking to turning these off-server coding tricks into creative interactive editing features, or interface improvements, that Wikisource has to offer. Feel free to make a subpage of Wikisource:WikiProject Fast Proofreading for an essay or something if you'd like. PseudoSkull (talk) 21:56, 9 October 2021 (UTC)[reply]
You can copy/paste my essay to a subpage here on your project if you like.  :)
I think I said all I wanted in that essay.
Of all the things I suggested there, the one I would personally find the most useful would be the addition of another status radio button (or more) for "text only proofread" with no formatting required (or very basic, such as bold/italic and the headers, footers, page numbers, nop, etc.). I can't tell you how many times I've wanted to proofread a page or two of a work, but the formatting defeats me. Why can't I just proofread the text and leave the formatting to someone who knows what they're doing? And then they have less to do, because the text is already proofread. With a new status/category, it would be easier for other editors to find pages that need some medium formatting but that aren't full of other defects like missing images or Greek letters which are all labelled "problematic."
I haven't looked at the Proofread of the Month and similar projects to see if any of them have improved since I ranted about them. Laura1822 (talk) 17:18, 10 October 2021 (UTC)[reply]
I'd also love to have a "nop" bot that would just add in all those nops I forget about. Or better yet, integrate nops into the overall coding so that proofreaders don't need to worry about them anymore! Laura1822 (talk) 17:20, 10 October 2021 (UTC)[reply]
@Inductiveload: Perhaps you would be interested in this. PseudoSkull (talk) 21:57, 9 October 2021 (UTC)[reply]
@Laura1822: Thank you for your thoughtful essay and I agree that there is much that enWS could do better. I created the Monthly Challenge because I felt the same frustrations that you did/do. Here is my opinion on the two projects.
  1. In comparision to PGDP, enWS is indeed missing the proofreaders. I think there are three main causes for this.
    1. First, a lack of overall quality. Yes, enWS has many great scan-backed works, but it also has way to many copy-paste from other websites or random sources. Since anyone can edit enWS, this makes it difficult to trust a work. It's also impossible to distinguish a scan-backed from a non-scan backed work unless you click into it. Too often enWS can seem like nothing more than a copy/paste of another website.
    2. A lack of advertisement. Even on Wikipedia, enWS is underadvertised. More work needs to be done to spread awareness of enWS.
    3. As you mention, a lack of guidance for new proofreaders. Most users want to be presented with a selection of texts to work on and not told to think of one. This is why I started the Monthly Challenge.
  2. enWS has one major advantage over PGDP: scan-backing. On Project Gutenberg, you can never check to ensure the accuracy of the transcription. Even with the multi-round process of PGDP, errors still creep in. Even worse, PGDP is behind a registration wall and after around 3 months a project is archived and no longer visible to non-administrators. On enWS, you can always check and correct a text. This is the key advantage of enWS and what I believe will bring it long-term success.
  3. Templates are a blessing and a curse. They provide an intermediate language that describes the intention behind the formatting and leaves the implementation to a separate part. This means that it's possible to change the code behind an implementation and update all its usages in one go. On PG, this needs to be done manually for every text. As the web evolves, keeping up with updating the code will probably become a major burden for PG.
  4. Works created by PGDP have a major flaw. Between f2 and posting on PG, the editor usually silently corrects all printing errors and errata. This makes it a new edition rather than a accurate transcription of the original text.

Ultimately, I think that enWS needs to find ways of spreading awareness and trying to recruit more users. I also needs to do more to guide new users. However, with time and more scan-backed works, I think that enWS will catch-up to PG. The tragedy is that while PGDP can borrow from enWS, enWS cannot borrow from PGDP. Therefore, every text that is scan-backed through PGDP will need to be redone on enWS. That situation makes me very sad. Languageseeker (talk) 01:04, 23 October 2021 (UTC)[reply]

I agree with all of your points, especially the advantages of WS over PGDP. Your reasons are why I'm here and not there, after all. The scan-backed books are The most important thing of all. Most, if not all, of the Jane Austen texts, for example, are copy-pasted from PG but the IA texts we have here are the first editions. There are differences on almost every single page. Spelling and punctuation mostly. That sort of thing is important to scholars.
I wish I had more time and energy to spend here. What time I have, I would like to spend improving works like Austen's to correspond precisely with the first editions (wish I had energy to develop and promote a Project for that), as well as adding Western Canon texts that are either missing or have the same problems as Austen's works. (Then there's my original project/focus on early 19th century periodicals.) But I don't have a a lot of time or energy to spend here, so when I do come, I like to focus on just proofreading. Hence all my frustrations.
It sounds like we're on the same page, so to speak! Laura1822 (talk) 13:35, 23 October 2021 (UTC)[reply]

PGDP proofreading guide[edit]

Hi, I’ve been working on a WS version of the Distributed proofreaders Proofreading guide here. It’s messy. I was doing Post processing at DP because I like working on the whole book but the WS experience is superior and much less daunting. I would like to do a standard, streamlined Project guideline that does a book from archive.org to transclusion. I think this would help people who want to focus/feel more comfortable with proofreading but upskill them with how WS works.

Re Fast proofreading: I think everyone should go to DP and learn at least P1 and F1 levels. It only takes a few days to upskill there and the standardised program there would mean everyone would be on the same page (PUN, not sorry). I don’t think it’s worth reinventing the wheel. I think focussing on making the WS process so it reflects the DP standards would minimise the transition for DP proofreaders to work at WS so they would pick things up quickly. I suspect there are a lot of DP proofreaders who get bored with the limited scope there and would welcome the opportunity to spread their wings here, if the format was familiar.

I am not digital savvy and have painfully acquired what skills I have. I would really like to discuss how to emulate the DP standards and process here. Where to start? Cheers, Zoeannl (talk) 07:51, 23 October 2021 (UTC)[reply]

That looks amazing!! I have only glanced/skimmed it so far but I will take a deeper look when I can. That looks like exactly what document/set of help docs that I have been wishing for over and over again but never had what it takes to develop myself. Thank you for creating it!
We need more people who are familiar with both systems. Part of my frustration that led to my rant linked above was that when I went looking for help, I didn't find anyone who was familiar with PGDP. You'd think there'd be a large cohort here. But it seems we are finally finding each other. Laura1822 (talk) 13:41, 23 October 2021 (UTC)[reply]
@Zoeannl: Your walkthrough is amazing. Would you mind if I use it for the Monthly Challenge? Languageseeker (talk) 15:26, 23 October 2021 (UTC)[reply]
Thanks, it is a copy paste of DP's guideline that I then butchered and keep dropping in new tips as I go. I want to work out some issues I have with it and think assessing it from a newbies perspective would be very helpful. I really want to develop a process as close as possible to DP to encourage DP proofreaders to come to WS and also because I do believe that it would be fastest/easiest/most productive if we direct new (and not so new) people to DP to learn basic proofreading skills. I anticipate conflict over this (being exclusive/prescriptive) but see no reason why we can't have designated PG style projects parallel to the current ideosyncratic WS mode.
I really like the Monthly Challenge structure. I suppose I'm suggesting a PG sub-challenge/category. I think it would be helpful to have sub-challenge/category instructions-so the PG category label would link to the Proofreading guide. Basic books could have a text-only label with basic instructions, but books with illustrations would be labeled images, with a link to how to handle images. Maybe also, missing image/table/language pages could be posted separately so people could practise specific skills.
I really want to get exemplars and step-by-step walkthroughs into the Proofreaders guide. But there are inconsistencies and basic principles I want sorted first. Consistency and standardisation is key, at least at the beginning level and I think everything should be able to be done with templates. And they should be on the pull-down menu.
Can't we do without <span> and </div>? I know it's the wiki way, but can we do without markup like italics and the colon indent? Why is there no poem template?!!
Why does {{bar}} have no unit but {{gap}} does? Why do I get told off for using gap?
I'd like all formatting templates to have a similar format to {{ts}} table style using shorthand to do indenting, italics, colour, font, alignment etc.
Why does the editing menu flick from top to bottom of the edit page? So annoying.
So who are the people who will take pity on the digital illiterate and accommodate us? It has been so frustrating as Laura says. But it hasn't been in vain, I have taken this long to figure out how WS does and doesn't work for me, to be able to ask the questions to find a solution. Cheers, Zoe Zoeannl (talk) 21:44, 23 October 2021 (UTC)[reply]
@Languageseeker I have a proposal. If we categorize books when they are put up on the monthly challenge according to the skill level required then proofreaders can choose as they feel able or inclined. Also if the project manager expects DP proofreading standards then this could be a category too. Can the category be in Template:MC-Cover? Can we have sections according to categories?
This is revelant to fast proofreading because for proofreaders from DP, they will work much faster if the expectations reflect the DP Proofreading guidelines and some may prefer to do text-only proofreading. Zoeannl (talk) 21:37, 24 October 2021 (UTC)[reply]
@Zoeannl: Take your time with the guide. If there is any section that you would like help with, let me know.
There is <poem></poem> for formatting poetry. For indentation, {{text-indent}} and {{dent}} are preferred. enWS is developed by volunteers so sometimes, things don't quite match up.
I'm hoping to add a label to the Monthly Challenge that will give an indication of the difficulty level/what to expect. This will require some additional programming from Inductiveload. Let's see how things go and then a discussion could be had about creating separate section. My intent is not to feature too many hard texts in the MC. Instead, I want to use it to help newcomers become accustomed to enWS and to increase the number of key works on enWS. This is why there is an "Ask for Help" section (newly renamed from Discussion).
Creating a option for no-formating proofreading will require a major rewrite of the software used to run proofreading and I don't think that it's the wisest solution. In reality, most pages have either very little or no formatting. Think about the millions of pages from novels that have bold/italic or no formatting at all. I would rather users learn to do light formatting or save a page as problematic than be discouraged from formatting at an early stage. In the end, it's only possible to learn by doing. That requires good mentorship more than software restrictions. I hope that the MC will provide a section such mentorship.
I'm really glad that you're going this project and I really appreciate your help and ideas. Your experience and knowledge with PGDP is invaluable. Hopefully, with your help, we can reproduce the learning experience from there here. Languageseeker (talk) 01:40, 25 October 2021 (UTC)[reply]