CROWDSOURCING AND OPEN ACCESS: COLLABORATIVE TECHNIQUES FOR DISSEMINATING LEGAL MATERIALS AND SCHOLARSHIP
Timothy K. Armstrong
This short essay surveys the state of open access to primary legal source materials (statutes, judicial opinions and the like) and legal scholarship. The ongoing digitization phenomenon (illustrated, although by no means typified, by massive scanning endeavors such as the Google Books project and the Library of Congress’s efforts to digitize United States historical documents) has made a wealth of information, including legal information, freely available online, and a number of open-access collections of legal source materials have been created. Many of these collections, however, suffer from similar flaws: they devote too much effort to collecting case law rather than other authorities, they overemphasize recent works (especially those originally created in digital form), they do not adequately hyperlink between related documents in the collection, their citator functions are haphazard and rudimentary, and they do not enable easy userauthentication against official reference sources.
The essay explores whether some of these problems might be alleviated by enlarging the pool of contributors who are working to bring paper records into the digital era. The same “peer production” process that has allowed far-flung communities of volunteers to build large-scale informational goods like the Wikipedia encyclopedia or the Linux operating system might be harnessed to build a digital library. The essay critically reviews two projects that have sought to “crowdsource” proofreading and archiving of texts: Distributed Proofreaders, a project frequently held up as a model in the academic literature on peer production; and Wikisource, a sister site of Wikipedia that improves on Distributed Proofreaders in a number of ways. The essay concludes by offering a few illustrations meant to show the potential for using Wikisource as an open-access repository for primary source materials and scholarship, and considers some possible drawbacks of the crowdsourced approach.
The digital era has exposed the limitations of paper as an archival medium. Although paper (like other forms of hard-copy) makes an excellent tool for transmitting knowledge across lengthy spans of time, it makes a poor tool for transmitting knowledge across lengthy spans of distance. A wealth of knowledge, including legal knowledge, remains effectively trapped inside paper records, where it can be used only by those with access to the physical medium in which it is contained.
The movement to digitize paper records and make them freely available online promises to liberate information, including legal information, from these physical constraints and make it accessible around the globe. The scope of the task, however, is massive and daunting. Even the best organized (and best funded) efforts, such as the Google Books project (currently the subject of copyright litigation) and the Library of Congress’s efforts to scan American historical documents, can only scratch the surface. Indeed, the Library of Congress recently estimated that, at its present pace, it willtake almost two thousand years to digitize the nine billion text records it presently holds in its collection.
Wikis and other collaborative tools change this picture in potentially important ways. Just as other informational projects have benefited by opening themselves to participation by a distributed community of volunteers, the means now exist to harness the efforts of legal professionals, students, and even interested members of the public at large to improve access to legal information, court decisions, statutes and regulations, and legal scholarship. In 2008, for example, the participants in one such project (initiated by the present author) succeeded in making crucial portions of the legislative history of the landmark Copyright Act of 1976 freely available online for the first time. The online version of the Copyright Act’s legislative history improves access not only by duplicating the text of the original report, but—perhaps more importantly—by making it possible for other online works that cite the report to hyperlink to it. This creates a seamless web of knowledge that improves upon the practical experience of using reference sources in paper form. If we multiply this isolated example by dozens, hundreds, or thousands of interested online users of legal texts, the possibility of a transformative moment in access to legal knowledge begins to appear ever closer.
This essay begins with a review of the open access imperative, which may be normatively grounded in considerations of transparency, democratic legitimacy, and the fulfillment of the university’s public service mandate. It then surveys the current status of a number of projects aimed at improving public access to legalmaterials and scholarship, and explores whether “crowdsourced,” Wiki-centered efforts may achieve comparable results at lower cost. It concludes with an assessment of some of the drawbacks and limitations of the “crowdsourced” approach.
II. Policy Background: The Open Access Imperative
A. Open Access to Scholarship
“Open access,” in the sense of making documentary materials available over the Internet for reading and copying without charge, is an emerging phenomenon in the legal academy. In the legal academic community, the “open access” label is associated primarily with free distribution of scholarly works. The discussion has revolved around whether to improve access to faculty scholarship, how best to do so, and what it might mean for the traditional legal publishing paradigm.
At one level, enlisting faculty support for scholarly open-access initiatives consists merely of fostering personal and institutional self-interest. Inaccessible scholarship is unpersuasive scholarship, and studies have tended to suggest that opening access to scholarly works correlates with greater scholarly impact (as measured by citation counts). Researchers’ growing reliance on the Internet as a complement—and perhaps, one day, a successor—to proprietary databases or library hard copies feeds the demand for open access to scholarly works. Furthermore, the same technologies that enable open access to traditional legal scholarship also give scholars new forms to express themselves, creating forms of scholarly discourse that would have been uneconomical to produce in the pre-Internet era.
The movement to assure open access to scholarship is more advanced outside the legal academy. The difference is partly explained by differing market dynamics: University libraries, driven by eye-popping increases in subscription costs for specialized research journals, responded by dropping subscriptions, creating a risk that scholars working in those specialized fields would find it more difficult both to remain abreast of developments and to ensure dissemination of their own work to their peers. Open accesspublishing solves both problems by making current scholarship available worldwide at little expense. For that reason, faculty at several influential research institutions have voted to authorize archiving and distribution of their scholarship on open-access terms. Harvard University’s Faculty of Arts and Sciences did so (by unanimous vote) early in 2008, and the Harvard Law School faculty unanimously followed suit a few months later. The Massachusetts Institute of Technology adopted a university-wide open access mandate in early 2009, and similar measures are pending or have been adopted by other universities.
The adoption of open-access mandates by university faculty has led to the creation of institutional electronic repositories of scholarly works. Duke Law School’s faculty scholarship repository includes faculty papers dating back over half a century. Harvard’s new DASH repository may be unique in including student-authored papers alongside faculty scholarship. Nor is the push for scholarly open access confined to elite institutions: the Oklahoma City University School of Law, for example, maintains a repository of faculty scholarship extending back four decades. Cross-institutional repositories such as SSRN and BEPress hold even larger collections of faculty scholarship from universities worldwide.
Some law journals have also committed to publishing on an open-access model. The Science Commons organization (an affiliate
- Associate Professor of Law, University of Cincinnati College of Law. B.A. 1989, M.P.Aff. 1993, J.D. 1993, The University of Texas at Austin; LL.M . 2005, Harvard Law School. This work began as a series of presentations which I delivered at the summer 2008 and 2009 CALI Conferences for Law School Computing, and an updated version was presented at the meeting of the Section of Internet and Computer Law at the 2010 Annual Meeting of the Association of American Law Schools (AALS). I appreciate the thoughtful comments of the attendees at each of these sessions. Research support from the Harold C. Schott Foundation is gratefully acknowledged, as is the research assistance of Ron Jones. Copyright © 2010, Timothy K. Armstrong.
This work is licensed under the Creative Commons Attribution-Share Alike 3.0 United States license. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/, or send a letter to Creative Commons, 171 2nd Street, Suite 300, San Francisco, California, 94105, USA. For purposes of Paragraph 4(c) of said license, proper attribution must include the name of the original author and the name of the Santa Clara Computer & High Technology Law Journal as publisher, the title of the Article, the Uniform Resource Identifier, as described in the license, and, if applicable, credit indicating that the Article has been used in a derivative work.
- See, e.g., Authors Guild, Inc. v. Google Inc., No. 05 CV 8136(DC), 2009 WL 5576331 (S.D.N.Y. Nov. 19, 2009) (preliminarily approving proposed amended settlement agreement).
- See infra note 69.
- See Katie Hafner, History, Digitized (and Abridged), N.Y. Times, Mar. 11, 2007, § 3 (Magazine), at 1.
- See, e.g., Yochai Benkler, The Wealth of Networks 68–90 (2006) (collecting examples).
- See infra note 142 and accompanying text.
- See Wikisource, Pages that link to “Copyright Law Revision (House Report No. 94-1476)”, at http://en.wikisource.org/wiki/Special:WhatLinksHere/Copyright_Law_Revision_(House_Report_No._94-1476)) (accessed Apr. 15, 2010); Wikisource, Pages that link to “Copyright Law Revision (Senate Report No. 94-473)”, at http://en.wikisource.org/wiki/Special:WhatLinksHere/Copyright_Law_Revision_(Senate_Report_No._94-473)) (accessed Apr. 15, 2010).
- As one writer put it: “[M]any nerds believe that a billion readers can reliably weave together the pages of old books, one hyperlink at a time. Those with a passion for a special subject, obscure author or favorite book will, over time, link up its important parts. Multiply that simple generous act by millions of readers, and the universal library can be integrated in full, by fans for fans.” Kevin Kelly, Scan This Book!, N.Y. Times, May 14, 2006, § 6 (Magazine), at 42, 45.
- See Peter Suber, Open Access Overview, http://www.earlham.edu/~peters/fos/overview.htm (last visited Sept. 29, 2009) (“Open-access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions.”). There is no single settled definition of “open access,” although most conventional understandings of the term share common traits (the most important being the relative ease and low cost of access as compared with the traditional roprietary publication paradigm). See generally John Willinsky, The Access Principle: The Case For Open Access to Research and Scholarship App. A (2006) (cataloging “ten flavors of open access”); Lawrence B. Solum, Download It While It’s Hot: Open Access and Legal Scholarship, 10 Lewis & Clark L. Rev. 841, 856–57 (2006). The open access movement is a global phenomenon guided and informed by a number of declarations of principles issued by international groups, a full cataloging of which lies beyond the scope of the present essay. See, e.g., Richard A. Danner, Applying the Access Principle in Law: The Responsibilities of the Legal Scholar, 35 Int’l J. Leg. Info. 355, 359–66 (2007) (summarizing several of the pertinent declarations); David W. Opderbeck, The Penguin’s Paradox: The Political Economy of International Intellectual Property and the Paradox of Open Intellectual Property Models, 18 Stan. L. & Pol’y Rev. 101, 107–09 (2007) (recounting pertinent history).
By focusing on issues involving the legality of access to the underlying content, most discussions of open access elide related issues such as the openness of the software platforms used in creating and reading the content or the openness of the networks over which the content flows. See, e.g., Access Denied: The Practice and Policy of Global Internet Filtering (Ronald Deibert et al., eds., 2008) (surveying state actors’ controls over Internet information flows); Stephen Murgatroyd, Access to Knowledge in an e-Connected World, in The E-Connected World: Risks and Opportunities 79 (Stephen Coleman, ed., 2003) (acknowledging interrelationships among these concerns). This essay adheres to convention in focusing on the question of open access to content, while recognizing that other issues may carry greater force in particular circumstances.
- See, e.g., Joseph Scott Miller, Forward: Why Open Access to Scholarship Matters, 10 Lewis & Clark L. Rev. 733 (2006); Nicholas Bramble, Preparing Academic Scholarship for an Open Access World, 20 Harv. J.L. & Tech. 209 (2006).
- See Willinsky, supra note 8, at 22 ( “[O]pen access is associated with increased citations for authors and journals, when compared to similar work that is not open access”); id. at 22–24 (summarizing research). For a look at some of the methodological pitfalls of studies of this type, as well as some possible solutions, see Bernard S. Black & Paul L. Caron, Ranking Law Schools: Using SSRN to Measure Scholarly Performance, 81 Ind. L.J. 83, 92–95 (2006).
- See Solum, supra note 8, at 859 (“There will come a day when the saying, ‘If it isn’t on the net, it doesn’t exist,’ is true. Open access legal scholarship will be the only legal scholarship that is actually read. Closed access legal scholarship will be the tree that falls with no one in the forest.”); Carol A. Parker, Institutional Repositories and the Principle of Open Access: Changing the Way We Think About Legal Scholarship, 37 N.M. L. Rev. 431, 431 (2007) (suggesting that “open access to legal scholarship will soon be adopted and implement ed by every law school in the United States”); Richard A. Danner et al., The Twenty-First Century Law Library, 101 Law Libr. J. 143, 146 (2009) (“[T]he fact that young people are going to Google and to Wikipedia first is a call to arms in a way”) (comments of Richard A. Danner).
- See, e.g., Marci Hoffman & Katherine Topulos, Tyranny of the Available: Under-Represented Topics, Approaches, and Viewpoints, 35 Syracuse J. Int’l L. & Com. 175, 188–90 (2008);Paul L. Caron, Bloggership: How Blogs Are Transforming Legal Scholarship, 84 Wash. U.L. Rev. 1025 (2006).
- See Willinsky, supra note 8, ch. 2; Dan Hunter, Walled Gardens, 62 Wash. & Lee L. Rev. 607, 613–17 (2005). The effect of subscription costs as a driver of open access is surely greater in technical fields, where subscription rates for specialty publications may run into the thousands (or even tens of thousands) of dollars a year, than in law. But see Danner, supra note 8, at 377 (“[B]ecause they enjoy unlimited (and apparently cost-free) access to law journals and other information through Westlaw, LexisNexis, Hein Online, and other databases, it might be hard for law students and faculty to appreciate the impacts of access costs on researchers outside the U.S. legal education environment.”); Solum, supra note 8, at 863 (“[A]s you move from major research universities to regional universities to local colleges, the access of faculty and students to closed electronic databases (Westlaw, LexisNexis, JSTOR, etc.) begins to become very sketchy. In the least-developed countries, such access is virtually nonexistent.”).
- See, e.g., Michael J. Madison et al., The University as Constructed Cultural Commons, 30 Wash. U. J.L. & Pol’y 365, 399–400 (2009).
- See Harvard Law faculty votes for ‘open access’ to scholarly articles, http://www.law.harvard.edu/news/2008/05/07_openaccess.html (last visited Oct. 6, 2009).
- See Natasha Plotkin, MIT Will Publish All Faculty Articles Free In Online Repository, The Tech, Mar. 20, 2009, available at http://tech.mit.edu/V129/PDF/N14.pdf.
- On the other hand, the news is not uniformly favorable. In April 2009, the faculty of the University of Maryland defeated a resolution encouraging (but not requiring) that faculty members make their scholarship available in open-access repositories.
- See Duke Law School, Faculty Scholarship Repository, http://www.law.duke.edu/scholarship/repository (last visited Oct. 6, 2009). The earliest work presently found in the collection is Robinson O. Everett, Securing Security, 16 Law & Contemp. Probs. 49 (1951), available at http://eprints.law.duke.edu/365/. See generally Danner, supra note 8, at 393–94.
- The DASH repository is online at http://dash.harvard.edu/ (last visited Oct. 6, 2009).
- See Oklahoma City University School of Law, Faculty Scholarship Repository, http://www.okcu.edu/law/facultyandadministration/publications/index.php (last visited Oct. 6, 2009).
- See, e.g., Parker, supra note 11, at 431–32; Jessica Litman, The Economics of Open Access Law Publishing, 10 Lewis & Clark L. Rev. 779, 791–92 (2006); Black & Caron, supra note 10.