Timothy K. Armstrong[†]


This short essay surveys the state of open access to primary legal source materials (statutes, judicial opinions and the like) and legal scholarship. The ongoing digitization phenomenon (illustrated, although by no means typified, by massive scanning endeavors such as the Google Books project and the Library of Congress’s efforts to digitize United States historical documents) has made a wealth of information, including legal information, freely available online, and a number of open-access collections of legal source materials have been created. Many of these collections, however, suffer from similar flaws: they devote too much effort to collecting case law rather than other authorities, they overemphasize recent works (especially those originally created in digital form), they do not adequately hyperlink between related documents in the collection, their citator functions are haphazard and rudimentary, and they do not enable easy user authentication against official reference sources.

The essay explores whether some of these problems might be alleviated by enlarging the pool of contributors who are working to bring paper records into the digital era. The same “peer production” process that has allowed far-flung communities of volunteers to build large-scale informational goods like the Wikipedia encyclopedia or the Linux operating system might be harnessed to build a digital library. The essay critically reviews two projects that have sought to “crowdsource” proofreading and archiving of texts: Distributed Proofreaders, a project frequently held up as a model in the academic literature on peer production; and Wikisource, a sister site of Wikipedia that improves on Distributed Proofreaders in a number of ways. The essay concludes by offering a few illustrations meant to show the potential for using Wikisource as an open-access repository for primary source materials and scholarship, and considers some possible drawbacks of the crowdsourced approach.

I. Introduction[edit]

The digital era has exposed the limitations of paper as an archival medium. Although paper (like other forms of hard-copy) makes an excellent tool for transmitting knowledge across lengthy spans of time, it makes a poor tool for transmitting knowledge across lengthy spans of distance. A wealth of knowledge, including legal knowledge, remains effectively trapped inside paper records, where it can be used only by those with access to the physical medium in which it is contained.

The movement to digitize paper records and make them freely available online promises to liberate information, including legal information, from these physical constraints and make it accessible around the globe. The scope of the task, however, is massive and daunting. Even the best organized (and best funded) efforts, such as the Google Books project (currently the subject of copyright litigation[1]) and the Library of Congress’s efforts to scan American historical documents,[2] can only scratch the surface. Indeed, the Library of Congress recently estimated that, at its present pace, it willtake almost two thousand years to digitize the nine billion text records it presently holds in its collection.[3]

Wikis and other collaborative tools change this picture in potentially important ways. Just as other informational projects have benefited by opening themselves to participation by a distributed community of volunteers,[4] the means now exist to harness the efforts of legal professionals, students, and even interested members of the public at large to improve access to legal information, court decisions, statutes and regulations, and legal scholarship. In 2008, for example, the participants in one such project (initiated by the present author) succeeded in making crucial portions of the legislative history of the landmark Copyright Act of 1976 freely available online for the first time.[5] The online version of the Copyright Act’s legislative history improves access not only by duplicating the text of the original report, but—perhaps more importantly—by making it possible for other online works that cite the report to hyperlink to it.[6] This creates a seamless web of knowledge that improves upon the practical experience of using reference sources in paper form. If we multiply this isolated example by dozens, hundreds, or thousands of interested online users of legal texts, the possibility of a transformative moment in access to legal knowledge begins to appear ever closer.[7]

This essay begins with a review of the open access imperative, which may be normatively grounded in considerations of transparency, democratic legitimacy, and the fulfillment of the university’s public service mandate. It then surveys the current status of a number of projects aimed at improving public access to legalmaterials and scholarship, and explores whether “crowdsourced,” Wiki-centered efforts may achieve comparable results at lower cost. It concludes with an assessment of some of the drawbacks and limitations of the “crowdsourced” approach.

II. Policy Background: The Open Access Imperative[edit]

A. Open Access to Scholarship[edit]

“Open access,” in the sense of making documentary materials available over the Internet for reading and copying without charge,[8] is an emerging phenomenon in the legal academy. In the legal academic community, the “open access” label is associated primarily with free distribution of scholarly works. The discussion has revolved around whether to improve access to faculty scholarship, how best to do so, and what it might mean for the traditional legal publishing paradigm.[9]

At one level, enlisting faculty support for scholarly open-access initiatives consists merely of fostering personal and institutional self-interest. Inaccessible scholarship is unpersuasive scholarship, and studies have tended to suggest that opening access to scholarly works correlates with greater scholarly impact (as measured by citation counts).[10] Researchers’ growing reliance on the Internet as a complement—and perhaps, one day, a successor—to proprietary databases or library hard copies feeds the demand for open access to scholarly works.[11] Furthermore, the same technologies that enable open access to traditional legal scholarship also give scholars new forms to express themselves, creating forms of scholarly discourse that would have been uneconomical to produce in the pre-Internet era.[12]

The movement to assure open access to scholarship is more advanced outside the legal academy. The difference is partly explained by differing market dynamics: University libraries, driven by eye-popping increases in subscription costs for specialized research journals, responded by dropping subscriptions, creating a risk that scholars working in those specialized fields would find it more difficult both to remain abreast of developments and to ensure dissemination of their own work to their peers.[13] Open accesspublishing solves both problems by making current scholarship available worldwide at little expense. For that reason, faculty at several influential research institutions have voted to authorize archiving and distribution of their scholarship on open-access terms. Harvard University’s Faculty of Arts and Sciences did so (by unanimous vote) early in 2008,[14] and the Harvard Law School faculty unanimously followed suit a few months later.[15] The Massachusetts Institute of Technology adopted a university-wide open access mandate in early 2009,[16] and similar measures are pending or have been adopted by other universities.[17]

The adoption of open-access mandates by university faculty has led to the creation of institutional electronic repositories of scholarly works. Duke Law School’s faculty scholarship repository includes faculty papers dating back over half a century.[18] Harvard’s new DASH repository may be unique in including student-authored papers alongside faculty scholarship.[19] Nor is the push for scholarly open access confined to elite institutions: the Oklahoma City University School of Law, for example, maintains a repository of faculty scholarship extending back four decades.[20] Cross-institutional repositories such as SSRN and BEPress hold even larger collections of faculty scholarship from universities worldwide.[21]

Some law journals have also committed to publishing on an open-access model. The Science Commons organization (an affiliate

^ Associate Professor of Law, University of Cincinnati College of Law. B.A. 1989, M.P.Aff. 1993, J.D. 1993, The University of Texas at Austin; LL.M . 2005, Harvard Law School. This work began as a series of presentations which I delivered at the summer 2008 and 2009 CALI Conferences for Law School Computing, and an updated version was presented at the meeting of the Section of Internet and Computer Law at the 2010 Annual Meeting of the Association of American Law Schools (AALS). I appreciate the thoughtful comments of the attendees at each of these sessions. Research support from the Harold C. Schott Foundation is gratefully acknowledged, as is the research assistance of Ron Jones. Copyright © 2010, Timothy K. Armstrong.
This work is licensed under the Creative Commons Attribution-Share Alike 3.0 United States license. To view a copy of this license, visit, or send a letter to Creative Commons, 171 2nd Street, Suite 300, San Francisco, California, 94105, USA. For purposes of Paragraph 4(c) of said license, proper attribution must include the name of the original author and the name of the Santa Clara Computer & High Technology Law Journal as publisher, the title of the Article, the Uniform Resource Identifier, as described in the license, and, if applicable, credit indicating that the Article has been used in a derivative work.

