Accessibility, sustainability, excellence: how to expand access to research publications/The Research Communications Revolution

From Wikisource
Jump to navigation Jump to search

3. The Research Communications Revolution

3.1. The ways in which the published findings of research are produced, disseminated, managed, consumed and preserved have changed fundamentally over the past twenty years. The activities, roles and responsibilities of the various players in the research communications system—researchers, universities and other research institutions, research funders, publishers, learned societies, libraries, aggregators and secondary publishers, as well as readers—have been transformed. For all the organisations that act as intermediaries between authors and readers, the last two decades have brought unprecedented changes in the nature and scope of their activities, and continuing uncertainties as to the boundaries between their specific roles.

3.2. These changes are but part of a wider context of developments in the digital world: jockeying for position on a global scale between content providers, device companies, packagers, aggregators, delivery platforms, bandwidth suppliers and so on, all seeking a competitive edge. And change continues apace. Mobile access anywhere and at any time to content of all kinds, tagged with metadata, fully searchable, and interwoven with a rich array of other multimedia, is becoming a general expectation; and interactivity and interrelationships with social media are developing fast. All these developments bring the need to reconceptualise working patterns and practices. But few individuals or organisations have a clearly-defined vision as to what the research communications landscape will look like in ten or twenty years’ time.

3.3. In this context, it is important to understand where we have come from; what has changed, why and how; and the key factors that are likely to drive change into the future. We consider in this section the nature of and the drivers for change under three main heads: economic, technological, and social.

Economic factors

3.4. Research and its outputs. There are some six million researchers in the world, and their number has been growing fast. That growth has reflected significant increases in expenditure on research and development (R&D), particularly by Governments. Across the 34 members of the OECD, for example, gross expenditure on R&D increased by over 60% in real terms in the ten years to 2008, and in major research countries it has tended to exceed the rate of growth in GDP. Up to 2008, therefore, across OECD countries as a group, R&D grew as a proportion of the economy as a whole: from 1.9% in 1981 to 2.3% in 2008.[1]

3.5. Of course, much of the expenditure on R&D is devoted to the development of products, processes or services, relatively little of which results in the kinds of research findings and outputs that are reported in books and journals. Governments tend to be the major funders of the basic and applied research that results in such findings; and they have increased—or at least sought to protect—their budgets for investment in research because they see such investment as an essential underpinning for a successful modern economy and society. In the US, for example, the Federal budget for basic research increased by 28% in real terms between 2000 and 2009, including the stimulus provided by the American Recovery and Reinvestment Act.[2]

3.6. The result has been a sustained increase in the amount of research being undertaken, and in the outputs of that research. The number of articles published in journals has been growing in recent years at nearly 4% a year, so that in 2010 over 1.9 million articles were published, alongside an unknown number of research reports, conference presentations, working papers and so on.[3] Although expenditure on research has been constrained in some countries since the financial crisis of 2008, there is no sign that the rates of increase in global research publications will fall in the foreseeable future.

3.7. Globalisation. Within the context of these increases in research activity and outputs, there have been dramatic shifts in the global research landscape in recent years. Strong economic growth in countries such as Brazil, China and India has driven large increases in investment in R&D, which have in turn brought huge rises in the volume of research outputs. Between 2006 and 2010, the annual growth rate in articles with authors from Brazil was 9.8%, from China 12.3%, and from India 13.7%. Chinese authors accounted for 17.1% of the global total of articles published in 2010, and they are now second only to researchers in the USA in the number of articles published. Some countries starting from a lower base have seen even higher rates of growth: for Iran it was 25.2% between 2006 and 2010, for Malaysia 35.4%.[4]

3.8. This global shift in the production of research outputs has been accompanied by a rise in international collaboration among researchers. Research is increasingly being undertaken in a distributed way that blurs the distinctions between countries, making it more and more difficult to attribute research inputs and outputs unequivocally to specific countries. But collaborations are increasingly focused in a core of countries (including the UK) which collaborate with each other as well as with others in the periphery: collaborations in the periphery itself are relatively rare.[5]

3.9. In this context of globalisation and collaboration, the UK itself sustains, as we shall see in Section 4, a world-leading position in both the productivity and the quality of its research base.

3.10. Prices and Costs. The steady growth in the volumes of research publications presents a series of challenges. Between 2006 and 2010, the global total of journal articles alone increased by a fifth, alongside much larger increases in other forms of output, especially research data. Responsibilities for disseminating, preserving and providing access to research publications—in the interests of both authors and readers—are shared between publishers, aggregators, libraries and other intermediaries; and in fulfilling those responsibilities they incur significant costs. Publishers—both commercial and not-for-profit—must seek to recoup those costs, and generate surpluses for investment, for distribution to shareholders, or for transfer to support other activities. Subscription-based journals do so in the main through their charges for licences, the largest proportion of which are met by academic and other libraries. Open access journals secure most of their revenues through article processing or publishing charges (APCs), paid by authors once an article has been accepted for publication. Some journals operate as hybrids, generating their revenues partly from subscriptions and partly from APCs for open access articles. For all categories of journal, costs and prices vary, depending critically on the number of manuscripts submitted to them, and the numbers they publish:[6] the more articles submitted, the more must be rejected and this increases the cost per article published.

3.11. Academic libraries have faced financial pressures arising from the expansion both in the numbers of staff and students they are required to serve, but also in the volumes of books and journals they are expected to provide. A seemingly-inexorable rise in expenditure on journals has put pressure on all other elements in their budgets. Most libraries have achieved significant savings by streamlining their operations, driven in part by budgetary pressures. Thus the expansion of the HE sector and of research has not been accompanied by commensurate increases in library budgets, at least in Europe and North America. In the US, for example, gross expenditure on basic research rose by over 54% in real terms between 1999 and 2009,[7] but the budgets for members of the Association of Research Libraries (representing universities where the majority of US basic and applied research is carried out) fell from over 3.5% of university expenditure in the 1980s to under 2.0% in 2009.[8] The UK experience has been similar: while library expenditure in UK universities rose in real terms between 1999 and 2009, as a proportion of total expenditure in universities, it fell from 3.3% to 2.7%.[9]

Technological issues

3.12. The digital revolution in publishing. We have now reached a position where the current contents—and in most cases the back-runs—of nearly all journal titles are available online. This has brought a key shift in the relationship between libraries and publishers. Where libraries formerly purchased physical copies of journals, they now purchase licences under the terms of which publishers provide access to content that is held on their platforms.[10]

3.13. This shift has been accompanied by a huge increase in the number of journal titles made available through university libraries. That has been the result of so-called big deals under which publishers sell licensed access to a broad range (sometimes all) of their journal titles for a fixed period of three years or more. The pricing of such deals is complex: for while the price of individual titles is discounted deeply, publishers are in effect expanding their market by shifting libraries from highly-selective to larger all-encompassing collections. Taken together, the internet and the rise of big deals have brought a fundamental shift in research communications, particularly in relation to journals.[11]

3.14. The changes have been welcomed by researchers across all disciplines. For in their capacities both as producers and as consumers of research outputs, researchers see articles in journals as the dominant channel for communicating the results of research; and that dominance has been enhanced in the last decade.[12] Numerous surveys have shown how researchers have welcomed and embraced easy 24/7 access to unprecedented amounts of content.[13] Tenopir and King’s studies of researchers in the US[14] indicate that the number of articles read each month by university faculty has increased by over 80 per cent since the late 1970s.

3.15. The form in which articles are read has not changed as much as some would wish. Most papers are downloaded in the PDF format that mimics the form of the printed page; and a high proportion are printed for reading offline. Nearly all content is produced and also made available, however, in XML and HTML format; and there are increasing moves towards the use of more sophisticated semantic mark-up with more extensive linking and interactive features that cannot be accommodated in PDFs. Publishers are also addressing the demands for making their content available on mobile devices including smartphones, tablets and e-book readers, where PDF formats are not appropriate. In this way they are responding to the growing demand for the content they publish to be delivered through a range of devices, at any time or place.

3.16. Publishers, libraries, aggregators and others, including the general search engines such as Google, have also invested heavily to ensure that researchers and others can easily discover and navigate their way around the huge volumes of research content that are now available online. Readers can thus discover and gain access to content through a wide range of ‘gateway’ services, as well as through publisher platforms; and services such as citation linking and chaining are underpinned by the allocation of persistent identifiers (in the form of digital object identifiers (DOIs)) managed by the CrossRef organisation.[15]

3.17. These developments have been accompanied by huge investment in systems to manage the flows of information along the various supply chains in the research communications system: between authors, publishers, aggregators, subscription agents, libraries, end-users and so on. Developing systems and standards to facilitate effective and more open flows of metadata continue to be the focus of much effort, along with systems to generate consistent and more sophisticated information about users and usage. Access under licence has also required considerable investment in systems to manage such access; libraries and publishers have joined in establishing systems to authenticate and authorise users so that they can gain access to the published content they are entitled to read; and to ensure that they are not denied access free at the point of use when that is indeed what they are entitled to. Libraries have also invested considerable sums in systems to identify and track the digital resources for which they have purchased licences. And both libraries and publishers are investing considerable sums in systems to track levels and patterns of usage. All the infrastructural costs associated with licensing regimes are reflected in the prices charged by publishers, and also in the costs borne by libraries not only in subscriptions but in operating expenses.

3.18. Recently there have also been moves by some publishers—along with much experimentation from members of the research community—towards using Web and Semantic Web technologies to enhance journal articles in ways which some have termed ‘semantic publishing’. This has included enriching the text by providing interactive figures and ‘semantic lenses’ which turn a table into a graph, or animate a diagram; providing links to definitions of terms or concepts, or to additional information about such terms, or about relevant people or organisations; direct links to all cited references; access to the data within the article in actionable form, and links to the full datasets that underlie the article; and machine-readable metadata. The aim of enriching articles in such ways is to render the information and knowledge contained in and relating to the article easier to discover, analyse, extract, combine and re-use.

3.19. Related to such moves has been a growth of interest in exploiting the potential of text-mining tools to analyse and process the information contained in collections or corpora of journal articles and other documents in order to extract relevant information, to manipulate it, and to generate new information. The use of such techniques is not yet widespread, not least because arrangements for making publications available for text mining can be complex, and because the entry costs are high for those who lack the necessary technical skills. But text mining offers considerable potential to increase the efficiency, effectiveness and quality of research, to unlock hidden information, and to develop new knowledge.[16] The Government recently consulted upon the proposal in the Hargreaves Review of Intellectual Property to remove one of the barriers to wider adoption of text mining by introducing a new exception to copyright. This would allow whole copyright works to be copied for the purposes of text-mining and data-mining for non-commercial research.[17] We note that publishers of open access and hybrid journals can generally take a more relaxed view about the rights of users to analyse and manipulate the contents of their journals; but we have not repeated in our own work any investigation of the issues covered by the Hargreaves Report.

3.20. The data deluge. Computational and remote sensing technologies have in recent years created new ways of doing science. They have led to what some have referred to as a data deluge, and a new era of data-driven research. The business of both the public and commercial sectors is increasingly driven by the gathering and progressively more sophisticated analysis of data from a range of sources. It has been estimated that by 2020 35 zetabytes (1021 bytes) of digital data will be created each year. Linked data and semantic web technologies promise the creation of new information by deep integration of an increasing number of datasets of growing complexity, and finding new ways of re-using them. It is not our purpose to examine all the consequences of the huge growth in the volume and scope of the data that researchers gather, create and use. Many of the implications are considered in the Royal Society’s report on Science as an Open Enterprise referred to earlier.[18] We note, however, that data is increasingly important in its own right as an output of research; and that there is increasing interest in how to support researchers in managing their data more effectively, and in making it available for others to use in their own research and for other purposes.[19] For the infrastructure and services through which data are made available and readily-usable are now seen as an essential underpinning for successful research.

3.21. The key challenge for publishers as well as for others concerned in the effective communication of research is how to handle the increasingly complex relationships between the books, articles and other publications on the one hand, and the data that underlies the findings that those publications present on the other; and how to ensure that they are presented and made accessible in an integrated way.

3.22. Most scholarly publishers accept that data and publications belong together. The relationship between them is sometimes presented as a pyramid with a broad base of raw data and data sets, on the basis of which researchers construct a smaller set of structured data collections and databases, then processed data and data representations, and topped off with the relatively small amount of data (typically in the form of small tables and charts) that is contained within the publication itself.[20] Journal publishers increasingly link from articles to relevant data stored elsewhere, and some enable readers to interact with and edit data presented in the article itself. Journals have also seen a dramatic increase in the past five years in the amount of supplementary material presented to them along with articles in the traditional format. For some this has become a growing problem, with the supplementary material exceeding in volume the articles themselves, and presenting problems in peer review and quality assurance.[21]

3.23. Publishers have an important role to play in making more of the data that researchers produce more readily available for others to peruse and re-use. Some are already introducing stricter policies requiring authors to make underlying data available, along with advice on reliable and trustworthy data archives. Some are also enhancing articles to provide better integration with underlying data; ensuring that data have persistent identifiers to underpin effective two-way links between data and publications; and helping to promote guidelines for the proper citation of data. There is also scope for much more effective co-operation between publishers and data centres to facilitate integration between data and publications, including support for full interactivity when readers wish to re-use data; and for the publication of data journals that describe data sets and data methods. In an ideal world, there would be closer integration between the text and the data presented in journal articles, with seamless links to interactive datasets; a consequent fall in the amount of supplementary material; and two-way links, with interactive viewers, between publications and relevant data held in data archives. The availability of, and access to, publications and associated data would then become fully integrated and seamless, with both feeding off each other.

Social, political and behavioural issues

3.24. Openness and transparency. The technological developments outlined above have enabled the creation of a wide range of new services. Together they have brought a new age of abundance in the provision and availability of information resources. As information of all kinds has become more readily available, members of the research and academic community have become increasingly used to operating in a complex information environment of data, information and ideas; and they have changed their workflows accordingly. They have also come increasingly to expect that information and the services surrounding it are, and should be, available free at the point of use, at any time and wherever they are. Such notions are underpinned by the widespread availability of research content provided via academic libraries: researchers are often unaware of the routes through which content is provided to them, and the extent to which they rely on licences paid for by the library.

3.25. Some researchers, as well as librarians and others, have also become active in movements to promote access to data, information and other forms of content that people are free to use, re-use and redistribute without any legal, technological or other restriction. In this context, any restrictions on access are seen as barriers against realising the full potential of information whether formally published or not—as an essential component of social and economic welfare, and as the raw materials for the development of innovative tools and services.

3.26. Similar motivations underlie the Government’s commitment to openness and transparency in enhancing access to data generated by public bodies. It intends through its open data initiative to facilitate accountability; improve outcomes and productivity in key services through informed comparison; enhance social relationships; and drive dynamic economic growth by making data available for use in the market. Again, there are legal and ethical constraints, but such objectives are readily transferable to the research domain. As we noted earlier, Governments across the world are concerned to maximise the social and economic benefits that they gain through the investments they make in research; and it is therefore not surprising that they are increasingly interested in how to ensure that publicly-funded research findings are readily available not only across the research community itself, but more widely.

3.27. Disintermediation and the disruption of established roles. Over the past two decades, all intermediaries—publishers, aggregators, abstract and indexing services, libraries and so on—have had continually to re-assess and redefine their roles, in a world where authors can in principle communicate direct with their readers: for they can readily broadcast information direct via a blog or a website. Readers no longer have to visit a library to find material relevant to their work; for they can discover and gain access to relevant material whenever and wherever they have access to the internet. The central position that libraries once played in the research environment has now shifted to other sources.

3.28. Reducing the role of intermediaries in such ways is sometimes referred to as ‘disintermediation’. But these changes have not eliminated the need for intermediaries, for a variety of reasons including the continuing need for quality assurance of content, and for effective search and navigation systems to guide readers to the content they want. Intermediaries develop and invest in such services, and they need to operate under business models that provide the revenues that enable them to do so. But all are operating in an environment where they face repeated questioning of the value of the services they provide. They also face insistent demands for greater customer focus, even as many of the services they provide are increasingly less-visible to authors and readers. The digital revolution has also brought the need for new services in areas such as digital preservation: the role of research libraries in ensuring the long-term preservation of print does not readily transfer to digital content, and while services such as Portico and the edepot at the Koninklijke Bibliotheek in the Netherlands have made considerable progress, we are still some way from a position where there are robust arrangements in place for the long term preservation of digital copies of all issues of all journal titles so that they remain accessible for future generations.[22] Further investment is likely to be needed in this area.

3.29. Behaviours and expectations. We have already noted that researchers now read many more articles than they did twenty years ago. They also make extensive use of journals and other material to which they did not have access the print era. But how they read and navigate has changed too. They read on screen as well as in print, bouncing from one site to another, ‘power-browsing’ through content and spending less time reading individual items. But researchers are now more likely to navigate to the content they want through use of a gateway service or search engine rather than by browsing through the tables of contents of individual journals.[23] And they expect that when they discover material that looks relevant to their work, they will be able to access the full text immediately without charge: one of the key frustrations they express is when that expectation is thwarted. A growing minority, as we have seen, also want to use a variety of tools to organise and manipulate the content they find.

3.30. On the whole, however, researchers operate in an environment where information is abundant, and face challenges in dealing with that abundance. In the research communications landscape, as elsewhere, there is thus growing interest in ideas surrounding what has been termed the economy of attention.[24] This is based on the insight that the consumption of information requires investments of time and attention. Since those are limited resources, however, as more information is produced, each item must compete for the limited attention of readers. Such competition underlines the need for all those concerned in the research communications landscape to pay close heed to issues such as ease of search and navigation, branding, and to systems that provide effective signals of trust and authority.

3.31. Social Media. Over recent years, researchers have made increasing use of social media—blogs, wikis, podcasts, online videos, Twitter feeds, RSS feeds, comments on online articles and so on. Recent studies indicate that around a half of the members of academic staff in the UK make use of some form of social media at least occasionally in the course of their work. They do so, however, for the most part on an irregular basis, and much more as readers than as creators: only a minority are frequent users and creators of social media content. Thus while researchers are generally supportive in their attitudes towards social media as a means of sharing ideas and collaborating with other members of the research community, they are wary of the lack of quality assurance, and see them as a supplement to—not a replacement for—traditional publications: they ‘cannot at any point replace high-quality peer-reviewed journal articles’[25] Nor do they as yet form a key part of researchers’ general workflows. In terms of our remit, they are not peer-reviewed publications.

3.32. Some services with social media aspects do, however, show signs that they might become more generally embedded in research workflows. Mendeley, for example, provides a web-based service which allows researchers to manage and annotate their bibliographies, but also to connect with colleagues and share papers and annotations with them.[26] It also provides a means to discover papers as well as other researchers and research groups working in specific fields. It now has nearly two million registered users worldwide.

Open Access

3.33. The development of the open access movement can be traced back to the 1990s, when the earliest e-print repositories[27] (initially called archives) and open access journals (that is, journals that make their contents available free of charge upon publication) began to appear. These initiatives were stimulated by the rapid development of the internet, by concerns about the increasing cost of subscriptions to journals, and also the growth of the view that the results of publicly-funded research should be in the public domain. In that context, the Scholarly Publishing and Academic Resources Coalition (SPARC)[28] was launched in 1998 by the Association of Research Libraries (ARL) in North America in 1998, with a mission to correct what it saw as imbalances in the research communications system that had driven up the cost of journals and thereby inhibited access to information and thus the advancement of scholarship.

3.34. The open access movement began to take off in a significant way in the years immediately after 2000, with the launch of what are still the two biggest open access publishers, BioMedCentral[29] in the UK, and the Public Library of Science (PLoS) in the US.[30] Three key statements on open access were launched in 2002 and 2003: the Budapest Open Access Initiative[31] at a meeting organised by the Open Society Institute in February 2002; the Bethesda Statement on Open Access Publishing,[32] drafted at a meeting organised by the Howard Hughes Medical Institute in April 2003; and the Berlin Declaration[33] at a meeting organised by the Max Planck Society in October 2003. All three stress that open access implies that authors should grant free access and rights to use published works, subject only to proper attribution of authorship. Each also acknowledges two complementary routes to open access—publishing in open access journals, and providing access by depositing material in open access repositories—and the need to develop appropriate financial as well as legal frameworks to support the moves to make the published findings of research more widely available via the internet.

3.35. The open access movement is clearly an international one, and UK representatives have played a significant role in it. The SHERPA[34] project was established at the University of Nottingham in 2002, funded by JISC, to support the development of institutional repositories and to facilitate the rapid dissemination of research. It soon established the Romeo online database of publishers’ policies relating to the deposit of published articles in repositories, followed by the Juliet database of funders’ policies on open access, and the OpenDoar database of open access repositories. The latter complemented the Directory of Open Access Journals[35] established by the University of Lund in 2003.

Repositories

3.36. Repositories are now a familiar way to facilitate open access. There are now over two thousand repositories worldwide, the great majority of them based in universities and other research institutions. They vary hugely in size and scope. Some have fewer than a hundred items, while the CERN repository in Geneva has more than a million; and the kinds of records they contain include reports and working papers, conference papers and posters, dissertations and theses, designs, exhibition materials, performances and so on. They vary also in the amount of material that is available in full text, as distinct from simply metadata records. In many of the larger institutional repositories, the majority of items are recorded only as metadata.

3.37. Some of the largest repositories are not institutionally-based, but operate as a service to specific subject communities across the globe. Among the most notable of these are ArXiv,[36] for e-prints mainly in physics, and PubMedCentral (PMC),[37] which is run by the U.S. National Institutes of Health's National Library of Medicine (NIH/NLM). The nature and scale of repositories such as these will be considered further in Section 7.

Open access journals

3.38. The number of open access journals has risen rapidly since they first began to emerge in the 1990s. There are currently over 7,600 open access journals listed in the Directory of Open Access Journals (DOAJ), published in 117 countries. The three countries with the most journals are the US (1360), Brazil (690) and the UK (533). There have been some criticisms of the DOAJ statistics, but it is clear that open access journals now represent a significant proportion of the journals published globally. They are highly heterogeneous nature and scope, and like all journals they vary considerably in editorial standards and in the quality of peer review.[38] Most are relatively new journals which have been open access from the start, many of them founded by individual scholars on tailor-made platforms, often with a business model based on voluntary labour and the use of a university’s web server free of charge; others are older-established journals that have converted to open access; while new open access publishers such as BioMedCentral and PLoS have established a large-scale presence in the market, with their operations funded by charging APCs to authors.

3.39. In addition to the fully open access journals, nearly all the large scholarly publishers now offer the hybrid option for at least some of their journals: that is, in return for the payment of an APC, they will make an article in an otherwise subscription-based journal accessible immediately on publication, without any reader having to pay a subscription or PPV charge.[39]

3.40. The proportion of the global total of articles published each year which are published in open access or hybrid journals is not easy to calculate. A recent study estimated that over 190,000 articles were published in open access journals in 2009, about 7.7% of all peer-reviewed journal articles published that year.[40] The EU-funded Study of Open Access Publishing (SOAP) estimated a slightly higher 8-10% of all peer-reviewed articles were published open access.[41] Such figures should be set in the context where the total number of articles in all kinds of peer-reviewed journals worldwide is rising at the rate of around 4% a year.

3.41. Most publishers providing fully open access journals operate on a small scale, with only one title, publishing fewer than one hundred articles a year. A recent study[42] suggests that two-thirds of open access articles are published by 10% of publishers, and that fourteen publishers are responsible for around 30% of open access articles. Science, technology and medicine account for two-thirds of journals and more than three-quarters of articles. Social science and humanities, on the other hand, account for a third of journals but only 16% of articles.[43]

3.42. Take-up of the open access option in hybrid journals is relatively low, at around 2% on average.[44] Some publishers have seen higher levels of take-up in certain disciplines: Oxford Journals have seen 10% of authors in the life sciences selecting the open access option across 16 participating journals, as against approximately 5% in medicine and public health and 3% in the humanities and social sciences. Nature Communications reports take-up of the open access option at over 40%.

3.43. Overall, recent studies suggest that the growth of open access articles has been much faster than for peer-reviewed articles as a whole. This has been the result both of the creation of new ‘born open access’ journals and the switch of established journals either to open access or to the hybrid model. The recent development of what have been termed ‘repository’ journals[45] such as PLoSOn—where the peer review process focuses solely on whether the findings and conclusions are justified by the results and methodology presented, rather than on assessment of the relative importance of the research or perceived level of interest it will generate—has stimulated further growth. Established publishers such as American Institute of Physics, Nature Publishing Group, the BMJ (British Medical Journal) Group, and SAGE Publications in the social sciences, have all launched similar journals in the past couple of years. PLoSOne is now by some counts the largest journal in the world. Such journals play a role different from the highly-selective journals which seek to present only the best and most significant research in their fields.

Funders’ policies

3.44. Major funders of research began from 2005 to introduce policies to promote open access to the published findings of the research they fund. The National Institutes of Health (NIH) in the US introduced a policy requiring that scientists should submit final peer-reviewed journal manuscripts arising from NIH funding to PubMed Central upon acceptance for publication; and that they should be accessible to the public no later than 12 months after publication.[46] In the UK, the House of Commons Science and Technology Committee issued a report in 2004[47] recommending that research funders should require that published findings should be deposited in institutional repositories, and that there should be a further study of the funding of open access journals. In response to that report, Research Councils UK (RCUK) produced in 2005 and 2006 position statements[48] outlining a requirement that articles should be deposited in repositories, but recognising that access would depend on copyright and licensing arrangements relating, for example, to embargo periods. The Wellcome Trust introduced a policy requiring that published outputs of the research that it funds should be made available through PubMedCentral within six months of publication; and it complemented that policy with arrangements to meet the costs of the APCs charged by open access publishers.[49]

3.45. Similar policies were introduced from 2006 onwards by a range of organisations including the Deutsche Forschungsgemeinschaft (DfG)[50] in Germany, the Centre National de la Recherche Scientifique (CNRS)[51] in France, and the Canadian Institutes of Health Research.[52] The European Union’s interest in open access was reinforced by its funding of initiatives to support the development of Europe-wide research infrastructures, and the introduction of open access policies for part of the Framework 7 programme and by the European Research Council.[53]

3.46. These policies and initiatives varied as between encouraging and requiring open access, in the extent to which any requirement for deposit and access via repositories was mitigated by embargo periods, and in whether or how they were backed up by the provision of funding to meet the costs of publishing in open access journals. They also vary in the extent to which they have been policed or enforced. Even the Wellcome Trust, which has been the most generous in its arrangements for funding for open access publishing, has seen compliance with its policies requiring deposit of articles in the UK PubMedCentral repository reach only around 55 per cent.

Institutional policies

3.47. Policies from individual universities and other research institutions to promote or require open access have been somewhat slower to emerge. In the US, Harvard University’s Faculty of Arts and Sciences introduced in 2008 a policy under which its staff grant the university a nonexclusive, irrevocable right to distribute their articles for any non-commercial purpose, and articles are stored, preserved, and made freely accessible in digital form in Digital Access to Scholarship at Harvard (DASH), the University’s open access repository.[54] Other US universities have followed with similar policies. In the UK, universities from across the sector—including University College London, and the Universities of Leicester, Salford and Abertay Dundee[55]—have introduced policies to require deposit of publications in their institutional repositories. But the policies are qualified by such terms as ‘copyright permissions allowing’ and ‘where publisher agreements permit’. As with funders’ policies, it is not clear how extensively the policies are policed, and rates of compliance are as yet not high. These issues are considered further in Section 4.

Publisher and learned society concerns

3.48. When funders and institutions began to develop policies to promote open access, especially access via repositories, both commercial and learned society publishers that publish subscription-based journals tended to see them as a threat. Many such publishers saw the prospect of a requirement that articles should be made available through institutional and subject-based repositories, after what was seen as a relatively short embargo period, as a threat to their revenues and even to the survival of their journals, with the prospect of sales falling as swift, free access became accessible via repositories. Learned societies saw a threat to the publishing income that sustains many of their charitable scholarly and public engagement activities; and also to their income from members who are often attracted by society publications as a membership benefit. Some learned societies have also expressed concerns that allowing use and re-use of research results on open access terms might limit the UK’s ability to exploit those results commercially.

3.49. The reaction of many publishers and learned societies to the policies introduced by funding agencies and others was therefore to put restrictions around what could be deposited in repositories, and the rights associated with it. Thus many publishers insisted that only the manuscript submitted to them by the author or, more commonly, the manuscript accepted for publication after peer review, could be made available, rather than the ‘version of record’ copy-edited and marked up by the publisher. And in addition to embargo periods, publishers sought to restrict the rights of readers to re-use material deposited in repositories. These issues are considered more fully in Sections 4 and 7.

3.50. Subscription-based publishers’ reactions to the development of open access journals were more mixed. Many were initially hostile, suggesting that the new journals represented a lowering of standards, or that they were not sustainable without heavy subsidy. Others including Oxford University Press and the Institute of Physics responded by launching their own open access journals alongside their existing subscription-based ones, or by developing the hybrid model. Most of the larger scholarly publishers now provide a mix of options in this way.


  1. See OECD Science, Technology and R&D Statistics, available at http://dx.doi.org/10.1787/strd-data-en .
  2. National Science Board. 2012. Science and Engineering Indicators 2012. Arlington VA: National Science Foundation (NSB 12-01), Chapter 4.
  3. International Comparative Performance of the UK Research Base 2011, a report prepared by Elsevier for the Department of Business, Innovation and Skills, 2011. The figures are based on analysis of the SCOPUS database.
  4. Ibid.
  5. Boshoff, N. (2009) “South–South research collaboration of countries in the Southern Development Community (SADC)” Scientometrics 84(2) pp. 481–503
  6. For a discussion of the literature on the drivers of journal costs and prices, see Activities, costs and funding flows in the scholarlycommunications system in the UK, RIN 2008
  7. National Science Board (2012), Science and Engineering Indicators 2012. National Science Foundation, Appendix Table 4-4
  8. http://www.arlstatistics.org/about/series/eg
  9. RIN, Trends in the finances of UK higher education libraries, 1999-2009, 2010
  10. It is important to note, however, that for a range of reasons, many libraries purchase both physical copies and online access, even though this adds to both libraries’ and publishers’ costs, not least in relation to VAT. See E-only Scholarly Journals: overcoming the barriers, RIN, Publishing Research Consortium, JISC and Research Libraries UK, 2010.
  11. As we shall see below, the shift to online access for monographs, however, has been much slower to take off.
  12. Communicating Knowledge: how and why UK researchers publish and disseminate their findings, RIN, 2009.
  13. E-journals: their use, value and impact: final report, RIN 2011.
  14. Tenopir, C., D.W. King, Sheri Edwards, and Lei Wu. “Electronic Journals and Changes in Scholarly Article Seeking and Reading Patterns.” Aslib Proceedings: New Information Perspectives, vol. 61 (2009): 5. A recent parallel study of researchers in the UK indicates that they read on average 267 articles a year (298 if humanities researchers are excluded from the calculation). See Carol Tenopir and Rachel Volentine, UK Scholarly Reading and the Value of Library Resources, JISC Collections, 2012
  15. http://www.crossref.org/ .
  16. McDonald, D et al, The Value and Benefits of Text Mining, JISC, 2012.
  17. I Hargreaves, Digital Opportunity: A Review of Intellectual Property and Growth, Intellectual Property Office, 2011; Consultation on Copyright, Intellectual Property Office 2012,
  18. Royal Society, Science as an Open Enterprise, forthcoming 2012
  19. See, for example, the OECD’s Principles and Guidelines for Access to Research Data from Public Funding. OECD Publications. Paris. 2007; and the guidance produced in the UK by JISC, the Digital Curation Centre, the Research Councils and others. For an example of Research Council guidance, see the Biotechnology and Biological Sciences Research Council, BBSRC Data Sharing Policy, June 2010.
  20. See, for example, Susan Reilly et al Report On Integration of Data and Publications, 2011, available at http://www.stm-assoc.org/2011_12_5_ODE_Report_On_Integration_of_Data_and_Publications.pdf
  21. Such difficulties led the Journal of Neuroscience to decide in 2010 that it would no longer accept any supplementary material along with the articles submitted to it.
  22. For Portico see http://www.portico.org/digital-preservation/ ; for the e-depot at the Koninklijke Bibliotheek in the Netherlands see http://www.kb.nl/hrd/dd/index-en.html). Other services include, LOCKSS (Lots of Copied Keeps Stuff Safe) (http://www.lockss.org/) and CLOCKSS (Controlled LOCKSS) (http://www.clockss.org/clockss/Home.).
  23. Nicholas, D. (2009). The information-seeking behaviour of the virtual scholar: from use to users. Serials 21(2), 89-92; Carol Tenopir and Rachel Volentine, UK Scholarly Reading and the Value of Library Resources, JISC Collections, 2012
  24. See, for example, Fang Wu and Bernardo A. Huberman, ‘Novelty and Collective Attention’, Proceedings of the National Academy of Sciences. 105, 17599, 2007; and Gonçalves, B.,et al’ Modeling users' activity on twitter networks: validation of Dunbar's number’, PLoSOne, vol 6 (8), 2011.
  25. If You Build it, Will They Come? How Researchers Perceive and Use Web 2.0 , RIN 2010; Carol Tenopir and Rachel Volentine, UK Scholarly Reading and the Value of Library Resources, JISC Collections, 2012.
  26. http://www.mendeley.com/
  27. The ArXiv repository for e-prints in physics was founded by Paul Ginsparg in 1991, and was followed by by the Social Science Research Network (SSRN) in 1994 and Research Papers in Economics (RePEc) in 1997.
  28. http://www.arl.org/sparc/
  29. http://www.biomedcentral.com/
  30. http://www.plos.org/
  31. http://www.soros.org/openaccess/read
  32. http://www.earlham.edu/~peters/fos/bethesda.htm
  33. http://oa.mpg.de/lang/en-uk/berlin-prozess/berliner-erklarung
  34. http://www.sherpa.ac.uk/
  35. http://www.doaj.org/
  36. http://arxiv.org/
  37. http://www.ncbi.nlm.nih.gov/pmc/
  38. The qualification for entry in the Directory is that the journal has in place a ‘quality control system to guarantee the content’. But as with subscription-based journals, standards vary. http://www.doaj.org/doaj?func=loadTempl&templ=about&uiLanguage=en
  39. There are, however, some variations as to rights of use and re-use.
  40. The study also charted rapid growth from 19,500 in 2000 to 191,850 in 2009. Laakso et al, The development of OA journal publishing 1993-2009, PLoS ONE 6(6): http://www.plosone.org/article/info:doi/10.1371/journal.pone.0020961#pone.0020961-Morris1
  41. Suenje Dallmeier--Tiessen et al, First results of the SOAP Project: Open Access Publishing in 2010. http://arxiv.org/ftp/arxiv/papers/1010/1010.0506.pdf . Analysis of the SCOPUS database by Elsevier, however, suggests a lower figure of around 4-5%.
  42. Suenje Dallmeier--Tiessen et al, op cit
  43. It is also notable that while APCs and membership subscriptions are the most important sources of income for STM publishers, sponsorship and print subscriptions are favoured in social sciences and humanities. Dependence on APCs is also characteristic of publishers with large numbers of journals, and less common among small publishers.
  44. Suenje Dallmeier--Tiessen et al, op cit.
  45. House of Commons Science and Technology Committee Peer review in scientific publications, HC 856, 2011
  46. ttp://publicaccess.nih.gov/policy.htm
  47. http://www.publications.parliament.uk/pa/cm200304/cmselect/cmsctech/399/39902.htm
  48. http://www.rcuk.ac.uk/documents/documents/2006statement.pdf. The 2005 statement also made explicit reference to provision for the payment of APCs under the full economic costing regime then being introduced: http://www.rcuk.ac.uk/documents/documents/2005statement.pdf
  49. http://www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTD002766.htm . The Medical Research Council also introduced a policy requiring deposit within six months, but did not follow the Wellcome Trust in its policies relating to the payment of APCs.
  50. http://www.dfg.de/en/research_funding/programmes/infrastructure/lis/digital_information/open_access/index.html
  51. http://www.ccsd.cnrs.fr/support/content/PDF/DGauxDU_060621.pdf
  52. http://www.cihr-irsc.gc.ca/e/32005.html
  53. For the DRIVER programme to develop the infrastructure of repositories, see http://www.driverrepository.eu/ ; for the open access policy for Framework Programme 7, see http://ec.europa.eu/research/science-society/index.cfm?fuseaction=public.topic&id=1300&lang=1 ; and for the European Research Council policy, see http://erc.europa.eu/sites/default/files/press_release/files/erc_scc_statement_2006_open_access_0.pdf The ROARMAP service (www.roarmap.eprints.org ) indicates that some fifty funders worldwide have instituted policies to promote open access.
  54. For a full statement of the policy, and a list of the schools which have now adopted it, see http://osc.hul.harvard.edu/policies .It should be noted, however, that only a small proportion of the articles and other publications published by Harvard authors are as yet available in DASH. See also Amy Brand, ‘Beyond Mandate and Repository, toward sustainable faculty self-archiving’, Learned Publishing, 25(1), 2012.
  55. For UCL, see http://www.ucl.ac.uk/library/publications-policy.shtml ; for Leicester, see http://www2.le.ac.uk/offices/researchsupport/policyandstrategy/open-access/pubpolicy ; for Salford, see http://www.salford.ac.uk/__data/assets/pdf_file/0006/58722/USIRPolicy.pdf ; and for Abertay Dundee see https://portal.abertay.ac.uk/portal/page/portal/abertayknowledge/research/Self-Archiving-and-ResearchRepository-PolicyV1-1.pdf