Copyright Office letter affirming refusal to register the "Prancer DNA Sequence"  (2014) 
by Robert J. Kasunic

United States Copyright Office

Library of Congress 101 Independence Avenue SE Washington DC 20559-6000

February 11, 2014


Attn: Howard Simon

1140 O'Brien Drive, Suite A

Menlo Park, CA 94025

Correspondence ID: 1-K1L4R4

Original Correspondence ID: 1-DH5IGE

Dear Mr. Simon:

I am writing in response to your request for reconsideration of the Registration Program's refusal to register the DNA sequence entitled the "Prancer DNA Sequence." I apologize for the delay in responding, but this request was an issue of first impression for the U.S. Copyright Office and as such, was given significant consideration prior to rendering a decision. After carefully reconsidering the registration materials and the arguments contained in your request for reconsideration, the Office affirms the refusal of registration.

The Office's decision is based primarily on three principles of copyright law. First, that to be copyrightable, a claim must be based on an "original work of authorship" falling within the congressionally established categories of authorship in title 17. 17 U.S.C. §102(a) and H.R. Rep. 94-1476, 51–53. Second, copyright protection does not "extend to any idea, procedure, process, system, method of operation, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work." 17 U.S.C. §102(b). Third, to be copyrightable, an original work of authorship must include a sufficient quantum of copyrightable authorship. The Office finds that the Prancer DNA Sequence fails on all three of these principles.

I. DNA falls outside of the categories of copyrightable by Congress

In your argument in support of copyright protection for the Prancer DNA Sequence, you emphasize that this is a human-engineered, synthetic DNA sequence rather than a naturally-occurring DNA sequence. Were this a naturally-occurring DNA sequence that was "discovered" by your client through a process of isolation, the matter would be simply resolved by §102 as further clarified by the Feist decision: "one who discovers a fact is not its 'maker' or 'originator.' The discoverer merely finds and records." Feist Publications v. Rural Telephone Service Co., 499 U.S. 340, 347 (1991). Your assertion that this work is created rather than discovered will, for purposes of this review, be accepted as such. Yet, the extent to which any genetic sequence can truly be entirely human-created is unclear, particularly with respect to copyright creativity. Since the operation of DNA is dictated by laws of biology, at least some aspects of DNA sequences are controlled by those laws. Moreover, to the extent that any sequence involves the isolation of any naturally-occurring sequence, or a derivation thereof, it may not be deemed created as opposed to discovered. It is of great concern to the Office that neither this critical distinction nor the degree of creative human authorship can be established through the examination of the deposit. The inability of the Office to independently discern new creative authorship suggests that a claim in a DNA sequence may be far better suited for the realm of patent, where a heighted standard of novelty, nonobviousness, and an examination of prior art would be considered, rather than the originality standard of copyright.

However, because this claim has been submitted for copyright registration, the Office applies the principles of copyright law. Assuming arguendo that this sequence is not a discovery or the isolation of a naturally-occurring sequence, it nevertheless must qualify as an original work of authorship to support a claim of copyright.

The first question to address is whether this sequence qualifies as copyrightable subject matter. You argue this sequence is fixed because "DNA is composed of stable chemical nucleotides" and because "DNA possesses definite sequences of nucleotides that can easily be determined, copies known to last for at least many thousands of years with its nucleotide sequence intact." First Request for Reconsideration, at 2. You further argue that "[u]sing synthetic biological techniques, it is routine to design and construct new, human-designed DNA sequences. Because a synthetic biologist designs particular DNA sequences, and 'writes,' or fixes, them when she synthesizes those sequences, she is an author." Id.

The fixation of chemical nucleotide sequences designed and constructed by a biologist does not fall within any of the congressionally established categories of authorship specified in § 102(a). You argue that those categories are illustrative and not limitative according to the language of § 102(a) and the legislative history. Coupled with the 1976 Act's expansive definition of fixation, you find support for the Copyright Act's ability to include synthetic DNA sequences within its subject matter.

In 2012, the U.S. Copyright Office thoroughly analyzed the relationship between the § 102(a) categories and the Office's discretion to identify new categories of authorship or to expand the scope of existing categories. In a statement of policy, the Office stated:

This passage suggests that Congress intended the statute to be flexible as to the scope of established categories, but also that Congress [] intended to retain control of the designation of entirely new categories of authorship. The legislative history goes on to state that the illustrative nature of the section 102 categories of authorship was intended to provide "sufficient flexibility to free the courts from rigid or outmoded concepts of the scope of particular categories." Id. at 53 (emphasis added). The flexibility granted to the courts is limited to the scope of the categories designated by Congress in section 102(a). Congress did not delegate authority to the courts to create new categories of authorship. Congress reserved this option to itself.

If the federal courts do not have authority to establish new categories of subject matter, it necessarily follows that the Copyright Office also has no such authority in the absence of any clear delegation of authority to the Register of Copyrights.

Statement of Policy; Registration of Compilations, 77 Fed. Reg. 37,605, 37,607 (2012).

This policy statement clarified the U.S. Copyright Office's interpretation of congressional intent based on the language of the statute together with a complete reading of the applicable legislative history. The Office found that Congress intended to avoid exhausting only its own power to create new categories of authorship:

In using the phrase "original works of authorship," rather than "all writings of an author" now in section 4 of the statute, the committee's purpose is to avoid exhausting the constitutional power of Congress to legislate in this field, and to eliminate the uncertainties arising from the latter phrase. Since the present statutory language is substantially the same as the empowering language of the Constitution, a recurring question has been whether the statutory and the constitutional provisions are coextensive. If so, the courts would be faced with the alternative of holding copyrightable something that Congress clearly did not intend to protect, or of holding constitutionally incapable of copyright something that Congress might one day want to protect.

H.R. Rep. 94-1476 at 51 (1976).

Congress [chose] to provide the courts with the authority to interpret the scope of existing categories, but it retained for itself the authority to create new categories of authorship in the future, as it did with architectural works. This was further clarified in the House Report:

In some of these cases the new expressive forms—electronic music, filmstrips, and computer programs, for example—could be regarded as an extension of copyrightable subject matter Congress had already intended to protect, and were thus considered copyrightable from the outset without the need of new legislation. In other cases, such as photographs, sound recordings, and motion pictures, statutory enactment was deemed necessary to give them full recognition as copyrightable works.


The Office finds that synthetic DNA sequences do not fit within any of the existing categories of copyrightable authorship listed in section 102(a) and are not an extension of copyrightable subject matter that Congress already intended to be protected by copyright. Even if the Office came to the opposite conclusion, based on its prior interpretation of the statute and the legislative history, the Office would not find it prudent to interpret the scope of existing categories in a wholly new manner. While the legislative history suggested that courts did have the flexibility to interpret the scope of existing categories beyond their present limits, the Office does not find similar support for its own authority. Moreover, neither the courts nor the Office [has] authority to create new categories of authorship; this prerogative resides with Congress. The Office finds a claim in synthetic DNA sequences to be a claim in a new category of copyrightable subject matter that is presently precluded from copyright protection until such time as Congress decides it should become copyrightable subject matter.

In your letter, you argue that a synthetic DNA sequence is analogous to a computer program such that it should qualify for registration as a literary work. The Office disagrees for several reasons.

You state that prior to the CONTU Commission's Final Report and Congress's subsequent amendment to the Copyright Act adding § 117 and § 101's definition of computer programs, the U.S. Copyright Office began registering certain computer programs under a "rule of doubt." However, the CONTU Report clarifies that the Office's issuance of qualified registrations were contingent upon the presence of observable authorship ("to the extent that they incorporate authorship of the programmer's expression of original ideas, as distinguished from the ideas themselves.")[1] and the deposit of human-readable copies. National Commission on New Technological Uses of Copyrighted Works, Final Report at 15 (1979).

The Copyright Office's Circular 31D contains additional conditions precedent for the registration of computer programs at that time:

(a) The elements of assembling, selecting, arranging, editing, and literary expression that went into the compilation of the program are sufficient to constitute original authorship.
(b) The program has been published, with the required copyright notice; that is, "copies" (i.e., reproductions of the program in a form perceptible or capable of being made perceptible to the human eye) bearing the notice have been distributed or made available to the public.
(c) The copies deposited for registration consist of or include reproductions in a language intelligible to human beings. If the first publication was [in] a form (such as machine-readable tape) that cannot be perceived visually or read by humans, something more (such as a print-out of the entire program) must be deposited along with two complete copies of the program as first published.
(d) An application for registration is submitted on Form A as a "book." Detailed instructions for registration are included in the application forms.
(e) The applicant also submits a brief explanation of the way in which the program was first made available to the public, and the form in which the copies were published. This explanation is not an essential requirement in every case, but it will generally facilitate examination of the required application, copies, and fee.

See, Copyright Office Circular 31D (1967). In addition, as quoted above from the House Report, Congress suggested that computer programs were regarded as an extension of copyrightable subject matter that Congress already intended to protect, H.R. Rep. 94-1476 at 51, and were a form of literary work. Id. at 54.

DNA sequences are fundamentally different from computer programs. As you stated, DNA sequences are fixed because "DNA is composed of stable chemical nucleotides." DNA sequences, whether naturally occurring or synthetic, are the result of biology or biological techniques, respectively. Biological creations do not fit within any of the existing categories of authorship. Indeed, the Copyright Office Review Board upheld a similar decision in a reconsideration of a denial of registration for genetically modified plants. The fact that Congress has chosen to provide delineated patent protection for certain biological processes, such [as] plant patents for newly invented strains of asexually reproducing plants, while precluding protection for tuber-propagated plants or wild uncultivated plants provides strong justification for leaving such decisions to Congress. Moreover, the Supreme Court's recent decision in the Association for Molecular Pathology v. Myriad Genetics, Inc., also provides reason to question whether synthetic or cDNA sequences are proper subject matter for copyright, since they are eligible for patent protection. Such concerns about the potential overlap between copyright and patent protection only strengthen the Office's conclusion that a synthetic DNA sequence does not fall within any of the existing § 102(a) categories of authorship.

Your alternative argument that the synthetic DNA sequence is analogous to a computer program because the Prancer sequence is comprised of a set of statements or instructions. You argue: "there is nothing in copyright law that would justify treating a set of instructions directed towards a computer any differently than a set of instructions directed towards some other machine capable of receiving and acting upon the instructions, including a biological machine such as a recombinant microorganism." First Request for Reconsideration at 4.

The Office disagrees. The definition of a computer program is "a set of statements or instructions to be used directly or indirectly in a computer in order to bring about a certain result." 17 U.S.C. § 101. Congress added this definition after the CONTU Report to address computer programs, a form of authorship that it had previously suggested fell within the scope of literary works. The 231 codons that make up the Prancer DNA Sequence are not statements or instructions that are "used directly or indirectly in a computer in order to bring about a certain result." While an organism may be analogized to a machine, it clearly is not one, and as a result falls outside of the category enumerated in the statute. Additionally, although the deposit submitted with the application for registration contains a notation of the Prancer DNA sequence comprised of specific series of four letters, the use of letters does not transform this sequence into a literary work. Every gene sequence is represented by some sequence of these four letters to identify the sequence of the chemical compounds that these letters represent. This sequence is not literary authorship, but rather a form of notating the biological sequence. Copyright does not protect DNA sequences whether naturally occurring or synthetic.

II. Copyright does not protect processes or systems

In addition to failing to fall within the scope of a congressionally-recognized category of authorship, the U.S. Copyright Office finds that this claim is precluded from copyright protection under section 102(b) which states:

(b) In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.

The Prancer DNA Sequence is genetic formula for a biological system. It does not describe, explain, or illustrate anything except the genetic markers that comprise this biological organism. Therefore, there is no copyrightable expression, but rather the claim simply records the formula for this biological system or process.

III. Copyright requires sufficient creative authorship

The sequence of nucleotides in a given gene is commonly represented as a list of characters comprising the letters A, C, G and T.[2] These letters correspond to the compounds adenine, cytosine, guanine and thymine, respectively. In the case of a synthetic gene, the specific sequence of nucleotides is the result of some person's choices, but those choices are not made for the purpose of artistic expression. They are made to create a specific gene that produces a "particular polypeptide." Properly understood, the nucleotide sequence of a synthetic gene, inasmuch as it could be conceived as a form of expression at all, is a "form of expression dictated solely by functional considerations."[3] A synthetic gene's nucleotide sequence, be it fixed in a writing or a biological cell, does not bear the "stamp" of any author.4[4] It is a mechanistically composed term that describes the gene's physical structure in a simple and unembellished way. The sequence simply serves to identify the gene, much as any name identifies the object to which it commonly applied.

You state, "for most of the 20 amino acids encoded by the standard genetic code, there are anywhere from two to six alternate codons specifying the same amino acid." However, even if most of the codons are redundant, that redundancy does not transform a biological sequence into an expressive work of authorship. Similarly, the mere fact that the sequence is comprised of a sequence of four letters also does not transform this biological process into a literary work. The letters are not used expressively, but rather for a functional end—to produce an encoded protein that is fluorescent, which as you state, is "a useful functional attribute in biology," and "is interpretable by most living biological systems." First Request for Reconsideration at 5. It is not interpretable by humans and is not used directly or indirectly in a computer to bring about a certain result. The sequence of codons is used to represent the synthesis of a protein comprising 231 amino acids that are linked together in a specific order to be used to produce a functional result in a biological organism. In the cases where letters and [symbols] are used to create a copyrightable work of authorship, they are combined into a form that is readable and conveys meaning to a human, including with the aid of a machine or device, or, to cause a certain result in a computer. The Prancer DNA Sequence does not possess any such expression beyond its functional representation.

For the reasons stated above, the U.S. Copyright Office affirms its conclusion that the Prancer DNA Sequence does not support a claim for copyright registration.


Robert J. Kasunic

Associate Register of Copyrights
and Director of Copyright Policy and Practices

  1. H.R. Report at 54.
  2. The letters A, C, G, and T are technically only used to represent genes sequences in DNA. Gene sequences in RNA arc represented by the A, C, G, and U, where the "U" represents the nucleotide uracil.
  3. Nimmer on Copyright § 2.01[B].
  4. See Harper & Row Pub. v. Nations Entm't, 471 U.S. 539, 547 (1985) ("The copyright is limited to those aspects of the work—termed "expression"—that display the stamp of the author's originality.")