US: Google Books is “fair use” of 20+ million copyrighted works; class action dismissed

By Dori Ann Hanswirth & Nathaniel Boyer on November 20, 2013

On Thursday, a New York-based federal judge (Chin, J.) ruled that Google Books, a repository of over 20 million books available to be text-searched by any internet user for free, is a non-actionable “fair use.” The Authors Guild, Inc. v. Google, Inc., No. 1:05-cv-08136-DC, ___ F. Supp. 2d. ___ (S.D.N.Y. Nov. 14, 2013). Reasoning that Google Books uses “words in books . . . in a way they have never been used before,” the Court threw out a proposed class-action lawsuit that sought an injunction and unspecified damages from Google, who scanned and saved the books in their entirety without obtaining permission from the millions of proposed class members.

Defendant Google, Inc., the creator of Google Books, partnered with thousands of libraries to make scans of over 20 million books. But these are not ordinary “image” scans: Rather, Google uses advanced optical character recognition (“OCR”) scanning technology to rapidly and accurately detect the words in the books. From that, Google has created “a comprehensive word index that helps readers, scholars, researchers, and others find books,” which is searchable by anyone with internet access.

Google’s creation is “highly transformative,” the Court found, because Google was not using the published words for their original expressive purpose—rather it “uses snippets of text to act as pointers directing users to a broad selection of books.” For example, Google Books has enabled researchers to engage in “data mining” projects they have never done before, like analyzing changes in usage of language over time. Also weighing in favor of the Court’s fair use determination was that (a) Google “does not engage in direct commercialization of the copyrighted works,” and (b) the vast majority (over 93%) of the books are non-fiction, which is entitled to less protection than the more “creative” fictional works.

The Court was equally dismissive of the plaintiffs’ claims that their market in the sale of books has been harmed. Google does not post the entire books for public viewing carte blanche. Rather, users of Google Books can only access “snippets” of books when they run text searches—and even then, 10% of all pages are not viewable, and portions of the pages that are viewable are “blacklisted.” It would be highly unlikely “that someone would take the time and energy to input countless searches to try and get enough snippets to comprise an entire book”—a task that, presumably, can be done effectively only if the person already possesses a copy of the book, anyway. Rather, “a reasonable factfinder could only conclude that Google Books enhances the sales of books to the benefit of copyright owners” by, for example, enabling underserved libraries to identify which books they want to purchase with their limited resources.

This is a major ruling. It suggests that, with appropriate safeguards to prevent viewers from accessing works in full, aggregating copyrighted data and making it publicly available via searchable indexes may pass muster under the U.S. copyright laws. This affects any industry concerned about copyright implications of disseminating aggregated data.

However, the Court’s opinion is unlikely to be the last word in this case: The class-action plaintiffs plan to appeal the ruling to a three-judge panel of the influential United States Court of Appeals for the Second Circuit (also based in New York City).