108 research outputs found

    Orphan Works as Grist for the Data Mill

    Get PDF
    The phenomenon of library digitization in general, and the digitization of so-called “orphan works” in particular, raises many important copyright law questions. However, as this Article explains, correctly understood, there is no orphan works problem for certain kinds of library digitization. The distinction between expressive and non-expressive works is already well recognized in copyright law as the gatekeeper to copyright protection—novels are protected by copyright, while telephone books and other uncreative compilations of data are not. The same distinction should generally be made in relation to potential acts of infringement. Preserving the functional force of the idea-expression distinction in the digital context requires that copying for purely non-expressive purposes (also referred to as non-consumptive use), such as the automated extraction of data, should not be regarded as infringing. The non-expressive use of copyrighted works has tremendous potential social value by making search engines possible, and by providing an important data source for research in computational linguistics, automated translation, and natural language processing. Furthermore, the macro-analysis of text is being increasingly used in fields such as the study of literature itself. So long as digitization is confined to data processing applications that do not result in infringing expressive or consumptive uses of individual works, there is no orphan works problem because the exclusive rights of the copyright owner are limited to the expressive elements of their works and the expressive uses of their works

    The New Legal Landscape for Text Mining and Machine Learning

    Get PDF
    Now that the dust has settled on the Authors Guild cases, this Article takes stock of the legal context for TDM research in the United States. This reappraisal begins in Part I with an assessment of exactly what the Authors Guild cases did and did not establish with respect to the fair use status of text mining. Those cases held unambiguously that reproducing copyrighted works as one step in the process of knowledge discovery through text data mining was transformative, and thus ultimately a fair use of those works. Part I explains why those rulings followed inexorably from copyright\u27s most fundamental principles. It also explains why the precedent set in the Authors Guild cases is likely to remain settled law in the United States. Parts II and III address legal considerations for would-be text miners and their supporting institutions beyond the core holding of the Authors Guild cases. The Google Books and HathiTrust cases held, in effect, that copying expressive works for non-expressive purposes was justified as fair use. This addresses the most significant issue for the legality of text data mining research in the United States; however, the legality of non-expressive use is far from the only legal issue that researchers and their supporting institutions must confront if they are to realize the full potential of these technologies. Neither case addressed issues arising under contract law, laws prohibiting computer hacking, laws prohibiting the circumvention of technological protection measures (i.e., encryption and other digital locks), or cross-border copyright issues. Furthermore, although Google Books addressed the display of snippets of text as part of the communication of search results, and both Authors Guild cases addressed security issues that might bear upon the fair use claim, those holdings were a product of the particular factual circumstances of those cases and can only be extended cautiously to other contexts. Specifically, Part II surveys the legal status of TDM research in other important jurisdictions and explains some of the key differences between the law in the United States and the law in the European Union. It also explains how researchers can predict which law will apply in different situations. Part III sets out a four-stage model of the lifecycle of text data mining research and uses this model to identify and explain the relevant legal issues beyond the core holdings of the Authors Guild cases in relation to TDM as a non-expressive use

    Internet Safe Harbors and the Transformation of Copyright Law

    Get PDF
    This Article explores the potential displacement of substantive copyright law in the increasingly important online environment. In 1998, Congress enacted a system of intermediary safe harbors as part of the Digital Millennium Copyright Act (DMCA). The internet safe harbors and the associated system of notice-and-takedown fundamentally changed the incentives of platforms, users, and rightsholders in relation to claims of copyright infringement. These different incentives interact to yield a functional balance of copyright online that diverges markedly from the experience of copyright law in traditional media environments. More recently, private agreements between rightsholders and large commercial internet platforms have been made in the shadow of those safe harbors. These “DMCA-plus” agreements relate to automatic copyright filtering systems, such as YouTube’s Content ID, that not only return platforms to their gatekeeping role, but encode that role in algorithms and software. The normative implications of these developments are contestable. Fair use and other axioms of copyright law still nominally apply online, but in practice, the safe harbors and private agreements made in the shadow of those safe harbors are now far more important determinants of online behavior than whether that conduct is, or is not, substantively in compliance with copyright law. Substantive copyright law is not necessarily irrelevant online, but its relevance is indirect and contingent. The attenuated relevance of substantive copyright law to online expression has benefits and costs that appear fundamentally incommensurable. Compared to the offline world, online platforms are typically more permissive of infringement, and more open to new and unexpected speech and new forms of cultural participation. However, speech on these platforms is also more vulnerable to overreaching claims by rightsholders. There is no easy metric for comparing the value of noninfringing expression enabled by the safe harbors to that which has been unjustifiably suppressed by misuse of the notice-and-takedown system. Likewise, the harm that copyright infringement does to rightsholders is not easy to calculate, nor is it easy to weigh against the many benefits of the safe harbors. DMCA-plus agreements raise additional incommensurable potential costs and benefits. Automatic copyright enforcement systems have obvious advantages for both platforms and rightsholders: they may reduce the harm of copyright infringement; they may also allow platforms to be more hospitable to certain types of user content. However, automated enforcement systems may also place an undue burden on fair use and other forms of noninfringing speech. The design of copyright enforcement robots encodes a series of policy choices made by platforms and rightsholders and, as a result, subjects online speech and cultural participation to a new layer of private ordering and control. In the future, private interests, not public policy, will determine the conditions under which users get to participate in online platforms that adopt these systems. In a world where communication and expression is policed by copyright robots, the substantive content of copyright law matters only to the extent that those with power decide that it should matter

    Copyright Trolling, An Empirical Study

    Get PDF
    ABSTRACT: This detailed empirical and doctrinal study of copyright trolling presents new data showing the astonishing rate of growth of multi-defendant John Doe litigation in United States district courts over the past decade. It also presents new evidence of the association between this form of litigation and allegations of infringement concerning pornographic films. Multi-defendant John Doe lawsuits have become the most common form of copyright litigation in several U.S. districts, and in districts such as the Northern District of Illinois, copyright litigation involving pornography accounts for more than half of new cases. This Article highlights a fundamental oversight in the literature on copyright trolls. Paralleling discussions in patent law, scholars addressing the troll issue in copyright have applied status-based definitions to determine who is, and is not, a troll. This Article argues that the definition should be conduct based. Multi-defendant John Doe litigation should be considered copyright trolling whenever it is motivated by a desire to turn litigation into an independent revenue stream. Such litigation, when initiated with the aim of turning a profit in the courthouse as opposed to seeking compensation or deterring illegal activity, reflects a kind of systematic opportunism that fits squarely within the concept of litigation trolling. This Article shows that existing status-based definitions of copyright trolls do not account for what is now arguably the most prevalent form of trolling. In addition to these empirical and theoretical contributions, this Article shows how statutory damages and permissive joinder make multi-defendant John Doe litigation possible and why allegations of infringement concerning pornographic films are particularly well-suited to this model

    Copyright Trolling, An Empirical Study

    Get PDF
    This Article proceeds as follows: Part II locates MDJD suits within the broader context of the IP troll debate. It explains why attempts to define copyright trolls in terms of status—i.e., in terms of the plaintiff’s relationship to the underlying IP—are ultimately flawed and suggests a conduct-focused approach based on identifying systematic opportunism. Part II explains why MDJD lawsuits have all of the hallmarks of copyright trolling, and it explores the basic economics of MDJD litigation. It then presents empirical data documenting the astonishing rise of MDJD lawsuits over the past decade. Part III explores the role of statutory damages and permissive joinder in MDJD lawsuits in terms of the economic model developed in Part II. Part III also explains why the economics of this type of litigation are so well-suited to allegations of infringement concerning pornography and presents new data on the prevalence of pornography-related MDJD lawsuits. Part IV presents concrete proposals for copyright reform designed to make copyright trolling less attractive. This Part explains how, even in the absence of legislative reform, district court judges can exercise their discretion over joinder and early discovery to ensure that statutory damages are not excessive and to insist on various procedural safeguards

    Internet Safe Harbors and the Transformation of Copyright Law

    Get PDF
    This Article explores the potential displacement of substantive copyright law in the increasingly important online environment. In 1998, Congress enacted a system of intermediary safe harbors as part of the Digital Millennium Copyright Act (DMCA). The internet safe harbors and the associated system of notice-and-takedown fundamentally changed the incentives of platforms, users, and rightsholders in relation to claims of copyright infringement. These different incentives interact to yield a functional balance of copyright online that diverges markedly from the experience of copyright law in traditional media environments. More recently, private agreements between rightsholders and large commercial internet platforms have been made in the shadow of those safe harbors. These “DMCA-plus” agreements relate to automatic copyright filtering systems, such as YouTube\u27s Content ID, that not only return platforms to their gatekeeping role, but encode that role in algorithms and software. The normative implications of these developments are contestable. Fair use and other axioms of copyright law still nominally apply online, but in practice, the safe harbors and private agreements made in the shadow of those safe harbors are now far more important determinants of online behavior than whether that conduct is, or is not, substantively in compliance with copyright law. Substantive copyright law is not necessarily irrelevant online, but its relevance is indirect and contingent. The attenuated relevance of substantive copyright law to online expression has benefits and costs that appear fundamentally incommensurable. Compared to the offline world, online platforms are typically more permissive of infringement, and more open to new and unexpected speech and new forms of cultural participation. However, speech on these platforms is also more vulnerable to overreaching claims by rightsholders. There is no easy metric for comparing the value of noninfringing expression enabled by the safe harbors to that which has been unjustifiably suppressed by misuse of the notice-and-takedown system. Likewise, the harm that copyright infringement does to rightsholders is not easy to calculate, nor is it easy to weigh against the many benefits of the safe harbors. DMCA-plus agreements raise additional incommensurable potential costs and benefits. Automatic copyright enforcement systems have obvious advantages for both platforms and rightsholders: they may reduce the harm of copyright infringement; they may also allow platforms to be more hospitable to certain types of user content. However, automated enforcement systems may also place an undue burden on fair use and other forms of noninfringing speech. The design of copyright enforcement robots encodes a series of policy choices made by platforms and rightsholders and, as a result, subjects online speech and cultural participation to a new layer of private ordering and control. In the future, private interests, not public policy, will determine the conditions under which users get to participate in online platforms that adopt these systems. In a world where communication and expression is policed by copyright robots, the substantive content of copyright law matters only to the extent that those with power decide that it should matter

    Internet Safe Harbors and the Transformation of Copyright Law

    Get PDF
    This Article explores the potential displacement of substantive copyright law in the increasingly important online environment. In 1998, Congress enacted a system of intermediary safe harbors as part of the Digital Millennium Copyright Act (DMCA). The internet safe harbors and the associated system of notice-and-takedown fundamentally changed the incentives of platforms, users, and rightsholders in relation to claims of copyright infringement. These different incentives interact to yield a functional balance of copyright online that diverges markedly from the experience of copyright law in traditional media environments. More recently, private agreements between rightsholders and large commercial internet platforms have been made in the shadow of those safe harbors. These “DMCA-plus” agreements relate to automatic copyright filtering systems, such as YouTube’s Content ID, that not only return platforms to their gatekeeping role, but encode that role in algorithms and software. The normative implications of these developments are contestable. Fair use and other axioms of copyright law still nominally apply online, but in practice, the safe harbors and private agreements made in the shadow of those safe harbors are now far more important determinants of online behavior than whether that conduct is, or is not, substantively in compliance with copyright law. Substantive copyright law is not necessarily irrelevant online, but its relevance is indirect and contingent. The attenuated relevance of substantive copyright law to online expression has benefits and costs that appear fundamentally incommensurable. Compared to the offline world, online platforms are typically more permissive of infringement, and more open to new and unexpected speech and new forms of cultural participation. However, speech on these platforms is also more vulnerable to overreaching claims by rightsholders. There is no easy metric for comparing the value of noninfringing expression enabled by the safe harbors to that which has been unjustifiably suppressed by misuse of the notice-and-takedown system. Likewise, the harm that copyright infringement does to rightsholders is not easy to calculate, nor is it easy to weigh against the many benefits of the safe harbors. DMCA-plus agreements raise additional incommensurable potential costs and benefits. Automatic copyright enforcement systems have obvious advantages for both platforms and rightsholders: they may reduce the harm of copyright infringement; they may also allow platforms to be more hospitable to certain types of user content. However, automated enforcement systems may also place an undue burden on fair use and other forms of noninfringing speech. The design of copyright enforcement robots encodes a series of policy choices made by platforms and rightsholders and, as a result, subjects online speech and cultural participation to a new layer of private ordering and control. In the future, private interests, not public policy, will determine the conditions under which users get to participate in online platforms that adopt these systems. In a world where communication and expression is policed by copyright robots, the substantive content of copyright law matters only to the extent that those with power decide that it should matter
    • …
    corecore