1,666 research outputs found

    Understanding Organizational Responses to Innovative Deviance: A Case Study of HathiTrust.

    Full text link
    This thesis traces the emergence and evolution of HathiTrust as way of generating deeper insights into the processes of sociotechnical transformation. HathiTrust emerged from the groundbreaking and legally contentious Google mass digitization project as an organization operated by the University of Michigan. It grew into a partnership with over 100 research institutions that support a shared digital repository, oversee a digital library comprised of over thirteen million volumes, and run a research center for non-consumptive computational research. This dissertation combines traditional legal research and analysis with social scientific approaches. Primary data for this case study were generated from in-depth interviews and review of relevant documents such as contracts, judicial opinions, press releases, and organizational reports. It develops an analytic framework blending the sociological concept of innovative deviance with organizational sensemaking theories and copyright doctrine. It describes and explains how and why organizations make sense of and make decisions with respect to risk and opportunity under conditions of uncertainty, ambiguity, and disequilibrium. This explains how slow-moving institutions such as laws and academic research libraries change and adapt in accordance with changes in technology and social practices. It describes the dynamic, non-linear, and mutually constitutive relationships among technology, social practice, and law that shaped and were shaped by HathiTrust. In so doing, it offers insights into the processes of sociotechnical transformation.PhDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133351/1/acentiva_1.pd

    Workset Creation for Scholarly Analysis and Data Capsules (WCSA+DC): Laying the foundations for secure computation with copyrighted data in the HathiTrust Research Center, Phase I

    Get PDF
    The primary objective of the WCSA+DC project is the seamless integration of the workset model and tools with the Data Capsule framework to provide non-consumptive research access HathiTrust’s massive corpus of data objects, securely and at scale, regardless of copyright status. That is, we plan to surmount the copyright wall on behalf of scholars and their students. Notwithstanding the substantial preliminary work that has been done on both the WCSA and DC fronts, they are both still best characterized as being in the prototyping stages. It is our intention to that this proposed Phase I of the project devote an intense two-year burst of effort to move the suite of WCSA and DC prototypes from the realm of proof-of-concept to that of a firmly integrated at-scale deployment. We plan to concentrate our requested resources on making sure our systems are as secure and robust at scale as possible. Phase I will engage four external research partners. Two of the external partners, Kevin Page (Oxford) and Annika Hinze (Waikato) were recipients of WCSA prototyping sub-awards. We are very glad to propose extending and refining aspects of their prototyping work in the context of WCSA+DC. Two other scholars, Ted Underwood (Illinois) and James Pustejovsky (Brandeis) will play critical roles in Phase I as active participants in the development and refinement of the tools and systems from their particular user-scholar perspectives: Underwood, Digital Humanities (DH); Pustejovsky, Computational Linguistics (CL). The four key outcomes and benefits of the WCSA+DC, Phase I project are: 1. The deployment of a new Workset Builder tool that enhances search and discovery across the entire HTDL by complementing traditional volume-level bibliographic metadata with new metadata derived from a variety of sources at various levels granularity. 2. The creation of Linked Open Data resources to help scholars find, select, integrate and disseminate a wider range of data as part of their scholarly analysis life-cycle. 3. A new Data Capsule framework that integrates worksets, runs at scale, and does both in a secure, non-consumptive, manner. 4. A set of exemplar pre-built Data Capsules that incorporate tools commonly used by both the DH and CL communities that scholars can then customize to their specific needs.Andrew W. Mellon Foundation, grant no. 41500672Ope

    Data Mining Research with In-copyright and Use-limited Text Datasets: Preliminary Findings from a Systematic Literature Review and Stakeholder Interviews

    Get PDF
    Text data mining and analysis has emerged as a viable research method for scholars, following the growth of mass digitization, digital publishing, and scholarly interest in data re-use. Yet the texts that comprise datasets for analysis are frequently protected by copyright or other intellectual property rights that limit their access and use. This paper discusses the role of libraries at the intersection of data mining and intellectual property, asserting that academic libraries are vital partners in enabling scholars to effectively incorporate text data mining into their research. We report on activities leading up to an IMLS-funded National Forum of stakeholders and discuss preliminary findings from a systematic literature review, as well as initial results of interviews with forum stakeholders. Emerging themes suggest the need for a multi-pronged distributed approach that includes a public campaign for building awareness and advocacy, development of best practice guides for library support services and training, and international efforts toward data standardization and copyright harmonization.Institute of Museum and Library Services (LG-73-17-0070-17)Ope

    Workset Creation for Scholarly Analysis: Prototyping Project

    Get PDF
    Scholars rely on library collections to support their scholarship. Out of these collections, scholars select, organize, and refine the worksets that will answer to their particular research objectives. The requirements for those worksets are becoming increasingly sophisticated and complex, both as humanities scholarship has become more interdisciplinary and as it has become more digital. The HathiTrust is a repository that centrally collects image and text representations of library holdings digitized by the Google Books project and other mass-digitization efforts. The HathiTrust's computational infrastructure is being built to support large-scale manipulation and preservation of these representations, but it organizes them according to catalog records that were created to enable users to find books in a building or to make high-level generalizations about duplicate holdings across libraries, etc. These catalog records were never meant to support the granularity of sorting and selection or works that scholars now expect, much less page-level or chapter-level sorting and selection out of a corpus of billions of pages. The ability to slice through a massive corpus consisting of many different library collections, and out of that to construct the precise workset required for a particular scholarly investigation, is the “game changing” potential of the HathiTrust; understanding how to do that is a research problem, and one that is keenly of interest to the HathiTrust Research Center (HTRC), since we believe that scholarship begins with the selection of appropriate resources. Given the unprecedented size and scope of the HathiTrust corpus—in conjunction with the HTRC’s unique computational access to copyrighted materials—we are proposing a project that will engage scholars in designing tools for exploration, location, and analytic grouping of materials so they can routinely conduct computational scholarship at scale, based on meaningful worksets. “Workset Creation for Scholarly Analysis: Prototyping Project” (WCSA) seeks to address three sets of tightly intertwined research questions regarding 1) enriching the metadata in the HathiTrust corpus, 2) augmenting string-based metadata with URIs to leverage discovery and sharing through external services, and 3) formalizing the notion of collections and worksets in the context of the HathiTrust Research Center. Building upon the model of the Open Annotation Collaboration, the HTRC proposes to release an open, competitive Request for Proposals with the intent to fund four prototyping projects that will build tools for enriching and augmenting metadata for the HathiTrust corpus. Concurrently, the HTRC will work closely with the Center for Informatics Research in Science and Scholarship (CIRSS) to develop and instantiate a set of formal data models that will be used to capture and integrate the outputs of the funded prototyping projects with the larger HathiTrust corpus.Andrew W. Mellon Foundation, grant no. 21300666Ope

    Les “ non-consumptive research uses ” des ressources numériques

    Get PDF
    Cette note n’est pas un avis juridique mais une analyse de situation légale autour de la question des non-consumptive research uses. La non-consumptive research est la recherche où le seul usage que le chercheur fait d’une ressource numérique est d’y appliquer une analyse informatique (computationnelle), et non d’en lire et comprendre « humainement » des parties substantielles afin d’assimiler intellectuellement son contenu. Exemples de non-consumptive research use : extraction de texte, analyse automatisée de texte, traduction automatique, synthèse et rapport statistique généré automatiquement, indexation automatique… Conséquences légales de cette notion dans le droit anglo-saxon : depuis 2009, la non-consumptive research est assimilée à un fair use, c’est-à-dire un usage ne nécessitant pas d’autorisation spécifique. Ce point est contesté par les représentants des ayants-droits, mais largement utilisé par les bibliothèques universitaires pour autoriser leurs chercheurs à utiliser les ressources numériques à des fins de non-consumptive research

    Les "non-consumptive research uses\u27\u27 des ressources numériques

    Get PDF
    Cette note n’est pas un avis juridique mais une analyse de situation légale autour de la question des non-consumptive research uses. La non-consumptive research est la recherche où le seul usage que le chercheur fait d’une ressource numérique est d’y appliquer une analyse informatique (computationnelle), et non d’en lire et comprendre « humainement » des parties substantielles afin d’assimiler intellectuellement son contenu. Exemples de non-consumptive research use : extraction de texte, analyse automatisée de texte, traduction automatique, synthèse et rapport statistique généré automatiquement, indexation automatique… Conséquences légales de cette notion dans le droit anglo-saxon : depuis 2009, la non-consumptive research est assimilée à un fair use, c’est-à-dire un usage ne nécessitant pas d’autorisation spécifique. Ce point est contesté par les représentants des ayants-droits, mais largement utilisé par les bibliothèques universitaires pour autoriser leurs chercheurs à utiliser les ressources numériques à des fins de non-consumptive research. Adaptation de la notion en contexte de droit d’auteur français/européen : les non-consumptiv

    Toward open computational communication science: A practical road map for reusable data and code

    Get PDF
    Computational communication science (CCS) offers an opportunity to accelerate the scope and pace of discovery in communication research. This article argues that CCS will profit from adopting open science practices by fostering the reusability of data and code. We discuss the goals and challenges related to creating reusable data and code and offer practical guidance to individual researchers to achieve this. More specifically, we argue for integration of the research process into reusable workflows and recognition of tools and data as academic work. The challenges and road map are also critically discussed in terms of the additional burden they place on individual scholars, which culminates in a call to action for the field to support and incentivize the reusability of tools and data

    An Economic Theory of Infrastructure and Commons Management

    Get PDF
    In this article, Professor Frischmann combines a number of current debates across many disciplinary lines, all of which examine from different perspectives whether certain resources should be managed through a regime of private property or through a regime of open access. Frischmann develops and applies a theory that demonstrates there are strong economic arguments for managing and sustaining openly accessible infrastructure. The approach he takes differs from conventional analyses in that he focuses extensively on demand-side considerations and fully explores how infrastructure resources generate value for consumers and society. As a result, the theory brings into focus the social value of common infrastructure, and strongly suggests that the benefits of open access (costs of restricted access) are significantly greater than reflected in current debates. Frischmann\u27s infrastructure theory ultimately ties together different strands of legal and economic thought pertaining to natural resources such as lakes, traditional infrastructure such as road systems, what antitrust theorists describe as essential facilities, basic scientific research, and the Internet. His theory has significant potential to reframe a number of important debates within intellectual property, cyberlaw, telecommunications, and many other areas. Note: Professor Lawrence Lessig will publish a Reply titled Re-Marking the Progress in Frischmann in the same edition of the Minnesota Law Review
    • …
    corecore