6 research outputs found

    Workset Creation for Scholarly Analysis: Recommendations and Prototyping Project Reports

    Get PDF
    This document assembles and describes the outcomes of the four prototyping projects undertaken as part of the Workset Creation for Scholarly Analysis (WCSA) research project (2013 – 2015). Each prototyping project team provided its own final report. These reports are assembled together and included in this document. Based on the totality of results reported, the WCSA project team also provide a set of overarching recommendations for HTRC implementation and adoption of research conducted by the Prototyping Project teams. The work described here was made possible through the generous support of The Andrew W. Mellon Foundation (Grant Ref # 21300666).The Andrew W. Mellon Foundation (Grant Ref # 21300666)Ope

    Workset Creation for Scholarly Analysis and Data Capsules (WCSA+DC): Laying the foundations for secure computation with copyrighted data in the HathiTrust Research Center, Phase I

    Get PDF
    The primary objective of the WCSA+DC project is the seamless integration of the workset model and tools with the Data Capsule framework to provide non-consumptive research access HathiTrust’s massive corpus of data objects, securely and at scale, regardless of copyright status. That is, we plan to surmount the copyright wall on behalf of scholars and their students. Notwithstanding the substantial preliminary work that has been done on both the WCSA and DC fronts, they are both still best characterized as being in the prototyping stages. It is our intention to that this proposed Phase I of the project devote an intense two-year burst of effort to move the suite of WCSA and DC prototypes from the realm of proof-of-concept to that of a firmly integrated at-scale deployment. We plan to concentrate our requested resources on making sure our systems are as secure and robust at scale as possible. Phase I will engage four external research partners. Two of the external partners, Kevin Page (Oxford) and Annika Hinze (Waikato) were recipients of WCSA prototyping sub-awards. We are very glad to propose extending and refining aspects of their prototyping work in the context of WCSA+DC. Two other scholars, Ted Underwood (Illinois) and James Pustejovsky (Brandeis) will play critical roles in Phase I as active participants in the development and refinement of the tools and systems from their particular user-scholar perspectives: Underwood, Digital Humanities (DH); Pustejovsky, Computational Linguistics (CL). The four key outcomes and benefits of the WCSA+DC, Phase I project are: 1. The deployment of a new Workset Builder tool that enhances search and discovery across the entire HTDL by complementing traditional volume-level bibliographic metadata with new metadata derived from a variety of sources at various levels granularity. 2. The creation of Linked Open Data resources to help scholars find, select, integrate and disseminate a wider range of data as part of their scholarly analysis life-cycle. 3. A new Data Capsule framework that integrates worksets, runs at scale, and does both in a secure, non-consumptive, manner. 4. A set of exemplar pre-built Data Capsules that incorporate tools commonly used by both the DH and CL communities that scholars can then customize to their specific needs.Andrew W. Mellon Foundation, grant no. 41500672Ope

    Modeling Worksets in the HathiTrust Research Center

    Get PDF
    Report formally defining the notion of workset both generally and specifically within the context of the HTRC. See executive summary for full details.Mellon Reference Number 21300666Ope

    Workset Creation for Scholarly Analysis: Prototyping Project

    Get PDF
    Scholars rely on library collections to support their scholarship. Out of these collections, scholars select, organize, and refine the worksets that will answer to their particular research objectives. The requirements for those worksets are becoming increasingly sophisticated and complex, both as humanities scholarship has become more interdisciplinary and as it has become more digital. The HathiTrust is a repository that centrally collects image and text representations of library holdings digitized by the Google Books project and other mass-digitization efforts. The HathiTrust's computational infrastructure is being built to support large-scale manipulation and preservation of these representations, but it organizes them according to catalog records that were created to enable users to find books in a building or to make high-level generalizations about duplicate holdings across libraries, etc. These catalog records were never meant to support the granularity of sorting and selection or works that scholars now expect, much less page-level or chapter-level sorting and selection out of a corpus of billions of pages. The ability to slice through a massive corpus consisting of many different library collections, and out of that to construct the precise workset required for a particular scholarly investigation, is the “game changing” potential of the HathiTrust; understanding how to do that is a research problem, and one that is keenly of interest to the HathiTrust Research Center (HTRC), since we believe that scholarship begins with the selection of appropriate resources. Given the unprecedented size and scope of the HathiTrust corpus—in conjunction with the HTRC’s unique computational access to copyrighted materials—we are proposing a project that will engage scholars in designing tools for exploration, location, and analytic grouping of materials so they can routinely conduct computational scholarship at scale, based on meaningful worksets. “Workset Creation for Scholarly Analysis: Prototyping Project” (WCSA) seeks to address three sets of tightly intertwined research questions regarding 1) enriching the metadata in the HathiTrust corpus, 2) augmenting string-based metadata with URIs to leverage discovery and sharing through external services, and 3) formalizing the notion of collections and worksets in the context of the HathiTrust Research Center. Building upon the model of the Open Annotation Collaboration, the HTRC proposes to release an open, competitive Request for Proposals with the intent to fund four prototyping projects that will build tools for enriching and augmenting metadata for the HathiTrust corpus. Concurrently, the HTRC will work closely with the Center for Informatics Research in Science and Scholarship (CIRSS) to develop and instantiate a set of formal data models that will be used to capture and integrate the outputs of the funded prototyping projects with the larger HathiTrust corpus.Andrew W. Mellon Foundation, grant no. 21300666Ope

    Cultural Heritage on line

    Get PDF
    The 2nd International Conference "Cultural Heritage online – Empowering users: an active role for user communities" was held in Florence on 15-16 December 2009. It was organised by the Fondazione Rinascimento Digitale, the Italian Ministry for Cultural Heritage and Activities and the Library of Congress, through the National Digital Information Infrastructure and Preservation Program - NDIIP partners. The conference topics were related to digital libraries, digital preservation and the changing paradigms, focussing on user needs and expectations, analysing how to involve users and the cultural heritage community in creating and sharing digital resources. The sessions investigated also new organisational issues and roles, and cultural and economic limits from an international perspective
    corecore