5 research outputs found

    Modeling Worksets in the HathiTrust Research Center

    Get PDF
    Report formally defining the notion of workset both generally and specifically within the context of the HTRC. See executive summary for full details.Mellon Reference Number 21300666Ope

    Proposal for Persistent & Unique Entity Identifiers

    Get PDF
    This proposal argues for the establishment of persistent and unique identifiers for page level content. The page is a key conceptual entity within the HathiTrust Research Center (HTRC) framework. Volumes are composed of pages and pages are the size of the portions of data that the HTRC’s analytics modules consume and execute algorithms across. The need for infrastructure that supports persistent and unique identity for is best described by seven use cases: 1. Persistent Citability: Scholars engaging in the analysis of HTRC resources have a clear need to cite those resources in a persistent manner independent of those resources’ relative positions within other entities. 2. Point-in-time Citability: Scholars engaging in the analysis of HTRC resources have a clear need to cite resources in an unambiguous way that is persistent with respect to time. 3. Reproducibility: Scholars need methods by which the resources that they cite can be shared so that their work conforms to the norms of peer-review and reproducibility of results. 4. Supporting “non-consumptive” Usage: Anonymizing page-level content by disassociating it from the volumes that it is conceptually a part of increases the difficulty of leveraging HTRC analytics modules for the direct reproduction of HathiTrust (HT) content. 5. Improved Granularity: Since many features that scholars are interested in exist at the conceptual level of a page rather than at the level of a volume, unique page-level entities expand the types of methods by which worksets can be gathered and by which analytics modules can be constructed. 6. Expanded Workset Membership: In the near future we would like to empower scholars with options for creating worksets from arbitrary resources at arbitrary levels of granularity, including constructing worksets from collections of arbitrary pages. 7. Supporting Graph Representations: Unique identifiers for page-level content facilitate the creation of more conceptually accurate and functional graph representations of the HT corpus. There several waysOpe

    Workset Creation for Scholarly Analysis and Data Capsules (WCSA+DC): Laying the foundations for secure computation with copyrighted data in the HathiTrust Research Center, Phase I

    Get PDF
    The primary objective of the WCSA+DC project is the seamless integration of the workset model and tools with the Data Capsule framework to provide non-consumptive research access HathiTrust’s massive corpus of data objects, securely and at scale, regardless of copyright status. That is, we plan to surmount the copyright wall on behalf of scholars and their students. Notwithstanding the substantial preliminary work that has been done on both the WCSA and DC fronts, they are both still best characterized as being in the prototyping stages. It is our intention to that this proposed Phase I of the project devote an intense two-year burst of effort to move the suite of WCSA and DC prototypes from the realm of proof-of-concept to that of a firmly integrated at-scale deployment. We plan to concentrate our requested resources on making sure our systems are as secure and robust at scale as possible. Phase I will engage four external research partners. Two of the external partners, Kevin Page (Oxford) and Annika Hinze (Waikato) were recipients of WCSA prototyping sub-awards. We are very glad to propose extending and refining aspects of their prototyping work in the context of WCSA+DC. Two other scholars, Ted Underwood (Illinois) and James Pustejovsky (Brandeis) will play critical roles in Phase I as active participants in the development and refinement of the tools and systems from their particular user-scholar perspectives: Underwood, Digital Humanities (DH); Pustejovsky, Computational Linguistics (CL). The four key outcomes and benefits of the WCSA+DC, Phase I project are: 1. The deployment of a new Workset Builder tool that enhances search and discovery across the entire HTDL by complementing traditional volume-level bibliographic metadata with new metadata derived from a variety of sources at various levels granularity. 2. The creation of Linked Open Data resources to help scholars find, select, integrate and disseminate a wider range of data as part of their scholarly analysis life-cycle. 3. A new Data Capsule framework that integrates worksets, runs at scale, and does both in a secure, non-consumptive, manner. 4. A set of exemplar pre-built Data Capsules that incorporate tools commonly used by both the DH and CL communities that scholars can then customize to their specific needs.Andrew W. Mellon Foundation, grant no. 41500672Ope

    Magnetic Skyrmions and Topological Domain Walls

    Get PDF
    Whether in the compass in the early ages of mankind or in the hard disk drives that provide the vast quantities of memory that nowadays information technology demands: magnetism is a ubiquitous phenomenon with numerous applications in our everyday life. The comparatively young skyrmion, in turn, adds with its topologically non-trivial structure a new flavor to this extensively studied field. This cumulative dissertation explores the diverse phenomena that occur in magnets, where non-collinear magnetization-textures are stabilized as a consequence of broken mirror or inversion symmetries of the material. These textures include long-ranged spirals, lattices of skyrmions, and polarized structures