Search CORE

57 research outputs found

Defining Textual Entailment

Author: Jett Jacob
Korman Daniel Z.
Mack Eric
Renear Allen H.
Publication venue
Publication date: 01/01/2018
Field of study

Textual entailment is a relationship that obtains between fragments of text when one fragment in some sense implies the other fragment. The automation of textual entailment recognition supports a wide variety of text-based tasks, including information retrieval, information extraction, question answering, text summarization, and machine translation. Much ingenuity has been devoted to developing algorithms for identifying textual entailments, but relatively little to saying what textual entailment actually is. This article is a review of the logical and philosophical issues involved in providing an adequate definition of textual entailment. We show that many natural definitions of textual entailment are refuted by counterexamples, including the most widely cited definition of Dagan et al. We then articulate and defend the following revised definition: T textually entails H = df typically, a human reading T would be justified in inferring the proposition expressed by H from the proposition expressed by T. We also show that textual entailment is context-sensitive, nontransitive, and nonmonotonic

PhilPapers

Crossref

eScholarship - University of California

Modeling Worksets in the HathiTrust Research Center

Author: Jett Jacob
Publication venue
Publication date: 01/07/2015
Field of study

Report formally defining the notion of workset both generally and specifically within the context of the HTRC. See executive summary for full details.Mellon Reference Number 21300666Ope

Illinois Digital Environment for Access to Learning and Scholarship Repository

Exploring the Benefits for Users of Linked Open Data for Digitized Special Collections: Google Analytics data summary

Author: Jett Jacob
Publication venue
Publication date: 01/03/2017
Field of study

In addition to one-on-one user interactions and a planned focus group, additional assessment methods, i.e., site traffic data gathered from Google Analytics and test queries using Google’s search engine, were used to produce supplementary benchmark data. The sections below summarize the facts observed from these two data collections.Ope

Illinois Digital Environment for Access to Learning and Scholarship Repository

To Map or Not to Map: Rethinking Crosswalk Agendas

Author: Bothmann Bobby
Dubin David
Jett Jacob
Publication venue: University of Washington Information School
Publication date: 12/11/2021
Field of study

In the two decades since their publication, the Functional Requirements of Bibliographic Records and succeeding standards such as the Library Reference Model have had a marked impact on discourse concerning descriptive theory and practice. The BIBFRAME model, which began as an effort to replace MARC as a linked data-capable modeling format, offers an alternate view of the bibliographic universe with three principal entities rather than four. Differences between BIBFRAME and LRM are based in competing intuitions on the nature of creative works, and at first the two approaches appear to compete for the same intellectual space. BIBFRAME offers us a less constrained model of bibliographic descriptions than the FRBR models, and if interoperability between BIBFRAME and WEMI-aligned standards like Resource Description and Access requires translation of RDA records both to and from BIBFRAME descriptions, then the latter’s flexibility poses problems for mapping between the models. Proposed solutions to those problems reveal as much about different modeling philosophies as they do about different views of creative works and their relationships to texts and copies. Linked data protocols are intended to support resources and scenarios that are far too diverse for either a single account of creative works or for a subsumption-based taxonomy of models. But a need for descriptions flexible enough to include them all does not require us to retreat from modeling commitments to either reductionism or operationalism. BIBFRAME can be seen as reaching for or pointing toward a descriptive domain that supports a complementary role to the IFLA standards

University of Washington: ResearchWorks Journal Hosting

When conceptual models collide: aggregates in IFLA's Library Reference Model

Author: Dubin David
Jett Jacob
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 06/06/2019
Field of study

IFLA’s Library Reference Model defines manifestations as sets of carriers sharing relevant physical and intentional properties, and aggregates as manifestations that embody multiple expressions. Taken together, these accounts pose consistency problems for some manifestation-level properties, and for the constraint that an item exemplifies exactly one manifestation.Ope

Illinois Digital Environment for Access to Learning and Scholarship Repository

Enhancing Cultural Heritage Collections by Supporting and Analyzing Participation in Flickr

Author: Jett Jacob
Palmer Carole L.
Senseney Megan F.
Publication venue: ASIS&T
Publication date: 29/10/2012
Field of study

Cultural heritage institutions can enhance their collections by sharing content through popular web services. Drawing on current analyses from the Flickr Feasibility Study, we report on the pronounced increase in use of the IMLS DCC Flickr Photostream in the past year, trends in how users are engaging with the content, and data provider perspectives on participation in Flickr through the DCC. In addition to users providing comments and tags for images, they are increasingly integrating historical images from libraries and museums into new digital objects and special collections. Intermediary services can fill a key role in lowering the burden for institutions to engage in Web 2.0 initiatives and broadening public access to cultural heritage content. To extend the scope of the current DCC services, we propose a feedback framework for transferring user-generated information to institutional data providers.IMLS LG-06-07-0020published or submitted for publicationis peer reviewe

Illinois Digital Environment for Access to Learning and Scholarship Repository

Exploring the Benefits for Users of Linked Open Data for Digitized Special Collections: Benchmark case studies of two digital library websites

Author: Cole Timothy W.
Jett Jacob
Kinnaman Alex
Zavala Melina
Publication venue
Publication date: 01/03/2017
Field of study

This report presents the results from a pair of case studies conducted as part of the Exploring the benefits for users of Linked Open Data for digitized special collections project. Each case study was produced from a series of interviews with users of digital special collections. The case studies compare the Motley Collection of Theatre & Costume Design1 (Motley) to the Harvard Theatre Collection2 and the Kolb-Proust Archive for Research3 (KPA) to the Bovary Manuscript Archive4 respectively. Each of the users was a volunteer and was asked to compare to digital collection websites to one another during the course of completing a series of user tasks which included assessing the overall layout and utility of each digital collection’s interface, searching for a specific resource, and characterizing how they might employ the collections in their research.Andrew W. Mellon Foundation, Award No. 31500650Ope

Illinois Digital Environment for Access to Learning and Scholarship Repository

Disambiguating Descriptions: Mapping Digital Special Collections Metadata into Linked Open Data Formats

Author: Cole Timothy W.
Han Myung-Ja K.
Jett Jacob
Publication venue: Association of Information Science and Technology
Publication date: 01/01/2016
Field of study

In this poster we describe the Linked Open Data (LOD) for Digital Special Collections project at the University of Illinois at Urbana-Champaign and describe some of the particular challenges that legacy metadata poses for representation in LOD formats. LOD formats are primarily based on the World Wide Web Consortium’s Resource Description Framework standard which demands both that entities be named by opaque universal identifiers whenever possible but also that metadata descriptions for entities be as unambiguous as possible. The challenges for disambiguating those descriptions are illustrated through examples drawn from digital special collections based at four different digital librariesOpe

Illinois Digital Environment for Access to Learning and Scholarship Repository

Conceptualizing worksets for non-consumptive research

Author: Downie J. Stephen
Fallaw Colleen
Jett Jacob
Maden Chris
Senseney Megan
Publication venue: 'iSchools'
Publication date: 15/03/2015
Field of study

The HathiTrust (HT) digital library comprises 4 billion pages (composing 11 million volumes). The HathiTrust Research Center (HTRC) – a unique collaboration between University of Illinois and Indiana University – is developing tools to connect scholars to this large and diverse corpus. This poster discusses HTRC’s activities surrounding the discovery, formation and optimization of useful analytic subsets of the HT corpus (i.e., workset creation and use). As a part of this development we are prototyping a RDF-based triple-store designed to record and serialize metadata describing worksets and the bibliographic entities that are collected within them. At the heart of this work is the construction of a formal conceptual model that captures sufficient descriptive information about worksets, including provenance, curatorial intent, and other useful metadata, so that digital humanities scholars can more easily select, group, and cite their research data collections based upon HT and external corpora. The prototype’s data model is in being designed to be extensible and fit well within the Linked Open Data community.ye

Illinois Digital Environment for Access to Learning and Scholarship Repository

Proposal for Persistent & Unique Entity Identifiers

Author: Cole Timothy
Fallaw Colleen
Jett Jacob
Maden Christopher
Ruan Guangchen
Unnikrishnan Leena
Publication venue: 'University of Illinois Libraries'
Publication date: 22/08/2014
Field of study

This proposal argues for the establishment of persistent and unique identifiers for page level content. The page is a key conceptual entity within the HathiTrust Research Center (HTRC) framework. Volumes are composed of pages and pages are the size of the portions of data that the HTRC’s analytics modules consume and execute algorithms across. The need for infrastructure that supports persistent and unique identity for is best described by seven use cases: 1. Persistent Citability: Scholars engaging in the analysis of HTRC resources have a clear need to cite those resources in a persistent manner independent of those resources’ relative positions within other entities. 2. Point-in-time Citability: Scholars engaging in the analysis of HTRC resources have a clear need to cite resources in an unambiguous way that is persistent with respect to time. 3. Reproducibility: Scholars need methods by which the resources that they cite can be shared so that their work conforms to the norms of peer-review and reproducibility of results. 4. Supporting “non-consumptive” Usage: Anonymizing page-level content by disassociating it from the volumes that it is conceptually a part of increases the difficulty of leveraging HTRC analytics modules for the direct reproduction of HathiTrust (HT) content. 5. Improved Granularity: Since many features that scholars are interested in exist at the conceptual level of a page rather than at the level of a volume, unique page-level entities expand the types of methods by which worksets can be gathered and by which analytics modules can be constructed. 6. Expanded Workset Membership: In the near future we would like to empower scholars with options for creating worksets from arbitrary resources at arbitrary levels of granularity, including constructing worksets from collections of arbitrary pages. 7. Supporting Graph Representations: Unique identifiers for page-level content facilitate the creation of more conceptually accurate and functional graph representations of the HT corpus. There several waysOpe

Illinois Digital Environment for Access to Learning and Scholarship Repository