42,196 research outputs found

    EntiTables: Smart Assistance for Entity-Focused Tables

    Full text link
    Tables are among the most powerful and practical tools for organizing and working with data. Our motivation is to equip spreadsheet programs with smart assistance capabilities. We concentrate on one particular family of tables, namely, tables with an entity focus. We introduce and focus on two specific tasks: populating rows with additional instances (entities) and populating columns with new headings. We develop generative probabilistic models for both tasks. For estimating the components of these models, we consider a knowledge base as well as a large table corpus. Our experimental evaluation simulates the various stages of the user entering content into an actual table. A detailed analysis of the results shows that the models' components are complimentary and that our methods outperform existing approaches from the literature.Comment: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17), 201

    Cardinal Virtues: Extracting Relation Cardinalities from Text

    Full text link
    Information extraction (IE) from text has largely focused on relations between individual entities, such as who has won which award. However, some facts are never fully mentioned, and no IE method has perfect recall. Thus, it is beneficial to also tap contents about the cardinalities of these relations, for example, how many awards someone has won. We introduce this novel problem of extracting cardinalities and discusses the specific challenges that set it apart from standard IE. We present a distant supervision method using conditional random fields. A preliminary evaluation results in precision between 3% and 55%, depending on the difficulty of relations.Comment: 5 pages, ACL 2017 (short paper

    Drag it together with Groupie: making RDF data authoring easy and fun for anyone

    No full text
    One of the foremost challenges towards realizing a “Read-write Web of Data” [3] is making it possible for everyday computer users to easily find, manipulate, create, and publish data back to the Web so that it can be made available for others to use. However, many aspects of Linked Data make authoring and manipulation difficult for “normal” (ie non-coder) end-users. First, data can be high-dimensional, having arbitrary many properties per “instance”, and interlinked to arbitrary many other instances in a many different ways. Second, collections of Linked Data tend to be vastly more heterogeneous than in typical structured databases, where instances are kept in uniform collections (e.g., database tables). Third, while highly flexible, the problem of having all structures reduced as a graph is verbosity: even simple structures can appear complex. Finally, many of the concepts involved in linked data authoring - for example, terms used to define ontologies are highly abstract and foreign to regular citizen-users.To counter this complexity we have devised a drag-and-drop direct manipulation interface that makes authoring Linked Data easy, fun, and accessible to a wide audience. Groupie allows users to author data simply by dragging blobs representing entities into other entities to compose relationships, establishing one relational link at a time. Since the underlying representation is RDF, Groupie facilitates the inclusion of references to entities and properties defined elsewhere on the Web through integration with popular Linked Data indexing services. Finally, to make it easy for new users to build upon others’ work, Groupie provides a communal space where all data sets created by users can be shared, cloned and modified, allowing individual users to help each other model complex domains thereby leveraging collective intelligence

    PLuTO: MT for online patent translation

    Get PDF
    PLuTO – Patent Language Translation Online – is a partially EU-funded commercialization project which specializes in the automatic retrieval and translation of patent documents. At the core of the PLuTO framework is a machine translation (MT) engine through which web-based translation services are offered. The fully integrated PLuTO architecture includes a translation engine coupling MT with translation memories (TM), and a patent search and retrieval engine. In this paper, we first describe the motivating factors behind the provision of such a service. Following this, we give an overview of the PLuTO framework as a whole, with particular emphasis on the MT components, and provide a real world use case scenario in which PLuTO MT services are exploited
    • 

    corecore