42,196 research outputs found
EntiTables: Smart Assistance for Entity-Focused Tables
Tables are among the most powerful and practical tools for organizing and
working with data. Our motivation is to equip spreadsheet programs with smart
assistance capabilities. We concentrate on one particular family of tables,
namely, tables with an entity focus. We introduce and focus on two specific
tasks: populating rows with additional instances (entities) and populating
columns with new headings. We develop generative probabilistic models for both
tasks. For estimating the components of these models, we consider a knowledge
base as well as a large table corpus. Our experimental evaluation simulates the
various stages of the user entering content into an actual table. A detailed
analysis of the results shows that the models' components are complimentary and
that our methods outperform existing approaches from the literature.Comment: Proceedings of the 40th International ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR '17), 201
Cardinal Virtues: Extracting Relation Cardinalities from Text
Information extraction (IE) from text has largely focused on relations
between individual entities, such as who has won which award. However, some
facts are never fully mentioned, and no IE method has perfect recall. Thus, it
is beneficial to also tap contents about the cardinalities of these relations,
for example, how many awards someone has won. We introduce this novel problem
of extracting cardinalities and discusses the specific challenges that set it
apart from standard IE. We present a distant supervision method using
conditional random fields. A preliminary evaluation results in precision
between 3% and 55%, depending on the difficulty of relations.Comment: 5 pages, ACL 2017 (short paper
Drag it together with Groupie: making RDF data authoring easy and fun for anyone
One of the foremost challenges towards realizing a âRead-write Web of Dataâ [3] is making it possible for everyday computer users to easily find, manipulate, create, and publish data back to the Web so that it can be made available for others to use. However, many aspects of Linked Data make authoring and manipulation difficult for ânormalâ (ie non-coder) end-users. First, data can be high-dimensional, having arbitrary many properties per âinstanceâ, and interlinked to arbitrary many other instances in a many different ways. Second, collections of Linked Data tend to be vastly more heterogeneous than in typical structured databases, where instances are kept in uniform collections (e.g., database tables). Third, while highly flexible, the problem of having all structures reduced as a graph is verbosity: even simple structures can appear complex. Finally, many of the concepts involved in linked data authoring - for example, terms used to define ontologies are highly abstract and foreign to regular citizen-users.To counter this complexity we have devised a drag-and-drop direct manipulation interface that makes authoring Linked Data easy, fun, and accessible to a wide audience. Groupie allows users to author data simply by dragging blobs representing entities into other entities to compose relationships, establishing one relational link at a time. Since the underlying representation is RDF, Groupie facilitates the inclusion of references to entities and properties defined elsewhere on the Web through integration with popular Linked Data indexing services. Finally, to make it easy for new users to build upon othersâ work, Groupie provides a communal space where all data sets created by users can be shared, cloned and modified, allowing individual users to help each other model complex domains thereby leveraging collective intelligence
Recommended from our members
Weather, climate, and hydrologic forecasting for the US Southwest: A survey
As part of a regional integrated assessment of climate vulnerability, a survey was conducted from June 1998 to May 2000 of weather, climate, and hydrologic forecasts with coverage of the US Southwest and an emphasis on the Colorado River Basin. The survey addresses the types of forecasts that were issued, the organizations that provided them, and techniques used in their generation. It reflects discussions with key personnel from organizations involved in producing or issuing forecasts, providing data for making forecasts, or serving as a link for communicating forecasts. During the survey period, users faced a complex and constantly changing mix of forecast products available from a variety of sources. The abundance of forecasts was not matched in the provision of corresponding interpretive materials, documentation about how the forecasts were generated, or reviews of past performance. Potential existed for confusing experimental and research products with others that had undergone a thorough review process, including official products issued by the National Weather Service. Contrasts between the state of meteorologic and hydrologic forecasting were notable, especially in the former's greater operational flexibility and more rapid incorporation of new observations and research products. Greater attention should be given to forecast content and communication, including visualization, expression of probabilistic forecasts and presentation of ancillary information. Regional climate models and use of climate forecasts in water supply forecasting offer rapid improvements in predictive capabilities for the Southwest. Forecasts and production details should be archived, and publicly available forecasts should be accompanied by performance evaluations that are relevant to users
PLuTO: MT for online patent translation
PLuTO â Patent Language Translation Online â is a partially EU-funded commercialization project which specializes in the automatic retrieval and translation of patent documents. At the core of the PLuTO framework is a machine translation (MT) engine through which web-based translation services are offered. The fully integrated PLuTO architecture includes a translation engine coupling MT with translation memories (TM), and a patent search and retrieval engine. In this paper, we first describe the motivating factors behind the provision of such a service. Following this, we give an overview of the PLuTO framework as a whole, with particular emphasis on the MT components, and provide a real world use case scenario in which PLuTO MT services are exploited
- âŠ