5,365 research outputs found
EntiTables: Smart Assistance for Entity-Focused Tables
Tables are among the most powerful and practical tools for organizing and
working with data. Our motivation is to equip spreadsheet programs with smart
assistance capabilities. We concentrate on one particular family of tables,
namely, tables with an entity focus. We introduce and focus on two specific
tasks: populating rows with additional instances (entities) and populating
columns with new headings. We develop generative probabilistic models for both
tasks. For estimating the components of these models, we consider a knowledge
base as well as a large table corpus. Our experimental evaluation simulates the
various stages of the user entering content into an actual table. A detailed
analysis of the results shows that the models' components are complimentary and
that our methods outperform existing approaches from the literature.Comment: Proceedings of the 40th International ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR '17), 201
Topic modeling for entity linking using keyphrase
This paper proposes an Entity Linking system that applies a topic modeling ranking. We apply a novel approach in order to provide new relevant elements to the model. These elements are keyphrases related to the queries and gathered from a huge Wikipedia-based knowledge resourcePeer ReviewedPostprint (author’s final draft
Relation Discovery from Web Data for Competency Management
This paper describes a technique for automatically discovering associations between people and expertise from an analysis of very large data sources (including web pages, blogs and emails), using a family of algorithms that perform accurate named-entity recognition, assign different weights to terms according to an analysis of document structure, and access distances between terms in a document. My contribution is to add a social networking approach called BuddyFinder which relies on associations within a large enterprise-wide "buddy list" to help delimit the search space and also to provide a form of 'social triangulation' whereby the system can discover documents from your colleagues that contain pertinent information about you. This work has been influential in the information retrieval community generally, as it is the basis of a landmark system that achieved overall first place in every category in the Enterprise Search Track of TREC2006
Recommended from our members
Digital Creativity Support for Original Journalism
The decline in circulations and revenues resulting from the digitalization of news production and consumption has led to a crisis in journalism.Journalists have less time to research, investigate and write original stories, leading to problems for our democratic processes and holding the powerful to account. This paper reports the architecture, features and rationale for new digital creativity support designed to support journalists to discover more original angles onstories. It also summarises the evaluation of the tool’s use in 3 newsrooms
WikiM: Metapaths based Wikification of Scientific Abstracts
In order to disseminate the exponential extent of knowledge being produced in
the form of scientific publications, it would be best to design mechanisms that
connect it with already existing rich repository of concepts -- the Wikipedia.
Not only does it make scientific reading simple and easy (by connecting the
involved concepts used in the scientific articles to their Wikipedia
explanations) but also improves the overall quality of the article. In this
paper, we present a novel metapath based method, WikiM, to efficiently wikify
scientific abstracts -- a topic that has been rarely investigated in the
literature. One of the prime motivations for this work comes from the
observation that, wikified abstracts of scientific documents help a reader to
decide better, in comparison to the plain abstracts, whether (s)he would be
interested to read the full article. We perform mention extraction mostly
through traditional tf-idf measures coupled with a set of smart filters. The
entity linking heavily leverages on the rich citation and author publication
networks. Our observation is that various metapaths defined over these networks
can significantly enhance the overall performance of the system. For mention
extraction and entity linking, we outperform most of the competing
state-of-the-art techniques by a large margin arriving at precision values of
72.42% and 73.8% respectively over a dataset from the ACL Anthology Network. In
order to establish the robustness of our scheme, we wikify three other datasets
and get precision values of 63.41%-94.03% and 67.67%-73.29% respectively for
the mention extraction and the entity linking phase
Linked open graph: Browsing multiple SPARQL entry points to build your own LOD views
AbstractA number of accessible RDF stores are populating the linked open data world. The navigation on data reticular relationships is becoming every day more relevant. Several knowledge base present relevant links to common vocabularies while many others are going to be discovered increasing the reasoning capabilities of our knowledge base applications. In this paper, the Linked Open Graph, LOG, is presented. It is a web tool for collaborative browsing and navigation on multiple SPARQL entry points. The paper presented an overview of major problems to be addressed, a comparison with the state of the arts tools, and some details about the LOG graph computation to cope with high complexity of large Linked Open Dada graphs. The LOG.disit.org tool is also presented by means of a set of examples involving multiple RDF stores and putting in evidence the new provided features and advantages using dbPedia, Getty, Europeana, Geonames, etc. The LOG tool is free to be used, and it has been adopted, developed and/or improved in multiple projects: such as ECLAP for social media cultural heritage, Sii-Mobility for smart city, and ICARO for cloud ontology analysis, OSIM for competence/knowledge mining and analysis
- …