8,286 research outputs found

    Topic modeling for entity linking using keyphrase

    Get PDF
    This paper proposes an Entity Linking system that applies a topic modeling ranking. We apply a novel approach in order to provide new relevant elements to the model. These elements are keyphrases related to the queries and gathered from a huge Wikipedia-based knowledge resourcePeer ReviewedPostprint (author’s final draft

    Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams

    Get PDF
    The name entity disambiguation task aims to partition the records of multiple real-life persons so that each partition contains records pertaining to a unique person. Most of the existing solutions for this task operate in a batch mode, where all records to be disambiguated are initially available to the algorithm. However, more realistic settings require that the name disambiguation task be performed in an online fashion, in addition to, being able to identify records of new ambiguous entities having no preexisting records. In this work, we propose a Bayesian non-exhaustive classification framework for solving online name disambiguation task. Our proposed method uses a Dirichlet process prior with a Normal * Normal * Inverse Wishart data model which enables identification of new ambiguous entities who have no records in the training data. For online classification, we use one sweep Gibbs sampler which is very efficient and effective. As a case study we consider bibliographic data in a temporal stream format and disambiguate authors by partitioning their papers into homogeneous groups. Our experimental results demonstrate that the proposed method is better than existing methods for performing online name disambiguation task.Comment: to appear in CIKM 201

    Socially-Aware Distributed Hash Tables for Decentralized Online Social Networks

    Full text link
    Many decentralized online social networks (DOSNs) have been proposed due to an increase in awareness related to privacy and scalability issues in centralized social networks. Such decentralized networks transfer processing and storage functionalities from the service providers towards the end users. DOSNs require individualistic implementation for services, (i.e., search, information dissemination, storage, and publish/subscribe). However, many of these services mostly perform social queries, where OSN users are interested in accessing information of their friends. In our work, we design a socially-aware distributed hash table (DHTs) for efficient implementation of DOSNs. In particular, we propose a gossip-based algorithm to place users in a DHT, while maximizing the social awareness among them. Through a set of experiments, we show that our approach reduces the lookup latency by almost 30% and improves the reliability of the communication by nearly 10% via trusted contacts.Comment: 10 pages, p2p 2015 conferenc

    Computer Aided Aroma Design. I. Molecular knowledge framework

    Get PDF
    Computer Aided Aroma Design (CAAD) is likely to become a hot issue as the REACH EC document targets many aroma compounds to require substitution. The two crucial steps in CAMD are the generation of candidate molecules and the estimation of properties, which can be difficult when complex molecular structures like odours are sought and when their odour quality are definitely subjective whereas their odour intensity are partly subjective as stated in Rossitier’s review (1996). In part I, provided that classification rules like those presented in part II exist to assess the odour quality, the CAAD methodology presented proceeds with a multilevel approach matched by a versatile and novel molecular framework. It can distinguish the infinitesimal chemical structure differences, like in isomers, that are responsible for different odour quality and intensity. Besides, its chemical graph concepts are well suited for genetic algorithm sampling techniques used for an efficient screening of large molecules such as aroma. Finally, an input/output XML format based on the aggregation of CML and ThermoML enables to store the molecular classes but also any subjective or objective property values computed during the CAAD process

    USFD at KBP 2011: Entity Linking, Slot Filling and Temporal Bounding

    Full text link
    This paper describes the University of Sheffield's entry in the 2011 TAC KBP entity linking and slot filling tasks. We chose to participate in the monolingual entity linking task, the monolingual slot filling task and the temporal slot filling tasks. We set out to build a framework for experimentation with knowledge base population. This framework was created, and applied to multiple KBP tasks. We demonstrated that our proposed framework is effective and suitable for collaborative development efforts, as well as useful in a teaching environment. Finally we present results that, while very modest, provide improvements an order of magnitude greater than our 2010 attempt.Comment: Proc. Text Analysis Conference (2011
    • …
    corecore