36,981 research outputs found
DHLP 1&2: Giraph based distributed label propagation algorithms on heterogeneous drug-related networks
Background and Objective: Heterogeneous complex networks are large graphs
consisting of different types of nodes and edges. The knowledge extraction from
these networks is complicated. Moreover, the scale of these networks is
steadily increasing. Thus, scalable methods are required. Methods: In this
paper, two distributed label propagation algorithms for heterogeneous networks,
namely DHLP-1 and DHLP-2 have been introduced. Biological networks are one type
of the heterogeneous complex networks. As a case study, we have measured the
efficiency of our proposed DHLP-1 and DHLP-2 algorithms on a biological network
consisting of drugs, diseases, and targets. The subject we have studied in this
network is drug repositioning but our algorithms can be used as general methods
for heterogeneous networks other than the biological network. Results: We
compared the proposed algorithms with similar non-distributed versions of them
namely MINProp and Heter-LP. The experiments revealed the good performance of
the algorithms in terms of running time and accuracy.Comment: Source code available for Apache Giraph on Hadoo
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
Most successful information extraction systems operate with access to a large
collection of documents. In this work, we explore the task of acquiring and
incorporating external evidence to improve extraction accuracy in domains where
the amount of training data is scarce. This process entails issuing search
queries, extraction from new sources and reconciliation of extracted values,
which are repeated until sufficient evidence is collected. We approach the
problem using a reinforcement learning framework where our model learns to
select optimal actions based on contextual information. We employ a deep
Q-network, trained to optimize a reward function that reflects extraction
accuracy while penalizing extra effort. Our experiments on two databases -- of
shooting incidents, and food adulteration cases -- demonstrate that our system
significantly outperforms traditional extractors and a competitive
meta-classifier baseline.Comment: Appearing in EMNLP 2016 (12 pages incl. supplementary material
Learning Relatedness Measures for Entity Linking
Entity Linking is the task of detecting, in text documents, relevant mentions to entities of a given knowledge base. To this end, entity-linking algorithms use several signals and features extracted from the input text or from the knowl- edge base. The most important of such features is entity relatedness. Indeed, we argue that these algorithms benefit from maximizing the relatedness among the relevant enti- ties selected for annotation, since this minimizes errors in disambiguating entity-linking.
The definition of an e↵ective relatedness function is thus a crucial point in any entity-linking algorithm. In this paper we address the problem of learning high-quality entity relatedness functions. First, we formalize the problem of learning entity relatedness as a learning-to-rank problem. We propose a methodology to create reference datasets on the basis of manually annotated data. Finally, we show that our machine-learned entity relatedness function performs better than other relatedness functions previously proposed, and, more importantly, improves the overall performance of dif- ferent state-of-the-art entity-linking algorithms
- …