50,220 research outputs found
Relation Discovery from Web Data for Competency Management
This paper describes a technique for automatically discovering associations between people and expertise from an analysis of very large data sources (including web pages, blogs and emails), using a family of algorithms that perform accurate named-entity recognition, assign different weights to terms according to an analysis of document structure, and access distances between terms in a document. My contribution is to add a social networking approach called BuddyFinder which relies on associations within a large enterprise-wide "buddy list" to help delimit the search space and also to provide a form of 'social triangulation' whereby the system can discover documents from your colleagues that contain pertinent information about you. This work has been influential in the information retrieval community generally, as it is the basis of a landmark system that achieved overall first place in every category in the Enterprise Search Track of TREC2006
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks
We propose a distance supervised relation extraction approach for
long-tailed, imbalanced data which is prevalent in real-world settings. Here,
the challenge is to learn accurate "few-shot" models for classes existing at
the tail of the class distribution, for which little data is available.
Inspired by the rich semantic correlations between classes at the long tail and
those at the head, we take advantage of the knowledge from data-rich classes at
the head of the distribution to boost the performance of the data-poor classes
at the tail. First, we propose to leverage implicit relational knowledge among
class labels from knowledge graph embeddings and learn explicit relational
knowledge using graph convolution networks. Second, we integrate that
relational knowledge into relation extraction model by coarse-to-fine
knowledge-aware attention mechanism. We demonstrate our results for a
large-scale benchmark dataset which show that our approach significantly
outperforms other baselines, especially for long-tail relations.Comment: To be published in NAACL 201
- …