2,040 research outputs found
Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs
Many popular knowledge graphs such as Freebase, YAGO or DBPedia maintain a
list of non-discrete attributes for each entity. Intuitively, these attributes
such as height, price or population count are able to richly characterize
entities in knowledge graphs. This additional source of information may help to
alleviate the inherent sparsity and incompleteness problem that are prevalent
in knowledge graphs. Unfortunately, many state-of-the-art relational learning
models ignore this information due to the challenging nature of dealing with
non-discrete data types in the inherently binary-natured knowledge graphs. In
this paper, we propose a novel multi-task neural network approach for both
encoding and prediction of non-discrete attribute information in a relational
setting. Specifically, we train a neural network for triplet prediction along
with a separate network for attribute value regression. Via multi-task
learning, we are able to learn representations of entities, relations and
attributes that encode information about both tasks. Moreover, such attributes
are not only central to many predictive tasks as an information source but also
as a prediction target. Therefore, models that are able to encode, incorporate
and predict such information in a relational learning context are highly
attractive as well. We show that our approach outperforms many state-of-the-art
methods for the tasks of relational triplet classification and attribute value
prediction.Comment: Accepted at CIKM 201
Knowledge-based Biomedical Data Science 2019
Knowledge-based biomedical data science (KBDS) involves the design and
implementation of computer systems that act as if they knew about biomedicine.
Such systems depend on formally represented knowledge in computer systems,
often in the form of knowledge graphs. Here we survey the progress in the last
year in systems that use formally represented knowledge to address data science
problems in both clinical and biological domains, as well as on approaches for
creating knowledge graphs. Major themes include the relationships between
knowledge graphs and machine learning, the use of natural language processing,
and the expansion of knowledge-based approaches to novel domains, such as
Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages
with 3 table
Universal schema for entity type prediction
Categorizing entities by their types is useful in many applications, including knowledge base construction, relation extraction and query intent prediction. Fine-grained entity type ontologies are especially valuable, but typically difficult to design because of unavoidable quandaries about level of detail and boundary cases. Automatically classifying entities by type is challenging as well, usually involving hand-labeling data and training a supervised predictor. This paper presents a universal schema approach to fine-grained entity type prediction. The set of types is taken as the union of textual surface patterns (e.g. appositives) and pre-defined types from available databases (e.g. Freebase) - yielding not tens or hundreds of types, but more than ten thousands of entity types, such as financier, criminologist, and musical trio. We robustly learn mutual implication among this large union by learning latent vector embeddings from probabilistic matrix factorization, thus avoiding the need for hand-labeled data. Experimental results demonstrate more than 30% reduction in error versus a traditional classification approach on predicting fine-grained entities types. © 2013 ACM
Embedding Cardinality Constraints in Neural Link Predictors
Neural link predictors learn distributed representations of entities and
relations in a knowledge graph. They are remarkably powerful in the link
prediction and knowledge base completion tasks, mainly due to the learned
representations that capture important statistical dependencies in the data.
Recent works in the area have focused on either designing new scoring functions
or incorporating extra information into the learning process to improve the
representations. Yet the representations are mostly learned from the observed
links between entities, ignoring commonsense or schema knowledge associated
with the relations in the graph. A fundamental aspect of the topology of
relational data is the cardinality information, which bounds the number of
predictions given for a relation between a minimum and maximum frequency. In
this paper, we propose a new regularisation approach to incorporate relation
cardinality constraints to any existing neural link predictor without affecting
their efficiency or scalability. Our regularisation term aims to impose
boundaries on the number of predictions with high probability, thus,
structuring the embeddings space to respect commonsense cardinality assumptions
resulting in better representations. Experimental results on Freebase, WordNet
and YAGO show that, given suitable prior knowledge, the proposed method
positively impacts the predictive accuracy of downstream link prediction tasks.Comment: 8 pages, accepted at the 34th ACM/SIGAPP Symposium on Applied
Computing (SAC '19
- …