179,444 research outputs found
Universal Knowledge Graph Embeddings
A variety of knowledge graph embedding approaches have been developed. Most
of them obtain embeddings by learning the structure of the knowledge graph
within a link prediction setting. As a result, the embeddings reflect only the
semantics of a single knowledge graph, and embeddings for different knowledge
graphs are not aligned, e.g., they cannot be used to find similar entities
across knowledge graphs via nearest neighbor search. However, knowledge graph
embedding applications such as entity disambiguation require a more global
representation, i.e., a representation that is valid across multiple sources.
We propose to learn universal knowledge graph embeddings from large-scale
interlinked knowledge sources. To this end, we fuse large knowledge graphs
based on the owl:sameAs relation such that every entity is represented by a
unique identity. We instantiate our idea by computing universal embeddings
based on DBpedia and Wikidata yielding embeddings for about 180 million
entities, 15 thousand relations, and 1.2 billion triples. Moreover, we develop
a convenient API to provide embeddings as a service. Experiments on link
prediction show that universal knowledge graph embeddings encode better
semantics compared to embeddings computed on a single knowledge graph. For
reproducibility purposes, we provide our source code and datasets open access
at https://github.com/dice-group/Universal_EmbeddingsComment: 5 pages, 3 table
Augmenting Knowledge Transfer across Graphs
Given a resource-rich source graph and a resource-scarce target graph, how
can we effectively transfer knowledge across graphs and ensure a good
generalization performance? In many high-impact domains (e.g., brain networks
and molecular graphs), collecting and annotating data is prohibitively
expensive and time-consuming, which makes domain adaptation an attractive
option to alleviate the label scarcity issue. In light of this, the
state-of-the-art methods focus on deriving domain-invariant graph
representation that minimizes the domain discrepancy. However, it has recently
been shown that a small domain discrepancy loss may not always guarantee a good
generalization performance, especially in the presence of disparate graph
structures and label distribution shifts. In this paper, we present TRANSNET, a
generic learning framework for augmenting knowledge transfer across graphs. In
particular, we introduce a novel notion named trinity signal that can naturally
formulate various graph signals at different granularity (e.g., node
attributes, edges, and subgraphs). With that, we further propose a domain
unification module together with a trinity-signal mixup scheme to jointly
minimize the domain discrepancy and augment the knowledge transfer across
graphs. Finally, comprehensive empirical results show that TRANSNET outperforms
all existing approaches on seven benchmark datasets by a significant margin
REGAL: Representation Learning-based Graph Alignment
Problems involving multiple networks are prevalent in many scientific and
other domains. In particular, network alignment, or the task of identifying
corresponding nodes in different networks, has applications across the social
and natural sciences. Motivated by recent advancements in node representation
learning for single-graph tasks, we propose REGAL (REpresentation
learning-based Graph ALignment), a framework that leverages the power of
automatically-learned node representations to match nodes across different
graphs. Within REGAL we devise xNetMF, an elegant and principled node embedding
formulation that uniquely generalizes to multi-network problems. Our results
demonstrate the utility and promise of unsupervised representation
learning-based network alignment in terms of both speed and accuracy. REGAL
runs up to 30x faster in the representation learning stage than comparable
methods, outperforms existing network alignment methods by 20 to 30% accuracy
on average, and scales to networks with millions of nodes each.Comment: In Proceedings of the 27th ACM International Conference on
Information and Knowledge Management (CIKM), 201
Relational learning on temporal knowledge graphs
Over the last decade, there has been an increasing interest in relational machine learning (RML), which studies methods for the statistical analysis of relational or graph-structured data. Relational data arise naturally in many real-world applications, including social networks, recommender systems, and computational finance. Such data can be represented in the form of a graph consisting of nodes (entities) and labeled edges (relationships between entities). While traditional machine learning techniques are based on feature vectors, RML takes relations into account and permits inference among entities. Recently, performing prediction and learning tasks on knowledge graphs has become a main topic in RML. Knowledge graphs (KGs) are widely used resources for studying multi-relational data in the form of a directed graph, where each labeled edge describes a factual statement, such as (Munich, locatedIn, Germany).
Traditionally, knowledge graphs are considered to represent stationary relationships, which do not change over time. In contrast, event-based multi-relational data exhibits complex temporal dynamics in addition to its multi-relational nature. For example, the political relationship between two countries would intensify because of trade fights; the president of a country may change after an election. To represent the temporal aspect, temporal knowledge graphs (tKGs) were introduced that store a temporal event as a quadruple by extending the static triple with a timestamp describing when this event occurred, i.e. (Barack Obama, visit, India, 2010-11-06). Thus, each edge in the graph has temporal information associated with it and may recur or evolve over time.
Among various learning paradigms on KGs, knowledge representation learning (KRL), also known as knowledge graph embedding, has achieved great success. KRL maps entities and relations into low-dimensional vector spaces while capturing semantic meanings. However, KRL approaches have mostly been done for static KGs and lack the ability to utilize rich temporal dynamics available on tKGs. In this thesis, we study state-of-the-art representation learning techniques for temporal knowledge graphs that can capture temporal dependencies across entities in addition to their relational dependencies. We discover representations for two inference tasks, i.e., tKG forecasting and completion. The former is to forecast future events using historical observations up to the present time, while the latter predicts missing links at observed timestamps. For tKG forecasting, we show how to make the reasoning process interpretable while maintaining performance by employing a sequential reasoning process over local subgraphs. Besides, we propose a continuous-depth multi-relational graph neural network with a novel graph neural ordinary differential equation. It allows for learning continuous-time representations of tKGs, especially in cases with observations in irregular time intervals, as encountered in online analysis. For tKG completion, we systematically review multiple benchmark models. We thoroughly investigate the significance of the proposed temporal encoding technique in each model and provide the first unified open-source framework, which gathers the implementations of well-known tKG completion models. Finally, we discuss the power of geometric learning and show that learning evolving entity representations in a product of Riemannian manifolds can better reflect geometric structures on tKGs and achieve better performances than Euclidean embeddings while requiring significantly fewer model parameters
Leveraging Pre-trained Language Models for Time Interval Prediction in Text-Enhanced Temporal Knowledge Graphs
Most knowledge graph completion (KGC) methods learn latent representations of
entities and relations of a given graph by mapping them into a vector space.
Although the majority of these methods focus on static knowledge graphs, a
large number of publicly available KGs contain temporal information stating the
time instant/period over which a certain fact has been true. Such graphs are
often known as temporal knowledge graphs. Furthermore, knowledge graphs may
also contain textual descriptions of entities and relations. Both temporal
information and textual descriptions are not taken into account during
representation learning by static KGC methods, and only structural information
of the graph is leveraged. Recently, some studies have used temporal
information to improve link prediction, yet they do not exploit textual
descriptions and do not support inductive inference (prediction on entities
that have not been seen in training).
We propose a novel framework called TEMT that exploits the power of
pre-trained language models (PLMs) for text-enhanced temporal knowledge graph
completion. The knowledge stored in the parameters of a PLM allows TEMT to
produce rich semantic representations of facts and to generalize on previously
unseen entities. TEMT leverages textual and temporal information available in a
KG, treats them separately, and fuses them to get plausibility scores of facts.
Unlike previous approaches, TEMT effectively captures dependencies across
different time points and enables predictions on unseen entities. To assess the
performance of TEMT, we carried out several experiments including time interval
prediction, both in transductive and inductive settings, and triple
classification. The experimental results show that TEMT is competitive with the
state-of-the-art.Comment: 10 pages, 3 figure
Investigating learners’ meta-representational competencies when constructing bar graphs
Published ArticleCurrent views in the teaching and learning of data handling suggest that learners should create graphs of data they collect themselves and not just use textbook data. It is presumed real-world data creates an ideal environment for learners to tap from their pool of stored knowledge and demonstrate their meta-representational competences. Although prior knowledge is acknowledged as a critical resource out of which expertise is constructed, empirical evidence shows that new levels of mathematical thinking do not always build logically and consistently on previous experience. This suggests that researchers should analyse this resource in more detail in order to understand where prior knowledge could be supportive and where it could be problematic in the process of learning. This article analyses Grade 11 learners’meta-representational competences when constructing bar graphs. The basic premise was that by examining the process of graph construction and how learners respond to a variety of stages thereof, it was possible to create a description of a graphical frame or a knowledge representation structure that was stored in the learner’s memory. Errors could then be described and explained in terms of the inadequacies of the frame, that is: ‘Is the learner making good use of the stored prior knowledge?’ A total of 43 learners were observed over a week in a classroom environment whilst they attempted to draw graphs for data they had collected for a mathematics project. Four units of analysis are used to focus on how learners created a frequency table, axes, bars and the overall representativeness of the graph vis-à -vis the data. Results show that learners had an inadequate graphical frame as they drew a graph that had elements of a value bar graph, distribution bar graph and a histogram all representing the same data set. This inability to distinguish between these graphs and the types of data they represent implies that learners were likely to face difficulties with measures of centre and variability which are interpreted differently across these three graphs but are foundational in all statistical thinkin
Learning representations of entities and relations
Learning to represent factual knowledge about the world in a succinct and accessible
manner is a fundamental machine learning problem. Encoding facts as representations
of entities and binary relationships between them, as learned by knowledge graph representation models, is useful for various tasks, including predicting new facts (i.e. link
prediction), question answering, fact checking and information retrieval. The focus of
this thesis is on (i) improving knowledge graph representation with the aim of tackling
the link prediction task; and (ii) devising a theory on how semantics can be captured
in the geometry of relation representations.
Most knowledge graphs are very incomplete and manually adding new information is
costly, which drives the development of methods which can automatically infer missing
facts. This thesis introduces three knowledge graph representation methods, each applied to the link prediction task. The first contribution is HypER, a convolutional model
which simplifies and improves upon the link prediction performance of the existing convolutional state-of-the-art model ConvE and can be mathematically explained in terms
of constrained tensor factorisation. Drawing inspiration from the tensor factorisation
view of HypER, the second contribution is TuckER, a relatively straightforward linear
knowledge graph representation model, which, at the time of its introduction, obtained
state-of-the-art link prediction performance across standard datasets. With a specific
focus on representing hierarchical knowledge graph relations, the third contribution
is MuRP, first multi-relational graph representation model embedded in hyperbolic
space. MuRP outperforms all existing models and its Euclidean counterpart MuRE in
link prediction on hierarchical knowledge graph relations whilst requiring far fewer dimensions. Since their publication, all above mentioned models have influenced a range
of subsequent developments in the knowledge graph representation field.
Despite the development of a large number of knowledge graph representation models
with gradually increasing predictive performance, relatively little is known of the latent
structure they learn. We generalise recent theoretical understanding of how semantic
relations of similarity, paraphrase and analogy are encoded in the geometric interactions
of word embeddings to how more general relations, as found in knowledge graphs, can
be encoded in their representations. This increased theoretical understanding can be
used to aid future knowledge graph representation model design, as well as to improve
models which incorporate logical rules between relations into their representations or
those that jointly learn from multiple data sources (e.g. knowledge graphs and text)
- …