3,710 research outputs found
SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions
Knowledge representation is an important, long-history topic in AI, and there
have been a large amount of work for knowledge graph embedding which projects
symbolic entities and relations into low-dimensional, real-valued vector space.
However, most embedding methods merely concentrate on data fitting and ignore
the explicit semantic expression, leading to uninterpretable representations.
Thus, traditional embedding methods have limited potentials for many
applications such as question answering, and entity classification. To this
end, this paper proposes a semantic representation method for knowledge graph
\textbf{(KSR)}, which imposes a two-level hierarchical generative process that
globally extracts many aspects and then locally assigns a specific category in
each aspect for every triple. Since both aspects and categories are
semantics-relevant, the collection of categories in each aspect is treated as
the semantic representation of this triple. Extensive experiments justify our
model outperforms other state-of-the-art baselines substantially.Comment: Submitted to AAAI.201
Approach for Semi-automatic Construction of Anti-infective Drug Ontology Based on Entity Linking
Ontology can be used for the interpretation of natural language. To construct
an anti-infective drug ontology, one needs to design and deploy a
methodological step to carry out the entity discovery and linking. Medical
synonym resources have been an important part of medical natural language
processing (NLP). However, there are problems such as low precision and low
recall rate. In this study, an NLP approach is adopted to generate candidate
entities. Open ontology is analyzed to extract semantic relations. Six-word
vector features and word-level features are selected to perform the entity
linking. The extraction results of synonyms with a single feature and different
combinations of features are studied. Experiments show that our selected
features have achieved a precision rate of 86.77%, a recall rate of 89.03% and
an F1 score of 87.89%. This paper finally presents the structure of the
proposed ontology and its relevant statistical data
Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion
Human-curated knowledge graphs provide critical supportive information to
various natural language processing tasks, but these graphs are usually
incomplete, urging auto-completion of them. Prevalent graph embedding
approaches, e.g., TransE, learn structured knowledge via representing graph
elements into dense embeddings and capturing their triple-level relationship
with spatial distance. However, they are hardly generalizable to the elements
never visited in training and are intrinsically vulnerable to graph
incompleteness. In contrast, textual encoding approaches, e.g., KG-BERT, resort
to graph triple's text and triple-level contextualized representations. They
are generalizable enough and robust to the incompleteness, especially when
coupled with pre-trained encoders. But two major drawbacks limit the
performance: (1) high overheads due to the costly scoring of all possible
triples in inference, and (2) a lack of structured knowledge in the textual
encoder. In this paper, we follow the textual encoding paradigm and aim to
alleviate its drawbacks by augmenting it with graph embedding techniques -- a
complementary hybrid of both paradigms. Specifically, we partition each triple
into two asymmetric parts as in translation-based graph embedding approach, and
encode both parts into contextualized representations by a Siamese-style
textual encoder. Built upon the representations, our model employs both
deterministic classifier and spatial measurement for representation and
structure learning respectively. Moreover, we develop a self-adaptive ensemble
scheme to further improve the performance by incorporating triple scores from
an existing graph embedding model. In experiments, we achieve state-of-the-art
performance on three benchmarks and a zero-shot dataset for link prediction,
with highlights of inference costs reduced by 1-2 orders of magnitude compared
to a textual encoding method.Comment: 12 pages, WWW'21, April19-23, 2021, Ljubljana, Sloveni
TransA: An Adaptive Approach for Knowledge Graph Embedding
Knowledge representation is a major topic in AI, and many studies attempt to
represent entities and relations of knowledge base in a continuous vector
space. Among these attempts, translation-based methods build entity and
relation vectors by minimizing the translation loss from a head entity to a
tail one. In spite of the success of these methods, translation-based methods
also suffer from the oversimplified loss metric, and are not competitive enough
to model various and complex entities/relations in knowledge bases. To address
this issue, we propose \textbf{TransA}, an adaptive metric approach for
embedding, utilizing the metric learning ideas to provide a more flexible
embedding method. Experiments are conducted on the benchmark datasets and our
proposed method makes significant and consistent improvements over the
state-of-the-art baselines
From One Point to A Manifold: Knowledge Graph Embedding For Precise Link Prediction
Knowledge graph embedding aims at offering a numerical knowledge
representation paradigm by transforming the entities and relations into
continuous vector space. However, existing methods could not characterize the
knowledge graph in a fine degree to make a precise prediction. There are two
reasons: being an ill-posed algebraic system and applying an overstrict
geometric form. As precise prediction is critical, we propose an manifold-based
embedding principle (\textbf{ManifoldE}) which could be treated as a well-posed
algebraic system that expands the position of golden triples from one point in
current models to a manifold in ours. Extensive experiments show that the
proposed models achieve substantial improvements against the state-of-the-art
baselines especially for the precise prediction task, and yet maintain high
efficiency.Comment: arXiv admin note: text overlap with arXiv:1509.0548
Deep Feature Learning via Structured Graph Laplacian Embedding for Person Re-Identification
Learning the distance metric between pairs of examples is of great importance
for visual recognition, especially for person re-identification (Re-Id).
Recently, the contrastive and triplet loss are proposed to enhance the
discriminative power of the deeply learned features, and have achieved
remarkable success. As can be seen, either the contrastive or triplet loss is
just one special case of the Euclidean distance relationships among these
training samples. Therefore, we propose a structured graph Laplacian embedding
algorithm, which can formulate all these structured distance relationships into
the graph Laplacian form. The proposed method can take full advantages of the
structured distance relationships among these training samples, with the
constructed complete graph. Besides, this formulation makes our method
easy-to-implement and super-effective. When embedding the proposed algorithm
with the softmax loss for the CNN training, our method can obtain much more
robust and discriminative deep features with inter-personal dispersion and
intra-personal compactness, which is essential to person Re-Id. We illustrate
the effectiveness of our proposed method on top of three popular networks,
namely AlexNet, DGDNet and ResNet50, on recent four widely used Re-Id benchmark
datasets. Our proposed method achieves state-of-the-art performances.Comment: 9 pages, 4 figure
Legal Question Answering using Ranking SVM and Deep Convolutional Neural Network
This paper presents a study of employing Ranking SVM and Convolutional Neural
Network for two missions: legal information retrieval and question answering in
the Competition on Legal Information Extraction/Entailment. For the first task,
our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is
based on paragraph level instead of article level as in previous studies. In
fact, each single-paragraph article corresponds to a particular paragraph in a
huge multiple-paragraph article. For the legal question answering task,
additional statistical features from information retrieval task integrated into
Convolutional Neural Network contribute to higher accuracy.Comment: 15 pages, 2 figures, Tenth International Workshop on
Juris-informatics (JURISIN 2016) associated with JSAI International Symposia
on AI 2016 (IsAI-2016
Large-Scale Visual Relationship Understanding
Large scale visual understanding is challenging, as it requires a model to
handle the widely-spread and imbalanced distribution of <subject, relation,
object> triples. In real-world scenarios with large numbers of objects and
relations, some are seen very commonly while others are barely seen. We develop
a new relationship detection model that embeds objects and relations into two
vector spaces where both discriminative capability and semantic affinity are
preserved. We learn both a visual and a semantic module that map features from
the two modalities into a shared space, where matched pairs of features have to
discriminate against those unmatched, but also maintain close distances to
semantically similar ones. Benefiting from that, our model can achieve superior
performance even when the visual entity categories scale up to more than
80,000, with extremely skewed class distribution. We demonstrate the efficacy
of our model on a large and imbalanced benchmark based of Visual Genome that
comprises 53,000+ objects and 29,000+ relations, a scale at which no previous
work has ever been evaluated at. We show superiority of our model over
carefully designed baselines on the original Visual Genome dataset with 80,000+
categories. We also show state-of-the-art performance on the VRD dataset and
the scene graph dataset which is a subset of Visual Genome with 200 categories
LogicENN: A Neural Based Knowledge Graphs Embedding Model with Logical Rules
Knowledge graph embedding models have gained significant attention in AI
research. Recent works have shown that the inclusion of background knowledge,
such as logical rules, can improve the performance of embeddings in downstream
machine learning tasks. However, so far, most existing models do not allow the
inclusion of rules. We address the challenge of including rules and present a
new neural based embedding model (LogicENN). We prove that LogicENN can learn
every ground truth of encoded rules in a knowledge graph. To the best of our
knowledge, this has not been proved so far for the neural based family of
embedding models. Moreover, we derive formulae for the inclusion of various
rules, including (anti-)symmetric, inverse, irreflexive and transitive,
implication, composition, equivalence and negation. Our formulation allows to
avoid grounding for implication and equivalence relations. Our experiments show
that LogicENN outperforms the state-of-the-art models in link prediction
A Framework for Deep Constrained Clustering -- Algorithms and Advances
The area of constrained clustering has been extensively explored by
researchers and used by practitioners. Constrained clustering formulations
exist for popular algorithms such as k-means, mixture models, and spectral
clustering but have several limitations. A fundamental strength of deep
learning is its flexibility, and here we explore a deep learning framework for
constrained clustering and in particular explore how it can extend the field of
constrained clustering. We show that our framework can not only handle standard
together/apart constraints (without the well documented negative effects
reported earlier) generated from labeled side information but more complex
constraints generated from new types of side information such as continuous
values and high-level domain knowledge.Comment: Updated for ECML/PKDD 201
- …