Search CORE

3,710 research outputs found

SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions

Author: Huang Minlie
Xiao Han
Zhu Xiaoyan
Publication venue
Publication date: 17/06/2017
Field of study

Knowledge representation is an important, long-history topic in AI, and there have been a large amount of work for knowledge graph embedding which projects symbolic entities and relations into low-dimensional, real-valued vector space. However, most embedding methods merely concentrate on data fitting and ignore the explicit semantic expression, leading to uninterpretable representations. Thus, traditional embedding methods have limited potentials for many applications such as question answering, and entity classification. To this end, this paper proposes a semantic representation method for knowledge graph \textbf{(KSR)}, which imposes a two-level hierarchical generative process that globally extracts many aspects and then locally assigns a specific category in each aspect for every triple. Since both aspects and categories are semantics-relevant, the collection of categories in each aspect is treated as the semantic representation of this triple. Extensive experiments justify our model outperforms other state-of-the-art baselines substantially.Comment: Submitted to AAAI.201

arXiv.org e-Print Archive

Approach for Semi-automatic Construction of Anti-infective Drug Ontology Based on Entity Linking

Author: Deng Yang
Liu Li
Liu Yong
Shen Ying
Yuan Kaiqi
Publication venue
Publication date: 05/12/2018
Field of study

Ontology can be used for the interpretation of natural language. To construct an anti-infective drug ontology, one needs to design and deploy a methodological step to carry out the entity discovery and linking. Medical synonym resources have been an important part of medical natural language processing (NLP). However, there are problems such as low precision and low recall rate. In this study, an NLP approach is adopted to generate candidate entities. Open ontology is analyzed to extract semantic relations. Six-word vector features and word-level features are selected to perform the entity linking. The extraction results of synonyms with a single feature and different combinations of features are studied. Experiments show that our selected features have achieved a precision rate of 86.77%, a recall rate of 89.03% and an F1 score of 87.89%. This paper finally presents the structure of the proposed ontology and its relevant statistical data

arXiv.org e-Print Archive

Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion

Author: Chang Yi
Long Guodong
Shen Tao
Wang Bo
Zhou Tianyi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/02/2021
Field of study

Human-curated knowledge graphs provide critical supportive information to various natural language processing tasks, but these graphs are usually incomplete, urging auto-completion of them. Prevalent graph embedding approaches, e.g., TransE, learn structured knowledge via representing graph elements into dense embeddings and capturing their triple-level relationship with spatial distance. However, they are hardly generalizable to the elements never visited in training and are intrinsically vulnerable to graph incompleteness. In contrast, textual encoding approaches, e.g., KG-BERT, resort to graph triple's text and triple-level contextualized representations. They are generalizable enough and robust to the incompleteness, especially when coupled with pre-trained encoders. But two major drawbacks limit the performance: (1) high overheads due to the costly scoring of all possible triples in inference, and (2) a lack of structured knowledge in the textual encoder. In this paper, we follow the textual encoding paradigm and aim to alleviate its drawbacks by augmenting it with graph embedding techniques -- a complementary hybrid of both paradigms. Specifically, we partition each triple into two asymmetric parts as in translation-based graph embedding approach, and encode both parts into contextualized representations by a Siamese-style textual encoder. Built upon the representations, our model employs both deterministic classifier and spatial measurement for representation and structure learning respectively. Moreover, we develop a self-adaptive ensemble scheme to further improve the performance by incorporating triple scores from an existing graph embedding model. In experiments, we achieve state-of-the-art performance on three benchmarks and a zero-shot dataset for link prediction, with highlights of inference costs reduced by 1-2 orders of magnitude compared to a textual encoding method.Comment: 12 pages, WWW'21, April19-23, 2021, Ljubljana, Sloveni

arXiv.org e-Print Archive

TransA: An Adaptive Approach for Knowledge Graph Embedding

Author: Hao Yu
Huang Minlie
Xiao Han
Zhu Xiaoyan
Publication venue
Publication date: 27/09/2015
Field of study

Knowledge representation is a major topic in AI, and many studies attempt to represent entities and relations of knowledge base in a continuous vector space. Among these attempts, translation-based methods build entity and relation vectors by minimizing the translation loss from a head entity to a tail one. In spite of the success of these methods, translation-based methods also suffer from the oversimplified loss metric, and are not competitive enough to model various and complex entities/relations in knowledge bases. To address this issue, we propose \textbf{TransA}, an adaptive metric approach for embedding, utilizing the metric learning ideas to provide a more flexible embedding method. Experiments are conducted on the benchmark datasets and our proposed method makes significant and consistent improvements over the state-of-the-art baselines

arXiv.org e-Print Archive

From One Point to A Manifold: Knowledge Graph Embedding For Precise Link Prediction

Author: Huang Minlie
Xiao Han
Zhu Xiaoyan
Publication venue
Publication date: 16/06/2017
Field of study

Knowledge graph embedding aims at offering a numerical knowledge representation paradigm by transforming the entities and relations into continuous vector space. However, existing methods could not characterize the knowledge graph in a fine degree to make a precise prediction. There are two reasons: being an ill-posed algebraic system and applying an overstrict geometric form. As precise prediction is critical, we propose an manifold-based embedding principle (\textbf{ManifoldE}) which could be treated as a well-posed algebraic system that expands the position of golden triples from one point in current models to a manifold in ours. Extensive experiments show that the proposed models achieve substantial improvements against the state-of-the-art baselines especially for the precise prediction task, and yet maintain high efficiency.Comment: arXiv admin note: text overlap with arXiv:1509.0548

arXiv.org e-Print Archive

Deep Feature Learning via Structured Graph Laplacian Embedding for Person Re-Identification

Author: Cheng De
Gong Yihong
Hauptmann Alexander G.
Li Zhihui
Shi Weiwei
Zheng Nanning
Publication venue
Publication date: 24/07/2017
Field of study

Learning the distance metric between pairs of examples is of great importance for visual recognition, especially for person re-identification (Re-Id). Recently, the contrastive and triplet loss are proposed to enhance the discriminative power of the deeply learned features, and have achieved remarkable success. As can be seen, either the contrastive or triplet loss is just one special case of the Euclidean distance relationships among these training samples. Therefore, we propose a structured graph Laplacian embedding algorithm, which can formulate all these structured distance relationships into the graph Laplacian form. The proposed method can take full advantages of the structured distance relationships among these training samples, with the constructed complete graph. Besides, this formulation makes our method easy-to-implement and super-effective. When embedding the proposed algorithm with the softmax loss for the CNN training, our method can obtain much more robust and discriminative deep features with inter-personal dispersion and intra-personal compactness, which is essential to person Re-Id. We illustrate the effectiveness of our proposed method on top of three popular networks, namely AlexNet, DGDNet and ResNet50, on recent four widely used Re-Id benchmark datasets. Our proposed method achieves state-of-the-art performances.Comment: 9 pages, 4 figure

arXiv.org e-Print Archive

Legal Question Answering using Ranking SVM and Deep Convolutional Neural Network

Author: Do Phong-Khac
Nguyen Huy-Tien
Nguyen Minh-Le
Nguyen Minh-Tien
Tran Chien-Xuan
Publication venue
Publication date: 15/03/2017
Field of study

This paper presents a study of employing Ranking SVM and Convolutional Neural Network for two missions: legal information retrieval and question answering in the Competition on Legal Information Extraction/Entailment. For the first task, our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is based on paragraph level instead of article level as in previous studies. In fact, each single-paragraph article corresponds to a particular paragraph in a huge multiple-paragraph article. For the legal question answering task, additional statistical features from information retrieval task integrated into Convolutional Neural Network contribute to higher accuracy.Comment: 15 pages, 2 figures, Tenth International Workshop on Juris-informatics (JURISIN 2016) associated with JSAI International Symposia on AI 2016 (IsAI-2016

arXiv.org e-Print Archive

Large-Scale Visual Relationship Understanding

Author: Elgammal Ahmed
Elhoseiny Mohamed
Kalantidis Yannis
Paluri Manohar
Rohrbach Marcus
Zhang Ji
Publication venue
Publication date: 16/08/2019
Field of study

Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples. In real-world scenarios with large numbers of objects and relations, some are seen very commonly while others are barely seen. We develop a new relationship detection model that embeds objects and relations into two vector spaces where both discriminative capability and semantic affinity are preserved. We learn both a visual and a semantic module that map features from the two modalities into a shared space, where matched pairs of features have to discriminate against those unmatched, but also maintain close distances to semantically similar ones. Benefiting from that, our model can achieve superior performance even when the visual entity categories scale up to more than 80,000, with extremely skewed class distribution. We demonstrate the efficacy of our model on a large and imbalanced benchmark based of Visual Genome that comprises 53,000+ objects and 29,000+ relations, a scale at which no previous work has ever been evaluated at. We show superiority of our model over carefully designed baselines on the original Visual Genome dataset with 80,000+ categories. We also show state-of-the-art performance on the VRD dataset and the scene graph dataset which is a subset of Visual Genome with 200 categories

arXiv.org e-Print Archive

LogicENN: A Neural Based Knowledge Graphs Embedding Model with Logical Rules

Author: Lehmann Jens
Nayyeri Mojtaba
Xu Chengjin
Yazdi Hamed Shariat
Publication venue
Publication date: 19/08/2019
Field of study

Knowledge graph embedding models have gained significant attention in AI research. Recent works have shown that the inclusion of background knowledge, such as logical rules, can improve the performance of embeddings in downstream machine learning tasks. However, so far, most existing models do not allow the inclusion of rules. We address the challenge of including rules and present a new neural based embedding model (LogicENN). We prove that LogicENN can learn every ground truth of encoded rules in a knowledge graph. To the best of our knowledge, this has not been proved so far for the neural based family of embedding models. Moreover, we derive formulae for the inclusion of various rules, including (anti-)symmetric, inverse, irreflexive and transitive, implication, composition, equivalence and negation. Our formulation allows to avoid grounding for implication and equivalence relations. Our experiments show that LogicENN outperforms the state-of-the-art models in link prediction

arXiv.org e-Print Archive

A Framework for Deep Constrained Clustering -- Algorithms and Advances

Author: Basu Sugato
Davidson Ian
Zhang Hongjing
Publication venue
Publication date: 19/12/2019
Field of study

The area of constrained clustering has been extensively explored by researchers and used by practitioners. Constrained clustering formulations exist for popular algorithms such as k-means, mixture models, and spectral clustering but have several limitations. A fundamental strength of deep learning is its flexibility, and here we explore a deep learning framework for constrained clustering and in particular explore how it can extend the field of constrained clustering. We show that our framework can not only handle standard together/apart constraints (without the well documented negative effects reported earlier) generated from labeled side information but more complex constraints generated from new types of side information such as continuous values and high-level domain knowledge.Comment: Updated for ECML/PKDD 201

arXiv.org e-Print Archive