Search CORE

9,875 research outputs found

Information Extraction from Scientific Literature for Method Recommendation

Author: Luan Yi
Publication venue
Publication date: 13/12/2018
Field of study

As a research community grows, more and more papers are published each year. As a result there is increasing demand for improved methods for finding relevant papers, automatically understanding the key ideas and recommending potential methods for a target problem. Despite advances in search engines, it is still hard to identify new technologies according to a researcher's need. Due to the large variety of domains and extremely limited annotated resources, there has been relatively little work on leveraging natural language processing in scientific recommendation. In this proposal, we aim at making scientific recommendations by extracting scientific terms from a large collection of scientific papers and organizing the terms into a knowledge graph. In preliminary work, we trained a scientific term extractor using a small amount of annotated data and obtained state-of-the-art performance by leveraging large amount of unannotated papers through applying multiple semi-supervised approaches. We propose to construct a knowledge graph in a way that can make minimal use of hand annotated data, using only the extracted terms, unsupervised relational signals such as co-occurrence, and structural external resources such as Wikipedia. Latent relations between scientific terms can be learned from the graph. Recommendations will be made through graph inference for both observed and unobserved relational pairs.Comment: Thesis Proposal. arXiv admin note: text overlap with arXiv:1708.0607

arXiv.org e-Print Archive

Text Generation from Knowledge Graphs with Graph Transformers

Author: Bekal Dhanush
Hajishirzi Hannaneh
Koncel-Kedziorski Rik
Lapata Mirella
Luan Yi
Publication venue
Publication date: 17/05/2019
Field of study

Generating texts which express complex ideas spanning multiple sentences requires a structured representation of their content (document plan), but these representations are prohibitively expensive to manually produce. In this work, we address the problem of generating coherent multi-sentence texts from the output of an information extraction system, and in particular a knowledge graph. Graphical knowledge representations are ubiquitous in computing, but pose a significant challenge for text generation techniques due to their non-hierarchical nature, collapsing of long-distance dependencies, and structural variety. We introduce a novel graph transforming encoder which can leverage the relational structure of such knowledge graphs without imposing linearization or hierarchical constraints. Incorporated into an encoder-decoder setup, we provide an end-to-end trainable system for graph-to-text generation that we apply to the domain of scientific text. Automatic and human evaluations show that our technique produces more informative texts which exhibit better document structure than competitive encoder-decoder methods.Comment: Accepted as a long paper in NAACL 201

arXiv.org e-Print Archive

A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications

Author: Cai Hongyun
Chang Kevin Chen-Chuan
Zheng Vincent W.
Publication venue
Publication date: 02/02/2018
Field of study

Graph is an important data representation which appears in a wide diversity of real-world scenarios. Effective graph analytics provides users a deeper understanding of what is behind the data, and thus can benefit a lot of useful applications such as node classification, node recommendation, link prediction, etc. However, most graph analytics methods suffer the high computation and space cost. Graph embedding is an effective yet efficient way to solve the graph analytics problem. It converts the graph data into a low dimensional space in which the graph structural information and graph properties are maximally preserved. In this survey, we conduct a comprehensive review of the literature in graph embedding. We first introduce the formal definition of graph embedding as well as the related concepts. After that, we propose two taxonomies of graph embedding which correspond to what challenges exist in different graph embedding problem settings and how the existing work address these challenges in their solutions. Finally, we summarize the applications that graph embedding enables and suggest four promising future research directions in terms of computation efficiency, problem settings, techniques and application scenarios.Comment: A 20-page comprehensive survey of graph/network embedding for over 150+ papers till year 2018. It provides systematic categorization of problems, techniques and applications. Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE). Comments and suggestions are welcomed for continuously improving this surve

arXiv.org e-Print Archive

Entity Embeddings with Conceptual Subspaces as a Basis for Plausible Reasoning

Author: Jameel Shoaib
Schockaert Steven
Publication venue
Publication date: 25/10/2017
Field of study

Conceptual spaces are geometric representations of conceptual knowledge, in which entities correspond to points, natural properties correspond to convex regions, and the dimensions of the space correspond to salient features. While conceptual spaces enable elegant models of various cognitive phenomena, the lack of automated methods for constructing such representations have so far limited their application in artificial intelligence. To address this issue, we propose a method which learns a vector-space embedding of entities from Wikipedia and constrains this embedding such that entities of the same semantic type are located in some lower-dimensional subspace. We experimentally demonstrate the usefulness of these subspaces as (approximate) conceptual space representations by showing, among others, that important features can be modelled as directions and that natural properties tend to correspond to convex regions

arXiv.org e-Print Archive

Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding

Author: D Spohr
J Duchi
P Ristoski
Y Hao
Publication venue
Publication date: 25/09/2017
Field of study

Entity alignment is the task of finding entities in two knowledge bases (KBs) that represent the same real-world object. When facing KBs in different natural languages, conventional cross-lingual entity alignment methods rely on machine translation to eliminate the language barriers. These approaches often suffer from the uneven quality of translations between languages. While recent embedding-based techniques encode entities and relationships in KBs and do not need machine translation for cross-lingual entity alignment, a significant number of attributes remain largely unexplored. In this paper, we propose a joint attribute-preserving embedding model for cross-lingual entity alignment. It jointly embeds the structures of two KBs into a unified vector space and further refines it by leveraging attribute correlations in the KBs. Our experimental results on real-world datasets show that this approach significantly outperforms the state-of-the-art embedding approaches for cross-lingual entity alignment and could be complemented with methods based on machine translation

arXiv.org e-Print Archive

From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge

Author: Aditya Somak
Aloimonos Yiannis
Baral Chitta
Fermuller Cornelia
Yang Yezhou
Publication venue
Publication date: 10/11/2015
Field of study

In this paper we propose the construction of linguistic descriptions of images. This is achieved through the extraction of scene description graphs (SDGs) from visual scenes using an automatically constructed knowledge base. SDGs are constructed using both vision and reasoning. Specifically, commonsense reasoning is applied on (a) detections obtained from existing perception methods on given images, (b) a "commonsense" knowledge base constructed using natural language processing of image annotations and (c) lexical ontological knowledge from resources such as WordNet. Amazon Mechanical Turk(AMT)-based evaluations on Flickr8k, Flickr30k and MS-COCO datasets show that in most cases, sentences auto-constructed from SDGs obtained by our method give a more relevant and thorough description of an image than a recent state-of-the-art image caption based approach. Our Image-Sentence Alignment Evaluation results are also comparable to that of the recent state-of-the art approaches

arXiv.org e-Print Archive

node2bits: Compact Time- and Attribute-aware Node Representations for User Stitching

Author: Heimann Mark
Jin Di
Koutra Danai
Rossi Ryan
Publication venue
Publication date: 19/09/2019
Field of study

Identity stitching, the task of identifying and matching various online references (e.g., sessions over different devices and timespans) to the same user in real-world web services, is crucial for personalization and recommendations. However, traditional user stitching approaches, such as grouping or blocking, require quadratic pairwise comparisons between a massive number of user activities, thus posing both computational and storage challenges. Recent works, which are often application-specific, heuristically seek to reduce the amount of comparisons, but they suffer from low precision and recall. To solve the problem in an application-independent way, we take a heterogeneous network-based approach in which users (nodes) interact with content (e.g., sessions, websites), and may have attributes (e.g., location). We propose node2bits, an efficient framework that represents multi-dimensional features of node contexts with binary hashcodes. node2bits leverages feature-based temporal walks to encapsulate short- and long-term interactions between nodes in heterogeneous web networks, and adopts SimHash to obtain compact, binary representations and avoid the quadratic complexity for similarity search. Extensive experiments on large-scale real networks show that node2bits outperforms traditional techniques and existing works that generate real-valued embeddings by up to 5.16% in F1 score on user stitching, while taking only up to 1.56% as much storage

arXiv.org e-Print Archive

Knowledge Graph Alignment using String Edit Distance

Author: Kaur Navdeep
Kunapuli Gautam
Natarajan Sriraam
Publication venue
Publication date: 29/03/2020
Field of study

In this work, we propose a novel knowledge graph alignment technique based upon string edit distance that exploits the type information between entities and can find similarity between relations of any arityComment: Position Pape

arXiv.org e-Print Archive

Collaborative Adversarial Learning for RelationalLearning on Multiple Bipartite Graphs

Author: Chen Siheng
Chen Xu
Li Chenyang
Lv Dan
Su Jingchao
Zhang Ya
Publication venue
Publication date: 16/07/2020
Field of study

Relational learning aims to make relation inference by exploiting the correlations among different types of entities. Exploring relational learning on multiple bipartite graphs has been receiving attention because of its popular applications such as recommendations. How to make efficient relation inference with few observed links is the main problem on multiple bipartite graphs. Most existing approaches attempt to solve the sparsity problem via learning shared representations to integrate knowledge from multi-source data for shared entities. However, they merely model the correlations from one aspect (e.g. distribution, representation), and cannot impose sufficient constraints on different relations of the shared entities. One effective way of modeling the multi-domain data is to learn the joint distribution of the shared entities across domains.In this paper, we propose Collaborative Adversarial Learning (CAL) that explicitly models the joint distribution of the shared entities across multiple bipartite graphs. The objective of CAL is formulated from a variational lower bound that maximizes the joint log-likelihoods of the observations. In particular, CAL consists of distribution-level and feature-level alignments for knowledge from multiple bipartite graphs. The two-level alignment acts as two different constraints on different relations of the shared entities and facilitates better knowledge transfer for relational learning on multiple bipartite graphs. Extensive experiments on two real-world datasets have shown that the proposed model outperforms the existing methods.Comment: 8 pages. It has been accepted by IEEE International Conference on Knowledge Graphs (ICKG) 202

arXiv.org e-Print Archive

Reinforcement Learning-based Collective Entity Alignment with Adaptive Features

Author: Groth P.
Lin X.
Tang J.
Zeng W.
Zhao X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2021
Field of study