Search CORE

31,462 research outputs found

Representation Learning for Attributed Multiplex Heterogeneous Network

Author: Bhagat Smriti
Bojchevski Aleksandar
Hamilton Will
Huang Xiao
Kingma Diederik P
Lin Zhouhan
Mikolov Tomas
Mikolov Tomas
Tang Lei
Taskar Ben
Thomas
Yang Cheng
Yang Zhilin
Zhang Hongming
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/05/2019
Field of study

Network embedding (or graph embedding) has been widely used in many real-world applications. However, existing methods mainly focus on networks with single-typed nodes/edges and cannot scale well to handle large networks. Many real-world networks consist of billions of nodes and edges of multiple types, and each node is associated with different attributes. In this paper, we formalize the problem of embedding learning for the Attributed Multiplex Heterogeneous Network and propose a unified framework to address this problem. The framework supports both transductive and inductive learning. We also give the theoretical analysis of the proposed framework, showing its connection with previous works and proving its better expressiveness. We conduct systematical evaluations for the proposed framework on four different genres of challenging datasets: Amazon, YouTube, Twitter, and Alibaba. Experimental results demonstrate that with the learned embeddings from the proposed framework, we can achieve statistically significant improvements (e.g., 5.99-28.23% lift by F1 scores; p<<0.01, t-test) over previous state-of-the-art methods for link prediction. The framework has also been successfully deployed on the recommendation system of a worldwide leading e-commerce company, Alibaba Group. Results of the offline A/B tests on product recommendation further confirm the effectiveness and efficiency of the framework in practice.Comment: Accepted to KDD 2019. Website: https://sites.google.com/view/gatn

arXiv.org e-Print Archive

Crossref

Stochastic Answer Networks for Machine Reading Comprehension

Author: Duh Kevin
Gao Jianfeng
Liu Xiaodong
Shen Yelong
Publication venue
Publication date: 01/01/2018
Field of study

We propose a simple yet robust stochastic answer network (SAN) that simulates multi-step reasoning in machine reading comprehension. Compared to previous work such as ReasoNet which used reinforcement learning to determine the number of steps, the unique feature is the use of a kind of stochastic prediction dropout on the answer module (final layer) of the neural network during the training. We show that this simple trick improves robustness and achieves results competitive to the state-of-the-art on the Stanford Question Answering Dataset (SQuAD), the Adversarial SQuAD, and the Microsoft MAchine Reading COmprehension Dataset (MS MARCO).Comment: 11 pages, 5 figures, Accepted to ACL 201

arXiv.org e-Print Archive

Crossref

Joint Entity Extraction and Assertion Detection for Clinical Text

Author: Bhatia Parminder
Celikkaya Busra
Khalilia Mohammed
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Negative medical findings are prevalent in clinical reports, yet discriminating them from positive findings remains a challenging task for information extraction. Most of the existing systems treat this task as a pipeline of two separate tasks, i.e., named entity recognition (NER) and rule-based negation detection. We consider this as a multi-task problem and present a novel end-to-end neural model to jointly extract entities and negations. We extend a standard hierarchical encoder-decoder NER model and first adopt a shared encoder followed by separate decoders for the two tasks. This architecture performs considerably better than the previous rule-based and machine learning-based systems. To overcome the problem of increased parameter size especially for low-resource settings, we propose the Conditional Softmax Shared Decoder architecture which achieves state-of-art results for NER and negation detection on the 2010 i2b2/VA challenge dataset and a proprietary de-identified clinical dataset.Comment: Accepted at the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019

arXiv.org e-Print Archive

Crossref