33,257 research outputs found
Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks
Heterogeneous information networks (HINs) are ubiquitous in real-world
applications. In the meantime, network embedding has emerged as a convenient
tool to mine and learn from networked data. As a result, it is of interest to
develop HIN embedding methods. However, the heterogeneity in HINs introduces
not only rich information but also potentially incompatible semantics, which
poses special challenges to embedding learning in HINs. With the intention to
preserve the rich yet potentially incompatible information in HIN embedding, we
propose to study the problem of comprehensive transcription of heterogeneous
information networks. The comprehensive transcription of HINs also provides an
easy-to-use approach to unleash the power of HINs, since it requires no
additional supervision, expertise, or feature engineering. To cope with the
challenges in the comprehensive transcription of HINs, we propose the HEER
algorithm, which embeds HINs via edge representations that are further coupled
with properly-learned heterogeneous metrics. To corroborate the efficacy of
HEER, we conducted experiments on two large-scale real-words datasets with an
edge reconstruction task and multiple case studies. Experiment results
demonstrate the effectiveness of the proposed HEER model and the utility of
edge representations and heterogeneous metrics. The code and data are available
at https://github.com/GentleZhu/HEER.Comment: 10 pages. In Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, London, United Kingdom,
ACM, 201
Learning the Joint Representation of Heterogeneous Temporal Events for Clinical Endpoint Prediction
The availability of a large amount of electronic health records (EHR)
provides huge opportunities to improve health care service by mining these
data. One important application is clinical endpoint prediction, which aims to
predict whether a disease, a symptom or an abnormal lab test will happen in the
future according to patients' history records. This paper develops deep
learning techniques for clinical endpoint prediction, which are effective in
many practical applications. However, the problem is very challenging since
patients' history records contain multiple heterogeneous temporal events such
as lab tests, diagnosis, and drug administrations. The visiting patterns of
different types of events vary significantly, and there exist complex nonlinear
relationships between different events. In this paper, we propose a novel model
for learning the joint representation of heterogeneous temporal events. The
model adds a new gate to control the visiting rates of different events which
effectively models the irregular patterns of different events and their
nonlinear correlations. Experiment results with real-world clinical data on the
tasks of predicting death and abnormal lab tests prove the effectiveness of our
proposed approach over competitive baselines.Comment: 8 pages, this paper has been accepted by AAAI 201
Structural Deep Embedding for Hyper-Networks
Network embedding has recently attracted lots of attentions in data mining.
Existing network embedding methods mainly focus on networks with pairwise
relationships. In real world, however, the relationships among data points
could go beyond pairwise, i.e., three or more objects are involved in each
relationship represented by a hyperedge, thus forming hyper-networks. These
hyper-networks pose great challenges to existing network embedding methods when
the hyperedges are indecomposable, that is to say, any subset of nodes in a
hyperedge cannot form another hyperedge. These indecomposable hyperedges are
especially common in heterogeneous networks. In this paper, we propose a novel
Deep Hyper-Network Embedding (DHNE) model to embed hyper-networks with
indecomposable hyperedges. More specifically, we theoretically prove that any
linear similarity metric in embedding space commonly used in existing methods
cannot maintain the indecomposibility property in hyper-networks, and thus
propose a new deep model to realize a non-linear tuplewise similarity function
while preserving both local and global proximities in the formed embedding
space. We conduct extensive experiments on four different types of
hyper-networks, including a GPS network, an online social network, a drug
network and a semantic network. The empirical results demonstrate that our
method can significantly and consistently outperform the state-of-the-art
algorithms.Comment: Accepted by AAAI 1
Multimodal Classification of Urban Micro-Events
In this paper we seek methods to effectively detect urban micro-events. Urban
micro-events are events which occur in cities, have limited geographical
coverage and typically affect only a small group of citizens. Because of their
scale these are difficult to identify in most data sources. However, by using
citizen sensing to gather data, detecting them becomes feasible. The data
gathered by citizen sensing is often multimodal and, as a consequence, the
information required to detect urban micro-events is distributed over multiple
modalities. This makes it essential to have a classifier capable of combining
them. In this paper we explore several methods of creating such a classifier,
including early, late, hybrid fusion and representation learning using
multimodal graphs. We evaluate performance on a real world dataset obtained
from a live citizen reporting system. We show that a multimodal approach yields
higher performance than unimodal alternatives. Furthermore, we demonstrate that
our hybrid combination of early and late fusion with multimodal embeddings
performs best in classification of urban micro-events
- …