14,054 research outputs found
Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks
Heterogeneous information networks (HINs) are ubiquitous in real-world
applications. In the meantime, network embedding has emerged as a convenient
tool to mine and learn from networked data. As a result, it is of interest to
develop HIN embedding methods. However, the heterogeneity in HINs introduces
not only rich information but also potentially incompatible semantics, which
poses special challenges to embedding learning in HINs. With the intention to
preserve the rich yet potentially incompatible information in HIN embedding, we
propose to study the problem of comprehensive transcription of heterogeneous
information networks. The comprehensive transcription of HINs also provides an
easy-to-use approach to unleash the power of HINs, since it requires no
additional supervision, expertise, or feature engineering. To cope with the
challenges in the comprehensive transcription of HINs, we propose the HEER
algorithm, which embeds HINs via edge representations that are further coupled
with properly-learned heterogeneous metrics. To corroborate the efficacy of
HEER, we conducted experiments on two large-scale real-words datasets with an
edge reconstruction task and multiple case studies. Experiment results
demonstrate the effectiveness of the proposed HEER model and the utility of
edge representations and heterogeneous metrics. The code and data are available
at https://github.com/GentleZhu/HEER.Comment: 10 pages. In Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, London, United Kingdom,
ACM, 201
LATTE: Application Oriented Social Network Embedding
In recent years, many research works propose to embed the network structured
data into a low-dimensional feature space, where each node is represented as a
feature vector. However, due to the detachment of embedding process with
external tasks, the learned embedding results by most existing embedding models
can be ineffective for application tasks with specific objectives, e.g.,
community detection or information diffusion. In this paper, we propose study
the application oriented heterogeneous social network embedding problem.
Significantly different from the existing works, besides the network structure
preservation, the problem should also incorporate the objectives of external
applications in the objective function. To resolve the problem, in this paper,
we propose a novel network embedding framework, namely the "appLicAtion
orienTed neTwork Embedding" (Latte) model. In Latte, the heterogeneous network
structure can be applied to compute the node "diffusive proximity" scores,
which capture both local and global network structures. Based on these computed
scores, Latte learns the network representation feature vectors by extending
the autoencoder model model to the heterogeneous network scenario, which can
also effectively unite the objectives of network embedding and external
application tasks. Extensive experiments have been done on real-world
heterogeneous social network datasets, and the experimental results have
demonstrated the outstanding performance of Latte in learning the
representation vectors for specific application tasks.Comment: 11 Pages, 12 Figures, 1 Tabl
BL-MNE: Emerging Heterogeneous Social Network Embedding through Broad Learning with Aligned Autoencoder
Network embedding aims at projecting the network data into a low-dimensional
feature space, where the nodes are represented as a unique feature vector and
network structure can be effectively preserved. In recent years, more and more
online application service sites can be represented as massive and complex
networks, which are extremely challenging for traditional machine learning
algorithms to deal with. Effective embedding of the complex network data into
low-dimension feature representation can both save data storage space and
enable traditional machine learning algorithms applicable to handle the network
data. Network embedding performance will degrade greatly if the networks are of
a sparse structure, like the emerging networks with few connections. In this
paper, we propose to learn the embedding representation for a target emerging
network based on the broad learning setting, where the emerging network is
aligned with other external mature networks at the same time. To solve the
problem, a new embedding framework, namely "Deep alIgned autoencoder based
eMbEdding" (DIME), is introduced in this paper. DIME handles the diverse link
and attribute in a unified analytic based on broad learning, and introduces the
multiple aligned attributed heterogeneous social network concept to model the
network structure. A set of meta paths are introduced in the paper, which
define various kinds of connections among users via the heterogeneous link and
attribute information. The closeness among users in the networks are defined as
the meta proximity scores, which will be fed into DIME to learn the embedding
vectors of users in the emerging network. Extensive experiments have been done
on real-world aligned social networks, which have demonstrated the
effectiveness of DIME in learning the emerging network embedding vectors.Comment: 10 pages, 9 figures, 4 tables. Full paper is accepted by ICDM 2017,
In: Proceedings of the 2017 IEEE International Conference on Data Mining
Exploring Student Check-In Behavior for Improved Point-of-Interest Prediction
With the availability of vast amounts of user visitation history on
location-based social networks (LBSN), the problem of Point-of-Interest (POI)
prediction has been extensively studied. However, much of the research has been
conducted solely on voluntary checkin datasets collected from social apps such
as Foursquare or Yelp. While these data contain rich information about
recreational activities (e.g., restaurants, nightlife, and entertainment),
information about more prosaic aspects of people's lives is sparse. This not
only limits our understanding of users' daily routines, but more importantly
the modeling assumptions developed based on characteristics of recreation-based
data may not be suitable for richer check-in data. In this work, we present an
analysis of education "check-in" data using WiFi access logs collected at
Purdue University. We propose a heterogeneous graph-based method to encode the
correlations between users, POIs, and activities, and then jointly learn
embeddings for the vertices. We evaluate our method compared to previous
state-of-the-art POI prediction methods, and show that the assumptions made by
previous methods significantly degrade performance on our data with dense(r)
activity signals. We also show how our learned embeddings could be used to
identify similar students (e.g., for friend suggestions).Comment: published in KDD'1
- …