14,046 research outputs found

    Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks

    Full text link
    Heterogeneous information networks (HINs) are ubiquitous in real-world applications. In the meantime, network embedding has emerged as a convenient tool to mine and learn from networked data. As a result, it is of interest to develop HIN embedding methods. However, the heterogeneity in HINs introduces not only rich information but also potentially incompatible semantics, which poses special challenges to embedding learning in HINs. With the intention to preserve the rich yet potentially incompatible information in HIN embedding, we propose to study the problem of comprehensive transcription of heterogeneous information networks. The comprehensive transcription of HINs also provides an easy-to-use approach to unleash the power of HINs, since it requires no additional supervision, expertise, or feature engineering. To cope with the challenges in the comprehensive transcription of HINs, we propose the HEER algorithm, which embeds HINs via edge representations that are further coupled with properly-learned heterogeneous metrics. To corroborate the efficacy of HEER, we conducted experiments on two large-scale real-words datasets with an edge reconstruction task and multiple case studies. Experiment results demonstrate the effectiveness of the proposed HEER model and the utility of edge representations and heterogeneous metrics. The code and data are available at https://github.com/GentleZhu/HEER.Comment: 10 pages. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, United Kingdom, ACM, 201

    LATTE: Application Oriented Social Network Embedding

    Full text link
    In recent years, many research works propose to embed the network structured data into a low-dimensional feature space, where each node is represented as a feature vector. However, due to the detachment of embedding process with external tasks, the learned embedding results by most existing embedding models can be ineffective for application tasks with specific objectives, e.g., community detection or information diffusion. In this paper, we propose study the application oriented heterogeneous social network embedding problem. Significantly different from the existing works, besides the network structure preservation, the problem should also incorporate the objectives of external applications in the objective function. To resolve the problem, in this paper, we propose a novel network embedding framework, namely the "appLicAtion orienTed neTwork Embedding" (Latte) model. In Latte, the heterogeneous network structure can be applied to compute the node "diffusive proximity" scores, which capture both local and global network structures. Based on these computed scores, Latte learns the network representation feature vectors by extending the autoencoder model model to the heterogeneous network scenario, which can also effectively unite the objectives of network embedding and external application tasks. Extensive experiments have been done on real-world heterogeneous social network datasets, and the experimental results have demonstrated the outstanding performance of Latte in learning the representation vectors for specific application tasks.Comment: 11 Pages, 12 Figures, 1 Tabl

    BL-MNE: Emerging Heterogeneous Social Network Embedding through Broad Learning with Aligned Autoencoder

    Full text link
    Network embedding aims at projecting the network data into a low-dimensional feature space, where the nodes are represented as a unique feature vector and network structure can be effectively preserved. In recent years, more and more online application service sites can be represented as massive and complex networks, which are extremely challenging for traditional machine learning algorithms to deal with. Effective embedding of the complex network data into low-dimension feature representation can both save data storage space and enable traditional machine learning algorithms applicable to handle the network data. Network embedding performance will degrade greatly if the networks are of a sparse structure, like the emerging networks with few connections. In this paper, we propose to learn the embedding representation for a target emerging network based on the broad learning setting, where the emerging network is aligned with other external mature networks at the same time. To solve the problem, a new embedding framework, namely "Deep alIgned autoencoder based eMbEdding" (DIME), is introduced in this paper. DIME handles the diverse link and attribute in a unified analytic based on broad learning, and introduces the multiple aligned attributed heterogeneous social network concept to model the network structure. A set of meta paths are introduced in the paper, which define various kinds of connections among users via the heterogeneous link and attribute information. The closeness among users in the networks are defined as the meta proximity scores, which will be fed into DIME to learn the embedding vectors of users in the emerging network. Extensive experiments have been done on real-world aligned social networks, which have demonstrated the effectiveness of DIME in learning the emerging network embedding vectors.Comment: 10 pages, 9 figures, 4 tables. Full paper is accepted by ICDM 2017, In: Proceedings of the 2017 IEEE International Conference on Data Mining

    Exploring Student Check-In Behavior for Improved Point-of-Interest Prediction

    Full text link
    With the availability of vast amounts of user visitation history on location-based social networks (LBSN), the problem of Point-of-Interest (POI) prediction has been extensively studied. However, much of the research has been conducted solely on voluntary checkin datasets collected from social apps such as Foursquare or Yelp. While these data contain rich information about recreational activities (e.g., restaurants, nightlife, and entertainment), information about more prosaic aspects of people's lives is sparse. This not only limits our understanding of users' daily routines, but more importantly the modeling assumptions developed based on characteristics of recreation-based data may not be suitable for richer check-in data. In this work, we present an analysis of education "check-in" data using WiFi access logs collected at Purdue University. We propose a heterogeneous graph-based method to encode the correlations between users, POIs, and activities, and then jointly learn embeddings for the vertices. We evaluate our method compared to previous state-of-the-art POI prediction methods, and show that the assumptions made by previous methods significantly degrade performance on our data with dense(r) activity signals. We also show how our learned embeddings could be used to identify similar students (e.g., for friend suggestions).Comment: published in KDD'1
    • …
    corecore