3,905 research outputs found

    Semi-supervised Embedding in Attributed Networks with Outliers

    Full text link
    In this paper, we propose a novel framework, called Semi-supervised Embedding in Attributed Networks with Outliers (SEANO), to learn a low-dimensional vector representation that systematically captures the topological proximity, attribute affinity and label similarity of vertices in a partially labeled attributed network (PLAN). Our method is designed to work in both transductive and inductive settings while explicitly alleviating noise effects from outliers. Experimental results on various datasets drawn from the web, text and image domains demonstrate the advantages of SEANO over state-of-the-art methods in semi-supervised classification under transductive as well as inductive settings. We also show that a subset of parameters in SEANO is interpretable as outlier score and can significantly outperform baseline methods when applied for detecting network outliers. Finally, we present the use of SEANO in a challenging real-world setting -- flood mapping of satellite images and show that it is able to outperform modern remote sensing algorithms for this task.Comment: in Proceedings of SIAM International Conference on Data Mining (SDM'18

    Cognitive satellite communications and representation learning for streaming and complex graphs.

    Get PDF
    This dissertation includes two topics. The first topic studies a promising dynamic spectrum access algorithm (DSA) that improves the throughput of satellite communication (SATCOM) under the uncertainty. The other topic investigates distributed representation learning for streaming and complex networks. DSA allows a secondary user to access the spectrum that are not occupied by primary users. However, uncertainty in SATCOM causes more spectrum sensing errors. In this dissertation, the uncertainty has been addressed by formulating a DSA decision-making process as a Partially Observable Markov Decision Process (POMDP) model to optimally determine which channels to sense and access. Large-scale networks have attracted many attentions to discover the hidden information from big data. Particularly, representation learning embeds the network into a lower vector space while maximally preserving the similarity among nodes. I propose a real-time distributed graph embedding algorithm (RTDGE) which is capable of distributively embedding the streaming graph by combining a novel edge partition approach and an incremental negative sample approach. Furthermore, a platform is prototyped based on Kafka and Storm. Real-time Twitter network data can be retrieved, partitioned and processed for state-of-art tasks. For knowledge graphs, existing works cannot capture the complex connection patterns and never consider the impacts from complicated relations, due to the unquantifiable relationships. A novel embedding algorithm is proposed to hierarchically measure the structural similarity and the impacts from relations by constructing a multi-layer graph. Then, an advanced representation learning model is designed based on an entity\u27s context generated by random walks on the multi-layer content graph

    Network Representation Learning: A Survey

    Full text link
    With the widespread use of information technologies, information networks are becoming increasingly popular to capture complex relationships across various disciplines, such as social networks, citation networks, telecommunication networks, and biological networks. Analyzing these networks sheds light on different aspects of social life such as the structure of societies, information diffusion, and communication patterns. In reality, however, the large scale of information networks often makes network analytic tasks computationally expensive or intractable. Network representation learning has been recently proposed as a new learning paradigm to embed network vertices into a low-dimensional vector space, by preserving network topology structure, vertex content, and other side information. This facilitates the original network to be easily handled in the new vector space for further analysis. In this survey, we perform a comprehensive review of the current literature on network representation learning in the data mining and machine learning field. We propose new taxonomies to categorize and summarize the state-of-the-art network representation learning techniques according to the underlying learning mechanisms, the network information intended to preserve, as well as the algorithmic designs and methodologies. We summarize evaluation protocols used for validating network representation learning including published benchmark datasets, evaluation methods, and open source algorithms. We also perform empirical studies to compare the performance of representative algorithms on common datasets, and analyze their computational complexity. Finally, we suggest promising research directions to facilitate future study.Comment: Accepted by IEEE transactions on Big Data; 25 pages, 10 tables, 6 figures and 127 reference
    • …
    corecore