3,905 research outputs found
Semi-supervised Embedding in Attributed Networks with Outliers
In this paper, we propose a novel framework, called Semi-supervised Embedding
in Attributed Networks with Outliers (SEANO), to learn a low-dimensional vector
representation that systematically captures the topological proximity,
attribute affinity and label similarity of vertices in a partially labeled
attributed network (PLAN). Our method is designed to work in both transductive
and inductive settings while explicitly alleviating noise effects from
outliers. Experimental results on various datasets drawn from the web, text and
image domains demonstrate the advantages of SEANO over state-of-the-art methods
in semi-supervised classification under transductive as well as inductive
settings. We also show that a subset of parameters in SEANO is interpretable as
outlier score and can significantly outperform baseline methods when applied
for detecting network outliers. Finally, we present the use of SEANO in a
challenging real-world setting -- flood mapping of satellite images and show
that it is able to outperform modern remote sensing algorithms for this task.Comment: in Proceedings of SIAM International Conference on Data Mining
(SDM'18
Cognitive satellite communications and representation learning for streaming and complex graphs.
This dissertation includes two topics. The first topic studies a promising dynamic spectrum access algorithm (DSA) that improves the throughput of satellite communication (SATCOM) under the uncertainty. The other topic investigates distributed representation learning for streaming and complex networks. DSA allows a secondary user to access the spectrum that are not occupied by primary users. However, uncertainty in SATCOM causes more spectrum sensing errors. In this dissertation, the uncertainty has been addressed by formulating a DSA decision-making process as a Partially Observable Markov Decision Process (POMDP) model to optimally determine which channels to sense and access. Large-scale networks have attracted many attentions to discover the hidden information from big data. Particularly, representation learning embeds the network into a lower vector space while maximally preserving the similarity among nodes. I propose a real-time distributed graph embedding algorithm (RTDGE) which is capable of distributively embedding the streaming graph by combining a novel edge partition approach and an incremental negative sample approach. Furthermore, a platform is prototyped based on Kafka and Storm. Real-time Twitter network data can be retrieved, partitioned and processed for state-of-art tasks. For knowledge graphs, existing works cannot capture the complex connection patterns and never consider the impacts from complicated relations, due to the unquantifiable relationships. A novel embedding algorithm is proposed to hierarchically measure the structural similarity and the impacts from relations by constructing a multi-layer graph. Then, an advanced representation learning model is designed based on an entity\u27s context generated by random walks on the multi-layer content graph
Network Representation Learning: A Survey
With the widespread use of information technologies, information networks are
becoming increasingly popular to capture complex relationships across various
disciplines, such as social networks, citation networks, telecommunication
networks, and biological networks. Analyzing these networks sheds light on
different aspects of social life such as the structure of societies,
information diffusion, and communication patterns. In reality, however, the
large scale of information networks often makes network analytic tasks
computationally expensive or intractable. Network representation learning has
been recently proposed as a new learning paradigm to embed network vertices
into a low-dimensional vector space, by preserving network topology structure,
vertex content, and other side information. This facilitates the original
network to be easily handled in the new vector space for further analysis. In
this survey, we perform a comprehensive review of the current literature on
network representation learning in the data mining and machine learning field.
We propose new taxonomies to categorize and summarize the state-of-the-art
network representation learning techniques according to the underlying learning
mechanisms, the network information intended to preserve, as well as the
algorithmic designs and methodologies. We summarize evaluation protocols used
for validating network representation learning including published benchmark
datasets, evaluation methods, and open source algorithms. We also perform
empirical studies to compare the performance of representative algorithms on
common datasets, and analyze their computational complexity. Finally, we
suggest promising research directions to facilitate future study.Comment: Accepted by IEEE transactions on Big Data; 25 pages, 10 tables, 6
figures and 127 reference
- …