38,394 research outputs found
Outlier Detection from Network Data with Subnetwork Interpretation
Detecting a small number of outliers from a set of data observations is
always challenging. This problem is more difficult in the setting of multiple
network samples, where computing the anomalous degree of a network sample is
generally not sufficient. In fact, explaining why the network is exceptional,
expressed in the form of subnetwork, is also equally important. In this paper,
we develop a novel algorithm to address these two key problems. We treat each
network sample as a potential outlier and identify subnetworks that mostly
discriminate it from nearby regular samples. The algorithm is developed in the
framework of network regression combined with the constraints on both network
topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus
goes beyond subspace/subgraph discovery and we show that it converges to a
global optimum. Evaluation on various real-world network datasets demonstrates
that our algorithm not only outperforms baselines in both network and high
dimensional setting, but also discovers highly relevant and interpretable local
subnetworks, further enhancing our understanding of anomalous networks
Community Detection by -penalized Graph Laplacian
Community detection in network analysis aims at partitioning nodes in a
network into disjoint communities. Most currently available algorithms
assume that is known, but choosing a correct is generally very
difficult for real networks. In addition, many real networks contain outlier
nodes not belonging to any community, but currently very few algorithm can
handle networks with outliers. In this paper, we propose a novel model free
tightness criterion and an efficient algorithm to maximize this criterion for
community detection. This tightness criterion is closely related with the graph
Laplacian with penalty. Unlike most community detection methods, our
method does not require a known and can properly detect communities in
networks with outliers.
Both theoretical and numerical properties of the method are analyzed. The
theoretical result guarantees that, under the degree corrected stochastic block
model, even for networks with outliers, the maximizer of the tightness
criterion can extract communities with small misclassification rates even when
the number of communities grows to infinity as the network size grows.
Simulation study shows that the proposed method can recover true communities
more accurately than other methods. Applications to a college football data and
a yeast protein-protein interaction data also reveal that the proposed method
performs significantly better.Comment: 40 pages, 15 Postscript figure
In-Network Outlier Detection in Wireless Sensor Networks
To address the problem of unsupervised outlier detection in wireless sensor
networks, we develop an approach that (1) is flexible with respect to the
outlier definition, (2) computes the result in-network to reduce both bandwidth
and energy usage,(3) only uses single hop communication thus permitting very
simple node failure detection and message reliability assurance mechanisms
(e.g., carrier-sense), and (4) seamlessly accommodates dynamic updates to data.
We examine performance using simulation with real sensor data streams. Our
results demonstrate that our approach is accurate and imposes a reasonable
communication load and level of power consumption.Comment: Extended version of a paper appearing in the Int'l Conference on
Distributed Computing Systems 200
Evidential Label Propagation Algorithm for Graphs
Community detection has attracted considerable attention crossing many areas
as it can be used for discovering the structure and features of complex
networks. With the increasing size of social networks in real world, community
detection approaches should be fast and accurate. The Label Propagation
Algorithm (LPA) is known to be one of the near-linear solutions and benefits of
easy implementation, thus it forms a good basis for efficient community
detection methods. In this paper, we extend the update rule and propagation
criterion of LPA in the framework of belief functions. A new community
detection approach, called Evidential Label Propagation (ELP), is proposed as
an enhanced version of conventional LPA. The node influence is first defined to
guide the propagation process. The plausibility is used to determine the domain
label of each node. The update order of nodes is discussed to improve the
robustness of the method. ELP algorithm will converge after the domain labels
of all the nodes become unchanged. The mass assignments are calculated finally
as memberships of nodes. The overlapping nodes and outliers can be detected
simultaneously through the proposed method. The experimental results
demonstrate the effectiveness of ELP.Comment: 19th International Conference on Information Fusion, Jul 2016,
Heidelber, Franc
Outlier detection techniques for wireless sensor networks: A survey
In the field of wireless sensor networks, those measurements that significantly deviate from the normal pattern of sensed data are considered as outliers. The potential sources of outliers include noise and errors, events, and malicious attacks on the network. Traditional outlier detection techniques are not directly applicable to wireless sensor networks due to the nature of sensor data and specific requirements and limitations of the wireless sensor networks. This survey provides a comprehensive overview of existing outlier detection techniques specifically developed for the wireless sensor networks. Additionally, it presents a technique-based taxonomy and a comparative table to be used as a guideline to select a technique suitable for the application at hand based on characteristics such as data type, outlier type, outlier identity, and outlier degree
Outlier Detection Techniques For Wireless Sensor Networks: A Survey
In the field of wireless sensor networks, measurements that
significantly deviate from the normal pattern of sensed data are
considered as outliers. The potential sources of outliers include
noise and errors, events, and malicious attacks on the network.
Traditional outlier detection techniques are not directly
applicable to wireless sensor networks due to the multivariate
nature of sensor data and specific requirements and limitations of
the wireless sensor networks. This survey provides a comprehensive
overview of existing outlier detection techniques specifically
developed for the wireless sensor networks. Additionally, it
presents a technique-based taxonomy and a decision tree to be used
as a guideline to select a technique suitable for the application
at hand based on characteristics such as data type, outlier type,
outlier degree
A similarity-based community detection method with multiple prototype representation
Communities are of great importance for understanding graph structures in
social networks. Some existing community detection algorithms use a single
prototype to represent each group. In real applications, this may not
adequately model the different types of communities and hence limits the
clustering performance on social networks. To address this problem, a
Similarity-based Multi-Prototype (SMP) community detection approach is proposed
in this paper. In SMP, vertices in each community carry various weights to
describe their degree of representativeness. This mechanism enables each
community to be represented by more than one node. The centrality of nodes is
used to calculate prototype weights, while similarity is utilized to guide us
to partitioning the graph. Experimental results on computer generated and
real-world networks clearly show that SMP performs well for detecting
communities. Moreover, the method could provide richer information for the
inner structure of the detected communities with the help of prototype weights
compared with the existing community detection models
Semi-supervised Embedding in Attributed Networks with Outliers
In this paper, we propose a novel framework, called Semi-supervised Embedding
in Attributed Networks with Outliers (SEANO), to learn a low-dimensional vector
representation that systematically captures the topological proximity,
attribute affinity and label similarity of vertices in a partially labeled
attributed network (PLAN). Our method is designed to work in both transductive
and inductive settings while explicitly alleviating noise effects from
outliers. Experimental results on various datasets drawn from the web, text and
image domains demonstrate the advantages of SEANO over state-of-the-art methods
in semi-supervised classification under transductive as well as inductive
settings. We also show that a subset of parameters in SEANO is interpretable as
outlier score and can significantly outperform baseline methods when applied
for detecting network outliers. Finally, we present the use of SEANO in a
challenging real-world setting -- flood mapping of satellite images and show
that it is able to outperform modern remote sensing algorithms for this task.Comment: in Proceedings of SIAM International Conference on Data Mining
(SDM'18
Outlier Edge Detection Using Random Graph Generation Models and Applications
Outliers are samples that are generated by different mechanisms from other
normal data samples. Graphs, in particular social network graphs, may contain
nodes and edges that are made by scammers, malicious programs or mistakenly by
normal users. Detecting outlier nodes and edges is important for data mining
and graph analytics. However, previous research in the field has merely focused
on detecting outlier nodes. In this article, we study the properties of edges
and propose outlier edge detection algorithms using two random graph generation
models. We found that the edge-ego-network, which can be defined as the induced
graph that contains two end nodes of an edge, their neighboring nodes and the
edges that link these nodes, contains critical information to detect outlier
edges. We evaluated the proposed algorithms by injecting outlier edges into
some real-world graph data. Experiment results show that the proposed
algorithms can effectively detect outlier edges. In particular, the algorithm
based on the Preferential Attachment Random Graph Generation model consistently
gives good performance regardless of the test graph data. Further more, the
proposed algorithms are not limited in the area of outlier edge detection. We
demonstrate three different applications that benefit from the proposed
algorithms: 1) a preprocessing tool that improves the performance of graph
clustering algorithms; 2) an outlier node detection algorithm; and 3) a novel
noisy data clustering algorithm. These applications show the great potential of
the proposed outlier edge detection techniques.Comment: 14 pages, 5 figures, journal pape
- …