5,409 research outputs found
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
Spectral Embedding Norm: Looking Deep into the Spectrum of the Graph Laplacian
The extraction of clusters from a dataset which includes multiple clusters
and a significant background component is a non-trivial task of practical
importance. In image analysis this manifests for example in anomaly detection
and target detection. The traditional spectral clustering algorithm, which
relies on the leading eigenvectors to detect clusters, fails in such
cases. In this paper we propose the {\it spectral embedding norm} which sums
the squared values of the first normalized eigenvectors, where can be
significantly larger than . We prove that this quantity can be used to
separate clusters from the background in unbalanced settings, including extreme
cases such as outlier detection. The performance of the algorithm is not
sensitive to the choice of , and we demonstrate its application on synthetic
and real-world remote sensing and neuroimaging datasets
Enhance density peak clustering algorithm for anomaly intrusion detection system
In this paper proposed new model of Density Peak Clustering algorithm to enhance clustering of intrusion attacks. The Anomaly Intrusion Detection System (AIDS) by using original density peak clustering algorithm shows the stable in result to be applied to data-mining module of the intrusion detection system. The proposed system depends on two objectives; the first objective is to analyzing the disadvantage of DPC; however, we propose a novel improvement of DPC algorithm by modifying the calculation of local density method based on cosine similarity instead of the cat off distance parameter to improve the operation of selecting the peak points. The second objective is using the Gaussian kernel measure as a distance metric instead of Euclidean distance to improve clustering of high-dimensional complex nonlinear inseparable network traffic data and reduce the noise. The experimentations evaluated with NSL-KDD dataset
- …