40 research outputs found
A taxonomy of unsupervised anomaly detection algorithms comprising of four main groups.
<p>Note that CMGOS can be categorized in two groups: It is a clustering-based algorithm as well as estimating a subspace of each cluster.</p
The AUC results of the remaining unsupervised anomaly detection algorithms.
<p>Four different strategies for keeping the components have been used for rPCA, while for HBOS the number of different bins was altered.</p
The AUC values for the large kdd99 dataset for 0 < <i>k</i> < 100.
<p>It can be easily seen that the performance of local anomaly detection algorithms is poor for this global anomaly detection challenge.</p
The AUC values for the nearest-neighbor based algorithms on the breast-cancer dataset.
<p>It can be seen that <i>k</i> values smaller than 10 tend to result in poor estimates, especially when considering local anomaly detection algorithms. Please note that the AUC axis is cut off at 0.5.</p
A visualization of the results for the uCBLOF algorithm.
<p>The anomaly score is represented by the bubble size, whereas the color corresponds to the clustering result of the preceded <i>k</i>-means clustering algorithm. Local anomalies are obviously not detected using uCBLOF.</p
Comparing COF (top) with LOF (bottom) using a simple dataset with a linear correlation of two attributes.
<p>It can be seen that the spherical density estimation of LOF fails to recognize the anomaly, whereas COF detects the non-linear anomaly (<i>k</i> = 4).</p
The 10 datasets used for comparative evaluation of the unsupervised anomaly detection algorithms from different application domains.
<p>A broad spectrum of size, dimensionality and anomaly percentage is covered. They also differ in difficulty and cover local and global anomaly detection tasks.</p
Comparing the computation time of the different algorithm show huge differences, especially for the larger datasets.
<p>The unit of the table is seconds for the first nine columns and minutes for the last dataset (kdd99).</p
The results of the clustering-based algorithms showing the AUC and the standard deviation for different initial <i>k</i> (10 ≤ <i>k</i> ≤ 50).
<p>The last row shows a comparison with the best nearest-neighbor method for the dataset.</p
A visualization of the results of the <i>k</i>-NN global anomaly detection algorithm.
<p>The anomaly score is represented by the bubble size whereas the color shows the labels of the artificially generated dataset.</p