1 research outputs found
The Trace Criterion for Kernel Bandwidth Selection for Support Vector Data Description
Support vector data description (SVDD) is a popular anomaly detection
technique. The SVDD classifier partitions the whole data space into an inlier
region, which consists of the region near the training data, and an outlier
region, which consists of points away from the training data. The computation
of the SVDD classifier requires a kernel function, for which the Gaussian
kernel is a common choice. The Gaussian kernel has a bandwidth parameter, and
it is important to set the value of this parameter correctly for good results.
A small bandwidth leads to overfitting such that the resulting SVDD classifier
overestimates the number of anomalies, whereas a large bandwidth leads to
underfitting and an inability to detect many anomalies. In this paper, we
present a new unsupervised method for selecting the Gaussian kernel bandwidth.
Our method exploits a low-rank representation of the kernel matrix to suggest a
kernel bandwidth value. Our new technique is competitive with the current state
of the art for low-dimensional data and performs extremely well for many
classes of high-dimensional data. Because the mathematical formulation of SVDD
is identical with the mathematical formulation of one-class support vector
machines (OCSVM) when the Gaussian kernel is used, our method is equally
applicable to Gaussian kernel bandwidth tuning for OCSVM.Comment: note: some text overlap with arXiv:1708.05106 because common
background material is covered in both paper