Search CORE

10,348 research outputs found

Distribution matching for transduction

Author: Petterson James
Quadrianto Novi
Smola Alex
Publication venue: Curran Associates, Inc.
Publication date: 01/01/2009
Field of study

Many transductive inference algorithms assume that distributions over training and test estimates should be related, e.g. by providing a large margin of separation on both sets. We use this idea to design a transduction algorithm which can be used without modification for classification, regression, and structured estimation. At its heart we exploit the fact that for a good learner the distributions over the outputs on training and test sets should match. This is a classical two-sample problem which can be solved efficiently in its most general form by using distance measures in Hilbert Space. It turns out that a number of existing heuristics can be viewed as special cases of our approach.

CiteSeerX

Sussex Research Online

Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features

Author: A Zimek
CC Chang
D Baehrens
DM Tax
EJ Candès
FE Grubbs
G Montavon
J Kim
JJ Hull
R Chalapathy
S Shalev-Shwartz
SM Erfani
T Schlegl
V Barnett
V Chandola
Y Bengio
Publication venue
Publication date: 14/10/2018
Field of study

One-class support vector machine (OC-SVM) for a long time has been one of the most effective anomaly detection methods and extensively adopted in both research as well as industrial applications. The biggest issue for OC-SVM is yet the capability to operate with large and high-dimensional datasets due to optimization complexity. Those problems might be mitigated via dimensionality reduction techniques such as manifold learning or autoencoder. However, previous work often treats representation learning and anomaly prediction separately. In this paper, we propose autoencoder based one-class support vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier features to approximate the radial basis kernel, into deep learning context by combining it with a representation learning architecture and jointly exploit stochastic gradient descent to obtain end-to-end training. Interestingly, this also opens up the possible use of gradient-based attribution methods to explain the decision making for anomaly detection, which has ever been challenging as a result of the implicit mappings between the input space and the kernel space. To the best of our knowledge, this is the first work to study the interpretability of deep learning in anomaly detection. We evaluate our method on a wide range of unsupervised anomaly detection tasks in which our end-to-end training architecture achieves a performance significantly better than the previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201

arXiv.org e-Print Archive

Queen's University Belfast Research Portal

Crossref