326 research outputs found
Feature Selection for Linear SVM with Provable Guarantees
We give two provably accurate feature-selection techniques for the linear
SVM. The algorithms run in deterministic and randomized time respectively. Our
algorithms can be used in an unsupervised or supervised setting. The supervised
approach is based on sampling features from support vectors. We prove that the
margin in the feature space is preserved to within -relative error of
the margin in the full feature space in the worst-case. In the unsupervised
setting, we also provide worst-case guarantees of the radius of the minimum
enclosing ball, thereby ensuring comparable generalization as in the full
feature space and resolving an open problem posed in Dasgupta et al. We present
extensive experiments on real-world datasets to support our theory and to
demonstrate that our method is competitive and often better than prior
state-of-the-art, for which there are no known provable guarantees.Comment: Appearing in Proceedings of 18th AISTATS, JMLR W&CP, vol 38, 201
Improved Subsampled Randomized Hadamard Transform for Linear SVM
Subsampled Randomized Hadamard Transform (SRHT), a popular random projection
method that can efficiently project a -dimensional data into -dimensional
space () in time, has been widely used to address the
challenge of high-dimensionality in machine learning. SRHT works by rotating
the input data matrix by Randomized
Walsh-Hadamard Transform followed with a subsequent uniform column sampling on
the rotated matrix. Despite the advantages of SRHT, one limitation of SRHT is
that it generates the new low-dimensional embedding without considering any
specific properties of a given dataset. Therefore, this data-independent random
projection method may result in inferior and unstable performance when used for
a particular machine learning task, e.g., classification. To overcome this
limitation, we analyze the effect of using SRHT for random projection in the
context of linear SVM classification. Based on our analysis, we propose
importance sampling and deterministic top- sampling to produce effective
low-dimensional embedding instead of uniform sampling SRHT. In addition, we
also proposed a new supervised non-uniform sampling method. Our experimental
results have demonstrated that our proposed methods can achieve higher
classification accuracies than SRHT and other random projection methods on six
real-life datasets.Comment: AAAI-2
PMLR Press
Recently there has been significant interest in training machine-learning models at low precision: by reducing precision, one can reduce computation and communication by one order of magnitude. We examine training at reduced precision, both from a theoretical and practical perspective, and ask: is it possible to train models at end-to-end low precision with provable guarantees? Can this lead to consistent order-of-magnitude speedups? We mainly focus on linear models, and the answer is yes for linear models. We develop a simple framework called ZipML based on one simple but novel strategy called double sampling. Our ZipML framework is able to execute training at low precision with no bias, guaranteeing convergence, whereas naive quanti- zation would introduce significant bias. We val- idate our framework across a range of applica- tions, and show that it enables an FPGA proto- type that is up to 6.5 × faster than an implemen- tation using full 32-bit precision. We further de- velop a variance-optimal stochastic quantization strategy and show that it can make a significant difference in a variety of settings. When applied to linear models together with double sampling, we save up to another 1.7 × in data movement compared with uniform quantization. When training deep networks with quantized models, we achieve higher accuracy than the state-of-the- art XNOR-Net
Feature-Guided Black-Box Safety Testing of Deep Neural Networks
Despite the improved accuracy of deep neural networks, the discovery of
adversarial examples has raised serious safety concerns. Most existing
approaches for crafting adversarial examples necessitate some knowledge
(architecture, parameters, etc.) of the network at hand. In this paper, we
focus on image classifiers and propose a feature-guided black-box approach to
test the safety of deep neural networks that requires no such knowledge. Our
algorithm employs object detection techniques such as SIFT (Scale Invariant
Feature Transform) to extract features from an image. These features are
converted into a mutable saliency distribution, where high probability is
assigned to pixels that affect the composition of the image with respect to the
human visual system. We formulate the crafting of adversarial examples as a
two-player turn-based stochastic game, where the first player's objective is to
minimise the distance to an adversarial example by manipulating the features,
and the second player can be cooperative, adversarial, or random. We show that,
theoretically, the two-player game can con- verge to the optimal strategy, and
that the optimal strategy represents a globally minimal adversarial image. For
Lipschitz networks, we also identify conditions that provide safety guarantees
that no adversarial examples exist. Using Monte Carlo tree search we gradually
explore the game state space to search for adversarial examples. Our
experiments show that, despite the black-box setting, manipulations guided by a
perception-based saliency distribution are competitive with state-of-the-art
methods that rely on white-box saliency matrices or sophisticated optimization
procedures. Finally, we show how our method can be used to evaluate robustness
of neural networks in safety-critical applications such as traffic sign
recognition in self-driving cars.Comment: 35 pages, 5 tables, 23 figure
Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms
Inductive learning is based on inferring a general rule from a finite data
set and using it to label new data. In transduction one attempts to solve the
problem of using a labeled training set to label a set of unlabeled points,
which are given to the learner prior to learning. Although transduction seems
at the outset to be an easier task than induction, there have not been many
provably useful algorithms for transduction. Moreover, the precise relation
between induction and transduction has not yet been determined. The main
theoretical developments related to transduction were presented by Vapnik more
than twenty years ago. One of Vapnik's basic results is a rather tight error
bound for transductive classification based on an exact computation of the
hypergeometric tail. While tight, this bound is given implicitly via a
computational routine. Our first contribution is a somewhat looser but explicit
characterization of a slightly extended PAC-Bayesian version of Vapnik's
transductive bound. This characterization is obtained using concentration
inequalities for the tail of sums of random variables obtained by sampling
without replacement. We then derive error bounds for compression schemes such
as (transductive) support vector machines and for transduction algorithms based
on clustering. The main observation used for deriving these new error bounds
and algorithms is that the unlabeled test points, which in the transductive
setting are known in advance, can be used in order to construct useful data
dependent prior distributions over the hypothesis space
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
- …