33,660 research outputs found
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
A Bayesian approach to bandwidth selection for multivariate kernel regression with an application to state-price density estimation.
Multivariate kernel regression is an important tool for investigating the relationship between a response and a set of explanatory variables. It is generally accepted that the performance of a kernel regression estimator largely depends on the choice of bandwidth rather than the kernel function. This nonparametric technique has been employed in a number of empirical studies including the state-price density estimation pioneered by Aït-Sahalia and Lo (1998). However, the widespread usefulness of multivariate kernel regression has been limited by the difficulty in computing a data-driven bandwidth. In this paper, we present a Bayesian approach to bandwidth selection for multivariate kernel regression. A Markov chain Monte Carlo algorithm is presented to sample the bandwidth vector and other parameters in a multivariate kernel regression model. A Monte Carlo study shows that the proposed bandwidth selector is more accurate than the rule-of-thumb bandwidth selector known as the normal reference rule according to Scott (1992) and Bowman and Azzalini (1997). The proposed bandwidth selection algorithm is applied to a multivariate kernel regression model that is often used to estimate the state-price density of Arrow-Debreu securities. When applying the proposed method to the S&P 500 index options and the DAX index options, we find that for short-maturity options, the proposed Bayesian bandwidth selector produces an obviously different state-price density from the one produced by using a subjective bandwidth selector discussed in Aït-Sahalia and Lo (1998).Black-Scholes formula, Likelihood, Markov chain Monte Carlo, Posterior density.
Undermining User Privacy on Mobile Devices Using AI
Over the past years, literature has shown that attacks exploiting the
microarchitecture of modern processors pose a serious threat to the privacy of
mobile phone users. This is because applications leave distinct footprints in
the processor, which can be used by malware to infer user activities. In this
work, we show that these inference attacks are considerably more practical when
combined with advanced AI techniques. In particular, we focus on profiling the
activity in the last-level cache (LLC) of ARM processors. We employ a simple
Prime+Probe based monitoring technique to obtain cache traces, which we
classify with Deep Learning methods including Convolutional Neural Networks. We
demonstrate our approach on an off-the-shelf Android phone by launching a
successful attack from an unprivileged, zeropermission App in well under a
minute. The App thereby detects running applications with an accuracy of 98%
and reveals opened websites and streaming videos by monitoring the LLC for at
most 6 seconds. This is possible, since Deep Learning compensates measurement
disturbances stemming from the inherently noisy LLC monitoring and unfavorable
cache characteristics such as random line replacement policies. In summary, our
results show that thanks to advanced AI techniques, inference attacks are
becoming alarmingly easy to implement and execute in practice. This once more
calls for countermeasures that confine microarchitectural leakage and protect
mobile phone applications, especially those valuing the privacy of their users
Ratings and rankings: Voodoo or Science?
Composite indicators aggregate a set of variables using weights which are
understood to reflect the variables' importance in the index. In this paper we
propose to measure the importance of a given variable within existing composite
indicators via Karl Pearson's `correlation ratio'; we call this measure `main
effect'. Because socio-economic variables are heteroskedastic and correlated,
(relative) nominal weights are hardly ever found to match (relative) main
effects; we propose to summarize their discrepancy with a divergence measure.
We further discuss to what extent the mapping from nominal weights to main
effects can be inverted. This analysis is applied to five composite indicators,
including the Human Development Index and two popular league tables of
university performance. It is found that in many cases the declared importance
of single indicators and their main effect are very different, and that the
data correlation structure often prevents developers from obtaining the stated
importance, even when modifying the nominal weights in the set of nonnegative
numbers with unit sum.Comment: 28 pages, 7 figure
- …