1,191 research outputs found
Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers
Machine Learning (ML) algorithms are used to train computers to perform a
variety of complex tasks and improve with experience. Computers learn how to
recognize patterns, make unintended decisions, or react to a dynamic
environment. Certain trained machines may be more effective than others because
they are based on more suitable ML algorithms or because they were trained
through superior training sets. Although ML algorithms are known and publicly
released, training sets may not be reasonably ascertainable and, indeed, may be
guarded as trade secrets. While much research has been performed about the
privacy of the elements of training sets, in this paper we focus our attention
on ML classifiers and on the statistical information that can be unconsciously
or maliciously revealed from them. We show that it is possible to infer
unexpected but useful information from ML classifiers. In particular, we build
a novel meta-classifier and train it to hack other classifiers, obtaining
meaningful information about their training sets. This kind of information
leakage can be exploited, for example, by a vendor to build more effective
classifiers or to simply acquire trade secrets from a competitor's apparatus,
potentially violating its intellectual property rights
PlaceAvoider: Steering First-Person Cameras away from Sensitive Spaces
Abstract—Cameras are now commonplace in our social and computing landscapes and embedded into consumer devices like smartphones and tablets. A new generation of wearable devices (such as Google Glass) will soon make ‘first-person ’ cameras nearly ubiquitous, capturing vast amounts of imagery without deliberate human action. ‘Lifelogging ’ devices and applications will record and share images from people’s daily lives with their social networks. These devices that automatically capture images in the background raise serious privacy concerns, since they are likely to capture deeply private information. Users of these devices need ways to identify and prevent the sharing of sensitive images. As a first step, we introduce PlaceAvoider, a technique for owners of first-person cameras to ‘blacklist ’ sensitive spaces (like bathrooms and bedrooms). PlaceAvoider recognizes images captured in these spaces and flags them for review before the images are made available to applications. PlaceAvoider performs novel image analysis using both fine-grained image features (like specific objects) and coarse-grained, scene-level features (like colors and textures) to classify where a photo was taken. PlaceAvoider combines these features in a probabilistic framework that jointly labels streams of images in order to improve accuracy. We test the technique on five realistic first-person image datasets and show it is robust to blurriness, motion, and occlusion. I
Privacy-preserving comparison of variable-length data with application to biometric template protection
The establishment of cloud computing and big data in a wide variety of daily applications has raised some privacy concerns due to the sensitive nature of some of the processed data. This has promoted the need to develop data protection techniques, where the storage and all operations are carried out without disclosing any information. Following this trend, this paper presents a new approach to efficiently compare variable-length data in the encrypted domain using homomorphic encryption where only encrypted data is stored or exchanged. The new variable-length-based algorithm is fused with existing fixed-length techniques in order to obtain increased comparison accuracy. To assess the soundness of the proposed approach, we evaluate its performance on a particular application: a multi-algorithm biometric template protection system based on dynamic signatures that complies with the requirements described in the ISO/IEC 24745 standard on biometric information protection. Experiments have been carried out on a publicly available database and a free implementation of the Paillier cryptosystem to ensure reproducibility and comparability to other schemes.This work was supported in part by the German Federal Ministry of Education and Research (BMBF); in part by the Hessen State Ministry
for Higher Education, Research, and the Arts (HMWK) within the Center for Research in Security and Privacy (CRISP); in part by the
Spanish Ministerio de Economia y Competitividad / Fondo Europeo de Desarrollo Regional through the CogniMetrics Project under Grant
TEC2015-70627-R; and in part by Cecaban
Abstract Hidden Markov Models: a monadic account of quantitative information flow
Hidden Markov Models, HMM's, are mathematical models of Markov processes with
state that is hidden, but from which information can leak. They are typically
represented as 3-way joint-probability distributions.
We use HMM's as denotations of probabilistic hidden-state sequential
programs: for that, we recast them as `abstract' HMM's, computations in the
Giry monad , and we equip them with a partial order of increasing
security. However to encode the monadic type with hiding over some state
we use rather
than the conventional that suffices for
Markov models whose state is not hidden. We illustrate the
construction with a small
Haskell prototype.
We then present uncertainty measures as a generalisation of the extant
diversity of probabilistic entropies, with characteristic analytic properties
for them, and show how the new entropies interact with the order of increasing
security. Furthermore, we give a `backwards' uncertainty-transformer semantics
for HMM's that is dual to the `forwards' abstract HMM's - it is an analogue of
the duality between forwards, relational semantics and backwards,
predicate-transformer semantics for imperative programs with demonic choice.
Finally, we argue that, from this new denotational-semantic viewpoint, one
can see that the Dalenius desideratum for statistical databases is actually an
issue in compositionality. We propose a means for taking it into account
Radar and RGB-depth sensors for fall detection: a review
This paper reviews recent works in the literature on the use of systems based on radar and RGB-Depth (RGB-D) sensors for fall detection, and discusses outstanding research challenges and trends related to this research field. Systems to detect reliably fall events and promptly alert carers and first responders have gained significant interest in the past few years in order to address the societal issue of an increasing number of elderly people living alone, with the associated risk of them falling and the consequences in terms of health treatments, reduced well-being, and costs. The interest in radar and RGB-D sensors is related to their capability to enable contactless and non-intrusive monitoring, which is an advantage for practical deployment and users’ acceptance and compliance, compared with other sensor technologies, such as video-cameras, or wearables. Furthermore, the possibility of combining and fusing information from The heterogeneous types of sensors is expected to improve the overall performance of practical fall detection systems. Researchers from different fields can benefit from multidisciplinary knowledge and awareness of the latest developments in radar and RGB-D sensors that this paper is discussing
Location Privacy in Usage-Based Automotive Insurance: Attacks and Countermeasures
Usage-based insurance (UBI) is regarded as a promising way to provide accurate automotive insurance rates by analyzing the driving behaviors (e.g., speed, mileage, and harsh braking/accelerating) of drivers. The best practice that has been adopted by many insurance programs to protect users\u27 location privacy is the use of driving speed rather than GPS data. However, in this paper, we challenge this approach by presenting a novel speed-based location trajectory inference framework. The basic strategy of the proposed inference framework is motivated by the following observations. In practice, many environmental factors, such as real-time traffic and traffic regulations, can influence the driving speed. These factors provide side-channel information about the driving route, which can be exploited to infer the vehicle\u27s trace. We implement our discovered attack on a public data set in New Jersey. The experimental results show that the attacker has a nearly 60% probability of obtaining the real route if he chooses the top 10 candidate routes. To thwart the proposed attack, we design a privacy preserving scoring and data audition framework that enhances drivers\u27 control on location privacy without affecting the utility of UBI. Our defense framework can also detect users\u27 dishonest behavior (e.g., modification of speed data) via a probabilistic audition scheme. Extensive experimental results validate the effectiveness of the defense framework
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale
Incremental learning is one paradigm to enable model building and updating at
scale with streaming data. For end-to-end automatic speech recognition (ASR)
tasks, the absence of human annotated labels along with the need for privacy
preserving policies for model building makes it a daunting challenge. Motivated
by these challenges, in this paper we use a cloud based framework for
production systems to demonstrate insights from privacy preserving incremental
learning for automatic speech recognition (ILASR). By privacy preserving, we
mean, usage of ephemeral data which are not human annotated. This system is a
step forward for production levelASR models for incremental/continual learning
that offers near real-time test-bed for experimentation in the cloud for
end-to-end ASR, while adhering to privacy-preserving policies. We show that the
proposed system can improve the production models significantly(3%) over a new
time period of six months even in the absence of human annotated labels with
varying levels of weak supervision and large batch sizes in incremental
learning. This improvement is 20% over test sets with new words and phrases in
the new time period. We demonstrate the effectiveness of model building in a
privacy-preserving incremental fashion for ASR while further exploring the
utility of having an effective teacher model and use of large batch sizes.Comment: 9 page
- …