51,429 research outputs found
Exploring Connections Between Active Learning and Model Extraction
Machine learning is being increasingly used by individuals, research
institutions, and corporations. This has resulted in the surge of Machine
Learning-as-a-Service (MLaaS) - cloud services that provide (a) tools and
resources to learn the model, and (b) a user-friendly query interface to access
the model. However, such MLaaS systems raise privacy concerns such as model
extraction. In model extraction attacks, adversaries maliciously exploit the
query interface to steal the model. More precisely, in a model extraction
attack, a good approximation of a sensitive or proprietary model held by the
server is extracted (i.e. learned) by a dishonest user who interacts with the
server only via the query interface. This attack was introduced by Tramer et
al. at the 2016 USENIX Security Symposium, where practical attacks for various
models were shown. We believe that better understanding the efficacy of model
extraction attacks is paramount to designing secure MLaaS systems. To that end,
we take the first step by (a) formalizing model extraction and discussing
possible defense strategies, and (b) drawing parallels between model extraction
and established area of active learning. In particular, we show that recent
advancements in the active learning domain can be used to implement powerful
model extraction attacks, and investigate possible defense strategies
Active Mining of Parallel Video Streams
The practicality of a video surveillance system is adversely limited by the
amount of queries that can be placed on human resources and their vigilance in
response. To transcend this limitation, a major effort under way is to include
software that (fully or at least semi) automatically mines video footage,
reducing the burden imposed to the system. Herein, we propose a semi-supervised
incremental learning framework for evolving visual streams in order to develop
a robust and flexible track classification system. Our proposed method learns
from consecutive batches by updating an ensemble in each time. It tries to
strike a balance between performance of the system and amount of data which
needs to be labelled. As no restriction is considered, the system can address
many practical problems in an evolving multi-camera scenario, such as concept
drift, class evolution and various length of video streams which have not been
addressed before. Experiments were performed on synthetic as well as real-world
visual data in non-stationary environments, showing high accuracy with fairly
little human collaboration
GOOWE: Geometrically Optimum and Online-Weighted Ensemble Classifier for Evolving Data Streams
Designing adaptive classifiers for an evolving data stream is a challenging
task due to the data size and its dynamically changing nature. Combining
individual classifiers in an online setting, the ensemble approach, is a
well-known solution. It is possible that a subset of classifiers in the
ensemble outperforms others in a time-varying fashion. However, optimum weight
assignment for component classifiers is a problem which is not yet fully
addressed in online evolving environments. We propose a novel data stream
ensemble classifier, called Geometrically Optimum and Online-Weighted Ensemble
(GOOWE), which assigns optimum weights to the component classifiers using a
sliding window containing the most recent data instances. We map vote scores of
individual classifiers and true class labels into a spatial environment. Based
on the Euclidean distance between vote scores and ideal-points, and using the
linear least squares (LSQ) solution, we present a novel, dynamic, and online
weighting approach. While LSQ is used for batch mode ensemble classifiers, it
is the first time that we adapt and use it for online environments by providing
a spatial modeling of online ensembles. In order to show the robustness of the
proposed algorithm, we use real-world datasets and synthetic data generators
using the MOA libraries. First, we analyze the impact of our weighting system
on prediction accuracy through two scenarios. Second, we compare GOOWE with 8
state-of-the-art ensemble classifiers in a comprehensive experimental
environment. Our experiments show that GOOWE provides improved reactions to
different types of concept drift compared to our baselines. The statistical
tests indicate a significant improvement in accuracy, with conservative time
and memory requirements.Comment: 33 Pages, Accepted for publication in The ACM Transactions on
Knowledge Discovery from Data (TKDD) in August 201
Spot: An accurate and efficient multi-entity device-free WLAN localization system
Device-free (DF) localization in WLANs has been introduced as a value-added
service that allows tracking indoor entities that do not carry any devices.
Previous work in DF WLAN localization focused on the tracking of a single
entity due to the intractability of the multi-entity tracking problem whose
complexity grows exponentially with the number of humans being tracked. In this
paper, we introduce Spot as an accurate and efficient system for multi-entity
DF detection and tracking. Spot is based on a probabilistic energy minimization
framework that combines a conditional random field with a Markov model to
capture the temporal and spatial relations between the entities' poses. A novel
cross-calibration technique is introduced to reduce the calibration overhead of
multiple entities to linear, regardless of the number of humans being tracked.
This also helps in increasing the system accuracy. We design the energy
minimization function with the goal of being efficiently solved in mind. We
show that the designed function can be mapped to a binary graph-cut problem
whose solution has a linear complexity on average and a third order polynomial
in the worst case. We further employ clustering on the estimated location
candidates to reduce outliers and obtain more accurate tracking. Experimental
evaluation in two typical testbeds, with a side-by-side comparison with the
state-of-the-art, shows that Spot can achieve a multi-entity tracking accuracy
of less than 1.1m. This corresponds to at least 36% enhancement in median
distance error over the state-of-the-art DF localization systems, which can
only track a single entity. In addition, Spot can estimate the number of
entities correctly to within one difference error. This highlights that Spot
achieves its goals of having an accurate and efficient software-only DF
tracking solution of multiple entities in indoor environments.Comment: 14 pages, 24 figure
Joining Sound Event Detection and Localization Through Spatial Segregation
Identification and localization of sounds are both integral parts of
computational auditory scene analysis. Although each can be solved separately,
the goal of forming coherent auditory objects and achieving a comprehensive
spatial scene understanding suggests pursuing a joint solution of the two
problems. This work presents an approach that robustly binds localization with
the detection of sound events in a binaural robotic system. Both tasks are
joined through the use of spatial stream segregation which produces
probabilistic time-frequency masks for individual sources attributable to
separate locations, enabling segregated sound event detection operating on
these streams. We use simulations of a comprehensive suite of test scenes with
multiple co-occurring sound sources, and propose performance measures for
systematic investigation of the impact of scene complexity on this segregated
detection of sound types. Analyzing the effect of spatial scene arrangement, we
show how a robot could facilitate high performance through optimal head
rotation. Furthermore, we investigate the performance of segregated detection
given possible localization error as well as error in the estimation of number
of active sources. Our analysis demonstrates that the proposed approach is an
effective method to obtain joint sound event location and type information
under a wide range of conditions.Comment: Accepted for publication in IEEE/ACM Transactions on Audio, Speech,
and Language Processin
Active Anomaly Detection via Ensembles
In critical applications of anomaly detection including computer security and
fraud prevention, the anomaly detector must be configurable by the analyst to
minimize the effort on false positives. One important way to configure the
anomaly detector is by providing true labels for a few instances. We study the
problem of label-efficient active learning to automatically tune anomaly
detection ensembles and make four main contributions. First, we present an
important insight into how anomaly detector ensembles are naturally suited for
active learning. This insight allows us to relate the greedy querying strategy
to uncertainty sampling, with implications for label-efficiency. Second, we
present a novel formalism called compact description to describe the discovered
anomalies and show that it can also be employed to improve the diversity of the
instances presented to the analyst without loss in the anomaly discovery rate.
Third, we present a novel data drift detection algorithm that not only detects
the drift robustly, but also allows us to take corrective actions to adapt the
detector in a principled manner. Fourth, we present extensive experiments to
evaluate our insights and algorithms in both batch and streaming settings. Our
results show that in addition to discovering significantly more anomalies than
state-of-the-art unsupervised baselines, our active learning algorithms under
the streaming-data setup are competitive with the batch setup.Comment: 14 page
Active Speakers in Context
Current methods for active speak er detection focus on modeling short-term
audiovisual information from a single speaker. Although this strategy can be
enough for addressing single-speaker scenarios, it prevents accurate detection
when the task is to identify who of many candidate speakers are talking. This
paper introduces the Active Speaker Context, a novel representation that models
relationships between multiple speakers over long time horizons. Our Active
Speaker Context is designed to learn pairwise and temporal relations from an
structured ensemble of audio-visual observations. Our experiments show that a
structured feature ensemble already benefits the active speaker detection
performance. Moreover, we find that the proposed Active Speaker Context
improves the state-of-the-art on the AVA-ActiveSpeaker dataset achieving a mAP
of 87.1%. We present ablation studies that verify that this result is a direct
consequence of our long-term multi-speaker analysis
Active Decision Boundary Annotation with Deep Generative Models
This paper is on active learning where the goal is to reduce the data
annotation burden by interacting with a (human) oracle during training.
Standard active learning methods ask the oracle to annotate data samples.
Instead, we take a profoundly different approach: we ask for annotations of the
decision boundary. We achieve this using a deep generative model to create
novel instances along a 1d line. A point on the decision boundary is revealed
where the instances change class. Experimentally we show on three data sets
that our method can be plugged-in to other active learning schemes, that human
oracles can effectively annotate points on the decision boundary, that our
method is robust to annotation noise, and that decision boundary annotations
improve over annotating data samples.Comment: ICCV 201
Active Betweenness Cardinality: Algorithms and Applications
Centrality rankings such as degree, closeness, betweenness, Katz, PageRank,
etc. are commonly used to identify critical nodes in a graph. These methods are
based on two assumptions that restrict their wider applicability. First, they
assume the exact topology of the network is available. Secondly, they do not
take into account the activity over the network and only rely on its topology.
However, in many applications, the network is autonomous, vast, and
distributed, and it is hard to collect the exact topology. At the same time,
the underlying pairwise activity between node pairs is not uniform and node
criticality strongly depends on the activity on the underlying network.
In this paper, we propose active betweenness cardinality, as a new measure,
where the node criticalities are based on not the static structure, but the
activity of the network. We show how this metric can be computed efficiently by
using only local information for a given node and how we can find the most
critical nodes starting from only a few nodes. We also show how this metric can
be used to monitor a network and identify failed nodes.We present experimental
results to show effectiveness by demonstrating how the failed nodes can be
identified by measuring active betweenness cardinality of a few nodes in the
system
AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies
Speech activity detection (or endpointing) is an important processing step
for applications such as speech recognition, language identification and
speaker diarization. Both audio- and vision-based approaches have been used for
this task in various settings, often tailored toward end applications. However,
much of the prior work reports results in synthetic settings, on task-specific
datasets, or on datasets that are not openly available. This makes it difficult
to compare approaches and understand their strengths and weaknesses. In this
paper, we describe a new dataset which we will release publicly containing
densely labeled speech activity in YouTube videos, with the goal of creating a
shared, available dataset for this task. The labels in the dataset annotate
three different speech activity conditions: clean speech, speech co-occurring
with music, and speech co-occurring with noise, which enable analysis of model
performance in more challenging conditions based on the presence of overlapping
noise. We report benchmark performance numbers on AVA-Speech using
off-the-shelf, state-of-the-art audio and vision models that serve as a
baseline to facilitate future research.Comment: Interspeech, 201
- …