51,429 research outputs found

    Exploring Connections Between Active Learning and Model Extraction

    Full text link
    Machine learning is being increasingly used by individuals, research institutions, and corporations. This has resulted in the surge of Machine Learning-as-a-Service (MLaaS) - cloud services that provide (a) tools and resources to learn the model, and (b) a user-friendly query interface to access the model. However, such MLaaS systems raise privacy concerns such as model extraction. In model extraction attacks, adversaries maliciously exploit the query interface to steal the model. More precisely, in a model extraction attack, a good approximation of a sensitive or proprietary model held by the server is extracted (i.e. learned) by a dishonest user who interacts with the server only via the query interface. This attack was introduced by Tramer et al. at the 2016 USENIX Security Symposium, where practical attacks for various models were shown. We believe that better understanding the efficacy of model extraction attacks is paramount to designing secure MLaaS systems. To that end, we take the first step by (a) formalizing model extraction and discussing possible defense strategies, and (b) drawing parallels between model extraction and established area of active learning. In particular, we show that recent advancements in the active learning domain can be used to implement powerful model extraction attacks, and investigate possible defense strategies

    Active Mining of Parallel Video Streams

    Full text link
    The practicality of a video surveillance system is adversely limited by the amount of queries that can be placed on human resources and their vigilance in response. To transcend this limitation, a major effort under way is to include software that (fully or at least semi) automatically mines video footage, reducing the burden imposed to the system. Herein, we propose a semi-supervised incremental learning framework for evolving visual streams in order to develop a robust and flexible track classification system. Our proposed method learns from consecutive batches by updating an ensemble in each time. It tries to strike a balance between performance of the system and amount of data which needs to be labelled. As no restriction is considered, the system can address many practical problems in an evolving multi-camera scenario, such as concept drift, class evolution and various length of video streams which have not been addressed before. Experiments were performed on synthetic as well as real-world visual data in non-stationary environments, showing high accuracy with fairly little human collaboration

    GOOWE: Geometrically Optimum and Online-Weighted Ensemble Classifier for Evolving Data Streams

    Full text link
    Designing adaptive classifiers for an evolving data stream is a challenging task due to the data size and its dynamically changing nature. Combining individual classifiers in an online setting, the ensemble approach, is a well-known solution. It is possible that a subset of classifiers in the ensemble outperforms others in a time-varying fashion. However, optimum weight assignment for component classifiers is a problem which is not yet fully addressed in online evolving environments. We propose a novel data stream ensemble classifier, called Geometrically Optimum and Online-Weighted Ensemble (GOOWE), which assigns optimum weights to the component classifiers using a sliding window containing the most recent data instances. We map vote scores of individual classifiers and true class labels into a spatial environment. Based on the Euclidean distance between vote scores and ideal-points, and using the linear least squares (LSQ) solution, we present a novel, dynamic, and online weighting approach. While LSQ is used for batch mode ensemble classifiers, it is the first time that we adapt and use it for online environments by providing a spatial modeling of online ensembles. In order to show the robustness of the proposed algorithm, we use real-world datasets and synthetic data generators using the MOA libraries. First, we analyze the impact of our weighting system on prediction accuracy through two scenarios. Second, we compare GOOWE with 8 state-of-the-art ensemble classifiers in a comprehensive experimental environment. Our experiments show that GOOWE provides improved reactions to different types of concept drift compared to our baselines. The statistical tests indicate a significant improvement in accuracy, with conservative time and memory requirements.Comment: 33 Pages, Accepted for publication in The ACM Transactions on Knowledge Discovery from Data (TKDD) in August 201

    Spot: An accurate and efficient multi-entity device-free WLAN localization system

    Full text link
    Device-free (DF) localization in WLANs has been introduced as a value-added service that allows tracking indoor entities that do not carry any devices. Previous work in DF WLAN localization focused on the tracking of a single entity due to the intractability of the multi-entity tracking problem whose complexity grows exponentially with the number of humans being tracked. In this paper, we introduce Spot as an accurate and efficient system for multi-entity DF detection and tracking. Spot is based on a probabilistic energy minimization framework that combines a conditional random field with a Markov model to capture the temporal and spatial relations between the entities' poses. A novel cross-calibration technique is introduced to reduce the calibration overhead of multiple entities to linear, regardless of the number of humans being tracked. This also helps in increasing the system accuracy. We design the energy minimization function with the goal of being efficiently solved in mind. We show that the designed function can be mapped to a binary graph-cut problem whose solution has a linear complexity on average and a third order polynomial in the worst case. We further employ clustering on the estimated location candidates to reduce outliers and obtain more accurate tracking. Experimental evaluation in two typical testbeds, with a side-by-side comparison with the state-of-the-art, shows that Spot can achieve a multi-entity tracking accuracy of less than 1.1m. This corresponds to at least 36% enhancement in median distance error over the state-of-the-art DF localization systems, which can only track a single entity. In addition, Spot can estimate the number of entities correctly to within one difference error. This highlights that Spot achieves its goals of having an accurate and efficient software-only DF tracking solution of multiple entities in indoor environments.Comment: 14 pages, 24 figure

    Joining Sound Event Detection and Localization Through Spatial Segregation

    Full text link
    Identification and localization of sounds are both integral parts of computational auditory scene analysis. Although each can be solved separately, the goal of forming coherent auditory objects and achieving a comprehensive spatial scene understanding suggests pursuing a joint solution of the two problems. This work presents an approach that robustly binds localization with the detection of sound events in a binaural robotic system. Both tasks are joined through the use of spatial stream segregation which produces probabilistic time-frequency masks for individual sources attributable to separate locations, enabling segregated sound event detection operating on these streams. We use simulations of a comprehensive suite of test scenes with multiple co-occurring sound sources, and propose performance measures for systematic investigation of the impact of scene complexity on this segregated detection of sound types. Analyzing the effect of spatial scene arrangement, we show how a robot could facilitate high performance through optimal head rotation. Furthermore, we investigate the performance of segregated detection given possible localization error as well as error in the estimation of number of active sources. Our analysis demonstrates that the proposed approach is an effective method to obtain joint sound event location and type information under a wide range of conditions.Comment: Accepted for publication in IEEE/ACM Transactions on Audio, Speech, and Language Processin

    Active Anomaly Detection via Ensembles

    Full text link
    In critical applications of anomaly detection including computer security and fraud prevention, the anomaly detector must be configurable by the analyst to minimize the effort on false positives. One important way to configure the anomaly detector is by providing true labels for a few instances. We study the problem of label-efficient active learning to automatically tune anomaly detection ensembles and make four main contributions. First, we present an important insight into how anomaly detector ensembles are naturally suited for active learning. This insight allows us to relate the greedy querying strategy to uncertainty sampling, with implications for label-efficiency. Second, we present a novel formalism called compact description to describe the discovered anomalies and show that it can also be employed to improve the diversity of the instances presented to the analyst without loss in the anomaly discovery rate. Third, we present a novel data drift detection algorithm that not only detects the drift robustly, but also allows us to take corrective actions to adapt the detector in a principled manner. Fourth, we present extensive experiments to evaluate our insights and algorithms in both batch and streaming settings. Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup.Comment: 14 page

    Active Speakers in Context

    Full text link
    Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker. Although this strategy can be enough for addressing single-speaker scenarios, it prevents accurate detection when the task is to identify who of many candidate speakers are talking. This paper introduces the Active Speaker Context, a novel representation that models relationships between multiple speakers over long time horizons. Our Active Speaker Context is designed to learn pairwise and temporal relations from an structured ensemble of audio-visual observations. Our experiments show that a structured feature ensemble already benefits the active speaker detection performance. Moreover, we find that the proposed Active Speaker Context improves the state-of-the-art on the AVA-ActiveSpeaker dataset achieving a mAP of 87.1%. We present ablation studies that verify that this result is a direct consequence of our long-term multi-speaker analysis

    Active Decision Boundary Annotation with Deep Generative Models

    Full text link
    This paper is on active learning where the goal is to reduce the data annotation burden by interacting with a (human) oracle during training. Standard active learning methods ask the oracle to annotate data samples. Instead, we take a profoundly different approach: we ask for annotations of the decision boundary. We achieve this using a deep generative model to create novel instances along a 1d line. A point on the decision boundary is revealed where the instances change class. Experimentally we show on three data sets that our method can be plugged-in to other active learning schemes, that human oracles can effectively annotate points on the decision boundary, that our method is robust to annotation noise, and that decision boundary annotations improve over annotating data samples.Comment: ICCV 201

    Active Betweenness Cardinality: Algorithms and Applications

    Full text link
    Centrality rankings such as degree, closeness, betweenness, Katz, PageRank, etc. are commonly used to identify critical nodes in a graph. These methods are based on two assumptions that restrict their wider applicability. First, they assume the exact topology of the network is available. Secondly, they do not take into account the activity over the network and only rely on its topology. However, in many applications, the network is autonomous, vast, and distributed, and it is hard to collect the exact topology. At the same time, the underlying pairwise activity between node pairs is not uniform and node criticality strongly depends on the activity on the underlying network. In this paper, we propose active betweenness cardinality, as a new measure, where the node criticalities are based on not the static structure, but the activity of the network. We show how this metric can be computed efficiently by using only local information for a given node and how we can find the most critical nodes starting from only a few nodes. We also show how this metric can be used to monitor a network and identify failed nodes.We present experimental results to show effectiveness by demonstrating how the failed nodes can be identified by measuring active betweenness cardinality of a few nodes in the system

    AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies

    Full text link
    Speech activity detection (or endpointing) is an important processing step for applications such as speech recognition, language identification and speaker diarization. Both audio- and vision-based approaches have been used for this task in various settings, often tailored toward end applications. However, much of the prior work reports results in synthetic settings, on task-specific datasets, or on datasets that are not openly available. This makes it difficult to compare approaches and understand their strengths and weaknesses. In this paper, we describe a new dataset which we will release publicly containing densely labeled speech activity in YouTube videos, with the goal of creating a shared, available dataset for this task. The labels in the dataset annotate three different speech activity conditions: clean speech, speech co-occurring with music, and speech co-occurring with noise, which enable analysis of model performance in more challenging conditions based on the presence of overlapping noise. We report benchmark performance numbers on AVA-Speech using off-the-shelf, state-of-the-art audio and vision models that serve as a baseline to facilitate future research.Comment: Interspeech, 201
    corecore