944 research outputs found

    Online learning and detection of faces with low human supervision

    Get PDF
    The final publication is available at link.springer.comWe present an efficient,online,and interactive approach for computing a classifier, called Wild Lady Ferns (WiLFs), for face learning and detection using small human supervision. More precisely, on the one hand, WiLFs combine online boosting and extremely randomized trees (Random Ferns) to compute progressively an efficient and discriminative classifier. On the other hand, WiLFs use an interactive human-machine approach that combines two complementary learning strategies to reduce considerably the degree of human supervision during learning. While the first strategy corresponds to query-by-boosting active learning, that requests human assistance over difficult samples in function of the classifier confidence, the second strategy refers to a memory-based learning which uses ¿ Exemplar-based Nearest Neighbors (¿ENN) to assist automatically the classifier. A pre-trained Convolutional Neural Network (CNN) is used to perform ¿ENN with high-level feature descriptors. The proposed approach is therefore fast (WilFs run in 1 FPS using a code not fully optimized), accurate (we obtain detection rates over 82% in complex datasets), and labor-saving (human assistance percentages of less than 20%). As a byproduct, we demonstrate that WiLFs also perform semi-automatic annotation during learning, as while the classifier is being computed, WiLFs are discovering faces instances in input images which are used subsequently for training online the classifier. The advantages of our approach are demonstrated in synthetic and publicly available databases, showing comparable detection rates as offline approaches that require larger amounts of handmade training data.Peer ReviewedPostprint (author's final draft

    Non-sparse Linear Representations for Visual Tracking with Online Reservoir Metric Learning

    Get PDF
    Most sparse linear representation-based trackers need to solve a computationally expensive L1-regularized optimization problem. To address this problem, we propose a visual tracker based on non-sparse linear representations, which admit an efficient closed-form solution without sacrificing accuracy. Moreover, in order to capture the correlation information between different feature dimensions, we learn a Mahalanobis distance metric in an online fashion and incorporate the learned metric into the optimization problem for obtaining the linear representation. We show that online metric learning using proximity comparison significantly improves the robustness of the tracking, especially on those sequences exhibiting drastic appearance changes. Furthermore, in order to prevent the unbounded growth in the number of training samples for the metric learning, we design a time-weighted reservoir sampling method to maintain and update limited-sized foreground and background sample buffers for balancing sample diversity and adaptability. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.Comment: Appearing in IEEE Conf. Computer Vision and Pattern Recognition, 201

    Interactive multiple object learning with scanty human supervision

    Get PDF
    © 2016. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/We present a fast and online human-robot interaction approach that progressively learns multiple object classifiers using scanty human supervision. Given an input video stream recorded during the human robot interaction, the user just needs to annotate a small fraction of frames to compute object specific classifiers based on random ferns which share the same features. The resulting methodology is fast (in a few seconds, complex object appearances can be learned), versatile (it can be applied to unconstrained scenarios), scalable (real experiments show we can model up to 30 different object classes), and minimizes the amount of human intervention by leveraging the uncertainty measures associated to each classifier.; We thoroughly validate the approach on synthetic data and on real sequences acquired with a mobile platform in indoor and outdoor scenarios containing a multitude of different objects. We show that with little human assistance, we are able to build object classifiers robust to viewpoint changes, partial occlusions, varying lighting and cluttered backgrounds. (C) 2016 Elsevier Inc. All rights reserved.Peer ReviewedPostprint (author's final draft

    What Makes a Place? Building Bespoke Place Dependent Object Detectors for Robotics

    Full text link
    This paper is about enabling robots to improve their perceptual performance through repeated use in their operating environment, creating local expert detectors fitted to the places through which a robot moves. We leverage the concept of 'experiences' in visual perception for robotics, accounting for bias in the data a robot sees by fitting object detector models to a particular place. The key question we seek to answer in this paper is simply: how do we define a place? We build bespoke pedestrian detector models for autonomous driving, highlighting the necessary trade off between generalisation and model capacity as we vary the extent of the place we fit to. We demonstrate a sizeable performance gain over a current state-of-the-art detector when using computationally lightweight bespoke place-fitted detector models.Comment: IROS 201

    Efficiently learning a detection cascade with sparse eigenvectors

    Get PDF
    Real-time object detection has many computer vision applications. Since Viola and Jones proposed the first real-time AdaBoost based face detection system, much effort has been spent on improving the boosting method. In this work, we first show that feature selection methods other than boosting can also be used for training an efficient object detector. In particular, we introduce greedy sparse linear discriminant analysis (GSLDA) for its conceptual simplicity and computational efficiency; and slightly better detection performance is achieved compared with. Moreover, we propose a new technique, termed boosted greedy sparse linear discriminant analysis (BGSLDA), to efficiently train a detection cascade. BGSLDA exploits the sample reweighting property of boosting and the class-separability criterion of GSLDA. Experiments in the domain of highly skewed data distributions (e.g., face detection) demonstrate that classifiers trained with the proposed BGSLDA outperforms AdaBoost and its variants. This finding provides a significant opportunity to argue that AdaBoost and similar approaches are not the only methods that can achieve high detection results for real-time object detection

    Semi-supervised tensor-based graph embedding learning and its application to visual discriminant tracking

    Get PDF
    An appearance model adaptable to changes in object appearance is critical in visual object tracking. In this paper, we treat an image patch as a 2-order tensor which preserves the original image structure. We design two graphs for characterizing the intrinsic local geometrical structure of the tensor samples of the object and the background. Graph embedding is used to reduce the dimensions of the tensors while preserving the structure of the graphs. Then, a discriminant embedding space is constructed. We prove two propositions for finding the transformation matrices which are used to map the original tensor samples to the tensor-based graph embedding space. In order to encode more discriminant information in the embedding space, we propose a transfer-learningbased semi-supervised strategy to iteratively adjust the embedding space into which discriminative information obtained from earlier times is transferred. We apply the proposed semi-supervised tensor-based graph embedding learning algorithm to visual tracking. The new tracking algorithm captures an object’s appearance characteristics during tracking and uses a particle filter to estimate the optimal object state. Experimental results on the CVPR 2013 benchmark dataset demonstrate the effectiveness of the proposed tracking algorithm

    Facial Expression Recognition

    Get PDF

    Steady-State movement related potentials for brain–computer interfacing

    Get PDF
    An approach for brain-computer interfacing (BCI) by analysis of steady-state movement related potentials (ssMRPs) produced during rhythmic finger movements is proposed in this paper. The neurological background of ssMRPs is briefly reviewed. Averaged ssMRPs represent the development of a lateralized rhythmic potential, and the energy of the EEG signals at the finger tapping frequency can be used for single-trial ssMRP classification. The proposed ssMRP-based BCI approach is tested using the classic Fisher's linear discriminant classifier. Moreover, the influence of the current source density transform on the performance of BCI system is investigated. The averaged correct classification rates (CCRs) as well as averaged information transfer rates (ITRs) for different sliding time windows are reported. Reliable single-trial classification rates of 88%-100% accuracy are achievable at relatively high ITRs. Furthermore, we have been able to achieve CCRs of up to 93% in classification of the ssMRPs recorded during imagined rhythmic finger movements. The merit of this approach is in the application of rhythmic cues for BCI, the relatively simple recording setup, and straightforward computations that make the real-time implementations plausible

    Boosted Random ferns for object detection

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper we introduce the Boosted Random Ferns (BRFs) to rapidly build discriminative classifiers for learning and detecting object categories. At the core of our approach we use standard random ferns, but we introduce four main innovations that let us bring ferns from an instance to a category level, and still retain efficiency. First, we define binary features on the histogram of oriented gradients-domain (as opposed to intensity-), allowing for a better representation of intra-class variability. Second, both the positions where ferns are evaluated within the sliding window, and the location of the binary features for each fern are not chosen completely at random, but instead we use a boosting strategy to pick the most discriminative combination of them. This is further enhanced by our third contribution, that is to adapt the boosting strategy to enable sharing of binary features among different ferns, yielding high recognition rates at a low computational cost. And finally, we show that training can be performed online, for sequentially arriving images. Overall, the resulting classifier can be very efficiently trained, densely evaluated for all image locations in about 0.1 seconds, and provides detection rates similar to competing approaches that require expensive and significantly slower processing times. We demonstrate the effectiveness of our approach by thorough experimentation in publicly available datasets in which we compare against state-of-the-art, and for tasks of both 2D detection and 3D multi-view estimation.Peer ReviewedPostprint (author's final draft
    corecore