2,314 research outputs found

    Effective classifiers for detecting objects

    Get PDF
    Several state-of-the-art machine learning classifiers are compared for the purposes of object detection in complex images, using global image features derived from the Ohta color space and Local Binary Patterns. Image complexity in this sense refers to the degree to which the target objects are occluded and/or non-dominant (i.e. not in the foreground) in the image, and also the degree to which the images are cluttered with non-target objects. The results indicate that a voting ensemble of Support Vector Machines, Random Forests, and Boosted Decision Trees provide the best performance with AUC values of up to 0.92 and Equal Error Rate accuracies of up to 85.7% in stratified 10-fold cross validation experiments on the GRAZ02 complex image dataset

    Online learning and detection of faces with low human supervision

    Get PDF
    The final publication is available at link.springer.comWe present an efficient,online,and interactive approach for computing a classifier, called Wild Lady Ferns (WiLFs), for face learning and detection using small human supervision. More precisely, on the one hand, WiLFs combine online boosting and extremely randomized trees (Random Ferns) to compute progressively an efficient and discriminative classifier. On the other hand, WiLFs use an interactive human-machine approach that combines two complementary learning strategies to reduce considerably the degree of human supervision during learning. While the first strategy corresponds to query-by-boosting active learning, that requests human assistance over difficult samples in function of the classifier confidence, the second strategy refers to a memory-based learning which uses ¿ Exemplar-based Nearest Neighbors (¿ENN) to assist automatically the classifier. A pre-trained Convolutional Neural Network (CNN) is used to perform ¿ENN with high-level feature descriptors. The proposed approach is therefore fast (WilFs run in 1 FPS using a code not fully optimized), accurate (we obtain detection rates over 82% in complex datasets), and labor-saving (human assistance percentages of less than 20%). As a byproduct, we demonstrate that WiLFs also perform semi-automatic annotation during learning, as while the classifier is being computed, WiLFs are discovering faces instances in input images which are used subsequently for training online the classifier. The advantages of our approach are demonstrated in synthetic and publicly available databases, showing comparable detection rates as offline approaches that require larger amounts of handmade training data.Peer ReviewedPostprint (author's final draft

    Boosting for Generic 2D/3D Object Recognition

    Get PDF
    Generic object recognition is an important function of the human visual system. For an artificial vision system to be able to emulate the human perception abilities, it should also be able to perform generic object recognition. In this thesis, we address the generic object recognition problem and present different approaches and models which tackle different aspects of this difficult problem. First, we present a model for generic 2D object recognition from complex 2D images. The model exploits only appearance-based information, in the form of a combination of texture and color cues, for binary classification of 2D object classes. Learning is accomplished in a weakly supervised manner using Boosting. However, we live in a 3D world and the ability to recognize 3D objects is very important for any vision system. Therefore, we present a model for generic recognition of 3D objects from range images. Our model makes use of a combination of simple local shape descriptors extracted from range images for recognizing 3D object categories, as shape is an important information provided by range images. Moreover, we present a novel dataset for generic object recognition that provides 2D and range images about different object classes using a Time-of-Flight (ToF) camera. As the surrounding world contains thousands of different object categories, recognizing many different object classes is important as well. Therefore, we extend our generic 3D object recognition model to deal with the multi-class learning and recognition task. Moreover, we extend the multi-class recognition model by introducing a novel model which uses a combination of appearance-based information extracted from 2D images and range-based (shape) information extracted from range images for multi-class generic 3D object recognition and promising results are obtained

    Boosted Random ferns for object detection

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper we introduce the Boosted Random Ferns (BRFs) to rapidly build discriminative classifiers for learning and detecting object categories. At the core of our approach we use standard random ferns, but we introduce four main innovations that let us bring ferns from an instance to a category level, and still retain efficiency. First, we define binary features on the histogram of oriented gradients-domain (as opposed to intensity-), allowing for a better representation of intra-class variability. Second, both the positions where ferns are evaluated within the sliding window, and the location of the binary features for each fern are not chosen completely at random, but instead we use a boosting strategy to pick the most discriminative combination of them. This is further enhanced by our third contribution, that is to adapt the boosting strategy to enable sharing of binary features among different ferns, yielding high recognition rates at a low computational cost. And finally, we show that training can be performed online, for sequentially arriving images. Overall, the resulting classifier can be very efficiently trained, densely evaluated for all image locations in about 0.1 seconds, and provides detection rates similar to competing approaches that require expensive and significantly slower processing times. We demonstrate the effectiveness of our approach by thorough experimentation in publicly available datasets in which we compare against state-of-the-art, and for tasks of both 2D detection and 3D multi-view estimation.Peer ReviewedPostprint (author's final draft
    corecore