4,260 research outputs found

    Object Detection at the Optimal Scale with Hidden State Shape Models

    Full text link
    Hidden State Shape Models (HSSMs) [2], a variant of Hidden Markov Models (HMMs) [9], were proposed to detect shape classes of variable structure in cluttered images. In this paper, we formulate a probabilistic framework for HSSMs which provides two major improvements in comparison to the previous method [2]. First, while the method in [2] required the scale of the object to be passed as an input, the method proposed here estimates the scale of the object automatically. This is achieved by introducing a new term for the observation probability that is based on a object-clutter feature model. Second, a segmental HMM [6, 8] is applied to model the "duration probability" of each HMM state, which is learned from the shape statistics in a training set and helps obtain meaningful registration results. Using a segmental HMM provides a principled way to model dependencies between the scales of different parts of the object. In object localization experiments on a dataset of real hand images, the proposed method significantly outperforms the method of [2], reducing the incorrect localization rate from 40% to 15%. The improvement in accuracy becomes more significant if we consider that the method proposed here is scale-independent, whereas the method of [2] takes as input the scale of the object we want to localize

    A Survey on Joint Object Detection and Pose Estimation using Monocular Vision

    Get PDF
    In this survey we present a complete landscape of joint object detection and pose estimation methods that use monocular vision. Descriptions of traditional approaches that involve descriptors or models and various estimation methods have been provided. These descriptors or models include chordiograms, shape-aware deformable parts model, bag of boundaries, distance transform templates, natural 3D markers and facet features whereas the estimation methods include iterative clustering estimation, probabilistic networks and iterative genetic matching. Hybrid approaches that use handcrafted feature extraction followed by estimation by deep learning methods have been outlined. We have investigated and compared, wherever possible, pure deep learning based approaches (single stage and multi stage) for this problem. Comprehensive details of the various accuracy measures and metrics have been illustrated. For the purpose of giving a clear overview, the characteristics of relevant datasets are discussed. The trends that prevailed from the infancy of this problem until now have also been highlighted.Comment: Accepted at the International Joint Conference on Computer Vision and Pattern Recognition (CCVPR) 201

    Point Pair Feature based Object Detection for Random Bin Picking

    Full text link
    Point pair features are a popular representation for free form 3D object detection and pose estimation. In this paper, their performance in an industrial random bin picking context is investigated. A new method to generate representative synthetic datasets is proposed. This allows to investigate the influence of a high degree of clutter and the presence of self similar features, which are typical to our application. We provide an overview of solutions proposed in literature and discuss their strengths and weaknesses. A simple heuristic method to drastically reduce the computational complexity is introduced, which results in improved robustness, speed and accuracy compared to the naive approach

    Localization from semantic observations via the matrix permanent

    Get PDF
    Most approaches to robot localization rely on low-level geometric features such as points, lines, and planes. In this paper, we use object recognition to obtain semantic information from the robot’s sensors and consider the task of localizing the robot within a prior map of landmarks, which are annotated with semantic labels. As object recognition algorithms miss detections and produce false alarms, correct data association between the detections and the landmarks on the map is central to the semantic localization problem. Instead of the traditional vector-based representation, we propose a sensor model, which encodes the semantic observations via random finite sets and enables a unified treatment of missed detections, false alarms, and data association. Our second contribution is to reduce the problem of computing the likelihood of a set-valued observation to the problem of computing a matrix permanent. It is this crucial transformation that allows us to solve the semantic localization problem with a polynomial-time approximation to the set-based Bayes filter. Finally, we address the active semantic localization problem, in which the observer’s trajectory is planned in order to improve the accuracy and efficiency of the localization process. The performance of our approach is demonstrated in simulation and in real environments using deformable-part-model-based object detectors. Robust global localization from semantic observations is demonstrated for a mobile robot, for the Project Tango phone, and on the KITTI visual odometry dataset. Comparisons are made with the traditional lidar-based geometric Monte Carlo localization

    Performance Assessment of Feature Detection Algorithms: A Methodology and Case Study on Corner Detectors

    Get PDF
    In this paper we describe a generic methodology for evaluating the labeling performance of feature detectors. We describe a method for generating a test set and apply the methodology to the performance assessment of three well-known corner detectors: the Kitchen-Rosenfeld, Paler et al. and Harris-Stephens corner detectors. The labeling deficiencies of each of these detectors is related to their discrimination ability between corners and various of the features which comprise the class of noncorners

    Director Field Model of the Primary Visual Cortex for Contour Detection

    Full text link
    We aim to build the simplest possible model capable of detecting long, noisy contours in a cluttered visual scene. For this, we model the neural dynamics in the primate primary visual cortex in terms of a continuous director field that describes the average rate and the average orientational preference of active neurons at a particular point in the cortex. We then use a linear-nonlinear dynamical model with long range connectivity patterns to enforce long-range statistical context present in the analyzed images. The resulting model has substantially fewer degrees of freedom than traditional models, and yet it can distinguish large contiguous objects from the background clutter by suppressing the clutter and by filling-in occluded elements of object contours. This results in high-precision, high-recall detection of large objects in cluttered scenes. Parenthetically, our model has a direct correspondence with the Landau - de Gennes theory of nematic liquid crystal in two dimensions.Comment: 9 pages, 7 figure

    SIFTing the relevant from the irrelevant: Automatically detecting objects in training images

    Get PDF
    Many state-of-the-art object recognition systems rely on identifying the location of objects in images, in order to better learn its visual attributes. In this paper, we propose four simple yet powerful hybrid ROI detection methods (combining both local and global features), based on frequently occurring keypoints. We show that our methods demonstrate competitive performance in two different types of datasets, the Caltech101 dataset and the GRAZ-02 dataset, where the pairs of keypoint bounding box method achieved the best accuracies overall
    corecore