336,947 research outputs found

    Classification problems in object-based representation systems

    Get PDF
    Colloque avec actes et comité de lecture.Classification is a process that consists in two dual operations: generating a set of classes and then classifying given objects into the created classes. The class generation may be understood as a learning process and object classification as a problem-solving process. The goal of this position paper is to introduce and to make precise the notion of a classification problem in object-based representation systems, e.g. a query against a class hierarchy, to define a subsumption relation between classifications problems, and to analyze the way a classification problem can be solved with respect to a class hierarchy

    Toward a perceptual object recognition system

    Get PDF
    [1] demonstrated that humans are easily able to recognize an object in less than 0.5 seconds. Unfortunately,object recognition remains one of the most challenging problems in computer vision. Many algorithms basedon local approaches have been proposed in recent decades. Local approaches can be divided in 4 phases:region selection, region appearance description, image representation and classification [2]. Although thesesystems have demonstrated excellent performance, some weaknesses remain. The first limitation is in the region selection phase. Many existing techniques extract a large number of points/regions of interest. For instance, dense grids contain tens of thousands of points per image while interest point detectors often extract thousands of points. Furthermore, some studies have demonstrated that these techniques were not designed to detect the most pertinent regions for object recognition. There is only a weak correlation between the distribution of extracted points and eye fixations [3]. The second limitation mentioned in the literature concerns the region appearance description phase. The techniques used in this phase typically describe image regions using high-dimensional vectors [4]. For example, SIFT, the most popular descriptor for object recognition, produces a 128-dimensional vector per region [5].The main objective of this thesis is to propose a pipeline for an object recognition algorithm based on human perception which addresses the object recognition system complexity: query run time and memory allocation. In this context, we propose a filter based on a visual attention system [6] to address the problems of extracting a large number of points of interest using existing region selection techniques. We chose to use bottom-up visual attention systems that encode attentional fixations in a topographic map, known as a saliency map. This map serves as basis for generating a mask to select salient points according to human interest, from the points extracted by a region selection technique [7]. Furthermore, we addressed the problem of high dimensionality of descriptors in region appearance phase. We proposed a new hybrid descriptor representing the spatial frequency of some perceptual features, extracted by a visual attention system (color, texture, intensity [8]. This descriptor consist of a concatenation of energy measures computed at the output of a filter bank [9], at each level of the multi-resolution pyramid of perceptual features. This descriptor has the advantage of being lower dimensional than traditional descriptors.The test of our filtering approach, using Perreira da Silva system [10] as a filter on VOC2005, demonstrated that we can maintain approximately the same performance of an object recognition system by selecting only 40% of extracted  points (using Harris-Laplace [11] and Laplacian [12]), while having an important reduction in complexity (40% reduction in query run time). Furthermore, evaluating our descriptor with an object recognition system using Harris-Laplace and Laplacian interest point detectors on VOC2007 database showed a slight decrease in performance ( 5% reduction of average precision) compared to the original system based on the SIFT descriptor, but with a 50% reduction in complexity. In addition, we evaluated our descriptor using a visual attention system as the region selection technique on VOC2005. The experiment showed a slight decrease in performance (3% reduction in precision), but a drastically reduced complexity of the system (with 5% reduction in query run-time and 70% in complexity).In this thesis, we proposed two approaches to manage the problems of complexity in object recognitionsystem. In future, it would be interesting to address the problems of the last two phases in object system: image representation and classification, by introducing perceptually plausible concepts such as deep learning techniques

    Class Representation of Shapes Using Qualitative-codes

    Get PDF
    This paper introduces our qualitative shape representation formalism that is devised to overcome, as we have argued, the class abstraction problems created by numeric schemes. The numeric shape representation method used in conventional geometric modeling systems reveals difficulties in several aspects of architectural designing. Firstly, numeric schemes strongly require complete and detailed information for any simple task of object modeling. This requirement of information completeness makes it hard to apply numeric schemes to shapes in sketch level drawings that are characteristically ambiguous and have non-specific limitations on shape descriptions. Secondly, Cartesian coordinate-based quantitative shape representation schemes show restrictions in the task of shape comparison and classification that are inevitably involved in abstract concepts related to shape characteristics. One of the reasons why quantitative schemes are difficult to apply to the abstraction of individual shape information into its classes and categories is the uniqueness property, meaning that an individual description in a quantitative scheme should refer to only one object in the domain of representation. A class representation, however, should be able to indicate not only one but also a group of objects sharing common characteristics. Thirdly, it is difficult or inefficient to apply numeric shape representation schemes based on the Cartesian coordinate system to preliminary shape analysis and modeling tasks because of their emphasis on issues, such as detail, completeness, uniqueness and individuality, which can only be accessed in the final stages of designing. Therefore, we face the need for alternative shape representation schemes that can handle class representation of objects in order to manage the shapes in the early stages of designing. We consider shape as a boundary description consisting of a set of connected and closed lines. Moreover, we need to consider non-numeric approaches to overcome the problems caused by quantitative representation approaches.This paper introduces a qualitative approach to shape representation that is contrasted to conventional numeric techniques. This research is motivated by ideas and methodologies from related studies such as in qualitative formalism ([4], [6], [19], [13], [31]), qualitative abstraction [16], qualitative vector algebra ([7], [32]), qualitative shapes ([18], [23], [21]), and coding theory ([20], [25], [26], [1], [2], [3], [22]). We develop a qualitative shape representation scheme by adopting propitious aspects of the above techniques to suit the need for our shape comparison and analysis tasks. The qualitative shape-encoding scheme converts shapes into systematically constructed qualitative symbols called Q-codes. This paper explains how the Q-code scheme is developed and applied

    DART: Distribution Aware Retinal Transform for Event-based Cameras

    Full text link
    We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-features classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101). (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) For overcoming the low-sample problem for the one-shot learning of a binary classifier, statistical bootstrapping is leveraged with online learning; (ii) To achieve tracker robustness, the scale and rotation equivariance property of the DART descriptors is exploited for the one-shot learning. (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset. (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201
    corecore