152,040 research outputs found

    Object recognition using shape-from-shading

    Get PDF
    This paper investigates whether surface topography information extracted from intensity images using a recently reported shape-from-shading (SFS) algorithm can be used for the purposes of 3D object recognition. We consider how curvature and shape-index information delivered by this algorithm can be used to recognize objects based on their surface topography. We explore two contrasting object recognition strategies. The first of these is based on a low-level attribute summary and uses histograms of curvature and orientation measurements. The second approach is based on the structural arrangement of constant shape-index maximal patches and their associated region attributes. We show that region curvedness and a string ordering of the regions according to size provides recognition accuracy of about 96 percent. By polling various recognition schemes. including a graph matching method. we show that a recognition rate of 98-99 percent is achievable

    Image segmentation with adaptive region growing based on a polynomial surface model

    Get PDF
    A new method for segmenting intensity images into smooth surface segments is presented. The main idea is to divide the image into flat, planar, convex, concave, and saddle patches that coincide as well as possible with meaningful object features in the image. Therefore, we propose an adaptive region growing algorithm based on low-degree polynomial fitting. The algorithm uses a new adaptive thresholding technique with the L∞ fitting cost as a segmentation criterion. The polynomial degree and the fitting error are automatically adapted during the region growing process. The main contribution is that the algorithm detects outliers and edges, distinguishes between strong and smooth intensity transitions and finds surface segments that are bent in a certain way. As a result, the surface segments corresponding to meaningful object features and the contours separating the surface segments coincide with real-image object edges. Moreover, the curvature-based surface shape information facilitates many tasks in image analysis, such as object recognition performed on the polynomial representation. The polynomial representation provides good image approximation while preserving all the necessary details of the objects in the reconstructed images. The method outperforms existing techniques when segmenting images of objects with diffuse reflecting surfaces

    Detecting and Grouping Identical Objects for Region Proposal and Classification

    Full text link
    Often multiple instances of an object occur in the same scene, for example in a warehouse. Unsupervised multi-instance object discovery algorithms are able to detect and identify such objects. We use such an algorithm to provide object proposals to a convolutional neural network (CNN) based classifier. This results in fewer regions to evaluate, compared to traditional region proposal algorithms. Additionally, it enables using the joint probability of multiple instances of an object, resulting in improved classification accuracy. The proposed technique can also split a single class into multiple sub-classes corresponding to the different object types, enabling hierarchical classification.Comment: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Workshop Deep Learning for Robotic Vision, 21 July, 2017, Honolulu, Hawai

    An Image Based Feature Space and Mapping for Linking Regions and Words

    No full text
    We propose an image based feature space and define a mapping of both image regions and textual labels into that space. We believe the embedding of both image regions and labels into the same space in this way is novel, and makes object recognition more straightforward. Each dimension of the space corresponds to an image from the database. The coordinates of an image segment(region) are calculated based on its distance to the closest segment within each of the images, while the coordinates of a label are generated based on their association with the images. As a result, similar image segments associated with the same objects are clustered together in this feature space, and should also be close to the labels representing the object. The link between image regions and words can be discovered from their separation in the feature space. The algorithm is applied to an image collection and preliminary results are encouraging

    Action intention recognition for proactive human assistance in domestic environments

    Get PDF
    The current Master’s Thesis in Automatics, Control and Robotics covers the development and implementation of an Action Intention Recognition algorithm for proactive human assistance in domestic environments. The proposed solution is based on the use of data provided by a real time RGBD Object Recognition process which captures object state changes inside a defined region of interest of the domestic environment setup. A background analysis is performed to analyze state of the art approaches to both real time RGBD object recognition and action intention recognition methods. The preliminary analysis serves as the base for the proposal of a new volume descriptor for object categorization and an improved formalism for Activation Spreading Networks in the context of action intention recognition. Several tests are performed to study the performance of the proposed solution and its results are analyzed to define the conclusions of the project and propose future work. Finally, the project budget and environmental impact as well as the project schedule are presented and briefly discusse

    Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views

    Get PDF
    We present a novel Object Recognition approach based on affine invariant regions. It actively counters the problems related to the limited repeatability of the region detectors, and the difficulty of matching, in the presence of large amounts of background clutter and particularly challenging viewing conditions. After producing an initial set of matches, the method gradually explores the surrounding image areas, recursively constructing more and more matching regions, increasingly farther from the initial ones. This process covers the object with matches, and simultaneously separates the correct matches from the wrong ones. Hence, recognition and segmentation are achieved at the same time. The approach includes a mechanism for capturing the relationships between multiple model views and exploiting these for integrating the contributions of the views at recognition time. This is based on an efficient algorithm for partitioning a set of region matches into groups lying on smooth surfaces. Integration is achieved by measuring the consistency of configurations of groups arising from different model views. Experimental results demonstrate the stronger power of the approach in dealing with extensive clutter, dominant occlusion, and large scale and viewpoint changes. Non-rigid deformations are explicitly taken into account, and the approximative contours of the object are produced. All presented techniques can extend any view-point invariant feature extracto

    Toward a perceptual object recognition system

    Get PDF
    [1] demonstrated that humans are easily able to recognize an object in less than 0.5 seconds. Unfortunately,object recognition remains one of the most challenging problems in computer vision. Many algorithms basedon local approaches have been proposed in recent decades. Local approaches can be divided in 4 phases:region selection, region appearance description, image representation and classification [2]. Although thesesystems have demonstrated excellent performance, some weaknesses remain. The first limitation is in the region selection phase. Many existing techniques extract a large number of points/regions of interest. For instance, dense grids contain tens of thousands of points per image while interest point detectors often extract thousands of points. Furthermore, some studies have demonstrated that these techniques were not designed to detect the most pertinent regions for object recognition. There is only a weak correlation between the distribution of extracted points and eye fixations [3]. The second limitation mentioned in the literature concerns the region appearance description phase. The techniques used in this phase typically describe image regions using high-dimensional vectors [4]. For example, SIFT, the most popular descriptor for object recognition, produces a 128-dimensional vector per region [5].The main objective of this thesis is to propose a pipeline for an object recognition algorithm based on human perception which addresses the object recognition system complexity: query run time and memory allocation. In this context, we propose a filter based on a visual attention system [6] to address the problems of extracting a large number of points of interest using existing region selection techniques. We chose to use bottom-up visual attention systems that encode attentional fixations in a topographic map, known as a saliency map. This map serves as basis for generating a mask to select salient points according to human interest, from the points extracted by a region selection technique [7]. Furthermore, we addressed the problem of high dimensionality of descriptors in region appearance phase. We proposed a new hybrid descriptor representing the spatial frequency of some perceptual features, extracted by a visual attention system (color, texture, intensity [8]. This descriptor consist of a concatenation of energy measures computed at the output of a filter bank [9], at each level of the multi-resolution pyramid of perceptual features. This descriptor has the advantage of being lower dimensional than traditional descriptors.The test of our filtering approach, using Perreira da Silva system [10] as a filter on VOC2005, demonstrated that we can maintain approximately the same performance of an object recognition system by selecting only 40% of extracted  points (using Harris-Laplace [11] and Laplacian [12]), while having an important reduction in complexity (40% reduction in query run time). Furthermore, evaluating our descriptor with an object recognition system using Harris-Laplace and Laplacian interest point detectors on VOC2007 database showed a slight decrease in performance ( 5% reduction of average precision) compared to the original system based on the SIFT descriptor, but with a 50% reduction in complexity. In addition, we evaluated our descriptor using a visual attention system as the region selection technique on VOC2005. The experiment showed a slight decrease in performance (3% reduction in precision), but a drastically reduced complexity of the system (with 5% reduction in query run-time and 70% in complexity).In this thesis, we proposed two approaches to manage the problems of complexity in object recognitionsystem. In future, it would be interesting to address the problems of the last two phases in object system: image representation and classification, by introducing perceptually plausible concepts such as deep learning techniques
    corecore