10 research outputs found

    Real-Time Salient Closed Boundary Tracking via Line Segments Perceptual Grouping

    Full text link
    This paper presents a novel real-time method for tracking salient closed boundaries from video image sequences. This method operates on a set of straight line segments that are produced by line detection. The tracking scheme is coherently integrated into a perceptual grouping framework in which the visual tracking problem is tackled by identifying a subset of these line segments and connecting them sequentially to form a closed boundary with the largest saliency and a certain similarity to the previous one. Specifically, we define a new tracking criterion which combines a grouping cost and an area similarity constraint. The proposed criterion makes the resulting boundary tracking more robust to local minima. To achieve real-time tracking performance, we use Delaunay Triangulation to build a graph model with the detected line segments and then reduce the tracking problem to finding the optimal cycle in this graph. This is solved by our newly proposed closed boundary candidates searching algorithm called "Bidirectional Shortest Path (BDSP)". The efficiency and robustness of the proposed method are tested on real video sequences as well as during a robot arm pouring experiment.Comment: 7 pages, 8 figures, The 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017) submission ID 103

    A Combinatorial Solution to Non-Rigid 3D Shape-to-Image Matching

    Get PDF
    We propose a combinatorial solution for the problem of non-rigidly matching a 3D shape to 3D image data. To this end, we model the shape as a triangular mesh and allow each triangle of this mesh to be rigidly transformed to achieve a suitable matching to the image. By penalising the distance and the relative rotation between neighbouring triangles our matching compromises between image and shape information. In this paper, we resolve two major challenges: Firstly, we address the resulting large and NP-hard combinatorial problem with a suitable graph-theoretic approach. Secondly, we propose an efficient discretisation of the unbounded 6-dimensional Lie group SE(3). To our knowledge this is the first combinatorial formulation for non-rigid 3D shape-to-image matching. In contrast to existing local (gradient descent) optimisation methods, we obtain solutions that do not require a good initialisation and that are within a bound of the optimal solution. We evaluate the proposed method on the two problems of non-rigid 3D shape-to-shape and non-rigid 3D shape-to-image registration and demonstrate that it provides promising results.Comment: 10 pages, 7 figure

    Rastreamento visual sob mudanças extremas de iluminação utilizando a soma da variância condicional

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2014Rastreamento visual direto é resolvido atualmente utilizando principalmente técnicas de otimização baseadas em descida de gradiente. A velocidade de convergência destas técnicas permite utilizar modelos de transformações com vários graus de liberdade. Muitas abordagens utilizam a Soma do Quadrado dos Resíduos como medida de similaridade, mas esta técnica não oferece estabilidade perante mudanças de iluminação na cena. Estas mudanças causam instabilidades na convergência dos métodos se não forem compensadas. Uma das técnicas de compensação de iluminação utiliza um modelo paramétrico de iluminação que aumenta o número de parâmetros a serem calculados. As aplicações que utilizam rastreamento direto precisam de respostas em tempo real e podem tornar-se impraticáveis com a adição do modelo de iluminação. Nesta dissertação é proposto um método de rastreamento visual direto robusto capaz de rastrear sob condições de iluminação extremas. Utilizando a Soma da Variância Condicional como base, a abordagem proposta utiliza sub-imagens para lidar com mudanças de iluminações extremas. O método proposto reduz o esforço computacional quando comparado com técnicas similares da literatura. Resultados experimentais atestam a redução de em média 57,5% em tempo de processamento para sequências coloridas.Abstract: Direct visual tracking is currently solved mainly with the use of gradient descent optimization. The speed of convergence of these techniques allows the use of transformation models with many degrees of freedom. The most popular similarity measure for direct tracking is the Sum of the Squared Differences, even though this approach is not robust to illumination changes in the scene. These changes, when left uncompensated, can lead to instabilities in the convergence of the algorithms. One technique used to compensate illumination changes uses an illumination model, which increases the number of parameters to be computed. Since most applications that use direct visual tracking need the results to be delivered in real time, the addition of the illumination model can hinder their performance. A novel direct visual tracking approach is presented in this work, being able to cope with extreme illumination conditions. Using the Sum of Conditional Varianceas a base, the proposed method uses sub-images to compensate for extreme illumination configurations. The proposed method reduces the computational burden when compared to similar approaches in the literature. Experimental results show that the method is 57,5% faster on average when dealing with color sequences

    Bottom-up Object Segmentation for Visual Recognition

    Get PDF
    Automatic recognition and segmentation of objects in images is a central open problem in computer vision. Most previous approaches have pursued either sliding-window object detection or dense classification of overlapping local image patches. Differently, the framework introduced in this thesis attempts to identify the spatial extent of objects prior to recognition, using bottom-up computational processes and mid-level selection cues. After a set of plausible object hypotheses is identified, a sequential recognition process is executed, based on continuous estimates of the spatial overlap between the image segment hypotheses and each putative class. The object hypotheses are represented as figure-ground segmentations, and are extracted automatically, without prior knowledge of the properties of individual object classes, by solving a sequence of constrained parametric min-cut problems (CPMC) on a regular image grid. It is show that CPMC significantly outperforms the state of the art for low-level segmentation in the PASCAL VOC 2009 and 2010 datasets. Results beyond the current state of the art for image classification, object detection and semantic segmentation are also demonstrated in a number of challenging datasets including Caltech-101, ETHZ-Shape as well as PASCAL VOC 2009-11. These results suggest that a greater emphasis on grouping and image organization may be valuable for making progress in high-level tasks such as object recognition and scene understanding

    Hierarchical Visual Content Modelling and Query based on Trees

    Get PDF
    In recent years, such vast archives of video information have become available that human annotation of content is no longer feasible; automation of video content analysis is therefore highly desirable. The recognition of semantic content in images is a problem that relies on prior knowledge and learnt information and that, to date, has only been partially solved. Salient analysis, on the other hand, is statistically based and highlights regions that are distinct from their surroundings, while also being scalable and repeatable. The arrangement of salient information into hierarchical tree structures in the spatial and temporal domains forms an important step to bridge the semantic salient gap. Salient regions are identified using region analysis, rank ordered and documented in a tree for further analysis. A structure of this kind contains all the information in the original video and forms an intermediary between video processing and video understanding, transforming video analysis to a syntactic database analysis problem. This contribution demonstrates the formulation of spatio-temporal salient trees the syntax to index them, and provides an interface for higher level cognition in machine vision
    corecore