1,909 research outputs found
Parametric region-based foreround segmentation in planar and multi-view sequences
Foreground segmentation in video sequences is an important area of the image processing that attracts great interest among the scientist community, since it makes possible the detection of the objects that appear in the sequences under analysis, and allows us to achieve a correct performance of high level applications which use foreground segmentation as an initial step.
The current Ph.D. thesis entitled Parametric Region-Based Foreground Segmentation in Planar and Multi-View Sequences details, in the following pages, the research work carried out within this eld. In this investigation, we propose to use parametric probabilistic models at pixel-wise and region level in order to model the di erent classes that are involved in the classi cation process of the di erent regions of the image: foreground, background and, in some sequences, shadow. The development is presented in the following chapters as a generalization of the techniques proposed for objects segmentation in 2D planar sequences to 3D multi-view environment, where we establish a cooperative relationship between all the sensors that are recording the scene.
Hence, di erent scenarios have been analyzed in this thesis in order to improve the foreground segmentation techniques:
In the first part of this research, we present segmentation methods appropriate for 2D planar scenarios. We start dealing with foreground segmentation in static camera sequences, where a system that combines pixel-wise background model with region-based foreground and shadow models is proposed in a Bayesian classi cation framework. The research continues with the application of this method to moving camera scenarios, where the Bayesian framework is developed between foreground and background classes, both characterized with region-based models, in order to obtain a robust foreground segmentation for this kind of sequences.
The second stage of the research is devoted to apply these 2D techniques to multi-view acquisition setups, where several cameras are recording the scene at the same time. At the beginning of this section, we propose a foreground segmentation system for sequences recorded by means of color and depth sensors, which combines di erent probabilistic models created for the background and foreground classes in each one of the views, by taking into account the reliability that each sensor presents. The investigation goes ahead by proposing foreground segregation methods for multi-view smart room scenarios. In these sections, we design two systems where foreground segmentation and 3D reconstruction are combined in order to improve the results of each process. The proposals end with the presentation of a multi-view segmentation system where a foreground probabilistic model is proposed in the 3D space to gather all the object information that appears in the views.
The results presented in each one of the proposals show that the foreground segmentation and also the 3D reconstruction can be improved, in these scenarios, by using parametric probabilistic models for modeling the objects to segment, thus introducing the information of the object in a Bayesian classi cation framework.La segmentaci on de objetos de primer plano en secuencias de v deo es una importante area del procesado de imagen que despierta gran inter es por parte de la comunidad cient ca, ya que posibilita la detecci on de objetos que aparecen en las diferentes secuencias en an alisis, y permite el buen funcionamiento de aplicaciones de alto nivel que utilizan esta segmentaci on obtenida como par ametro de entrada.
La presente tesis doctoral titulada Parametric Region-Based Foreground Segmentation in Planar and Multi-View Sequences detalla, en las p aginas que siguen, el trabajo de investigaci on desarrollado en este campo. En esta investigaci on se propone utilizar modelos probabil sticos param etricos a nivel de p xel y a nivel de regi on para modelar las diferentes clases que participan en la clasi caci on de las regiones de la imagen: primer plano, fondo y en seg un que secuencias, las regiones de sombra. El desarrollo se presenta en los cap tulos que siguen como una generalizaci on de t ecnicas propuestas para la segmentaci on de objetos en secuencias 2D mono-c amara, al entorno 3D multi-c amara, donde se establece la cooperaci on de los diferentes sensores que participan en la grabaci on de la escena.
De esta manera, diferentes escenarios han sido estudiados con el objetivo de mejorar las t ecnicas de segmentaci on para cada uno de ellos: En la primera parte de la investigaci on, se presentan m etodos de segmentaci on para escenarios monoc amara. Concretamente, se comienza tratando la segmentaci on de primer plano para c amara est atica, donde se propone un sistema completo basado en la clasi caci on Bayesiana entre el modelo a nivel de p xel de nido para modelar el fondo, y los modelos a nivel de regi on creados para modelar los objetos de primer plano y la sombra que cada uno de ellos proyecta. La investigaci on prosigue con la aplicaci on de este m etodo a secuencias grabadas mediante c amara en movimiento, donde la clasi caci on Bayesiana se plantea entre las clases de fondo y primer plano, ambas caracterizadas con modelos a nivel de regi on, con el objetivo de obtener una segmentaci on robusta para este tipo de secuencias.
La segunda parte de la investigaci on, se centra en la aplicaci on de estas t ecnicas mono-c amara a entornos multi-vista, donde varias c amaras graban conjuntamente la misma escena. Al inicio de dicho apartado, se propone una segmentaci on de primer plano en secuencias donde se combina una c amara de color con una c amara de profundidad en una clasi caci on que combina los diferentes modelos probabil sticos creados para el fondo y el primer plano en cada c amara, a partir de la fi abilidad que presenta cada sensor. La investigaci on prosigue proponiendo m etodos de segmentaci on de primer plano para entornos multi-vista en salas inteligentes. En estos apartados se diseñan dos sistemas donde la segmentaci on de primer plano y la reconstrucci on 3D se combinan para mejorar los resultados de cada uno de estos procesos.
Las propuestas fi nalizan con la presentaci on de un sistema de segmentaci on multi-c amara donde se centraliza la informaci on del objeto a segmentar mediante el diseño de un modelo probabil stico 3D.
Los resultados presentados en cada uno de los sistemas, demuestran que la segmentacion de primer plano y la reconstrucci on 3D pueden verse mejorados en estos escenarios mediante el uso de modelos probabilisticos param etricos para modelar los objetos a segmentar, introduciendo as la informaci on disponible del objeto en un marco de clasi caci on Bayesiano
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
Analysis of Three-Dimensional Protein Images
A fundamental goal of research in molecular biology is to understand protein
structure. Protein crystallography is currently the most successful method for
determining the three-dimensional (3D) conformation of a protein, yet it
remains labor intensive and relies on an expert's ability to derive and
evaluate a protein scene model. In this paper, the problem of protein structure
determination is formulated as an exercise in scene analysis. A computational
methodology is presented in which a 3D image of a protein is segmented into a
graph of critical points. Bayesian and certainty factor approaches are
described and used to analyze critical point graphs and identify meaningful
substructures, such as alpha-helices and beta-sheets. Results of applying the
methodologies to protein images at low and medium resolution are reported. The
research is related to approaches to representation, segmentation and
classification in vision, as well as to top-down approaches to protein
structure prediction.Comment: See http://www.jair.org/ for any accompanying file
Task-driven active sensing framework applied to leaf probing
© . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/This article presents a new method for actively exploring a 3D workspace with the aim of localizing relevant regions for a given task. Our method encodes the exploration route in a multi-layer occupancy grid map. This map, together with a multiple-view estimator and a maximum-information-gain gathering approach, incrementally provide a better understanding of the scene until reaching the task termination criterion. This approach is designed to be applicable to any task entailing 3D object exploration where some previous knowledge of its approximate shape is available. Its suitability is demonstrated here for a leaf probing task using an eye-in-hand arm configuration in the context of a phenotyping application (leaf probing).Peer ReviewedPostprint (author's final draft
- …