10 research outputs found

    Segmentation and semantic labelling of RGBD data with convolutional neural networks and surface fitting

    Get PDF
    We present an approach for segmentation and semantic labelling of RGBD data exploiting together geometrical cues and deep learning techniques. An initial over-segmentation is performed using spectral clustering and a set of non-uniform rational B-spline surfaces is fitted on the extracted segments. Then a convolutional neural network (CNN) receives in input colour and geometry data together with surface fitting parameters. The network is made of nine convolutional stages followed by a softmax classifier and produces a vector of descriptors for each sample. In the next step, an iterative merging algorithm recombines the output of the over-segmentation into larger regions matching the various elements of the scene. The couples of adjacent segments with higher similarity according to the CNN features are candidate to be merged and the surface fitting accuracy is used to detect which couples of segments belong to the same surface. Finally, a set of labelled segments is obtained by combining the segmentation output with the descriptors from the CNN. Experimental results show how the proposed approach outperforms state-of-the-art methods and provides an accurate segmentation and labelling

    Joint segmentation of color and depth data based on splitting and merging driven by surface fitting

    Get PDF
    This paper proposes a segmentation scheme based on the joint usage of color and depth data together with a 3D surface estimation scheme. Firstly a set of multi-dimensional vectors is built from color, geometry and surface orientation information. Normalized cuts spectral clustering is then applied in order to recursively segment the scene in two parts thus obtaining an over-segmentation. This procedure is followed by a recursive merging stage where close segments belonging to the same object are joined together. At each step of both procedures a NURBS model is fitted on the computed segments and the accuracy of the fitting is used as a measure of the plausibility that a segment represents a single surface or object. By comparing the accuracy to the one at the previous step, it is possible to determine if each splitting or merging operation leads to a better scene representation and consequently whether to perform it or not. Experimental results show how the proposed method provides an accurate and reliable segmentation

    Joint segmentation of color and depth data based on splitting and merging driven by surface fitting

    Get PDF
    This paper proposes a segmentation scheme based on the joint usage of color and depth data together with a 3D surface estimation scheme. Firstly a set of multi-dimensional vectors is built from color, geometry and surface orientation information. Normalized cuts spectral clustering is then applied in order to recursively segment the scene in two parts thus obtaining an over-segmentation. This procedure is followed by a recursive merging stage where close segments belonging to the same object are joined together. At each step of both procedures a NURBS model is fitted on the computed segments and the accuracy of the fitting is used as a measure of the plausibility that a segment represents a single surface or object. By comparing the accuracy to the one at the previous step, it is possible to determine if each splitting or merging operation leads to a better scene representation and consequently whether to perform it or not. Experimental results show how the proposed method provides an accurate and reliable segmentation

    Segmentación no supervisada de imágenes RGB-D

    Get PDF
    El propósito de un método de segmentación es descomponer una imagen en sus partes constitutivas. La segmentación es generalmente la primera etapa en un sistema de análisis de imágenes, y es una de las tareas más críticas debido a que su resultado afectará las etapas siguientes.El objetivo central de esta tarea consiste en agrupar objetos perceptualmente similares basándose en ciertas características en una imagen. Tradicionalmente las aplicaciones de procesamiento de imágenes, visión por computador y robótica se han centrado en las imágenes a color. Sin embargo, el uso de la información de color es limitado hasta cierto a punto debido a que las imágenes obtenidas con cámaras tradicionales no pueden registrar toda la información que la escena tridimensional provee. Una alternativa para afrontar estas dificultades y otorgarle mayor robustez a los algoritmos de segmentación aplicados sobre imágenes obtenidas con cámaras tradicionales es incorporar la información de profundidad perdida en el proceso de captura. Las imágenes que contienen información de color de la escena, y la profundidad de los objetos se denominan imágenes RGB-D Un punto clave de los métodos para segmentar imágenes utilizando datos de color y distancia, es determinar cual es la mejor forma de fusionar estas dos fuentes de información con el objetivo de extraer con mayor precisión los objetos presentes en la escena. Un gran numero de técnicas utilizan métodos de aprendizaje supervisado. Sin embargo, en muchos casos no existen bases de datos que permitan utilizar técnicas supervisadas y en caso de existir, los costos de realizar el entrenamiento de estos métodos puede ser prohibitivo. Las técnicas no supervisadas, a diferencia de las supervisadas, no requieren una fase de entrenamiento a partir de un conjunto de entrenamiento por lo que pueden ser utilizadas en un amplio campo de aplicaciones. En el marco de este trabajo de especialización es de particular interés el análisis de los métodos actuales de segmentación no supervisada de imágenes RGB-D. Un segundo objetivo del presente trabajo es analizar las métricas de evaluación que permiten indicar la calidad del proceso de segmentación.Facultad de Informátic

    Segmentación no supervisada de imágenes RGB-D

    Get PDF
    El propósito de un método de segmentación es descomponer una imagen en sus partes constitutivas. La segmentación es generalmente la primera etapa en un sistema de análisis de imágenes, y es una de las tareas más críticas debido a que su resultado afectará las etapas siguientes.El objetivo central de esta tarea consiste en agrupar objetos perceptualmente similares basándose en ciertas características en una imagen. Tradicionalmente las aplicaciones de procesamiento de imágenes, visión por computador y robótica se han centrado en las imágenes a color. Sin embargo, el uso de la información de color es limitado hasta cierto a punto debido a que las imágenes obtenidas con cámaras tradicionales no pueden registrar toda la información que la escena tridimensional provee. Una alternativa para afrontar estas dificultades y otorgarle mayor robustez a los algoritmos de segmentación aplicados sobre imágenes obtenidas con cámaras tradicionales es incorporar la información de profundidad perdida en el proceso de captura. Las imágenes que contienen información de color de la escena, y la profundidad de los objetos se denominan imágenes RGB-D Un punto clave de los métodos para segmentar imágenes utilizando datos de color y distancia, es determinar cual es la mejor forma de fusionar estas dos fuentes de información con el objetivo de extraer con mayor precisión los objetos presentes en la escena. Un gran numero de técnicas utilizan métodos de aprendizaje supervisado. Sin embargo, en muchos casos no existen bases de datos que permitan utilizar técnicas supervisadas y en caso de existir, los costos de realizar el entrenamiento de estos métodos puede ser prohibitivo. Las técnicas no supervisadas, a diferencia de las supervisadas, no requieren una fase de entrenamiento a partir de un conjunto de entrenamiento por lo que pueden ser utilizadas en un amplio campo de aplicaciones. En el marco de este trabajo de especialización es de particular interés el análisis de los métodos actuales de segmentación no supervisada de imágenes RGB-D. Un segundo objetivo del presente trabajo es analizar las métricas de evaluación que permiten indicar la calidad del proceso de segmentación.Facultad de Informátic

    Deep learning for scene understanding with color and depth data

    Get PDF
    Significant advancements have been made in the recent years concerning both data acquisition and processing hardware, as well as optimization and machine learning techniques. On one hand, the introduction of depth sensors in the consumer market has made possible the acquisition of 3D data at a very low cost, allowing to overcome many of the limitations and ambiguities that typically affect computer vision applications based on color information. At the same time, computationally faster GPUs have allowed researchers to perform time-consuming experimentations even on big data. On the other hand, the development of effective machine learning algorithms, including deep learning techniques, has given a highly performing tool to exploit the enormous amount of data nowadays at hand. Under the light of such encouraging premises, three classical computer vision problems have been selected and novel approaches for their solution have been proposed in this work that both leverage the output of a deep Convolutional Neural Network (ConvNet) as well jointly exploit color and depth data to achieve competing results. In particular, a novel semantic segmentation scheme for color and depth data is presented that uses the features extracted from a ConvNet together with geometric cues. A method for 3D shape classification is also proposed that uses a deep ConvNet fed with specific 3D data representations. Finally, a ConvNet for ToF and stereo confidence estimation has been employed underneath a ToF-stereo fusion algorithm thus avoiding to rely on complex yet inaccurate noise models for the confidence estimation task

    Time-of-Flight Cameras and Microsoft Kinectâ„¢

    Full text link

    Scene Segmentation Assisted by Stereo Vision

    No full text
    Stereo vision systems for 3D reconstruction have been deeply studied and are nowadays capable to provide a reasonably accurate estimate of the 3D geometry of a framed scene. They are commonly used to merely extract the 3D structure of the scene. However, a great variety of applications is not interested in the geometry itself, but rather in scene analysis operations, among which scene segmentation is a very important one. Classically, scene segmentation has been tackled by means of color information only, but it turns out to be a badly conditioned image processing operation which remains very challenging. This paper proposes a new framework for scene segmentation where color information is assisted by 3D geometry data, obtained by stereo vision techniques. This approach resembles in some way what happens inside our brain, where the two different views coming from the eyes are used to recognize the various object in the scene and by exploiting a pair of images instead of just one allows to greatly improve the segmentation quality and robustness. Clearly the performance of the approach is dependent on the specific stereo vision algorithm used in order to extract the geometry information. This paper investigates which stereo vision algorithms are best suited to this kind of analysis. Experimental results confirm the effectiveness of the proposed framework and allow to properly rank stereo vision systems on the basis of their performances when applied to the scene segmentation problem

    Scene Segmentation Assisted by Stereo Vision

    No full text
    Stereo vision systems for 3D reconstruction have been deeply studied and are nowadays capable to provide a reasonably accurate estimate of the 3D geometry of a framed scene. They are commonly used to merely extract the 3D structure of the scene. However, a great variety of applications is not interested in the geometry itself, but rather in scene analysis operations, among which scene segmentation is a very important one. Classically, scene segmentation has been tackled by means of color information only, but it turns out to be a badly conditioned image processing operation which remains very challenging. This paper proposes a new framework for scene segmentation where color information is assisted by 3D geometry data, obtained by stereo vision techniques. This approach resembles in some way what happens inside our brain, where the two different views coming from the eyes are used to recognize the various object in the scene and by exploiting a pair of images instead of just one allows to greatly improve the segmentation quality and robustness. Clearly the performance of the approach is dependent on the specific stereo vision algorithm used in order to extract the geometry information. This paper investigates which stereo vision algorithms are best suited to this kind of analysis. Experimental results confirm the effectiveness of the proposed framework and allow to properly rank stereo vision systems on the basis of their performances when applied to the scene segmentation problem
    corecore