261 research outputs found
Online Mutual Foreground Segmentation for Multispectral Stereo Videos
The segmentation of video sequences into foreground and background regions is
a low-level process commonly used in video content analysis and smart
surveillance applications. Using a multispectral camera setup can improve this
process by providing more diverse data to help identify objects despite adverse
imaging conditions. The registration of several data sources is however not
trivial if the appearance of objects produced by each sensor differs
substantially. This problem is further complicated when parallax effects cannot
be ignored when using close-range stereo pairs. In this work, we present a new
method to simultaneously tackle multispectral segmentation and stereo
registration. Using an iterative procedure, we estimate the labeling result for
one problem using the provisional result of the other. Our approach is based on
the alternating minimization of two energy functions that are linked through
the use of dynamic priors. We rely on the integration of shape and appearance
cues to find proper multispectral correspondences, and to properly segment
objects in low contrast regions. We also formulate our model as a frame
processing pipeline using higher order terms to improve the temporal coherence
of our results. Our method is evaluated under different configurations on
multiple multispectral datasets, and our implementation is available online.Comment: Preprint accepted for publication in IJCV (December 2018
WxBS: Wide Baseline Stereo Generalizations
We have presented a new problem -- the wide multiple baseline stereo (WxBS)
-- which considers matching of images that simultaneously differ in more than
one image acquisition factor such as viewpoint, illumination, sensor type or
where object appearance changes significantly, e.g. over time. A new dataset
with the ground truth for evaluation of matching algorithms has been introduced
and will be made public.
We have extensively tested a large set of popular and recent detectors and
descriptors and show than the combination of RootSIFT and HalfRootSIFT as
descriptors with MSER and Hessian-Affine detectors works best for many
different nuisance factors. We show that simple adaptive thresholding improves
Hessian-Affine, DoG, MSER (and possibly other) detectors and allows to use them
on infrared and low contrast images.
A novel matching algorithm for addressing the WxBS problem has been
introduced. We have shown experimentally that the WxBS-M matcher dominantes the
state-of-the-art methods both on both the new and existing datasets.Comment: Descriptor and detector evaluation expande
Geometric and photometric affine invariant image registration
This thesis aims to present a solution to the correspondence problem for the registration
of wide-baseline images taken from uncalibrated cameras. We propose an affine
invariant descriptor that combines the geometry and photometry of the scene to find
correspondences between both views. The geometric affine invariant component of the
descriptor is based on the affine arc-length metric, whereas the photometry is analysed
by invariant colour moments. A graph structure represents the spatial distribution of the
primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs
represent connectivities by extracted contours. After matching, we refine the search for
correspondences by using a maximum likelihood robust algorithm. We have evaluated
the system over synthetic and real data. The method is endemic to propagation of errors
introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System
Método para el registro automático de imágenes basado en transformaciones proyectivas planas dependientes de las distancias y orientado a imágenes sin características comunes
Tesis inédita de la Universidad Complutense de Madrid, Facultad de Ciencias Físicas, Departamento de Arquitectura de Computadores y Automática, leída el 18-12-2015Multisensory data fusion oriented to image-based application improves the accuracy, quality and availability of the data, and consequently, the performance of robotic systems, by means of combining the information of a scene acquired from multiple and different sources into a unified representation of the 3D world scene, which is more enlightening and enriching for the subsequent image processing, improving either the reliability by using the redundant information, or the capability by taking advantage of complementary information. Image registration is one of the most relevant steps in image fusion techniques. This procedure aims the geometrical alignment of two or more images. Normally, this process relies on feature-matching techniques, which is a drawback for combining sensors that are not able to deliver common features. For instance, in the combination of ToF and RGB cameras, the robust feature-matching is not reliable. Typically, the fusion of these two sensors has been addressed from the computation of the cameras calibration parameters for coordinate transformation between them. As a result, a low resolution colour depth map is provided. For improving the resolution of these maps and reducing the loss of colour information, extrapolation techniques are adopted. A crucial issue for computing high quality and accurate dense maps is the presence of noise in the depth measurement from the ToF camera, which is normally reduced by means of sensor calibration and filtering techniques. However, the filtering methods, implemented for the data extrapolation and denoising, usually over-smooth the data, reducing consequently the accuracy of the registration procedure...La fusión multisensorial orientada a aplicaciones de procesamiento de imágenes, conocida como fusión de imágenes, es una técnica que permite mejorar la exactitud, la calidad y la disponibilidad de datos de un entorno tridimensional, que a su vez permite mejorar el rendimiento y la operatividad de sistemas robóticos. Dicha fusión, se consigue mediante la combinación de la información adquirida por múltiples y diversas fuentes de captura de datos, la cual se agrupa del tal forma que se obtiene una mejor representación del entorno 3D, que es mucho más ilustrativa y enriquecedora para la implementación de métodos de procesamiento de imágenes. Con ello se consigue una mejora en la fiabilidad y capacidad del sistema, empleando la información redundante que ha sido adquirida por múltiples sensores. El registro de imágenes es uno de los procedimientos más importantes que componen la fusión de imágenes. El objetivo principal del registro de imágenes es la consecución de la alineación geométrica entre dos o más imágenes. Normalmente, este proceso depende de técnicas de búsqueda de patrones comunes entre imágenes, lo cual puede ser un inconveniente cuando se combinan sensores que no proporcionan datos con características similares. Un ejemplo de ello, es la fusión de cámaras de color de alta resolución (RGB) con cámaras de Tiempo de Vuelo de baja resolución (Time-of-Flight (ToF)), con las cuales no es posible conseguir una detección robusta de patrones comunes entre las imágenes capturadas por ambos sensores. Por lo general, la fusión entre estas cámaras se realiza mediante el cálculo de los parámetros de calibración de las mismas, que permiten realizar la trasformación homogénea entre ellas. Y como resultado de este xii Abstract procedimiento, se obtienen mapas de profundad y de color de baja resolución. Con el objetivo de mejorar la resolución de estos mapas y de evitar la pérdida de información de color, se utilizan diversas técnicas de extrapolación de datos. Un factor crucial a tomar en cuenta para la obtención de mapas de alta calidad y alta exactitud, es la presencia de ruido en las medidas de profundidad obtenidas por las cámaras ToF. Este problema, normalmente se reduce mediante la calibración de estos sensores y con técnicas de filtrado de datos. Sin embargo, las técnicas de filtrado utilizadas, tanto para la interpolación de datos, como para la reducción del ruido, suelen producir el sobre-alisamiento de los datos originales, lo cual reduce la exactitud del registro de imágenes...Sección Deptal. de Arquitectura de Computadores y Automática (Físicas)Fac. de Ciencias FísicasTRUEunpu
OpenPTrack: Open Source Multi-Camera Calibration and People Tracking for RGB-D Camera Networks
OpenPTrack is an open source software for multi-camera calibration and people tracking in RGB-D camera networks. It allows to track people in big volumes at sensor frame rate and currently supports a heterogeneous set of 3D sensors.
In this work, we describe its user-friendly calibration procedure, which consists of simple steps with real-time feedback that allow to obtain accurate results in estimating the camera poses that are then used for tracking people. On top of a calibration based on moving a checkerboard within the tracking space and on a global optimization of cameras and checkerboards poses, a novel procedure which aligns people detections coming from all sensors in a x-y-time space is used for refining camera poses.
While people detection is executed locally, in the machines connected to each sensor, tracking is performed by a single node which takes into account detections from all over the network. Here we detail how a cascade of algorithms working on depth point clouds and color, infrared and disparity images is used to perform people detection from different types of sensors and in any indoor light condition.
We present experiments showing that a considerable improvement can be obtained with the proposed calibration refinement procedure that exploits people detections and we compare Kinect v1, Kinect v2 and Mesa SR4500 performance for people tracking applications. OpenPTrack is based on the Robot Operating System and the Point Cloud Library and has already been adopted in networks composed of up to ten imagers for interactive arts, education, culture and human\u2013robot interaction applications
- …