38 research outputs found
Contributions to region-based image and video analysis: feature aggregation, background subtraction and description constraining
Tesis doctoral inédita leÃda en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de TecnologÃa Electrónica y de las Comunicaciones. Fecha de lectura: 22-01-2016Esta tesis tiene embargado el acceso al texto completo hasta el 22-07-2017The use of regions for image and video analysis has been traditionally motivated by their ability
to diminish the number of processed units and hence, the number of required decisions. However,
as we explore in this thesis, this is just one of the potential advantages that regions may
provide. When dealing with regions, two description spaces may be differentiated: the decision
space, on which regions are shaped—region segmentation—, and the feature space, on which
regions are used for analysis—region-based applications—. These two spaces are highly related.
The solutions taken on the decision space severely affect their performance in the feature space.
Accordingly, in this thesis we propose contributions on both spaces. Regarding the contributions
to region segmentation, these are two-fold. Firstly, we give a twist to a classical region segmentation
technique, the Mean-Shift, by exploring new solutions to automatically set the spectral
kernel bandwidth. Secondly, we propose a method to describe the micro-texture of a pixel
neighbourhood by using an easily customisable filter-bank methodology—which is based on the
discrete cosine transform (DCT)—. The rest of the thesis is devoted to describe region-based
approaches to several highly topical issues in computer vision; two broad tasks are explored:
background subtraction (BS) and local descriptors (LD). Concerning BS, regions are here used
as complementary cues to refine pixel-based BS algorithms: by providing robust to illumination
cues and by storing the background dynamics in a region-driven background modelling. Relating
to LD, the region is here used to reshape the description area usually fixed for local descriptors.
Region-masked versions of classical two-dimensional and three-dimensional local descriptions are
designed. So-built descriptions are proposed for the task of object identification, under a novel
neural-oriented strategy. Furthermore, a local description scheme based on a fuzzy use of the
region membership is derived. This characterisation scheme has been geometrically adapted to
account for projective deformations, providing a suitable tool for finding corresponding points
in wide-baseline scenarios. Experiments have been conducted for every contribution, discussing
the potential benefits and the limitations of the proposed schemes. In overall, obtained results
suggest that the region—conditioned by successful aggregation processes—is a reliable and
useful tool to extrapolate pixel-level results, diminish semantic noise, isolate significant object
cues and constrain local descriptions. The methods and approaches described along this thesis
present alternative or complementary solutions to pixel-based image processing.El uso de regiones para el análisis de imágenes y secuencias de video ha estado tradicionalmente
motivado por su utilidad para disminuir el número de unidades de análisis y, por ende, el número
de decisiones. En esta tesis evidenciamos que esta es sólo una de las muchas ventajas adheridas
a la utilización de regiones. En el procesamiento por regiones deben distinguirse dos espacios de
análisis: el espacio de decisión, en donde se construyen las regiones, y el espacio de caracterÃsticas,
donde se utilizan. Ambos espacios están altamente relacionados. Las soluciones diseñadas para
la construcción de regiones en el espacio de decisión definen su utilidad en el espacio de análisis.
Por este motivo, a lo largo de esta tesis estudiamos ambos espacios. En particular, proponemos
dos contribuciones en la etapa de construcción de regiones. En la primera, revisitamos una
técnica clásica, Mean-Shift, e introducimos un esquema para la selección automática del ancho
de banda que permite estimar localmente la densidad de una determinada caracterÃstica. En
la segunda, utilizamos la transformada discreta del coseno para describir la variabilidad local
en el entorno de un pÃxel. En el resto de la tesis exploramos soluciones en el espacio de caracterÃsticas,
en otras palabras, proponemos aplicaciones que se apoyan en la región para realizar
el procesamiento. Dichas aplicaciones se centran en dos ramas candentes en el ámbito de la
visión por computador: la segregación del frente por substracción del fondo y la descripción
local de los puntos de una imagen. En la rama substracción de fondo, utilizamos las regiones
como unidades de apoyo a los algoritmos basados exclusivamente en el análisis a nivel de pÃxel.
En particular, mejoramos la robustez de estos algoritmos a los cambios locales de iluminación y
al dinamismo del fondo. Para esta última técnica definimos un modelo de fondo completamente
basado en regiones. Las contribuciones asociadas a la rama de descripción local están centradas
en el uso de la región para definir, automáticamente, entornos de descripción alrededor
de los puntos. En las aproximaciones existentes, estos entornos de descripción suelen ser de
tamaño y forma fija. Como resultado de este procedimiento se establece el diseño de versiones
enmascaradas de descriptores bidimensionales y tridimensionales. En el algoritmo desarrollado,
organizamos los descriptores asà diseñados en una estructura neuronal y los utilizamos para la
identificación automática de objetos. Por otro lado, proponemos un esquema de descripción
mediante asociación difusa de pÃxeles a regiones. Este entorno de descripción es transformado
geométricamente para adaptarse a potenciales deformaciones proyectivas en entornos estéreo donde las cámaras están ampliamente separadas. Cada una de las aproximaciones desarrolladas
se evalúa y discute, remarcando las ventajas e inconvenientes asociadas a su utilización. En
general, los resultados obtenidos sugieren que la región, asumiendo que ha sido construida de
manera exitosa, es una herramienta fiable y de utilidad para: extrapolar resultados a nivel de
pixel, reducir el ruido semántico, aislar las caracterÃsticas significativas de los objetos y restringir
la descripción local de estas caracterÃsticas. Los métodos y enfoques descritos a lo largo de esta
tesis establecen soluciones alternativas o complementarias al análisis a nivel de pÃxelIt was partially supported by the Spanish Government trough
its FPU grant program and the projects (TEC2007-65400 - SemanticVideo), (TEC2011-25995 Event
Video) and (TEC2014-53176-R HAVideo); the European Commission (IST-FP6-027685 - Mesh); the
Comunidad de Madrid (S-0505/TIC-0223 - ProMultiDis-CM) and the Spanish Administration Agency
CENIT 2007-1007 (VISION)
Pattern Recognition
Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition
Mobile Robots Navigation
Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described
Advances in Robotics, Automation and Control
The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man
A Multiple-Systems Approach in the Symbolic Modelling of Human Vision
For most of the thirty years or so of machine vision research, activity has been concentrated mainly in the domain of metric-based approaches: there has been negligible attention to the psychological factors in human vision. With the recent resurgence of interest in neural systems, that is now changing. This thesis discusses relevant aspects of basic visual neuroanatomy, and psychological phenomena, in an attempt to relate the concepts to a model of human vision and the prospective goals of future machine vision systems. It is suggested that, while biological vision is complex, the underlying mechanisms of human vision are more tractable than is often believed. We also argue here that the controversial subject of direct vision plays a crucial role in natural vision, and we attempt to relate this to the model. The recognition of massive parallelism in natural vision has led to proposals for emulating aspects of neural networks in technology. The systems model developed in this work demonstrates software-simulated cellular automata (CAs) in the role of mainly low-level image processing. It is shown that CAs are able to efficiently provide both conventional and neurally-inspired vision functions. The thesis also discusses the use of Prolog as the means of realising higher level image understanding. The symbolic processing developed is basic, but is nevertheless sufficient for the purposes of the present. demonstrations. Extensions to the concepts can be easily achieved. The modular systems approach adopted blends together several ideas and processes, and results in a more robust model of human vision that is able to translate a noisy real image into an accessible symbolic form for expert-domain interpretation
Adaptive Shadow and Highlight Invariant Colour Segmentation for Traffic Sign Recognition Based on Kohonen SOM
This paper describes an intelligent algorithm for traffic sign recognition which converges quickly, is accurate in its segmentation and adaptive in its behaviour. The proposed approach can segment images of traffic signs in different lighting and environmental conditions and in different countries. It is based on using Kohonen's Self-Organizing Maps (SOM) as a clustering tool and it is developed for Intelligent Vehicle applications. The current approach does not need any prior training. Instead, a slight portion, which is about 1% of the image under investigation, is used for training. This is a key issue to ensure fast convergence and high adaptability. The current approach was tested by using 442 images which were collected under different environmental conditions and from different countries. The proposed approach shows promising results; good improvement of 73% is observed in faded traffic sign images compared with 53.3% using the traditional algorithm. The adaptability of the system is evident from the segmentation of the traffic sign images from various countries where the result is 96% for the nine countries included in the test
Sustainable Agriculture and Advances of Remote Sensing (Volume 2)
Agriculture, as the main source of alimentation and the most important economic activity globally, is being affected by the impacts of climate change. To maintain and increase our global food system production, to reduce biodiversity loss and preserve our natural ecosystem, new practices and technologies are required. This book focuses on the latest advances in remote sensing technology and agricultural engineering leading to the sustainable agriculture practices. Earth observation data, in situ and proxy-remote sensing data are the main source of information for monitoring and analyzing agriculture activities. Particular attention is given to earth observation satellites and the Internet of Things for data collection, to multispectral and hyperspectral data analysis using machine learning and deep learning, to WebGIS and the Internet of Things for sharing and publication of the results, among others