133 research outputs found
Color constancy for landmark detection in outdoor environments
European Workshop on Advanced Mobile Robots (EUROBOT), 2001, Lund (Suecia)This work presents an evaluation of three color constancy techniques applied to a landmark detection system designed for a walking robot, which has to operate in unknown and unstructured outdoor environments. The first technique is the well-known image conversion to a chromaticity space, and the second technique is based on successive lighting intensity and illuminant color normalizations. Based on a differential model of color constancy, we propose the third technique, based on color ratios, which unifies the processes of color constancy and landmark detection. The approach used to detect potential landmarks, which is common to all evaluated systems, is based on visual saliency concepts using multiscale color opponent features to identify salient regions in the images. These regions are selected as landmark candidates, and they are further characterized by their features for identification and recognition.This work was supported by the project 'Navegación autónoma de robots guiados por objetivos visuales' (070-720).Peer Reviewe
Detecting salient cues through illumination-invariant color ratios
This work presents a novel technique for embedding color constancy into a saliency-based system for detecting potential landmarks in outdoor environments. Since multiscale color opponencies are among the ingredients determining saliency, the idea is to make such opponencies directly invariant to illumination variations, rather than enforcing the invariance of colors themselves. The new technique is compared against the alternative approach of preprocessing the images with a color constancy procedure before entering the saliency system. The first procedure used in the experimental comparison is the well-known image conversion to chromaticity space, and the second one is based on successive lighting intensity and illuminant color normalizations. The proposed technique offers significant advantages over the preceding two ones since, at a lower computational cost, it exhibits higher stability in front of illumination variations and even of slight viewpoint changes, resulting in a better correspondence of visual saliency to potential landmark elements.This work was supported by the project 'Sistema reconfigurable para la navegación basada en visión de robots caminantes y rodantes en entornos naturales.' (00). The authors would like to the support obtained from the Forschungszentrum Informatik and Institut für
Prozessrechentechnik, Automation und Robotik
, Karlsruhe University, Germany. This work is partially supported by the Spanish Science and Technology Directorate, in the scope of the project “Reconfigurable system for vision-based navigation of legged and wheeled robots in natural environments (SIRVENT)”, grant DPI2003-05193-C02-01.Peer Reviewe
Color-contrast landmark detection and encoding in outdoor images
International Conference on Computer Analysis of Images and Patterns (CAIP), 2005, Versalles (Francia)This paper describes a system to extract salient regions from an outdoor image and match them against a database of previously acquired landmarks. Region saliency is based mainly on color contrast, although intensity and texture orientation are also taken into account. Remarkably, color constancy is embedded in the saliency detection process through a novel color ratio algorithm that makes the system robust to illumination changes, so common in outdoor environments. A region is characterized by a combination of its saliency and its color distribution in chromaticity space. The newly acquired landmarks are compared with those already stored in a database, through a quadratic distance metric of their characterizations. Experimentation with a database containing 68 natural landmarks acquired with the system yielded good recognition results, in terms of both recall and rank indices. However, the discrimination between landmarks should be improved to avoid false positives, as suggested by the low precision index.This work was supported by the project 'Sistema reconfigurable para la navegación basada en visión de robots caminantes y rodantes en entornos naturales.' (00).Peer Reviewe
Cortical 3D Face Recognition Framework
Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. In cortical area V1 exist double-opponent colour blobs, also simple, complex and end-stopped cells which provide input for a multiscale line/edge representation, keypoints for dynamic routing and saliency maps for Focus-of-Attention. All these combined allow us to segregate faces. Events of different facial views are stored in memory and combined in order to identify the view and recognise the face including facial expression. In this paper we show that with five 2D views and their cortical representations it is possible to determine the left-right and frontal-lateral-profile views and to achieve view-invariant recognition of 3D faces
Outdoor view recognition based on landmark grouping and logistic regression
Vision-based robot localization outdoors has remained more elusive than its indoors counterpart. Drastic illumination changes and the scarceness of suitable landmarks are the main difficulties. This paper attempts to surmount them by deviating from the main trend of using local features. Instead, a global descriptor called landmark-view is defined, which aggregates the most visually-salient landmarks present in each scene. Thus, landmark co-occurrence and spatial and saliency relationships between them are added to the single landmark characterization, based on saliency and color distribution. A suitable framework to compare landmark-views is developed, and it is shown how this remarkably enhances the recognition performance, compared against single landmark recognition. A view-matching model is constructed using logistic regression. Experimentation using 45 views, acquired outdoors, containing 273 landmarks, yielded good recognition results. The overall percentage of correct view classification obtained was 80.6%, indicating the adequacy of the approach.Peer ReviewedPostprint (author’s final draft
Multimodal Computational Attention for Scene Understanding
Robotic systems have limited computational capacities. Hence, computational attention models are important to focus on specific stimuli and allow for complex cognitive processing. For this purpose, we developed auditory and visual attention models that enable robotic platforms to efficiently explore and analyze natural scenes. To allow for attention guidance in human-robot interaction, we use machine learning to integrate the influence of verbal and non-verbal social signals into our models
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Skin texture features for face recognition
Face recognition has been deployed in a wide range of important applications including surveillance and forensic identification. However, it still seems to be a challenging problem as its performance severely degrades under illumination, pose and expression variations, as well as with occlusions, and aging. In this thesis, we have investigated the use of local facial skin data as a source of biometric information to improve human recognition. Skin texture features have been exploited in three major tasks, which include (i) improving the performance of conventional face recognition systems, (ii) building an adaptive skin-based face recognition system, and (iii) dealing with circumstances when a full view of the face may not be avai'lable. Additionally, a fully automated scheme is presented for localizing eyes and mouth and segmenting four facial regions: forehead, right cheek, left cheek and chin. These four regions are divided into nonoverlapping patches with equal size. A novel skin/non-skin classifier is proposed for detecting patches containing only skin texture and therefore detecting the pure-skin regions. Experiments using the XM2VTS database indicate that the forehead region has the most significant biometric information. The use of forehead texture features improves the rank-l identification of Eigenfaces system from 77.63% to 84.07%. The rank-l identification is equal 93.56% when this region is fused with Kernel Direct Discriminant Analysis algorithm
Automatic View-Point Selection for Inter-Operative Endoscopic Surveillance
International audienceAbstract. Esophageal adenocarcinoma arises from Barrett’s esophagus, which is the most serious complication of gastroesophageal reflux disease. Strategies for screening involve periodic surveillance and tissue biopsies. A major challenge in such regular examinations is to record and track the disease evolution and re-localization of biopsied sites to provide targeted treatments. In this paper, we extend our original inter-operativerelocalization framework to provide a constrained image based search for obtaining the best view-point match to the live view. Within this context we investigate the effect of, (a) the choice of feature descriptors and color-space, (b) filtering of uninformative frames, (c) endoscopic modality, for view-point localization. Our experiments indicate an improvement in the best view-point retrieval rate to [92%, 87%] from [73%, 76%] (in our previous approach) for NBI and WL
Activie vision in robot cognition
Tese de doutoramento, Engenharia Informática, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2016As technology and our understanding of the human brain evolve, the idea of creating
robots that behave and learn like humans seems to get more and more attention.
However, although that knowledge and computational power are constantly growing
we still have much to learn to be able to create such machines. Nonetheless, that
does not mean we cannot try to validate our knowledge by creating biologically
inspired models to mimic some of our brain processes and use them for robotics
applications.
In this thesis several biologically inspired models for vision are presented: a
keypoint descriptor based on cortical cell responses that allows to create binary
codes which can be used to represent speci c image regions; and a stereo vision
model based on cortical cell responses and visual saliency based on color, disparity
and motion. Active vision is achieved by combining these vision modules with an
attractor dynamics approach for head pan control.
Although biologically inspired models are usually very heavy in terms of processing
power, these models were designed to be lightweight so that they can be
tested for real-time robot navigation, object recognition and vision steering. The
developed vision modules were tested on a child-sized robot, which uses only visual
information to navigate, to detect obstacles and to recognize objects in real time.
The biologically inspired visual system is integrated with a cognitive architecture,
which combines vision with short- and long-term memory for simultaneous localization
and mapping (SLAM). Motor control for navigation is also done using attractor
dynamics
- …