310 research outputs found
Recommended from our members
A biologically inspired spiking model of visual processing for image feature detection
To enable fast reliable feature matching or tracking in scenes, features need to be discrete and meaningful, and hence edge or corner features, commonly called interest points are often used for this purpose. Experimental research has illustrated that biological vision systems use neuronal circuits to extract particular features such as edges or corners from visual scenes. Inspired by this biological behaviour, this paper proposes a biologically inspired spiking neural network for the purpose of image feature extraction. Standard digital images are processed and converted to spikes in a manner similar to the processing that transforms light into spikes in the retina. Using a hierarchical spiking network, various types of biologically inspired receptive fields are used to extract progressively complex image features. The performance of the network is assessed by examining the repeatability of extracted features with visual results presented using both synthetic and real images
Multi-scale keypoints in V1 and face detection
End-stopped cells in cortical area V1, which combine out-
puts of complex cells tuned to different orientations, serve to detect line
and edge crossings (junctions) and points with a large curvature. In this
paper we study the importance of the multi-scale keypoint representa-
tion, i.e. retinotopic keypoint maps which are tuned to different spatial
frequencies (scale or Level-of-Detail). We show that this representation
provides important information for Focus-of-Attention (FoA) and object
detection. In particular, we show that hierarchically-structured saliency
maps for FoA can be obtained, and that combinations over scales in
conjunction with spatial symmetries can lead to face detection through
grouping operators that deal with keypoints at the eyes, nose and mouth,
especially when non-classical receptive field inhibition is employed. Al-
though a face detector can be based on feedforward and feedback loops
within area V1, such an operator must be embedded into dorsal and
ventral data streams to and from higher areas for obtaining translation-,
rotation- and scale-invariant face (object) detection
Multi-scale keypoint annotation: a biological approach
The primary visual cortex employs simple, complex and end-stopped cells to create a scale space of 1D singularities (lines and edges) and of 2D singularities (line and edge
junctions and crossings called keypoints). In this paper we show first results of a biological model which attributes information
of the local image structure to keypoints at all scales, ie junction type (L, T, +) and main line/edge orientations.
Keypoint annotation in combination with coarse to fine scale processing facilitates various processes, such as image matching (stereo and optical flow), object segregation
and object tracking
Integrated multi-scale architecture of the cortex with application to computer vision
Tese de dout., Engenharia ElectrĂłnica e de Computadores, Faculdade de CiĂŞncia e Tecnologia, Universidade do Algarve, 2007The main goal of this thesis is to try to understand the functioning of the
visual cortex through the development of computational models. In the
input layer V1 of the visual cortex there are simple, complex and endstopped
cells. These provide a multi-scale representation of objects and
scene in terms of lines, edges and keypoints. In this thesis we combine
recent progress concerning the development of computational models of
these and other cells with processes in higher cortical areas V2 and V4
etc. Three pertinent challenges are discussed: (i) object recognition embedded
in a cortical architecture; (ii) brightness perception, and (iii)
painterly rendering based on human vision. Specific aspects are Focusof-
Attention by means of keypoint-based saliency maps, the dynamic
routing of features from V1 through higher cortical areas in order to
obtain translation, rotation and size invariance, and the construction of
normalized object templates with canonical views in visual memory. Our
simulations show that the multi-scale representations can be integrated
into a cortical architecture in order to model subsequent processing steps:
from segregation, via different categorization levels, until final object
recognition is obtained. As for real cortical processing, the system starts
with coarse-scale information, refines categorization by using mediumscale
information, and employs all scales in recognition. We also show
that a 2D brightness model can be based on the multi-scale symbolic
representation of lines and edges, with an additional low-pass channel
and nonlinear amplitude transfer functions, such that object recognition
and brightness perception are combined processes based on the same information.
The brightness model can predict many different effects such
as Mach bands, grating induction, the Craik-O’Brien-Cornsweet illusion
and brightness induction, i.e. the opposite effects of assimilation (White
effect) and simultaneous brightness contrast. Finally, a novel application
is introduced: painterly rendering has been linked to computer vision,
but we propose to link it to human vision because perception and painting
are two processes which are strongly interwoven
TiViPE Simulation of a Cortical Crossing Cell Model
Abstract. Many cells in cat and monkey visual cortex (area V1 and area 17) respond to gratings and bar patterns of different orientation between center and surround [18]. It has been shown that these cells respond on average 3.3 times stronger to a crossing pattern than to a single bar [16]. In this paper a computational model for a group of neurons that respond solely to crossing patterns is proposed, and has been implemented in visual programming environment TiViPE [10]. Simulations show that the operator responds very accurately to crossing patterns that have an angular difference between 2 bars of 40 degrees or more, the operator responds appropriately to bar widths that are bound by 50 to 200 percent of the preferred bar width and is insensitive to non-uniform illumination conditions, which appear to be consistent with the experimental results.
Activie vision in robot cognition
Tese de doutoramento, Engenharia Informática, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2016As technology and our understanding of the human brain evolve, the idea of creating
robots that behave and learn like humans seems to get more and more attention.
However, although that knowledge and computational power are constantly growing
we still have much to learn to be able to create such machines. Nonetheless, that
does not mean we cannot try to validate our knowledge by creating biologically
inspired models to mimic some of our brain processes and use them for robotics
applications.
In this thesis several biologically inspired models for vision are presented: a
keypoint descriptor based on cortical cell responses that allows to create binary
codes which can be used to represent speci c image regions; and a stereo vision
model based on cortical cell responses and visual saliency based on color, disparity
and motion. Active vision is achieved by combining these vision modules with an
attractor dynamics approach for head pan control.
Although biologically inspired models are usually very heavy in terms of processing
power, these models were designed to be lightweight so that they can be
tested for real-time robot navigation, object recognition and vision steering. The
developed vision modules were tested on a child-sized robot, which uses only visual
information to navigate, to detect obstacles and to recognize objects in real time.
The biologically inspired visual system is integrated with a cognitive architecture,
which combines vision with short- and long-term memory for simultaneous localization
and mapping (SLAM). Motor control for navigation is also done using attractor
dynamics
Face and object recognition by 3D cortical representations
This thesis presents a novel integrated cortical architecture with significant
emphasis on low-level attentional mechanisms—based on retinal nonstandard
cells and pathways—that can group non-attentional, bottom-up
features present in V1/V2 into “proto-object” shapes. These shapes are extracted
at first using combinations of specific cell types for detecting corners,
bars/edges and curves which work extremely well for geometrically
shaped objects. Later, in the parietal pathway (probably in LIP), arbitrary
shapes can be extracted from population codes of V2 (or even dorsal V3)
oriented cells that encode the outlines of objects as “proto-objects”. Object
shapes obtained at both cortical levels play an important role in bottom-up
local object gist vision, which tries to understand scene context in less than
70 ms and is thought to use both global and local scene features.
Edge conspicuity maps are able to detect borders/edges of objects and
attribute them a weight based on their perceptual salience, using readily
available retinal ganglion cell colour-opponency coding. Conspicuity maps
are fundamental in building posterior saliency maps—important for both
bottom-up attention schemes and also for Focus-of-Attention mechanisms
that control eye gaze and object recognition.
Disparity maps are also a main focus of this thesis. They are built upon
binocular simple and complex cells in quadrature, using a Disparity-Enery
Model. These maps are fundamental for perception of distance within a
scene and close/far object relationships in doing foreground to background
segregation.
The role of cortical disparity in 3D facial recognition was also explored
when processing faces with very different facial expressions (even extreme
ones), yielding state-of-the-art results when compared to other, non-biological,
computer vision algorithms.A presente tese descreve uma nova arquitectura cortical integrada, com
ĂŞnfase especial em mecanismos de atenção a baixo nĂvel—baseados em
conexões corticais que utilizam células retinais não-padronizadas—conseguindo
agrupar diversas caracterĂsticas visuais de baixo nĂvel, ainda num
estado prĂ©-atencional, presentes nas áreas V1/V2, em formas especĂficas
de “proto-objectos”. As formas em questĂŁo sĂŁo extraĂdas em primeira mĂŁo
através de combinações de células especializadas que detectam localmente
cantos, rectas/arestas e curvaturas, funcionando extremamente bem para a detecção
de objectos com formas geométricas. Posteriormente, no lobo parietal
(provavelmente no cĂłrtex Lateral Intra-Parietal), já podem ser extraĂdas
formas arbitrárias, através de padrões de activação de populações de
neurónios, presentes em V2 (ou até em V3-dorsal), que codificam a periferia
de objectos como “proto-objectos”—representações básicas de categorias
especĂficas de objectos no cĂ©rebro. Ambas as formas extraĂdas nos dois
tipos de processamento cortical (utilizando cĂ©lulas especĂficas ou uma codificação
de formas arbitrária) desempenham um papel importante na visão
gist local, que tenta compreender o contexto geral da cena apresentada ao
sistema visual, em menos de 70 ms, sendo esperado que para tal se usem
tanto caracterĂsticas visuais globais como locais.
São também utilizados mapas de conspicuicidade, que permitem detectar
linhas e arestas de objectos, atribuindo-lhes um peso baseado na sua
saliência perceptual—utilizando para tal a codificação natural das células
retinais, em que as cores sĂŁo representadas por oponĂŞncia: claro/escuro,
vermelho/verde e amarelo/azul. Os mapas de conspicuicidade sĂŁo fundamentais
na construção posterior de mapas de saliência—importantes nos
esquemas prĂ©-atencionais de nĂvel celular baixo e tambĂ©m para os mecanisix
mos de Foco-de-Atenção que controlam o movimento ocular e reconhecimento
de caras e objectos.
Em paralelo, são também desenvolvidos os mapas de disparidade cortical,
sendo estes também um dos maiores focos desta tese. Estes são baseados
em células corticais binoculares simples e complexas, através de um processamento
das últimas em quadratura—modelo denominado por “Disparity-
Energy Model”. Estes mapas de disparidade são fundamentais na percepção
de distâncias dentro de uma cena visual e também para resolver o problema
da segregação objecto/fundo.
O papel da disparidade cortical é também explorado no reconhecimento
facial a 3D, em especial quando as faces a reconhecer apresentam expressões
faciais de diversas formas e nĂveis de intensidade. O modelo utilizado apresentou
resultados excelentes, atingindo o estado-da-arte, inclusivamente ficando
acima de modelos de visão computacional não biológicos.Fundação para a Ciência e a TecnologiaComissão Europei
Medical image enhancement
Each image acquired from a medical imaging system is often part of a two-dimensional (2-D) image set whose total presents a three-dimensional (3-D) object for diagnosis. Unfortunately, sometimes these images are of poor quality. These distortions cause an inadequate object-of-interest presentation, which can result in inaccurate image analysis. Blurring is considered a serious problem. Therefore, “deblurring” an image to obtain better quality is an important issue in medical image processing. In our research, the image is initially decomposed. Contrast improvement is achieved by modifying the coefficients obtained from the decomposed image. Small coefficient values represent subtle details and are amplified to improve the visibility of the corresponding details. The stronger image density variations make a major contribution to the overall dynamic range, and have large coefficient values. These values can be reduced without much information loss
- …