310 research outputs found

    Multi-scale keypoints in V1 and face detection

    Get PDF
    End-stopped cells in cortical area V1, which combine out- puts of complex cells tuned to different orientations, serve to detect line and edge crossings (junctions) and points with a large curvature. In this paper we study the importance of the multi-scale keypoint representa- tion, i.e. retinotopic keypoint maps which are tuned to different spatial frequencies (scale or Level-of-Detail). We show that this representation provides important information for Focus-of-Attention (FoA) and object detection. In particular, we show that hierarchically-structured saliency maps for FoA can be obtained, and that combinations over scales in conjunction with spatial symmetries can lead to face detection through grouping operators that deal with keypoints at the eyes, nose and mouth, especially when non-classical receptive field inhibition is employed. Al- though a face detector can be based on feedforward and feedback loops within area V1, such an operator must be embedded into dorsal and ventral data streams to and from higher areas for obtaining translation-, rotation- and scale-invariant face (object) detection

    Multi-scale keypoint annotation: a biological approach

    Get PDF
    The primary visual cortex employs simple, complex and end-stopped cells to create a scale space of 1D singularities (lines and edges) and of 2D singularities (line and edge junctions and crossings called keypoints). In this paper we show first results of a biological model which attributes information of the local image structure to keypoints at all scales, ie junction type (L, T, +) and main line/edge orientations. Keypoint annotation in combination with coarse to fine scale processing facilitates various processes, such as image matching (stereo and optical flow), object segregation and object tracking

    Integrated multi-scale architecture of the cortex with application to computer vision

    Get PDF
    Tese de dout., Engenharia Electrónica e de Computadores, Faculdade de Ciência e Tecnologia, Universidade do Algarve, 2007The main goal of this thesis is to try to understand the functioning of the visual cortex through the development of computational models. In the input layer V1 of the visual cortex there are simple, complex and endstopped cells. These provide a multi-scale representation of objects and scene in terms of lines, edges and keypoints. In this thesis we combine recent progress concerning the development of computational models of these and other cells with processes in higher cortical areas V2 and V4 etc. Three pertinent challenges are discussed: (i) object recognition embedded in a cortical architecture; (ii) brightness perception, and (iii) painterly rendering based on human vision. Specific aspects are Focusof- Attention by means of keypoint-based saliency maps, the dynamic routing of features from V1 through higher cortical areas in order to obtain translation, rotation and size invariance, and the construction of normalized object templates with canonical views in visual memory. Our simulations show that the multi-scale representations can be integrated into a cortical architecture in order to model subsequent processing steps: from segregation, via different categorization levels, until final object recognition is obtained. As for real cortical processing, the system starts with coarse-scale information, refines categorization by using mediumscale information, and employs all scales in recognition. We also show that a 2D brightness model can be based on the multi-scale symbolic representation of lines and edges, with an additional low-pass channel and nonlinear amplitude transfer functions, such that object recognition and brightness perception are combined processes based on the same information. The brightness model can predict many different effects such as Mach bands, grating induction, the Craik-O’Brien-Cornsweet illusion and brightness induction, i.e. the opposite effects of assimilation (White effect) and simultaneous brightness contrast. Finally, a novel application is introduced: painterly rendering has been linked to computer vision, but we propose to link it to human vision because perception and painting are two processes which are strongly interwoven

    TiViPE Simulation of a Cortical Crossing Cell Model

    Full text link
    Abstract. Many cells in cat and monkey visual cortex (area V1 and area 17) respond to gratings and bar patterns of different orientation between center and surround [18]. It has been shown that these cells respond on average 3.3 times stronger to a crossing pattern than to a single bar [16]. In this paper a computational model for a group of neurons that respond solely to crossing patterns is proposed, and has been implemented in visual programming environment TiViPE [10]. Simulations show that the operator responds very accurately to crossing patterns that have an angular difference between 2 bars of 40 degrees or more, the operator responds appropriately to bar widths that are bound by 50 to 200 percent of the preferred bar width and is insensitive to non-uniform illumination conditions, which appear to be consistent with the experimental results.

    Activie vision in robot cognition

    Get PDF
    Tese de doutoramento, Engenharia Informática, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2016As technology and our understanding of the human brain evolve, the idea of creating robots that behave and learn like humans seems to get more and more attention. However, although that knowledge and computational power are constantly growing we still have much to learn to be able to create such machines. Nonetheless, that does not mean we cannot try to validate our knowledge by creating biologically inspired models to mimic some of our brain processes and use them for robotics applications. In this thesis several biologically inspired models for vision are presented: a keypoint descriptor based on cortical cell responses that allows to create binary codes which can be used to represent speci c image regions; and a stereo vision model based on cortical cell responses and visual saliency based on color, disparity and motion. Active vision is achieved by combining these vision modules with an attractor dynamics approach for head pan control. Although biologically inspired models are usually very heavy in terms of processing power, these models were designed to be lightweight so that they can be tested for real-time robot navigation, object recognition and vision steering. The developed vision modules were tested on a child-sized robot, which uses only visual information to navigate, to detect obstacles and to recognize objects in real time. The biologically inspired visual system is integrated with a cognitive architecture, which combines vision with short- and long-term memory for simultaneous localization and mapping (SLAM). Motor control for navigation is also done using attractor dynamics

    Face and object recognition by 3D cortical representations

    Get PDF
    This thesis presents a novel integrated cortical architecture with significant emphasis on low-level attentional mechanisms—based on retinal nonstandard cells and pathways—that can group non-attentional, bottom-up features present in V1/V2 into “proto-object” shapes. These shapes are extracted at first using combinations of specific cell types for detecting corners, bars/edges and curves which work extremely well for geometrically shaped objects. Later, in the parietal pathway (probably in LIP), arbitrary shapes can be extracted from population codes of V2 (or even dorsal V3) oriented cells that encode the outlines of objects as “proto-objects”. Object shapes obtained at both cortical levels play an important role in bottom-up local object gist vision, which tries to understand scene context in less than 70 ms and is thought to use both global and local scene features. Edge conspicuity maps are able to detect borders/edges of objects and attribute them a weight based on their perceptual salience, using readily available retinal ganglion cell colour-opponency coding. Conspicuity maps are fundamental in building posterior saliency maps—important for both bottom-up attention schemes and also for Focus-of-Attention mechanisms that control eye gaze and object recognition. Disparity maps are also a main focus of this thesis. They are built upon binocular simple and complex cells in quadrature, using a Disparity-Enery Model. These maps are fundamental for perception of distance within a scene and close/far object relationships in doing foreground to background segregation. The role of cortical disparity in 3D facial recognition was also explored when processing faces with very different facial expressions (even extreme ones), yielding state-of-the-art results when compared to other, non-biological, computer vision algorithms.A presente tese descreve uma nova arquitectura cortical integrada, com ênfase especial em mecanismos de atenção a baixo nível—baseados em conexões corticais que utilizam células retinais não-padronizadas—conseguindo agrupar diversas características visuais de baixo nível, ainda num estado pré-atencional, presentes nas áreas V1/V2, em formas específicas de “proto-objectos”. As formas em questão são extraídas em primeira mão através de combinações de células especializadas que detectam localmente cantos, rectas/arestas e curvaturas, funcionando extremamente bem para a detecção de objectos com formas geométricas. Posteriormente, no lobo parietal (provavelmente no córtex Lateral Intra-Parietal), já podem ser extraídas formas arbitrárias, através de padrões de activação de populações de neurónios, presentes em V2 (ou até em V3-dorsal), que codificam a periferia de objectos como “proto-objectos”—representações básicas de categorias específicas de objectos no cérebro. Ambas as formas extraídas nos dois tipos de processamento cortical (utilizando células específicas ou uma codificação de formas arbitrária) desempenham um papel importante na visão gist local, que tenta compreender o contexto geral da cena apresentada ao sistema visual, em menos de 70 ms, sendo esperado que para tal se usem tanto características visuais globais como locais. São também utilizados mapas de conspicuicidade, que permitem detectar linhas e arestas de objectos, atribuindo-lhes um peso baseado na sua saliência perceptual—utilizando para tal a codificação natural das células retinais, em que as cores são representadas por oponência: claro/escuro, vermelho/verde e amarelo/azul. Os mapas de conspicuicidade são fundamentais na construção posterior de mapas de saliência—importantes nos esquemas pré-atencionais de nível celular baixo e também para os mecanisix mos de Foco-de-Atenção que controlam o movimento ocular e reconhecimento de caras e objectos. Em paralelo, são também desenvolvidos os mapas de disparidade cortical, sendo estes também um dos maiores focos desta tese. Estes são baseados em células corticais binoculares simples e complexas, através de um processamento das últimas em quadratura—modelo denominado por “Disparity- Energy Model”. Estes mapas de disparidade são fundamentais na percepção de distâncias dentro de uma cena visual e também para resolver o problema da segregação objecto/fundo. O papel da disparidade cortical é também explorado no reconhecimento facial a 3D, em especial quando as faces a reconhecer apresentam expressões faciais de diversas formas e níveis de intensidade. O modelo utilizado apresentou resultados excelentes, atingindo o estado-da-arte, inclusivamente ficando acima de modelos de visão computacional não biológicos.Fundação para a Ciência e a TecnologiaComissão Europei

    Medical image enhancement

    Get PDF
    Each image acquired from a medical imaging system is often part of a two-dimensional (2-D) image set whose total presents a three-dimensional (3-D) object for diagnosis. Unfortunately, sometimes these images are of poor quality. These distortions cause an inadequate object-of-interest presentation, which can result in inaccurate image analysis. Blurring is considered a serious problem. Therefore, “deblurring” an image to obtain better quality is an important issue in medical image processing. In our research, the image is initially decomposed. Contrast improvement is achieved by modifying the coefficients obtained from the decomposed image. Small coefficient values represent subtle details and are amplified to improve the visibility of the corresponding details. The stronger image density variations make a major contribution to the overall dynamic range, and have large coefficient values. These values can be reduced without much information loss
    • …
    corecore