10 research outputs found

    Attention in hierarchical models of object recognition

    Get PDF
    Object recognition and visual attention are tightly linked processes in human perception. Over the last three decades, many models have been suggested to explain these two processes and their interactions, and in some cases these models appear to contradict each other. We suggest a unifying framework for object recognition and attention and review the existing modeling literature in this context. Furthermore, we demonstrate a proof-of-concept implementation for sharing complex features between recognition and attention as a mode of top-down attention to particular objects or object categories

    Integration of Bottom--Up and Top--Down Cues for Visual Attention Using Non--Linear Relaxation

    No full text
    Active and selective perception seeks regions of interest in an image in order to reduce the computational complexity associated with time--consuming processes such as object recognition. We describe in this paper a visual attention system that extracts regions of interest by integrating multiple image cues. Bottom--up cues are detected by decomposing the image into a number of feature and conspicuity maps, while a--priori knowledge (i.e. models) about objects is used to generate top--down attention cues. Bottom--up and top-- down information is combined through a non--linear relaxation process using energy minimization--like procedures. The functionality of the attention system is expanded by the introduction of an alerting (motion-- based) system able to explore and avoid obstacles. Experimental results are reported, using cluttered and noisy scenes. 1 Introduction Visual attention is the capability of biological visual systems to rapidly detect interesting parts of the visual input..

    Persuading Visual Attention through Geometry

    Full text link

    Spatially Coherent Geometric Class Labeling of Images and Its Applications

    Get PDF
    Automatic scene analysis is an active research area and is useful in many applications such as robotics and automation, industrial manufacturing, architectural design and multimedia. 3D structural information is one of the most important cues for scene analysis. In this thesis, we present a geometric labeling method to automatically extract rough 3D information from a single 2D image. Our method partitions an image scene into five geometric regions through labeling every image pixel as one of the five geometric classes (namely, “bottom”, “left ”, “center”, “right”, and “top” ). We formulate the geometric labeling problem as an energy minimization problem and optimize the energy with a graph cut based algorithm. In our energy function, we address the spatial consistency of the geometric labels in the scene while preserving discontinuities along image intensity edges. We also incorporate ordering constraints in our energy function. Ordering constraints specify the possible relative positional labels for neighbor pixels. For example, a pixel labeled as the “left” can not be the right of a pixel labeled as the “right” and a pixel labeled as the “bottom” can not be above a pixel labeled as the “top”. Ordering constraints arise naturally in a real scene. We observed that when ordering constraints are used, the commonly used graph-cut based «-expansion is more likely to get stuck in local minima. To overcome this, we developed new graph-cut moves which we call order-preserving moves. Unlike «-expansion which works for two labels in each move, order-preserving moves act on all labels. Although the global minimum is still not guaranteed, we will show that optimization with order-preserving moves is shown to perform significantly better than «-expansion. Experimental results show that it is possible to significantly increase the percentage of reasonably good labeling by promoting spatial consistency and incorporating ordering constraints. It is also shown that the order-preserving moves performs significantly better than the commonly used «-expansion when ordering constraints are used as there is a significantly improvement in computational efficiency and optimality while the improvement in accuracy of pixel labeling is also modest. in We also demonstrate the usefulness of the extracted 3D structure information of a scene in applications such as novel view generation, virtual scene walk-through, semantic segmentation, scene synthesis, and scene text extraction. We also show how we can apply this order-preserving moves for certain simple shape priors in graph-cut segmentation. Our geometric labeling method has the following main contributions: (i) We develop a new class of graph-cut moves called order-preserving moves, which performs significantly better than «-expansion when ordering constraints are used. (ii) We formulate the problem in a global optimization framework where we address the spatial consistency of labels in a scene by formulating an energy function which encourages spatial consistency between neighboring pixels while preserving discontinuities along image intensity edges. (iii) We incorporate relative ordering information about the labels in our energy function. (iv) We show that our ordering constraints can also be used in other applications such as object part segmentation. (v) We also show how the proposed order-preserving moves can be used for certain simple shape priors in graph-cut segmentation

    Controle de miopia com lentes de contacto hidrófilas multifocais, Insuficiência de convergência, DMRI com Neuropatia ótica

    Get PDF
    Este relatório foi realizado após o estágio com a duração de 6 meses no Centro Clínico e Experimental em Ciência da Visão (CCECV) da Universidade da Beira Interior (UBI). Apresentam-se três casos clínicos que foram eleitos entre as mais de meia centena de consultas realizadas por mim e sob a orientação do Especialista em Optometria Eduardo Teixeira, pois são aqueles que entendo que possam transmitir algo mais para os atuais e futuros profissionais. O primeiro caso clínico referido neste relatório descreve a adaptação de lentes de contacto hidrófilas com desenho específico para controlo da miopia a uma paciente de 13 anos de idade cujo erro refrativo aumentou no último ano. O segundo caso clínico, refere-se a uma juvenil de 11 anos que após o rastreio na sua escola no âmbito do Projeto Ver+ foi referenciada para o CCECV da UBI para uma consulta de optometria. Realizados os exames optométricos necessários, diagnosticou-se uma Insuficiência de Convergência. Para correção deste problema de visão binocular recorreu-se ao treino visual. O terceiro caso referido neste relatório é de um homem de 61 anos de idade, profissionalmente ativo, que se dirigiu ao CCECV para uma consulta de optometria com queixas relacionadas com trabalho de visão próxima, fotofobia e moscas volantes. A observação do fundo ocular, tomografia de coerência ótica e perimetria estática computorizada levam à suspeita de Degeneração Macular relacionada com a Idade (DMRI) com uma Neuropatia Ótica binocular.This report was carried out after the 6-month internship at the Clinical and Experimental Center for Vision Science (CCECV) of the University of Beira Interior (UBI). Three clinical cases were elected among the more than 50 consultations conducted by me and under the supervision of Eduardo Teixeira Optometry Specialist, because I feel can convey something more to the current and future optometrists. The first clinical case reported in this report describes the adaptation of hydrophilic contact lenses with specific design for myopia control to a 13-year-old patient whose refractive error increased in the last year.Where it is intended to call attention to the need to a more active and preventable approach to the monitoring of young myopes. The second clinical case refers to an 11-year-old girl who, after screening at her school under the V + Project, was referred to the CCECV of UBI for an optometry consultation. After the necessary optometric exams, a Convergence Insufficiency was diagnosed. To correct this binocular vision problem, we resorted to visual training. To reinforce the importance of optometry in the resolution of Convergence Insufficiency using evidence based techniques. The third case reported in this report is a 61-year-old male patient who approached the CCECV for an optometry consultation with complaints related to near vision work, photophobia, and flying flies. The observation of the ocular fundus, Optical tomography of coherence and Static computerized perimetry leads to the suspicion of Age-related Macular Degeneration with a Binocular Optical Neuropathy. It is intended to highlight the importance of the new diagnostic tools as a fundamental component of the screening of eye diseases at a stage where is possible to treat and limit the damage to the function

    IMAGE MANAGEMENT USING PATTERN RECOGNITION SYSTEMS

    Get PDF
    With the popular usage of personal image devices and the continued increase of computing power, casual users need to handle a large number of images on computers. Image management is challenging because in addition to searching and browsing textual metadata, we also need to address two additional challenges. First, thumbnails, which are representative forms of original images, require significant screen space to be represented meaningfully. Second, while image metadata is crucial for managing images, creating metadata for images is expensive. My research on these issues is composed of three components which address these problems. First, I explore a new way of browsing a large number of images. I redesign and implement a zoomable image browser, PhotoMesa, which is capable of showing thousands of images clustered by metadata. Combined with its simple navigation strategy, the zoomable image environment allows users to scale up the size of an image collection they can comfortably browse. Second, I examine tradeoffs of displaying thumbnails in limited screen space. While bigger thumbnails use more screen space, smaller thumbnails are hard to recognize. I introduce an automatic thumbnail cropping algorithm based on a computer vision saliency model. The cropped thumbnails keep the core informative part and remove the less informative periphery. My user study shows that users performed visual searches more than 18% faster with cropped thumbnails. Finally, I explore semi-automatic annotation techniques to help users make accurate annotations with low effort. Automatic metadata extraction is typically fast but inaccurate while manual annotation is slow but accurate. I investigate techniques to combine these two approaches. My semi-automatic annotation prototype, SAPHARI, generates image clusters which facilitate efficient bulk annotation. For automatic clustering, I present hierarchical event clustering and clothing based human recognition. Experimental results demonstrate the effectiveness of the semi-automatic annotation when applied on personal photo collections. Users were able to make annotation 49% and 6% faster with the semi-automatic annotation interface on event and face tasks, respectively
    corecore