10 research outputs found
Recommended from our members
High-capacity preconscious processing in concurrent groupings of colored dots.
Grouping is a perceptual process in which a subset of stimulus components (a group) is selected for a subsequent-typically implicit-perceptual computation. Grouping is a critical precursor to segmenting objects from the background and ultimately to object recognition. Here, we study grouping by color. We present subjects with 300-ms exposures of 12 dots colored with the same but unknown identical color interspersed among 14 dots of seven different colors. To indicate grouping, subjects point-click the remembered centroid ("center of gravity") of the set of homogeneous dots, of heterogeneous dots, or of all dots. Subjects accurately judge all of these centroids. Furthermore, after a single stimulus exposure, subjects can judge both the heterogeneous and homogeneous centroids, that is, subjects simultaneously group by similarity and by dissimilarity. The centroid paradigm reveals the relative weight of each dot among targets and distractors to the underlying grouping process, offering a more detailed, quantitative description of grouping than was previously possible. A change detection experiment reveals that conscious memory contains less than two dots and their locations, whereas an ideal detector would have to perfectly process at least 15 of 26 dots to match the subjects' centroid judgments-indicating an extraordinary capacity for preconscious grouping. A different color set yielded identical results. Grouping theories that rely on predefined feature maps would fail to explain these results. Rather, the results indicate that preconscious grouping is automatic, flexible, and rapid, and a far more complex process than previously believed
Attention in hierarchical models of object recognition
Object recognition and visual attention are tightly linked processes in human perception. Over the last three decades, many models have been suggested to explain these two processes and their interactions, and in some cases these models appear to contradict each other. We suggest a unifying framework for object recognition and attention and review the existing modeling literature in this context. Furthermore, we demonstrate a proof-of-concept implementation for sharing complex features between recognition and attention as a mode of top-down attention to particular objects or object categories
Integration of Bottom--Up and Top--Down Cues for Visual Attention Using Non--Linear Relaxation
Active and selective perception seeks regions of interest in an image in order to reduce the computational complexity associated with time--consuming processes such as object recognition. We describe in this paper a visual attention system that extracts regions of interest by integrating multiple image cues. Bottom--up cues are detected by decomposing the image into a number of feature and conspicuity maps, while a--priori knowledge (i.e. models) about objects is used to generate top--down attention cues. Bottom--up and top-- down information is combined through a non--linear relaxation process using energy minimization--like procedures. The functionality of the attention system is expanded by the introduction of an alerting (motion-- based) system able to explore and avoid obstacles. Experimental results are reported, using cluttered and noisy scenes. 1 Introduction Visual attention is the capability of biological visual systems to rapidly detect interesting parts of the visual input..
Spatially Coherent Geometric Class Labeling of Images and Its Applications
Automatic scene analysis is an active research area and is useful in many applications such as robotics and automation, industrial manufacturing, architectural design and multimedia. 3D structural information is one of the most important cues for scene analysis. In this thesis, we present a geometric labeling method to automatically extract rough 3D information from a single 2D image. Our method partitions an image scene into five geometric regions through labeling every image pixel as one of the five geometric classes (namely, “bottom”, “left ”, “center”, “right”, and “top” ). We formulate the geometric labeling problem as an energy minimization problem and optimize the energy with a graph cut based algorithm. In our energy function, we address the spatial consistency of the geometric labels in the scene while preserving discontinuities along image intensity edges. We also incorporate ordering constraints in our energy function. Ordering constraints specify the possible relative positional labels for neighbor pixels. For example, a pixel labeled as the “left” can not be the right of a pixel labeled as the “right” and a pixel labeled as the “bottom” can not be above a pixel labeled as the “top”. Ordering constraints arise naturally in a real scene. We observed that when ordering constraints are used, the commonly used graph-cut based «-expansion is more likely to get stuck in local minima. To overcome this, we developed new graph-cut moves which we call order-preserving moves. Unlike «-expansion which works for two labels in each move, order-preserving moves act on all labels. Although the global minimum is still not guaranteed, we will show that optimization with order-preserving moves is shown to perform significantly better than «-expansion. Experimental results show that it is possible to significantly increase the percentage of reasonably good labeling by promoting spatial consistency and incorporating ordering constraints. It is also shown that the order-preserving moves performs significantly better than the commonly used «-expansion when ordering constraints are used as there is a significantly improvement in computational efficiency and optimality while the improvement in accuracy of pixel labeling is also modest. in We also demonstrate the usefulness of the extracted 3D structure information of a scene in applications such as novel view generation, virtual scene walk-through, semantic segmentation, scene synthesis, and scene text extraction. We also show how we can apply this order-preserving moves for certain simple shape priors in graph-cut segmentation. Our geometric labeling method has the following main contributions: (i) We develop a new class of graph-cut moves called order-preserving moves, which performs significantly better than «-expansion when ordering constraints are used. (ii) We formulate the problem in a global optimization framework where we address the spatial consistency of labels in a scene by formulating an energy function which encourages spatial consistency between neighboring pixels while preserving discontinuities along image intensity edges. (iii) We incorporate relative ordering information about the labels in our energy function. (iv) We show that our ordering constraints can also be used in other applications such as object part segmentation. (v) We also show how the proposed order-preserving moves can be used for certain simple shape priors in graph-cut segmentation
Controle de miopia com lentes de contacto hidrófilas multifocais, Insuficiência de convergência, DMRI com Neuropatia ótica
Este relatório foi realizado após o estágio com a duração de 6 meses no Centro Clínico
e Experimental em Ciência da Visão (CCECV) da Universidade da Beira Interior (UBI).
Apresentam-se três casos clínicos que foram eleitos entre as mais de meia centena de
consultas realizadas por mim e sob a orientação do Especialista em Optometria Eduardo
Teixeira, pois são aqueles que entendo que possam transmitir algo mais para os atuais e
futuros profissionais.
O primeiro caso clínico referido neste relatório descreve a adaptação de lentes de
contacto hidrófilas com desenho específico para controlo da miopia a uma paciente de 13
anos de idade cujo erro refrativo aumentou no último ano.
O segundo caso clínico, refere-se a uma juvenil de 11 anos que após o rastreio na sua
escola no âmbito do Projeto Ver+ foi referenciada para o CCECV da UBI para uma consulta de
optometria. Realizados os exames optométricos necessários, diagnosticou-se uma
Insuficiência de Convergência. Para correção deste problema de visão binocular recorreu-se
ao treino visual.
O terceiro caso referido neste relatório é de um homem de 61 anos de idade,
profissionalmente ativo, que se dirigiu ao CCECV para uma consulta de optometria com
queixas relacionadas com trabalho de visão próxima, fotofobia e moscas volantes. A
observação do fundo ocular, tomografia de coerência ótica e perimetria estática
computorizada levam à suspeita de Degeneração Macular relacionada com a Idade (DMRI) com
uma Neuropatia Ótica binocular.This report was carried out after the 6-month internship at the Clinical and Experimental
Center for Vision Science (CCECV) of the University of Beira Interior (UBI). Three clinical cases
were elected among the more than 50 consultations conducted by me and under the
supervision of Eduardo Teixeira Optometry Specialist, because I feel can convey something
more to the current and future optometrists.
The first clinical case reported in this report describes the adaptation of hydrophilic
contact lenses with specific design for myopia control to a 13-year-old patient whose
refractive error increased in the last year.Where it is intended to call attention to the need
to a more active and preventable approach to the monitoring of young myopes. The second
clinical case refers to an 11-year-old girl who, after screening at her school under the V +
Project, was referred to the CCECV of UBI for an optometry consultation. After the necessary
optometric exams, a Convergence Insufficiency was diagnosed. To correct this binocular
vision problem, we resorted to visual training. To reinforce the importance of optometry in
the resolution of Convergence Insufficiency using evidence based techniques.
The third case reported in this report is a 61-year-old male patient who approached
the CCECV for an optometry consultation with complaints related to near vision work,
photophobia, and flying flies. The observation of the ocular fundus, Optical tomography of
coherence and Static computerized perimetry leads to the suspicion of Age-related Macular
Degeneration with a Binocular Optical Neuropathy. It is intended to highlight the importance
of the new diagnostic tools as a fundamental component of the screening of eye diseases at a
stage where is possible to treat and limit the damage to the function
IMAGE MANAGEMENT USING PATTERN RECOGNITION SYSTEMS
With the popular usage of personal image devices and the continued increase of computing power, casual users need to handle a large number of images on computers. Image management is challenging because in addition to searching and browsing textual metadata, we also need to address two additional challenges. First, thumbnails, which are representative forms of original images, require significant screen space to be represented meaningfully. Second, while image metadata is crucial for managing images, creating metadata for images is expensive. My research on these issues is composed of three components which address these problems.
First, I explore a new way of browsing a large number of images. I redesign and implement a zoomable image browser, PhotoMesa, which is capable of showing thousands of images clustered by metadata. Combined with its simple navigation strategy, the zoomable image environment allows users to scale up the size of an image collection they can comfortably browse.
Second, I examine tradeoffs of displaying thumbnails in limited screen space. While bigger thumbnails use more screen space, smaller thumbnails are hard to recognize. I introduce an automatic thumbnail cropping algorithm based on a computer vision saliency model. The cropped thumbnails keep the core informative part and remove the less informative periphery. My user study shows that users performed visual searches more than 18% faster with cropped thumbnails.
Finally, I explore semi-automatic annotation techniques to help users make accurate annotations with low effort. Automatic metadata extraction is typically fast but inaccurate while manual annotation is slow but accurate. I investigate techniques to combine these two approaches. My semi-automatic annotation prototype, SAPHARI, generates image clusters which facilitate efficient bulk annotation. For automatic clustering, I present hierarchical event clustering and clothing based human recognition. Experimental results demonstrate the effectiveness of the semi-automatic annotation when applied on personal photo collections. Users were able to make annotation 49% and 6% faster with the semi-automatic annotation interface on event and face tasks, respectively