838 research outputs found

    Algorithm that mimics human perceptual grouping of dot patterns

    Get PDF

    Algorithm that mimics human perceptual grouping of dot patterns

    Get PDF

    Latent Noise Segmentation: How Neural Noise Leads to the Emergence of Segmentation and Grouping

    Full text link
    Deep Neural Networks (DNNs) that achieve human-level performance in general tasks like object segmentation typically require supervised labels. In contrast, humans are able to perform these tasks effortlessly without supervision. To accomplish this, the human visual system makes use of perceptual grouping. Understanding how perceptual grouping arises in an unsupervised manner is critical for improving both models of the visual system, and computer vision models. In this work, we propose a counterintuitive approach to unsupervised perceptual grouping and segmentation: that they arise because of neural noise, rather than in spite of it. We (1) mathematically demonstrate that under realistic assumptions, neural noise can be used to separate objects from each other, and (2) show that adding noise in a DNN enables the network to segment images even though it was never trained on any segmentation labels. Interestingly, we find that (3) segmenting objects using noise results in segmentation performance that aligns with the perceptual grouping phenomena observed in humans. We introduce the Good Gestalt (GG) datasets -- six datasets designed to specifically test perceptual grouping, and show that our DNN models reproduce many important phenomena in human perception, such as illusory contours, closure, continuity, proximity, and occlusion. Finally, we (4) demonstrate the ecological plausibility of the method by analyzing the sensitivity of the DNN to different magnitudes of noise. We find that some model variants consistently succeed with remarkably low levels of neural noise (σ<0.001\sigma<0.001), and surprisingly, that segmenting this way requires as few as a handful of samples. Together, our results suggest a novel unsupervised segmentation method requiring few assumptions, a new explanation for the formation of perceptual grouping, and a potential benefit of neural noise in the visual system

    Quantification of two Gestalt Laws using curve resconstruction

    Get PDF
    Visual perception is the ability to interpret, process, and comprehend all the information received through the sense of sight by association with earlier experiences. Researchers have long struggled to explain what visual processing does to create what we actually see, and brought many theoretical approaches explaining how human beings see the world. The theoretical approaches of visual perception differ widely and their coverage ranges from early theories such as Gestalt theory to recent computational theory in the field of Artificial Intelligence. According to the characteristics of visual perception, human beings tend to classify the ambient environment objects into different categories described by various symbols or objects. Similar symbols or even quite dissimilar symbols may be perceived as belonging together or belonging to different groups according to people's judgment. It must follow certain rules when human beings set up relationships between those objects and symbols, and finally obtain the unambiguous perceptual results through the process of visual perception. To find out the mechanisms underlying these properties of visual perception, this present thesis conducts experiments on perception using curve reconstructions as test cases. The perception model developed through the experiment is implemented in a curve reconstruction algorithm. It is assumed that a good perception model will reconstruct curves in the same manner as human beings perceive them. In the present thesis, a series of methods from Design of Experiments (DOE), ANOVA and the multivariate nonlinear regression model are applied to investigate the relationships between the points and curves. The results show that our perception model conforms to the pattern human perceives the points

    Dynamic and Integrative Properties of the Primary Visual Cortex

    Get PDF
    The ability to derive meaning from complex, ambiguous sensory input requires the integration of information over both space and time, as well as cognitive mechanisms to dynamically shape that integration. We have studied these processes in the primary visual cortex (V1), where neurons have been proposed to integrate visual inputs along a geometric pattern known as the association field (AF). We first used cortical reorganization as a model to investigate the role that a specific network of V1 connections, the long-range horizontal connections, might play in temporal and spatial integration across the AF. When retinal lesions ablate sensory information from portions of the visual field, V1 undergoes a process of reorganization mediated by compensatory changes in the network of horizontal collaterals. The reorganization accompanies the brain’s amazing ability to perceptually “fill-inâ€, or “seeâ€, the lost visual input. We developed a computational model to simulate cortical reorganization and perceptual fill-in mediated by a plexus of horizontal connections that encode the AF. The model reproduces the major features of the perceptual fill-in reported by human subjects with retinal lesions, and it suggests that V1 neurons, empowered by their horizontal connections, underlie both perceptual fill-in and normal integrative mechanisms that are crucial to our visual perception. These results motivated the second prong of our work, which was to experimentally study the normal integration of information in V1. Since psychophysical and physiological studies suggest that spatial interactions in V1 may be under cognitive control, we investigated the integrative properties of V1 neurons under different cognitive states. We performed extracellular recordings from single V1 neurons in macaques that were trained to perform a delayed-match-to-sample contour detection task. We found that the ability of V1 neurons to summate visual inputs from beyond the classical receptive field (cRF) imbues them with selectivity for complex contour shapes, and that neuronal shape selectivity in V1 changed dynamically according to the shapes monkeys were cued to detect. Over the population, V1 encoded subsets of the AF, predicted by the computational model, that shifted as a function of the monkeys’ expectations. These results support the major conclusions of the theoretical work; even more, they reveal a sophisticated mode of form processing, whereby the selectivity of the whole network in V1 is reshaped by cognitive state

    Camouflage and perceptual organization in the animal kingdom

    Get PDF

    The visual perception of human locomotion

    Get PDF
    To function adeptly within our environment, we must perceive and interpret the movements of others. What mechanisms underlie our exquisite visual sensitivity to human movement? To address this question, a set of psychophysical studies was conducted to ascertain the temporal characteristics of the visual perception of human locomotion. Subjects viewed a computer-generated point-light walker presented within a mask under conditions of apparent motion. The temporal delay between the display frames as well as the motion characteristics of the mask were varied. With sufficiently long trial durations, performance in a direction discrimination task remained fairly constant across inter-stimulus interval (ISI) when the walker was presented within a random motion mask but increased with ISI when the mask motion duplicated the motion of the walker. This pattern of results suggests that both low-level and high-level visual analyses are involved in the visual perception of human locomotion. These findings are discussed in relation to recent neurophysiological data suggesting that the visual perception of human movement may involve a functional linkage between the visual and motor systems.peer-reviewe

    Human interaction with digital ink : legibility measurement and structural analysis

    Get PDF
    Literature suggests that it is possible to design and implement pen-based computer interfaces that resemble the use of pen and paper. These interfaces appear to allow users freedom in expressing ideas and seem to be familiar and easy to use. Different ideas have been put forward concerning this type of interface, however despite the commonality of aims and problems faced, there does not appear to be a common approach to their design and implementation. This thesis aims to progress the development of pen-based computer interfaces that resemble the use of pen and paper. To do this, a conceptual model is proposed for interfaces that enable interaction with "digital ink". This conceptual model is used to organize and analyse the broad range of literature related to pen-based interfaces, and to identify topics that are not sufficiently addressed by published research. Two issues highlighted by the model: digital ink legibility and digital ink structuring, are then investigated. In the first investigation, methods are devised to objectively and subjectively measure the legibility of handwritten script. These methods are then piloted in experiments that vary the horizontal rendering resolution of handwritten script displayed on a computer screen. Script legibility is shown to decrease with rendering resolution, after it drops below a threshold value. In the second investigation, the clustering of digital ink strokes into words is addressed. A method of rating the accuracy of clustering algorithms is proposed: the percentage of words spoiled. The clustering error rate is found to vary among different writers, for a clustering algorithm using the geometric features of both ink strokes, and the gaps between them. The work contributes a conceptual interface model, methods of measuring digital ink legibility, and techniques for investigating stroke clustering features, to the field of digital ink interaction research

    Impact of object extraction methods on classification performance in surface inspection systems

    Get PDF
    In surface inspection applications, the main goal is to detect all areas which might contain defects or unacceptable imperfections, and to classify either every single 'suspicious' region or the investigated part as a whole. After an image is acquired by the machine vision hardware, all pixels that deviate from a pre-defined 'ideal' master image are set to a non-zero value, depending on the magnitude of deviation. This procedure leads to so-called "contrast images", in which accumulations of bright pixels may appear, representing potentially defective areas. In this paper, various methods are presented for grouping these bright pixels together into meaningful objects, ranging from classical image processing techniques to machine-learning-based clustering approaches. One important issue here is to find reasonable groupings even for non-connected and widespread objects. In general, these objects correspond either to real faults or to pseudo-errors that do not affect the surface quality at all. The impact of different extraction methods on the accuracy of image classifiers will be studied. The classifiers are trained with feature vectors calculated for the extracted objects found in images labeled by the user and showing surfaces of production items. In our investigation artificially created contrast images will be considered as well as real ones recorded on-line at a CD imprint production and at an egg inspection system. © Springer-Verlag 2009

    Neural models of inter-cortical networks in the primate visual system for navigation, attention, path perception, and static and kinetic figure-ground perception

    Full text link
    Vision provides the primary means by which many animals distinguish foreground objects from their background and coordinate locomotion through complex environments. The present thesis focuses on mechanisms within the visual system that afford figure-ground segregation and self-motion perception. These processes are modeled as emergent outcomes of dynamical interactions among neural populations in several brain areas. This dissertation specifies and simulates how border-ownership signals emerge in cortex, and how the medial superior temporal area (MSTd) represents path of travel and heading, in the presence of independently moving objects (IMOs). Neurons in visual cortex that signal border-ownership, the perception that a border belongs to a figure and not its background, have been identified but the underlying mechanisms have been unclear. A model is presented that demonstrates that inter-areal interactions across model visual areas V1-V2-V4 afford border-ownership signals similar to those reported in electrophysiology for visual displays containing figures defined by luminance contrast. Competition between model neurons with different receptive field sizes is crucial for reconciling the occlusion of one object by another. The model is extended to determine border-ownership when object borders are kinetically-defined, and to detect the location and size of shapes, despite the curvature of their boundary contours. Navigation in the real world requires humans to travel along curved paths. Many perceptual models have been proposed that focus on heading, which specifies the direction of travel along straight paths, but not on path curvature. In primates, MSTd has been implicated in heading perception. A model of V1, medial temporal area (MT), and MSTd is developed herein that demonstrates how MSTd neurons can simultaneously encode path curvature and heading. Human judgments of heading are accurate in rigid environments, but are biased in the presence of IMOs. The model presented here explains the bias through recurrent connectivity in MSTd and avoids the use of differential motion detectors which, although used in existing models to discount the motion of an IMO relative to its background, is not biologically plausible. Reported modulation of the MSTd population due to attention is explained through competitive dynamics between subpopulations responding to bottom-up and top- down signals
    • …
    corecore