174 research outputs found

    Biased Competition in Visual Processing Hierarchies: A Learning Approach Using Multiple Cues

    Get PDF
    In this contribution, we present a large-scale hierarchical system for object detection fusing bottom-up (signal-driven) processing results with top-down (model or task-driven) attentional modulation. Specifically, we focus on the question of how the autonomous learning of invariant models can be embedded into a performing system and how such models can be used to define object-specific attentional modulation signals. Our system implements bi-directional data flow in a processing hierarchy. The bottom-up data flow proceeds from a preprocessing level to the hypothesis level where object hypotheses created by exhaustive object detection algorithms are represented in a roughly retinotopic way. A competitive selection mechanism is used to determine the most confident hypotheses, which are used on the system level to train multimodal models that link object identity to invariant hypothesis properties. The top-down data flow originates at the system level, where the trained multimodal models are used to obtain space- and feature-based attentional modulation signals, providing biases for the competitive selection process at the hypothesis level. This results in object-specific hypothesis facilitation/suppression in certain image regions which we show to be applicable to different object detection mechanisms. In order to demonstrate the benefits of this approach, we apply the system to the detection of cars in a variety of challenging traffic videos. Evaluating our approach on a publicly available dataset containing approximately 3,500 annotated video images from more than 1 h of driving, we can show strong increases in performance and generalization when compared to object detection in isolation. Furthermore, we compare our results to a late hypothesis rejection approach, showing that early coupling of top-down and bottom-up information is a favorable approach especially when processing resources are constrained

    Coverage, Continuity and Visual Cortical Architecture

    Get PDF
    The primary visual cortex of many mammals contains a continuous representation of visual space, with a roughly repetitive aperiodic map of orientation preferences superimposed. It was recently found that orientation preference maps (OPMs) obey statistical laws which are apparently invariant among species widely separated in eutherian evolution. Here, we examine whether one of the most prominent models for the optimization of cortical maps, the elastic net (EN) model, can reproduce this common design. The EN model generates representations which optimally trade of stimulus space coverage and map continuity. While this model has been used in numerous studies, no analytical results about the precise layout of the predicted OPMs have been obtained so far. We present a mathematical approach to analytically calculate the cortical representations predicted by the EN model for the joint mapping of stimulus position and orientation. We find that in all previously studied regimes, predicted OPM layouts are perfectly periodic. An unbiased search through the EN parameter space identifies a novel regime of aperiodic OPMs with pinwheel densities lower than found in experiments. In an extreme limit, aperiodic OPMs quantitatively resembling experimental observations emerge. Stabilization of these layouts results from strong nonlocal interactions rather than from a coverage-continuity-compromise. Our results demonstrate that optimization models for stimulus representations dominated by nonlocal suppressive interactions are in principle capable of correctly predicting the common OPM design. They question that visual cortical feature representations can be explained by a coverage-continuity-compromise.Comment: 100 pages, including an Appendix, 21 + 7 figure

    Neural organisation of innate behaviour in zebrafish larvae

    Get PDF
    Animals’ inner worlds are a hazy imitation of reality, shaped by evolution. Of the infinitude of stimuli that can arise in their natural environment, only a few will bear significance for an animal’s survival and reproductive success. Thus, neural circuits have evolved to extract only these relevant stimuli from the background and connect them to downstream effectors. Sometimes, competing representations of the outside world arise in the brain, and these must be resolved to ensure adaptive behaviour. Through the study of an animal’s behaviour, we can learn about its inner world: which stimuli it cares about; the desires these stimuli engender within it; and how its movements enact and extinguish those desires, allowing new stimuli to emerge that reorchestrate the inner world and refresh the cycle. Here, I present three studies that investigate the emergence of this world in the neural circuits of zebrafish larvae. In the first study, I mapped the behavioural sequences of zebrafish larvae as they pursued and consumed prey. Manipulating their vision with genetic mutants, virtual reality, and lesion studies revealed the dynamic features of stimuli that drive switches in the behaviour. I showed that, by chaining kinematically varied swim types into regular sequences, larvae bring prey to a binocular zone in the near visual field. Here, the fused representation of the stimulus across hemispheres releases stereotyped strike manoeuvres, tuned to the distance to the prey. In the second study, I helped investigate how visual circuits build representations of prey and predator stimuli. Measuring the responses of neurons to visual stimuli revealed how feature selectivity arises from the integration of upstream inputs. Features are unevenly represented across space, matching predicted changes in prey percepts as animals progress through their hunting sequences. When neurons tuned to specific features were ablated, I showed that the detection of prey was altered, no longer eliciting the usual hunting responses from animals. In the third study, I contributed to the discovery of a circuit in the brain that coordinates behavioural responses to competing stimuli. When confronted with multiple threats, animals either ignore one and escape from the other, or average their locations and escape in an intermediate direction. I showed that these two strategies are mediated by distinct swims types. Inhibiting specific neurons in the brain reduced directional escapes, but not intermediate ones, revealing a circuit that contributes to a bottom-up attention mechanism. Together, these three studies reveal the organisation of behaviour within neural circuits of the larval zebrafish brain. Finally, I consider the broader networks in the brain that might implement and modulate responses to salient visual stimuli, and how these circuits could serve as a substrate for behavioural evolution

    Modelling individual variations in brain structure and function using multimodal MRI

    Get PDF
    Every brain is different. Understanding this variability is crucial for investigating the neural substrate underlying individuals’ unique behaviour and developing personalised diagnosis and treatments. This thesis presents novel computational approaches to study individual variability in brain structure and function using magnetic resonance imaging (MRI) data. It comprises three main chapters, each addressing a specific challenge in the field. In Chapter 3, the thesis proposes a novel Image Quality Transfer (IQT) technique, HQ-augmentation, to accurately localise a Deep Brain Stimulation (DBS) target in low-quality clinical-like data. Leveraging high-quality diffusion MRI datasets from the Human Connectome Project (HCP), the HQ-augmentation approach is robust to corruptions in data quality while preserving the individual anatomical variability of the DBS target. It outperforms existing alternatives and generalises to unseen low-quality diffusion MRI datasets with different acquisition protocols, such as the UK Biobank (UKB) dataset. In Chapter 4, the thesis presents a framework for enhancing prediction accuracy of individual task-fMRI activation profiles using the variability of resting-state fMRI. Assuming resting-state functional modes underlie task-evoked activity, this chapter demonstrates that shape and intensity of individualised task activations can be separately modelled. This chapter introduced the concept of "residualisation" and showed that training on residuals leads to better individualised predictions. The framework’s prediction accuracy, validated on HCP and UKB data, is on par with task-fMRI test-retest reliability, suggesting potential for supplementing traditional task localisers. In Chapter 5, the thesis presents a novel framework for individualised retinotopic mapping using resting-state fMRI, from the primary visual cortex to visual cortex area 4. The proposed approach reproduces task-elicited retinotopy and captures individual differences in retinotopic organisation. The proposed framework delineates borders of early visual areas more accurately than group-average parcellation and is effective with both high-field 7T and more common 3T resting-state fMRI data, providing a valuable alternative to resource-intensive retinotopy task-fMRI experiments. Overall, this thesis demonstrates the potential of advanced MRI analysis techniques to study individual variability in brain structure and function, paving the way for improved clinical applications tailored to individual patients and a better understanding of neural mechanisms underlying unique human behaviour

    A topological solution to object segmentation and tracking

    Full text link
    The world is composed of objects, the ground, and the sky. Visual perception of objects requires solving two fundamental challenges: segmenting visual input into discrete units, and tracking identities of these units despite appearance changes due to object deformation, changing perspective, and dynamic occlusion. Current computer vision approaches to segmentation and tracking that approach human performance all require learning, raising the question: can objects be segmented and tracked without learning? Here, we show that the mathematical structure of light rays reflected from environment surfaces yields a natural representation of persistent surfaces, and this surface representation provides a solution to both the segmentation and tracking problems. We describe how to generate this surface representation from continuous visual input, and demonstrate that our approach can segment and invariantly track objects in cluttered synthetic video despite severe appearance changes, without requiring learning.Comment: 21 pages, 6 main figures, 3 supplemental figures, and supplementary material containing mathematical proof

    Learning, Moving, And Predicting With Global Motion Representations

    Get PDF
    In order to effectively respond to and influence the world they inhabit, animals and other intelligent agents must understand and predict the state of the world and its dynamics. An agent that can characterize how the world moves is better equipped to engage it. Current methods of motion computation rely on local representations of motion (such as optical flow) or simple, rigid global representations (such as camera motion). These methods are useful, but they are difficult to estimate reliably and limited in their applicability to real-world settings, where agents frequently must reason about complex, highly nonrigid motion over long time horizons. In this dissertation, I present methods developed with the goal of building more flexible and powerful notions of motion needed by agents facing the challenges of a dynamic, nonrigid world. This work is organized around a view of motion as a global phenomenon that is not adequately addressed by local or low-level descriptions, but that is best understood when analyzed at the level of whole images and scenes. I develop methods to: (i) robustly estimate camera motion from noisy optical flow estimates by exploiting the global, statistical relationship between the optical flow field and camera motion under projective geometry; (ii) learn representations of visual motion directly from unlabeled image sequences using learning rules derived from a formulation of image transformation in terms of its group properties; (iii) predict future frames of a video by learning a joint representation of the instantaneous state of the visual world and its motion, using a view of motion as transformations of world state. I situate this work in the broader context of ongoing computational and biological investigations into the problem of estimating motion for intelligent perception and action
    • 

    corecore