17,681 research outputs found

    Computing motion in the primate's visual system

    Get PDF
    Computing motion on the basis of the time-varying image intensity is a difficult problem for both artificial and biological vision systems. We will show how one well-known gradient-based computer algorithm for estimating visual motion can be implemented within the primate's visual system. This relaxation algorithm computes the optical flow field by minimizing a variational functional of a form commonly encountered in early vision, and is performed in two steps. In the first stage, local motion is computed, while in the second stage spatial integration occurs. Neurons in the second stage represent the optical flow field via a population-coding scheme, such that the vector sum of all neurons at each location codes for the direction and magnitude of the velocity at that location. The resulting network maps onto the magnocellular pathway of the primate visual system, in particular onto cells in the primary visual cortex (V1) as well as onto cells in the middle temporal area (MT). Our algorithm mimics a number of psychophysical phenomena and illusions (perception of coherent plaids, motion capture, motion coherence) as well as electrophysiological recordings. Thus, a single unifying principle ‘the final optical flow should be as smooth as possible’ (except at isolated motion discontinuities) explains a large number of phenomena and links single-cell behavior with perception and computational theory

    Integrated 2-D Optical Flow Sensor

    Get PDF
    I present a new focal-plane analog VLSI sensor that estimates optical flow in two visual dimensions. The chip significantly improves previous approaches both with respect to the applied model of optical flow estimation as well as the actual hardware implementation. Its distributed computational architecture consists of an array of locally connected motion units that collectively solve for the unique optimal optical flow estimate. The novel gradient-based motion model assumes visual motion to be translational, smooth and biased. The model guarantees that the estimation problem is computationally well-posed regardless of the visual input. Model parameters can be globally adjusted, leading to a rich output behavior. Varying the smoothness strength, for example, can provide a continuous spectrum of motion estimates, ranging from normal to global optical flow. Unlike approaches that rely on the explicit matching of brightness edges in space or time, the applied gradient-based model assures spatiotemporal continuity on visual information. The non-linear coupling of the individual motion units improves the resulting optical flow estimate because it reduces spatial smoothing across large velocity differences. Extended measurements of a 30x30 array prototype sensor under real-world conditions demonstrate the validity of the model and the robustness and functionality of the implementation

    General highlight detection in sport videos

    Get PDF
    Attention is a psychological measurement of human reflection against stimulus. We propose a general framework of highlight detection by comparing attention intensity during the watching of sports videos. Three steps are involved: adaptive selection on salient features, unified attention estimation and highlight identification. Adaptive selection computes feature correlation to decide an optimal set of salient features. Unified estimation combines these features by the technique of multi-resolution autoregressive (MAR) and thus creates a temporal curve of attention intensity. We rank the intensity of attention to discriminate boundaries of highlights. Such a framework alleviates semantic uncertainty around sport highlights and leads to an efficient and effective highlight detection. The advantages are as follows: (1) the capability of using data at coarse temporal resolutions; (2) the robustness against noise caused by modality asynchronism, perception uncertainty and feature mismatch; (3) the employment of Markovian constrains on content presentation, and (4) multi-resolution estimation on attention intensity, which enables the precise allocation of event boundaries

    Attention and Anticipation in Fast Visual-Inertial Navigation

    Get PDF
    We study a Visual-Inertial Navigation (VIN) problem in which a robot needs to estimate its state using an on-board camera and an inertial sensor, without any prior knowledge of the external environment. We consider the case in which the robot can allocate limited resources to VIN, due to tight computational constraints. Therefore, we answer the following question: under limited resources, what are the most relevant visual cues to maximize the performance of visual-inertial navigation? Our approach has four key ingredients. First, it is task-driven, in that the selection of the visual cues is guided by a metric quantifying the VIN performance. Second, it exploits the notion of anticipation, since it uses a simplified model for forward-simulation of robot dynamics, predicting the utility of a set of visual cues over a future time horizon. Third, it is efficient and easy to implement, since it leads to a greedy algorithm for the selection of the most relevant visual cues. Fourth, it provides formal performance guarantees: we leverage submodularity to prove that the greedy selection cannot be far from the optimal (combinatorial) selection. Simulations and real experiments on agile drones show that our approach ensures state-of-the-art VIN performance while maintaining a lean processing time. In the easy scenarios, our approach outperforms appearance-based feature selection in terms of localization errors. In the most challenging scenarios, it enables accurate visual-inertial navigation while appearance-based feature selection fails to track robot's motion during aggressive maneuvers.Comment: 20 pages, 7 figures, 2 table

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Get PDF
    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved

    Computing optical flow in the primate visual system

    Get PDF
    Computing motion on the basis of the time-varying image intensity is a difficult problem for both artificial and biological vision systems. We show how gradient models, a well-known class of motion algorithms, can be implemented within the magnocellular pathway of the primate's visual system. Our cooperative algorithm computes optical flow in two steps. In the first stage, assumed to be located in primary visual cortex, local motion is measured while spatial integration occurs in the second stage, assumed to be located in the middle temporal area (MT). The final optical flow is extracted in this second stage using population coding, such that the velocity is represented by the vector sum of neurons coding for motion in different directions. Our theory, relating the single-cell to the perceptual level, accounts for a number of psychophysical and electrophysiological observations and illusions

    Invariance of visual operations at the level of receptive fields

    Get PDF
    Receptive field profiles registered by cell recordings have shown that mammalian vision has developed receptive fields tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time. This article presents a theoretical model by which families of idealized receptive field profiles can be derived mathematically from a small set of basic assumptions that correspond to structural properties of the environment. The article also presents a theory for how basic invariance properties to variations in scale, viewing direction and relative motion can be obtained from the output of such receptive fields, using complementary selection mechanisms that operate over the output of families of receptive fields tuned to different parameters. Thereby, the theory shows how basic invariance properties of a visual system can be obtained already at the level of receptive fields, and we can explain the different shapes of receptive field profiles found in biological vision from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.Comment: 40 pages, 17 figure
    corecore