1,792 research outputs found

    A Neural Model of How the Brain Computes Heading from Optic Flow in Realistic Scenes

    Full text link
    Animals avoid obstacles and approach goals in novel cluttered environments using visual information, notably optic flow, to compute heading, or direction of travel, with respect to objects in the environment. We present a neural model of how heading is computed that describes interactions among neurons in several visual areas of the primate magnocellular pathway, from retina through V1, MT+, and MSTd. The model produces outputs which are qualitatively and quantitatively similar to human heading estimation data in response to complex natural scenes. The model estimates heading to within 1.5° in random dot or photo-realistically rendered scenes and within 3° in video streams from driving in real-world environments. Simulated rotations of less than 1 degree per second do not affect model performance, but faster simulated rotation rates deteriorate performance, as in humans. The model is part of a larger navigational system that identifies and tracks objects while navigating in cluttered environments.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National-Geospatial Intelligence Agency (NMA201-01-1-2016

    Learning Latent Space Dynamics for Tactile Servoing

    Full text link
    To achieve a dexterous robotic manipulation, we need to endow our robot with tactile feedback capability, i.e. the ability to drive action based on tactile sensing. In this paper, we specifically address the challenge of tactile servoing, i.e. given the current tactile sensing and a target/goal tactile sensing --memorized from a successful task execution in the past-- what is the action that will bring the current tactile sensing to move closer towards the target tactile sensing at the next time step. We develop a data-driven approach to acquire a dynamics model for tactile servoing by learning from demonstration. Moreover, our method represents the tactile sensing information as to lie on a surface --or a 2D manifold-- and perform a manifold learning, making it applicable to any tactile skin geometry. We evaluate our method on a contact point tracking task using a robot equipped with a tactile finger. A video demonstrating our approach can be seen in https://youtu.be/0QK0-Vx7WkIComment: Accepted to be published at the International Conference on Robotics and Automation (ICRA) 2019. The final version for publication at ICRA 2019 is 7 pages (i.e. 6 pages of technical content (including text, figures, tables, acknowledgement, etc.) and 1 page of the Bibliography/References), while this arXiv version is 8 pages (added Appendix and some extra details

    Computational Modeling of Human Dorsal Pathway for Motion Processing

    Get PDF
    Reliable motion estimation in videos is of crucial importance for background iden- tification, object tracking, action recognition, event analysis, self-navigation, etc. Re- constructing the motion field in the 2D image plane is very challenging, due to variations in image quality, scene geometry, lighting condition, and most importantly, camera jit- tering. Traditional optical flow models assume consistent image brightness and smooth motion field, which are violated by unstable illumination and motion discontinuities that are common in real world videos. To recognize observer (or camera) motion robustly in complex, realistic scenarios, we propose a biologically-inspired motion estimation system to overcome issues posed by real world videos. The bottom-up model is inspired from the infrastructure as well as functionalities of human dorsal pathway, and the hierarchical processing stream can be divided into three stages: 1) spatio-temporal processing for local motion, 2) recogni- tion for global motion patterns (camera motion), and 3) preemptive estimation of object motion. To extract effective and meaningful motion features, we apply a series of steer- able, spatio-temporal filters to detect local motion at different speeds and directions, in a way that\u27s selective of motion velocity. The intermediate response maps are cal- ibrated and combined to estimate dense motion fields in local regions, and then, local motions along two orthogonal axes are aggregated for recognizing planar, radial and circular patterns of global motion. We evaluate the model with an extensive, realistic video database that collected by hand with a mobile device (iPad) and the video content varies in scene geometry, lighting condition, view perspective and depth. We achieved high quality result and demonstrated that this bottom-up model is capable of extracting high-level semantic knowledge regarding self motion in realistic scenes. Once the global motion is known, we segment objects from moving backgrounds by compensating for camera motion. For videos captured with non-stationary cam- eras, we consider global motion as a combination of camera motion (background) and object motion (foreground). To estimate foreground motion, we exploit corollary dis- charge mechanism of biological systems and estimate motion preemptively. Since back- ground motions for each pixel are collectively introduced by camera movements, we apply spatial-temporal averaging to estimate the background motion at pixel level, and the initial estimation of foreground motion is derived by comparing global motion and background motion at multiple spatial levels. The real frame signals are compared with those derived by forward predictions, refining estimations for object motion. This mo- tion detection system is applied to detect objects with cluttered, moving backgrounds and is proved to be efficient in locating independently moving, non-rigid regions. The core contribution of this thesis is the invention of a robust motion estimation system for complicated real world videos, with challenges by real sensor noise, complex natural scenes, variations in illumination and depth, and motion discontinuities. The overall system demonstrates biological plausibility and holds great potential for other applications, such as camera motion removal, heading estimation, obstacle avoidance, route planning, and vision-based navigational assistance, etc

    Object Segmentation from Motion Discontinuities and Temporal Occlusions–A Biologically Inspired Model

    Get PDF
    BACKGROUND: Optic flow is an important cue for object detection. Humans are able to perceive objects in a scene using only kinetic boundaries, and can perform the task even when other shape cues are not provided. These kinetic boundaries are characterized by the presence of motion discontinuities in a local neighbourhood. In addition, temporal occlusions appear along the boundaries as the object in front covers the background and the objects that are spatially behind it. METHODOLOGY/PRINCIPAL FINDINGS: From a technical point of view, the detection of motion boundaries for segmentation based on optic flow is a difficult task. This is due to the problem that flow detected along such boundaries is generally not reliable. We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries. It includes two separate mechanisms for both the detection of motion discontinuities and of occlusion regions based on how neurons respond to spatial and temporal contrast, respectively. The mechanisms are embedded in a biologically inspired architecture that integrates information of different model components of the visual processing due to feedback connections. In particular, mutual interactions between the detection of motion discontinuities and temporal occlusions allow a considerable improvement of the kinetic boundary detection. CONCLUSIONS/SIGNIFICANCE: A new model is proposed that uses optic flow cues to detect motion discontinuities and object occlusion. We suggest that by combining these results for motion discontinuities and object occlusion, object segmentation within the model can be improved. This idea could also be applied in other models for object segmentation. In addition, we discuss how this model is related to neurophysiological findings. The model was successfully tested both with artificial and real sequences including self and object motion

    Control-Oriented Reduced Order Modeling of Dipteran Flapping Flight

    Get PDF
    Flying insects achieve flight stabilization and control in a manner that requires only small, specialized neural structures to perform the essential components of sensing and feedback, achieving unparalleled levels of robust aerobatic flight on limited computational resources. An engineering mechanism to replicate these control strategies could provide a dramatic increase in the mobility of small scale aerial robotics, but a formal investigation has not yet yielded tools that both quantitatively and intuitively explain flapping wing flight as an "input-output" relationship. This work uses experimental and simulated measurements of insect flight to create reduced order flight dynamics models. The framework presented here creates models that are relevant for the study of control properties. The work begins with automated measurement of insect wing motions in free flight, which are then used to calculate flight forces via an empirically-derived aerodynamics model. When paired with rigid body dynamics and experimentally measured state feedback, both the bare airframe and closed loop systems may be analyzed using frequency domain system identification. Flight dynamics models describing maneuvering about hover and cruise conditions are presented for example fruit flies (Drosophila melanogaster) and blowflies (Calliphorids). The results show that biologically measured feedback paths are appropriate for flight stabilization and sexual dimorphism is only a minor factor in flight dynamics. A method of ranking kinematic control inputs to maximize maneuverability is also presented, showing that the volume of reachable configurations in state space can be dramatically increased due to appropriate choice of kinematic inputs

    A Taxonomy of Deep Convolutional Neural Nets for Computer Vision

    Get PDF
    Traditional architectures for solving computer vision problems and the degree of success they enjoyed have been heavily reliant on hand-crafted features. However, of late, deep learning techniques have offered a compelling alternative -- that of automatically learning problem-specific features. With this new paradigm, every problem in computer vision is now being re-examined from a deep learning perspective. Therefore, it has become important to understand what kind of deep networks are suitable for a given problem. Although general surveys of this fast-moving paradigm (i.e. deep-networks) exist, a survey specific to computer vision is missing. We specifically consider one form of deep networks widely used in computer vision - convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN and then examine the broad variations proposed over time to suit different applications. We hope that our recipe-style survey will serve as a guide, particularly for novice practitioners intending to use deep-learning techniques for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Biologically Inspired Visual Control of Flying Robots

    Get PDF
    Insects posses an incredible ability to navigate their environment at high speed, despite having small brains and limited visual acuity. Through selective pressure they have evolved computationally efficient means for simultaneously performing navigation tasks and instantaneous control responses. The insect’s main source of information is visual, and through a hierarchy of processes this information is used for perception; at the lowest level are local neurons for detecting image motion and edges, at the higher level are interneurons to spatially integrate the output of previous stages. These higher level processes could be considered as models of the insect's environment, reducing the amount of information to only that which evolution has determined relevant. The scope of this thesis is experimenting with biologically inspired visual control of flying robots through information processing, models of the environment, and flight behaviour. In order to test these ideas I developed a custom quadrotor robot and experimental platform; the 'wasp' system. All algorithms ran on the robot, in real-time or better, and hypotheses were always verified with flight experiments. I developed a new optical flow algorithm that is computationally efficient, and able to be applied in a regular pattern to the image. This technique is used later in my work when considering patterns in the image motion field. Using optical flow in the log-polar coordinate system I developed attitude estimation and time-to-contact algorithms. I find that the log-polar domain is useful for analysing global image motion; and in many ways equivalent to the retinotopic arrange- ment of neurons in the optic lobe of insects, used for the same task. I investigated the role of depth in insect flight using two experiments. In the first experiment, to study how concurrent visual control processes might be combined, I developed a control system using the combined output of two algorithms. The first algorithm was a wide-field optical flow balance strategy and the second an obstacle avoidance strategy which used inertial information to estimate the depth to objects in the environment - objects whose depth was significantly different to their surround- ings. In the second experiment I created an altitude control system which used a model of the environment in the Hough space, and a biologically inspired sampling strategy, to efficiently detect the ground. Both control systems were used to control the flight of a quadrotor in an indoor environment. The methods that insects use to perceive edges and control their flight in response had not been applied to artificial systems before. I developed a quadrotor control system that used the distribution of edges in the environment to regulate the robot height and avoid obstacles. I also developed a model that predicted the distribution of edges in a static scene, and using this prediction was able to estimate the quadrotor altitude
    • 

    corecore