3,302 research outputs found

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Final report key contents: main results accomplished by the EU-Funded project IM-CLeVeR - Intrinsically Motivated Cumulative Learning Versatile Robots

    Get PDF
    This document has the goal of presenting the main scientific and technological achievements of the project IM-CLeVeR. The document is organised as follows: 1. Project executive summary: a brief overview of the project vision, objectives and keywords. 2. Beneficiaries of the project and contacts: list of Teams (partners) of the project, Team Leaders and contacts. 3. Project context and objectives: the vision of the project and its overall objectives 4. Overview of work performed and main results achieved: a one page overview of the main results of the project 5. Overview of main results per partner: a bullet-point list of main results per partners 6. Main achievements in detail, per partner: a throughout explanation of the main results per partner (but including collaboration work), with also reference to the main publications supporting them

    Nuclei & Glands Instance Segmentation in Histology Images: A Narrative Review

    Full text link
    Instance segmentation of nuclei and glands in the histology images is an important step in computational pathology workflow for cancer diagnosis, treatment planning and survival analysis. With the advent of modern hardware, the recent availability of large-scale quality public datasets and the community organized grand challenges have seen a surge in automated methods focusing on domain specific challenges, which is pivotal for technology advancements and clinical translation. In this survey, 126 papers illustrating the AI based methods for nuclei and glands instance segmentation published in the last five years (2017-2022) are deeply analyzed, the limitations of current approaches and the open challenges are discussed. Moreover, the potential future research direction is presented and the contribution of state-of-the-art methods is summarized. Further, a generalized summary of publicly available datasets and a detailed insights on the grand challenges illustrating the top performing methods specific to each challenge is also provided. Besides, we intended to give the reader current state of existing research and pointers to the future directions in developing methods that can be used in clinical practice enabling improved diagnosis, grading, prognosis, and treatment planning of cancer. To the best of our knowledge, no previous work has reviewed the instance segmentation in histology images focusing towards this direction.Comment: 60 pages, 14 figure

    Bio-inspired electronics for micropower vision processing

    No full text
    Vision processing is a topic traditionally associated with neurobiology; known to encode, process and interpret visual data most effectively. For example, the human retina; an exquisite sheet of neurobiological wetware, is amongst the most powerful and efficient vision processors known to mankind. With improving integrated technologies, this has generated considerable research interest in the microelectronics community in a quest to develop effective, efficient and robust vision processing hardware with real-time capability. This thesis describes the design of a novel biologically-inspired hybrid analogue/digital vision chip ORASIS1 for centroiding, sizing and counting of enclosed objects. This chip is the first two-dimensional silicon retina capable of centroiding and sizing multiple objects2 in true parallel fashion. Based on a novel distributed architecture, this system achieves ultra-fast and ultra-low power operation in comparison to conventional techniques. Although specifically applied to centroid detection, the generalised architecture in fact presents a new biologically-inspired processing paradigm entitled: distributed asynchronous mixed-signal logic processing. This is applicable to vision and sensory processing applications in general that require processing of large numbers of parallel inputs, normally presenting a computational bottleneck. Apart from the distributed architecture, the specific centroiding algorithm and vision chip other original contributions include: an ultra-low power tunable edge-detection circuit, an adjustable threshold local/global smoothing network and an ON/OFF-adaptive spiking photoreceptor circuit. Finally, a concise yet comprehensive overview of photodiode design methodology is provided for standard CMOS technologies. This aims to form a basic reference from an engineering perspective, bridging together theory with measured results. Furthermore, an approximate photodiode expression is presented, aiming to provide vision chip designers with a basic tool for pre-fabrication calculations

    Visual Attention for Robotic Cognition: A Biologically Inspired Probabilistic Architecture

    Get PDF
    The human being, the most magnificent autonomous entity in the universe, frequently takes the decision of `what to look at' in their day-to-day life without even realizing the complexities of the underlying process. When it comes to the design of such an attention system for autonomous robots, all of a sudden this apparently simple task appears to be an extremely complex one with highly dynamic interaction among motor skills, knowledge and experience developed throughout the life-time, highly connected circuitry of the visual cortex, and super-fast timing. The most fascinating thing about visual attention system of the primates is that the underlying mechanism is not precisely known yet. Different influential theories and hypothesis regarding this mechanism, however, are being proposed in psychology and neuroscience. These theories and hypothesis have encouraged the research on synthetic modeling of visual attention in computer vision, computational neuroscience and, very recently, in AI robotics. The major motivation behind the computational modeling of visual attention is two-fold: understanding the mechanism underlying the cognition of the primates' and using the principle of focused attention in different real-world applications, e.g. in computer vision, surveillance, and robotics. Accordingly, we observe the rise of two different trends in the computational modeling of visual attention. The first one is mostly focused on developing mathematical models which mimic, as much as possible, the details of the primates' attention system: the structure, the connectivity among visual neurons and different regions of the visual cortex, the flow of information etc. Such models provide a way to test the theories of the primates' visual attention with minimal involvement from the live subjects. This is a magnificent way to use technological advancement for the understanding of human cognition. The second trend in computational modeling, on the other hand, uses the methodological sophistication of the biological processes (like visual attention) to advance the technology. These models are mostly concerned with developing a technical system of visual attention which can be used in real-world applications where the principle of focused attention might play a significant role for redundant information management. This thesis is focused on developing a computational model of visual attention for robotic cognition and, therefore, belongs to the second trend. The design of a visual attention model for robotic systems as a component of their cognition comes with a number of challenges which, generally, do not appear in the traditional computer vision applications of visual attention. The robotic models of visual attention, although heavily inspired by the rich literature of visual attention in computer vision, adopt different measures to cope with these challenges. This thesis proposes a Bayesian model of visual attention designed specifically for robotic systems and, therefore, tackles the challenges involved with robotic visual attention. The operation of the proposed model is guided by the theory of biased competition, a popular theory from cognitive neuroscience describing the mechanism of primates' visual attention. The proposed Bayesian attention model offers a robot-centric approach of visual attention where the head-pose of a robot in the 3D world is estimated recursively such that the robot can focus on the most behaviorally relevant stimuli in its environment. The behavioral relevance of an object determined based on two criteria which are inspired by the postulates of the biased competitive hypothesis of visual attention in the primates. Accordingly, the proposed model encourages a robot to focus on novel stimuli or stimuli that have similarity with a `sought for' object depending on the context. In order to address a number of robot-specific issues of visual attention, the proposed model is further extended to the multi-modal case where speech commands from the human are used to modulate the visual attention behavior of the robot. The Bayes model of visual attention, inherited from the Bayesian sensor fusion characteristic, naturally accommodates multi-modal information during attention selection. This enables the proposed model to be the core component of an attention oriented speech-based human-robot interaction framework. Extensive experiments are performed in the real-world to investigate different aspects of the proposed Bayesian visual attention model

    Deep Learning Methods for 3D Aerial and Satellite Data

    Get PDF
    Recent advances in digital electronics have led to an overabundance of observations from electro-optical (EO) imaging sensors spanning high spatial, spectral and temporal resolution. This unprecedented volume, variety, and velocity is overwhelming our capacity to manage and translate that data into actionable information. Although decades of image processing research have taken the human out of the loop for many important tasks, the human analyst is still an irreplaceable link in the image exploitation chain, especially for more complex tasks requiring contextual understanding, memory, discernment, and learning. If knowledge discovery is to keep pace with the growing availability of data, new processing paradigms are needed in order to automate the analysis of earth observation imagery and ease the burden of manual interpretation. To address this gap, this dissertation advances fundamental and applied research in deep learning for aerial and satellite imagery. We show how deep learning---a computational model inspired by the human brain---can be used for (1) tracking, (2) classifying, and (3) modeling from a variety of data sources including full-motion video (FMV), Light Detection and Ranging (LiDAR), and stereo photogrammetry. First we assess the ability of a bio-inspired tracking method to track small targets using aerial videos. The tracker uses three kinds of saliency maps: appearance, location, and motion. Our approach achieves the best overall performance, including being the only method capable of handling long-term occlusions. Second, we evaluate the classification accuracy of a multi-scale fully convolutional network to label individual points in LiDAR data. Our method uses only the 3D-coordinates and corresponding low-dimensional spectral features for each point. Evaluated using the ISPRS 3D Semantic Labeling Contest, our method scored second place with an overall accuracy of 81.6\%. Finally, we validate the prediction capability of our neighborhood-aware network to model the bare-earth surface of LiDAR and stereo photogrammetry point clouds. The network bypasses traditionally-used ground classifications and seamlessly integrate neighborhood features with point-wise and global features to predict a per point Digital Terrain Model (DTM). We compare our results with two widely used softwares for DTM extraction, ENVI and LAStools. Together, these efforts have the potential to alleviate the manual burden associated with some of the most challenging and time-consuming geospatial processing tasks, with implications for improving our response to issues of global security, emergency management, and disaster response

    Active Vision for Scene Understanding

    Get PDF
    Visual perception is one of the most important sources of information for both humans and robots. A particular challenge is the acquisition and interpretation of complex unstructured scenes. This work contributes to active vision for humanoid robots. A semantic model of the scene is created, which is extended by successively changing the robot\u27s view in order to explore interaction possibilities of the scene

    Active Vision for Scene Understanding

    Get PDF
    Visual perception is one of the most important sources of information for both humans and robots. A particular challenge is the acquisition and interpretation of complex unstructured scenes. This work contributes to active vision for humanoid robots. A semantic model of the scene is created, which is extended by successively changing the robot's view in order to explore interaction possibilities of the scene

    A hierarchical system for a distributed representation of the peripersonal space of a humanoid robot

    Get PDF
    Reaching a target object in an unknown and unstructured environment is easily performed by human beings. However, designing a humanoid robot that executes the same task requires the implementation of complex abilities, such as identifying the target in the visual field, estimating its spatial location, and precisely driving the motors of the arm to reach it. While research usually tackles the development of such abilities singularly, in this work we integrate a number of computational models into a unified framework, and demonstrate in a humanoid torso the feasibility of an integrated working representation of its peripersonal space. To achieve this goal, we propose a cognitive architecture that connects several models inspired by neural circuits of the visual, frontal and posterior parietal cortices of the brain. The outcome of the integration process is a system that allows the robot to create its internal model and its representation of the surrounding space by interacting with the environment directly, through a mutual adaptation of perception and action. The robot is eventually capable of executing a set of tasks, such as recognizing, gazing and reaching target objects, which can work separately or cooperate for supporting more structured and effective behaviors
    • …
    corecore