798 research outputs found
End-to-End Learning of Representations for Asynchronous Event-Based Data
Event cameras are vision sensors that record asynchronous streams of
per-pixel brightness changes, referred to as "events". They have appealing
advantages over frame-based cameras for computer vision, including high
temporal resolution, high dynamic range, and no motion blur. Due to the sparse,
non-uniform spatiotemporal layout of the event signal, pattern recognition
algorithms typically aggregate events into a grid-based representation and
subsequently process it by a standard vision pipeline, e.g., Convolutional
Neural Network (CNN). In this work, we introduce a general framework to convert
event streams into grid-based representations through a sequence of
differentiable operations. Our framework comes with two main advantages: (i)
allows learning the input event representation together with the task dedicated
network in an end to end manner, and (ii) lays out a taxonomy that unifies the
majority of extant event representations in the literature and identifies novel
ones. Empirically, we show that our approach to learning the event
representation end-to-end yields an improvement of approximately 12% on optical
flow estimation and object recognition over state-of-the-art methods.Comment: To appear at ICCV 201
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
On Neuromechanical Approaches for the Study of Biological Grasp and Manipulation
Biological and robotic grasp and manipulation are undeniably similar at the
level of mechanical task performance. However, their underlying fundamental
biological vs. engineering mechanisms are, by definition, dramatically
different and can even be antithetical. Even our approach to each is
diametrically opposite: inductive science for the study of biological systems
vs. engineering synthesis for the design and construction of robotic systems.
The past 20 years have seen several conceptual advances in both fields and the
quest to unify them. Chief among them is the reluctant recognition that their
underlying fundamental mechanisms may actually share limited common ground,
while exhibiting many fundamental differences. This recognition is particularly
liberating because it allows us to resolve and move beyond multiple paradoxes
and contradictions that arose from the initial reasonable assumption of a large
common ground. Here, we begin by introducing the perspective of neuromechanics,
which emphasizes that real-world behavior emerges from the intimate
interactions among the physical structure of the system, the mechanical
requirements of a task, the feasible neural control actions to produce it, and
the ability of the neuromuscular system to adapt through interactions with the
environment. This allows us to articulate a succinct overview of a few salient
conceptual paradoxes and contradictions regarding under-determined vs.
over-determined mechanics, under- vs. over-actuated control, prescribed vs.
emergent function, learning vs. implementation vs. adaptation, prescriptive vs.
descriptive synergies, and optimal vs. habitual performance. We conclude by
presenting open questions and suggesting directions for future research. We
hope this frank assessment of the state-of-the-art will encourage and guide
these communities to continue to interact and make progress in these important
areas
Multi-scale Evolutionary Neural Architecture Search for Deep Spiking Neural Networks
Spiking Neural Networks (SNNs) have received considerable attention not only
for their superiority in energy efficient with discrete signal processing, but
also for their natural suitability to integrate multi-scale biological
plasticity. However, most SNNs directly adopt the structure of the
well-established DNN, rarely automatically design Neural Architecture Search
(NAS) for SNNs. The neural motifs topology, modular regional structure and
global cross-brain region connection of the human brain are the product of
natural evolution and can serve as a perfect reference for designing
brain-inspired SNN architecture. In this paper, we propose a Multi-Scale
Evolutionary Neural Architecture Search (MSE-NAS) for SNN, simultaneously
considering micro-, meso- and macro-scale brain topologies as the evolutionary
search space. MSE-NAS evolves individual neuron operation, self-organized
integration of multiple circuit motifs, and global connectivity across motifs
through a brain-inspired indirect evaluation function, Representational
Dissimilarity Matrices (RDMs). This training-free fitness function could
greatly reduce computational consumption and NAS's time, and its
task-independent property enables the searched SNNs to exhibit excellent
transferbility and scalability. Extensive experiments demonstrate that the
proposed algorithm achieves state-of-the-art (SOTA) performance with shorter
simulation steps on static datasets (CIFAR10, CIFAR100) and neuromorphic
datasets (CIFAR10-DVS and DVS128-Gesture). The thorough analysis also
illustrates the significant performance improvement and consistent
bio-interpretability deriving from the topological evolution at different
scales and the RDMs fitness function
Automotive Object Detection via Learning Sparse Events by Temporal Dynamics of Spiking Neurons
Event-based sensors, with their high temporal resolution (1us) and dynamical
range (120dB), have the potential to be deployed in high-speed platforms such
as vehicles and drones. However, the highly sparse and fluctuating nature of
events poses challenges for conventional object detection techniques based on
Artificial Neural Networks (ANNs). In contrast, Spiking Neural Networks (SNNs)
are well-suited for representing event-based data due to their inherent
temporal dynamics. In particular, we demonstrate that the membrane potential
dynamics can modulate network activity upon fluctuating events and strengthen
features of sparse input. In addition, the spike-triggered adaptive threshold
can stabilize training which further improves network performance. Based on
this, we develop an efficient spiking feature pyramid network for event-based
object detection. Our proposed SNN outperforms previous SNNs and sophisticated
ANNs with attention mechanisms, achieving a mean average precision (map50) of
47.7% on the Gen1 benchmark dataset. This result significantly surpasses the
previous best SNN by 9.7% and demonstrates the potential of SNNs for
event-based vision. Our model has a concise architecture while maintaining high
accuracy and much lower computation cost as a result of sparse computation. Our
code will be publicly available
Recommended from our members
Towards integrated neural-symbolic systems for human-level AI: Two research programs helping to bridge the gaps
After a human-level AI-oriented overview of the status quo in neural-symbolic integration, two research programs aiming at overcoming long-standing challenges in the field are suggested to the community: The first program targets a better understanding of foundational differences and relationships on the level of computational complexity between symbolic and subsymbolic computation and representation, potentially providing explanations for the empirical differences between the paradigms in application scenarios and a foothold for subsequent attempts at overcoming these. The second program suggests a new approach and computational architecture for the cognitively-inspired anchoring of an agent's learning, knowledge formation, and higher reasoning abilities in real-world interactions through a closed neural-symbolic acting/sensing-processing-reasoning cycle, potentially providing new foundations for future agent architectures, multi-agent systems, robotics, and cognitive systems and facilitating a deeper understanding of the development and interaction in human-technological settings
- …