49 research outputs found

    StereoSpike: Depth Learning with a Spiking Neural Network

    Full text link
    Depth estimation is an important computer vision task, useful in particular for navigation in autonomous vehicles, or for object manipulation in robotics. Here we solved it using an end-to-end neuromorphic approach, combining two event-based cameras and a Spiking Neural Network (SNN) with a slightly modified U-Net-like encoder-decoder architecture, that we named StereoSpike. More specifically, we used the Multi Vehicle Stereo Event Camera Dataset (MVSEC). It provides a depth ground-truth, which was used to train StereoSpike in a supervised manner, using surrogate gradient descent. We propose a novel readout paradigm to obtain a dense analog prediction -- the depth of each pixel -- from the spikes of the decoder. We demonstrate that this architecture generalizes very well, even better than its non-spiking counterparts, leading to state-of-the-art test accuracy. To the best of our knowledge, it is the first time that such a large-scale regression problem is solved by a fully spiking network. Finally, we show that low firing rates (<10%) can be obtained via regularization, with a minimal cost in accuracy. This means that StereoSpike could be efficiently implemented on neuromorphic chips, opening the door for low power and real time embedded systems

    RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions

    Full text link
    Depth estimation from monocular images is pivotal for real-world visual perception systems. While current learning-based depth estimation models train and test on meticulously curated data, they often overlook out-of-distribution (OoD) situations. Yet, in practical settings -- especially safety-critical ones like autonomous driving -- common corruptions can arise. Addressing this oversight, we introduce a comprehensive robustness test suite, RoboDepth, encompassing 18 corruptions spanning three categories: i) weather and lighting conditions; ii) sensor failures and movement; and iii) data processing anomalies. We subsequently benchmark 42 depth estimation models across indoor and outdoor scenes to assess their resilience to these corruptions. Our findings underscore that, in the absence of a dedicated robustness evaluation framework, many leading depth estimation models may be susceptible to typical corruptions. We delve into design considerations for crafting more robust depth estimation models, touching upon pre-training, augmentation, modality, model capacity, and learning paradigms. We anticipate our benchmark will establish a foundational platform for advancing robust OoD depth estimation.Comment: NeurIPS 2023; 45 pages, 25 figures, 13 tables; Code at https://github.com/ldkong1205/RoboDept

    tRNS boosts visual perceptual learning in participants with bilateral macular degeneration

    Get PDF
    Perceptual learning (PL) has shown promise in enhancing residual visual functions in patients with age-related macular degeneration (MD), however it requires prolonged training and evidence of generalization to untrained visual functions is limited. Recent studies suggest that combining transcranial random noise stimulation (tRNS) with perceptual learning produces faster and larger visual improvements in participants with normal vision. Thus, this approach might hold the key to improve PL effects in MD. To test this, we trained two groups of MD participants on a contrast detection task with (n = 5) or without (n = 7) concomitant occipital tRNS. The training consisted of a lateral masking paradigm in which the participant had to detect a central low contrast Gabor target. Transfer tasks, including contrast sensitivity, near and far visual acuity, and visual crowding, were measured at pre-, mid and post-tests. Combining tRNS and perceptual learning led to greater improvements in the trained task, evidenced by a larger increment in contrast sensitivity and reduced inhibition at the shortest target to flankers’ distance. The overall amount of transfer was similar between the two groups. These results suggest that coupling tRNS and perceptual learning has promising potential applications as a clinical rehabilitation strategy to improve vision in MD patients

    26th Annual Computational Neuroscience Meeting (CNS*2017): Part 1

    Get PDF

    Optical flow estimation from event-based cameras and spiking neural networks

    Get PDF
    Event-based cameras are raising interest within the computer vision community. These sensors operate with asynchronous pixels, emitting events, or “spikes”, when the luminance change at a given pixel since the last event surpasses a certain threshold. Thanks to their inherent qualities, such as their low power consumption, low latency, and high dynamic range, they seem particularly tailored to applications with challenging temporal constraints and safety requirements. Event-based sensors are an excellent fit for Spiking Neural Networks (SNNs), since the coupling of an asynchronous sensor with neuromorphic hardware can yield real-time systems with minimal power requirements. In this work, we seek to develop one such system, using both event sensor data from the DSEC dataset and spiking neural networks to estimate optical flow for driving scenarios. We propose a U-Net-like SNN which, after supervised training, is able to make dense optical flow estimations. To do so, we encourage both minimal norm for the error vector and minimal angle between ground-truth and predicted flow, training our model with back-propagation using a surrogate gradient. In addition, the use of 3d convolutions allows us to capture the dynamic nature of the data by increasing the temporal receptive fields. Upsampling after each decoding stage ensures that each decoder's output contributes to the final estimation. Thanks to separable convolutions, we have been able to develop a light model (when compared to competitors) that can nonetheless yield reasonably accurate optical flow estimates

    Sub-optimality of the early visual system explained through biologically plausible plasticity

    Get PDF
    The early visual cortex is the site of crucial pre-processing for more complex, biologically relevant computations that drive perception and, ultimately, behaviour. This pre-processing is often viewed as an optimisation which enables the most efficient representation of visual input. However, measurements in monkey and cat suggest that receptive fields in the primary visual cortex are often noisy, blobby, and symmetrical, making them sub-optimal for operations such as edge-detection. We propose that this suboptimality occurs because the receptive fields do not emerge through a global minimisation of the generative error, but through locally operating biological mechanisms such as spike-timing dependent plasticity. Using an orientation discrimination paradigm, we show that while sub-optimal, such models offer a much better description of biology at multiple levels: single-cell, population coding, and perception. Taken together, our results underline the need to carefully consider the distinction between information-theoretic and biological notions of optimality in early sensorial populations
    corecore