44,710 research outputs found
Practical classification of different moving targets using automotive radar and deep neural networks
In this work, the authors present results for classification of different classes of targets (car, single and multiple people, bicycle) using automotive radar data and different neural networks. A fast implementation of radar algorithms for detection, tracking, and micro-Doppler extraction is proposed in conjunction with the automotive radar transceiver TEF810X and microcontroller unit SR32R274 manufactured by NXP Semiconductors. Three different types of neural networks are considered, namely a classic convolutional network, a residual network, and a combination of convolutional and recurrent network, for different classification problems across the four classes of targets recorded. Considerable accuracy (close to 100% in some cases) and low latency of the radar pre-processing prior to classification (∼0.55 s to produce a 0.5 s long spectrogram) are demonstrated in this study, and possible shortcomings and outstanding issues are discussed
Humans and deep networks largely agree on which kinds of variation make object recognition harder
View-invariant object recognition is a challenging problem, which has
attracted much attention among the psychology, neuroscience, and computer
vision communities. Humans are notoriously good at it, even if some variations
are presumably more difficult to handle than others (e.g. 3D rotations). Humans
are thought to solve the problem through hierarchical processing along the
ventral stream, which progressively extracts more and more invariant visual
features. This feed-forward architecture has inspired a new generation of
bio-inspired computer vision systems called deep convolutional neural networks
(DCNN), which are currently the best algorithms for object recognition in
natural images. Here, for the first time, we systematically compared human
feed-forward vision and DCNNs at view-invariant object recognition using the
same images and controlling for both the kinds of transformation as well as
their magnitude. We used four object categories and images were rendered from
3D computer models. In total, 89 human subjects participated in 10 experiments
in which they had to discriminate between two or four categories after rapid
presentation with backward masking. We also tested two recent DCNNs on the same
tasks. We found that humans and DCNNs largely agreed on the relative
difficulties of each kind of variation: rotation in depth is by far the hardest
transformation to handle, followed by scale, then rotation in plane, and
finally position. This suggests that humans recognize objects mainly through 2D
template matching, rather than by constructing 3D object models, and that DCNNs
are not too unreasonable models of human feed-forward vision. Also, our results
show that the variation levels in rotation in depth and scale strongly modulate
both humans' and DCNNs' recognition performances. We thus argue that these
variations should be controlled in the image datasets used in vision research
- …