10,205 research outputs found
Eye in the Sky: Real-time Drone Surveillance System (DSS) for Violent Individuals Identification using ScatterNet Hybrid Deep Learning Network
Drone systems have been deployed by various law enforcement agencies to
monitor hostiles, spy on foreign drug cartels, conduct border control
operations, etc. This paper introduces a real-time drone surveillance system to
identify violent individuals in public areas. The system first uses the Feature
Pyramid Network to detect humans from aerial images. The image region with the
human is used by the proposed ScatterNet Hybrid Deep Learning (SHDL) network
for human pose estimation. The orientations between the limbs of the estimated
pose are next used to identify the violent individuals. The proposed deep
network can learn meaningful representations quickly using ScatterNet and
structural priors with relatively fewer labeled examples. The system detects
the violent individuals in real-time by processing the drone images in the
cloud. This research also introduces the aerial violent individual dataset used
for training the deep network which hopefully may encourage researchers
interested in using deep learning for aerial surveillance. The pose estimation
and violent individuals identification performance is compared with the
state-of-the-art techniques.Comment: To Appear in the Efficient Deep Learning for Computer Vision (ECV)
workshop at IEEE Computer Vision and Pattern Recognition (CVPR) 2018. Youtube
demo at this: https://www.youtube.com/watch?v=zYypJPJipY
Learning how to be robust: Deep polynomial regression
Polynomial regression is a recurrent problem with a large number of
applications. In computer vision it often appears in motion analysis. Whatever
the application, standard methods for regression of polynomial models tend to
deliver biased results when the input data is heavily contaminated by outliers.
Moreover, the problem is even harder when outliers have strong structure.
Departing from problem-tailored heuristics for robust estimation of parametric
models, we explore deep convolutional neural networks. Our work aims to find a
generic approach for training deep regression models without the explicit need
of supervised annotation. We bypass the need for a tailored loss function on
the regression parameters by attaching to our model a differentiable hard-wired
decoder corresponding to the polynomial operation at hand. We demonstrate the
value of our findings by comparing with standard robust regression methods.
Furthermore, we demonstrate how to use such models for a real computer vision
problem, i.e., video stabilization. The qualitative and quantitative
experiments show that neural networks are able to learn robustness for general
polynomial regression, with results that well overpass scores of traditional
robust estimation methods.Comment: 18 pages, conferenc
Learning Human Pose Estimation Features with Convolutional Networks
This paper introduces a new architecture for human pose estimation using a
multi- layer convolutional network architecture and a modified learning
technique that learns low-level features and higher-level weak spatial models.
Unconstrained human pose estimation is one of the hardest problems in computer
vision, and our new architecture and learning schema shows significant
improvement over the current state-of-the-art results. The main contribution of
this paper is showing, for the first time, that a specific variation of deep
learning is able to outperform all existing traditional architectures on this
task. The paper also discusses several lessons learned while researching
alternatives, most notably, that it is possible to learn strong low-level
feature detectors on features that might even just cover a few pixels in the
image. Higher-level spatial models improve somewhat the overall result, but to
a much lesser extent then expected. Many researchers previously argued that the
kinematic structure and top-down information is crucial for this domain, but
with our purely bottom up, and weak spatial model, we could improve other more
complicated architectures that currently produce the best results. This mirrors
what many other researchers, like those in the speech recognition, object
recognition, and other domains have experienced
- …