17,810 research outputs found
Classification of Occluded Objects using Fast Recurrent Processing
Recurrent neural networks are powerful tools for handling incomplete data
problems in computer vision, thanks to their significant generative
capabilities. However, the computational demand for these algorithms is too
high to work in real time, without specialized hardware or software solutions.
In this paper, we propose a framework for augmenting recurrent processing
capabilities into a feedforward network without sacrificing much from
computational efficiency. We assume a mixture model and generate samples of the
last hidden layer according to the class decisions of the output layer, modify
the hidden layer activity using the samples, and propagate to lower layers. For
visual occlusion problem, the iterative procedure emulates feedforward-feedback
loop, filling-in the missing hidden layer activity with meaningful
representations. The proposed algorithm is tested on a widely used dataset, and
shown to achieve 2 improvement in classification accuracy for occluded
objects. When compared to Restricted Boltzmann Machines, our algorithm shows
superior performance for occluded object classification.Comment: arXiv admin note: text overlap with arXiv:1409.8576 by other author
Data Imputation through the Identification of Local Anomalies
We introduce a comprehensive and statistical framework in a model free
setting for a complete treatment of localized data corruptions due to severe
noise sources, e.g., an occluder in the case of a visual recording. Within this
framework, we propose i) a novel algorithm to efficiently separate, i.e.,
detect and localize, possible corruptions from a given suspicious data instance
and ii) a Maximum A Posteriori (MAP) estimator to impute the corrupted data. As
a generalization to Euclidean distance, we also propose a novel distance
measure, which is based on the ranked deviations among the data attributes and
empirically shown to be superior in separating the corruptions. Our algorithm
first splits the suspicious instance into parts through a binary partitioning
tree in the space of data attributes and iteratively tests those parts to
detect local anomalies using the nominal statistics extracted from an
uncorrupted (clean) reference data set. Once each part is labeled as anomalous
vs normal, the corresponding binary patterns over this tree that characterize
corruptions are identified and the affected attributes are imputed. Under a
certain conditional independency structure assumed for the binary patterns, we
analytically show that the false alarm rate of the introduced algorithm in
detecting the corruptions is independent of the data and can be directly set
without any parameter tuning. The proposed framework is tested over several
well-known machine learning data sets with synthetically generated corruptions;
and experimentally shown to produce remarkable improvements in terms of
classification purposes with strong corruption separation capabilities. Our
experiments also indicate that the proposed algorithms outperform the typical
approaches and are robust to varying training phase conditions
What is Holding Back Convnets for Detection?
Convolutional neural networks have recently shown excellent results in
general object detection and many other tasks. Albeit very effective, they
involve many user-defined design choices. In this paper we want to better
understand these choices by inspecting two key aspects "what did the network
learn?", and "what can the network learn?". We exploit new annotations
(Pascal3D+), to enable a new empirical analysis of the R-CNN detector. Despite
common belief, our results indicate that existing state-of-the-art convnet
architectures are not invariant to various appearance factors. In fact, all
considered networks have similar weak points which cannot be mitigated by
simply increasing the training data (architectural changes are needed). We show
that overall performance can improve when using image renderings for data
augmentation. We report the best known results on the Pascal3D+ detection and
view-point estimation tasks
Face recognition technologies for evidential evaluation of video traces
Human recognition from video traces is an important task in forensic investigations and evidence evaluations. Compared with other biometric traits, face is one of the most popularly used modalities for human recognition due to the fact that its collection is non-intrusive and requires less cooperation from the subjects. Moreover, face images taken at a long distance can still provide reasonable resolution, while most biometric modalities, such as iris and fingerprint, do not have this merit. In this chapter, we discuss automatic face recognition technologies for evidential evaluations of video traces. We first introduce the general concepts in both forensic and automatic face recognition , then analyse the difficulties in face recognition from videos . We summarise and categorise the approaches for handling different uncontrollable factors in difficult recognition conditions. Finally we discuss some challenges and trends in face recognition research in both forensics and biometrics . Given its merits tested in many deployed systems and great potential in other emerging applications, considerable research and development efforts are expected to be devoted in face recognition in the near future
Real time hand gesture recognition including hand segmentation and tracking
In this paper we present a system that performs automatic gesture recognition. The system consists of two main components: (i) A unified technique for segmentation and tracking of face and hands using a skin detection algorithm along with handling occlusion between skin objects to keep track of the status of the occluded parts. This is realized by combining 3 useful features, namely, color, motion and position. (ii) A static and dynamic gesture recognition system. Static gesture recognition is achieved using a robust hand shape classification, based on PCA subspaces, that is invariant to scale along with small translation and rotation transformations. Combining hand shape classification with position information and using DHMMs allows us to accomplish dynamic gesture recognition
- âŠ