2,680 research outputs found
Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
This work presents a first evaluation of using spatio-temporal receptive
fields from a recently proposed time-causal spatio-temporal scale-space
framework as primitives for video analysis. We propose a new family of video
descriptors based on regional statistics of spatio-temporal receptive field
responses and evaluate this approach on the problem of dynamic texture
recognition. Our approach generalises a previously used method, based on joint
histograms of receptive field responses, from the spatial to the
spatio-temporal domain and from object recognition to dynamic texture
recognition. The time-recursive formulation enables computationally efficient
time-causal recognition. The experimental evaluation demonstrates competitive
performance compared to state-of-the-art. Especially, it is shown that binary
versions of our dynamic texture descriptors achieve improved performance
compared to a large range of similar methods using different primitives either
handcrafted or learned from data. Further, our qualitative and quantitative
investigation into parameter choices and the use of different sets of receptive
fields highlights the robustness and flexibility of our approach. Together,
these results support the descriptive power of this family of time-causal
spatio-temporal receptive fields, validate our approach for dynamic texture
recognition and point towards the possibility of designing a range of video
analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
On the design of an ECOC-compliant genetic algorithm
Genetic Algorithms (GA) have been previously applied to Error-Correcting Output Codes (ECOC) in state-of-the-art works in order to find a suitable coding matrix. Nevertheless, none of the presented techniques directly take into account the properties of the ECOC matrix. As a result the considered search space is unnecessarily large. In this paper, a novel Genetic strategy to optimize the ECOC coding step is presented. This novel strategy redefines the usual crossover and mutation operators in order to take into account the theoretical properties of the ECOC framework. Thus, it reduces the search space and lets the algorithm to converge faster. In addition, a novel operator that is able to enlarge the code in a smart way is introduced. The novel methodology is tested on several UCI datasets and four challenging computer vision problems. Furthermore, the analysis of the results done in terms of performance, code length and number of Support Vectors shows that the optimization process is able to find very efficient codes, in terms of the trade-off between classification performance and the number of classifiers. Finally, classification performance per dichotomizer results shows that the novel proposal is able to obtain similar or even better results while defining a more compact number of dichotomies and SVs compared to state-of-the-art approaches
Fast traffic sign recognition using color segmentation and deep convolutional networks
The use of Computer Vision techniques for the automatic
recognition of road signs is fundamental for the development of intelli-
gent vehicles and advanced driver assistance systems. In this paper, we
describe a procedure based on color segmentation, Histogram of Ori-
ented Gradients (HOG), and Convolutional Neural Networks (CNN) for
detecting and classifying road signs. Detection is speeded up by a pre-
processing step to reduce the search space, while classication is carried
out by using a Deep Learning technique. A quantitative evaluation of the
proposed approach has been conducted on the well-known German Traf-
c Sign data set and on the novel Data set of Italian Trac Signs (DITS),
which is publicly available and contains challenging sequences captured
in adverse weather conditions and in an urban scenario at night-time.
Experimental results demonstrate the eectiveness of the proposed ap-
proach in terms of both classication accuracy and computational speed
Total Recall: Understanding Traffic Signs using Deep Hierarchical Convolutional Neural Networks
Recognizing Traffic Signs using intelligent systems can drastically reduce
the number of accidents happening world-wide. With the arrival of Self-driving
cars it has become a staple challenge to solve the automatic recognition of
Traffic and Hand-held signs in the major streets. Various machine learning
techniques like Random Forest, SVM as well as deep learning models has been
proposed for classifying traffic signs. Though they reach state-of-the-art
performance on a particular data-set, but fall short of tackling multiple
Traffic Sign Recognition benchmarks. In this paper, we propose a novel and
one-for-all architecture that aces multiple benchmarks with better overall
score than the state-of-the-art architectures. Our model is made of residual
convolutional blocks with hierarchical dilated skip connections joined in
steps. With this we score 99.33% Accuracy in German sign recognition benchmark
and 99.17% Accuracy in Belgian traffic sign classification benchmark. Moreover,
we propose a newly devised dilated residual learning representation technique
which is very low in both memory and computational complexity
Learning sound representations using trainable COPE feature extractors
Sound analysis research has mainly been focused on speech and music
processing. The deployed methodologies are not suitable for analysis of sounds
with varying background noise, in many cases with very low signal-to-noise
ratio (SNR). In this paper, we present a method for the detection of patterns
of interest in audio signals. We propose novel trainable feature extractors,
which we call COPE (Combination of Peaks of Energy). The structure of a COPE
feature extractor is determined using a single prototype sound pattern in an
automatic configuration process, which is a type of representation learning. We
construct a set of COPE feature extractors, configured on a number of training
patterns. Then we take their responses to build feature vectors that we use in
combination with a classifier to detect and classify patterns of interest in
audio signals. We carried out experiments on four public data sets: MIVIA audio
events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that
we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on
the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund)
demonstrate the effectiveness of the proposed method and are higher than the
ones obtained by other existing approaches. The COPE feature extractors have
high robustness to variations of SNR. Real-time performance is achieved even
when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio
Advanced Driver-Assistance System with Traffic Sign Recognition for Safe and Efficient Driving
Advanced Driver-Assistance Systems (ADAS) coupled with traffic sign recognition could lead to safer driving environments. This study presents a sophisticated, yet robust and accurate traffic sign detection system using computer vision and ML, for ADAS. Unavailability of large local traffic sign datasets and the unbalances of traffic sign distribution are the key bottlenecks of this research. Hence, we choose to work with support vector machines (SVM) with a custom-built unbalance dataset, to build a lightweight model with excellent classification accuracy. The SVM model delivered optimum performance with the radial basis kernel, C=10, and gamma=0.0001. In the proposed method, same priority was given to processing time (testing time) and accuracy, as traffic sign identification is time critical. The final accuracy obtained was 87% (with confidence interval 84%-90%) with a processing time of 0.64s (with confidence interval of 0.57s-0.67s) for correct detection at testing, which emphasizes the effectiveness of the proposed method
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
- …