12,856 research outputs found
Improving Texture Categorization with Biologically Inspired Filtering
Within the domain of texture classification, a lot of effort has been spent
on local descriptors, leading to many powerful algorithms. However,
preprocessing techniques have received much less attention despite their
important potential for improving the overall classification performance. We
address this question by proposing a novel, simple, yet very powerful
biologically-inspired filtering (BF) which simulates the performance of human
retina. In the proposed approach, given a texture image, after applying a DoG
filter to detect the "edges", we first split the filtered image into two "maps"
alongside the sides of its edges. The feature extraction step is then carried
out on the two "maps" instead of the input image. Our algorithm has several
advantages such as simplicity, robustness to illumination and noise, and
discriminative power. Experimental results on three large texture databases
show that with an extremely low computational cost, the proposed method
improves significantly the performance of many texture classification systems,
notably in noisy environments. The source codes of the proposed algorithm can
be downloaded from https://sites.google.com/site/nsonvu/code.Comment: 11 page
Learning Discriminative Stein Kernel for SPD Matrices and Its Applications
Stein kernel has recently shown promising performance on classifying images
represented by symmetric positive definite (SPD) matrices. It evaluates the
similarity between two SPD matrices through their eigenvalues. In this paper,
we argue that directly using the original eigenvalues may be problematic
because: i) Eigenvalue estimation becomes biased when the number of samples is
inadequate, which may lead to unreliable kernel evaluation; ii) More
importantly, eigenvalues only reflect the property of an individual SPD matrix.
They are not necessarily optimal for computing Stein kernel when the goal is to
discriminate different sets of SPD matrices. To address the two issues in one
shot, we propose a discriminative Stein kernel, in which an extra parameter
vector is defined to adjust the eigenvalues of the input SPD matrices. The
optimal parameter values are sought by optimizing a proxy of classification
performance. To show the generality of the proposed method, three different
kernel learning criteria that are commonly used in the literature are employed
respectively as a proxy. A comprehensive experimental study is conducted on a
variety of image classification tasks to compare our proposed discriminative
Stein kernel with the original Stein kernel and other commonly used methods for
evaluating the similarity between SPD matrices. The experimental results
demonstrate that, the discriminative Stein kernel can attain greater
discrimination and better align with classification tasks by altering the
eigenvalues. This makes it produce higher classification performance than the
original Stein kernel and other commonly used methods.Comment: 13 page
Pyramidal Fisher Motion for Multiview Gait Recognition
The goal of this paper is to identify individuals by analyzing their gait.
Instead of using binary silhouettes as input data (as done in many previous
works) we propose and evaluate the use of motion descriptors based on densely
sampled short-term trajectories. We take advantage of state-of-the-art people
detectors to define custom spatial configurations of the descriptors around the
target person. Thus, obtaining a pyramidal representation of the gait motion.
The local motion features (described by the Divergence-Curl-Shear descriptor)
extracted on the different spatial areas of the person are combined into a
single high-level gait descriptor by using the Fisher Vector encoding. The
proposed approach, coined Pyramidal Fisher Motion, is experimentally validated
on the recent `AVA Multiview Gait' dataset. The results show that this new
approach achieves promising results in the problem of gait recognition.Comment: Submitted to International Conference on Pattern Recognition, ICPR,
201
Accuracy Booster: Performance Boosting using Feature Map Re-calibration
Convolution Neural Networks (CNN) have been extremely successful in solving
intensive computer vision tasks. The convolutional filters used in CNNs have
played a major role in this success, by extracting useful features from the
inputs. Recently researchers have tried to boost the performance of CNNs by
re-calibrating the feature maps produced by these filters, e.g.,
Squeeze-and-Excitation Networks (SENets). These approaches have achieved better
performance by Exciting up the important channels or feature maps while
diminishing the rest. However, in the process, architectural complexity has
increased. We propose an architectural block that introduces much lower
complexity than the existing methods of CNN performance boosting while
performing significantly better than them. We carry out experiments on the
CIFAR, ImageNet and MS-COCO datasets, and show that the proposed block can
challenge the state-of-the-art results. Our method boosts the ResNet-50
architecture to perform comparably to the ResNet-152 architecture, which is a
three times deeper network, on classification. We also show experimentally that
our method is not limited to classification but also generalizes well to other
tasks such as object detection.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV),
202
Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks
This work addresses the problem of vehicle identification through
non-overlapping cameras. As our main contribution, we introduce a novel dataset
for vehicle identification, called Vehicle-Rear, that contains more than three
hours of high-resolution videos, with accurate information about the make,
model, color and year of nearly 3,000 vehicles, in addition to the position and
identification of their license plates. To explore our dataset we design a
two-stream CNN that simultaneously uses two of the most distinctive and
persistent features available: the vehicle's appearance and its license plate.
This is an attempt to tackle a major problem: false alarms caused by vehicles
with similar designs or by very close license plate identifiers. In the first
network stream, shape similarities are identified by a Siamese CNN that uses a
pair of low-resolution vehicle patches recorded by two different cameras. In
the second stream, we use a CNN for OCR to extract textual information,
confidence scores, and string similarities from a pair of high-resolution
license plate patches. Then, features from both streams are merged by a
sequence of fully connected layers for decision. In our experiments, we
compared the two-stream network against several well-known CNN architectures
using single or multiple vehicle features. The architectures, trained models,
and dataset are publicly available at https://github.com/icarofua/vehicle-rear
PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras
We present the first purely event-based, energy-efficient approach for object
detection and categorization using an event camera. Compared to traditional
frame-based cameras, choosing event cameras results in high temporal resolution
(order of microseconds), low power consumption (few hundred mW) and wide
dynamic range (120 dB) as attractive properties. However, event-based object
recognition systems are far behind their frame-based counterparts in terms of
accuracy. To this end, this paper presents an event-based feature extraction
method devised by accumulating local activity across the image frame and then
applying principal component analysis (PCA) to the normalized neighborhood
region. Subsequently, we propose a backtracking-free k-d tree mechanism for
efficient feature matching by taking advantage of the low-dimensionality of the
feature representation. Additionally, the proposed k-d tree mechanism allows
for feature selection to obtain a lower-dimensional dictionary representation
when hardware resources are limited to implement dimensionality reduction.
Consequently, the proposed system can be realized on a field-programmable gate
array (FPGA) device leading to high performance over resource ratio. The
proposed system is tested on real-world event-based datasets for object
categorization, showing superior classification performance and relevance to
state-of-the-art algorithms. Additionally, we verified the object detection
method and real-time FPGA performance in lab settings under non-controlled
illumination conditions with limited training data and ground truth
annotations.Comment: Accepted in ACCV 2018 Workshops, to appea
- …