12,989 research outputs found
Dynamic texture and scene classification by transferring deep image features
Dynamic texture and scene classification are two fundamental problems in
understanding natural video content. Extracting robust and effective features
is a crucial step towards solving these problems. However the existing
approaches suffer from the sensitivity to either varying illumination, or
viewpoint changing, or even camera motion, and/or the lack of spatial
information. Inspired by the success of deep structures in image
classification, we attempt to leverage a deep structure to extract feature for
dynamic texture and scene classification. To tackle with the challenges in
training a deep structure, we propose to transfer some prior knowledge from
image domain to video domain. To be specific, we propose to apply a
well-trained Convolutional Neural Network (ConvNet) as a mid-level feature
extractor to extract features from each frame, and then form a representation
of a video by concatenating the first and the second order statistics over the
mid-level features. We term this two-level feature extraction scheme as a
Transferred ConvNet Feature (TCoF). Moreover we explore two different
implementations of the TCoF scheme, i.e., the \textit{spatial} TCoF and the
\textit{temporal} TCoF, in which the mean-removed frames and the difference
between two adjacent frames are used as the inputs of the ConvNet,
respectively. We evaluate systematically the proposed spatial TCoF and the
temporal TCoF schemes on three benchmark data sets, including DynTex, YUPENN,
and Maryland, and demonstrate that the proposed approach yields superior
performance
Local Jet Pattern: A Robust Descriptor for Texture Classification
Methods based on local image features have recently shown promise for texture
classification tasks, especially in the presence of large intra-class variation
due to illumination, scale, and viewpoint changes. Inspired by the theories of
image structure analysis, this paper presents a simple, efficient, yet robust
descriptor namely local jet pattern (LJP) for texture classification. In this
approach, a jet space representation of a texture image is derived from a set
of derivatives of Gaussian (DtGs) filter responses up to second order, so
called local jet vectors (LJV), which also satisfy the Scale Space properties.
The LJP is obtained by utilizing the relationship of center pixel with the
local neighborhood information in jet space. Finally, the feature vector of a
texture region is formed by concatenating the histogram of LJP for all elements
of LJV. All DtGs responses up to second order together preserves the intrinsic
local image structure, and achieves invariance to scale, rotation, and
reflection. This allows us to develop a texture classification framework which
is discriminative and robust. Extensive experiments on five standard texture
image databases, employing nearest subspace classifier (NSC), the proposed
descriptor achieves 100%, 99.92%, 99.75%, 99.16%, and 99.65% accuracy for
Outex_TC-00010 (Outex_TC10), and Outex_TC-00012 (Outex_TC12), KTH-TIPS,
Brodatz, CUReT, respectively, which are outperforms the state-of-the-art
methods.Comment: Accepted in Multimedia Tools and Applications, Springe
HEp-2 Cell Classification via Fusing Texture and Shape Information
Indirect Immunofluorescence (IIF) HEp-2 cell image is an effective evidence
for diagnosis of autoimmune diseases. Recently computer-aided diagnosis of
autoimmune diseases by IIF HEp-2 cell classification has attracted great
attention. However the HEp-2 cell classification task is quite challenging due
to large intra-class variation and small between-class variation. In this paper
we propose an effective and efficient approach for the automatic classification
of IIF HEp-2 cell image by fusing multi-resolution texture information and
richer shape information. To be specific, we propose to: a) capture the
multi-resolution texture information by a novel Pairwise Rotation Invariant
Co-occurrence of Local Gabor Binary Pattern (PRICoLGBP) descriptor, b) depict
the richer shape information by using an Improved Fisher Vector (IFV) model
with RootSIFT features which are sampled from large image patches in multiple
scales, and c) combine them properly. We evaluate systematically the proposed
approach on the IEEE International Conference on Pattern Recognition (ICPR)
2012, IEEE International Conference on Image Processing (ICIP) 2013 and ICPR
2014 contest data sets. The experimental results for the proposed methods
significantly outperform the winners of ICPR 2012 and ICIP 2013 contest, and
achieve comparable performance with the winner of the newly released ICPR 2014
contest.Comment: 11 pages, 7 figure
Ensemble of Multi-View Learning Classifiers for Cross-Domain Iris Presentation Attack Detection
The adoption of large-scale iris recognition systems around the world has
brought to light the importance of detecting presentation attack images
(textured contact lenses and printouts). This work presents a new approach in
iris Presentation Attack Detection (PAD), by exploring combinations of
Convolutional Neural Networks (CNNs) and transformed input spaces through
binarized statistical image features (BSIF). Our method combines lightweight
CNNs to classify multiple BSIF views of the input image. Following explorations
on complementary input spaces leading to more discriminative features to detect
presentation attacks, we also propose an algorithm to select the best (and most
discriminative) predictors for the task at hand.An ensemble of predictors makes
use of their expected individual performances to aggregate their results into a
final prediction. Results show that this technique improves on the current
state of the art in iris PAD, outperforming the winner of LivDet-Iris2017
competition both for intra- and cross-dataset scenarios, and illustrating the
very difficult nature of the cross-dataset scenario.Comment: IEEE Transactions on Information Forensics and Security (Early
Access), 201
RaspiReader: An Open Source Fingerprint Reader Facilitating Spoof Detection
We present the design and prototype of an open source, optical fingerprint
reader, called RaspiReader, using ubiquitous components. RaspiReader, a
low-cost and easy to assemble reader, provides the fingerprint research
community a seamless and simple method for gaining more control over the
sensing component of fingerprint recognition systems. In particular, we posit
that this versatile fingerprint reader will encourage researchers to explore
novel spoof detection methods that integrate both hardware and software.
RaspiReader's hardware is customized with two cameras for fingerprint
acquisition with one camera providing high contrast, frustrated total internal
reflection (FTIR) images, and the other camera outputting direct images. Using
both of these image streams, we extract complementary information which, when
fused together, results in highly discriminative features for fingerprint spoof
(presentation attack) detection. Our experimental results demonstrate a marked
improvement over previous spoof detection methods which rely only on FTIR
images provided by COTS optical readers. Finally, fingerprint matching
experiments between images acquired from the FTIR output of the RaspiReader and
images acquired from a COTS fingerprint reader verify the interoperability of
the RaspiReader with existing COTS optical readers.Comment: 14 pages, 14 figure
Discriminative Representation Combinations for Accurate Face Spoofing Detection
Three discriminative representations for face presentation attack detection
are introduced in this paper. Firstly we design a descriptor called spatial
pyramid coding micro-texture (SPMT) feature to characterize local appearance
information. Secondly we utilize the SSD, which is a deep learning framework
for detection, to excavate context cues and conduct end-to-end face
presentation attack detection. Finally we design a descriptor called template
face matched binocular depth (TFBD) feature to characterize stereo structures
of real and fake faces. For accurate presentation attack detection, we also
design two kinds of representation combinations. Firstly, we propose a
decision-level cascade strategy to combine SPMT with SSD. Secondly, we use a
simple score fusion strategy to combine face structure cues (TFBD) with local
micro-texture features (SPMT). To demonstrate the effectiveness of our design,
we evaluate the representation combination of SPMT and SSD on three public
datasets, which outperforms all other state-of-the-art methods. In addition, we
evaluate the representation combination of SPMT and TFBD on our dataset and
excellent performance is also achieved.Comment: To be published in Pattern Recognitio
From BoW to CNN: Two Decades of Texture Representation for Texture Classification
Texture is a fundamental characteristic of many types of images, and texture
representation is one of the essential and challenging problems in computer
vision and pattern recognition which has attracted extensive research
attention. Since 2000, texture representations based on Bag of Words (BoW) and
on Convolutional Neural Networks (CNNs) have been extensively studied with
impressive performance. Given this period of remarkable evolution, this paper
aims to present a comprehensive survey of advances in texture representation
over the last two decades. More than 200 major publications are cited in this
survey covering different aspects of the research, which includes (i) problem
description; (ii) recent advances in the broad categories of BoW-based,
CNN-based and attribute-based methods; and (iii) evaluation issues,
specifically benchmark datasets and state of the art results. In retrospect of
what has been achieved so far, the survey discusses open challenges and
directions for future research.Comment: Accepted by IJC
Marrying Tracking with ELM: A Metric Constraint Guided Multiple Feature Fusion Method
Object Tracking is one important problem in computer vision and surveillance
system. The existing models mainly exploit the single-view feature (i.e. color,
texture, shape) to solve the problem, failing to describe the objects
comprehensively. In this paper, we solve the problem from multi-view
perspective by leveraging multi-view complementary and latent information, so
as to be robust to the partial occlusion and background clutter especially when
the objects are similar to the target, meanwhile addressing tracking drift.
However, one big problem is that multi-view fusion strategy can inevitably
result tracking into non-efficiency. To this end, we propose to marry ELM
(Extreme learning machine) to multi-view fusion to train the global hidden
output weight, to effectively exploit the local information from each view.
Following this principle, we propose a novel method to obtain the optimal
sample as the target object, which avoids tracking drift resulting from noisy
samples. Our method is evaluated over 12 challenge image sequences challenged
with different attributes including illumination, occlusion, deformation, etc.,
which demonstrates better performance than several state-of-the-art methods in
terms of effectiveness and robustness.Comment: arXiv admin note: substantial text overlap with arXiv:1807.1021
Deep Local Binary Patterns
Local Binary Pattern (LBP) is a traditional descriptor for texture analysis
that gained attention in the last decade. Being robust to several properties
such as invariance to illumination translation and scaling, LBPs achieved
state-of-the-art results in several applications. However, LBPs are not able to
capture high-level features from the image, merely encoding features with low
abstraction levels. In this work, we propose Deep LBP, which borrow ideas from
the deep learning community to improve LBP expressiveness. By using
parametrized data-driven LBP, we enable successive applications of the LBP
operators with increasing abstraction levels. We validate the relevance of the
proposed idea in several datasets from a wide range of applications. Deep LBP
improved the performance of traditional and multiscale LBP in all cases
Texture retrieval using periodically extended and adaptive curvelets
Image retrieval is an important problem in the area of multimedia processing.
This paper presents two new curvelet-based algorithms for texture retrieval
which are suitable for use in constrained-memory devices. The developed
algorithms are tested on three publicly available texture datasets: CUReT,
Mondial-Marmi, and STex-fabric. Our experiments confirm the effectiveness of
the proposed system. Furthermore, a weighted version of the proposed retrieval
algorithm is proposed, which is shown to achieve promising results in the
classification of seismic activities
- …