47 research outputs found
Memory Integrity of CNNs for Cross-Dataset Facial Expression Recognition
Facial expression recognition is a major problem in the domain of artificial
intelligence. One of the best ways to solve this problem is the use of
convolutional neural networks (CNNs). However, a large amount of data is
required to train properly these networks but most of the datasets available
for facial expression recognition are relatively small. A common way to
circumvent the lack of data is to use CNNs trained on large datasets of
different domains and fine-tuning the layers of such networks to the target
domain. However, the fine-tuning process does not preserve the memory integrity
as CNNs have the tendency to forget patterns they have learned. In this paper,
we evaluate different strategies of fine-tuning a CNN with the aim of assessing
the memory integrity of such strategies in a cross-dataset scenario. A CNN
pre-trained on a source dataset is used as the baseline and four adaptation
strategies have been evaluated: fine-tuning its fully connected layers;
fine-tuning its last convolutional layer and its fully connected layers;
retraining the CNN on a target dataset; and the fusion of the source and target
datasets and retraining the CNN. Experimental results on four datasets have
shown that the fusion of the source and the target datasets provides the best
trade-off between accuracy and memory integrity
Texture CNN for Thermoelectric Metal Pipe Image Classification
In this paper, the concept of representation learning based on deep neural
networks is applied as an alternative to the use of handcrafted features in a
method for automatic visual inspection of corroded thermoelectric metallic
pipes. A texture convolutional neural network (TCNN) replaces handcrafted
features based on Local Phase Quantization (LPQ) and Haralick descriptors (HD)
with the advantage of learning an appropriate textural representation and the
decision boundaries into a single optimization process. Experimental results
have shown that it is possible to reach the accuracy of 99.20% in the task of
identifying different levels of corrosion in the internal surface of
thermoelectric pipe walls, while using a compact network that requires much
less effort in tuning parameters when compared to the handcrafted approach
since the TCNN architecture is compact regarding the number of layers and
connections. The observed results open up the possibility of using deep neural
networks in real-time applications such as the automatic inspection of
thermoelectric metal pipes
Two-View Fine-grained Classification of Plant Species
Automatic plant classification is a challenging problem due to the wide
biodiversity of the existing plant species in a fine-grained scenario. Powerful
deep learning architectures have been used to improve the classification
performance in such a fine-grained problem, but usually building models that
are highly dependent on a large training dataset and which are not scalable. In
this paper, we propose a novel method based on a two-view leaf image
representation and a hierarchical classification strategy for fine-grained
recognition of plant species. It uses the botanical taxonomy as a basis for a
coarse-to-fine strategy applied to identify the plant genus and species. The
two-view representation provides complementary global and local features of
leaf images. A deep metric based on Siamese convolutional neural networks is
used to reduce the dependence on a large number of training samples and make
the method scalable to new plant species. The experimental results on two
challenging fine-grained datasets of leaf images (i.e. LifeCLEF 2015 and
LeafSnap) have shown the effectiveness of the proposed method, which achieved
recognition accuracy of 0.87 and 0.96 respectively.Comment: Submitted to Ecological Informatic
People Counting in Crowded and Outdoor Scenes using a Hybrid Multi-Camera Approach
This paper presents two novel approaches for people counting in crowded and
open environments that combine the information gathered by multiple views.
Multiple camera are used to expand the field of view as well as to mitigate the
problem of occlusion that commonly affects the performance of counting methods
using single cameras. The first approach is regarded as a direct approach and
it attempts to segment and count each individual in the crowd. For such an aim,
two head detectors trained with head images are employed: one based on support
vector machines and another based on Adaboost perceptron. The second approach,
regarded as an indirect approach employs learning algorithms and statistical
analysis on the whole crowd to achieve counting. For such an aim, corner points
are extracted from groups of people in a foreground image and computed by a
learning algorithm which estimates the number of people in the scene. Both
approaches count the number of people on the scene and not only on a given
image or video frame of the scene. The experimental results obtained on the
benchmark PETS2009 video dataset show that proposed indirect method surpasses
other methods with improvements of up to 46.7% and provides accurate counting
results for the crowded scenes. On the other hand, the direct method shows high
error rates due to the fact that the latter has much more complex problems to
solve, such as segmentation of heads
A Classifier-free Ensemble Selection Method based on Data Diversity in Random Subspaces
The Ensemble of Classifiers (EoC) has been shown to be effective in improving
the performance of single classifiers by combining their outputs, and one of
the most important properties involved in the selection of the best EoC from a
pool of classifiers is considered to be classifier diversity. In general,
classifier diversity does not occur randomly, but is generated systematically
by various ensemble creation methods. By using diverse data subsets to train
classifiers, these methods can create diverse classifiers for the EoC. In this
work, we propose a scheme to measure data diversity directly from random
subspaces, and explore the possibility of using it to select the best data
subsets for the construction of the EoC. Our scheme is the first ensemble
selection method to be presented in the literature based on the concept of data
diversity. Its main advantage over the traditional framework (ensemble creation
then selection) is that it obviates the need for classifier training prior to
ensemble selection. A single Genetic Algorithm (GA) and a Multi-Objective
Genetic Algorithm (MOGA) were evaluated to search for the best solutions for
the classifier-free ensemble selection. In both cases, objective functions
based on different clustering diversity measures were implemented and tested.
All the results obtained with the proposed classifier-free ensemble selection
method were compared with the traditional classifier-based ensemble selection
using Mean Classifier Error (ME) and Majority Voting Error (MVE). The
applicability of the method is tested on UCI machine learning problems and NIST
SD19 handwritten numerals
Double Transfer Learning for Breast Cancer Histopathologic Image Classification
This work proposes a classification approach for breast cancer
histopathologic images (HI) that uses transfer learning to extract features
from HI using an Inception-v3 CNN pre-trained with ImageNet dataset. We also
use transfer learning on training a support vector machine (SVM) classifier on
a tissue labeled colorectal cancer dataset aiming to filter the patches from a
breast cancer HI and remove the irrelevant ones. We show that removing
irrelevant patches before training a second SVM classifier, improves the
accuracy for classifying malign and benign tumors on breast cancer images. We
are able to improve the classification accuracy in 3.7% using the feature
extraction transfer learning and an additional 0.7% using the irrelevant patch
elimination. The proposed approach outperforms the state-of-the-art in three
out of the four magnification factors of the breast cancer dataset
Texture CNN for Histopathological Image Classification
Biopsies are the gold standard for breast cancer diagnosis. This task can be
improved by the use of Computer Aided Diagnosis (CAD) systems, reducing the
time of diagnosis and reducing the inter and intra-observer variability. The
advances in computing have brought this type of system closer to reality.
However, datasets of Histopathological Images (HI) from biopsies are quite
small and unbalanced what makes difficult to use modern machine learning
techniques such as deep learning. In this paper we propose a compact
architecture based on texture filters that has fewer parameters than
traditional deep models but is able to capture the difference between malignant
and benign tissues with relative accuracy. The experimental results on the
BreakHis dataset have show that the proposed texture CNN achieves almost 90% of
accuracy for classifying benign and malignant tissues
Histopathologic Image Processing: A Review
Histopathologic Images (HI) are the gold standard for evaluation of some
tumors. However, the analysis of such images is challenging even for
experienced pathologists, resulting in problems of inter and intra observer.
Besides that, the analysis is time and resource consuming. One of the ways to
accelerate such an analysis is by using Computer Aided Diagnosis systems. In
this work we present a literature review about the computing techniques to
process HI, including shallow and deep methods. We cover the most common tasks
for processing HI such as segmentation, feature extraction, unsupervised
learning and supervised learning. A dataset section show some datasets found
during the literature review. We also bring a study case of breast cancer
classification using a mix of deep and shallow machine learning methods. The
proposed method obtained an accuracy of 91% in the best case, outperforming the
compared baseline of the dataset
Data Augmentation for Histopathological Images Based on Gaussian-Laplacian Pyramid Blending
Data imbalance is a major problem that affects several machine learning (ML)
algorithms. Such a problem is troublesome because most of the ML algorithms
attempt to optimize a loss function that does not take into account the data
imbalance. Accordingly, the ML algorithm simply generates a trivial model that
is biased toward predicting the most frequent class in the training data. In
the case of histopathologic images (HIs), both low-level and high-level data
augmentation (DA) techniques still present performance issues when applied in
the presence of inter-patient variability; whence the model tends to learn
color representations, which is related to the staining process. In this paper,
we propose a novel approach capable of not only augmenting HI dataset but also
distributing the inter-patient variability by means of image blending using the
Gaussian-Laplacian pyramid. The proposed approach consists of finding the
Gaussian pyramids of two images of different patients and finding the Laplacian
pyramids thereof. Afterwards, the left-half side and the right-half side of
different HIs are joined in each level of the Laplacian pyramid, and from the
joint pyramids, the original image is reconstructed. This composition combines
the stain variation of two patients, avoiding that color differences mislead
the learning process. Experimental results on the BreakHis dataset have shown
promising gains vis-a-vis the majority of DA techniques presented in the
literature.Comment: 8 page
CNN Hyperparameter tuning applied to Iris Liveness Detection
The iris pattern has significantly improved the biometric recognition field
due to its high level of stability and uniqueness. Such physical feature has
played an important role in security and other related areas. However,
presentation attacks, also known as spoofing techniques, can be used to bypass
the biometric system with artifacts such as printed images, artificial eyes,
and textured contact lenses. To improve the security of these systems, many
liveness detection methods have been proposed, and the first Internacional Iris
Liveness Detection competition was launched in 2013 to evaluate their
effectiveness. In this paper, we propose a hyperparameter tuning of the CASIA
algorithm, submitted by the Chinese Academy of Sciences to the third
competition of Iris Liveness Detection, in 2017. The modifications proposed
promoted an overall improvement, with an 8.48% Attack Presentation
Classification Error Rate (APCER) and 0.18% Bonafide Presentation
Classification Error Rate (BPCER) for the evaluation of the combined datasets.
Other threshold values were evaluated in an attempt to reduce the trade-off
between the APCER and the BPCER on the evaluated datasets and worked out
successfully.Comment: Accepted for presentation at the International Conference on Computer
Vision Theory and Applications (VISAPP 2020