21 research outputs found
Newton Method-based Subspace Support Vector Data Description
In this paper, we present an adaptation of Newton's method for the
optimization of Subspace Support Vector Data Description (S-SVDD). The
objective of S-SVDD is to map the original data to a subspace optimized for
one-class classification, and the iterative optimization process of data
mapping and description in S-SVDD relies on gradient descent. However, gradient
descent only utilizes first-order information, which may lead to suboptimal
results. To address this limitation, we leverage Newton's method to enhance
data mapping and data description for an improved optimization of subspace
learning-based one-class classification. By incorporating this auxiliary
information, Newton's method offers a more efficient strategy for subspace
learning in one-class classification as compared to gradient-based
optimization. The paper discusses the limitations of gradient descent and the
advantages of using Newton's method in subspace learning for one-class
classification tasks. We provide both linear and nonlinear formulations of
Newton's method-based optimization for S-SVDD. In our experiments, we explored
both the minimization and maximization strategies of the objective. The results
demonstrate that the proposed optimization strategy outperforms the
gradient-based S-SVDD in most cases.Comment: 8 pages, 2 figures, 2 tables, 1 Algorithm. Accepted at IEEE Symposium
Series on Computational Intelligence 202
Color Constancy Convolutional Autoencoder
In this paper, we study the importance of pre-training for the generalization
capability in the color constancy problem. We propose two novel approaches
based on convolutional autoencoders: an unsupervised pre-training algorithm
using a fine-tuned encoder and a semi-supervised pre-training algorithm using a
novel composite-loss function. This enables us to solve the data scarcity
problem and achieve competitive, to the state-of-the-art, results while
requiring much fewer parameters on ColorChecker RECommended dataset. We further
study the over-fitting phenomenon on the recently introduced version of
INTEL-TUT Dataset for Camera Invariant Color Constancy Research, which has both
field and non-field scenes acquired by three different camera models.Comment: 6 pages, 1 figure, 3 table
Class-wise Generalization Error: an Information-Theoretic Analysis
Existing generalization theories of supervised learning typically take a
holistic approach and provide bounds for the expected generalization over the
whole data distribution, which implicitly assumes that the model generalizes
similarly for all the classes. In practice, however, there are significant
variations in generalization performance among different classes, which cannot
be captured by the existing generalization bounds. In this work, we tackle this
problem by theoretically studying the class-generalization error, which
quantifies the generalization performance of each individual class. We derive a
novel information-theoretic bound for class-generalization error using the KL
divergence, and we further obtain several tighter bounds using the conditional
mutual information (CMI), which are significantly easier to estimate in
practice. We empirically validate our proposed bounds in different neural
networks and show that they accurately capture the complex class-generalization
error behavior. Moreover, we show that the theoretical tools developed in this
paper can be applied in several applications beyond this context.Comment: 26 page
On Feature Diversity in Energy-based Models
Energy-based learning is a powerful learning paradigm that encapsulates
various discriminative and generative approaches. An energy-based model (EBM)
is typically formed of inner-model(s) that learn a combination of the different
features to generate an energy mapping for each input configuration. In this
paper, we focus on the diversity of the produced feature set. We extend the
probably approximately correct (PAC) theory of EBMs and analyze the effect of
redundancy reduction on the performance of EBMs. We derive generalization
bounds for various learning contexts, i.e., regression, classification, and
implicit regression, with different energy functions and we show that indeed
reducing redundancy of the feature set can consistently decrease the gap
between the true and empirical expectation of the energy and boosts the
performance of the model.Comment: 18 pages, 3 figure
Reducing Redundancy in the Bottleneck Representation of the Autoencoders
Autoencoders are a type of unsupervised neural networks, which can be used to
solve various tasks, e.g., dimensionality reduction, image compression, and
image denoising. An AE has two goals: (i) compress the original input to a
low-dimensional space at the bottleneck of the network topology using an
encoder, (ii) reconstruct the input from the representation at the bottleneck
using a decoder. Both encoder and decoder are optimized jointly by minimizing a
distortion-based loss which implicitly forces the model to keep only those
variations of input data that are required to reconstruct the and to reduce
redundancies. In this paper, we propose a scheme to explicitly penalize feature
redundancies in the bottleneck representation. To this end, we propose an
additional loss term, based on the pair-wise correlation of the neurons, which
complements the standard reconstruction loss forcing the encoder to learn a
more diverse and richer representation of the input. We tested our approach
across different tasks: dimensionality reduction using three different dataset,
image compression using the MNIST dataset, and image denoising using fashion
MNIST. The experimental results show that the proposed loss leads consistently
to superior performance compared to the standard AE loss.Comment: 6 pages,4 figures. The paper is under consideration at Pattern
Recognition Letter
Within-layer Diversity Reduces Generalization Gap
Neural networks are composed of multiple layers arranged in a hierarchical
structure jointly trained with a gradient-based optimization, where the errors
are back-propagated from the last layer back to the first one. At each
optimization step, neurons at a given layer receive feedback from neurons
belonging to higher layers of the hierarchy. In this paper, we propose to
complement this traditional 'between-layer' feedback with additional
'within-layer' feedback to encourage diversity of the activations within the
same layer. To this end, we measure the pairwise similarity between the outputs
of the neurons and use it to model the layer's overall diversity. By penalizing
similarities and promoting diversity, we encourage each neuron to learn a
distinctive representation and, thus, to enrich the data representation learned
within the layer and to increase the total capacity of the model. We
theoretically study how the within-layer activation diversity affects the
generalization performance of a neural network and prove that increasing the
diversity of hidden activations reduces the estimation error. In addition to
the theoretical guarantees, we present an empirical study on three datasets
confirming that the proposed approach enhances the performance of
state-of-the-art neural network models and decreases the generalization gap.Comment: 18 pages, 1 figure, 3 Table
Efficient CNN with uncorrelated Bag of Features pooling
Despite the superior performance of CNN, deploying them on low computational
power devices is still limited as they are typically computationally expensive.
One key cause of the high complexity is the connection between the convolution
layers and the fully connected layers, which typically requires a high number
of parameters. To alleviate this issue, Bag of Features (BoF) pooling has been
recently proposed. BoF learns a dictionary, that is used to compile a histogram
representation of the input. In this paper, we propose an approach that builds
on top of BoF pooling to boost its efficiency by ensuring that the items of the
learned dictionary are non-redundant. We propose an additional loss term, based
on the pair-wise correlation of the items of the dictionary, which complements
the standard loss to explicitly regularize the model to learn a more diverse
and rich dictionary. The proposed strategy yields an efficient variant of BoF
and further boosts its performance, without any additional parameters.Comment: 6 pages, 2 Figure
Graph Embedding with Data Uncertainty
spectral-based subspace learning is a common data preprocessing step in many
machine learning pipelines. The main aim is to learn a meaningful low
dimensional embedding of the data. However, most subspace learning methods do
not take into consideration possible measurement inaccuracies or artifacts that
can lead to data with high uncertainty. Thus, learning directly from raw data
can be misleading and can negatively impact the accuracy. In this paper, we
propose to model artifacts in training data using probability distributions;
each data point is represented by a Gaussian distribution centered at the
original data point and having a variance modeling its uncertainty. We
reformulate the Graph Embedding framework to make it suitable for learning from
distributions and we study as special cases the Linear Discriminant Analysis
and the Marginal Fisher Analysis techniques. Furthermore, we propose two
schemes for modeling data uncertainty based on pair-wise distances in an
unsupervised and a supervised contexts.Comment: 20 pages, 4 figure
Monte Carlo Dropout Ensembles for Robust Illumination Estimation
Computational color constancy is a preprocessing step used in many camera
systems. The main aim is to discount the effect of the illumination on the
colors in the scene and restore the original colors of the objects. Recently,
several deep learning-based approaches have been proposed to solve this problem
and they often led to state-of-the-art performance in terms of average errors.
However, for extreme samples, these methods fail and lead to high errors. In
this paper, we address this limitation by proposing to aggregate different deep
learning methods according to their output uncertainty. We estimate the
relative uncertainty of each approach using Monte Carlo dropout and the final
illumination estimate is obtained as the sum of the different model estimates
weighted by the log-inverse of their corresponding uncertainties. The proposed
framework leads to state-of-the-art performance on INTEL-TAU dataset.Comment: 7 pages,6 figure