323 research outputs found
Urban Land Cover Classification with Missing Data Modalities Using Deep Convolutional Neural Networks
Automatic urban land cover classification is a fundamental problem in remote
sensing, e.g. for environmental monitoring. The problem is highly challenging,
as classes generally have high inter-class and low intra-class variance.
Techniques to improve urban land cover classification performance in remote
sensing include fusion of data from different sensors with different data
modalities. However, such techniques require all modalities to be available to
the classifier in the decision-making process, i.e. at test time, as well as in
training. If a data modality is missing at test time, current state-of-the-art
approaches have in general no procedure available for exploiting information
from these modalities. This represents a waste of potentially useful
information. We propose as a remedy a convolutional neural network (CNN)
architecture for urban land cover classification which is able to embed all
available training modalities in a so-called hallucination network. The network
will in effect replace missing data modalities in the test phase, enabling
fusion capabilities even when data modalities are missing in testing. We
demonstrate the method using two datasets consisting of optical and digital
surface model (DSM) images. We simulate missing modalities by assuming that DSM
images are missing during testing. Our method outperforms both standard CNNs
trained only on optical images as well as an ensemble of two standard CNNs. We
further evaluate the potential of our method to handle situations where only
some DSM images are missing during testing. Overall, we show that we can
clearly exploit training time information of the missing modality during
testing
Training Echo State Networks with Regularization through Dimensionality Reduction
In this paper we introduce a new framework to train an Echo State Network to
predict real valued time-series. The method consists in projecting the output
of the internal layer of the network on a space with lower dimensionality,
before training the output layer to learn the target task. Notably, we enforce
a regularization constraint that leads to better generalization capabilities.
We evaluate the performances of our approach on several benchmark tests, using
different techniques to train the readout of the network, achieving superior
predictive performance when using the proposed framework. Finally, we provide
an insight on the effectiveness of the implemented mechanics through a
visualization of the trajectory in the phase space and relying on the
methodologies of nonlinear time-series analysis. By applying our method on well
known chaotic systems, we provide evidence that the lower dimensional embedding
retains the dynamical properties of the underlying system better than the
full-dimensional internal states of the network
Learning Latent Representations of Bank Customers With The Variational Autoencoder
Learning data representations that reflect the customers' creditworthiness
can improve marketing campaigns, customer relationship management, data and
process management or the credit risk assessment in retail banks. In this
research, we adopt the Variational Autoencoder (VAE), which has the ability to
learn latent representations that contain useful information. We show that it
is possible to steer the latent representations in the latent space of the VAE
using the Weight of Evidence and forming a specific grouping of the data that
reflects the customers' creditworthiness. Our proposed method learns a latent
representation of the data, which shows a well-defied clustering structure
capturing the customers' creditworthiness. These clusters are well suited for
the aforementioned banks' activities. Further, our methodology generalizes to
new customers, captures high-dimensional and complex financial data, and scales
to large data sets.Comment: arXiv admin note: substantial text overlap with arXiv:1806.0253
Deep Generative Models for Reject Inference in Credit Scoring
Credit scoring models based on accepted applications may be biased and their
consequences can have a statistical and economic impact. Reject inference is
the process of attempting to infer the creditworthiness status of the rejected
applications. In this research, we use deep generative models to develop two
new semi-supervised Bayesian models for reject inference in credit scoring, in
which we model the data generating process to be dependent on a Gaussian
mixture. The goal is to improve the classification accuracy in credit scoring
models by adding reject applications. Our proposed models infer the unknown
creditworthiness of the rejected applications by exact enumeration of the two
possible outcomes of the loan (default or non-default). The efficient
stochastic gradient optimization technique used in deep generative models makes
our models suitable for large data sets. Finally, the experiments in this
research show that our proposed models perform better than classical and
alternative machine learning models for reject inference in credit scoring
Bidirectional deep-readout echo state networks
We propose a deep architecture for the classification of multivariate time
series. By means of a recurrent and untrained reservoir we generate a vectorial
representation that embeds temporal relationships in the data. To improve the
memorization capability, we implement a bidirectional reservoir, whose last
state captures also past dependencies in the input. We apply dimensionality
reduction to the final reservoir states to obtain compressed fixed size
representations of the time series. These are subsequently fed into a deep
feedforward network trained to perform the final classification. We test our
architecture on benchmark datasets and on a real-world use-case of blood
samples classification. Results show that our method performs better than a
standard echo state network and, at the same time, achieves results comparable
to a fully-trained recurrent network, but with a faster training
Time Series Cluster Kernel for Learning Similarities between Multivariate Time Series with Missing Data
Similarity-based approaches represent a promising direction for time series
analysis. However, many such methods rely on parameter tuning, and some have
shortcomings if the time series are multivariate (MTS), due to dependencies
between attributes, or the time series contain missing data. In this paper, we
address these challenges within the powerful context of kernel methods by
proposing the robust \emph{time series cluster kernel} (TCK). The approach
taken leverages the missing data handling properties of Gaussian mixture models
(GMM) augmented with informative prior distributions. An ensemble learning
approach is exploited to ensure robustness to parameters by combining the
clustering results of many GMM to form the final kernel.
We evaluate the TCK on synthetic and real data and compare to other
state-of-the-art techniques. The experimental results demonstrate that the TCK
is robust to parameter choices, provides competitive results for MTS without
missing data and outstanding results for missing data.Comment: 23 pages, 6 figure
- …