2 research outputs found
Learning with a neural network based on similarity measures
Currently, in machine learning, there is a growing interest in finding new and
better predictive models that can deal with heterogeneous data and missing values.
In this thesis, two learning algorithms are proposed that can deal with both issues.
The first learning algorithm that is studied consists of a neural network based on
similarity measures, the Similarity Neural Network (SNN). It is a two-layer network,
where the hidden layer computes the similarity between the input data and a set of
prototypes, and the output layer gathers these results and predicts the output. In
this thesis, several variants of this algorithm are proposed and it is analyzed which
one performs better. Some of these variants are the way to choose the prototypes or
how to set the parameters of the activation function. A full analysis is performed in
the experiments section.
Secondly, an Ensemble of SNNs is also proposed. The purpose of using an ensemble
is to increase predictive performance, reduce variability and reduce learning time
complexity. This second learning algorithm combines the predictions of a set of SNNs
and gives the response of the ensemble based on these predictions. For this algorithm,
several ensemble learners are proposed (in other words, different ways to combined
these predictions). These variants are analyzed with a set of experiments.
The main goal of this thesis is to understand these two methods, derive training
algorithms and compare them with traditional learning algorithms, such as the classical
Random Forest. The results of the experiments show a competitive performance
of both methods, obtaining similar results than the Random Forest and improving
it in some problems. 16 datasets with heterogeneous data and missing values are
tested, some of them large and difficult problems. About the SNN, with these experiments,
it is found that adding regularization to the network has a high influence
on the model. About the ensemble, the experiment results suggest that the simplest
ensemble learner (mean or majority vote of the SNNs) is the one that performs better.
Among the two proposals, both get similar and quite good performance metrics but
the ensemble obtains slightly better predictions
Learning with a neural network based on similarity measures
Currently, in machine learning, there is a growing interest in finding new and
better predictive models that can deal with heterogeneous data and missing values.
In this thesis, two learning algorithms are proposed that can deal with both issues.
The first learning algorithm that is studied consists of a neural network based on
similarity measures, the Similarity Neural Network (SNN). It is a two-layer network,
where the hidden layer computes the similarity between the input data and a set of
prototypes, and the output layer gathers these results and predicts the output. In
this thesis, several variants of this algorithm are proposed and it is analyzed which
one performs better. Some of these variants are the way to choose the prototypes or
how to set the parameters of the activation function. A full analysis is performed in
the experiments section.
Secondly, an Ensemble of SNNs is also proposed. The purpose of using an ensemble
is to increase predictive performance, reduce variability and reduce learning time
complexity. This second learning algorithm combines the predictions of a set of SNNs
and gives the response of the ensemble based on these predictions. For this algorithm,
several ensemble learners are proposed (in other words, different ways to combined
these predictions). These variants are analyzed with a set of experiments.
The main goal of this thesis is to understand these two methods, derive training
algorithms and compare them with traditional learning algorithms, such as the classical
Random Forest. The results of the experiments show a competitive performance
of both methods, obtaining similar results than the Random Forest and improving
it in some problems. 16 datasets with heterogeneous data and missing values are
tested, some of them large and difficult problems. About the SNN, with these experiments,
it is found that adding regularization to the network has a high influence
on the model. About the ensemble, the experiment results suggest that the simplest
ensemble learner (mean or majority vote of the SNNs) is the one that performs better.
Among the two proposals, both get similar and quite good performance metrics but
the ensemble obtains slightly better predictions