2 research outputs found

    Learning with a neural network based on similarity measures

    Get PDF
    Currently, in machine learning, there is a growing interest in finding new and better predictive models that can deal with heterogeneous data and missing values. In this thesis, two learning algorithms are proposed that can deal with both issues. The first learning algorithm that is studied consists of a neural network based on similarity measures, the Similarity Neural Network (SNN). It is a two-layer network, where the hidden layer computes the similarity between the input data and a set of prototypes, and the output layer gathers these results and predicts the output. In this thesis, several variants of this algorithm are proposed and it is analyzed which one performs better. Some of these variants are the way to choose the prototypes or how to set the parameters of the activation function. A full analysis is performed in the experiments section. Secondly, an Ensemble of SNNs is also proposed. The purpose of using an ensemble is to increase predictive performance, reduce variability and reduce learning time complexity. This second learning algorithm combines the predictions of a set of SNNs and gives the response of the ensemble based on these predictions. For this algorithm, several ensemble learners are proposed (in other words, different ways to combined these predictions). These variants are analyzed with a set of experiments. The main goal of this thesis is to understand these two methods, derive training algorithms and compare them with traditional learning algorithms, such as the classical Random Forest. The results of the experiments show a competitive performance of both methods, obtaining similar results than the Random Forest and improving it in some problems. 16 datasets with heterogeneous data and missing values are tested, some of them large and difficult problems. About the SNN, with these experiments, it is found that adding regularization to the network has a high influence on the model. About the ensemble, the experiment results suggest that the simplest ensemble learner (mean or majority vote of the SNNs) is the one that performs better. Among the two proposals, both get similar and quite good performance metrics but the ensemble obtains slightly better predictions

    Learning with a neural network based on similarity measures

    Get PDF
    Currently, in machine learning, there is a growing interest in finding new and better predictive models that can deal with heterogeneous data and missing values. In this thesis, two learning algorithms are proposed that can deal with both issues. The first learning algorithm that is studied consists of a neural network based on similarity measures, the Similarity Neural Network (SNN). It is a two-layer network, where the hidden layer computes the similarity between the input data and a set of prototypes, and the output layer gathers these results and predicts the output. In this thesis, several variants of this algorithm are proposed and it is analyzed which one performs better. Some of these variants are the way to choose the prototypes or how to set the parameters of the activation function. A full analysis is performed in the experiments section. Secondly, an Ensemble of SNNs is also proposed. The purpose of using an ensemble is to increase predictive performance, reduce variability and reduce learning time complexity. This second learning algorithm combines the predictions of a set of SNNs and gives the response of the ensemble based on these predictions. For this algorithm, several ensemble learners are proposed (in other words, different ways to combined these predictions). These variants are analyzed with a set of experiments. The main goal of this thesis is to understand these two methods, derive training algorithms and compare them with traditional learning algorithms, such as the classical Random Forest. The results of the experiments show a competitive performance of both methods, obtaining similar results than the Random Forest and improving it in some problems. 16 datasets with heterogeneous data and missing values are tested, some of them large and difficult problems. About the SNN, with these experiments, it is found that adding regularization to the network has a high influence on the model. About the ensemble, the experiment results suggest that the simplest ensemble learner (mean or majority vote of the SNNs) is the one that performs better. Among the two proposals, both get similar and quite good performance metrics but the ensemble obtains slightly better predictions
    corecore