53 research outputs found

    On building ensembles of stacked denoising auto-encoding classifiers and their further improvement

    Get PDF
    To aggregate diverse learners and to train deep architectures are the two principal avenues towards increasing the expressive capabilities of neural networks. Therefore, their combinations merit attention. In this contribution, we study how to apply some conventional diversity methods-bagging and label switching- to a general deep machine, the stacked denoising auto-encoding classifier, in order to solve a number of appropriately selected image recognition problems. The main conclusion of our work is that binarizing multi-class problems is the key to obtain benefit from those diversity methods. Additionally, we check that adding other kinds of performance improvement procedures, such as pre-emphasizing training samples and elastic distortion mechanisms, further increases the quality of the results. In particular, an appropriate combination of all the above methods leads us to reach a new absolute record in classifying MNIST handwritten digits. These facts reveal that there are clear opportunities for designing more powerful classifiers by means of combining different improvement techniques. (C) 2017 Elsevier B.V. All rights reserved.This work has been partly supported by research grants CASI- CAM-CM ( S2013/ICE-2845, Madrid Community) and Macro-ADOBE (TEC2015-67719, MINECO-FEDER EU), as well as by the research network DAMA ( TIN2015-70308-REDT, MINECO )

    Pre-emphasizing Binarized Ensembles to Improve Classification Performance

    Get PDF
    14th International Work-Conference on Artificial Neural Networks, IWANN 2017Machine ensembles are learning architectures that offer high expressive capacities and, consequently, remarkable performances. This is due to their high number of trainable parameters.In this paper, we explore and discuss whether binarization techniques are effective to improve standard diversification methods and if a simple additional trick, consisting in weighting the training examples, allows to obtain better results. Experimental results, for three selected classification problems, show that binarization permits that standard direct diversification methods (bagging, in particular) achieve better results, obtaining even more significant performance improvements when pre-emphasizing the training samples. Some research avenues that this finding opens are mentioned in the conclusions.This work has been partly supported by research grants CASI-CAM-CM (S2013/ICE-2845, DGUI-CM and FEDER) and Macro-ADOBE (TEC2015-67719-P, MINECO)

    Designing mixture of deep experts

    Get PDF
    Mixture of Experts (MoE) is a classical architecture for ensembles where each member is specialised in a given part of the input space or its expertise area. Working in this manner, we aim to specialise the experts on smaller problems, solving the original problem through some type of divide and conquer approach. The goal of our research is to initially reproduce the work done by Collobert et al[1] , 2002 followed by extending this work by using neural networks as experts on different datasets. Specialised representations will be learned over different aspects of the problem, and the results of the different members will be merged according to their specific expertise. This expertise can then be learned itself by a given network acting as a gating function. MOE architecture composed on N expert networks. These experts are combined via a gating network, which partition the input space accordingly. It is based on divide and conquer strategy supervised by a gating network. Using a specialised cost function the experts specialise in their sub-space. Using the discriminative power of experts is much better than simply clustering. The gating network needs to needs to learn how to assign examples to different specialists. Such models show promise for building larger networks that are still cheap to compute at test time, and more parallelizable at training time. We were able to reproduce the work by the author and implemented a multi-class gater to classify images. We know that Neural Networks perform the best with lots of data. However, some of our experiments require us to divide the dataset and train multiple Neural Networks. We observe that in data deprived condition our MoE are almost on par and compete with ensembles trained on complete data. Keywords : Machine Learning, Multi Layer Perceptrons, Mixture of Experts, Support Vector Machines, Divide and Conquer, Stochastic Gradient Descent, Optimization

    Embedded Feature Ranking for Ensemble MLP Classifiers

    Full text link

    Assessment of Cross-train Machine Learning Techniques for QoT-Estimation in agnostic Optical Networks

    Get PDF
    With the evolution of 5G technology, high definition video, virtual reality, and the internet of things (IoT), the demand for high capacity optical networks has been increasing dramatically. To support the capacity demand, low-margin optical networks engage operator interest. To engross this techno-economic interest, planning tools with higher accuracy and accurate models for the quality of transmission estimation (QoT-E) are needed. However, considering the state-of-the-art optical network’s heterogeneity, it is challenging to develop such an accurate planning tool and low-margin QoT-E models using the traditional analytical approach. Fortunately, data-driven machine-learning (ML) cognition provides a promising path. This paper reports the use of cross-trained ML-based learning methods to predict the QoT of an un-established lightpath (LP) in an agnostic network based on the retrieved data from already established LPs of an in-service network. This advanced prediction of the QoT of un-established LP in an agnostic network is a key enabler not only for the optimal planning of this network but it also provides the opportunity to automatically deploy the LPs with a minimum margin in a reliable manner. The QoT metric of the LPs are defined by the generalized signal-to-noise ratio (GSNR), which includes the effect of both amplified spontaneous emission (ASE) noise and non-linear interference (NLI) accumulation. The real field data is mimicked by using a well reliable and tested network simulation tool GNPy. Using the generated synthetic data set, supervised ML techniques such as wide deep neural network, deep neural network, multi-layer perceptron regressor, boasted tree regressor, decision tree regressor, and random forest regressor are applied, demonstrating the GSNR prediction of an un-established LP in an agnostic network with a maximum error of 0.40 dB

    Diversidad en aprendizaje profundo por auto-codificación

    Get PDF
    El diseño de aprendices profundos generales se ha mantenido como reto durante décadas. En el siglo actual se está produciendo la aparición de varios nuevos –y eficaces– procedimientos para ello. Esos procedimientos incluyen los métodos representacionales, que merecen especial atención porque no solo permiten construir máquinas potentes, sino que también extraen relevantes rasgos de alto nivel de las observaciones. Los auto-codificadores expansivos reductores de ruido son (elementos de) una de las familias de máquinas representacionales profundas. Por otra parte, los conjuntos son una alternativa sólidamente establecida para conseguir soluciones con altas prestaciones para problemas empíricos –basados en muestras– de inferencia. Se valen de la introducción de diversidad en un grupo de aprendices. Obviamente, este es un principio que también puede aplicarse a redes neuronales profundas; pero, sorprendentemente, hay muy pocos estudios que exploran esta posibilidad. En esta disertación doctoral se investiga si las técnicas convencionales de diversificación –incluyendo la binarización en el caso de bases de datos multiclase– permiten mejorar las prestaciones de clasificadores basados en auto-codificadores expansivos con reducción de ruido. Se usan tanto “Bagging” como “Switching”, junto con esquemas de binarización uno-contra-uno y de códigos de salida correctores de errores, sobre dos tipos básicos de arquitecturas: T, que tiene una unidad de auto-codificación común, y G, que también diversifica ese elemento representacional. Los resultados experimentales confirman que –si se incluye la binarización– la combinación de diversidad y profundidad conduce a mejores prestaciones, especialmente con las arquitecturas T. Para completar la exploración sobre posibles mejoras, se analiza también la aplicación de formas flexibles de pre-énfasis. Tales formas proporcionan por sí solas mejoras de prestaciones, pero las mejoras son muy importantes cuando el pre-énfasis se combina con la diversificación, en especial si se emplean diferentes parámetros de pre-énfasis a diferentes dicotomías en los problemas multiclase. Una distorsión elástica convencional permite alcanzar resultados récord. Estos resultados no son tan solo relevantes “per se”, sino que abren una vía de prometedoras líneas de investigación, las cuales se exponen en el capítulo final de esta tesis.Designing general deep learners has remained as a challenge along decades. The present century sees the emergence of several new effective procedures for it. Among them, representational methods merit particular attention, because they not only serve to build powerful machines, but also extract relevant high-level features of the observations. Expansive denoising auto-encoders are (elements of) one of such representational deep machine families. On the other hand, ensembles are a well established alternative to get high performance solutions for empirical –sample based– inference problems. They are principled on introducing diversity in a number of different learners. Obviously, this is a principle which can also be applied to deep neural networks, but, surprisingly, there are very few studies exploring this possibility. In this doctoral dissertation, we investigate if conventional diversification techniques –including binarization for multiclass databases– further improve the performance of expansive denoising auto-encoder based classifiers. Both “Bagging” and “Switching” are used, as well as one-versus-one and error-correcting-output-code binarization schemes, with two basic types of architectures: T, which has a common auto-encoding unit, and G, which also diversifies that representational element. The experimental results confirm that –if binarization is included– combining diversity and depth offers significant performance advantages, specially with T architectures. To complete the exploration on improving denoising auto-encoding based classifiers, the application of flexible enough pre-emphasis functions is also analyzed. Using this kind of pre-emphasis provides performance advantages by itself, but the advantages are very important when pre-emphasis is combined with diversification, specially if different emphasis parameters are applied to different dichotomies in multiclass problems. A conventional elastic distortion allows record results. These results are not only relevant by themselves, but they open a series of promising research avenues, that are presented in the final chapter of this thesis.Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: Antonio Artés Rodríguez.- Secretario: Sancho Salcedo Sanz.- Vocal: Pedro Antonio Gutiérrez Peñ
    corecore