20 research outputs found

    Deep diffusion autoencoders

    Full text link
    International Joint Conference on Neural Networks, celebrada en 2019 en Budapest© 2019 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Extending work by Mishne et al., we propose Deep Diffusion Autoencoders (DDA) that learn an encoder-decoder map using a composite loss function that simultaneously minimizes the reconstruction error at the output layer and the distance to a Diffusion Map embedding in the bottleneck layer. These DDA are thus able to reconstruct new patterns from points in the embedding space in a way that preserves the geometry of the sample and, as a consequence, our experiments show that they may provide a powerful tool for data augmentation.With partial support from Spain’s grants TIN2016-76406-P and S2013/ICE-2845 CASI-CAM-CM. Work supported also by project FACIL-Ayudas Fundación BBVA a Equipos de Investigación Científica 2016, and the UAM–ADIC Chair for Data Science and Machine Learning. We also gratefully acknowledge the use of the facilities of Centro de Computación Científica (CCC) at UAM

    Enforcing Group Structure through the Group Fused Lasso

    Full text link
    We introduce the Group Total Variation (GTV) regularizer, a modification of Total Variation that uses the 2,1 norm instead of the 1 one to deal with multidimensional features. When used as the only regularizer, GTV can be applied jointly with iterative convex optimization algorithms such as FISTA. This requires to compute its proximal operator which we derive using a dual formulation. GTV can also be combined with a Group Lasso (GL) regularizer, leading to what we call Group Fused Lasso (GFL) whose proximal operator can now be computed combining the GTV and GL proximals through proximal Dykstra algorithm. We will illustrate how to apply GFL in strongly structured but ill-posed regression problems as well as the use of GTV to denoise colour images.Acknowledgements With partial support from Spain’s grant TIN2010-21575-C02-01 and the UAM–ADIC Chair for Machine Learning

    Sparse methods for wind energy prediction

    Full text link
    © 2012 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.International Joint Conference on Neural Networks (IJCNN), celebrado en 2012 en Brisbane, QLD, AustraliaIn this work we will analyze and apply to the prediction of wind energy some of the best known regularized linear regression algorithms, such as Ordinary Least Squares, Ridge Regression and, particularly, Lasso, Group Lasso and Elastic-Net that also seek to impose a certain degree of sparseness on the final models. To achieve this goal, some of them introduce a non-differentiable regularization term that requires special techniques to solve the corresponding optimization problem that will yield the final model. Proximal Algorithms have been recently introduced precisely to handle this kind of optimization problems, and so we will briefly review how to apply them in regularized linear regression. Moreover, the proximal method FISTA will be used when applying the non-differentiable models to the problem of predicting the global wind energy production in Spain, using as inputs numerical weather forecasts for the entire Iberian peninsula. Our results show how some of the studied sparsity-inducing models are able to produce a coherent selection of features, attaining similar performance to a baseline model using expert information, while making use of less data features.The authors of the paper acknowledge partial support from grant TIN2010-21575-C02-01 of the TIN Subprogram from Spain’s MICINN and of the C´atedra UAM-IIC en Modelado y Predicci´on. The first author is also supported by the FPU– MEC grant AP2008-00167. We also thank Red E´ectrica de Espa˜na, Spain’s TSO, for providing historic wind energy dat

    Sparse Linear Wind Farm Energy Forecast

    Full text link
    In this work we will apply sparse linear regression methods to forecast wind farm energy production using numerical weather prediction (NWP) features over several pressure levels, a problem where pattern dimension can become very large. We shall place sparse regression in the context of proximal optimization, which we shall briefly review, and we shall show how sparse methods outperform other models while at the same time shedding light on the most relevant NWP features and on their predictive structure.With partial support from grant TIN2010-21575-C02-01 of Spain's Ministerio de Econom a y Competitividad and the UAM{ADIC Chair for Machine Learning in Modelling and Prediction. The rst author is supported by the FPU{MEC grant AP2008-00167. We thank our colleague Alvaro Barbero for the software used in this work

    Faster SVM training via conjugate SMO

    Full text link
    We propose an improved version of the SMO algorithm for training classification and regression SVMs, based on a Conjugate Descent procedure. This new approach only involves a modest increase on the com- putational cost of each iteration but, in turn, usually results in a substantial decrease in the number of iterations required to converge to a given precision. Besides, we prove convergence of the iterates of this new Conjugate SMO as well as a linear rate when the kernel matrix is positive definite. We have im- plemented Conjugate SMO within the LIBSVM library and show experimentally that it is faster for many hyper-parameter configurations, being often a better option than second order SMO when performing a grid-search for SVM tuning

    Structure Learning in Deep Multi-Task Models

    Full text link
    Multi-Task Learning (MTL) aims at improving the learning process by solving different tasks simultaneously. Two general approaches for neural MTL are hard and soft information sharing during training. Here we propose two new approaches to neural MTL. The first one uses a common model to enforce a soft sharing learning of the tasks considered. The second one adds a graph Laplacian term to a hard sharing neural model with the goal of detecting existing but a priori unknown task relations. We will test both tasks on real and synthetic datasets and show that either one can improve on other MTL neural models.The authors acknowledge support from the European Regional Development Fund and the Spanish State Research Agency of the Ministry of Economy, Industry, and Competitiveness under the project PID2019-106827GB-I00. They also thank the UAM–ADIC Chair for Data Science and Machine Learning and gratefully acknowledge the use of the facilities of Centro de Computación Científica (CCC) at UAM

    Auto-adaptive multi-scale Laplacian Pyramids for modeling non-uniform data

    Full text link
    Kernel-based techniques have become a common way for describing the local and global relationships of data samples that are generated in real-world processes. In this research, we focus on a multi-scale kernel based technique named Auto-adaptive Laplacian Pyramids (ALP). This method can be useful for function approximation and interpolation. ALP is an extension of the standard Laplacian Pyramids model that incorporates a modified Leave-One-Out Cross Validation procedure, which makes the method stable and automatic in terms of parameters selection without extra cost. This paper introduces a new algorithm that extends ALP to fit datasets that are non-uniformly distributed. In particular, the optimal stopping criterion will be point-dependent with respect to the local noise level and the sample rate. Experimental results over real datasets highlight the advantages of the proposed multi-scale technique for modeling and learning complex, high dimensional dataThey wish to thank Prof. Ronald R. Coifman for helpful remarks. They 525 also gratefully acknowledge the use of the facilities of Centro de Computación Científica (CCC) at Universidad Autónoma de Madrid. Funding: This work was supported by Spanish grants of the Ministerio de Ciencia, Innovación y Universidades [grant numbers: TIN2013-42351-P, TIN2015-70308-REDT, TIN2016-76406-P]; project CASI-CAM-CM supported by Madri+d 530 [grant number: S2013/ICE-2845]; project FACIL supported by Fundación BBVA (2016); and the UAM–ADIC Chair for Data Science and Machine Learnin

    Tecnología, computación e inteligencia

    Full text link
    Continuamos esta sección de la revista, dedicada a Conferencias célebres impartidas en la Universidad Autónoma de Madrid a lo largo de su historia, bien como Lecciones inaugurales de curso académico, o bien impartidas en su investidura por Doctores Honoris Causa nombrados por esta universidad. Se trata por tanto de conferencias con importantes contenidos relacionados con la ciencia y el progreso del conocimiento, e impartidas por personalidades ilustres del mundo académico, científico o social. En esta ocasión publicamos la Lección inaugural de la Universidad Autónoma de Madrid del Curso académico 2006-2007, pronunciado por D. José R. Dorronsoro, Catedrático de Ingeniería Informática de la UA

    Deep least squares fisher discriminant analysis

    Full text link
    © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksWhile being one of the first and most elegant tools for dimensionality reduction, Fisher linear discriminant analysis (FLDA) is not currently considered among the top methods for feature extraction or classification. In this paper, we will review two recent approaches to FLDA, namely, least squares Fisher discriminant analysis (LSFDA) and regularized kernel FDA (RKFDA) and propose deep FDA (DFDA), a straightforward nonlinear extension of LSFDA that takes advantage of the recent advances on deep neural networks. We will compare the performance of RKFDA and DFDA on a large number of two-class and multiclass problems, many of them involving class-imbalanced data sets and some having quite large sample sizes; we will use, for this, the areas under the receiver operating characteristics (ROCs) curve of the classifiers considered. As we shall see, the classification performance of both methods is often very similar and particularly good on imbalanced problems, but building DFDA models is considerably much faster than doing so for RKFDA, particularly in problems with quite large sample size

    Robust losses in deep regression

    Full text link
    What is the noise distribution of a given regression problem is not known in advance and, given that the assumption on which noise is present is reflected on the loss to be used, a consequence is that neither the loss choice should be fixed beforehand. In this work we will address this issue examining seven regression losses, some of them proposed in the field of robust linear regression, over twelve problems. While in our experiments some losses appear as better suited for most of the problems, we feel more appropriate to conclude that the choice of a “best loss” is problem dependent and perhaps should be handled similarly to what is done in hyperparameter selectionThe authors acknowledge nancial support from the Eu- ropean Regional Development Fund and the Spanish State Research Agency of the Ministry of Economy, Industry, and Competitiveness under the project PID2019-106827GB-I00. They also thank the support of the UAM{ADIC Chair for Data Science and Machine Learning and gratefully acknowledge the use of the facilities of Centro de Computación Científica (CCC) at UA
    corecore