5,439 research outputs found

    Hyperparameter optimization with approximate gradient

    Full text link
    Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we propose an algorithm for the optimization of continuous hyperparameters using inexact gradient information. An advantage of this method is that hyperparameters can be updated before model parameters have fully converged. We also give sufficient conditions for the global convergence of this method, based on regularity conditions of the involved functions and summability of errors. Finally, we validate the empirical performance of this method on the estimation of regularization constants of L2-regularized logistic regression and kernel Ridge regression. Empirical benchmarks indicate that our approach is highly competitive with respect to state of the art methods.Comment: Proceedings of the International conference on Machine Learning (ICML

    On the Consistency of Ordinal Regression Methods

    Get PDF
    Many of the ordinal regression models that have been proposed in the literature can be seen as methods that minimize a convex surrogate of the zero-one, absolute, or squared loss functions. A key property that allows to study the statistical implications of such approximations is that of Fisher consistency. Fisher consistency is a desirable property for surrogate loss functions and implies that in the population setting, i.e., if the probability distribution that generates the data were available, then optimization of the surrogate would yield the best possible model. In this paper we will characterize the Fisher consistency of a rich family of surrogate loss functions used in the context of ordinal regression, including support vector ordinal regression, ORBoosting and least absolute deviation. We will see that, for a family of surrogate loss functions that subsumes support vector ordinal regression and ORBoosting, consistency can be fully characterized by the derivative of a real-valued function at zero, as happens for convex margin-based surrogates in binary classification. We also derive excess risk bounds for a surrogate of the absolute error that generalize existing risk bounds for binary classification. Finally, our analysis suggests a novel surrogate of the squared error loss. We compare this novel surrogate with competing approaches on 9 different datasets. Our method shows to be highly competitive in practice, outperforming the least squares loss on 7 out of 9 datasets.Comment: Journal of Machine Learning Research 18 (2017

    Design of VR app applied to cognitive training

    Get PDF
    L’objectiu principal d’aquest projecte és el disseny d’una aplicació de realitat virtual per millorar el tractament dels pacients amb deteriorament cognitiu lleu, així com estudiar els possibles avantatges que aquesta tecnologia pot proporcionar en aquest camp. Es va escollir la realitat virtual perquè permet augmentar la sensació d’immersió pel que fa a les tecnologies actuals. Actualment la realitat virtual s’està utilitzant amb aquest tipus de tractament i està aconseguint gran resultats amb els pacients. A més, mitjançant l’ús d’aquesta tècnica d’immersió visual, s’espera que ajudi a millorar la capacitat dels pacients davant nous problemes, com pot ser la iniciació a la realitat virtual, una qüestió fonamental que ajuda a la millora dels pacients que es troben en les primeres etapes de la malaltia. L’aplicació consisteix en un entorn de supermercat virtual on el pacient pot realitzar diverses proves. En aquesta hi haurà diferents nivells amb diverses complexitats, sempre després d’haver realitzat un tutorial previ. L’aplicació s’ha realitzat en dues fases diferents: primer es va crear el guió, amb col·laboració amb la unitat d’Alzheimer de l’Hospital Clínic. Els nivells de l’aplicació es van definir aquí. El següent va ser la realització de l’aplicació amb col·laboració amb la companyia Vysion 360. Per a la seva utilització per la unitat d’Alzheimer de l’Hospital Clínic, l’aplicació tenia que complir diferents criteris. En primer lloc, els nivells de dificultat tenen que ser suficients per realitzar un tractament a llarg termini. En segon lloc, per crear una bona experiència de immersió, l’entorn creat té que ser el més realista possible. Finalment, s’ha creat una base de dades local per guardar la informació de totes les sessions, utilitzat posteriorment en l’anàlisi de evolució dels pacients. Amb aquesta aplicació, s’espera que els resultats en els pacients amb deteriorament cognitiu lleu milloren respecte a les tècniques anteriors. Especialment gràcies a la gran experiència d’immersió aconseguida amb la realitat virtual, la qual ajuda a la concentració dels pacients durant el tractament

    Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

    Get PDF
    Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures. Yet, despite their practical success, support for nonsmooth objectives is still lacking, making them unsuitable for many problems of interest in machine learning, such as the Lasso, group Lasso or empirical risk minimization with convex constraints. In this work, we propose and analyze ProxASAGA, a fully asynchronous sparse method inspired by SAGA, a variance reduced incremental gradient algorithm. The proposed method is easy to implement and significantly outperforms the state of the art on several nonsmooth, large-scale problems. We prove that our method achieves a theoretical linear speedup with respect to the sequential version under assumptions on the sparsity of gradients and block-separability of the proximal term. Empirical benchmarks on a multi-core architecture illustrate practical speedups of up to 12x on a 20-core machine.Comment: Appears in Advances in Neural Information Processing Systems 30 (NIPS 2017), 28 page

    Could the ease of doing business be considered a predictor of countries' socio-economic wealth? An empirical analysis using pls-sem

    Get PDF
    The wealth of nations differs significantly due to different factors. One of the reasons identified by previous studies is the level of entrepreneurship promotion by governments. This aspect has scarcely been studied empirically to date. Therefore, this paper sheds some light on this regard through building a construct out of ten Ease of Doing Business Index (EDBI) measures developed by the World Bank and relating it with a construct shaped by two measures of socio-economic wealth (SEW), namely gross domestic product and the Human Development Index. To this end, we conduct a structural equation model analysis using partial least squares (PLS-SEM) method with a 2018 database comprising secondary data from 190 countries. As the main contribution of this study, the results show that good performance in the EDBI ranking predicts good performance in the SEW ranking. Additionally, this study is pioneer in the use of these rankings to build composite constructs (latent variables) and relate them. For these reasons, our findings are useful for both academia and governments responsible for promoting entrepreneurship, as this latter is identified as the key enabler of economic development

    Learning from Noisy Label Distributions

    Full text link
    In this paper, we consider a novel machine learning problem, that is, learning a classifier from noisy label distributions. In this problem, each instance with a feature vector belongs to at least one group. Then, instead of the true label of each instance, we observe the label distribution of the instances associated with a group, where the label distribution is distorted by an unknown noise. Our goals are to (1) estimate the true label of each instance, and (2) learn a classifier that predicts the true label of a new instance. We propose a probabilistic model that considers true label distributions of groups and parameters that represent the noise as hidden variables. The model can be learned based on a variational Bayesian method. In numerical experiments, we show that the proposed model outperforms existing methods in terms of the estimation of the true labels of instances.Comment: Accepted in ICANN201

    HRF estimation improves sensitivity of fMRI encoding and decoding models

    Get PDF
    Extracting activation patterns from functional Magnetic Resonance Images (fMRI) datasets remains challenging in rapid-event designs due to the inherent delay of blood oxygen level-dependent (BOLD) signal. The general linear model (GLM) allows to estimate the activation from a design matrix and a fixed hemodynamic response function (HRF). However, the HRF is known to vary substantially between subjects and brain regions. In this paper, we propose a model for jointly estimating the hemodynamic response function (HRF) and the activation patterns via a low-rank representation of task effects.This model is based on the linearity assumption behind the GLM and can be computed using standard gradient-based solvers. We use the activation patterns computed by our model as input data for encoding and decoding studies and report performance improvement in both settings.Comment: 3nd International Workshop on Pattern Recognition in NeuroImaging (2013

    Success Factors in Peer-to-Business (P2B) Crowdlending: A Predictive Approach

    Get PDF
    Peer-to-Business (P2B) crowdlending is gaining importance among companies seeking funding. However, not all projects get the same take-up by the crowd. Thus, this study aims to determine the key factors that drive non-professional investors to choose a given loan in an online environment. To this purpose, we have analyzed 243 crowdlending campaigns on October.eu platform. We have obtained a series of variables from the analyzed loans using logistic regression. Results indicate that loan amount, loan term and overall credit rating are the key predictors of non-professional lender P2B crowdlending success. These findings may be useful for predicting whether the crowd will subscribe to a loan request or not. This information would help businesses to modify specific loan characteristics (if possible) to make their loans more attractive or could even lead companies to consider a different financial option. It could also help platforms select and adapt project parameters to secure their success

    A Real-Time Remote IDS Testbed for Connected Vehicles

    Full text link
    Connected vehicles are becoming commonplace. A constant connection between vehicles and a central server enables new features and services. This added connectivity raises the likelihood of exposure to attackers and risks unauthorized access. A possible countermeasure to this issue are intrusion detection systems (IDS), which aim at detecting these intrusions during or after their occurrence. The problem with IDS is the large variety of possible approaches with no sensible option for comparing them. Our contribution to this problem comprises the conceptualization and implementation of a testbed for an automotive real-world scenario. That amounts to a server-side IDS detecting intrusions into vehicles remotely. To verify the validity of our approach, we evaluate the testbed from multiple perspectives, including its fitness for purpose and the quality of the data it generates. Our evaluation shows that the testbed makes the effective assessment of various IDS possible. It solves multiple problems of existing approaches, including class imbalance. Additionally, it enables reproducibility and generating data of varying detection difficulties. This allows for comprehensive evaluation of real-time, remote IDS.Comment: Peer-reviewed version accepted for publication in the proceedings of the 34th ACM/SIGAPP Symposium On Applied Computing (SAC'19

    Relationship Amongst Technology Use, Work Overload, and Psychological Detachment from Work

    Get PDF
    Permanent connection to the work world as a result of new technologies raises the possibility of workday extensions and excessive workloads. The present study addresses the relationship between technology and psychological detachment from work resulting from work overload. Participants were 313 professionals from the health sector who responded to three instruments used in similar studies. Through PLS-SEM, regression and dependence analyses were developed, and through the bootstrapping method, significance of factor loadings, path coefficients and variances were examined. Results of the study corroborate a negative effect of technology use on psychological detachment from work and a positive correlation between technology and work overload. Additionally, there is a significant indirect effect of technology on psychological detachment from work as a result of work overload. Findings extend the literature related to the stressor-detachment model, and support the idea that workers who are often connected to their jobs by technological tools are less likely to reach adequate psychological detachment levels. Implications for the academic community and practitioners are discusse
    corecore