Search CORE

1,688 research outputs found

A regression model with a hidden logistic process for signal parametrization

Author: Aknin Patrice
Chamroukhi Faicel
Govaert Gérard
Samé Allou
Publication venue
Publication date: 25/12/2013
Field of study

A new approach for signal parametrization, which consists of a specific regression model incorporating a discrete hidden logistic process, is proposed. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. An experimental study using simulated and real data reveals good performances of the proposed approach.Comment: In Proceedings of the XVIIth European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Pages 503-508, 2009, Bruges, Belgiu

arXiv.org e-Print Archive

Time series modeling by a regression approach based on a latent process

Author: Aknin Patrice
Chamroukhi Faicel
Govaert Gérard
Samé Allou
Publication venue: 'Elsevier BV'
Publication date: 25/12/2013
Field of study

Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations

arXiv.org e-Print Archive

Supervised learning of a regression model based on latent process. Application to the estimation of fuel cell life time

Author: Aknin Patrice
Candusso Denis
Chamroukhi Faicel
Hissel Daniel
Onanena Raïssa
Oukhellou Latifa
Publication venue
Publication date: 25/12/2013
Field of study

This paper describes a pattern recognition approach aiming to estimate fuel cell duration time from electrochemical impedance spectroscopy measurements. It consists in first extracting features from both real and imaginary parts of the impedance spectrum. A parametric model is considered in the case of the real part, whereas regression model with latent variables is used in the latter case. Then, a linear regression model using different subsets of extracted features is used fo r the estimation of fuel cell time duration. The performances of the proposed approach are evaluated on experimental data set to show its feasibility. This could lead to interesting perspectives for predictive maintenance policy of fuel cell.Comment: In Proceeding of the 8th IEEE International Conference on Machine Learning and Applications (IEEE ICMLA'09), pages 632-637, 2009, Miami Beach, FL, US

arXiv.org e-Print Archive

Gene Hunting with Knockoffs for Hidden Markov Models

Author: Candès Emmanuel J.
Sabatti Chiara
Sesia Matteo
Publication venue: 'Oxford University Press (OUP)'
Publication date: 14/06/2017
Field of study

Modern scientific studies often require the identification of a subset of relevant explanatory variables, in the attempt to understand an interesting phenomenon. Several statistical methods have been developed to automate this task, but only recently has the framework of model-free knockoffs proposed a general solution that can perform variable selection under rigorous type-I error control, without relying on strong modeling assumptions. In this paper, we extend the methodology of model-free knockoffs to a rich family of problems where the distribution of the covariates can be described by a hidden Markov model (HMM). We develop an exact and efficient algorithm to sample knockoff copies of an HMM. We then argue that combined with the knockoffs selective framework, they provide a natural and powerful tool for performing principled inference in genome-wide association studies with guaranteed FDR control. Finally, we apply our methodology to several datasets aimed at studying the Crohn's disease and several continuous phenotypes, e.g. levels of cholesterol.Comment: 35 pages, 13 figues, 9 table

arXiv.org e-Print Archive

A Selective Overview of Deep Learning

Author: Fan Jianqing
Ma Cong
Zhong Yiqiao
Publication venue
Publication date: 15/04/2019
Field of study

Deep learning has arguably achieved tremendous success in recent years. In simple words, deep learning uses the composition of many nonlinear functions to model the complex dependency between input features and labels. While neural networks have a long history, recent advances have greatly improved their performance in computer vision, natural language processing, etc. From the statistical and scientific perspective, it is natural to ask: What is deep learning? What are the new characteristics of deep learning, compared with classical methods? What are the theoretical foundations of deep learning? To answer these questions, we introduce common neural network models (e.g., convolutional neural nets, recurrent neural nets, generative adversarial nets) and training techniques (e.g., stochastic gradient descent, dropout, batch normalization) from a statistical point of view. Along the way, we highlight new characteristics of deep learning (including depth and over-parametrization) and explain their practical and theoretical benefits. We also sample recent results on theories of deep learning, many of which are only suggestive. While a complete understanding of deep learning remains elusive, we hope that our perspectives and discussions serve as a stimulus for new statistical research

arXiv.org e-Print Archive

A hidden process regression model for functional data description. Application to curve discrimination

Author: Aknin Patrice
Chamroukhi Faicel
Govaert Gérard
Samé Allou
Publication venue: 'Elsevier BV'
Publication date: 25/12/2013
Field of study

A new approach for functional data description is proposed in this paper. It consists of a regression model with a discrete hidden logistic process which is adapted for modeling curves with abrupt or smooth regime changes. The model parameters are estimated in a maximum likelihood framework through a dedicated Expectation Maximization (EM) algorithm. From the proposed generative model, a curve discrimination rule is derived using the Maximum A Posteriori rule. The proposed model is evaluated using simulated curves and real world curves acquired during railway switch operations, by performing comparisons with the piecewise regression approach in terms of curve modeling and classification

arXiv.org e-Print Archive

Fast Second-Order Stochastic Backpropagation for Variational Inference

Author: Beck Jeff
Fan Kai
Heller Katherine
Kwok James
Wang Ziteng
Publication venue
Publication date: 28/03/2017
Field of study

We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton optimization can be developed as well. This is accomplished by generalizing the gradient computation in stochastic backpropagation via a reparametrization trick with lower complexity. As an illustrative example, we apply this approach to the problems of Bayesian logistic regression and variational auto-encoder (VAE). Additionally, we compute bounds on the estimator variance of intractable expectations for the family of Lipschitz continuous function. Our method is practical, scalable and model free. We demonstrate our method on several real-world datasets and provide comparisons with other stochastic gradient methods to show substantial enhancement in convergence rates.Comment: Accepted by NIPS 201

arXiv.org e-Print Archive

Empirical Analysis of the Hessian of Over-Parametrized Neural Networks

Author: Bottou Leon
Dauphin Yann
Evci Utku
Guney V. Ugur
Sagun Levent
Publication venue
Publication date: 07/05/2018
Field of study

We study the properties of common loss surfaces through their Hessian matrix. In particular, in the context of deep learning, we empirically show that the spectrum of the Hessian is composed of two parts: (1) the bulk centered near zero, (2) and outliers away from the bulk. We present numerical evidence and mathematical justifications to the following conjectures laid out by Sagun et al. (2016): Fixing data, increasing the number of parameters merely scales the bulk of the spectrum; fixing the dimension and changing the data (for instance adding more clusters or making the data less separable) only affects the outliers. We believe that our observations have striking implications for non-convex optimization in high dimensions. First, the flatness of such landscapes (which can be measured by the singularity of the Hessian) implies that classical notions of basins of attraction may be quite misleading. And that the discussion of wide/narrow basins may be in need of a new perspective around over-parametrization and redundancy that are able to create large connected components at the bottom of the landscape. Second, the dependence of small number of large eigenvalues to the data distribution can be linked to the spectrum of the covariance matrix of gradients of model outputs. With this in mind, we may reevaluate the connections within the data-architecture-algorithm framework of a model, hoping that it would shed light into the geometry of high-dimensional and non-convex spaces in modern applications. In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.Comment: Minor update for ICLR 2018 Workshop Track presentatio

arXiv.org e-Print Archive

An overview of latent Markov models for longitudinal categorical data

Author: Bartolucci F.
Farcomeni A.
Pennoni F.
Publication venue
Publication date: 14/03/2010
Field of study

We provide a comprehensive overview of latent Markov (LM) models for the analysis of longitudinal categorical data. The main assumption behind these models is that the response variables are conditionally independent given a latent process which follows a first-order Markov chain. We first illustrate the basic LM model in which the conditional distribution of each response variable given the corresponding latent variable and the initial and transition probabilities of the latent process are unconstrained. For this model we also illustrate in detail maximum likelihood estimation through the Expectation-Maximization algorithm, which may be efficiently implemented by recursions known in the hidden Markov literature. We then illustrate several constrained versions of the basic LM model, which make the model more parsimonious and allow us to include and test hypotheses of interest. These constraints may be put on the conditional distribution of the response variables given the latent process (measurement model) or on the distribution of the latent process (latent model). We also deal with extensions of LM model for the inclusion of individual covariates and to multilevel data. Covariates may affect the measurement or the latent model; we discuss the implications of these two different approaches according to the context of application. Finally, we outline methods for obtaining standard errors for the parameter estimates, for selecting the number of states and for path prediction. Models and related inference are illustrated by the description of relevant socio-economic applications available in the literature

arXiv.org e-Print Archive

Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent

Author: Pirotta Matteo
Restelli Marcello
Publication venue
Publication date: 09/12/2017
Field of study

In this paper, we propose a novel approach to automatically determine the batch size in stochastic gradient descent methods. The choice of the batch size induces a trade-off between the accuracy of the gradient estimate and the cost in terms of samples of each update. We propose to determine the batch size by optimizing the ratio between a lower bound to a linear or quadratic Taylor approximation of the expected improvement and the number of samples used to estimate the gradient. The performance of the proposed approach is empirically compared with related methods on popular classification tasks. The work was presented at the NIPS workshop on Optimizing the Optimizers. Barcelona, Spain, 2016.Comment: Presented at the NIPS workshop on Optimizing the Optimizers. Barcelona, Spain, 201

arXiv.org e-Print Archive