3,782 research outputs found

    Nonstationary regression with support vector machines

    Get PDF
    In this work, we introduce a method for data analysis in nonstationary environments: time-adaptive support vector regression (TA-SVR). The proposed approach extends a previous development that was limited to classification problems. Focusing our study on time series applications, we show that TA-SVR can improve the accuracy of several aspects of nonstationary data analysis, namely the tasks of modelling and prediction, input relevance estimation, and reconstruction of a hidden forcing profile.Fil: Uzal, Lucas César. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y Sistemas; ArgentinaFil: Grinblat, Guillermo Luis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y Sistemas; ArgentinaFil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y Sistemas; ArgentinaFil: Verdes, Pablo Fabian. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y Sistemas; Argentin

    Increasing power for voxel-wise genome-wide association studies : the random field theory, least square kernel machines and fast permutation procedures

    Get PDF
    Imaging traits are thought to have more direct links to genetic variation than diagnostic measures based on cognitive or clinical assessments and provide a powerful substrate to examine the influence of genetics on human brains. Although imaging genetics has attracted growing attention and interest, most brain-wide genome-wide association studies focus on voxel-wise single-locus approaches, without taking advantage of the spatial information in images or combining the effect of multiple genetic variants. In this paper we present a fast implementation of voxel- and cluster-wise inferences based on the random field theory to fully use the spatial information in images. The approach is combined with a multi-locus model based on least square kernel machines to associate the joint effect of several single nucleotide polymorphisms (SNP) with imaging traits. A fast permutation procedure is also proposed which significantly reduces the number of permutations needed relative to the standard empirical method and provides accurate small p-value estimates based on parametric tail approximation. We explored the relation between 448,294 single nucleotide polymorphisms and 18,043 genes in 31,662 voxels of the entire brain across 740 elderly subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Structural MRI scans were analyzed using tensor-based morphometry (TBM) to compute 3D maps of regional brain volume differences compared to an average template image based on healthy elderly subjects. We find method to be more sensitive compared with voxel-wise single-locus approaches. A number of genes were identified as having significant associations with volumetric changes. The most associated gene was GRIN2B, which encodes the N-methyl-d-aspartate (NMDA) glutamate receptor NR2B subunit and affects both the parietal and temporal lobes in human brains. Its role in Alzheimer's disease has been widely acknowledged and studied, suggesting the validity of the approach. The various advantages over existing approaches indicate a great potential offered by this novel framework to detect genetic influences on human brains

    String and Membrane Gaussian Processes

    Full text link
    In this paper we introduce a novel framework for making exact nonparametric Bayesian inference on latent functions, that is particularly suitable for Big Data tasks. Firstly, we introduce a class of stochastic processes we refer to as string Gaussian processes (string GPs), which are not to be mistaken for Gaussian processes operating on text. We construct string GPs so that their finite-dimensional marginals exhibit suitable local conditional independence structures, which allow for scalable, distributed, and flexible nonparametric Bayesian inference, without resorting to approximations, and while ensuring some mild global regularity constraints. Furthermore, string GP priors naturally cope with heterogeneous input data, and the gradient of the learned latent function is readily available for explanatory analysis. Secondly, we provide some theoretical results relating our approach to the standard GP paradigm. In particular, we prove that some string GPs are Gaussian processes, which provides a complementary global perspective on our framework. Finally, we derive a scalable and distributed MCMC scheme for supervised learning tasks under string GP priors. The proposed MCMC scheme has computational time complexity O(N)\mathcal{O}(N) and memory requirement O(dN)\mathcal{O}(dN), where NN is the data size and dd the dimension of the input space. We illustrate the efficacy of the proposed approach on several synthetic and real-world datasets, including a dataset with 66 millions input points and 88 attributes.Comment: To appear in the Journal of Machine Learning Research (JMLR), Volume 1

    Sparse least squares support vector regression for nonstationary systems

    Get PDF
    A new adaptive sparse least squares support vector regression algorithm, referred to as SLSSVR has been introduced for the adaptive modeling of nonstationary systems. Using a sliding window of recent data set of size N to track t he non-stationary characteristics of the incoming data, our adaptive model is initially formulated based on least squares support vector regression with forgetting factor (without bias term). In order to obtain a sparse model in which some parameters are exactly zeros, a l 1 penalty was applied in parameter estimation in the dual problem. Furthermore we exploit the fact that since the associated system/kernel matrix in positive definite, the dual solution of least squares support vector machine without bias term, can be solved iteratively with guaranteed convergence. Furthermore since the models between two consecutive time steps there are (N-1) shared kernels/parameters, the online solution can be obtained efficiently using coordinate descent algorithm in the form of Gauss-Seidel algorithm with minimal number of iterations. This allows a very sparse model per time step to be obtained very efficiently, avoiding expensive matrix inversion. The real stock market dataset and simulated examples have shown that the proposed approaches can lead to superior performances in comparison with the linear recursive least algorithm and a number of online non-linear approaches in terms of modelling performance and model size
    corecore