Search CORE

1,230 research outputs found

Sparse Volterra and Polynomial Regression Models: Recoverability and Estimation

Author: Giannakis Georgios B.
Kekatos Vassilis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/09/2011
Field of study

Volterra and polynomial regression models play a major role in nonlinear system identification and inference tasks. Exciting applications ranging from neuroscience to genome-wide association analysis build on these models with the additional requirement of parsimony. This requirement has high interpretative value, but unfortunately cannot be met by least-squares based or kernel regression methods. To this end, compressed sampling (CS) approaches, already successful in linear regression settings, can offer a viable alternative. The viability of CS for sparse Volterra and polynomial models is the core theme of this work. A common sparse regression task is initially posed for the two models. Building on (weighted) Lasso-based schemes, an adaptive RLS-type algorithm is developed for sparse polynomial regressions. The identifiability of polynomial models is critically challenged by dimensionality. However, following the CS principle, when these models are sparse, they could be recovered by far fewer measurements. To quantify the sufficient number of measurements for a given level of sparsity, restricted isometry properties (RIP) are investigated in commonly met polynomial regression settings, generalizing known results for their linear counterparts. The merits of the novel (weighted) adaptive CS algorithms to sparse polynomial modeling are verified through synthetic as well as real data tests for genotype-phenotype analysis.Comment: 20 pages, to appear in IEEE Trans. on Signal Processin

arXiv.org e-Print Archive

Crossref

Sparse Bilinear Logistic Regression

Author: Baraniuk Richard G.
Shi Jianing V.
Xu Yangyang
Publication venue
Publication date: 15/04/2014
Field of study

In this paper, we introduce the concept of sparse bilinear logistic regression for decision problems involving explanatory variables that are two-dimensional matrices. Such problems are common in computer vision, brain-computer interfaces, style/content factorization, and parallel factor analysis. The underlying optimization problem is bi-convex; we study its solution and develop an efficient algorithm based on block coordinate descent. We provide a theoretical guarantee for global convergence and estimate the asymptotical convergence rate using the Kurdyka-{\L}ojasiewicz inequality. A range of experiments with simulated and real data demonstrate that sparse bilinear logistic regression outperforms current techniques in several important applications.Comment: 27 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Learning Word Representations with Hierarchical Sparse Coding

Author: Dyer Chris
Faruqui Manaal
Smith Noah A.
Yogatama Dani
Publication venue
Publication date: 06/11/2014
Field of study

We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings. We show an efficient learning algorithm based on stochastic proximal methods that is significantly faster than previous approaches, making it possible to perform hierarchical sparse coding on a corpus of billions of word tokens. Experiments on various benchmark tasks---word similarity ranking, analogies, sentence completion, and sentiment analysis---demonstrate that the method outperforms or is competitive with state-of-the-art methods. Our word representations are available at \url{http://www.ark.cs.cmu.edu/dyogatam/wordvecs/}

arXiv.org e-Print Archive

CiteSeerX

An Efficient Primal-Dual Prox Method for Non-Smooth Optimization

Author: Jin Rong
Mahdavi Mehrdad
Yang Tianbao
Zhu Shenghuo
Publication venue
Publication date: 26/07/2013
Field of study

We study the non-smooth optimization problems in machine learning, where both the loss function and the regularizer are non-smooth functions. Previous studies on efficient empirical loss minimization assume either a smooth loss function or a strongly convex regularizer, making them unsuitable for non-smooth optimization. We develop a simple yet efficient method for a family of non-smooth optimization problems where the dual form of the loss function is bilinear in primal and dual variables. We cast a non-smooth optimization problem into a minimax optimization problem, and develop a primal dual prox method that solves the minimax optimization problem at a rate of

O(1/T)

{assuming that the proximal step can be efficiently solved}, significantly faster than a standard subgradient descent method that has an

O(1/\sqrt{T})

convergence rate. Our empirical study verifies the efficiency of the proposed method for various non-smooth optimization problems that arise ubiquitously in machine learning by comparing it to the state-of-the-art first order methods

arXiv.org e-Print Archive

CiteSeerX

Variable Screening for High Dimensional Time Series

Author: Yousuf Kashif
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2018
Field of study

Variable selection is a widely studied problem in high dimensional statistics, primarily since estimating the precise relationship between the covariates and the response is of great importance in many scientific disciplines. However, most of theory and methods developed towards this goal for the linear model invoke the assumption of iid sub-Gaussian covariates and errors. This paper analyzes the theoretical properties of Sure Independence Screening (SIS) (Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849-911]) for high dimensional linear models with dependent and/or heavy tailed covariates and errors. We also introduce a generalized least squares screening (GLSS) procedure which utilizes the serial correlation present in the data. By utilizing this serial correlation when estimating our marginal effects, GLSS is shown to outperform SIS in many cases. For both procedures we prove sure screening properties, which depend on the moment conditions, and the strength of dependence in the error and covariate processes, amongst other factors. Additionally, combining these screening procedures with the adaptive Lasso is analyzed. Dependence is quantified by functional dependence measures (Wu [Proc. Natl. Acad. Sci. USA 102 (2005) 14150-14154]), and the results rely on the use of Nagaev-type and exponential inequalities for dependent random variables. We also conduct simulations to demonstrate the finite sample performance of these procedures, and include a real data application of forecasting the US inflation rate.Comment: Published in the Electronic Journal of Statistics (https://projecteuclid.org/euclid.ejs/1519700498

arXiv.org e-Print Archive

Crossref

Experimental design trade-offs for gene regulatory network inference: an in silico study of the yeast Saccharomyces cerevisiae cell cycle

Author: Jeroen Frank (5548520)
Marie Burris (5611874)
Michael Wilkins (5611877)
Mikayla Borton (3968105)
Paula Dalcin Martins (4768536)
Richard Wolfe (2504044)
Robert Danczak (5611871)
Simon Roux (151891)
Publication venue
Publication date: 14/12/2017
Field of study

Time-series of high throughput gene sequencing data intended for gene regulatory network (GRN) inference are often short due to the high costs of sampling cell systems. Moreover, experimentalists lack a set of quantitative guidelines that prescribe the minimal number of samples required to infer a reliable GRN model. We study the temporal resolution of data vs quality of GRN inference in order to ultimately overcome this deficit. The evolution of a Markovian jump process model for the Ras/cAMP/PKA pathway of proteins and metabolites in the G1 phase of the Saccharomyces cerevisiae cell cycle is sampled at a number of different rates. For each time-series we infer a linear regression model of the GRN using the LASSO method. The inferred network topology is evaluated in terms of the area under the precision-recall curve AUPR. By plotting the AUPR against the number of samples, we show that the trade-off has a, roughly speaking, sigmoid shape. An optimal number of samples corresponds to values on the ridge of the sigmoid

arXiv.org e-Print Archive

Directory of Open Access Journals

eScholarship - University of California

Radboud Repository

FigShare