4,027 research outputs found

    Pathwise Coordinate Optimization for Sparse Learning: Algorithm and Theory

    Full text link
    The pathwise coordinate optimization is one of the most important computational frameworks for high dimensional convex and nonconvex sparse learning problems. It differs from the classical coordinate optimization algorithms in three salient features: {\it warm start initialization}, {\it active set updating}, and {\it strong rule for coordinate preselection}. Such a complex algorithmic structure grants superior empirical performance, but also poses significant challenge to theoretical analysis. To tackle this long lasting problem, we develop a new theory showing that these three features play pivotal roles in guaranteeing the outstanding statistical and computational performance of the pathwise coordinate optimization framework. Particularly, we analyze the existing pathwise coordinate optimization algorithms and provide new theoretical insights into them. The obtained insights further motivate the development of several modifications to improve the pathwise coordinate optimization framework, which guarantees linear convergence to a unique sparse local optimum with optimal statistical properties in parameter estimation and support recovery. This is the first result on the computational and statistical guarantees of the pathwise coordinate optimization framework in high dimensions. Thorough numerical experiments are provided to support our theory.Comment: Accepted by the Annals of Statistics, 2016

    Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery

    Full text link
    We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence \cO(1/\epsilon), where ϵ\epsilon is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package \texttt{camel} implementing the proposed method is available on the Comprehensive R Archive Network \url{http://cran.r-project.org/web/packages/camel/}.Comment: Journal of Machine Learning Research, 201

    Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization

    Full text link
    Stochastic optimization naturally arises in machine learning. Efficient algorithms with provable guarantees, however, are still largely missing, when the objective function is nonconvex and the data points are dependent. This paper studies this fundamental challenge through a streaming PCA problem for stationary time series data. Specifically, our goal is to estimate the principle component of time series data with respect to the covariance matrix of the stationary distribution. Computationally, we propose a variant of Oja's algorithm combined with downsampling to control the bias of the stochastic gradient caused by the data dependency. Theoretically, we quantify the uncertainty of our proposed stochastic algorithm based on diffusion approximations. This allows us to prove the asymptotic rate of convergence and further implies near optimal asymptotic sample complexity. Numerical experiments are provided to support our analysis

    The distribution of p38(MAPK) in the sensorimotor cortex of a mouse model of Alzheimer’s disease

    Get PDF
    The p38 mitogen-activated protein kinase [p38(MAPK)] mediates responses to extracellular stressors. An increase in the phosphorylated form of p38(MAPK) [p-p38(MAPK)] has been associated with early events in Alzheimer disease (AD). Although most often associated with processes including apoptosis, inflammation and oxidative stress, p-p38(MAPK) also mediates beneficial physiological functions, such as cell growth, survival and phagocytosis of cellular pathogens. Amyloid plaques [β-amyloid aggregates] are a hallmark of AD-related pathology. As p38(MAPK) has been detected in the vicinity of senile plaques, we combined immunohistochemistry and stereological sampling to quantify the distribution of plaques and p-p38(MAPK)-immunoreactive (IR) cells in the sensorimotor cortex of 3-, 6- and 10-month-old TgCRND8 mice. This animal model expresses an aggressive nature of the AD-related human amyloid-β protein precursor (APP). It was confirmed by the appearance of both dense-core (thioflavin-S-positive) and diffuse plaques, even in the youngest mice. p-p38(MAPK)-IR cells were associated with both dense-core and diffuse plaques, but the expected age-dependent increase in the density of plaque-associated p-p38(MAPK)-IR cells was restricted to dense-core plaques. Furthermore, the density of dense-core plaque-associated p-p38(MAPK)-IR cells was inversely correlated with the size of the core within the given plaque, which supports a role for these microglia in restricting core growth. p-p38(MAPK)-IR cells were also observed throughout wildtype and TgCRND8 mouse cortical parenchyma, but the density of these non-plaque-associated cells remained constant, regardless of age or genotype. We conclude that the constitutive presence of p-p38(MAPK)-IR microglia in aging mouse brain is indicative of a longitudinal role for this kinase in normal brain physiology. Additionally, the majority of p-p38(MAPK)-IR cells were predominantly co-immunoreactive for the Macrophage-1 (CD11b/CD18) microglial marker, regardless of whether they were associated with plaques or localized to the parenchyma. We suggest that the facts that a pool of p-p38(MAPK)-IR microglia appears to restrict b-amyloid plaque core development and the non-pathological role of p-p38(MAPK) in parenchyma, needs to be considered when anticipating targeted p38(MAPK) therapeutics in the context of clinical AD

    NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization

    Get PDF
    We study a stochastic and distributed algorithm for nonconvex problems whose objective consists of a sum of NN nonconvex Li/NL_i/N-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into NN subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves ϵ\epsilon-stationary solution using O((i=1NLi/N)2/ϵ)\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/\epsilon) gradient evaluations, which can be up to O(N)\mathcal{O}(N) times better than the (proximal) gradient descent methods. It also achieves Q-linear convergence rate for nonconvex 1\ell_1 penalized quadratic problems with polyhedral constraints. Further, we reveal a fundamental connection between primal-dual based methods and a few primal only methods such as IAG/SAG/SAGA.Comment: 35 pages, 2 figure