Search CORE

877,201 research outputs found

Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization

Author: Mairal Julien
Publication venue
Publication date: 10/09/2013
Field of study

Majorization-minimization algorithms consist of iteratively minimizing a majorizing surrogate of an objective function. Because of its simplicity and its wide applicability, this principle has been very popular in statistics and in signal processing. In this paper, we intend to make this principle scalable. We introduce a stochastic majorization-minimization scheme which is able to deal with large-scale or possibly infinite data sets. When applied to convex optimization problems under suitable assumptions, we show that it achieves an expected convergence rate of

O(1/\sqrt{n})

after

n

iterations, and of

O(1/n)

for strongly convex functions. Equally important, our scheme almost surely converges to stationary points for a large class of non-convex problems. We develop several efficient algorithms based on our framework. First, we propose a new stochastic proximal gradient method, which experimentally matches state-of-the-art solvers for large-scale

\ell_1

-logistic regression. Second, we develop an online DC programming algorithm for non-convex sparse estimation. Finally, we demonstrate the effectiveness of our approach for solving large-scale structured matrix factorization problems.Comment: accepted for publication for Neural Information Processing Systems (NIPS) 2013. This is the 9-pages version followed by 16 pages of appendices. The title has changed compared to the first technical repor

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Algorithms for Large-scale Whole Genome Association Analysis

Author: Aulchenko Yurii
Bientinesi Paolo
Fabregat Diego
Peise Elmar
Publication venue
Publication date: 01/01/2013
Field of study

In order to associate complex traits with genetic polymorphisms, genome-wide association studies process huge datasets involving tens of thousands of individuals genotyped for millions of polymorphisms. When handling these datasets, which exceed the main memory of contemporary computers, one faces two distinct challenges: 1) Millions of polymorphisms come at the cost of hundreds of Gigabytes of genotype data, which can only be kept in secondary storage; 2) the relatedness of the test population is represented by a covariance matrix, which, for large populations, can only fit in the combined main memory of a distributed architecture. In this paper, we present solutions for both challenges: The genotype data is streamed from and to secondary storage using a double buffering technique, while the covariance matrix is kept across the main memory of a distributed memory system. We show that these methods sustain high-performance and allow the analysis of enormous datase

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

Fast algorithms for large scale generalized distance weighted discrimination

Author: Lam Xin Yee
Marron J. S.
Sun Defeng
Toh Kim-Chuan
Publication venue
Publication date: 16/08/2017
Field of study

High dimension low sample size statistical analysis is important in a wide range of applications. In such situations, the highly appealing discrimination method, support vector machine, can be improved to alleviate data piling at the margin. This leads naturally to the development of distance weighted discrimination (DWD), which can be modeled as a second-order cone programming problem and solved by interior-point methods when the scale (in sample size and feature dimension) of the data is moderate. Here, we design a scalable and robust algorithm for solving large scale generalized DWD problems. Numerical experiments on real data sets from the UCI repository demonstrate that our algorithm is highly efficient in solving large scale problems, and sometimes even more efficient than the highly optimized LIBLINEAR and LIBSVM for solving the corresponding SVM problems

arXiv.org e-Print Archive

ScholarBank@NUS

Randomized Tensor Ring Decomposition and Its Application to Large-scale Data Reconstruction

Author: Cao Jianting
Li Chao
Yuan Longhao
Zhao Qibin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/01/2019
Field of study

Dimensionality reduction is an essential technique for multi-way large-scale data, i.e., tensor. Tensor ring (TR) decomposition has become popular due to its high representation ability and flexibility. However, the traditional TR decomposition algorithms suffer from high computational cost when facing large-scale data. In this paper, taking advantages of the recently proposed tensor random projection method, we propose two TR decomposition algorithms. By employing random projection on every mode of the large-scale tensor, the TR decomposition can be processed at a much smaller scale. The simulation experiment shows that the proposed algorithms are

4-25

times faster than traditional algorithms without loss of accuracy, and our algorithms show superior performance in deep learning dataset compression and hyperspectral image reconstruction experiments compared to other randomized algorithms.Comment: ICASSP submissio

arXiv.org e-Print Archive

Crossref