Search CORE

3,882 research outputs found

Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees

Author: Jaggi Martin
Locatello Francesco
Rätsch Gunnar
Tschannen Michael
Publication venue
Publication date: 19/11/2017
Field of study

Greedy optimization methods such as Matching Pursuit (MP) and Frank-Wolfe (FW) algorithms regained popularity in recent years due to their simplicity, effectiveness and theoretical guarantees. MP and FW address optimization over the linear span and the convex hull of a set of atoms, respectively. In this paper, we consider the intermediate case of optimization over the convex cone, parametrized as the conic hull of a generic atom set, leading to the first principled definitions of non-negative MP algorithms for which we give explicit convergence rates and demonstrate excellent empirical performance. In particular, we derive sublinear (

\mathcal{O}(1/t)

) convergence on general smooth and convex objectives, and linear convergence (

\mathcal{O}(e^{-t})

) on strongly convex objectives, in both cases for general sets of atoms. Furthermore, we establish a clear correspondence of our algorithms to known algorithms from the MP and FW literature. Our novel algorithms and analyses target general atom sets and general objective functions, and hence are directly applicable to a large variety of learning settings.Comment: NIPS 201

arXiv.org e-Print Archive

MPG.PuRe

A Nonconvex Splitting Method for Symmetric Nonnegative Matrix Factorization: Convergence Analysis and Optimality

Author: Hong Mingyi
Hong Mingyi
Lu Songtao
Wang Zhengdao
Wang Zhengdao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2017
Field of study

Symmetric nonnegative matrix factorization (SymNMF) has important applications in data analytics problems such as document clustering, community detection and image segmentation. In this paper, we propose a novel nonconvex variable splitting method for solving SymNMF. The proposed algorithm is guaranteed to converge to the set of Karush-Kuhn-Tucker (KKT) points of the nonconvex SymNMF problem. Furthermore, it achieves a global sublinear convergence rate. We also show that the algorithm can be efficiently implemented in parallel. Further, sufficient conditions are provided which guarantee the global and local optimality of the obtained solutions. Extensive numerical results performed on both synthetic and real data sets suggest that the proposed algorithm converges quickly to a local minimum solution.Comment: IEEE Transactions on Signal Processing (to appear

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Crossref

A unified approach to non-negative matrix factorization and probabilistic latent semantic indexing

Author: Devarajan Karthik
Ebrahimi Nader
Wang Guoli
Publication venue: Collection of Biostatistics Research Archive
Publication date: 12/07/2011
Field of study

Non-negative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into two matrices, W and H, each with nonnegative entries, V ~ WH. NMF has been shown to have a unique parts-based, sparse representation of the data. The nonnegativity constraints in NMF allow only additive combinations of the data which enables it to learn parts that have distinct physical representations in reality. In the last few years, NMF has been successfully applied in a variety of areas such as natural language processing, information retrieval, image processing, speech recognition and computational biology for the analysis and interpretation of large-scale data. We present a generalized approach to NMF based on Renyi\u27s divergence between two non-negative matrices related to the Poisson likelihood. Our approach unifies various competing models and provides a unique framework for NMF. Furthermore, we generalize the equivalence between NMF and probabilistic latent semantic indexing, a well-known method used in text mining and document clustering applications. We evaluate the performance of our method in the unsupervised setting using consensus clustering and demonstrate its applicability using real-life and simulated data

Collection Of Biostatistics Research Archive

Exploring multimodal data fusion through joint decompositions with flexible couplings

Author: Cohen Jeremy Emile
Comon Pierre
Farias Rodrigo Cabral
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/05/2016
Field of study

A Bayesian framework is proposed to define flexible coupling models for joint tensor decompositions of multiple data sets. Under this framework, a natural formulation of the data fusion problem is to cast it in terms of a joint maximum a posteriori (MAP) estimator. Data driven scenarios of joint posterior distributions are provided, including general Gaussian priors and non Gaussian coupling priors. We present and discuss implementation issues of algorithms used to obtain the joint MAP estimator. We also show how this framework can be adapted to tackle the problem of joint decompositions of large datasets. In the case of a conditional Gaussian coupling with a linear transformation, we give theoretical bounds on the data fusion performance using the Bayesian Cramer-Rao bound. Simulations are reported for hybrid coupling models ranging from simple additive Gaussian models, to Gamma-type models with positive variables and to the coupling of data sets which are inherently of different size due to different resolution of the measurement devices.Comment: 15 pages, 7 figures, revised versio

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

Adaptive Density Estimation for Generative Models

Author: Alahari Karteek
Lucas Thomas
Schmid Cordelia
Shmelkov Konstantin
Verbeek Jakob
Publication venue
Publication date: 08/12/2019
Field of study

Unsupervised learning of generative models has seen tremendous progress over recent years, in particular due to generative adversarial networks (GANs), variational autoencoders, and flow-based models. GANs have dramatically improved sample quality, but suffer from two drawbacks: (i) they mode-drop, i.e., do not cover the full support of the train data, and (ii) they do not allow for likelihood evaluations on held-out data. In contrast, likelihood-based training encourages models to cover the full support of the train data, but yields poorer samples. These mutual shortcomings can in principle be addressed by training generative latent variable models in a hybrid adversarial-likelihood manner. However, we show that commonly made parametric assumptions create a conflict between them, making successful hybrid models non trivial. As a solution, we propose to use deep invertible transformations in the latent variable decoder. This approach allows for likelihood computations in image space, is more efficient than fully invertible models, and can take full advantage of adversarial training. We show that our model significantly improves over existing hybrid models: offering GAN-like samples, IS and FID scores that are competitive with fully adversarial models, and improved likelihood scores

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

A new steplength selection for scaled gradient methods with application to image deblurring

Author: Porta Federica
Prato Marco
Zanni Luca
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Gradient methods are frequently used in large scale image deblurring problems since they avoid the onerous computation of the Hessian matrix of the objective function. Second order information is typically sought by a clever choice of the steplength parameter defining the descent direction, as in the case of the well-known Barzilai and Borwein rules. In a recent paper, a strategy for the steplength selection approximating the inverse of some eigenvalues of the Hessian matrix has been proposed for gradient methods applied to unconstrained minimization problems. In the quadratic case, this approach is based on a Lanczos process applied every m iterations to the matrix of the most recent m back gradients but the idea can be extended to a general objective function. In this paper we extend this rule to the case of scaled gradient projection methods applied to non-negatively constrained minimization problems, and we test the effectiveness of the proposed strategy in image deblurring problems in both the presence and the absence of an explicit edge-preserving regularization term

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Ferrara

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia