3,578 research outputs found

    Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees

    Full text link
    Greedy optimization methods such as Matching Pursuit (MP) and Frank-Wolfe (FW) algorithms regained popularity in recent years due to their simplicity, effectiveness and theoretical guarantees. MP and FW address optimization over the linear span and the convex hull of a set of atoms, respectively. In this paper, we consider the intermediate case of optimization over the convex cone, parametrized as the conic hull of a generic atom set, leading to the first principled definitions of non-negative MP algorithms for which we give explicit convergence rates and demonstrate excellent empirical performance. In particular, we derive sublinear (O(1/t)\mathcal{O}(1/t)) convergence on general smooth and convex objectives, and linear convergence (O(e−t)\mathcal{O}(e^{-t})) on strongly convex objectives, in both cases for general sets of atoms. Furthermore, we establish a clear correspondence of our algorithms to known algorithms from the MP and FW literature. Our novel algorithms and analyses target general atom sets and general objective functions, and hence are directly applicable to a large variety of learning settings.Comment: NIPS 201

    Low-complexity RLS algorithms using dichotomous coordinate descent iterations

    Get PDF
    In this paper, we derive low-complexity recursive least squares (RLS) adaptive filtering algorithms. We express the RLS problem in terms of auxiliary normal equations with respect to increments of the filter weights and apply this approach to the exponentially weighted and sliding window cases to derive new RLS techniques. For solving the auxiliary equations, line search methods are used. We first consider conjugate gradient iterations with a complexity of O(N-2) operations per sample; N being the number of the filter weights. To reduce the complexity and make the algorithms more suitable for finite precision implementation, we propose a new dichotomous coordinate descent (DCD) algorithm and apply it to the auxiliary equations. This results in a transversal RLS adaptive filter with complexity as low as 3N multiplications per sample, which is only slightly higher than the complexity of the least mean squares (LMS) algorithm (2N multiplications). Simulations are used to compare the performance of the proposed algorithms against the classical RLS and known advanced adaptive algorithms. Fixed-point FPGA implementation of the proposed DCD-based RLS algorithm is also discussed and results of such implementation are presented

    Non-convex Optimization for Machine Learning

    Full text link
    A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low rank are frequently imposed or else the objective itself is designed to be a non-convex function. This is especially true of algorithms that operate in high-dimensional spaces or that train non-linear models such as tensor models and deep networks. The freedom to express the learning problem as a non-convex optimization problem gives immense modeling power to the algorithm designer, but often such problems are NP-hard to solve. A popular workaround to this has been to relax non-convex problems to convex ones and use traditional methods to solve the (convex) relaxed optimization problems. However this approach may be lossy and nevertheless presents significant challenges for large scale optimization. On the other hand, direct approaches to non-convex optimization have met with resounding success in several domains and remain the methods of choice for the practitioner, as they frequently outperform relaxation-based techniques - popular heuristics include projected gradient descent and alternating minimization. However, these are often poorly understood in terms of their convergence and other properties. This monograph presents a selection of recent advances that bridge a long-standing gap in our understanding of these heuristics. The monograph will lead the reader through several widely used non-convex optimization techniques, as well as applications thereof. The goal of this monograph is to both, introduce the rich literature in this area, as well as equip the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.Comment: The official publication is available from now publishers via http://dx.doi.org/10.1561/220000005

    Optimization with Sparsity-Inducing Penalties

    Get PDF
    Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted ℓ2\ell_2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view
    • 

    corecore