12,410 research outputs found

    Data-Efficient Learning via Minimizing Hyperspherical Energy

    Full text link
    Deep learning on large-scale data is dominant nowadays. The unprecedented scale of data has been arguably one of the most important driving forces for the success of deep learning. However, there still exist scenarios where collecting data or labels could be extremely expensive, e.g., medical imaging and robotics. To fill up this gap, this paper considers the problem of data-efficient learning from scratch using a small amount of representative data. First, we characterize this problem by active learning on homeomorphic tubes of spherical manifolds. This naturally generates feasible hypothesis class. With homologous topological properties, we identify an important connection -- finding tube manifolds is equivalent to minimizing hyperspherical energy (MHE) in physical geometry. Inspired by this connection, we propose a MHE-based active learning (MHEAL) algorithm, and provide comprehensive theoretical guarantees for MHEAL, covering convergence and generalization analysis. Finally, we demonstrate the empirical performance of MHEAL in a wide range of applications on data-efficient learning, including deep clustering, distribution matching, version space sampling and deep active learning

    Activity Identification and Local Linear Convergence of Forward--Backward-type methods

    Full text link
    In this paper, we consider a class of Forward--Backward (FB) splitting methods that includes several variants (e.g. inertial schemes, FISTA) for minimizing the sum of two proper convex and lower semi-continuous functions, one of which has a Lipschitz continuous gradient, and the other is partly smooth relatively to a smooth active manifold M\mathcal{M}. We propose a unified framework, under which we show that, this class of FB-type algorithms (i) correctly identifies the active manifolds in a finite number of iterations (finite activity identification), and (ii) then enters a local linear convergence regime, which we characterize precisely in terms of the structure of the underlying active manifolds. For simpler problems involving polyhedral functions, we show finite termination. We also establish and explain why FISTA (with convergent sequences) locally oscillates and can be slower than FB. These results may have numerous applications including in signal/image processing, sparse recovery and machine learning. Indeed, the obtained results explain the typical behaviour that has been observed numerically for many problems in these fields such as the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name only a few.Comment: Full length version of the previous short on

    Local Linear Convergence Analysis of Primal-Dual Splitting Methods

    Full text link
    In this paper, we study the local linear convergence properties of a versatile class of Primal-Dual splitting methods for minimizing composite non-smooth convex optimization problems. Under the assumption that the non-smooth components of the problem are partly smooth relative to smooth manifolds, we present a unified local convergence analysis framework for these methods. More precisely, in our framework we first show that (i) the sequences generated by Primal-Dual splitting methods identify a pair of primal and dual smooth manifolds in a finite number of iterations, and then (ii) enter a local linear convergence regime, which is characterized based on the structure of the underlying active smooth manifolds. We also show how our results for Primal-Dual splitting can be specialized to cover existing ones on Forward-Backward splitting and Douglas-Rachford splitting/ADMM (alternating direction methods of multipliers). Moreover, based on these obtained local convergence analysis result, several practical acceleration techniques are discussed. To exemplify the usefulness of the obtained result, we consider several concrete numerical experiments arising from fields including signal/image processing, inverse problems and machine learning, etc. The demonstration not only verifies the local linear convergence behaviour of Primal-Dual splitting methods, but also the insights on how to accelerate them in practice

    Sparse Coding on Symmetric Positive Definite Manifolds using Bregman Divergences

    Full text link
    This paper introduces sparse coding and dictionary learning for Symmetric Positive Definite (SPD) matrices, which are often used in machine learning, computer vision and related areas. Unlike traditional sparse coding schemes that work in vector spaces, in this paper we discuss how SPD matrices can be described by sparse combination of dictionary atoms, where the atoms are also SPD matrices. We propose to seek sparse coding by embedding the space of SPD matrices into Hilbert spaces through two types of Bregman matrix divergences. This not only leads to an efficient way of performing sparse coding, but also an online and iterative scheme for dictionary learning. We apply the proposed methods to several computer vision tasks where images are represented by region covariance matrices. Our proposed algorithms outperform state-of-the-art methods on a wide range of classification tasks, including face recognition, action recognition, material classification and texture categorization

    The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

    Get PDF
    Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need for new paradigms for the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found only in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments, some "old" and some new. These are generally unknown to most of the astronomical community, but are vital to the analysis and visualization of complex datasets and images. In order for astronomers to take advantage of the richness and complexity of the new era of data, and to be able to identify, adopt, and apply new solutions, the astronomical community needs a certain degree of awareness and understanding of the new concepts. One of the goals of this paper is to help bridge the gap between applied mathematics, artificial intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in Astronomy, special issue "Robotic Astronomy
    corecore