66 research outputs found

    Adaptive Relaxed ADMM: Convergence Theory and Practical Implementation

    Full text link
    Many modern computer vision and machine learning applications rely on solving difficult optimization problems that involve non-differentiable objective functions and constraints. The alternating direction method of multipliers (ADMM) is a widely used approach to solve such problems. Relaxed ADMM is a generalization of ADMM that often achieves better performance, but its efficiency depends strongly on algorithm parameters that must be chosen by an expert user. We propose an adaptive method that automatically tunes the key algorithm parameters to achieve optimal performance without user oversight. Inspired by recent work on adaptivity, the proposed adaptive relaxed ADMM (ARADMM) is derived by assuming a Barzilai-Borwein style linear gradient. A detailed convergence analysis of ARADMM is provided, and numerical results on several applications demonstrate fast practical convergence.Comment: CVPR 201

    DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

    Full text link
    Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative. However, even for reasonably-sized neural networks, these relaxations are not tractable, and so must be replaced by even weaker relaxations in practice. In this work, we propose a novel operator splitting method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller sub-problems that often have analytical solutions. The method is modular and scales to problem instances that were previously impossible to solve exactly due to their size. Furthermore, the solver operations are amenable to fast parallelization with GPU acceleration. We demonstrate our method in obtaining tighter bounds on the worst-case performance of large convolutional networks in image classification and reinforcement learning settings

    Scalable Machine Learning Methods for Massive Biomedical Data Analysis.

    Full text link
    Modern data acquisition techniques have enabled biomedical researchers to collect and analyze datasets of substantial size and complexity. The massive size of these datasets allows us to comprehensively study the biological system of interest at an unprecedented level of detail, which may lead to the discovery of clinically relevant biomarkers. Nonetheless, the dimensionality of these datasets presents critical computational and statistical challenges, as traditional statistical methods break down when the number of predictors dominates the number of observations, a setting frequently encountered in biomedical data analysis. This difficulty is compounded by the fact that biological data tend to be noisy and often possess complex correlation patterns among the predictors. The central goal of this dissertation is to develop a computationally tractable machine learning framework that allows us to extract scientifically meaningful information from these massive and highly complex biomedical datasets. We motivate the scope of our study by considering two important problems with clinical relevance: (1) uncertainty analysis for biomedical image registration, and (2) psychiatric disease prediction based on functional connectomes, which are high dimensional correlation maps generated from resting state functional MRI.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111354/1/takanori_1.pd

    Plug-and-Play Methods Provably Converge with Properly Trained Denoisers

    Full text link
    Plug-and-play (PnP) is a non-convex framework that integrates modern denoising priors, such as BM3D or deep learning-based denoisers, into ADMM or other proximal algorithms. An advantage of PnP is that one can use pre-trained denoisers when there is not sufficient data for end-to-end training. Although PnP has been recently studied extensively with great empirical success, theoretical analysis addressing even the most basic question of convergence has been insufficient. In this paper, we theoretically establish convergence of PnP-FBS and PnP-ADMM, without using diminishing stepsizes, under a certain Lipschitz condition on the denoisers. We then propose real spectral normalization, a technique for training deep learning-based denoisers to satisfy the proposed Lipschitz condition. Finally, we present experimental results validating the theory.Comment: Published in the International Conference on Machine Learning, 201

    Review of Extreme Multilabel Classification

    Full text link
    Extreme multilabel classification or XML, is an active area of interest in machine learning. Compared to traditional multilabel classification, here the number of labels is extremely large, hence, the name extreme multilabel classification. Using classical one versus all classification wont scale in this case due to large number of labels, same is true for any other classifiers. Embedding of labels as well as features into smaller label space is an essential first step. Moreover, other issues include existence of head and tail labels, where tail labels are labels which exist in relatively smaller number of given samples. The existence of tail labels creates issues during embedding. This area has invited application of wide range of approaches ranging from bit compression motivated from compressed sensing, tree based embeddings, deep learning based latent space embedding including using attention weights, linear algebra based embeddings such as SVD, clustering, hashing, to name a few. The community has come up with a useful set of metrics to identify correctly the prediction for head or tail labels.Comment: 46 pages, 13 figure

    Scalable and Ensemble Learning for Big Data

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2019. Major: Electrical/Computer Engineering. Advisor: Georgios Giannakis. 1 computer file (PDF); xi, 126 pages.The turn of the decade has trademarked society and computing research with a ``data deluge.'' As the number of smart, highly accurate and Internet-capable devices increases, so does the amount of data that is generated and collected. While this sheer amount of data has the potential to enable high quality inference, and mining of information, it introduces numerous challenges in the processing and pattern analysis, since available statistical inference and machine learning approaches do not necessarily scale well with the number of data and their dimensionality. In addition to the challenges related to scalability, data gathered are often noisy, dynamic, contaminated by outliers or corrupted to specifically inhibit the inference task. Moreover, many machine learning approaches have been shown to be susceptible to adversarial attacks. At the same time, the cost of cloud and distributed computing is rapidly declining. Therefore, there is a pressing need for statistical inference and machine learning tools that are robust to attacks and scale with the volume and dimensionality of the data, by harnessing efficiently the available computational resources. This thesis is centered on analytical and algorithmic foundations that aim to enable statistical inference and data analytics from large volumes of high-dimensional data. The vision is to establish a comprehensive framework based on state-of-the-art machine learning, optimization and statistical inference tools to enable truly large-scale inference, which can tap on the available (possibly distributed) computational resources, and be resilient to adversarial attacks. The ultimate goal is to both analytically and numerically demonstrate how valuable insights from signal processing can lead to markedly improved and accelerated learning tools. To this end, the present thesis investigates two main research thrusts: i) Large-scale subspace clustering; and ii) unsupervised ensemble learning. The aforementioned research thrusts introduce novel algorithms that aim to tackle the issues of large-scale learning. The potential of the proposed algorithms is showcased by rigorous theoretical results and extensive numerical tests
    • ā€¦
    corecore