66 research outputs found
Adaptive Relaxed ADMM: Convergence Theory and Practical Implementation
Many modern computer vision and machine learning applications rely on solving
difficult optimization problems that involve non-differentiable objective
functions and constraints. The alternating direction method of multipliers
(ADMM) is a widely used approach to solve such problems. Relaxed ADMM is a
generalization of ADMM that often achieves better performance, but its
efficiency depends strongly on algorithm parameters that must be chosen by an
expert user. We propose an adaptive method that automatically tunes the key
algorithm parameters to achieve optimal performance without user oversight.
Inspired by recent work on adaptivity, the proposed adaptive relaxed ADMM
(ARADMM) is derived by assuming a Barzilai-Borwein style linear gradient. A
detailed convergence analysis of ARADMM is provided, and numerical results on
several applications demonstrate fast practical convergence.Comment: CVPR 201
DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting
Analyzing the worst-case performance of deep neural networks against input
perturbations amounts to solving a large-scale non-convex optimization problem,
for which several past works have proposed convex relaxations as a promising
alternative. However, even for reasonably-sized neural networks, these
relaxations are not tractable, and so must be replaced by even weaker
relaxations in practice. In this work, we propose a novel operator splitting
method that can directly solve a convex relaxation of the problem to high
accuracy, by splitting it into smaller sub-problems that often have analytical
solutions. The method is modular and scales to problem instances that were
previously impossible to solve exactly due to their size. Furthermore, the
solver operations are amenable to fast parallelization with GPU acceleration.
We demonstrate our method in obtaining tighter bounds on the worst-case
performance of large convolutional networks in image classification and
reinforcement learning settings
Scalable Machine Learning Methods for Massive Biomedical Data Analysis.
Modern data acquisition techniques have enabled biomedical researchers to collect and analyze datasets of substantial size and complexity. The massive size of these datasets allows us to comprehensively study the biological system of interest at an unprecedented level of detail, which may lead to the discovery of clinically relevant biomarkers. Nonetheless, the dimensionality of these datasets presents critical computational and statistical challenges, as traditional statistical methods break down when the number of predictors dominates the number of observations, a setting frequently encountered in biomedical data analysis. This difficulty is compounded by the fact that biological data tend to be noisy and often possess complex correlation patterns among the predictors. The central goal of this dissertation is to develop a computationally tractable machine learning framework that allows us to extract scientifically meaningful information from these massive and highly complex biomedical datasets. We motivate the scope of our study by considering two important problems with clinical relevance: (1) uncertainty analysis for biomedical image registration, and (2) psychiatric disease prediction based on functional connectomes, which are high dimensional correlation maps generated from resting state functional MRI.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111354/1/takanori_1.pd
Plug-and-Play Methods Provably Converge with Properly Trained Denoisers
Plug-and-play (PnP) is a non-convex framework that integrates modern
denoising priors, such as BM3D or deep learning-based denoisers, into ADMM or
other proximal algorithms. An advantage of PnP is that one can use pre-trained
denoisers when there is not sufficient data for end-to-end training. Although
PnP has been recently studied extensively with great empirical success,
theoretical analysis addressing even the most basic question of convergence has
been insufficient. In this paper, we theoretically establish convergence of
PnP-FBS and PnP-ADMM, without using diminishing stepsizes, under a certain
Lipschitz condition on the denoisers. We then propose real spectral
normalization, a technique for training deep learning-based denoisers to
satisfy the proposed Lipschitz condition. Finally, we present experimental
results validating the theory.Comment: Published in the International Conference on Machine Learning, 201
Review of Extreme Multilabel Classification
Extreme multilabel classification or XML, is an active area of interest in
machine learning. Compared to traditional multilabel classification, here the
number of labels is extremely large, hence, the name extreme multilabel
classification. Using classical one versus all classification wont scale in
this case due to large number of labels, same is true for any other
classifiers. Embedding of labels as well as features into smaller label space
is an essential first step. Moreover, other issues include existence of head
and tail labels, where tail labels are labels which exist in relatively smaller
number of given samples. The existence of tail labels creates issues during
embedding. This area has invited application of wide range of approaches
ranging from bit compression motivated from compressed sensing, tree based
embeddings, deep learning based latent space embedding including using
attention weights, linear algebra based embeddings such as SVD, clustering,
hashing, to name a few. The community has come up with a useful set of metrics
to identify correctly the prediction for head or tail labels.Comment: 46 pages, 13 figure
Scalable and Ensemble Learning for Big Data
University of Minnesota Ph.D. dissertation. May 2019. Major: Electrical/Computer Engineering. Advisor: Georgios Giannakis. 1 computer file (PDF); xi, 126 pages.The turn of the decade has trademarked society and computing research with a ``data deluge.'' As the number of smart, highly accurate and Internet-capable devices increases, so does the amount of data that is generated and collected. While this sheer amount of data has the potential to enable high quality inference, and mining of information, it introduces numerous challenges in the processing and pattern analysis, since available statistical inference and machine learning approaches do not necessarily scale well with the number of data and their dimensionality. In addition to the challenges related to scalability, data gathered are often noisy, dynamic, contaminated by outliers or corrupted to specifically inhibit the inference task. Moreover, many machine learning approaches have been shown to be susceptible to adversarial attacks. At the same time, the cost of cloud and distributed computing is rapidly declining. Therefore, there is a pressing need for statistical inference and machine learning tools that are robust to attacks and scale with the volume and dimensionality of the data, by harnessing efficiently the available computational resources. This thesis is centered on analytical and algorithmic foundations that aim to enable statistical inference and data analytics from large volumes of high-dimensional data. The vision is to establish a comprehensive framework based on state-of-the-art machine learning, optimization and statistical inference tools to enable truly large-scale inference, which can tap on the available (possibly distributed) computational resources, and be resilient to adversarial attacks. The ultimate goal is to both analytically and numerically demonstrate how valuable insights from signal processing can lead to markedly improved and accelerated learning tools. To this end, the present thesis investigates two main research thrusts: i) Large-scale subspace clustering; and ii) unsupervised ensemble learning. The aforementioned research thrusts introduce novel algorithms that aim to tackle the issues of large-scale learning. The potential of the proposed algorithms is showcased by rigorous theoretical results and extensive numerical tests
- ā¦