4,273 research outputs found
Faster and Non-ergodic O(1/K) Stochastic Alternating Direction Method of Multipliers
We study stochastic convex optimization subjected to linear equality
constraints. Traditional Stochastic Alternating Direction Method of Multipliers
and its Nesterov's acceleration scheme can only achieve ergodic O(1/\sqrt{K})
convergence rates, where K is the number of iteration. By introducing Variance
Reduction (VR) techniques, the convergence rates improve to ergodic O(1/K). In
this paper, we propose a new stochastic ADMM which elaborately integrates
Nesterov's extrapolation and VR techniques. We prove that our algorithm can
achieve a non-ergodic O(1/K) convergence rate which is optimal for separable
linearly constrained non-smooth convex problems, while the convergence rates of
VR based ADMM methods are actually tight O(1/\sqrt{K}) in non-ergodic sense. To
the best of our knowledge, this is the first work that achieves a truly
accelerated, stochastic convergence rate for constrained convex problems. The
experimental results demonstrate that our algorithm is significantly faster
than the existing state-of-the-art stochastic ADMM methods
Auxiliary Image Regularization for Deep CNNs with Noisy Labels
Precisely-labeled data sets with sufficient amount of samples are very
important for training deep convolutional neural networks (CNNs). However, many
of the available real-world data sets contain erroneously labeled samples and
those errors substantially hinder the learning of very accurate CNN models. In
this work, we consider the problem of training a deep CNN model for image
classification with mislabeled training samples - an issue that is common in
real image data sets with tags supplied by amateur users. To solve this
problem, we propose an auxiliary image regularization technique, optimized by
the stochastic Alternating Direction Method of Multipliers (ADMM) algorithm,
that automatically exploits the mutual context information among training
images and encourages the model to select reliable images to robustify the
learning process. Comprehensive experiments on benchmark data sets clearly
demonstrate our proposed regularized CNN model is resistant to label noise in
training data.Comment: Published as a conference paper at ICLR 201
Game-Theoretic Design of Secure and Resilient Distributed Support Vector Machines with Adversaries
With a large number of sensors and control units in networked systems,
distributed support vector machines (DSVMs) play a fundamental role in scalable
and efficient multi-sensor classification and prediction tasks. However, DSVMs
are vulnerable to adversaries who can modify and generate data to deceive the
system to misclassification and misprediction. This work aims to design defense
strategies for DSVM learner against a potential adversary. We establish a
game-theoretic framework to capture the conflicting interests between the DSVM
learner and the attacker. The Nash equilibrium of the game allows predicting
the outcome of learning algorithms in adversarial environments, and enhancing
the resilience of the machine learning through dynamic distributed learning
algorithms. We show that the DSVM learner is less vulnerable when he uses a
balanced network with fewer nodes and higher degree. We also show that adding
more training samples is an efficient defense strategy against an attacker. We
present secure and resilient DSVM algorithms with verification method and
rejection method, and show their resiliency against adversary with numerical
experiments
Learning-based Resource Optimization in Ultra Reliable Low Latency HetNets
In this paper, the problems of user offloading and resource optimization are
jointly addressed to support ultra-reliable and low latency communications
(URLLC) in HetNets. In particular, a multi-tier network with a single macro
base station (MBS) and multiple overlaid small cell base stations (SBSs) is
considered that includes users with different latency and reliability
constraints. Modeling the latency and reliability constraints of users with
probabilistic guarantees, the joint problem of user offloading and resource
allocation (JUR) in a URLLC setting is formulated as an optimization problem to
minimize the cost of serving users for the MBS. In the considered scheme, SBSs
bid to serve URLLC users under their coverage at a given price, and the MBS
decides whether to serve each user locally or to offload it to one of the
overlaid SBSs. Since the JUR optimization is NP-hard, we propose a low
complexity learning-based heuristic method (LHM) which includes a support
vector machine-based user association model and a convex resource optimization
(CRO) algorithm. To further reduce the delay, we propose an alternating
direction method of multipliers (ADMM)-based solution to the CRO problem.
Simulation results show that using LHM, the MBS significantly decreases the
spectrum access delay for users (by 93\%) as compared to JUR, while also
reducing its bandwidth and power costs in serving users (by 33\%) as
compared to directly serving users without offloading.Comment: Submitted to IEEE Globecom 201
Proximal Methods for Sparse Optimal Scoring and Discriminant Analysis
Linear discriminant analysis (LDA) is a classical method for dimensionality
reduction, where discriminant vectors are sought to project data to a lower
dimensional space for optimal separability of classes. Several recent papers
have outlined strategies for exploiting sparsity for using LDA with
high-dimensional data. However, many lack scalable methods for solution of the
underlying optimization problems. We propose three new numerical optimization
schemes for solving the sparse optimal scoring formulation of LDA based on
block coordinate descent, the proximal gradient method, and the alternating
direction method of multipliers. We show that the per-iteration cost of these
methods scales linearly in the dimension of the data provided restricted
regularization terms are employed, and cubically in the dimension of the data
in the worst case. Furthermore, we establish that if our block coordinate
descent framework generates convergent subsequences of iterates, then these
subsequences converge to the stationary points of the sparse optimal scoring
problem. We demonstrate the effectiveness of our new methods with empirical
results for classification of Gaussian data and data sets drawn from
benchmarking repositories, including time-series and multispectral X-ray data,
and provide Matlab and R implementations of our optimization schemes
Structured Adversarial Attack: Towards General Implementation and Better Interpretability
When generating adversarial examples to attack deep neural networks (DNNs),
Lp norm of the added perturbation is usually used to measure the similarity
between original image and adversarial example. However, such adversarial
attacks perturbing the raw input spaces may fail to capture structural
information hidden in the input. This work develops a more general attack
model, i.e., the structured attack (StrAttack), which explores group sparsity
in adversarial perturbations by sliding a mask through images aiming for
extracting key spatial structures. An ADMM (alternating direction method of
multipliers)-based framework is proposed that can split the original problem
into a sequence of analytically solvable subproblems and can be generalized to
implement other attacking methods. Strong group sparsity is achieved in
adversarial perturbations even with the same level of Lp norm distortion as the
state-of-the-art attacks. We demonstrate the effectiveness of StrAttack by
extensive experimental results onMNIST, CIFAR-10, and ImageNet. We also show
that StrAttack provides better interpretability (i.e., better correspondence
with discriminative image regions)through adversarial saliency map (Papernot et
al., 2016b) and class activation map(Zhou et al., 2016).Comment: Published as a conference paper at ICLR 201
Sparse Coding with Fast Image Alignment via Large Displacement Optical Flow
Sparse representation-based classifiers have shown outstanding accuracy and
robustness in image classification tasks even with the presence of intense
noise and occlusion. However, it has been discovered that the performance
degrades significantly either when test image is not aligned with the
dictionary atoms or the dictionary atoms themselves are not aligned with each
other, in which cases the sparse linear representation assumption fails. In
this paper, having both training and test images misaligned, we introduce a
novel sparse coding framework that is able to efficiently adapt the dictionary
atoms to the test image via large displacement optical flow. In the proposed
algorithm, every dictionary atom is automatically aligned with the input image
and the sparse code is then recovered using the adapted dictionary atoms. A
corresponding supervised dictionary learning algorithm is also developed for
the proposed framework. Experimental results on digit datasets recognition
verify the efficacy and robustness of the proposed algorithm.Comment: ICASSP 201
Orientation Determination from Cryo-EM images Using Least Unsquared Deviation
A major challenge in single particle reconstruction from cryo-electron
microscopy is to establish a reliable ab-initio three-dimensional model using
two-dimensional projection images with unknown orientations. Common-lines based
methods estimate the orientations without additional geometric information.
However, such methods fail when the detection rate of common-lines is too low
due to the high level of noise in the images. An approximation to the least
squares global self consistency error was obtained using convex relaxation by
semidefinite programming. In this paper we introduce a more robust global self
consistency error and show that the corresponding optimization problem can be
solved via semidefinite relaxation. In order to prevent artificial clustering
of the estimated viewing directions, we further introduce a spectral norm term
that is added as a constraint or as a regularization term to the relaxed
minimization problem. The resulted problems are solved by using either the
alternating direction method of multipliers or an iteratively reweighted least
squares procedure. Numerical experiments with both simulated and real images
demonstrate that the proposed methods significantly reduce the orientation
estimation error when the detection rate of common-lines is low
Against Membership Inference Attack: Pruning is All You Need
The large model size, high computational operations, and vulnerability
against membership inference attack (MIA) have impeded deep learning or deep
neural networks (DNNs) popularity, especially on mobile devices. To address the
challenge, we envision that the weight pruning technique will help DNNs against
MIA while reducing model storage and computational operation. In this work, we
propose a pruning algorithm, and we show that the proposed algorithm can find a
subnetwork that can prevent privacy leakage from MIA and achieves competitive
accuracy with the original DNNs. We also verify our theoretical insights with
experiments. Our experimental results illustrate that the attack accuracy using
model compression is up to 13.6% and 10% lower than that of the baseline and
Min-Max game, accordingly.Comment: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Machine
Learning (stat.ML
Managing Randomization in the Multi-Block Alternating Direction Method of Multipliers for Quadratic Optimization
The Alternating Direction Method of Multipliers (ADMM) has gained a lot of
attention for solving large-scale and objective-separable constrained
optimization. However, the two-block variable structure of the ADMM still
limits the practical computational efficiency of the method, because one big
matrix factorization is needed at least once even for linear and convex
quadratic programming. This drawback may be overcome by enforcing a multi-block
structure of the decision variables in the original optimization problem.
Unfortunately, the multi-block ADMM, with more than two blocks, is not
guaranteed to be convergent. On the other hand, two positive developments have
been made: first, if in each cyclic loop one randomly permutes the updating
order of the multiple blocks, then the method converges in expectation for
solving any system of linear equations with any number of blocks. Secondly,
such a randomly permuted ADMM also works for equality-constrained convex
quadratic programming even when the objective function is not separable. The
goal of this paper is twofold. First, we add more randomness into the ADMM by
developing a randomly assembled cyclic ADMM (RAC-ADMM) where the decision
variables in each block are randomly assembled. We discuss the theoretical
properties of RAC-ADMM and show when random assembling helps and when it hurts,
and develop a criterion to guarantee that it converges almost surely. Secondly,
using the theoretical guidance on RAC-ADMM, we conduct multiple numerical tests
on solving both randomly generated and large-scale benchmark quadratic
optimization problems, which include continuous, and binary graph-partition and
quadratic assignment, and selected machine learning problems. Our numerical
tests show that the RAC-ADMM, with a variable-grouping strategy, could
significantly improve the computation efficiency on solving most quadratic
optimization problems.Comment: Expanded and streamlined theoretical sections. Added comparisons with
other multi-block ADMM variants. Updated Computational Studies Section on
continuous problems -- reporting primal and dual residuals instead of
objective value gap. Added selected machine learning problems
(ElasticNet/Lasso and Support Vector Machine) to Computational Studies
Sectio
- …