72,045 research outputs found
Zeroth-Order Online Alternating Direction Method of Multipliers: Convergence Analysis and Applications
In this paper, we design and analyze a new zeroth-order online algorithm,
namely, the zeroth-order online alternating direction method of multipliers
(ZOO-ADMM), which enjoys dual advantages of being gradient-free operation and
employing the ADMM to accommodate complex structured regularizers. Compared to
the first-order gradient-based online algorithm, we show that ZOO-ADMM requires
times more iterations, leading to a convergence rate of
, where is the number of optimization variables, and
is the number of iterations. To accelerate ZOO-ADMM, we propose two
minibatch strategies: gradient sample averaging and observation averaging,
resulting in an improved convergence rate of ,
where is the minibatch size. In addition to convergence analysis, we also
demonstrate ZOO-ADMM to applications in signal processing, statistics, and
machine learning
Fast non-coplanar beam orientation optimization based on group sparsity
The selection of beam orientations, which is a key step in radiation
treatment planning, is particularly challenging for non-coplanar radiotherapy
systems due to the large number of candidate beams. In this paper, we report
progress on the group sparsity approach to beam orientation optimization,
wherein beam angles are selected by solving a large scale fluence map
optimization problem with an additional group sparsity penalty term that
encourages most candidate beams to be inactive. The optimization problem is
solved using an accelerated proximal gradient method, the Fast Iterative
Shrinkage-Thresholding Algorithm (FISTA). We derive a closed-form expression
for a relevant proximal operator which enables the application of FISTA. The
proposed algorithm is used to create non-coplanar treatment plans for four
cases (including head and neck, lung, and prostate cases), and the resulting
plans are compared with clinical plans. The dosimetric quality of the group
sparsity treatment plans is superior to that of the clinical plans. Moreover,
the runtime for the group sparsity approach is typically about 5 minutes.
Problems of this size could not be handled using the previous group sparsity
method for beam orientation optimization, which was slow to solve much smaller
coplanar cases. This work demonstrates for the first time that the group
sparsity approach, when combined with an accelerated proximal gradient method
such as FISTA, works effectively for non-coplanar cases with 500-800 candidate
beams.Comment: A preliminary version of this work was reported in the AAPM 2016 oral
presentation "4pi Non-Coplanar IMRT Beam Angle Selection by Convex
Optimization with Group Sparsity Penalty" (link:
http://www.aapm.org/meetings/2016am/PRAbs.asp?mid=115&aid=33413
A Two-Step Pre-Processing for Semidefinite Programming
In semidefinite programming (SDP), a number of pre-processing techniques have
been developed including chordal-completion procedures, which reduce the
dimension of individual constraints by exploiting sparsity therein, and facial
reduction, which reduces the dimension of the problem by removing redundant
rows and columns. This paper suggest that these work in a complementary manner
and that facial reduction should be used after chordal-completion procedures.
In computational experiments on SDP instances from the SDPLib, a benchmark, and
structured instances from polynomial and binary quadratic optimisation, we show
that such two-step pre-processing with a standard interior-point method
outperforms the interior point method, with or without the traditional
pre-processing
Introduction to Nonnegative Matrix Factorization
In this paper, we introduce and provide a short overview of nonnegative
matrix factorization (NMF). Several aspects of NMF are discussed, namely, the
application in hyperspectral imaging, geometry and uniqueness of NMF solutions,
complexity, algorithms, and its link with extended formulations of polyhedra.
In order to put NMF into perspective, the more general problem class of
constrained low-rank matrix approximation problems is first briefly introduced.Comment: 18 pages, 4 figure
FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods
We consider first order gradient methods for effectively optimizing a
composite objective in the form of a sum of smooth and, potentially, non-smooth
functions. We present accelerated and adaptive gradient methods, called FLAG
and FLARE, which can offer the best of both worlds. They can achieve the
optimal convergence rate by attaining the optimal first-order oracle complexity
for smooth convex optimization. Additionally, they can adaptively and
non-uniformly re-scale the gradient direction to adapt to the limited curvature
available and conform to the geometry of the domain. We show theoretically and
empirically that, through the compounding effects of acceleration and
adaptivity, FLAG and FLARE can be highly effective for many data fitting and
machine learning applications
Interactions of Computational Complexity Theory and Mathematics
[This paper is a (self contained) chapter in a new book, Mathematics and
Computation, whose draft is available on my homepage at
https://www.math.ias.edu/avi/book ].
We survey some concrete interaction areas between computational complexity
theory and different fields of mathematics. We hope to demonstrate here that
hardly any area of modern mathematics is untouched by the computational
connection (which in some cases is completely natural and in others may seem
quite surprising). In my view, the breadth, depth, beauty and novelty of these
connections is inspiring, and speaks to a great potential of future
interactions (which indeed, are quickly expanding). We aim for variety. We give
short, simple descriptions (without proofs or much technical detail) of ideas,
motivations, results and connections; this will hopefully entice the reader to
dig deeper. Each vignette focuses only on a single topic within a large
mathematical filed. We cover the following:
Number Theory: Primality testing
Combinatorial Geometry: Point-line incidences
Operator Theory: The Kadison-Singer problem
Metric Geometry: Distortion of embeddings
Group Theory: Generation and random generation
Statistical Physics: Monte-Carlo Markov chains
Analysis and Probability: Noise stability
Lattice Theory: Short vectors
Invariant Theory: Actions on matrix tuplesComment: 27 page
Thinking Required
There exists a theory of a single general-purpose learning algorithm which
could explain the principles its operation. It assumes the initial rough
architecture, a small library of simple innate circuits which are prewired at
birth. and proposes that all significant mental algorithms are learned. Given
current understanding and observations, this paper reviews and lists the
ingredients of such an algorithm from architectural and functional
perspectives.Comment: 18 page
Optimal Algorithms for Distributed Optimization
In this paper, we study the optimal convergence rate for distributed convex
optimization problems in networks. We model the communication restrictions
imposed by the network as a set of affine constraints and provide optimal
complexity bounds for four different setups, namely: the function F(\xb)
\triangleq \sum_{i=1}^{m}f_i(\xb) is strongly convex and smooth, either
strongly convex or smooth or just convex. Our results show that Nesterov's
accelerated gradient descent on the dual problem can be executed in a
distributed manner and obtains the same optimal rates as in the centralized
version of the problem (up to constant or logarithmic factors) with an
additional cost related to the spectral gap of the interaction matrix. Finally,
we discuss some extensions to the proposed setup such as proximal friendly
functions, time-varying graphs, improvement of the condition numbers
Differentiating Through a Cone Program
We consider the problem of efficiently computing the derivative of the
solution map of a convex cone program, when it exists. We do this by implicitly
differentiating the residual map for its homogeneous self-dual embedding, and
solving the linear systems of equations required using an iterative method.
This allows us to efficiently compute the derivative operator, and its adjoint,
evaluated at a vector. These correspond to computing an approximate new
solution, given a perturbation to the cone program coefficients (i.e.,
perturbation analysis), and to computing the gradient of a function of the
solution with respect to the coefficients. Our method scales to large problems,
with numbers of coefficients in the millions. We present an open-source Python
implementation of our method that solves a cone program and returns the
derivative and its adjoint as abstract linear maps; our implementation can be
easily integrated into software systems for automatic differentiation.Comment: Correct sign error on page
A Comparative Study on Remote Tracking of Parkinsons Disease Progression Using Data Mining Methods
In recent years, applications of data mining methods are become more popular
in many fields of medical diagnosis and evaluations. The data mining methods
are appropriate tools for discovering and extracting of available knowledge in
medical databases. In this study, we divided 11 data mining algorithms into
five groups which are applied to a data set of patients clinical variables data
with Parkinsons Disease (PD) to study the disease progression. The data set
includes 22 properties of 42 people that all of our algorithms are applied to
this data set. The Decision Table with 0.9985 correlation coefficients has the
best accuracy and Decision Stump with 0.7919 correlation coefficients has the
lowest accuracy.Comment: 13 Pages, 4 Figure
- …