4,592 research outputs found
Hybrid Random/Deterministic Parallel Algorithms for Nonconvex Big Data Optimization
We propose a decomposition framework for the parallel optimization of the sum
of a differentiable {(possibly nonconvex)} function and a nonsmooth (possibly
nonseparable), convex one. The latter term is usually employed to enforce
structure in the solution, typically sparsity. The main contribution of this
work is a novel \emph{parallel, hybrid random/deterministic} decomposition
scheme wherein, at each iteration, a subset of (block) variables is updated at
the same time by minimizing local convex approximations of the original
nonconvex function. To tackle with huge-scale problems, the (block) variables
to be updated are chosen according to a \emph{mixed random and deterministic}
procedure, which captures the advantages of both pure deterministic and random
update-based schemes. Almost sure convergence of the proposed scheme is
established. Numerical results show that on huge-scale problems the proposed
hybrid random/deterministic algorithm outperforms both random and deterministic
schemes.Comment: The order of the authors is alphabetica
Coordinate Descent with Bandit Sampling
Coordinate descent methods usually minimize a cost function by updating a
random decision variable (corresponding to one coordinate) at a time. Ideally,
we would update the decision variable that yields the largest decrease in the
cost function. However, finding this coordinate would require checking all of
them, which would effectively negate the improvement in computational
tractability that coordinate descent is intended to afford. To address this, we
propose a new adaptive method for selecting a coordinate. First, we find a
lower bound on the amount the cost function decreases when a coordinate is
updated. We then use a multi-armed bandit algorithm to learn which coordinates
result in the largest lower bound by interleaving this learning with
conventional coordinate descent updates except that the coordinate is selected
proportionately to the expected decrease. We show that our approach improves
the convergence of coordinate descent methods both theoretically and
experimentally.Comment: appearing at NeurIPS 201
Beyond binomial and negative binomial: adaptation in Bernoulli parameter estimation
Estimating the parameter of a Bernoulli process arises in many applications, including photon-efficient active imaging where each illumination period is regarded as a single Bernoulli trial. Motivated by acquisition efficiency when multiple Bernoulli processes (e.g., multiple pixels) are of interest, we formulate the allocation of trials under a constraint on the mean as an optimal resource allocation problem. An oracle-aided trial allocation demonstrates that there can be a significant advantage from varying the allocation for different processes and inspires the introduction of a simple trial allocation gain quantity. Motivated by achieving this gain without an oracle, we present a trellis-based framework for representing and optimizing stopping rules. Considering the convenient case of Beta priors, three implementable stopping rules with similar performances are explored, and the simplest of these is shown to asymptotically achieve the oracle-aided trial allocation. These approaches are further extended to estimating functions of a Bernoulli parameter. In simulations inspired by realistic active imaging scenarios, we demonstrate significant mean-squared error improvements up to 4.36 dB for the estimation of p and up to 1.86 dB for the estimation of log p.https://arxiv.org/abs/1809.08801https://arxiv.org/abs/1809.08801First author draf
Beyond Binomial and Negative Binomial: Adaptation in Bernoulli Parameter Estimation
Estimating the parameter of a Bernoulli process arises in many applications,
including photon-efficient active imaging where each illumination period is
regarded as a single Bernoulli trial. Motivated by acquisition efficiency when
multiple Bernoulli processes are of interest, we formulate the allocation of
trials under a constraint on the mean as an optimal resource allocation
problem. An oracle-aided trial allocation demonstrates that there can be a
significant advantage from varying the allocation for different processes and
inspires a simple trial allocation gain quantity. Motivated by realizing this
gain without an oracle, we present a trellis-based framework for representing
and optimizing stopping rules. Considering the convenient case of Beta priors,
three implementable stopping rules with similar performances are explored, and
the simplest of these is shown to asymptotically achieve the oracle-aided trial
allocation. These approaches are further extended to estimating functions of a
Bernoulli parameter. In simulations inspired by realistic active imaging
scenarios, we demonstrate significant mean-squared error improvements: up to
4.36 dB for the estimation of p and up to 1.80 dB for the estimation of log p.Comment: 13 pages, 16 figure
- …