12,380 research outputs found

    Escaping the Local Minima via Simulated Annealing: Optimization of Approximately Convex Functions

    Full text link
    We consider the problem of optimizing an approximately convex function over a bounded convex set in Rn\mathbb{R}^n using only function evaluations. The problem is reduced to sampling from an \emph{approximately} log-concave distribution using the Hit-and-Run method, which is shown to have the same O\mathcal{O}^* complexity as sampling from log-concave distributions. In addition to extend the analysis for log-concave distributions to approximate log-concave distributions, the implementation of the 1-dimensional sampler of the Hit-and-Run walk requires new methods and analysis. The algorithm then is based on simulated annealing which does not relies on first order conditions which makes it essentially immune to local minima. We then apply the method to different motivating problems. In the context of zeroth order stochastic convex optimization, the proposed method produces an ϵ\epsilon-minimizer after O(n7.5ϵ2)\mathcal{O}^*(n^{7.5}\epsilon^{-2}) noisy function evaluations by inducing a O(ϵ/n)\mathcal{O}(\epsilon/n)-approximately log concave distribution. We also consider in detail the case when the "amount of non-convexity" decays towards the optimum of the function. Other applications of the method discussed in this work include private computation of empirical risk minimizers, two-stage stochastic programming, and approximate dynamic programming for online learning.Comment: 27 page

    Second-Order Kernel Online Convex Optimization with Adaptive Sketching

    Get PDF
    Kernel online convex optimization (KOCO) is a framework combining the expressiveness of non-parametric kernel models with the regret guarantees of online learning. First-order KOCO methods such as functional gradient descent require only O(t)\mathcal{O}(t) time and space per iteration, and, when the only information on the losses is their convexity, achieve a minimax optimal O(T)\mathcal{O}(\sqrt{T}) regret. Nonetheless, many common losses in kernel problems, such as squared loss, logistic loss, and squared hinge loss posses stronger curvature that can be exploited. In this case, second-order KOCO methods achieve O(log(Det(K)))\mathcal{O}(\log(\text{Det}(\boldsymbol{K}))) regret, which we show scales as O(defflogT)\mathcal{O}(d_{\text{eff}}\log T), where deffd_{\text{eff}} is the effective dimension of the problem and is usually much smaller than O(T)\mathcal{O}(\sqrt{T}). The main drawback of second-order methods is their much higher O(t2)\mathcal{O}(t^2) space and time complexity. In this paper, we introduce kernel online Newton step (KONS), a new second-order KOCO method that also achieves O(defflogT)\mathcal{O}(d_{\text{eff}}\log T) regret. To address the computational complexity of second-order methods, we introduce a new matrix sketching algorithm for the kernel matrix Kt\boldsymbol{K}_t, and show that for a chosen parameter γ1\gamma \leq 1 our Sketched-KONS reduces the space and time complexity by a factor of γ2\gamma^2 to O(t2γ2)\mathcal{O}(t^2\gamma^2) space and time per iteration, while incurring only 1/γ1/\gamma times more regret

    Parameter estimation in softmax decision-making models with linear objective functions

    Full text link
    With an eye towards human-centered automation, we contribute to the development of a systematic means to infer features of human decision-making from behavioral data. Motivated by the common use of softmax selection in models of human decision-making, we study the maximum likelihood parameter estimation problem for softmax decision-making models with linear objective functions. We present conditions under which the likelihood function is convex. These allow us to provide sufficient conditions for convergence of the resulting maximum likelihood estimator and to construct its asymptotic distribution. In the case of models with nonlinear objective functions, we show how the estimator can be applied by linearizing about a nominal parameter value. We apply the estimator to fit the stochastic UCL (Upper Credible Limit) model of human decision-making to human subject data. We show statistically significant differences in behavior across related, but distinct, tasks.Comment: In pres

    Implicit Langevin Algorithms for Sampling From Log-concave Densities

    Full text link
    For sampling from a log-concave density, we study implicit integrators resulting from θ\theta-method discretization of the overdamped Langevin diffusion stochastic differential equation. Theoretical and algorithmic properties of the resulting sampling methods for θ[0,1] \theta \in [0,1] and a range of step sizes are established. Our results generalize and extend prior works in several directions. In particular, for θ1/2\theta\ge1/2, we prove geometric ergodicity and stability of the resulting methods for all step sizes. We show that obtaining subsequent samples amounts to solving a strongly-convex optimization problem, which is readily achievable using one of numerous existing methods. Numerical examples supporting our theoretical analysis are also presented
    corecore