973,009 research outputs found

    Mad Adam?

    Get PDF
    When Adam first met Eve in Eden, he politely introduced himself: Madam. I\u27m Adam. (Eve should have been named Iris so that she could have replied: Sir, I\u27m Iris.) Had Adam been more loquacious,k he could have used any of the following palindromic introduction -- although some might have led Eve to doubt his sexual orientation and his sanity

    Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case

    Full text link
    Adam is a commonly used stochastic optimization algorithm in machine learning. However, its convergence is still not fully understood, especially in the non-convex setting. This paper focuses on exploring hyperparameter settings for the convergence of vanilla Adam and tackling the challenges of non-ergodic convergence related to practical application. The primary contributions are summarized as follows: firstly, we introduce precise definitions of ergodic and non-ergodic convergence, which cover nearly all forms of convergence for stochastic optimization algorithms. Meanwhile, we emphasize the superiority of non-ergodic convergence over ergodic convergence. Secondly, we establish a weaker sufficient condition for the ergodic convergence guarantee of Adam, allowing a more relaxed choice of hyperparameters. On this basis, we achieve the almost sure ergodic convergence rate of Adam, which is arbitrarily close to o(1/K)o(1/\sqrt{K}). More importantly, we prove, for the first time, that the last iterate of Adam converges to a stationary point for non-convex objectives. Finally, we obtain the non-ergodic convergence rate of O(1/K)O(1/K) for function values under the Polyak-Lojasiewicz (PL) condition. These findings build a solid theoretical foundation for Adam to solve non-convex stochastic optimization problems

    Characterization of the glass transition in vitreous silica by temperature scanning small-angle X-ray scattering

    Full text link
    The temperature dependence of the x-ray scattering in the region below the first sharp diffraction peak was measured for silica glasses with low and high OH content (GE-124 and Corning 7980). Data were obtained upon scanning the temperature at 10, 40 and 80 K/min between 400 K and 1820 K. The measurements resolve, for the first time, the hysteresis between heating and cooling through the glass transition for silica glass, and the data have a better signal to noise ratio than previous light scattering and differential thermal analysis data. For the glass with the higher hydroxyl concentration the glass transition is broader and at a lower temperature. Fits of the data to the Adam-Gibbs-Fulcher equation provide updated kinetic parameters for this very strong glass. The temperature derivative of the observed X-ray scattering matches that of light scattering to within 14%.Comment: EurophysicsLetters, in pres

    Characterization of the glass transition in vitreous silica by temperature scanning small-angle X-ray scattering

    Full text link
    The temperature dependence of the x-ray scattering in the region below the first sharp diffraction peak was measured for silica glasses with low and high OH content (GE-124 and Corning 7980). Data were obtained upon scanning the temperature at 10, 40 and 80 K/min between 400 K and 1820 K. The measurements resolve, for the first time, the hysteresis between heating and cooling through the glass transition for silica glass, and the data have a better signal to noise ratio than previous light scattering and differential thermal analysis data. For the glass with the higher hydroxyl concentration the glass transition is broader and at a lower temperature. Fits of the data to the Adam-Gibbs-Fulcher equation provide updated kinetic parameters for this very strong glass. The temperature derivative of the observed X-ray scattering matches that of light scattering to within 14%.Comment: EurophysicsLetters, in pres

    Adam through a Second-Order Lens

    Full text link
    Research into optimisation for deep learning is characterised by a tension between the computational efficiency of first-order, gradient-based methods (such as SGD and Adam) and the theoretical efficiency of second-order, curvature-based methods (such as quasi-Newton methods and K-FAC). We seek to combine the benefits of both approaches into a single computationally-efficient algorithm. Noting that second-order methods often depend on stabilising heuristics (such as Levenberg-Marquardt damping), we propose AdamQLR: an optimiser combining damping and learning rate selection techniques from K-FAC (Martens and Grosse, 2015) with the update directions proposed by Adam, inspired by considering Adam through a second-order lens. We evaluate AdamQLR on a range of regression and classification tasks at various scales, achieving competitive generalisation performance vs runtime.Comment: 28 pages, 15 figures, 4 tables. Submitted to ICLR 202

    Soziale Gerechtigkeit und die verschiedenen Varianten des Kapitalismus

    Get PDF
    This is a contribution ot a collection of classical texts on capitalism, from Adam Smith, G.W.F.Hegel, K. Marx, E. Durkheim, J.S. Mill, A. Sen, and A. Hirschman

    Mainstream economics and the Austrian school: toward reunification

    Get PDF
    In this paper, I compare the methodology of the Austrian school to two alternative methodologies from the economic mainstream: the ‘orthodox’ and revealed preference methodologies. I argue that Austrian school theorists should stop describing themselves as ‘extreme apriorists’ (or writing suggestively to that effect), and should start giving greater acknowledgement to the importance of empirical work within their research program. The motivation for this dialectical shift is threefold: the approach is more faithful to their actual practices, it better illustrates the underlying similarities between the mainstream and Austrian research paradigms, and it provides a philosophical foundation that is much more plausible in itself
    • …
    corecore