22 research outputs found

    Momentum-Based Variance Reduction in Non-Convex SGD

    Full text link
    Variance reduction has emerged in recent years as a strong competitor to stochastic gradient descent in non-convex problems, providing the first algorithms to improve upon the converge rate of stochastic gradient descent for finding first-order critical points. However, variance reduction techniques typically require carefully tuned learning rates and willingness to use excessively large "mega-batches" in order to achieve their improved results. We present a new algorithm, STORM, that does not require any batches and makes use of adaptive learning rates, enabling simpler implementation and less hyperparameter tuning. Our technique for removing the batches uses a variant of momentum to achieve variance reduction in non-convex optimization. On smooth losses FF, STORM finds a point x\boldsymbol{x} with E[F(x)]O(1/T+σ1/3/T1/3)\mathbb{E}[\|\nabla F(\boldsymbol{x})\|]\le O(1/\sqrt{T}+\sigma^{1/3}/T^{1/3}) in TT iterations with σ2\sigma^2 variance in the gradients, matching the optimal rate but without requiring knowledge of σ\sigma.Comment: Added Ac

    Design and Calibration of Pinch Force Measurement Using Strain Gauge for Post-Stroke Patients

    Get PDF
    Two fingers strength is an indicative measurement of pinch impairment. Conventionally, Fugl Meyer Upper Extremity Assessment (FMA-UE) is the primary standard to measure pinch strength of post-stroke survivors. In literature, the evaluation method performed by the therapist is subjective and exposed to inter-rater and intra-rater reliabilities. Recently, force-sensing resistors were implemented to measure two fingers force, but these sensors are subjected to nonlinearity, high hysteresis, and voltage drift. This paper presents a design of pinch force measurement based on the strain gauge. The pinch sensor was calibrated within a range of between 0 N to 50 N over a pinching length of 20 mm with a linearity error of 0.0123% and hysteresis of 0.513% during the loading and unloading process. The voltage drift has an average of 0.24% over 20 minutes. The pinch force measurement system reveals an objective pinch force measurements in evaluating the rehabilitation progress of post-stroke patients

    On the Last Iterate Convergence of Momentum Methods

    Full text link
    SGD with Momentum (SGDM) is widely used for large scale optimization of machine learning problems. Yet, the theoretical understanding of this algorithm is not complete. In fact, even the most recent results require changes to the algorithm like an averaging scheme and a projection onto a bounded domain, which are never used in practice. Also, no lower bound is known for SGDM. In this paper, we prove for the first time that for any constant momentum factor, there exists a Lipschitz and convex function for which the last iterate of SGDM suffers from an error Ω(logTT)\Omega(\frac{\log T}{\sqrt{T}}) after TT steps. Based on this fact, we study a new class of (both adaptive and non-adaptive) Follow-The-Regularized-Leader-based SGDM algorithms with \emph{increasing momentum} and \emph{shrinking updates}. For these algorithms, we show that the last iterate has optimal convergence O(1T)O (\frac{1}{\sqrt{T}}) for unconstrained convex optimization problems. Further, we show that in the interpolation setting with convex and smooth functions, our new SGDM algorithm automatically converges at a rate of O(logTT)O(\frac{\log T}{T}). Empirical results are shown as well
    corecore