6 research outputs found

    Generalization Bounds for Stochastic Gradient Descent via Localized ε\varepsilon-Covers

    Full text link
    In this paper, we propose a new covering technique localized for the trajectories of SGD. This localization provides an algorithm-specific complexity measured by the covering number, which can have dimension-independent cardinality in contrast to standard uniform covering arguments that result in exponential dimension dependency. Based on this localized construction, we show that if the objective function is a finite perturbation of a piecewise strongly convex and smooth function with PP pieces, i.e. non-convex and non-smooth in general, the generalization error can be upper bounded by O((lognlog(nP))/n)O(\sqrt{(\log n\log(nP))/n}), where nn is the number of data samples. In particular, this rate is independent of dimension and does not require early stopping and decaying step size. Finally, we employ these results in various contexts and derive generalization bounds for multi-index linear models, multi-class support vector machines, and KK-means clustering for both hard and soft label setups, improving the known state-of-the-art rates

    Symmetry in Applied Mathematics

    Get PDF
    Applied mathematics and symmetry work together as a powerful tool for problem reduction and solving. We are communicating applications in probability theory and statistics (A Test Detecting the Outliers for Continuous Distributions Based on the Cumulative Distribution Function of the Data Being Tested, The Asymmetric Alpha-Power Skew-t Distribution), fractals - geometry and alike (Khovanov Homology of Three-Strand Braid Links, Volume Preserving Maps Between p-Balls, Generation of Julia and Mandelbrot Sets via Fixed Points), supersymmetry - physics, nanostructures -chemistry, taxonomy - biology and alike (A Continuous Coordinate System for the Plane by Triangular Symmetry, One-Dimensional Optimal System for 2D Rotating Ideal Gas, Minimal Energy Configurations of Finite Molecular Arrays, Noether-Like Operators and First Integrals for Generalized Systems of Lane-Emden Equations), algorithms, programs and software analysis (Algorithm for Neutrosophic Soft Sets in Stochastic Multi-Criteria Group Decision Making Based on Prospect Theory, On a Reduced Cost Higher Order Traub-Steffensen-Like Method for Nonlinear Systems, On a Class of Optimal Fourth Order Multiple Root Solvers without Using Derivatives) to specific subjects (Facility Location Problem Approach for Distributed Drones, Parametric Jensen-Shannon Statistical Complexity and Its Applications on Full-Scale Compartment Fire Data). Diverse topics are thus combined to map out the mathematical core of practical problems
    corecore