6 research outputs found
Generalization Bounds for Stochastic Gradient Descent via Localized -Covers
In this paper, we propose a new covering technique localized for the
trajectories of SGD. This localization provides an algorithm-specific
complexity measured by the covering number, which can have
dimension-independent cardinality in contrast to standard uniform covering
arguments that result in exponential dimension dependency. Based on this
localized construction, we show that if the objective function is a finite
perturbation of a piecewise strongly convex and smooth function with
pieces, i.e. non-convex and non-smooth in general, the generalization error can
be upper bounded by , where is the number of
data samples. In particular, this rate is independent of dimension and does not
require early stopping and decaying step size. Finally, we employ these results
in various contexts and derive generalization bounds for multi-index linear
models, multi-class support vector machines, and -means clustering for both
hard and soft label setups, improving the known state-of-the-art rates
Recommended from our members
New Algorithms in Computational Microscopy
Microscopy plays an important role in providing tools to microscopically observe objects and their surrounding areas with much higher resolution ranging from the scale between molecular machineries (angstrom) and individual cells (micrometer). Under microscopes, illumination, such as visible light and electron-magnetic radiation/electron beam, interacts with samples, then they are scattered to a plane and are recorded. Computational microscopy corresponds to image reconstruction from these measurements as well as improving quality of the images. Along with the evolution of microscopy, new studies are discovered and algorithms need development not only to provide high-resolution imaging but also to decipher new and advanced research. In this dissertation, we focus on algorithm development for inverse problems in microscopy, specifically phase retrieval and tomography, and the application of these techniques to machine learning. The four studies in this dissertation demonstrates the use of optimization and calculus of variation in imaging science and other different disciplines.Study 1 focuses on coherent diffractive imaging (CDI) or phase retrieval, a non-linear inverse problem that aims to recover 2D image from it Fourier transforms in modulus taking into account that extra information provided by oversampling as a second constraint. To solve this two-constraint minimization, we proceed from Hamilton-Jacobi partial differential equation (HJ-PDE) and its Hopf-Lax formula. Introducing generalized Bregman distance to the HJ-PDE and applying Legendre transform, we derive our generalized proximal smoothing (GPS) algorithm under the form of primal-dual hybrid gradient (PDHG). While the reflection operator, known as extrapolating momentum, helps overcome local minima, the smoothing by the generalized Bregman distance is adjusted to improve convergence and consistency of phase retrieval.Study 2 focuses on electron tomography, 3D image reconstruction from a set of 2D projections obtained from a transmission electron microscope (TEM) or X-ray microscope. Notice that current tomography algorithms limit to a single tilt axis and fail to work with fully or partially missing data. In the light of calculus of variations and Fourier slice theorem (FST), we develop a highly accurate tomography iterative algorithm that can provide higher resolution imaging and work with missing data as well as has capability to perform multiple-tilt-axis tomography. The algorithm is further developed to work with non-isolated objects and partially-blocked projections which have become more popular in experiment. The success of real space iterative reconstruction engine (RESIRE) opens a new era to the study of tomography in material science and magnetic structures (vector Tomography).Study 3 and 4 are applications of our algorithms to machine learning. Study 3 develops a backward Euler method in a stochastic manner to solve K-mean clustering, a well-known non-convex optimization problem. The algorithm has been shown to improve minimums and consistency, providing a new powerful tool to the class of classification techniques. Study 4 is a direct application of GPS to deep learning gradient descent algorithms. Linearizing the Hopf-Lax formula derived in GPS, we derive our method Laplacian smoothing gradient descent (LSGD), simply known as gradient smoothing. Our experiment shows that LSGD has the ability to search for better and flatter minimums, reduce variation, and obtain higher accuracy and consistency
Symmetry in Applied Mathematics
Applied mathematics and symmetry work together as a powerful tool for problem reduction and solving. We are communicating applications in probability theory and statistics (A Test Detecting the Outliers for Continuous Distributions Based on the Cumulative Distribution Function of the Data Being Tested, The Asymmetric Alpha-Power Skew-t Distribution), fractals - geometry and alike (Khovanov Homology of Three-Strand Braid Links, Volume Preserving Maps Between p-Balls, Generation of Julia and Mandelbrot Sets via Fixed Points), supersymmetry - physics, nanostructures -chemistry, taxonomy - biology and alike (A Continuous Coordinate System for the Plane by Triangular Symmetry, One-Dimensional Optimal System for 2D Rotating Ideal Gas, Minimal Energy Configurations of Finite Molecular Arrays, Noether-Like Operators and First Integrals for Generalized Systems of Lane-Emden Equations), algorithms, programs and software analysis (Algorithm for Neutrosophic Soft Sets in Stochastic Multi-Criteria Group Decision Making Based on Prospect Theory, On a Reduced Cost Higher Order Traub-Steffensen-Like Method for Nonlinear Systems, On a Class of Optimal Fourth Order Multiple Root Solvers without Using Derivatives) to specific subjects (Facility Location Problem Approach for Distributed Drones, Parametric Jensen-Shannon Statistical Complexity and Its Applications on Full-Scale Compartment Fire Data). Diverse topics are thus combined to map out the mathematical core of practical problems