9,540 research outputs found
Symmetries of Riemann surfaces and magnetic monopoles
This thesis studies, broadly, the role of symmetry in elucidating structure. In particular, I investigate the role that automorphisms of algebraic curves play in three specific contexts; determining the orbits of theta characteristics, influencing the geometry of the highly-symmetric Bring’s curve, and in constructing magnetic monopole solutions. On theta characteristics, I show how to turn questions on the existence of invariant characteristics into questions of group cohomology, compute comprehensive tables of orbit decompositions for curves of genus 9 or less, and prove results on the existence of infinite families of curves with invariant characteristics. On Bring’s curve, I identify key points with geometric significance on the curve, completely determine the structure of the quotients by subgroups of automorphisms, finding new elliptic curves in the process, and identify the unique invariant theta characteristic on the curve. With respect to monopoles, I elucidate the role that the Hitchin conditions play in determining monopole spectral curves, the relation between these conditions and the automorphism group of the curve, and I develop the theory of computing Nahm data of symmetric monopoles. As such I classify all 3-monopoles whose Nahm data may be solved for in terms of elliptic functions
Spectral properties of random matrices
In the first part of this thesis, we give the theoretical foundations of random matrix theory through the definitions of a random matrix, a random probability measure and the corresponding empirical spectral distribution we will be working with. The main technical tool of the first paper is also defined rigorously and analyzed deeply, which is the Stieltjes transform method. We then use this tool to prove optimal convergence of the empirical spectral distribution of random sample covariance matrices to the deterministic Marchenko-Pastur distribution. We also give new results about the rigidity of the eigenvalues of this random sample covariance matrix as well as about the rate of their convergence. In the second part of this thesis, we define another important and more general technical tool which works additionally well with non-Hermitian random matrices and that is the Dyson equation method which was used in the second paper. Just like the Stieltjes transform method, it is also defined rigorously and analyzed deeply. We then prove new local laws about a random matrix model that interpolates between the Marchenko-Pastur distribution, the elliptical law and the circular law. Through our work these local laws can now be considered universal, which means that they are independent of the initial distribution of the random matrix entries. We finally give an overview of our new results and provide new directions of study
The infrared structure of perturbative gauge theories
Infrared divergences in the perturbative expansion of gauge theory amplitudes and cross sections have been a focus of theoretical investigations for almost a century. New insights still continue to emerge, as higher perturbative orders are explored, and high-precision phenomenological applications demand an ever more refined understanding. This review aims to provide a pedagogical overview of the subject. We briefly cover some of the early historical results, we provide some simple examples of low-order applications in the context of perturbative QCD, and discuss the necessary tools to extend these results to all perturbative orders. Finally, we describe recent developments concerning the calculation of soft anomalous dimensions in multi-particle scattering amplitudes at high orders, and we provide a brief introduction to the very active field of infrared subtraction for the calculation of differential distributions at colliders. © 2022 Elsevier B.V
Backpropagation Beyond the Gradient
Automatic differentiation is a key enabler of deep learning: previously, practitioners were limited to models
for which they could manually compute derivatives. Now, they can create sophisticated models with almost
no restrictions and train them using first-order, i. e. gradient, information. Popular libraries like PyTorch
and TensorFlow compute this gradient efficiently, automatically, and conveniently with a single line of
code. Under the hood, reverse-mode automatic differentiation, or gradient backpropagation, powers the
gradient computation in these libraries. Their entire design centers around gradient backpropagation.
These frameworks are specialized around one specific task—computing the average gradient in a mini-batch.
This specialization often complicates the extraction of other information like higher-order statistical moments
of the gradient, or higher-order derivatives like the Hessian. It limits practitioners and researchers to methods
that rely on the gradient. Arguably, this hampers the field from exploring the potential of higher-order
information and there is evidence that focusing solely on the gradient has not lead to significant recent
advances in deep learning optimization.
To advance algorithmic research and inspire novel ideas, information beyond the batch-averaged gradient
must be made available at the same level of computational efficiency, automation, and convenience.
This thesis presents approaches to simplify experimentation with rich information beyond the gradient
by making it more readily accessible. We present an implementation of these ideas as an extension to the
backpropagation procedure in PyTorch. Using this newly accessible information, we demonstrate possible use
cases by (i) showing how it can inform our understanding of neural network training by building a diagnostic
tool, and (ii) enabling novel methods to efficiently compute and approximate curvature information.
First, we extend gradient backpropagation for sequential feedforward models to Hessian backpropagation
which enables computing approximate per-layer curvature. This perspective unifies recently proposed block-
diagonal curvature approximations. Like gradient backpropagation, the computation of these second-order
derivatives is modular, and therefore simple to automate and extend to new operations.
Based on the insight that rich information beyond the gradient can be computed efficiently and at the
same time, we extend the backpropagation in PyTorch with the BackPACK library. It provides efficient and
convenient access to statistical moments of the gradient and approximate curvature information, often at a
small overhead compared to computing just the gradient.
Next, we showcase the utility of such information to better understand neural network training. We build
the Cockpit library that visualizes what is happening inside the model during training through various
instruments that rely on BackPACK’s statistics. We show how Cockpit provides a meaningful statistical
summary report to the deep learning engineer to identify bugs in their machine learning pipeline, guide
hyperparameter tuning, and study deep learning phenomena.
Finally, we use BackPACK’s extended automatic differentiation functionality to develop ViViT, an approach
to efficiently compute curvature information, in particular curvature noise. It uses the low-rank structure
of the generalized Gauss-Newton approximation to the Hessian and addresses shortcomings in existing
curvature approximations. Through monitoring curvature noise, we demonstrate how ViViT’s information
helps in understanding challenges to make second-order optimization methods work in practice.
This work develops new tools to experiment more easily with higher-order information in complex deep
learning models. These tools have impacted works on Bayesian applications with Laplace approximations,
out-of-distribution generalization, differential privacy, and the design of automatic differentia-
tion systems. They constitute one important step towards developing and establishing more efficient deep
learning algorithms
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Mirror symmetry for Dubrovin-Zhang Frobenius manifolds
Frobenius manifolds were formally defined by Boris Dubrovin in the early 1990s, and serve as a bridge between a priori very different fields of mathematics such as integrable systems theory, enumerative geometry, singularity theory, and mathematical physics. This thesis concerns, in particular, a specific class of Frobenius manifolds constructed on the orbit space of an extension of the affine Weyl group defined by Dubrovin together with Youjin Zhang. Here, we find Landau-Ginzburg superpotentials, or B-model mirrors, for these Frobenius structures by considering the characteristic equation for Lax operators of relativistic Toda chains as proposed by Andrea Brini. As a bonus, the results open up various applications in topology, integrable hierarchies, and Gromov-Witten theory, making interesting research questions in these areas more accessible. Some such applications are considered in this thesis. The form of the determinant of the Saito metric on discriminant strata is investigated, applications to the combinatorics of Lyashko-Looijenga maps are given, and investigations into the integrable systems theoretic and enumerative geometric applications are commenced
The Diophantine problem in Chevalley groups
In this paper we study the Diophantine problem in Chevalley groups , where is an indecomposable root system of rank , is
an arbitrary commutative ring with . We establish a variant of double
centralizer theorem for elementary unipotents . This theorem is
valid for arbitrary commutative rings with . The result is principle to show
that any one-parametric subgroup , , is Diophantine
in . Then we prove that the Diophantine problem in is
polynomial time equivalent (more precisely, Karp equivalent) to the Diophantine
problem in . This fact gives rise to a number of model-theoretic corollaries
for specific types of rings.Comment: 44 page
Interfaces and Quantum Algebras, I: Stable Envelopes
The stable envelopes of Okounkov et al. realize some representations of
quantum algebras associated to quivers, using geometry. We relate these
geometric considerations to quantum field theory. The main ingredients are the
supersymmetric interfaces in gauge theories with four supercharges, relation of
supersymmetric vacua to generalized cohomology theories, and Berry connections.
We mainly consider softly broken compactified three dimensional theories. The companion papers will discuss applications of this
construction to symplectic duality, Bethe/gauge correspondence, generalizations
to higher dimensional theories, and other topics.Comment: 152 pages; v2: references added, various explanations improve
Metric perturbations of Kerr spacetime in Lorenz gauge: Circular equatorial orbits
We construct the metric perturbation in Lorenz gauge for a compact body on a
circular equatorial orbit of a rotating black hole (Kerr) spacetime, using a
newly-developed method of separation of variables. The metric perturbation is
formed from a linear sum of differential operators acting on Teukolsky mode
functions, and certain auxiliary scalars, which are solutions to ordinary
differential equations in the frequency domain. For radiative modes, the
solution is uniquely determined by the Weyl scalars, the trace,
and gauge scalars whose amplitudes are determined by imposing
continuity conditions on the metric perturbation at the orbital radius. The
static (zero-frequency) part of the metric perturbation, which is handled
separately, also includes mass and angular momentum completion pieces. The
metric perturbation is validated against the independent results of a 2+1D time
domain code, and we demonstrate agreement at the expected level in all
components, and the absence of gauge discontinuities. In principle, the new
method can be used to determine the Lorenz-gauge metric perturbation at a
sufficiently high precision to enable accurate second-order self-force
calculations on Kerr spacetime in future. We conclude with a discussion of
extensions of the method to eccentric and non-equatorial orbits.Comment: 88 pages, 14 figure
- …