13 research outputs found
Random matrix theory and the loss surfaces of neural networks
Neural network models are one of the most successful approaches to machine
learning, enjoying an enormous amount of development and research over recent
years and finding concrete real-world applications in almost any conceivable
area of science, engineering and modern life in general. The theoretical
understanding of neural networks trails significantly behind their practical
success and the engineering heuristics that have grown up around them. Random
matrix theory provides a rich framework of tools with which aspects of neural
network phenomenology can be explored theoretically. In this thesis, we
establish significant extensions of prior work using random matrix theory to
understand and describe the loss surfaces of large neural networks,
particularly generalising to different architectures. Informed by the
historical applications of random matrix theory in physics and elsewhere, we
establish the presence of local random matrix universality in real neural
networks and then utilise this as a modeling assumption to derive powerful and
novel results about the Hessians of neural network loss surfaces and their
spectra. In addition to these major contributions, we make use of random matrix
models for neural network loss surfaces to shed light on modern neural network
training approaches and even to derive a novel and effective variant of a
popular optimisation algorithm.
Overall, this thesis provides important contributions to cement the place of
random matrix theory in the theoretical study of modern neural networks,
reveals some of the limits of existing approaches and begins the study of an
entirely new role for random matrix theory in the theory of deep learning with
important experimental discoveries and novel theoretical results based on local
random matrix universality.Comment: 320 pages, PhD thesi
A spin-glass model for the loss surfaces of generative adversarial networks
We present a novel mathematical model that seeks to capture the key design
feature of generative adversarial networks (GANs). Our model consists of two
interacting spin glasses, and we conduct an extensive theoretical analysis of
the complexity of the model's critical points using techniques from Random
Matrix Theory. The result is insights into the loss surfaces of large GANs that
build upon prior insights for simpler networks, but also reveal new structure
unique to this setting.Comment: 26 pages, 9 figure
Universal characteristics of deep neural network loss surfaces from random matrix theory
This paper considers several aspects of random matrix universality in deep
neural networks. Motivated by recent experimental work, we use universal
properties of random matrices related to local statistics to derive practical
implications for deep neural networks based on a realistic model of their
Hessians. In particular we derive universal aspects of outliers in the spectra
of deep neural networks and demonstrate the important role of random matrix
local laws in popular pre-conditioning gradient descent algorithms. We also
present insights into deep neural network loss surfaces from quite general
arguments based on tools from statistical physics and random matrix theory.Comment: 42 page
Appearance of Random Matrix Theory in Deep Learning
We investigate the local spectral statistics of the loss surface Hessians of
artificial neural networks, where we discover excellent agreement with Gaussian
Orthogonal Ensemble statistics across several network architectures and
datasets. These results shed new light on the applicability of Random Matrix
Theory to modelling neural networks and suggest a previously unrecognised role
for it in the study of loss surfaces in deep learning. Inspired by these
observations, we propose a novel model for the true loss surfaces of neural
networks, consistent with our observations, which allows for Hessian spectral
densities with rank degeneracy and outliers, extensively observed in practice,
and predicts a growing independence of loss gradients as a function of distance
in weight-space. We further investigate the importance of the true loss surface
in neural networks and find, in contrast to previous work, that the exponential
hardness of locating the global minimum has practical consequences for
achieving state of the art performance.Comment: 33 pages, 14 figure
Type, typography and the typographer
This chapter considers the changing role of typography and the evolving practices of the typographer across six centuries. It charts the effect of technology, trade, and training on the profession as typographic responsibility passed from printer to compositor, to designer, and finally to Everyman. This chapter also considers the changing visual appearance of typographic books and their journey to free themselves from the conventions of the manuscript book and their influence on the e‐book
What drives adoption of a computerised, multifaceted quality improvement intervention for cardiovascular disease management in primary healthcare settings? A mixed methods analysis using normalisation process theory
Rethinking penal modernism from the global South: The case of convict transportation to Australia
Criminological accounts of penal modernization have generally overlooked the experience of convict transportation to, and in, the global South, an effect of the general tendency of metropolitan theory to embed particular experiences and perspectives, and present them as universal. In consequence, the implications for our understanding of crime and punishment of this momentous penal project, spanning more than 80 years in the case of Australia, have received limited attention. The article reflects on this lacunae in contemporary penal thought and considers some of the historical, conceptual and policy lessons that might be drawn from an effort to incorporate convict transportation into an account of modern penal development