13 research outputs found

    Random matrix theory and the loss surfaces of neural networks

    Full text link
    Neural network models are one of the most successful approaches to machine learning, enjoying an enormous amount of development and research over recent years and finding concrete real-world applications in almost any conceivable area of science, engineering and modern life in general. The theoretical understanding of neural networks trails significantly behind their practical success and the engineering heuristics that have grown up around them. Random matrix theory provides a rich framework of tools with which aspects of neural network phenomenology can be explored theoretically. In this thesis, we establish significant extensions of prior work using random matrix theory to understand and describe the loss surfaces of large neural networks, particularly generalising to different architectures. Informed by the historical applications of random matrix theory in physics and elsewhere, we establish the presence of local random matrix universality in real neural networks and then utilise this as a modeling assumption to derive powerful and novel results about the Hessians of neural network loss surfaces and their spectra. In addition to these major contributions, we make use of random matrix models for neural network loss surfaces to shed light on modern neural network training approaches and even to derive a novel and effective variant of a popular optimisation algorithm. Overall, this thesis provides important contributions to cement the place of random matrix theory in the theoretical study of modern neural networks, reveals some of the limits of existing approaches and begins the study of an entirely new role for random matrix theory in the theory of deep learning with important experimental discoveries and novel theoretical results based on local random matrix universality.Comment: 320 pages, PhD thesi

    A spin-glass model for the loss surfaces of generative adversarial networks

    Get PDF
    We present a novel mathematical model that seeks to capture the key design feature of generative adversarial networks (GANs). Our model consists of two interacting spin glasses, and we conduct an extensive theoretical analysis of the complexity of the model's critical points using techniques from Random Matrix Theory. The result is insights into the loss surfaces of large GANs that build upon prior insights for simpler networks, but also reveal new structure unique to this setting.Comment: 26 pages, 9 figure

    Universal characteristics of deep neural network loss surfaces from random matrix theory

    Full text link
    This paper considers several aspects of random matrix universality in deep neural networks. Motivated by recent experimental work, we use universal properties of random matrices related to local statistics to derive practical implications for deep neural networks based on a realistic model of their Hessians. In particular we derive universal aspects of outliers in the spectra of deep neural networks and demonstrate the important role of random matrix local laws in popular pre-conditioning gradient descent algorithms. We also present insights into deep neural network loss surfaces from quite general arguments based on tools from statistical physics and random matrix theory.Comment: 42 page

    Appearance of Random Matrix Theory in Deep Learning

    Full text link
    We investigate the local spectral statistics of the loss surface Hessians of artificial neural networks, where we discover excellent agreement with Gaussian Orthogonal Ensemble statistics across several network architectures and datasets. These results shed new light on the applicability of Random Matrix Theory to modelling neural networks and suggest a previously unrecognised role for it in the study of loss surfaces in deep learning. Inspired by these observations, we propose a novel model for the true loss surfaces of neural networks, consistent with our observations, which allows for Hessian spectral densities with rank degeneracy and outliers, extensively observed in practice, and predicts a growing independence of loss gradients as a function of distance in weight-space. We further investigate the importance of the true loss surface in neural networks and find, in contrast to previous work, that the exponential hardness of locating the global minimum has practical consequences for achieving state of the art performance.Comment: 33 pages, 14 figure

    Type, typography and the typographer

    No full text
    This chapter considers the changing role of typography and the evolving practices of the typographer across six centuries. It charts the effect of technology, trade, and training on the profession as typographic responsibility passed from printer to compositor, to designer, and finally to Everyman. This chapter also considers the changing visual appearance of typographic books and their journey to free themselves from the conventions of the manuscript book and their influence on the e‐book

    Rethinking penal modernism from the global South: The case of convict transportation to Australia

    No full text
    Criminological accounts of penal modernization have generally overlooked the experience of convict transportation to, and in, the global South, an effect of the general tendency of metropolitan theory to embed particular experiences and perspectives, and present them as universal. In consequence, the implications for our understanding of crime and punishment of this momentous penal project, spanning more than 80 years in the case of Australia, have received limited attention. The article reflects on this lacunae in contemporary penal thought and considers some of the historical, conceptual and policy lessons that might be drawn from an effort to incorporate convict transportation into an account of modern penal development
    corecore