908 research outputs found

    Fundamental components of deep learning : a category-theoretic approach

    Get PDF
    Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning. This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella. In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory. Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning.Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning. This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella. In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory. Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning

    Foundations of Software Science and Computation Structures

    Get PDF
    This open access book constitutes the proceedings of the 23rd International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The 31 regular papers presented in this volume were carefully reviewed and selected from 98 submissions. The papers cover topics such as categorical models and logics; language theory, automata, and games; modal, spatial, and temporal logics; type theory and proof theory; concurrency theory and process calculi; rewriting theory; semantics of programming languages; program analysis, correctness, transformation, and verification; logics of programming; software specification and refinement; models of concurrent, reactive, stochastic, distributed, hybrid, and mobile systems; emerging models of computation; logical aspects of computational complexity; models of software security; and logical foundations of data bases.

    Foundations of Software Science and Computation Structures

    Get PDF
    This open access book constitutes the proceedings of the 23rd International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The 31 regular papers presented in this volume were carefully reviewed and selected from 98 submissions. The papers cover topics such as categorical models and logics; language theory, automata, and games; modal, spatial, and temporal logics; type theory and proof theory; concurrency theory and process calculi; rewriting theory; semantics of programming languages; program analysis, correctness, transformation, and verification; logics of programming; software specification and refinement; models of concurrent, reactive, stochastic, distributed, hybrid, and mobile systems; emerging models of computation; logical aspects of computational complexity; models of software security; and logical foundations of data bases.

    The strong gravitational lens finding challenge

    Get PDF
    Large-scale imaging surveys will increase the number of galaxy-scale strong lensing candidates by maybe three orders of magnitudes beyond the number known today. Finding these rare objects will require picking them out of at least tens of millions of images, and deriving scientific results from them will require quantifying the efficiency and bias of any search method. To achieve these objectives automated methods must be developed. Because gravitational lenses are rare objects, reducing false positives will be particularly important. We present a description and results of an open gravitational lens finding challenge. Participants were asked to classify 100 000 candidate objects as to whether they were gravitational lenses or not with the goal of developing better automated methods for finding lenses in large data sets. A variety of methods were used including visual inspection, arc and ring finders, support vector machines (SVM) and convolutional neural networks (CNN). We find that many of the methods will be easily fast enough to analyse the anticipated data flow. In test data, several methods are able to identify upwards of half the lenses after applying some thresholds on the lens characteristics such as lensed image brightness, size or contrast with the lens galaxy without making a single false-positive identification. This is significantly better than direct inspection by humans was able to do. Having multi-band, ground based data is found to be better for this purpose than single-band space based data with lower noise and higher resolution, suggesting that multi-colour data is crucial. Multi-band space based data will be superior to ground based data. The most difficult challenge for a lens finder is differentiating between rare, irregular and ring-like face-on galaxies and true gravitational lenses. The degree to which the efficiency and biases of lens finders can be quantified largely depends on the realism of the simulated data on which the finders are trained

    Complex extreme nonlinear waves: classical and quantum theory for new computing models

    Get PDF
    The historical role of nonlinear waves in developing the science of complexity, and also their physical feature of being a widespread paradigm in optics, establishes a bridge between two diverse and fundamental fields that can open an immeasurable number of new routes. In what follows, we present our most important results on nonlinear waves in classical and quantum nonlinear optics. About classical phenomenology, we lay the groundwork for establishing one uniform theory of dispersive shock waves, and for controlling complex nonlinear regimes through simple integer topological invariants. The second quantized field theory of optical propagation in nonlinear dispersive media allows us to perform numerical simulations of quantum solitons and the quantum nonlinear box problem. The complexity of light propagation in nonlinear media is here examined from all the main points of view: extreme phenomena, recurrence, control, modulation instability, and so forth. Such an analysis has a major, significant goal: answering the question can nonlinear waves do computation? For this purpose, our study towards the realization of an all-optical computer, able to do computation by implementing machine learning algorithms, is illustrated. The first all-optical realization of the Ising machine and the theoretical foundations of the random optical machine are here reported. We believe that this treatise is a fundamental study for the application of nonlinear waves to new computational techniques, disclosing new procedures to the control of extreme waves, and to the design of new quantum sources and non-classical state generators for future quantum technologies, also giving incredible insights about all-optical reservoir computing. Can nonlinear waves do computation? Our random optical machine draws the route for a positive answer to this question, substituting the randomness either with the uncertainty of quantum noise effects on light propagation or with the arbitrariness of classical, extremely nonlinear regimes, as similarly done by random projection methods and extreme learning machines

    Mathematical foundations for a compositional account of the Bayesian brain

    Get PDF
    This dissertation reports some first steps towards a compositional account of active inference and the Bayesian brain. Specifically, we use the tools of contemporary applied category theory to supply functorial semantics for approximate inference. To do so, we define on the 'syntactic' side the new notion of Bayesian lens and show that Bayesian updating composes according to the compositional lens pattern. Using Bayesian lenses, and inspired by compositional game theory, we define fibrations of statistical games and classify various problems of statistical inference as corresponding sections: the chain rule of the relative entropy is formalized as a strict section, while maximum likelihood estimation and the free energy give lax sections. In the process, we introduce a new notion of 'copy-composition'. On the 'semantic' side, we present a new formalization of general open dynamical systems (particularly: deterministic, stochastic, and random; and discrete- and continuous-time) as certain coalgebras of polynomial functors, which we show collect into monoidal opindexed categories (or, alternatively, into algebras for multicategories of generalized polynomial functors). We use these opindexed categories to define monoidal bicategories of 'cilia': dynamical systems which control lenses, and which supply the target for our functorial semantics. Accordingly, we construct functors which explain the bidirectional compositional structure of predictive coding neural circuits under the free energy principle, thereby giving a formal mathematical underpinning to the bidirectionality observed in the cortex. Along the way, we explain how to compose rate-coded neural circuits using an algebra for a multicategory of linear circuit diagrams, showing subsequently that this is subsumed by lenses and polynomial functors. Because category theory is unfamiliar to many computational neuroscientists and cognitive scientists, we have made a particular effort to give clear, detailed, and approachable expositions of all the category-theoretic structures and results of which we make use. We hope that this dissertation will prove helpful in establishing a new "well-typed'' science of life and mind, and in facilitating interdisciplinary communication
    corecore