Search CORE

908 research outputs found

Fundamental components of deep learning : a category-theoretic approach

Author: Gavranović Bruno
Publication venue
Publication date
Field of study

Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning. This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella. In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory. Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning.Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning. This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella. In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory. Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning

STAX (Strathclyde Repository)

Foundations of Software Science and Computation Structures

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/05/2020
Field of study

This open access book constitutes the proceedings of the 23rd International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The 31 regular papers presented in this volume were carefully reviewed and selected from 98 submissions. The papers cover topics such as categorical models and logics; language theory, automata, and games; modal, spatial, and temporal logics; type theory and proof theory; concurrency theory and process calculi; rewriting theory; semantics of programming languages; program analysis, correctness, transformation, and verification; logics of programming; software specification and refinement; models of concurrent, reactive, stochastic, distributed, hybrid, and mobile systems; emerging models of computation; logical aspects of computational complexity; models of software security; and logical foundations of data bases.

Directory of Open Access Books (DOAB)

Foundations of Software Science and Computation Structures

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library

Recommended from our members

Machine Learning in Confocal Laser Microscopy and Spectroscopy

Author: Yu Yinchuan
Publication venue: Washington State University
Publication date: 01/01/2022
Field of study

Confocal laser scanning microscopy (CLSM) is a preferred method for obtaining optical images with submicron resolution. Replacing the pinhole and detector of a CLSM with a digital camera (CCD or CMOS) has the potential to simplify the design and reduce cost. However, the relatively slow speed of a typical camera results in long scans. To address this issue, in the present investigation a microlens array (MLA) was used to split the laser beam into 48 beamlets that are focused onto the sample. In essence, 48 pinhole-detector measurements were performed in parallel. Images obtained from the 48 laser spots were stitched together into a final image.Photoluminescence (PL) spectroscopy is a non-destructive optical method that is widely used to characterize semiconductors. In the PL process, a substance absorbs photons and emits light with longer wavelengths. This paper discusses a method for identifying substances from their PL spectra using machine learning, a technique that is efficient in making classifications. Neural networks were constructed by taking simulated PL spectra as the input and the identity of the substance as the output. Six different semiconductors were chosen as categories: gallium oxide (Ga2O3), zinc oxide (ZnO), gallium nitride (GaN), cadmium sulfide (CdS), tungsten disulfide (WS2) and cesium lead bromide (CsPbBr3). The developed algorithm has a high accuracy (>90%) for assigning a substance to one of these six categories from its PL spectrum.With an XY stage, a CLSM can scan a large area on a sample. Adjusting the height of the objective is necessary which made the laser beam could focus on the sample surface. However, if the surface of the sample is not flat, the laser spot will go in and out of focus, causing bad scanning results. Deep learning especially convolutional neural networks is an efficient way to treat images. It shows its success in the field of object detection, image classification, face recognition, etc. The deep learning techniques were used to design a model that predicts the out-of-focus distance with the image of laser spot. The model can develop to a system that could automatically focusing the CLSM in real time

Washington State University institutional repository

The strong gravitational lens finding challenge

Author: Avestruz C.
Bellagamba F.
Bertin E.
Bom C. R.
Cabanac R.
Courbin F.
Davies A.
Decencière E.
Flamary R.
Gavazzi R.
Geiger M.
Hartley P.
Huertas-Company M.
Jackson N.
Jacobs C.
Jullo E.
Kneib J.-P.
Koopmans L. V. E.
Lanusse F.
Li C.-L.
Li N.
Lightman M.
Ma Q.
Makler M.
Meneghetti M.
Metcalf R. B.
Petrillo C. E.
Schäfer C.
Serjeant S.
Sonnenfeld A.
Tagore A.
Tortora C.
Tuccillo D.
Valentín M. B.
Velasco-Forero S.
Verdoes Kleijn G. A.
Vernardos G.
Publication venue: 'EDP Sciences'
Publication date: 01/01/2019
Field of study

Large-scale imaging surveys will increase the number of galaxy-scale strong lensing candidates by maybe three orders of magnitudes beyond the number known today. Finding these rare objects will require picking them out of at least tens of millions of images, and deriving scientific results from them will require quantifying the efficiency and bias of any search method. To achieve these objectives automated methods must be developed. Because gravitational lenses are rare objects, reducing false positives will be particularly important. We present a description and results of an open gravitational lens finding challenge. Participants were asked to classify 100 000 candidate objects as to whether they were gravitational lenses or not with the goal of developing better automated methods for finding lenses in large data sets. A variety of methods were used including visual inspection, arc and ring finders, support vector machines (SVM) and convolutional neural networks (CNN). We find that many of the methods will be easily fast enough to analyse the anticipated data flow. In test data, several methods are able to identify upwards of half the lenses after applying some thresholds on the lens characteristics such as lensed image brightness, size or contrast with the lens galaxy without making a single false-positive identification. This is significantly better than direct inspection by humans was able to do. Having multi-band, ground based data is found to be better for this purpose than single-band space based data with lower noise and higher resolution, suggesting that multi-colour data is crucial. Multi-band space based data will be superior to ground based data. The most difficult challenge for a lens finder is differentiating between rare, irregular and ring-like face-on galaxies and true gravitational lenses. The degree to which the efficiency and biases of lens finders can be quantified largely depends on the realism of the simulated data on which the finders are trained

EDP Sciences OAI-PMH repository (1.2.0)

University of Groningen

HAL AMU

HAL Descartes

OA@INAF - Istituto Nazionale di Astrofisica

arXiv.org e-Print Archive

Proceedings - University of Groningen

ARTS repository - University of Groningen

Open Research Online (The Open University)

HAL-INSU

HAL-IRD

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

HAL-MINES ParisTech

HAL-OBSPM

Dissertations of the University of Groningen

Complex extreme nonlinear waves: classical and quantum theory for new computing models

Author: MARCUCCI GIULIA
Publication venue
Publication date: 07/02/2020
Field of study

The historical role of nonlinear waves in developing the science of complexity, and also their physical feature of being a widespread paradigm in optics, establishes a bridge between two diverse and fundamental fields that can open an immeasurable number of new routes. In what follows, we present our most important results on nonlinear waves in classical and quantum nonlinear optics. About classical phenomenology, we lay the groundwork for establishing one uniform theory of dispersive shock waves, and for controlling complex nonlinear regimes through simple integer topological invariants. The second quantized field theory of optical propagation in nonlinear dispersive media allows us to perform numerical simulations of quantum solitons and the quantum nonlinear box problem. The complexity of light propagation in nonlinear media is here examined from all the main points of view: extreme phenomena, recurrence, control, modulation instability, and so forth. Such an analysis has a major, significant goal: answering the question can nonlinear waves do computation? For this purpose, our study towards the realization of an all-optical computer, able to do computation by implementing machine learning algorithms, is illustrated. The first all-optical realization of the Ising machine and the theoretical foundations of the random optical machine are here reported. We believe that this treatise is a fundamental study for the application of nonlinear waves to new computational techniques, disclosing new procedures to the control of extreme waves, and to the design of new quantum sources and non-classical state generators for future quantum technologies, also giving incredible insights about all-optical reservoir computing. Can nonlinear waves do computation? Our random optical machine draws the route for a positive answer to this question, substituting the randomness either with the uncertainty of quantum noise effects on light propagation or with the arbitrariness of classical, extremely nonlinear regimes, as similarly done by random projection methods and extreme learning machines

Archivio della ricerca- Università di Roma La Sapienza

Mathematical foundations for a compositional account of the Bayesian brain

Author: St Clere Smithe Toby Benedict
Publication venue
Publication date: 19/12/2023
Field of study

This dissertation reports some first steps towards a compositional account of active inference and the Bayesian brain. Specifically, we use the tools of contemporary applied category theory to supply functorial semantics for approximate inference. To do so, we define on the 'syntactic' side the new notion of Bayesian lens and show that Bayesian updating composes according to the compositional lens pattern. Using Bayesian lenses, and inspired by compositional game theory, we define fibrations of statistical games and classify various problems of statistical inference as corresponding sections: the chain rule of the relative entropy is formalized as a strict section, while maximum likelihood estimation and the free energy give lax sections. In the process, we introduce a new notion of 'copy-composition'. On the 'semantic' side, we present a new formalization of general open dynamical systems (particularly: deterministic, stochastic, and random; and discrete- and continuous-time) as certain coalgebras of polynomial functors, which we show collect into monoidal opindexed categories (or, alternatively, into algebras for multicategories of generalized polynomial functors). We use these opindexed categories to define monoidal bicategories of 'cilia': dynamical systems which control lenses, and which supply the target for our functorial semantics. Accordingly, we construct functors which explain the bidirectional compositional structure of predictive coding neural circuits under the free energy principle, thereby giving a formal mathematical underpinning to the bidirectionality observed in the cortex. Along the way, we explain how to compose rate-coded neural circuits using an algebra for a multicategory of linear circuit diagrams, showing subsequently that this is subsumed by lenses and polynomial functors. Because category theory is unfamiliar to many computational neuroscientists and cognitive scientists, we have made a particular effort to give clear, detailed, and approachable expositions of all the category-theoretic structures and results of which we make use. We hope that this dissertation will prove helpful in establishing a new "well-typed'' science of life and mind, and in facilitating interdisciplinary communication

Oxford University Research Archive