13 research outputs found

    Space-time tradeoffs of lenses and optics via higher category theory

    Full text link
    Optics and lenses are abstract categorical gadgets that model systems with bidirectional data flow. In this paper we observe that the denotational definition of optics - identifying two optics as equivalent by observing their behaviour from the outside - is not suitable for operational, software oriented approaches where optics are not merely observed, but built with their internal setups in mind. We identify operational differences between denotationally isomorphic categories of cartesian optics and lenses: their different composition rule and corresponding space-time tradeoffs, positioning them at two opposite ends of a spectrum. With these motivations we lift the existing categorical constructions and their relationships to the 2-categorical level, showing that the relevant operational concerns become visible. We define the 2-category 2-Optic(C)\textbf{2-Optic}(\mathcal{C}) whose 2-cells explicitly track optics' internal configuration. We show that the 1-category Optic(C)\textbf{Optic}(\mathcal{C}) arises by locally quotienting out the connected components of this 2-category. We show that the embedding of lenses into cartesian optics gets weakened from a functor to an oplax functor whose oplaxator now detects the different composition rule. We determine the difficulties in showing this functor forms a part of an adjunction in any of the standard 2-categories. We establish a conjecture that the well-known isomorphism between cartesian lenses and optics arises out of the lax 2-adjunction between their double-categorical counterparts. In addition to presenting new research, this paper is also meant to be an accessible introduction to the topic.Comment: 28 page

    Fundamental components of deep learning : a category-theoretic approach

    Get PDF
    Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning. This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella. In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory. Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning.Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning. This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella. In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory. Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning

    Categorical Foundations of Gradient-Based Learning

    Get PDF
    We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach to gradient-based learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realized in the discrete setting of boolean circuits. Finally, we demonstrate the practical significance of our framework with an implementation in Python.Comment: 14 page

    Compositional Game Theory, compositionally

    Get PDF
    We present a new compositional approach to compositional game theory (CGT) based upon Arrows, a concept originally from functional programming, closely related to Tambara modules, and operators to build new Arrows from old. We model equilibria as a module over an Arrow and define an operator to build a new Arrow from such a module over an existing Arrow. We also model strategies as graded Arrows and define an operator which builds a new Arrow by taking the colimit of a graded Arrow. A final operator builds a graded Arrow from a graded bimodule. We use this compositional approach to CGT to show how known and previously unknown variants of open games can be proven to form symmetric monoidal categories

    Application of Deep Learning for Sentiment Analysis

    No full text
    Uzevši u obzir eksponencijalni rast dostupnih podataka generiranih u cijelom svijetu, postoji rastući interes za stvaranje modela sposobnih za analizu podataka koji imaju semantički nepoznat kontekst. Ovaj rad se fokusira na istraživanje i analizu teorijske osnove koja stoji iza modela zvanih povratne neuronske mreže (engl. recurrent neural networks, RNN) kroz konkretnu implementaciju na problemu semantičke analize. Poseban slučaj povratnih neuronskih mreža,mreže s dugotrajnim kratkoročnim pamćenjem(engl. Long short-term memory networks, LSTM) su implementirane na problemu klasifikacije sentimenata na skupu podataka recenzija filmova. Testirane su i uspoređene mnogobrojne arhitekture i konfiguracije hiperparametara spomenutih neuronskih mreža. Značajano poboljšanje performansi je primjećeno s dubokim višeslojnim LSTM mrežama, u za razliku od rezultata dobivenih s plitkim mrežama.Given the exponential growth of amount of available data generated worldwide, there has been a rising interest in the creation of models capable of analyzing the data without knowing the semantic context. This paper focuses on exploration and analysis of the theoretical foundation that stands behind models called recurrent neural networks through concrete implementation on a problem of semantic analysis. Special case of recurrent neural networks,long short-term memory networks (LSTM networks) were implemented on a task of semantic classification on the IMDb dataset. Various architectures and hyperparameter configurations were tested and compared on the said neural networks. Significant performance boost was detected with deep multi-layer LSTM networks, compared to performance with shallow ones

    Compositional Deep Learning

    No full text
    Neuronske mreže postaju sve popularniji alat za rješavanje mnogih problema iz stvarnoga svijeta. Neuronske mreže generalna su metoda za diferencijabilnu optimizaciju koja uključuje mnoge druge algoritme strojnog učenja kao specijalne slučajeve. U ovom diplomskom radu mi izlažemo početke formalnog kompozicijskog okvira za razumijevanje različitih komponenata modernih arhitektura neuronskih mreža. Jezik teorije kategorija je korišten za proširenje postojećeg rada o kompozicijskom nadziranom učenju na područja nenadziranog učenja i generativnih modela. Prevođenjem arhitektura neuronskih mreža, skupova podataka, parameter-funkcija mape i nekolicinu drugih koncepata iz dubokog učenja u kategorijski jezik, pokazujemo da se optimizacija može raditi u prostoru funktora između dvije fiksne kategorije umjesto u prostoru funkcija između dva skupa. Dajemo pregled znakovite poveznice između formulacije dubokog učenja u ovom diplomskom radu i formulacije kategorijskih baza podataka. Nadalje, koristimo navedenu kategorijsku formulaciju kako bi osmislili novu arhitekturu neuronskih mreža kojoj je cilj naučiti umetanje i brisanje objekata iz slike sa neuparenim podacima. Testiranjem te arhitekture na dva skupa podataka dobivamo obećavajuće rezultate.Neural networks have become an increasingly popular tool for solving many real-world problems. They are a general framework for differentiable optimization which includes many other machine learning approaches as special cases. In this thesis we lay out the beginnings of a formal compositional framework for reasoning about a number of components of modern neural network architectures. The language of category theory is used to expand existing work on compositional supervised learning into territories of unsupervised learning and generative models. By translating neural network architectures, datasets, parameter-function map, and a number of other concepts to the categorical setting, we show optimization can be done in the space of functors between two fixed categories, rather than functions between two sets. We outline a striking correspondence between the deep learning formulation in this thesis and that of categorical database systems. Furthermore, we use the category-theoretic framework to come up with a novel neural network architecture whose goal is to learn the task of object insertion and object deletion in images with unpaired data. We test the architecture on two different datasets and obtain promising results

    Compositional Deep Learning

    No full text
    Neuronske mreže postaju sve popularniji alat za rješavanje mnogih problema iz stvarnoga svijeta. Neuronske mreže generalna su metoda za diferencijabilnu optimizaciju koja uključuje mnoge druge algoritme strojnog učenja kao specijalne slučajeve. U ovom diplomskom radu mi izlažemo početke formalnog kompozicijskog okvira za razumijevanje različitih komponenata modernih arhitektura neuronskih mreža. Jezik teorije kategorija je korišten za proširenje postojećeg rada o kompozicijskom nadziranom učenju na područja nenadziranog učenja i generativnih modela. Prevođenjem arhitektura neuronskih mreža, skupova podataka, parameter-funkcija mape i nekolicinu drugih koncepata iz dubokog učenja u kategorijski jezik, pokazujemo da se optimizacija može raditi u prostoru funktora između dvije fiksne kategorije umjesto u prostoru funkcija između dva skupa. Dajemo pregled znakovite poveznice između formulacije dubokog učenja u ovom diplomskom radu i formulacije kategorijskih baza podataka. Nadalje, koristimo navedenu kategorijsku formulaciju kako bi osmislili novu arhitekturu neuronskih mreža kojoj je cilj naučiti umetanje i brisanje objekata iz slike sa neuparenim podacima. Testiranjem te arhitekture na dva skupa podataka dobivamo obećavajuće rezultate.Neural networks have become an increasingly popular tool for solving many real-world problems. They are a general framework for differentiable optimization which includes many other machine learning approaches as special cases. In this thesis we lay out the beginnings of a formal compositional framework for reasoning about a number of components of modern neural network architectures. The language of category theory is used to expand existing work on compositional supervised learning into territories of unsupervised learning and generative models. By translating neural network architectures, datasets, parameter-function map, and a number of other concepts to the categorical setting, we show optimization can be done in the space of functors between two fixed categories, rather than functions between two sets. We outline a striking correspondence between the deep learning formulation in this thesis and that of categorical database systems. Furthermore, we use the category-theoretic framework to come up with a novel neural network architecture whose goal is to learn the task of object insertion and object deletion in images with unpaired data. We test the architecture on two different datasets and obtain promising results
    corecore