13 research outputs found
Space-time tradeoffs of lenses and optics via higher category theory
Optics and lenses are abstract categorical gadgets that model systems with
bidirectional data flow. In this paper we observe that the denotational
definition of optics - identifying two optics as equivalent by observing their
behaviour from the outside - is not suitable for operational, software oriented
approaches where optics are not merely observed, but built with their internal
setups in mind. We identify operational differences between denotationally
isomorphic categories of cartesian optics and lenses: their different
composition rule and corresponding space-time tradeoffs, positioning them at
two opposite ends of a spectrum. With these motivations we lift the existing
categorical constructions and their relationships to the 2-categorical level,
showing that the relevant operational concerns become visible. We define the
2-category whose 2-cells explicitly track
optics' internal configuration. We show that the 1-category
arises by locally quotienting out the connected
components of this 2-category. We show that the embedding of lenses into
cartesian optics gets weakened from a functor to an oplax functor whose
oplaxator now detects the different composition rule. We determine the
difficulties in showing this functor forms a part of an adjunction in any of
the standard 2-categories. We establish a conjecture that the well-known
isomorphism between cartesian lenses and optics arises out of the lax
2-adjunction between their double-categorical counterparts. In addition to
presenting new research, this paper is also meant to be an accessible
introduction to the topic.Comment: 28 page
Fundamental components of deep learning : a category-theoretic approach
Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning.
This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella.
In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory.
Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning.Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning.
This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella.
In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory.
Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning
Categorical Foundations of Gradient-Based Learning
We propose a categorical semantics of gradient-based machine learning
algorithms in terms of lenses, parametrised maps, and reverse derivative
categories. This foundation provides a powerful explanatory and unifying
framework: it encompasses a variety of gradient descent algorithms such as
ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions
such as as MSE and Softmax cross-entropy, shedding new light on their
similarities and differences. Our approach to gradient-based learning has
examples generalising beyond the familiar continuous domains (modelled in
categories of smooth maps) and can be realized in the discrete setting of
boolean circuits. Finally, we demonstrate the practical significance of our
framework with an implementation in Python.Comment: 14 page
Compositional Game Theory, compositionally
We present a new compositional approach to compositional game theory (CGT) based upon Arrows, a concept originally from functional programming, closely related to Tambara modules, and operators to build new Arrows from old. We model equilibria as a module over an Arrow and define an operator to build a new Arrow from such a module over an existing Arrow. We also model strategies as graded Arrows and define an operator which builds a new Arrow by taking the colimit of a graded Arrow. A final operator builds a graded Arrow from a graded bimodule. We use this compositional approach to CGT to show how known and previously unknown variants of open games can be proven to form symmetric monoidal categories
Application of Deep Learning for Sentiment Analysis
Uzevši u obzir eksponencijalni rast dostupnih podataka generiranih u cijelom svijetu, postoji rastući interes za stvaranje modela sposobnih za analizu podataka koji imaju semantički nepoznat kontekst. Ovaj rad se fokusira na istraživanje i analizu teorijske osnove koja stoji iza modela zvanih povratne neuronske mreže (engl. recurrent neural networks, RNN) kroz konkretnu implementaciju na problemu semantičke analize. Poseban slučaj povratnih neuronskih mreža,mreže s dugotrajnim kratkoročnim pamćenjem(engl. Long short-term memory networks, LSTM) su implementirane na problemu klasifikacije sentimenata na skupu podataka recenzija filmova. Testirane su i uspoređene mnogobrojne arhitekture i konfiguracije hiperparametara spomenutih neuronskih mreža. Značajano poboljšanje performansi je primjećeno s dubokim višeslojnim LSTM mrežama, u za razliku od rezultata dobivenih s plitkim mrežama.Given the exponential growth of amount of available data generated worldwide, there has been a rising interest in the creation of models capable of analyzing the data without knowing the semantic context. This paper focuses on exploration and analysis of the theoretical foundation that stands behind models called recurrent neural networks through concrete implementation on a problem of semantic analysis. Special case of recurrent neural networks,long short-term memory networks (LSTM networks) were implemented on a task of semantic classification on the IMDb dataset. Various architectures and hyperparameter configurations were tested and compared on the said neural networks. Significant performance boost was detected with deep multi-layer LSTM networks, compared to performance with shallow ones
Compositional Deep Learning
Neuronske mreže postaju sve popularniji alat za rješavanje mnogih problema iz stvarnoga svijeta. Neuronske mreže generalna su metoda za diferencijabilnu optimizaciju koja uključuje mnoge druge algoritme strojnog učenja kao specijalne slučajeve. U ovom diplomskom radu mi izlažemo početke formalnog kompozicijskog okvira za razumijevanje različitih komponenata modernih arhitektura neuronskih mreža. Jezik teorije kategorija je korišten za proširenje postojećeg rada o kompozicijskom nadziranom učenju na područja nenadziranog učenja i generativnih modela. Prevođenjem arhitektura neuronskih mreža, skupova podataka, parameter-funkcija mape i nekolicinu drugih koncepata iz dubokog učenja u kategorijski jezik, pokazujemo da se optimizacija može raditi u prostoru funktora između dvije fiksne kategorije umjesto u prostoru funkcija između dva skupa. Dajemo pregled znakovite poveznice između formulacije dubokog učenja u ovom diplomskom radu i formulacije kategorijskih baza podataka. Nadalje, koristimo navedenu kategorijsku formulaciju kako bi osmislili novu arhitekturu neuronskih mreža kojoj je cilj naučiti umetanje i brisanje objekata iz slike sa neuparenim podacima. Testiranjem te arhitekture na dva skupa podataka dobivamo obećavajuće rezultate.Neural networks have become an increasingly popular tool for solving many real-world problems. They are a general framework for differentiable optimization which includes many other machine learning approaches as special cases. In this thesis we lay out the beginnings of a formal compositional framework for reasoning about a number of components of modern neural network architectures. The language of category theory is used to expand existing work on compositional supervised learning into territories of unsupervised learning and generative models. By translating neural network architectures, datasets, parameter-function map, and a number of other concepts to the categorical setting, we show optimization can be done in the space of functors between two fixed categories, rather than functions between two sets. We outline a striking correspondence between the deep learning formulation in this thesis and that of categorical database systems. Furthermore, we use the category-theoretic framework to come up with a novel neural network architecture whose goal is to learn the task of object insertion and object deletion in images with unpaired data. We test the architecture on two different datasets and obtain promising results
Compositional Deep Learning
Neuronske mreže postaju sve popularniji alat za rješavanje mnogih problema iz stvarnoga svijeta. Neuronske mreže generalna su metoda za diferencijabilnu optimizaciju koja uključuje mnoge druge algoritme strojnog učenja kao specijalne slučajeve. U ovom diplomskom radu mi izlažemo početke formalnog kompozicijskog okvira za razumijevanje različitih komponenata modernih arhitektura neuronskih mreža. Jezik teorije kategorija je korišten za proširenje postojećeg rada o kompozicijskom nadziranom učenju na područja nenadziranog učenja i generativnih modela. Prevođenjem arhitektura neuronskih mreža, skupova podataka, parameter-funkcija mape i nekolicinu drugih koncepata iz dubokog učenja u kategorijski jezik, pokazujemo da se optimizacija može raditi u prostoru funktora između dvije fiksne kategorije umjesto u prostoru funkcija između dva skupa. Dajemo pregled znakovite poveznice između formulacije dubokog učenja u ovom diplomskom radu i formulacije kategorijskih baza podataka. Nadalje, koristimo navedenu kategorijsku formulaciju kako bi osmislili novu arhitekturu neuronskih mreža kojoj je cilj naučiti umetanje i brisanje objekata iz slike sa neuparenim podacima. Testiranjem te arhitekture na dva skupa podataka dobivamo obećavajuće rezultate.Neural networks have become an increasingly popular tool for solving many real-world problems. They are a general framework for differentiable optimization which includes many other machine learning approaches as special cases. In this thesis we lay out the beginnings of a formal compositional framework for reasoning about a number of components of modern neural network architectures. The language of category theory is used to expand existing work on compositional supervised learning into territories of unsupervised learning and generative models. By translating neural network architectures, datasets, parameter-function map, and a number of other concepts to the categorical setting, we show optimization can be done in the space of functors between two fixed categories, rather than functions between two sets. We outline a striking correspondence between the deep learning formulation in this thesis and that of categorical database systems. Furthermore, we use the category-theoretic framework to come up with a novel neural network architecture whose goal is to learn the task of object insertion and object deletion in images with unpaired data. We test the architecture on two different datasets and obtain promising results