11 research outputs found
Towards a topological-geometrical theory of group equivariant non-expansive operators for data analysis and machine learning
The aim of this paper is to provide a general mathematical framework for
group equivariance in the machine learning context. The framework builds on a
synergy between persistent homology and the theory of group actions. We define
group-equivariant non-expansive operators (GENEOs), which are maps between
function spaces associated with groups of transformations. We study the
topological and metric properties of the space of GENEOs to evaluate their
approximating power and set the basis for general strategies to initialise and
compose operators. We begin by defining suitable pseudo-metrics for the
function spaces, the equivariance groups, and the set of non-expansive
operators. Basing on these pseudo-metrics, we prove that the space of GENEOs is
compact and convex, under the assumption that the function spaces are compact
and convex. These results provide fundamental guarantees in a machine learning
perspective. We show examples on the MNIST and fashion-MNIST datasets. By
considering isometry-equivariant non-expansive operators, we describe a simple
strategy to select and sample operators, and show how the selected and sampled
operators can be used to perform both classical metric learning and an
effective initialisation of the kernels of a convolutional neural network.Comment: Added references. Extended Section 7. Added 3 figures. Corrected
typos. 42 pages, 7 figure
Using topological data analysis for building Bayesan neural networks
For the first time, a simplified approach to constructing Bayesian neural networks is proposed, combining computational
efficiency with the ability to analyze the learning process. The proposed approach is based on Bayesianization of a
deterministic neural network by randomizing parameters only at the interface level, i.e., the formation of a Bayesian
neural network based on a given network by replacing its parameters with probability distributions that have the
parameters of the original model as the average value. Evaluations of the efficiency metrics of the neural network were
obtained within the framework of the approach under consideration, and the Bayesian neural network constructed
through variation inference were performed using topological data analysis methods. The Bayesianization procedure
is implemented through graded variation of the randomization intensity. As an alternative, two neural networks with
identical structure were used — deterministic and classical Bayesian networks. The input of the neural network was
supplied with the original data of two datasets in versions without noise and with added Gaussian noise. The zero and
first persistent homologies for the embeddings of the formed neural networks on each layer were calculated. To assess
the quality of classification, the accuracy metric was used. It is shown that the barcodes for embeddings on each layer of
the Bayesianized neural network in all four scenarios are between the corresponding barcodes of the deterministic and
Bayesian neural networks for both zero and first persistent homologies. In this case, the deterministic neural network is
the lower bound, and the Bayesian neural network is the upper bound. It is shown that the structure of data associations
within a Bayesianized neural network is inherited from a deterministic model, but acquires the properties of a Bayesian
one. It has been experimentally established that there is a relationship between the normalized persistent entropy
calculated on neural network embeddings and the accuracy of the neural network. For predicting accuracy, the topology
of embeddings on the middle layer of the neural network model turned out to be the most revealing. The proposed
approach can be used to simplify the construction of a Bayesian neural network from an already trained deterministic
neural network, which opens up the possibility of increasing the accuracy of an existing neural network without ensemble
with additional classifiers. It becomes possible to proactively evaluate the effectiveness of the generated neural network
on simplified data without running it on a real dataset, which reduces the resource intensity of its development
Landscapes of data sets and functoriality of persistent homology
The aim of this article is to describe a new perspective on functoriality of
persistent homology and explain its intrinsic symmetry that is often
overlooked. A data set for us is a finite collection of functions, called
measurements, with a finite domain. Such a data set might contain internal
symmetries which are effectively captured by the action of a set of the domain
endomorphisms. Different choices of the set of endomorphisms encode different
symmetries of the data set. We describe various category structures on such
enriched data sets and prove some of their properties such as decompositions
and morphism formations. We also describe a data structure, based on coloured
directed graphs, which is convenient to encode the mentioned enrichment. We
show that persistent homology preserves only some aspects of these landscapes
of enriched data sets however not all. In other words persistent homology is
not a functor on the entire category of enriched data sets. Nevertheless we
show that persistent homology is functorial locally. We use the concept of
equivariant operators to capture some of the information missed by persistent
homology
On the concept of permutant in the theory of group equivariant non-expansive operators
L'applicazione di operatori non espansivi equivarianti rispetto a un gruppo (GENEOs) in ambito di deep learning e topological data analysis si è recentemente dimostrata molto efficace. In questa tesi viene approfondito il concetto di permutante, su cui si basa un metodo di costruzione di tali operatori. In particolare si dimostra che i permutanti sono organizzati in una struttura reticolare, dotata di massimo, e che quest'ultimo risulta essere un gruppo
On the representation of linear group equivariant operators
Negli ultimi anni si è sviluppato un crescente interesse verso l'apprendimento automatico e l'analisi topologica dei dati. Un possibile approccio nello studio dell'analisi topologica dei dati sfrutta l'utilizzo di operatori equivarianti rispetto all'azione di un gruppo, detti GEO. Tali operatori sono utili sia per approssimare la pseudo-distanza naturale tramite un approccio che coinvolge anche l'omologia persistente, uno strumento chiave nell'analisi topologica dei dati, sia per studiare in ambito topologico-geometrico le reti neurali, che possono essere decomposte in GEO. Un problema cruciale è la costruzione di classi di GEO, al fine di approssimare lo spazio di tutti gli operatori.
In questa tesi viene approfondito lo studio dei GEO lineari. Nel primo capitolo studieremo il contesto matematico in cui la ricerca si sviluppa e definiremo GEO e GENEO (operatori non espansivi ed equivarianti rispetto all'azione di un gruppo). Nel secondo capitolo introdurremo il concetto di misura permutante, tramite cui vedremo come si possa costruire un GEO lineare. Il terzo capitolo invece si svilupperà sulla rappresentabilità dei GEO lineari. Vedremo infatti che, sotto opportune ipotesi, ogni GEO lineare può essere associato a una misura permutante
Why we should use topological data analysis in ageing: Towards defining the “topological shape of ageing”
Living systems are subject to the arrow of time; from birth, they undergo complex transformations (self-organization) in a constant battle for survival, but inevitably ageing and disease trap them to death. Can ageing be understood and eventually reversed? What tools can be employed to further our understanding of ageing? The present article is an invitation for biologists and clinicians to consider key conceptual ideas and computational tools (known to mathematicians and physicists), which potentially may help dissect some of the underlying processes of ageing and disease. Specifically, we first discuss how to classify and analyse complex systems, as well as highlight critical theoretical difficulties that make complex systems hard to study. Subsequently, we introduce Topological Data Analysis - a novel Big Data tool – which may help in the study of complex systems since it extracts knowledge from data in a holistic approach via topological considerations. These conceptual ideas and tools are discussed in a relatively informal way to pave future discussions and collaborations between mathematicians and biologists studying ageing.Basque Government under the grant “Artificial Intelligence in BCAM number EXP. 2019/00432”
Inria associated team "NeuroTransSF
Diketo acid inhibitors of nsp13 of SARS-CoV-2 block viral replication
For RNA viruses, RNA helicases have long been recognized to play critical roles during virus replication cycles, facilitating proper folding and replication of viral RNAs, therefore representing an ideal target for drug discovery. SARS-CoV-2 helicase, the non-structural protein 13 (nsp13) is a highly conserved protein among all known coronaviruses, and, at the moment, is one of the most explored viral targets to identify new possible antiviral agents. In the present study, we present six diketo acids (DKAs) as nsp13 inhibitors able to block both SARS-CoV-2 nsp13 enzymatic functions. Among them four compounds were able to inhibit viral replication in the low micromolar range, being active also on other human coronaviruses such as HCoV229E and MERS CoV. The experimental investigation of the binding mode revealed ATP-non-competitive kinetics of inhibition, not affected by substrate-displacement effect, suggesting an allosteric binding mode that was further supported by molecular modelling calculations predicting the binding into an allosteric conserved site located in the RecA2 domain
Towards a topological–geometrical theory of group equivariant non-expansive operators for data analysis and machine learning
open4siThe research carried out by M.G.B. was supported by the European Research Council (Advanced Investigator Grant 671251 to Z.F. Mainen), the Champalimaud Foundation (Z.F. Mainen) and a GPU NVIDIA grant. The research carried out by P.F. and N.Q. was partially supported by GNSAGA-INdAM (Italy).We provide a general mathematical framework for group and set equivariance in machine learning. We define group equivariant non-expansive operators (GENEOs) as maps between function spaces associated with groups of transformations. We study the topological and metric properties of the space of GENEOs to evaluate their approximating power and set the basis for general strategies to initialize and compose operators. We define suitable pseudo-metrics for the function spaces, the equivariance groups and the set of non-expansive operators. We prove that, under suitable assumptions, the space of GENEOs is compact and convex. These results provide fundamental guarantees in a machine learning perspective. By considering isometry-equivariant non-expansive operators, we describe a simple strategy to select and sample operators. Thereafter, we show how selected and sampled operators can be used both to perform classical metric learning and to inject knowledge in artificial neural networks.openBergomi, Mattia G.; Frosini, Patrizio; Giorgi, Daniela; Quercioli, NicolaBergomi, Mattia G.; Frosini, Patrizio; Giorgi, Daniela; Quercioli, Nicol