11 research outputs found

    Towards a topological-geometrical theory of group equivariant non-expansive operators for data analysis and machine learning

    Get PDF
    The aim of this paper is to provide a general mathematical framework for group equivariance in the machine learning context. The framework builds on a synergy between persistent homology and the theory of group actions. We define group-equivariant non-expansive operators (GENEOs), which are maps between function spaces associated with groups of transformations. We study the topological and metric properties of the space of GENEOs to evaluate their approximating power and set the basis for general strategies to initialise and compose operators. We begin by defining suitable pseudo-metrics for the function spaces, the equivariance groups, and the set of non-expansive operators. Basing on these pseudo-metrics, we prove that the space of GENEOs is compact and convex, under the assumption that the function spaces are compact and convex. These results provide fundamental guarantees in a machine learning perspective. We show examples on the MNIST and fashion-MNIST datasets. By considering isometry-equivariant non-expansive operators, we describe a simple strategy to select and sample operators, and show how the selected and sampled operators can be used to perform both classical metric learning and an effective initialisation of the kernels of a convolutional neural network.Comment: Added references. Extended Section 7. Added 3 figures. Corrected typos. 42 pages, 7 figure

    Using topological data analysis for building Bayesan neural networks

    Get PDF
    For the first time, a simplified approach to constructing Bayesian neural networks is proposed, combining computational efficiency with the ability to analyze the learning process. The proposed approach is based on Bayesianization of a deterministic neural network by randomizing parameters only at the interface level, i.e., the formation of a Bayesian neural network based on a given network by replacing its parameters with probability distributions that have the parameters of the original model as the average value. Evaluations of the efficiency metrics of the neural network were obtained within the framework of the approach under consideration, and the Bayesian neural network constructed through variation inference were performed using topological data analysis methods. The Bayesianization procedure is implemented through graded variation of the randomization intensity. As an alternative, two neural networks with identical structure were used — deterministic and classical Bayesian networks. The input of the neural network was supplied with the original data of two datasets in versions without noise and with added Gaussian noise. The zero and first persistent homologies for the embeddings of the formed neural networks on each layer were calculated. To assess the quality of classification, the accuracy metric was used. It is shown that the barcodes for embeddings on each layer of the Bayesianized neural network in all four scenarios are between the corresponding barcodes of the deterministic and Bayesian neural networks for both zero and first persistent homologies. In this case, the deterministic neural network is the lower bound, and the Bayesian neural network is the upper bound. It is shown that the structure of data associations within a Bayesianized neural network is inherited from a deterministic model, but acquires the properties of a Bayesian one. It has been experimentally established that there is a relationship between the normalized persistent entropy calculated on neural network embeddings and the accuracy of the neural network. For predicting accuracy, the topology of embeddings on the middle layer of the neural network model turned out to be the most revealing. The proposed approach can be used to simplify the construction of a Bayesian neural network from an already trained deterministic neural network, which opens up the possibility of increasing the accuracy of an existing neural network without ensemble with additional classifiers. It becomes possible to proactively evaluate the effectiveness of the generated neural network on simplified data without running it on a real dataset, which reduces the resource intensity of its development

    Landscapes of data sets and functoriality of persistent homology

    Full text link
    The aim of this article is to describe a new perspective on functoriality of persistent homology and explain its intrinsic symmetry that is often overlooked. A data set for us is a finite collection of functions, called measurements, with a finite domain. Such a data set might contain internal symmetries which are effectively captured by the action of a set of the domain endomorphisms. Different choices of the set of endomorphisms encode different symmetries of the data set. We describe various category structures on such enriched data sets and prove some of their properties such as decompositions and morphism formations. We also describe a data structure, based on coloured directed graphs, which is convenient to encode the mentioned enrichment. We show that persistent homology preserves only some aspects of these landscapes of enriched data sets however not all. In other words persistent homology is not a functor on the entire category of enriched data sets. Nevertheless we show that persistent homology is functorial locally. We use the concept of equivariant operators to capture some of the information missed by persistent homology

    On the concept of permutant in the theory of group equivariant non-expansive operators

    Get PDF
    L'applicazione di operatori non espansivi equivarianti rispetto a un gruppo (GENEOs) in ambito di deep learning e topological data analysis si è recentemente dimostrata molto efficace. In questa tesi viene approfondito il concetto di permutante, su cui si basa un metodo di costruzione di tali operatori. In particolare si dimostra che i permutanti sono organizzati in una struttura reticolare, dotata di massimo, e che quest'ultimo risulta essere un gruppo

    On the representation of linear group equivariant operators

    Get PDF
    Negli ultimi anni si è sviluppato un crescente interesse verso l'apprendimento automatico e l'analisi topologica dei dati. Un possibile approccio nello studio dell'analisi topologica dei dati sfrutta l'utilizzo di operatori equivarianti rispetto all'azione di un gruppo, detti GEO. Tali operatori sono utili sia per approssimare la pseudo-distanza naturale tramite un approccio che coinvolge anche l'omologia persistente, uno strumento chiave nell'analisi topologica dei dati, sia per studiare in ambito topologico-geometrico le reti neurali, che possono essere decomposte in GEO. Un problema cruciale è la costruzione di classi di GEO, al fine di approssimare lo spazio di tutti gli operatori. In questa tesi viene approfondito lo studio dei GEO lineari. Nel primo capitolo studieremo il contesto matematico in cui la ricerca si sviluppa e definiremo GEO e GENEO (operatori non espansivi ed equivarianti rispetto all'azione di un gruppo). Nel secondo capitolo introdurremo il concetto di misura permutante, tramite cui vedremo come si possa costruire un GEO lineare. Il terzo capitolo invece si svilupperà sulla rappresentabilità dei GEO lineari. Vedremo infatti che, sotto opportune ipotesi, ogni GEO lineare può essere associato a una misura permutante

    Why we should use topological data analysis in ageing: Towards defining the “topological shape of ageing”

    Get PDF
    Living systems are subject to the arrow of time; from birth, they undergo complex transformations (self-organization) in a constant battle for survival, but inevitably ageing and disease trap them to death. Can ageing be understood and eventually reversed? What tools can be employed to further our understanding of ageing? The present article is an invitation for biologists and clinicians to consider key conceptual ideas and computational tools (known to mathematicians and physicists), which potentially may help dissect some of the underlying processes of ageing and disease. Specifically, we first discuss how to classify and analyse complex systems, as well as highlight critical theoretical difficulties that make complex systems hard to study. Subsequently, we introduce Topological Data Analysis - a novel Big Data tool – which may help in the study of complex systems since it extracts knowledge from data in a holistic approach via topological considerations. These conceptual ideas and tools are discussed in a relatively informal way to pave future discussions and collaborations between mathematicians and biologists studying ageing.Basque Government under the grant “Artificial Intelligence in BCAM number EXP. 2019/00432” Inria associated team "NeuroTransSF

    Diketo acid inhibitors of nsp13 of SARS-CoV-2 block viral replication

    Get PDF
    For RNA viruses, RNA helicases have long been recognized to play critical roles during virus replication cycles, facilitating proper folding and replication of viral RNAs, therefore representing an ideal target for drug discovery. SARS-CoV-2 helicase, the non-structural protein 13 (nsp13) is a highly conserved protein among all known coronaviruses, and, at the moment, is one of the most explored viral targets to identify new possible antiviral agents. In the present study, we present six diketo acids (DKAs) as nsp13 inhibitors able to block both SARS-CoV-2 nsp13 enzymatic functions. Among them four compounds were able to inhibit viral replication in the low micromolar range, being active also on other human coronaviruses such as HCoV229E and MERS CoV. The experimental investigation of the binding mode revealed ATP-non-competitive kinetics of inhibition, not affected by substrate-displacement effect, suggesting an allosteric binding mode that was further supported by molecular modelling calculations predicting the binding into an allosteric conserved site located in the RecA2 domain

    Towards a topological–geometrical theory of group equivariant non-expansive operators for data analysis and machine learning

    Get PDF
    open4siThe research carried out by M.G.B. was supported by the European Research Council (Advanced Investigator Grant 671251 to Z.F. Mainen), the Champalimaud Foundation (Z.F. Mainen) and a GPU NVIDIA grant. The research carried out by P.F. and N.Q. was partially supported by GNSAGA-INdAM (Italy).We provide a general mathematical framework for group and set equivariance in machine learning. We define group equivariant non-expansive operators (GENEOs) as maps between function spaces associated with groups of transformations. We study the topological and metric properties of the space of GENEOs to evaluate their approximating power and set the basis for general strategies to initialize and compose operators. We define suitable pseudo-metrics for the function spaces, the equivariance groups and the set of non-expansive operators. We prove that, under suitable assumptions, the space of GENEOs is compact and convex. These results provide fundamental guarantees in a machine learning perspective. By considering isometry-equivariant non-expansive operators, we describe a simple strategy to select and sample operators. Thereafter, we show how selected and sampled operators can be used both to perform classical metric learning and to inject knowledge in artificial neural networks.openBergomi, Mattia G.; Frosini, Patrizio; Giorgi, Daniela; Quercioli, NicolaBergomi, Mattia G.; Frosini, Patrizio; Giorgi, Daniela; Quercioli, Nicol
    corecore