995 research outputs found

    Quantum Algorithms for Interpolation and Sampling

    Get PDF
    Gibbs sampling from continuous real-valued functions is a challenging problem of interest in machine learning. Here we leverage quantum Fourier transforms to build a quantum algorithm for this task when the function is periodic. We use the quantum algorithms for solving linear ordinary differential equations to solve the Fokker--Planck equation and prepare a quantum state encoding the Gibbs distribution. We show that the efficiency of interpolation and differentiation of these functions on a quantum computer depends on the rate of decay of the Fourier coefficients of the Fourier transform of the function. We view this property as a concentration of measure in the Fourier domain, and also provide functional analytic conditions for it. Our algorithm makes zeroeth order queries to a quantum oracle of the function. Despite suffering from an exponentially long mixing time, this algorithm allows for exponentially improved precision in sampling, and polynomial quantum speedups in mean estimation in the general case, and particularly under geometric conditions we identify for the critical points of the energy function

    Enhancing Neuromorphic Computing with Advanced Spiking Neural Network Architectures

    Get PDF
    This dissertation proposes ways to address current limitations of neuromorphic computing to create energy-efficient and adaptable systems for AI applications. It does so by designing novel spiking neural networks architectures that improve their performance. Specifically, the two proposed architectures address the issues of training complexity, hyperparameter selection, computational flexibility, and scarcity of neuromorphic training data. The first architecture uses auxiliary learning to improve training performance and data usage, while the second architecture leverages neuromodulation capability of spiking neurons to improve multitasking classification performance. The proposed architectures are tested on Intel\u27s Loihi2 neuromorphic chip using several neuromorphic datasets, such as NMIST, DVSCIFAR10, and DVS128-Gesture. The presented results demonstrate potential of the proposed architectures but also reveal some of their limitations which are proposed as future research

    Referring to discourse participants in Ibero-Romance languages

    Get PDF
    Synopsis: This volume brings together contributions by researchers focusing on personal pronouns in Ibero-Romance languages, going beyond the well-established variable of expressed vs. non-expressed subjects. While factors such as agreement morphology, topic shift and contrast or emphasis have been argued to account for variable subject expression, several corpus studies on Ibero-Romance languages have shown that the expression of subject pronouns goes beyond these traditionally established factors and is also subject to considerable dialectal variation. One of the factors affecting choice and expression of personal pronouns or other referential devices is whether the construction is used personally or impersonally. The use and emergence of new impersonal constructions, eventually also new (im)personal pronouns, as well as the variation found in the expression of human impersonality in different Ibero-Romance language varieties is another interesting research area that has gained ground in the recent years. In addition to variable subject expression, similar methods and theoretical approaches have been applied to study the expression of objects. Finally, the reference to the addressee(s) using different address pronouns and other address forms is an important field of study that is closely connected to the variable expression of pronouns. The present book sheds light on all these aspects of reference to discourse participants. The volume contains contributions with a strong empirical background and various methods and both written and spoken corpus data from Ibero-Romance languages. The focus on discourse participants highlights the special properties of first and second person referents and the factors affecting them that are often different from the anaphoric third person. The chapters are organized into three thematic sections: (i) Variable expression of subjects and objects, (ii) Between personal and impersonal, and (iii) Reference to the addressee

    Leveraging elasticity theory to calculate cell forces: From analytical insights to machine learning

    Get PDF
    Living cells possess capabilities to detect and respond to mechanical features of their surroundings. In traction force microscopy, the traction of cells on an elastic substrate is made visible by observing substrate deformation as measured by the movement of embedded marker beads. Describing the substrates by means of elasticity theory, we can calculate the adhesive forces, improving our understanding of cellular function and behavior. In this dissertation, I combine analytical solutions with numerical methods and machine learning techniques to improve traction prediction in a range of experimental applications. I describe how to include the normal traction component in regularization-based Fourier approaches, which I apply to experimental data. I compare the dominant strategies for traction reconstruction, the direct method and inverse, regularization-based approaches and find, that the latter are more precise while the former is more stress resilient to noise. I find that a point-force based reconstruction can be used to study the force balance evolution in response to microneedle pulling showing a transition from a dipolar into a monopolar force arrangement. Finally, I show how a conditional invertible neural network not only reconstructs adhesive areas more localized, but also reveals spatial correlations and variations in reliability of traction reconstructions

    Computational Approaches to Drug Profiling and Drug-Protein Interactions

    Get PDF
    Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

    Robustness and Interpretability of Neural Networks’ Predictions under Adversarial Attacks

    Get PDF
    Le reti neurali profonde (DNNs) sono potenti modelli predittivi, che superano le capacità umane in una varietà di task. Imparano sistemi decisionali complessi e flessibili dai dati a disposizione e raggiungono prestazioni eccezionali in molteplici campi di apprendimento automatico, dalle applicazioni dell'intelligenza artificiale, come il riconoscimento di immagini, parole e testi, alle scienze più tradizionali, tra cui medicina, fisica e biologia. Nonostante i risultati eccezionali, le prestazioni elevate e l’alta precisione predittiva non sono sufficienti per le applicazioni nel mondo reale, specialmente in ambienti critici per la sicurezza, dove l'utilizzo dei DNNs è fortemente limitato dalla loro natura black-box. Vi è una crescente necessità di comprendere come vengono eseguite le predizioni, fornire stime di incertezza, garantire robustezza agli attacchi avversari e prevenire comportamenti indesiderati. Anche le migliori architetture sono vulnerabili a piccole perturbazioni nei dati di input, note come attacchi avversari: manipolazioni malevole degli input che sono percettivamente indistinguibili dai campioni originali ma sono in grado di ingannare il modello in predizioni errate. In questo lavoro, dimostriamo che tale fragilità è correlata alla geometria del manifold dei dati ed è quindi probabile che sia una caratteristica intrinseca delle predizioni dei DNNs. Questa condizione suggerisce una possibile direzione al fine di ottenere robustezza agli attacchi: studiamo la geometria degli attacchi avversari nel limite di un numero infinito di dati e di pesi per le reti neurali Bayesiane, dimostrando che, in questo limite, sono immuni agli attacchi avversari gradient-based. Inoltre, proponiamo alcune tecniche di training per migliorare la robustezza delle architetture deterministiche. In particolare, osserviamo sperimentalmente che ensembles di reti neurali addestrati su proiezioni casuali degli input originali in spazi basso-dimensionali sono più resistenti agli attacchi. Successivamente, ci concentriamo sul problema dell'interpretabilità delle predizioni delle reti nel contesto delle saliency-based explanations. Analizziamo la stabilità delle explanations soggette ad attacchi avversari e dimostriamo che, nel limite di un numero infinito di dati e di pesi, le interpretazioni Bayesiane sono più stabili di quelle fornite dalle reti deterministiche. Confermiamo questo comportamento in modo sperimentale nel regime di un numero finito di dati. Infine, introduciamo il concetto di attacco avversario alle sequenze di amminoacidi per protein Language Models (LM). I modelli di Deep Learning per la predizione della struttura delle proteine, come AlphaFold2, sfruttano le architetture Transformer e il loro meccanismo di attention per catturare le proprietà strutturali e funzionali delle sequenze di amminoacidi. Nonostante l'elevata precisione delle predizioni, perturbazioni biologicamente piccole delle sequenze di input, o anche mutazioni di un singolo amminoacido, possono portare a strutture 3D sostanzialmente diverse. Al contempo, i protein LMs sono insensibili alle mutazioni che inducono misfolding o disfunzione (ad esempio le missense mutations). In particolare, le predizioni delle coordinate 3D non rivelano l'effetto di unfolding indotto da queste mutazioni. Pertanto, esiste un'evidente incoerenza tra l'importanza biologica delle mutazioni e il conseguente cambiamento nella predizione strutturale. Ispirati da questo problema, introduciamo il concetto di perturbazione avversaria delle sequenze proteiche negli embedding continui dei protein LMs. Il nostro metodo utilizza i valori di attention per rilevare le posizioni degli amminoacidi più vulnerabili nelle sequenze di input. Le mutazioni avversarie sono biologicamente diverse dalle sequenze di riferimento e sono in grado di alterare in modo significativo le strutture 3D.Deep Neural Networks (DNNs) are powerful predictive models, exceeding human capabilities in a variety of tasks. They learn complex and flexible decision systems from the available data and achieve exceptional performances in multiple machine learning fields, spanning from applications in artificial intelligence, such as image, speech and text recognition, to the more traditional sciences, including medicine, physics and biology. Despite the outstanding achievements, high performance and high predictive accuracy are not sufficient for real-world applications, especially in safety-critical settings, where the usage of DNNs is severely limited by their black-box nature. There is an increasing need to understand how predictions are performed, to provide uncertainty estimates, to guarantee robustness to malicious attacks and to prevent unwanted behaviours. State-of-the-art DNNs are vulnerable to small perturbations in the input data, known as adversarial attacks: maliciously crafted manipulations of the inputs that are perceptually indistinguishable from the original samples but are capable of fooling the model into incorrect predictions. In this work, we prove that such brittleness is related to the geometry of the data manifold and is therefore likely to be an intrinsic feature of DNNs’ predictions. This negative condition suggests a possible direction to overcome such limitation: we study the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks and prove that, in this limit, they are immune to gradient-based adversarial attacks. Furthermore, we propose some training techniques to improve the adversarial robustness of deterministic architectures. In particular, we experimentally observe that ensembles of NNs trained on random projections of the original inputs into lower dimensional spaces are more resilient to the attacks. Next, we focus on the problem of interpretability of NNs’ predictions in the setting of saliency-based explanations. We analyze the stability of the explanations under adversarial attacks on the inputs and we prove that, in the large-data and overparameterized limit, Bayesian interpretations are more stable than those provided by deterministic networks. We validate this behaviour in multiple experimental settings in the finite data regime. Finally, we introduce the concept of adversarial perturbations of amino acid sequences for protein Language Models (LMs). Deep Learning models for protein structure prediction, such as AlphaFold2, leverage Transformer architectures and their attention mechanism to capture structural and functional properties of amino acid sequences. Despite the high accuracy of predictions, biologically small perturbations of the input sequences, or even single point mutations, can lead to substantially different 3d structures. On the other hand, protein language models are insensitive to mutations that induce misfolding or dysfunction (e.g. missense mutations). Precisely, predictions of the 3d coordinates do not reveal the structure-disruptive effect of these mutations. Therefore, there is an evident inconsistency between the biological importance of mutations and the resulting change in structural prediction. Inspired by this problem, we introduce the concept of adversarial perturbation of protein sequences in continuous embedding spaces of protein language models. Our method relies on attention scores to detect the most vulnerable amino acid positions in the input sequences. Adversarial mutations are biologically diverse from their references and are able to significantly alter the resulting 3D structures

    Turbulence closure with small, local neural networks: Forced two-dimensional and β\beta-plane flows

    Full text link
    We parameterize sub-grid scale (SGS) fluxes in sinusoidally forced two-dimensional turbulence on the β\beta-plane at high Reynolds numbers (Re\sim25000) using simple 2-layer Convolutional Neural Networks (CNN) having only O(1000)parameters, two orders of magnitude smaller than recent studies employing deeper CNNs with 8-10 layers; we obtain stable, accurate, and long-term online or a posteriori solutions at 16X downscaling factors. Our methodology significantly improves training efficiency and speed of online Large Eddy Simulations (LES) runs, while offering insights into the physics of closure in such turbulent flows. Our approach benefits from extensive hyperparameter searching in learning rate and weight decay coefficient space, as well as the use of cyclical learning rate annealing, which leads to more robust and accurate online solutions compared to fixed learning rates. Our CNNs use either the coarse velocity or the vorticity and strain fields as inputs, and output the two components of the deviatoric stress tensor. We minimize a loss between the SGS vorticity flux divergence (computed from the high-resolution solver) and that obtained from the CNN-modeled deviatoric stress tensor, without requiring energy or enstrophy preserving constraints. The success of shallow CNNs in accurately parameterizing this class of turbulent flows implies that the SGS stresses have a weak non-local dependence on coarse fields; it also aligns with our physical conception that small-scales are locally controlled by larger scales such as vortices and their strained filaments. Furthermore, 2-layer CNN-parameterizations are more likely to be interpretable and generalizable because of their intrinsic low dimensionality.Comment: 27 pages, 13 figure

    Contributions to improve the technologies supporting unmanned aircraft operations

    Get PDF
    Mención Internacional en el título de doctorUnmanned Aerial Vehicles (UAVs), in their smaller versions known as drones, are becoming increasingly important in today's societies. The systems that make them up present a multitude of challenges, of which error can be considered the common denominator. The perception of the environment is measured by sensors that have errors, the models that interpret the information and/or define behaviors are approximations of the world and therefore also have errors. Explaining error allows extending the limits of deterministic models to address real-world problems. The performance of the technologies embedded in drones depends on our ability to understand, model, and control the error of the systems that integrate them, as well as new technologies that may emerge. Flight controllers integrate various subsystems that are generally dependent on other systems. One example is the guidance systems. These systems provide the engine's propulsion controller with the necessary information to accomplish a desired mission. For this purpose, the flight controller is made up of a control law for the guidance system that reacts to the information perceived by the perception and navigation systems. The error of any of the subsystems propagates through the ecosystem of the controller, so the study of each of them is essential. On the other hand, among the strategies for error control are state-space estimators, where the Kalman filter has been a great ally of engineers since its appearance in the 1960s. Kalman filters are at the heart of information fusion systems, minimizing the error covariance of the system and allowing the measured states to be filtered and estimated in the absence of observations. State Space Models (SSM) are developed based on a set of hypotheses for modeling the world. Among the assumptions are that the models of the world must be linear, Markovian, and that the error of their models must be Gaussian. In general, systems are not linear, so linearization are performed on models that are already approximations of the world. In other cases, the noise to be controlled is not Gaussian, but it is approximated to that distribution in order to be able to deal with it. On the other hand, many systems are not Markovian, i.e., their states do not depend only on the previous state, but there are other dependencies that state space models cannot handle. This thesis deals a collection of studies in which error is formulated and reduced. First, the error in a computer vision-based precision landing system is studied, then estimation and filtering problems from the deep learning approach are addressed. Finally, classification concepts with deep learning over trajectories are studied. The first case of the collection xviiistudies the consequences of error propagation in a machine vision-based precision landing system. This paper proposes a set of strategies to reduce the impact on the guidance system, and ultimately reduce the error. The next two studies approach the estimation and filtering problem from the deep learning approach, where error is a function to be minimized by learning. The last case of the collection deals with a trajectory classification problem with real data. This work completes the two main fields in deep learning, regression and classification, where the error is considered as a probability function of class membership.Los vehículos aéreos no tripulados (UAV) en sus versiones de pequeño tamaño conocidos como drones, van tomando protagonismo en las sociedades actuales. Los sistemas que los componen presentan multitud de retos entre los cuales el error se puede considerar como el denominador común. La percepción del entorno se mide mediante sensores que tienen error, los modelos que interpretan la información y/o definen comportamientos son aproximaciones del mundo y por consiguiente también presentan error. Explicar el error permite extender los límites de los modelos deterministas para abordar problemas del mundo real. El rendimiento de las tecnologías embarcadas en los drones, dependen de nuestra capacidad de comprender, modelar y controlar el error de los sistemas que los integran, así como de las nuevas tecnologías que puedan surgir. Los controladores de vuelo integran diferentes subsistemas los cuales generalmente son dependientes de otros sistemas. Un caso de esta situación son los sistemas de guiado. Estos sistemas son los encargados de proporcionar al controlador de los motores información necesaria para cumplir con una misión deseada. Para ello se componen de una ley de control de guiado que reacciona a la información percibida por los sistemas de percepción y navegación. El error de cualquiera de estos sistemas se propaga por el ecosistema del controlador siendo vital su estudio. Por otro lado, entre las estrategias para abordar el control del error se encuentran los estimadores en espacios de estados, donde el filtro de Kalman desde su aparición en los años 60, ha sido y continúa siendo un gran aliado para los ingenieros. Los filtros de Kalman son el corazón de los sistemas de fusión de información, los cuales minimizan la covarianza del error del sistema, permitiendo filtrar los estados medidos y estimarlos cuando no se tienen observaciones. Los modelos de espacios de estados se desarrollan en base a un conjunto de hipótesis para modelar el mundo. Entre las hipótesis se encuentra que los modelos del mundo han de ser lineales, markovianos y que el error de sus modelos ha de ser gaussiano. Generalmente los sistemas no son lineales por lo que se realizan linealizaciones sobre modelos que a su vez ya son aproximaciones del mundo. En otros casos el ruido que se desea controlar no es gaussiano, pero se aproxima a esta distribución para poder abordarlo. Por otro lado, multitud de sistemas no son markovianos, es decir, sus estados no solo dependen del estado anterior, sino que existen otras dependencias que los modelos de espacio de estados no son capaces de abordar. Esta tesis aborda un compendio de estudios sobre los que se formula y reduce el error. En primer lugar, se estudia el error en un sistema de aterrizaje de precisión basado en visión por computador. Después se plantean problemas de estimación y filtrado desde la aproximación del aprendizaje profundo. Por último, se estudian los conceptos de clasificación con aprendizaje profundo sobre trayectorias. El primer caso del compendio estudia las consecuencias de la propagación del error de un sistema de aterrizaje de precisión basado en visión artificial. En este trabajo se propone un conjunto de estrategias para reducir el impacto sobre el sistema de guiado, y en última instancia reducir el error. Los siguientes dos estudios abordan el problema de estimación y filtrado desde la perspectiva del aprendizaje profundo, donde el error es una función que minimizar mediante aprendizaje. El último caso del compendio aborda un problema de clasificación de trayectorias con datos reales. Con este trabajo se completan los dos campos principales en aprendizaje profundo, regresión y clasificación, donde se plantea el error como una función de probabilidad de pertenencia a una clase.I would like to thank the Ministry of Science and Innovation for granting me the funding with reference PRE2018-086793, associated to the project TEC2017-88048-C2-2-R, which provide me the opportunity to carry out all my PhD. activities, including completing an international research internship.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Antonio Berlanga de Jesús.- Secretario: Daniel Arias Medina.- Vocal: Alejandro Martínez Cav

    Data-driven deep-learning methods for the accelerated simulation of Eulerian fluid dynamics

    Get PDF
    Deep-learning (DL) methods for the fast inference of the temporal evolution of fluid-dynamics systems, based on the previous recognition of features underlying large sets of fluid-dynamics data, have been studied. Specifically, models based on convolution neural networks (CNNs) and graph neural networks (GNNs) were proposed and discussed. A U-Net, a popular fully-convolutional architecture, was trained to infer wave dynamics on liquid surfaces surrounded by walls, given as input the system state at previous time-points. A term for penalising the error of the spatial derivatives was added to the loss function, which resulted in a suppression of spurious oscillations and a more accurate location and length of the predicted wavefronts. This model proved to accurately generalise to complex wall geometries not seen during training. As opposed to the image data-structures processed by CNNs, graphs offer higher freedom on how data is organised and processed. This motivated the use of graphs to represent the state of fluid-dynamic systems discretised by unstructured sets of nodes, and GNNs to process such graphs. Graphs have enabled more accurate representations of curvilinear geometries and higher resolution placement exclusively in areas where physics is more challenging to resolve. Two novel GNN architectures were designed for fluid-dynamics inference: the MuS-GNN, a multi-scale GNN, and the REMuS-GNN, a rotation-equivariant multi-scale GNN. Both architectures work by repeatedly passing messages from each node to its nearest nodes in the graph. Additionally, lower-resolutions graphs, with a reduced number of nodes, are defined from the original graph, and messages are also passed from finer to coarser graphs and vice-versa. The low-resolution graphs allowed for efficiently capturing physics encompassing a range of lengthscales. Advection and fluid flow, modelled by the incompressible Navier-Stokes equations, were the two types of problems used to assess the proposed GNNs. Whereas a single-scale GNN was sufficient to achieve high generalisation accuracy in advection simulations, flow simulation highly benefited from an increasing number of low-resolution graphs. The generalisation and long-term accuracy of these simulations were further improved by the REMuS-GNN architecture, which processes the system state independently of the orientation of the coordinate system thanks to a rotation-invariant representation and carefully designed components. To the best of the author’s knowledge, the REMuS-GNN architecture was the first rotation-equivariant and multi-scale GNN. The simulations were accelerated between one (in a CPU) and three (in a GPU) orders of magnitude with respect to a CPU-based numerical solver. Additionally, the parallelisation of multi-scale GNNs resulted in a close-to-linear speedup with the number of CPU cores or GPUs.Open Acces

    Search for the lepton flavour violating decay τ+ → μ+μ−μ+ with the LHCb experiment

    Get PDF
    This thesis presents a search for the lepton flavour violating decay τ+ → μ+μ−μ+ using data collected by the LHCb experiment in proton-proton collisions at a centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 5.4 fb−1. As of today, no lepton flavour violation has been observed in decays of charged leptons. When accounting for neutrino oscillations in the Standard Model, such processes are expected at undetectable decay rates, thus the observation of the τ+ → μ+μ−μ+ decay would be a clear sign of new physics. Even if the decay is not observed, improving the upper limit on the branching fraction of τ+ → μ+μ−μ+ would allow us to constrain theories of physics beyond the Standard Model that predict a decay rate within the sensitivity of the current experimental facilities. The main challenge of the analysis is the identification and rejection of the background sources, originating from accidental combinations of tracks wrongly associated with a τ decay and from the misidentification of hadrons (π±, K±) coming from D+ or Ds decays. The analysis is designed by removing the signal region in data in order to avoid biases in the optimisation of the signal selection. The background suppression is achieved by means of multivariate classification models trained to distinguish between signal and background events. The number of τ leptons produced in the LHCb acceptance is normalised to the yield observed in the D+s → ϕ(μ+μ−)π+ channel. An expected upper limit of B(τ+ → μ+μ−μ+) < 1.8(2.1) × 10−8 is found at 90%(95%) confidence level using the CLs method
    corecore