269 research outputs found

    Differentiable Tree Operations Promote Compositional Generalization

    Full text link
    In the context of structure-to-structure transformation tasks, learning sequences of discrete symbolic operations poses significant challenges due to their non-differentiability. To facilitate the learning of these symbolic sequences, we introduce a differentiable tree interpreter that compiles high-level symbolic tree operations into subsymbolic matrix operations on tensors. We present a novel Differentiable Tree Machine (DTM) architecture that integrates our interpreter with an external memory and an agent that learns to sequentially select tree operations to execute the target transformation in an end-to-end manner. With respect to out-of-distribution compositional generalization on synthetic semantic parsing and language generation tasks, DTM achieves 100% while existing baselines such as Transformer, Tree Transformer, LSTM, and Tree2Tree LSTM achieve less than 30%. DTM remains highly interpretable in addition to its perfect performance.Comment: ICML 2023. Code available at https://github.com/psoulos/dt

    Stochastic compartmental models and CD8+ T cell exhaustion

    Get PDF
    In this PhD thesis, mathematical models for cell differentiation are presented. Cell differentiation is a widely observed process in cellular biology allowing a small pool of not specialised cells to develop and maintain a bigger population of cells with a specific function. Different mathematical techniques are employed in this thesis, to study cell differentiation process. We propose a time-independent stochastic mathematical model to represent a general differentiation process via a sequence of compartments. Since we are interested in the ultimate fate of the system, we define a discrete-time branching processes and consider the impact, on the final population, of cells passing through only one or multiple compartments. Further, we include time dependency and define a continuous-time Markov chain to analyse cells dynamics along the sequence of compartments over time. Also, we focus on the journey of a single cell over time and compute a number of summary statistics of interest. Moreover, the impact of different types of differentiation events is considered and numerical results inspired by biological applications, mainly related to immunology, are summarised to illustrate our theoretical approach and methods. In the last Chapter, we focus on a specific cell differentiation process: cells of the immune system have been observed to differentiate towards a dysfunctional state, called exhaustion, during a chronic infection or cancer. One of the aims of this PhD thesis is to shed light into the exhaustion-differentiation process of CD8+ T cells and its reversibility which is a topic of interest for the current and future development of immunotherapies. In particular, based on data collected by the Kaech Lab, several deterministic mathematical models are defined to investigate cells’ trajectory towards the exhausted state as well as the duration of the antigen signal at early time point of stimulation

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Unsupervised space-time learning in primary visual cortex

    Get PDF
    The mammalian visual system is an incredibly complex computation device, capable of performing the various tasks of seeing: navigation, pattern and object recognition, motor coordination, trajectory extrapolation, among others. Decades of research has shown that experience-dependent plasticity of cortical circuitry underlies the impressive ability to rapidly learn many of these tasks and to adjust as required. One particular thread of investigation has focused on unsupervised learning, wherein changes to the visual environment lead to corresponding changes in cortical circuits. The most prominent example of unsupervised learning is ocular dominance plasticity, caused by visual deprivation to one eye and leading to a dramatic re-wiring of cortex. Other examples tend to make more subtle changes to the visual environment through passive exposure to novel visual stimuli. Here, we use one such unsupervised paradigm, sequence learning, to study experience-dependent plasticity in the mouse visual system. Through a combination of theory and experiment, we argue that the mammalian visual system is an unsupervised learning device. Beginning with a mathematical exploration of unsupervised learning in biology, engineering, and machine learning, we seek a more precise expression of our fundamental hypothesis. We draw connections between information theory, efficient coding, and common unsupervised learning algorithms such as Hebbian plasticity and principal component analysis. Efficient coding suggests a simple rule for transmitting information in the nervous system: use more spikes to encode unexpected information, and fewer spikes to encode expected information. Therefore, expectation violations ought to produce prediction errors, or brief periods of heightened firing when an unexpected event occurs. Meanwhile, modern unsupervised learning algorithms show how such expectations can be learned. Next, we review data from decades of visual neuroscience research, highlighting the computational principles and synaptic plasticity processes that support biological learning and seeing. By tracking the flow of visual information from the retina to thalamus and primary visual cortex, we discuss how the principle of efficient coding is evident in neural activity. One common example is predictive coding in the retina, where ganglion cells with canonical center-surround receptive fields compute a prediction error, sending spikes to the central nervous system only in response to locally-unpredictable visual stimuli. This behavior can be learned through simple Hebbian plasticity mechanisms. Similar models explain much of the activity of neurons in primary visual cortex, but we also discuss ways in which the theory fails to capture the rich biological complexity. Finally, we present novel experimental results from physiological investigations of the mouse primary visual cortex. We trained mice by passively exposing them to complex spatiotemporal patterns of light: rapidly-flashed sequences of images. We find evidence that visual cortex learns these sequences in a manner consistent with efficient coding, such that unexpected stimuli tend to elicit more firing than expected ones. Overall, we observe dramatic changes in evoked neural activity across days of passive exposure. Neural responses to the first, unexpected sequence element increase with days of training while responses at other, expected time points either decrease or stay the same. Furthermore, substituting an unexpected element for an expected one or omitting an expected element both cause brief bursts of increased firing. Our results therefore provide evidence for unsupervised learning and efficient coding in the mouse visual system, especially because unexpected events drive prediction errors. Overall, our analysis suggests novel experiments, which could be performed in the near future, and provides a useful framework to understand visual perception and learning

    Learned interpreters : structural and learned systematicity in neural networks for program execution

    Full text link
    Les architectures de réseaux de neurones profonds à usage général ont fait des progrès surprenants dans l'apprentissage automatique pour le code, permettant l’amélioration de la complétion de code, la programmation du langage naturel, la détection et la réparation des bogues, et même la résolution de problèmes de programmation compétitifs à un niveau de performance humain. Néanmoins, ces méthodes ont du mal à comprendre le processus d'exécution du code, même lorsqu'il s'agit de code qu'ils écrivent eux-mêmes. À cette fin, nous explorons une architecture du réseau neuronal inspiré d’interpréteur de code, via une nouvelle famille d'architecture appelée Instruction Pointer Attention Graph Neural Networks (IPA-GNN). Nous appliquons cette famille d'approches à plusieurs tâches nécessitant un raisonnement sur le comportement d'exécution du programme : apprendre à exécuter des programmes complets et partiels, prédire la couverture du code pour la vérification du matériel, et prédire les erreurs d'exécution dans des programmes de compétition. Grâce à cette série de travaux, nous apportons plusieurs contributions et rencontrons de multiples résultats surprenants et prometteurs. Nous introduisons une bibliothèque Python pour construire des représentations de graphes des programmes utiles dans la recherche sur l'apprentissage automatique, qui sert de fondement à la recherche dans cette thèse et dans la communauté de recherche plus large. Nous introduisons également de riches ensembles de données à grande échelle de programmes annotés avec le comportement du programme (les sorties et les erreurs soulevées lors de son exécution) pour faciliter la recherche dans ce domaine. Nous constatons que les méthodes IPA-GNN présentent une forte généralisation améliorée par rapport aux méthodes à usage général, fonctionnant bien lorsqu'ils sont entraînés pour exécuter uniquement des programmes courts mais testés sur des programmes plus longs. En fait, nous constatons que les méthodes IPA-GNN surpassent les méthodes génériques sur chacune des tâches de modélisation du comportement que nous considérons dans les domaines matériel et logiciel. Nous constatons même que les méthodes inspirées par l'interpréteur de code qui modélisent explicitement la gestion des exceptions ont une propriété interprétative souhaitable, permettant la prédiction des emplacements d'erreur même lorsqu'elles n'ont été entraînées qu'à prédire la présence d'erreur et le type d'erreur. Au total, les architectures inspirées des interpréteurs de code comme l'IPA-GNN représentent un chemin prometteur à suivre pour imprégner des réseaux de neurones avec de nouvelles capacités pour apprendre à raisonner sur les exécutions de programme.General purpose deep neural network architectures have made startling advances in machine learning for code, advancing code completion, enabling natural language programming, detecting and repairing bugs, and even solving competitive programming problems at a human level of performance. Nevertheless, these methods struggle to understand the execution behavior of code, even when it is code they write themselves. To this end, we explore interpreter-inspired neural network architectures, introducing a novel architecture family called instruction pointer attention graph neural networks (IPA-GNN). We apply this family of approaches to several tasks that require reasoning about the execution behavior of programs: learning to execute full and partial programs, code coverage prediction for hardware verification, and predicting runtime errors in competition programs. Through this series of works we make several contributions and encounter multiple surprising and promising results. We introduce a Python library for constructing graph representations of programs for use in machine learning research, which serves as a bedrock for the research in this thesis and in the broader research community. We also introduce rich large scale datasets of programs annotated with program behavior like outputs and errors raised to facilitate research in this domain. We find that IPA-GNN methods exhibit improved strong generalization over general purpose methods, performing well when trained to execute only on short programs and tested on significantly longer programs. In fact, we find that IPA-GNN methods outperform generic methods on each of the behavior modeling tasks we consider across both hardware and software domains. We even find that interpreter-inspired methods that model exception handling explicitly have a desirable interpretability property, enabling the prediction of error locations even when only trained on error presence and kind. In total, interpreter-inspired architectures like the IPA-GNN represent a promising path forward for imbuing neural networks with novel capabilities for learning to reason about program executions

    Compositional synthesis of reactive systems

    Get PDF
    Synthesis is the task of automatically deriving correct-by-construction implementations from formal specifications. While it is a promising path toward developing verified programs, it is infamous for being hard to solve. Compositionality is recognized as a key technique for reducing the complexity of synthesis. So far, compositional approaches require extensive manual effort. In this thesis, we introduce algorithms that automate these steps. In the first part, we develop compositional synthesis techniques for distributed systems. Providing assumptions on other processes' behavior is fundamental in this setting due to inter-process dependencies. We establish delay-dominance, a new requirement for implementations that allows for implicitly assuming that other processes will not maliciously violate the shared goal. Furthermore, we present an algorithm that computes explicit assumptions on process behavior to address more complex dependencies. In the second part, we transfer the concept of compositionality from distributed to single-process systems. We present a preprocessing technique for synthesis that identifies independently synthesizable system components. We extend this approach to an incremental synthesis algorithm, resulting in more fine-grained decompositions. Our experimental evaluation shows that our techniques automate the required manual efforts, resulting in fully automated compositional synthesis algorithms for both distributed and single-process systems.Synthese ist die Aufgabe korrekte Implementierungen aus formalen Spezifikation abzuleiten. Sie ist zwar ein vielversprechender Weg für die Entwicklung verifizierter Programme, aber auch dafür bekannt schwer zu lösen zu sein. Kompositionalität gilt als eine Schlüsseltechnik zur Verringerung der Komplexität der Synthese. Bislang erfordern kompositionale Ansätze einen hohen manuellen Aufwand. In dieser Dissertation stellen wir Algorithmen vor, die diese Schritte automatisieren. Im ersten Teil entwickeln wir kompositionale Synthesetechniken für verteilte Systeme. Aufgrund der Abhängigkeiten zwischen den Prozessen ist es in diesem Kontext von grundlegender Bedeutung, Annahmen über das Verhalten der anderen Prozesse zu treffen. Wir etablieren Delay-Dominance, eine neue Anforderung für Implementierungen, die es ermöglicht, implizit anzunehmen, dass andere Prozesse das gemeinsame Ziel nicht böswillig verletzen. Darüber hinaus stellen wir einen Algorithmus vor, der explizite Annahmen über das Verhalten anderer Prozesse ableitet, um komplexere Abhängigkeiten zu berücksichtigen. Im zweiten Teil übertragen wir das Konzept der Kompositionalität von verteilten auf Einzelprozesssysteme. Wir präsentieren eine Vorverarbeitungmethode für die Synthese, die unabhängig synthetisierbare Systemkomponenten identifiziert. Wir erweitern diesen Ansatz zu einem inkrementellen Synthesealgorithmus, der zu feineren Dekompositionen führt. Unsere experimentelle Auswertung zeigt, dass unsere Techniken den erforderlichen manuellen Aufwand automatisieren und so zu vollautomatischen Algorithmen für die kompositionale Synthese sowohl für verteilte als auch für Einzelprozesssysteme führen

    Beam Tree Recursive Cells

    Full text link
    We propose Beam Tree Recursive Cell (BT-Cell) - a backpropagation-friendly framework to extend Recursive Neural Networks (RvNNs) with beam search for latent structure induction. We further extend this framework by proposing a relaxation of the hard top-k operators in beam search for better propagation of gradient signals. We evaluate our proposed models in different out-of-distribution splits in both synthetic and realistic data. Our experiments show that BTCell achieves near-perfect performance on several challenging structure-sensitive synthetic tasks like ListOps and logical inference while maintaining comparable performance in realistic data against other RvNN-based models. Additionally, we identify a previously unknown failure case for neural models in generalization to unseen number of arguments in ListOps. The code is available at: https://github.com/JRC1995/BeamTreeRecursiveCells.Comment: Accepted in ICML 202
    • …
    corecore