19 research outputs found

    Programming Using Automata and Transducers

    Get PDF
    Automata, the simplest model of computation, have proven to be an effective tool in reasoning about programs that operate over strings. Transducers augment automata to produce outputs and have been used to model string and tree transformations such as natural language translations. The success of these models is primarily due to their closure properties and decidable procedures, but good properties come at the price of limited expressiveness. Concretely, most models only support finite alphabets and can only represent small classes of languages and transformations. We focus on addressing these limitations and bridge the gap between the theory of automata and transducers and complex real-world applications: Can we extend automata and transducer models to operate over structured and infinite alphabets? Can we design languages that hide the complexity of these formalisms? Can we define executable models that can process the input efficiently? First, we introduce succinct models of transducers that can operate over large alphabets and design BEX, a language for analysing string coders. We use BEX to prove the correctness of UTF and BASE64 encoders and decoders. Next, we develop a theory of tree transducers over infinite alphabets and design FAST, a language for analysing tree-manipulating programs. We use FAST to detect vulnerabilities in HTML sanitizers, check whether augmented reality taggers conflict, and optimize and analyze functional programs that operate over lists and trees. Finally, we focus on laying the foundations of stream processing of hierarchical data such as XML files and program traces. We introduce two new efficient and executable models that can process the input in a left-to-right linear pass: symbolic visibly pushdown automata and streaming tree transducers. Symbolic visibly pushdown automata are closed under Boolean operations and can specify and efficiently monitor complex properties for hierarchical structures over infinite alphabets. Streaming tree transducers can express and efficiently process complex XML transformations while enjoying decidable procedures

    The category of MSO transductions

    Full text link
    MSO transductions are binary relations between structures which are defined using monadic second-order logic. MSO transductions form a category, since they are closed under composition. We show that many notions from language theory, such as recognizability or tree decompositions, can be defined in an abstract way that only refers to MSO transductions and their compositions

    Linear High-Order Deterministic Tree Transducers with Regular Look-Ahead

    Get PDF
    We introduce the notion of high-order deterministic top-down tree transducers (HODT) whose outputs correspond to single-typed lambda-calculus formulas. These transducers are natural generalizations of known models of top-tree transducers such as: Deterministic Top-Down Tree Transducers, Macro Tree Transducers, Streaming Tree Transducers... We focus on the linear restriction of high order tree transducers with look-ahead (HODTR_lin), and prove this corresponds to tree to tree functional transformations defined by Monadic Second Order (MSO) logic. We give a specialized procedure for the composition of those transducers that uses a flow analysis based on coherence spaces and allows us to preserve the linearity of transducers. This procedure has a better complexity than classical algorithms for composition of other equivalent tree transducers, but raises the order of transducers. However, we also indicate that the order of a HODTR_lin can always be bounded by 3, and give a procedure that reduces the order of a HODTR_lin to 3. As those resulting HODTR_lin can then be transformed into other equivalent models, this gives an important insight on composition algorithm for other classes of transducers. Finally, we prove that those results partially translate to the case of almost linear HODTR: the class corresponds to the class of tree transformations performed by MSO with unfolding (not closed by composition), and provide a mechanism to reduce the order to 3 in this case

    Doctor of Philosophy

    Get PDF
    dissertationThe efficient transport of particles throughout a cell plays a fundamental role in several cellular processes. Broadly speaking, intracellular transport can be divided into two categories: passive and active transport. Whereas passive transport generally occurs via diffusive processes, active transport requires cellular energy through adenosine triphosphate (ATP). Many active transport processes are driven by molecular motors such as kinesin and dynein, which carry cargo and travel along the microtubules of a cell to deliver specific material to specific locations. Breakdown of molecular motor delivery is correlated with the onset of several diseases, such as Alzheimer's and Parkinson's. We mathematically model two fundamental cellular processes. In the first part, we introduce a possible biophysical mechanism by which cells attain uniformity in vesicle density throughout their body. We do this by modeling bulk motor density dynamics using partial differential equations derived from microscopic descriptions of individual motor-cargo complex dynamics. We then consider the cases where delivery of cargo to cellular targets is (i) irreversible and (ii) reversible. This problem is studied on the semi-infinite interval, disk, and spherical domains. We also consider the case where exclusion effects come into play. In all cases, we find that allowing for reversibility in cargo delivery to cellular targets allows for more uniform vesicle distribution. In the second part, we see how active transport by molecular motors allows for length control and sensing in flagella and axons, respectively. For the flagellum, we model length control using a doubly stochastic Poisson model. For axons, we model bulk motor dynamics by partial differential equations, and show how spatial information may be encoded in the frequency of an oscillating chemical signal being carried by dynein motors. Furthermore, we discuss how frequency-encoded signals may be decoded by cells, and how these mechanisms break down in the face of noise

    LIPIcs, Volume 248, ISAAC 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 248, ISAAC 2022, Complete Volum

    Foundations of Software Science and Computation Structures

    Get PDF
    This open access book constitutes the proceedings of the 24th International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2021, which was held during March 27 until April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The 28 regular papers presented in this volume were carefully reviewed and selected from 88 submissions. They deal with research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 28th European Symposium on Programming, ESOP 2019, which took place in Prague, Czech Republic, in April 2019, held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019

    Natural language generation as neural sequence learning and beyond

    Get PDF
    Natural Language Generation (NLG) is the task of generating natural language (e.g., English sentences) from machine readable input. In the past few years, deep neural networks have received great attention from the natural language processing community due to impressive performance across different tasks. This thesis addresses NLG problems with deep neural networks from two different modeling views. Under the first view, natural language sentences are modelled as sequences of words, which greatly simplifies their representation and allows us to apply classic sequence modelling neural networks (i.e., recurrent neural networks) to various NLG tasks. Under the second view, natural language sentences are modelled as dependency trees, which are more expressive and allow to capture linguistic generalisations leading to neural models which operate on tree structures. Specifically, this thesis develops several novel neural models for natural language generation. Contrary to many existing models which aim to generate a single sentence, we propose a novel hierarchical recurrent neural network architecture to represent and generate multiple sentences. Beyond the hierarchical recurrent structure, we also propose a means to model context dynamically during generation. We apply this model to the task of Chinese poetry generation and show that it outperforms competitive poetry generation systems. Neural based natural language generation models usually work well when there is a lot of training data. When the training data is not sufficient, prior knowledge for the task at hand becomes very important. To this end, we propose a deep reinforcement learning framework to inject prior knowledge into neural based NLG models and apply it to sentence simplification. Experimental results show promising performance using our reinforcement learning framework. Both poetry generation and sentence simplification are tackled with models following the sequence learning view, where sentences are treated as word sequences. In this thesis, we also explore how to generate natural language sentences as tree structures. We propose a neural model, which combines the advantages of syntactic structure and recurrent neural networks. More concretely, our model defines the probability of a sentence by estimating the generation probability of its dependency tree. At each time step, a node is generated based on the representation of the generated subtree. We show experimentally that this model achieves good performance in language modeling and can also generate dependency trees


    Get PDF
    \uc8 noto che i linguaggi regolari \u2014 o di tipo 3 \u2014 sono equivalenti agli automi a stati finiti. Tuttavia, in letteratura sono presenti altre caratterizzazioni di questa classe di linguaggi, in termini di modelli riconoscitori e grammatiche. Per esempio, limitando le risorse computazionali di modelli pi\uf9 generali, quali grammatiche context-free, automi a pila e macchine di Turing, che caratterizzano classi di linguaggi pi\uf9 ampie, \ue8 possibile ottenere modelli che generano o riconoscono solamente i linguaggi regolari. I dispositivi risultanti forniscono delle rappresentazioni alternative dei linguaggi di tipo 3, che, in alcuni casi, risultano significativamente pi\uf9 compatte rispetto a quelle dei modelli che caratterizzano la stessa classe di linguaggi. Il presente lavoro ha l\u2019obiettivo di studiare questi modelli formali dal punto di vista della complessit\ue0 descrizionale, o, in altre parole, di analizzare le relazioni tra le loro dimensioni, ossia il numero di simboli utilizzati per specificare la loro descrizione. Sono presentati, inoltre, alcuni risultati connessi allo studio della famosa domanda tuttora aperta posta da Sakoda e Sipser nel 1978, inerente al costo, in termini di numero di stati, per l\u2019eliminazione del nondeterminismo dagli automi stati finiti sfruttando la capacit\ue0 degli automi two-way deterministici di muovere la testina avanti e indietro sul nastro di input.It is well known that regular \u2014 or type 3 \u2014 languages are equivalent to finite automata. Nevertheless, many other characterizations of this class of languages in terms of computational devices and generative models are present in the literature. For example, by suitably restricting more general models such as context-free grammars, pushdown automata, and Turing machines, that characterize wider classes of languages, it is possible to obtain formal models that generate or recognize regular languages only. The resulting formalisms provide alternative representations of type 3 languages that may be significantly more concise than other models that share the same expressing power. The goal of this work is to investigate these formal systems from a descriptional complexity perspective, or, in other words, to study the relationships between their sizes, namely the number of symbols used to write down their descriptions. We also present some results related to the investigation of the famous question posed by Sakoda and Sipser in 1978, concerning the size blowups from nondeterministic finite automata to two-way deterministic finite automata

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum