54 research outputs found

    Pushdown automata in statistical machine translation

    Get PDF
    This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT. </jats:p

    Acta Cybernetica : Volume 19. Number 2.

    Get PDF

    ON EXPRESSIVENESS, INFERENCE, AND PARAMETER ESTIMATION OF DISCRETE SEQUENCE MODELS

    Get PDF
    Huge neural autoregressive sequence models have achieved impressive performance across different applications, such as NLP, reinforcement learning, and bioinformatics. However, some lingering problems (e.g., consistency and coherency of generated texts) continue to exist, regardless of the parameter count. In the first part of this thesis, we chart a taxonomy of the expressiveness of various sequence model families (Ch 3). In particular, we put forth complexity-theoretic proofs that string latent-variable sequence models are strictly more expressive than energy-based sequence models, which in turn are more expressive than autoregressive sequence models. Based on these findings, we introduce residual energy-based sequence models, a family of energy-based sequence models (Ch 4) whose sequence weights can be evaluated efficiently, and also perform competitively against autoregressive models. However, we show how unrestricted energy-based sequence models can suffer from uncomputability; and how such a problem is generally unfixable without knowledge of the true sequence distribution (Ch 5). In the second part of the thesis, we study practical sequence model families and algorithms based on theoretical findings in the first part of the thesis. We introduce neural particle smoothing (Ch 6), a family of approximate sampling methods that work with conditional latent variable models. We also introduce neural finite-state transducers (Ch 7), which extend weighted finite state transducers with the introduction of mark strings, allowing scoring transduction paths in a finite state transducer with a neural network. Finally, we propose neural regular expressions (Ch 8), a family of neural sequence models that are easy to engineer, allowing a user to design flexible weighted relations using Marked FSTs, and combine these weighted relations together with various operations

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers

    Acta Cybernetica : Volume 22. Number 2.

    Get PDF

    Graph and Hypergraph Decompositions for Exact Algorithms

    Get PDF
    This thesis studies exact exponential and fixed-parameter algorithms for hard graph and hypergraph problems. Specifically, we study two techniques that can be used in the development of such algorithms: (i) combinatorial decompositions of both the input instance and the solution, and (ii) evaluation of multilinear forms over semirings. In the first part of the thesis we develop new algorithms for graph and hypergraph problems based on techniques (i) and (ii). While these techniques are independently both useful, the work presented in this part is largely characterised by their joint application. That is, combining results from different pieces of the decompositions often takes the from of multilinear form evaluation task, and on the other hand, decompositions offer the basic structure for dynamic-programming-style algorithms for the evaluation of multilinear forms. As main positive results of the first part, we give algorithms for three different problem families. First, we give a fast evaluation algorithm for linear forms defined by a disjointness matrix of small sets. This can be applied to obtain faster algorithms for counting maximum-weight objects of small size, such as k-paths in graphs. Second, we give a general framework for exponential-time algorithms for finding maximum-weight subgraphs of bounded tree-width, based on the theory of tree decompositions. Besides basic combinatorial problems, this framework has applications in learning Bayesian network structures. Third, we give a fixed-parameter algorithm for finding unbalanced vertex cuts, that is, vertex cuts that separate a small number of vertices from the rest of the graph. In the second part of the thesis we consider aspects of the complexity theory of linear forms over semirings, in order to better understand technique (ii). Specifically, we study how the presence of different algebraic catalysts in the ground semiring affects the complexity. As the main result, we show that there are linear forms that are easy to compute over semirings with idempotent addition, but difficult to compute over rings, unless the strong exponential time hypothesis fails.Yksi tietojenkäsittelytieteen perustavista tavoitteista on tehokkaiden algoritmien kehittäminen. Teoreettisesta näkökulmasta algoritmia yleensä pidetään tehokkaana mikäli sen ajoaika riippuu polynomisesti syötteen koosta. On kuitenkin laskennallisia ongelmia, joihin ei ole olemassa polynomiaikaisia algoritmeja. Esimerkiksi NP-kovia ongelmia ei voi ratkaista polynomisessa ajassa, mikäli yleinen vaativuusolettamus P ≠ NP pitää paikkansa. Tästä huolimatta haluaisimme kuitenkin usein ratkaista tällaisia vaikeita ongelmia. Kaksi yleistä lähestymistapaa vaikeiden, polynomisessa ajassa ratkeamattomien ongelmien tarkkaan ratkaisemiseen on (i) eksponentiaalinen algoritmiikka ja (ii) parametrisoitu algoritmiikka. Eksponentiaaliaikaisessa algoritmiikassa kehitetään algoritmeja, joiden ajoaika on edelleen eksponentiaalinen syötteen koon suhteen, mutta jotka välttävät koko ratkaisuavaruuden läpikäynnin; toisin sanoen, kyse on vähemmän eksponentiaalisten algoritmien kehittämisestä. Parametrisoitu algoritmiikka puolestaan pyrkii eristämään eksponentiaaliaikaisen riippuvuuden ajoajassa syötteen koosta riippumattomaan parametriin. Tässä väitöstyössä esitetään eksponentiaaliaikaisia ja parametrisoituja algoritmeja erinäisten vaikeiden verkko- ja hyperverkko-ongelmien tarkkaan ratkaisemiseen. Esitetyt algoritmit perustuvat kahteen algoritmiseen tekniikkaan: (i) monilineaarimuotojen evaluoiminen yli erilaisten puolirengaiden ja (ii) kombinatoristen hajotelmien käyttö. Algoritmien lisäksi työssä tarkastellaan näihin tekniikoihin liittyviä vaativuusteoreettisia kysymyksiä, mikä auttaa ymmärtämään tekniikoiden rajoituksia ja toistaiseksi hyödyntämättömiä mahdollisuuksia
    corecore