10 research outputs found

    Multiple Context-Free Tree Grammars: Lexicalization and Characterization

    Get PDF
    Multiple (simple) context-free tree grammars are investigated, where "simple" means "linear and nondeleting". Every multiple context-free tree grammar that is finitely ambiguous can be lexicalized; i.e., it can be transformed into an equivalent one (generating the same tree language) in which each rule of the grammar contains a lexical symbol. Due to this transformation, the rank of the nonterminals increases at most by 1, and the multiplicity (or fan-out) of the grammar increases at most by the maximal rank of the lexical symbols; in particular, the multiplicity does not increase when all lexical symbols have rank 0. Multiple context-free tree grammars have the same tree generating power as multi-component tree adjoining grammars (provided the latter can use a root-marker). Moreover, every multi-component tree adjoining grammar that is finitely ambiguous can be lexicalized. Multiple context-free tree grammars have the same string generating power as multiple context-free (string) grammars and polynomial time parsing algorithms. A tree language can be generated by a multiple context-free tree grammar if and only if it is the image of a regular tree language under a deterministic finite-copying macro tree transducer. Multiple context-free tree grammars can be used as a synchronous translation device.Comment: 78 pages, 13 figure

    Composition closure of linear extended top-down tree transducers

    Get PDF
    Algorithms and the Foundations of Software technolog

    Algebraic decoder specification: coupling formal-language theory and statistical machine translation: Algebraic decoder specification: coupling formal-language theory and statistical machine translation

    Get PDF
    The specification of a decoder, i.e., a program that translates sentences from one natural language into another, is an intricate process, driven by the application and lacking a canonical methodology. The practical nature of decoder development inhibits the transfer of knowledge between theory and application, which is unfortunate because many contemporary decoders are in fact related to formal-language theory. This thesis proposes an algebraic framework where a decoder is specified by an expression built from a fixed set of operations. As yet, this framework accommodates contemporary syntax-based decoders, it spans two levels of abstraction, and, primarily, it encourages mutual stimulation between the theory of weighted tree automata and the application

    Learning words and syntactic cues in highly ambiguous contexts

    Get PDF
    The cross-situational word learning paradigm argues that word meanings can be approximated by word-object associations, computed from co-occurrence statistics between words and entities in the world. Lexicon acquisition involves simultaneously guessing (1) which objects are being talked about (the ”meaning”) and (2) which words relate to those objects. However, most modeling work focuses on acquiring meanings for isolated words, largely neglecting relationships between words or physical entities, which can play an important role in learning. Semantic parsing, on the other hand, aims to learn a mapping between entire utterances and compositional meaning representations where such relations are central. The focus is the mapping between meaning and words, while utterance meanings are treated as observed quantities. Here, we extend the joint inference problem of word learning to account for compositional meanings by incorporating a semantic parsing model for relating utterances to non-linguistic context. Integrating semantic parsing and word learning permits us to explore the impact of word-word and concept-concept relations. The result is a joint-inference problem inherited from the word learning setting where we must simultaneously learn utterance-level and individual word meanings, only now we also contend with the many possible relationships between concepts in the meaning and words in the sentence. To simplify design, we factorize the model into separate modules, one for each of the world, the meaning, and the words, and merge them into a single synchronous grammar for joint inference. There are three main contributions. First, we introduce a novel word learning model and accompanying semantic parser. Second, we produce a corpus which allows us to demonstrate the importance of structure in word learning. Finally, we also present a number of technical innovations required for implementing such a model

    Acta Cybernetica : Volume 23. Number 1.

    Get PDF

    New Results on Context-Free Tree Languages

    Get PDF
    Context-free tree languages play an important role in algebraic semantics and are applied in mathematical linguistics. In this thesis, we present some new results on context-free tree languages

    A Syntactical Reverse Engineering Approach to Fourth Generation Programming Languages Using Formal Methods

    Get PDF
    Fourth-generation programming languages (4GLs) feature rapid development with minimum configuration required by developers. However, 4GLs can suffer from limitations such as high maintenance cost and legacy software practices. Reverse engineering an existing large legacy 4GL system into a currently maintainable programming language can be a cheaper and more effective solution than rewriting from scratch. Tools do not exist so far, for reverse engineering proprietary XML-like and model-driven 4GLs where the full language specification is not in the public domain. This research has developed a novel method of reverse engineering some of the syntax of such 4GLs (with Uniface as an exemplar) derived from a particular system, with a view to providing a reliable method to translate/transpile that system's code and data structures into a modern object-oriented language (such as C\#). The method was also applied, although only to a limited extent, to some other 4GLs, Informix and Apex, to show that it was in principle more broadly applicable. A novel testing method that the syntax had been successfully translated was provided using 'abstract syntax trees'. The novel method took manually crafted grammar rules, together with Encapsulated Document Object Model based data from the source language and then used parsers to produce syntactically valid and equivalent code in the target/output language. This proof of concept research has provided a methodology plus sample code to automate part of the process. The methodology comprised a set of manual or semi-automated steps. Further automation is left for future research. In principle, the author's method could be extended to allow the reverse engineering recovery of the syntax of systems developed in other proprietary 4GLs. This would reduce time and cost for the ongoing maintenance of such systems by enabling their software engineers to work using modern object-oriented languages, methodologies, tools and techniques

    Key agreement: security / division

    Get PDF
    Some key agreement schemes, such as Diffie--Hellman key agreement, reduce to Rabi--Sherman key agreement, in which Alice sends abab to Charlie, Charlie sends bcbc to Alice, they agree on key a(bc)=(ab)ca(bc) = (ab)c, where multiplicative notation here indicates some specialized associative binary operation. All non-interactive key agreement schemes, where each peer independently determines a single delivery to the other, reduce to this case, because the ability to agree implies the existence of an associative operation. By extending the associative operation’s domain, the key agreement scheme can be enveloped into a mathematical ring, such that all cryptographic values are ring elements, and all key agreement computations are ring multiplications. (A smaller envelope, a semigroup instead of a ring, is also possible.) Security relies on the difficulty of division: here, meaning an operator // such that ((ab)/b)b=ab((ab)/b)b = ab. Security also relies on the difficulty of the less familiar wedge operation [ab,b,bc]abc[ab, b, bc] \mapsto abc. When Rabi--Sherman key agreement is instantiated as Diffie--Hellman key agreement: its multiplication amounts to modular exponentiation; its division amounts to the discrete logarithm problem; the wedge operation amounts to the computational Diffie--Hellman problem. Ring theory is well-developed and implies efficient division algorithms in some specific rings, such as matrix rings over fields. Semigroup theory, though less widely-known, also implies efficient division in specific semigroups, such as group-like semigroups. The rarity of key agreement schemes with well-established security suggests that easy multiplication with difficult division (and wedges) is elusive. Reduction of key agreement to ring or semigroup multiplication is not a panacea for cryptanalysis. Nonetheless, novel proposals for key agreement perhaps ought to run the gauntlet of a checklist for vulnerability to well-known division strategies that generalize across several forms of multiplication. Ambitiously applying this process of elimination to a plethora of diverse rings or semigroups might also, if only by a fluke, leave standing a few promising schemes, which might then deserve a more focused cryptanalysis

    Syntax-directed translations, tree transformations and bimorphisms

    Get PDF
    La traducció basada en la sintaxi va sorgir en l'àmbit de la traducció automàtica dels llenguatges naturals. Els sistemes han de modelar les transformacions d'arbres, reordenar parts d'oracions, ser simètrics i posseir propietats com la componibilitat o simetria. Existeixen diverses maneres de definir transformacions d'arbres: gramàtiques síncrones, transductors d'arbres i bimorfismes d'arbres. Les gramàtiques síncrones fan tot tipus de rotacions, però les propietats matemàtiques són més difícils de provar. Els transductors d'arbres són operacionals i fàcils d'implementar, però les classes principals no són tancades sota la composició. Els bimorfismes d'arbres són difícils d'implementar, però proporcionen una eina natural per provar componibilitat o simetria. Per millorar el procés de traducció, les gramàtiques síncrones es relacionen amb els bimorfismes d'arbres i amb els transductors d'arbres. En aquesta tesi es duu a terme un ampli estudi de la teoria i les propietats dels sistemes de traducció dirigides per la sintaxi, des d'aquestes tres perspectives molt diferents que es complementen perfectament entre si: com a dispositius generatius (gramàtiques síncrones), com a màquines acceptadores (transductors) i com a estructures algebraiques (bimorfismes). S'investiguen i comparen al nivell de la transformació d'arbres i com a dispositius que defineixen translacions. L'estudi es centra en bimorfismes, amb especial èmfasi en les seves aplicacions per al processament del llenguatge natural. També es proposa una completa i actualitzada visió general sobre les classes de transformacions d'arbres definits per bimorfismes, vinculant-los amb els tipus coneguts de gramàtiques síncrones i transductors d'arbres. Provem o recordem totes les propietats interessants que les esmentades classes posseeixen, millorant així els coneixements matemàtics previs. A més, s'exposen les relacions d'inclusió entre les principals classes de bimorfismes mitjançant un diagrama Hasse, com a dispositius de traducció i com a mecanismes de transformació d'arbres.La traducción basada en la sintaxis surgió en el ámbito de la traducción automática de los lenguajes naturales. Los sistemas deben modelar las transformaciones de árboles, reordenar partes de oraciones, ser simétricos y poseer propiedades como la composición o simetría. Existen varias maneras de definir transformaciones de árboles: gramáticas síncronas, transductores de árboles y bimorfismos de árboles. Las gramáticas síncronas hacen todo tipo de rotaciones, pero las propiedades matemáticas son más difíciles de probar. Los transductores de árboles son operacionales y fáciles de implementar pero las clases principales no son cerradas bajo la composición. Los bimorfismos de árboles son difíciles de implementar, pero proporcionan una herramienta natural para probar composición o simetría. Para mejorar el proceso de traducción, las gramáticas síncronas se relacionan con los bimorfismos de árboles y con los transductores de árboles. En esta tesis se lleva a cabo un amplio estudio de la teoría y las propiedades de los sistemas de traducción dirigidas por la sintaxis, desde estas tres perspectivas muy diferentes que se complementan perfectamente entre sí: como dispositivos generativos (gramáticas síncronas), como máquinas aceptadores (transductores) y como estructuras algebraicas (bimorfismos). Se investigan y comparan al nivel de la transformación de árboles y como dispositivos que definen translaciones. El estudio se centra en bimorfismos, con especial énfasis en sus aplicaciones para el procesamiento del lenguaje natural. También se propone una completa y actualizada visión general sobre las clases de transformaciones de árboles definidos por bimorfismos, vinculándolos con los tipos conocidos de gramáticas síncronas y transductores de árboles. Probamos o recordamos todas las propiedades interesantes que tales clases poseen, mejorando así los previos conocimientos matemáticos. Además, se exponen las relaciones de inclusión entre las principales clases de bimorfismos a través de un diagrama Hasse, como dispositivos de traducción y como mecanismos de transformación de árboles.Syntax-based machine translation was established by the demanding need of systems used in practical translations between natural languages. Such systems should, among others, model tree transformations, re-order parts of sentences, be symmetric and possess composability or forward and backward application. There are several formal ways to define tree transformations: synchronous grammars, tree transducers and tree bimorphisms. The synchronous grammars do all kind of rotations, but mathematical properties are harder to prove. The tree transducers are operational and easy to implement, but closure under composition does not hold for the main types. The tree bimorphisms are difficult to implement, but they provide a natural tool for proving composability or symmetry. To improve the translation process, synchronous grammars were related to tree bimorphisms and tree transducers. Following this lead, we give a comprehensive study of the theory and properties of syntax-directed translation systems seen from these three very different perspectives that perfectly complement each other: as generating devices (synchronous grammars), as acceptors (transducer machines) and as algebraic structures (bimorphisms). They are investigated and compared both as tree transformation and translation defining devices. The focus is on bimorphisms as they only recently got again into the spotlight especially given their applications to natural language processing. Moreover, we propose a complete and up-to-date overview on tree transformations classes defined by bimorphisms, linking them with well-known types of synchronous grammars and tree transducers. We prove or recall all the interesting properties such classes possess improving thus the mathematical knowledge on synchronous grammars and/or tree transducers. Also, inclusion relations between the main classes of bimorphisms both as translation devices and as tree transformation mechanisms are given for the first time through a Hasse diagram. Directions for future work are suggested by exhibiting how to extend previous results to more general classes of bimorphisms and synchronous grammars
    corecore