Search CORE

1,107 research outputs found

Latent Tree Language Model

Author: Brychcin Tomas
Publication venue
Publication date: 01/01/2016
Field of study

In this paper we introduce Latent Tree Language Model (LTLM), a novel approach to language modeling that encodes syntax and semantics of a given sentence as a tree of word roles. The learning phase iteratively updates the trees by moving nodes according to Gibbs sampling. We introduce two algorithms to infer a tree for a given sentence. The first one is based on Gibbs sampling. It is fast, but does not guarantee to find the most probable tree. The second one is based on dynamic programming. It is slower, but guarantees to find the most probable tree. We provide comparison of both algorithms. We combine LTLM with 4-gram Modified Kneser-Ney language model via linear interpolation. Our experiments with English and Czech corpora show significant perplexity reductions (up to 46% for English and 49% for Czech) compared with standalone 4-gram Modified Kneser-Ney language model.Comment: Accepted to EMNLP 201

arXiv.org e-Print Archive

Crossref

Deciding the Borel complexity of regular tree languages

Author: A. Blumensath
D. Niwiński
D.A. Martin
F. Murlak
F. Murlak
J. Duparc
J. Duparc
M. Bojańczyk
M. Bojańczyk
Publication venue
Publication date: 01/01/2014
Field of study

We show that it is decidable whether a given a regular tree language belongs to the class

{\bf \Delta^0_2}

of the Borel hierarchy, or equivalently whether the Wadge degree of a regular tree language is countable.Comment: 15 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Bottom Up Quotients and Residuals for Tree Languages

Author: Champarnaud Jean-Marc
Mignot Ludovic
Ouali-Sebti Nadia
Ziadi Djelloul
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we extend the notion of tree language quotients to bottom-up quotients. Instead of computing the residual of a tree language from top to bottom and producing a list of tree languages, we show how to compute a set of k-ary trees, where k is an arbitrary integer. We define the quotient formula for different combinations of tree languages: union, symbol products, compositions, iterated symbol products and iterated composition. These computations lead to the definition of the bottom-up quotient tree automaton, that turns out to be the minimal deterministic tree automaton associated with a regular tree language in the case of the 0-ary trees

arXiv.org e-Print Archive

HAL - Normandie Université

Multiple Context-Free Tree Grammars: Lexicalization and Characterization

Author: Engelfriet Joost
Maletti Andreas
Maneth Sebastian
Publication venue
Publication date: 11/07/2017
Field of study

Multiple (simple) context-free tree grammars are investigated, where "simple" means "linear and nondeleting". Every multiple context-free tree grammar that is finitely ambiguous can be lexicalized; i.e., it can be transformed into an equivalent one (generating the same tree language) in which each rule of the grammar contains a lexical symbol. Due to this transformation, the rank of the nonterminals increases at most by 1, and the multiplicity (or fan-out) of the grammar increases at most by the maximal rank of the lexical symbols; in particular, the multiplicity does not increase when all lexical symbols have rank 0. Multiple context-free tree grammars have the same tree generating power as multi-component tree adjoining grammars (provided the latter can use a root-marker). Moreover, every multi-component tree adjoining grammar that is finitely ambiguous can be lexicalized. Multiple context-free tree grammars have the same string generating power as multiple context-free (string) grammars and polynomial time parsing algorithms. A tree language can be generated by a multiple context-free tree grammar if and only if it is the image of a regular tree language under a deterministic finite-copying macro tree transducer. Multiple context-free tree grammars can be used as a synchronous translation device.Comment: 78 pages, 13 figure

arXiv.org e-Print Archive

Leiden University Scholary Publications

Construction of rational expression from tree automata using a generalization of Arden's Lemma

Author: Cherroun Hadda
Guellouma Younes
Mignot Ludovic
Ziadi Djelloul
Publication venue
Publication date: 01/01/2015
Field of study

Arden's Lemma is a classical result in language theory allowing the computation of a rational expression denoting the language recognized by a finite string automaton. In this paper we generalize this important lemma to the rational tree languages. Moreover, we propose also a construction of a rational tree expression which denotes the accepted tree language of a finite tree automaton

arXiv.org e-Print Archive

HAL - Normandie Université

The Wadge Hierarchy of Deterministic Tree Languages

Author: Filip Murlak
Michele Bugliesi
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 23/12/2008
Field of study

We provide a complete description of the Wadge hierarchy for deterministically recognisable sets of infinite trees. In particular we give an elementary procedure to decide if one deterministic tree language is continuously reducible to another. This extends Wagner's results on the hierarchy of omega-regular languages of words to the case of trees.Comment: 44 pages, 8 figures; extended abstract presented at ICALP 2006, Venice, Italy; full version appears in LMCS special issu

arXiv.org e-Print Archive

Crossref

Episciences.org