Search CORE

2,176 research outputs found

The risks of mixing dependency lengths from sequences of different length

Author: Ferrer-i-Cancho Ramon
Liu Haitao
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2014
Field of study

Mixing dependency lengths from sequences of different length is a common practice in language research. However, the empirical distribution of dependency lengths of sentences of the same length differs from that of sentences of varying length and the distribution of dependency lengths depends on sentence length for real sentences and also under the null hypothesis that dependencies connect vertices located in random positions of the sequence. This suggests that certain results, such as the distribution of syntactic dependency lengths mixing dependencies from sentences of varying length, could be a mere consequence of that mixing. Furthermore, differences in the global averages of dependency length (mixing lengths from sentences of varying length) for two different languages do not simply imply a priori that one language optimizes dependency lengths better than the other because those differences could be due to differences in the distribution of sentence lengths and other factors.Comment: Laguage and referencing has been improved; Eqs. 7, 11, B7 and B8 have been correcte

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization

Author: Cohen S. B.
Smith N. A.
Publication venue
Publication date: 01/01/2010
Field of study

We consider the search for a maximum likelihood assignment of hidden derivations and grammar weights for a probabilistic context-free grammar, the problem approximately solved by “Viterbi training.” We show that solving and even approximating Viterbi training for PCFGs is NP-hard. We motivate the use of uniformat-random initialization for Viterbi EM as an optimal initializer in absence of further information about the correct model parameters, providing an approximate bound on the log-likelihood.

CiteSeerX

Edinburgh Research Explorer

Frequency vs. Association for Constraint Selection in Usage-Based Construction Grammar

Author: Dunn Jonathan
Publication venue
Publication date: 01/01/2019
Field of study

A usage-based Construction Grammar (CxG) posits that slot-constraints generalize from common exemplar constructions. But what is the best model of constraint generalization? This paper evaluates competing frequency-based and association-based models across eight languages using a metric derived from the Minimum Description Length paradigm. The experiments show that association-based models produce better generalizations across all languages by a significant margin

arXiv.org e-Print Archive

Crossref

UC Research Repository

Supertagged phrase-based statistical machine translation

Author: Hassan Hany
Sima'an Khalil
Way Andy
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2007
Field of study

Until quite recently, extending Phrase-based Statistical Machine Translation (PBSMT) with syntactic structure caused system performance to deteriorate. In this work we show that incorporating lexical syntactic descriptions in the form of supertags can yield significantly better PBSMT systems. We describe a novel PBSMT model that integrates supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar. Despite the differences between these two approaches, the supertaggers give similar improvements. In addition to supertagging, we also explore the utility of a surface global grammaticality measure based on combinatory operators. We perform various experiments on the Arabic to English NIST 2005 test set addressing issues such as sparseness, scalability and the utility of system subcomponents. Our best result (0.4688 BLEU) improves by 6.1% relative to a state-of-theart PBSMT model, which compares very favourably with the leading systems on the NIST 2005 task

CiteSeerX

Irish Universities

DCU Online Research Access Service

International Migration, Integration and Social Cohesion online publications

Crossings as a side effect of dependency lengths

Author: Bick
Christensen
Conover
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Futrell
Gibson
Gildea
Gildea
Gómez-Rodríguez
Hays
Hochberg
Hudson
Iwatate
Jiang
Kawata
Kelih
Liu
Lu
Newman
Poirier
Popper
Prokhorov
Ramasamy
Tanaka
Temperley
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

The syntactic structure of sentences exhibits a striking regularity: dependencies tend to not cross when drawn above the sentence. We investigate two competing explanations. The traditional hypothesis is that this trend arises from an independent principle of syntax that reduces crossings practically to zero. An alternative to this view is the hypothesis that crossings are a side effect of dependency lengths, i.e. sentences with shorter dependency lengths should tend to have fewer crossings. We are able to reject the traditional view in the majority of languages considered. The alternative hypothesis can lead to a more parsimonious theory of language.Comment: the discussion section has been expanded significantly; in press in Complexity (Wiley

arXiv.org e-Print Archive

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

F-structure transfer-based statistical machine translation

Author: Bryl Anton
Graham Yvette
van Genabith Josef
Publication venue: CSLI Publications
Publication date: 01/01/2009
Field of study

In this paper, we describe a statistical deep syntactic transfer decoder that is trained fully automatically on parsed bilingual corpora. Deep syntactic transfer rules are induced automatically from the f-structures of a LFG parsed bitext corpus by automatically aligning local f-structures, and inducing all rules consistent with the node alignment. The transfer decoder outputs the n-best TL f-structures given a SL f-structure as input by applying large numbers of transfer rules and searching for the best output using a log-linear model to combine feature scores. The decoder includes a fully integrated dependency-based tri-gram language model. We include an experimental evaluation of the decoder using different parsing disambiguation resources for the German data to provide a comparison of how the system performs with different German training and test parses

Irish Universities

DCU Online Research Access Service

Parallel Natural Language Parsing: From Analysis to Speedup

Author: Lohuizen Marcellus Paulus van
Publication venue: Technische Universiteit Delft
Publication date: 01/01/2001
Field of study

Electrical Engineering, Mathematics and Computer Scienc

TU Delft Repository

University of Twente Research Information

Discovery of Linguistic Relations Using Lexical Attraction

Author: Yuret Deniz
Publication venue
Publication date: 01/01/1998
Field of study

This work has been motivated by two long term goals: to understand how humans learn language and to build programs that can understand language. Using a representation that makes the relevant features explicit is a prerequisite for successful learning and understanding. Therefore, I chose to represent relations between individual words explicitly in my model. Lexical attraction is defined as the likelihood of such relations. I introduce a new class of probabilistic language models named lexical attraction models which can represent long distance relations between words and I formalize this new class of models using information theory. Within the framework of lexical attraction, I developed an unsupervised language acquisition program that learns to identify linguistic relations in a given sentence. The only explicitly represented linguistic knowledge in the program is lexical attraction. There is no initial grammar or lexicon built in and the only input is raw text. Learning and processing are interdigitated. The processor uses the regularities detected by the learner to impose structure on the input. This structure enables the learner to detect higher level regularities. Using this bootstrapping procedure, the program was trained on 100 million words of Associated Press material and was able to achieve 60% precision and 50% recall in finding relations between content-words. Using knowledge of lexical attraction, the program can identify the correct relations in syntactically ambiguous sentences such as ``I saw the Statue of Liberty flying over New York.''Comment: dissertation, 56 page

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT