198 research outputs found
Spanish Resource Grammar version 2023
We present the latest version of the Spanish Resource Grammar (SRG). The new
SRG uses the recent version of Freeling morphological analyzer and tagger and
is accompanied by a manually verified treebank and a list of documented issues.
We also present the grammar's coverage and overgeneration on a small portion of
a learner corpus, an entirely new research line with respect to the SRG. The
grammar can be used for linguistic research, such as for empirically driven
development of syntactic theory, and in natural language processing
applications such as computer-assisted language learning. Finally, as the
treebanks grow, they can be used for training high-quality semantic parsers and
other systems which may benefit from precise and detailed semantics.Comment: 10 pages, 4 figure
Memory limitations are hidden in grammar
[Abstract] The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars that describe the syntactic dependencies between constituents, independent of the computational limitations of the human brain. Here, we evaluate this independence assumption by sampling sentences uniformly from the space of possible syntactic structures. We find that the average dependency distance between syntactically related words, a proxy for memory limitations, is less than expected by chance in a collection of state-of-the-art classes of dependency grammars. Our findings indicate that memory limitations have permeated grammatical descriptions, suggesting that it may be impossible to build a parsimonious theory of human linguistic productivity independent
of non-linguistic cognitive constraints
Universal Semantic Parsing
Universal Dependencies (UD) offer a uniform cross-lingual syntactic
representation, with the aim of advancing multilingual applications. Recent
work shows that semantic parsing can be accomplished by transforming syntactic
dependencies to logical forms. However, this work is limited to English, and
cannot process dependency graphs, which allow handling complex phenomena such
as control. In this work, we introduce UDepLambda, a semantic interface for UD,
which maps natural language to logical forms in an almost language-independent
fashion and can process dependency graphs. We perform experiments on question
answering against Freebase and provide German and Spanish translations of the
WebQuestions and GraphQuestions datasets to facilitate multilingual evaluation.
Results show that UDepLambda outperforms strong baselines across languages and
datasets. For English, it achieves a 4.9 F1 point improvement over the
state-of-the-art on GraphQuestions. Our code and data can be downloaded at
https://github.com/sivareddyg/udeplambda.Comment: EMNLP 201
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
Linguistic typology aims to capture structural and semantic variation across
the world's languages. A large-scale typology could provide excellent guidance
for multilingual Natural Language Processing (NLP), particularly for languages
that suffer from the lack of human labeled resources. We present an extensive
literature survey on the use of typological information in the development of
NLP techniques. Our survey demonstrates that to date, the use of information in
existing typological databases has resulted in consistent but modest
improvements in system performance. We show that this is due to both intrinsic
limitations of databases (in terms of coverage and feature granularity) and
under-employment of the typological features included in them. We advocate for
a new approach that adapts the broad and discrete nature of typological
categories to the contextual and continuous nature of machine learning
algorithms used in contemporary NLP. In particular, we suggest that such
approach could be facilitated by recent developments in data-driven induction
of typological knowledge
Natural Language Syntax Complies with the Free-Energy Principle
Natural language syntax yields an unbounded array of hierarchically
structured expressions. We claim that these are used in the service of active
inference in accord with the free-energy principle (FEP). While conceptual
advances alongside modelling and simulation work have attempted to connect
speech segmentation and linguistic communication with the FEP, we extend this
program to the underlying computations responsible for generating syntactic
objects. We argue that recently proposed principles of economy in language
design - such as "minimal search" criteria from theoretical syntax - adhere to
the FEP. This affords a greater degree of explanatory power to the FEP - with
respect to higher language functions - and offers linguistics a grounding in
first principles with respect to computability. We show how both tree-geometric
depth and a Kolmogorov complexity estimate (recruiting a Lempel-Ziv compression
algorithm) can be used to accurately predict legal operations on syntactic
workspaces, directly in line with formulations of variational free energy
minimization. This is used to motivate a general principle of language design
that we term Turing-Chomsky Compression (TCC). We use TCC to align concerns of
linguists with the normative account of self-organization furnished by the FEP,
by marshalling evidence from theoretical linguistics and psycholinguistics to
ground core principles of efficient syntactic computation within active
inference
Multiword expressions
Multiword expressions (MWEs) are a challenge for both the natural language applications and the linguistic theory because they often defy the application of the machinery developed for free combinations where the default is that the meaning of an utterance can be predicted from its structure. There is a rich body of primarily descriptive work on MWEs for many European languages but comparative work is little. The volume brings together MWE experts to explore the benefits of a multilingual perspective on MWEs. The ten contributions in this volume look at MWEs in Bulgarian, English, French, German, Maori, Modern Greek, Romanian, Serbian, and Spanish. They discuss prominent issues in MWE research such as classification of MWEs, their formal grammatical modeling, and the description of individual MWE types from the point of view of different theoretical frameworks, such as Dependency Grammar, Generative Grammar, Head-driven Phrase Structure Grammar, Lexical Functional Grammar, Lexicon Grammar
DFKI Workshop on Natural Language Generation
On the SaarbrĂŒcken campus sites as well as at DFKI, many research activities are pursued in the field of Natural Language Generation (NLG). We felt that too little is known about the total of these activities and decided to organize a workshop in order to share ideas and promote the results.
This DFKI workshop brought together local researchers working on NLG. Several papers are co-authored by international researchers. Although not all NLG activities are covered in the present document, the papers reviewed for this workshop clearly demonstrate that SaarbrĂŒcken counts among the important NLG sites in the world
DFKI Workshop on Natural Language Generation
On the SaarbrĂŒcken campus sites as well as at DFKI, many research activities are pursued in the field of Natural Language Generation (NLG). We felt that too little is known about the total of these activities and decided to organize a workshop in order to share ideas and promote the results.
This DFKI workshop brought together local researchers working on NLG. Several papers are co-authored by international researchers. Although not all NLG activities are covered in the present document, the papers reviewed for this workshop clearly demonstrate that SaarbrĂŒcken counts among the important NLG sites in the world
Superseded: Grammatical theory: From transformational grammar to constraint-based approaches. Second revised and extended edition.
This book is superseded by the third edition, available at http://langsci-press.org/catalog/book/255.
This book introduces formal grammar theories that play a role in current linguistic theorizing (Phrase Structure Grammar, Transformational Grammar/Government & Binding, Generalized Phrase Structure Grammar, Lexical Functional Grammar, Categorial Grammar, Head-âDriven Phrase Structure Grammar, Construction Grammar, Tree Adjoining Grammar). The key assumptions are explained and it is shown how the respective theory treats arguments and adjuncts, the active/passive alternation, local reorderings, verb placement, and fronting of constituents over long distances. The analyses are explained with German as the object language.
The second part of the book compares these approaches with respect to their predictions regarding language acquisition and psycholinguistic plausibility. The nativism hypothesis, which assumes that humans posses genetically determined innate language-specific knowledge, is critically examined and alternative models of language acquisition are discussed. The second part then addresses controversial issues of current theory building such as the question of flat or binary branching structures being more appropriate, the question whether constructions should be treated on the phrasal or the lexical level, and the question whether abstract, non-visible entities should play a role in syntactic analyses. It is shown that the analyses suggested in the respective frameworks are often translatable into each other. The book closes with a chapter showing how properties common to all languages or to certain classes of languages can be captured.
The book is a translation of the German book Grammatiktheorie, which was published by Stauffenburg in 2010. The following quotes are taken from reviews:
With this critical yet fair reflection on various grammatical theories, MĂŒller fills what was a major gap in the literature. Karen Lehmann, Zeitschrift fĂŒr RezenÂsioÂnen zur gerÂmanÂisÂtisÂchen SprachÂwisÂsenschaft, 2012
Stefan MĂŒllerâs recent introductory textbook, GramÂmatikÂtheÂoÂrie, is an astonishingly comprehensive and insightful survey for beginning students of the present state of syntactic theory. Wolfgang Sternefeld und Frank Richter, Zeitschrift fĂŒr SprachÂwissenÂschaft, 2012
This is the kind of work that has been sought after for a while [...] The impartial and objective discussion offered by the author is particularly refreshing. Werner Abraham, Germanistik, 2012
This book is a new edition of http://langsci-press.org/catalog/book/25
- âŠ