Search CORE

927 research outputs found

Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

Author: Candito Marie
Foster Jennifer
Goldberg Yoav
Kübler Sandra
Rehbein Ines
Seddah Djamé
Tounsi Lamia
Tsarfaty Reut
Versley Yannick
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2010
Field of study

The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations

CiteSeerX

INRIA a CCSD electronic archive server

Irish Universities

DCU Online Research Access Service

Hal-Diderot

Parsing With Lexicalized Tree Adjoining Grammar

Author: Joshi Aravind K
Schabes Yves
Publication venue: ScholarlyCommons
Publication date: 01/02/1990
Field of study

Most current linguistic theories give lexical accounts of several phenomena that used to be considered purely syntactic. The information put in the lexicon is thereby increased in both amount and complexity: see, for example, lexical rules in LFG (Kaplan and Bresnan, 1983), GPSG (Gazdar, Klein, Pullum and Sag, 1985), HPSG (Pollard and Sag, 1987), Combinatory Categorial Grammars (Steedman, 1987), Karttunen\u27s version of Categorial Grammar (Karttunen 1986, 1988), some versions of GB theory (Chomsky 1981), and Lexicon-Grammars (Gross 1984). We would like to take into account this fact while defining a formalism. We therefore explore the view that syntactical rules are not separated from lexical items. We say that a grammar is lexicalized (Schabes, AbeilK and Joshi, 1988) if it consists of: (1) a finite set of structures each associated with lexical items; each lexical item will be called the anchor of the corresponding structure; the structures define the domain of locality over which constraints are specified; (2) an operation or operations for composing the structures. The notion of anchor is closely related to the word associated with a functor-argument category in Categorial Grammars. Categorial Grammar (as used for example by Steedman, 1987) are \u27lexicalized\u27 according to our definition since each basic category has a lexical item associated with it

ScholarlyCommons@Penn

Tree-Adjoining Grammars and Lexicalized Grammars

Author: Joshi Aravind K
Schabes Yves
Publication venue: ScholarlyCommons
Publication date: 01/03/1991
Field of study

In this paper, we will describe a tree generating system called tree-adjoining grammar(TAG)and state some of the recent results about TAGs. The work on TAGS is motivated by linguistic considerations. However, a number of formal results have been established for TAGs, which we believe, would be of interest to researchers in tree grammars and tree automata. After giving a short introduction to TAG, we briefly state these results concerning both the properties of the string sets and tree sets (Section 2). We will also describe the notion of lexicalization of grammars (Section 3) and investigate the relationship of lexicalization to context-free grammars (CFGs) and TAGS (Section 4)

ScholarlyCommons@Penn

Parsing Strategies With \u27Lexicalized\u27 Grammars: Application to Tree Adjoining Grammars

Author: Abeillé Anne
Joshi Aravind K
Schabes Yves
Publication venue: ScholarlyCommons
Publication date: 01/08/1988
Field of study

In this paper, we present a parsing strategy that arose from the development of an Earley-type parsing algorithm for TAGs (Schabes and Joshi 1988) and from some recent linguistic work in TAGs (Abeillé: 1988a). In our approach, each elementary structure is systematically associated with a lexical head. These structures specify extended domains of locality (as compared to a context-free grammar) over which constraints can be stated. These constraints either hold within the elementary structure itself or specify what other structures can be composed with a given elementary structure. The \u27grammar\u27 consists of a lexicon where each lexical item is associated with a finite number of structures for which that item is the head. There are no separate grammar rules. There are, of course, \u27rules\u27 which tell us how these structures are composed. A grammar of this form will be said to be \u27lexicalized\u27. We show that in general context-free grammars cannot be \u27lexicalized\u27. We then show how a \u27lexicalized\u27 grammar naturally follows from the extended domain of locality of TAGs and examine briefly some of the linguistic implications of our approach. A general parsing strategy for \u27lexicalized\u27 grammars is discussed. In the first stage, the parser selects a set of elementary structures associated with the lexical items in the input sentence, and in the second stage the sentence is parsed with respect to this set. The strategy is independent of nature of the elementary structures in the underlying grammar. However, we focus our attention on TAGs. Since the set of trees selected at the end of the first stage is not infinite, the parser can use in principle any search strategy. Thus, in particular, a top-down strategy can be used since problems due to recursive structures are eliminated. We then explain how the Earley-type parser for TAGs can be modified to take advantage of this approach

ScholarlyCommons@Penn

Punctuation in Quoted Speech

Author: Doran Christine
Publication venue
Publication date: 01/01/1996
Field of study

Quoted speech is often set off by punctuation marks, in particular quotation marks. Thus, it might seem that the quotation marks would be extremely useful in identifying these structures in texts. Unfortunately, the situation is not quite so clear. In this work, I will argue that quotation marks are not adequate for either identifying or constraining the syntax of quoted speech. More useful information comes from the presence of a quoting verb, which is either a verb of saying or a punctual verb, and the presence of other punctuation marks, usually commas. Using a lexicalized grammar, we can license most quoting clauses as text adjuncts. A distinction will be made not between direct and indirect quoted speech, but rather between adjunct and non-adjunct quoting clauses.Comment: 11 pages, 11 ps figures, Proceedings of SIGPARSE 96 - Punctuation in Computational Linguistic

arXiv.org e-Print Archive

CiteSeerX

All Fragments Count in Parser Evaluation

Author: Bastings J.
Sima'an K.
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2014
Field of study

International Migration, Integration and Social Cohesion online publications

"Implicature-Laden" Elicitations in Talk Radio Shows

Author: Herczeg-Deli Ágnes
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/12/2011
Field of study

Indirect elicitations in talk radio programmes on BBC Radio are not uncommon, notwithstanding, misunderstanding between the host and his conversational partner is not frequent. Investigating some of the reasons this paper focuses on how the socio-cultural and cognitive factors of the context interweave in discourse. The author suggests that valid interpretation and appropriate response to inferred elicitations can be best explained within the framework of Relevance Theory, and more specifically, with the presumption of accessibility of schemas obtained from the cognitive environment of the discourse partners. Through examples of empirical research the paper aims to reveal how the mutual knowledge of the participants controls discourse via the mental processes occurring in the interaction of two minds

Biblioteka Nauki - repozytorium artykuÅÃ³w

Crossref

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

Some Novel Applications of Explanation-Based Learning to Parsing Lexicalized Tree-Adjoining Grammars

Author: Joshi Aravind
Srinivas B.
Publication venue
Publication date: 01/01/1995
Field of study

In this paper we present some novel applications of Explanation-Based Learning (EBL) technique to parsing Lexicalized Tree-Adjoining grammars. The novel aspects are (a) immediate generalization of parses in the training set, (b) generalization over recursive structures and (c) representation of generalized parses as Finite State Transducers. A highly impoverished parser called a ``stapler'' has also been introduced. We present experimental results using EBL for different corpora and architectures to show the effectiveness of our approach.Comment: uuencoded postscript fil

arXiv.org e-Print Archive

CiteSeerX

Crossref

ScholarlyCommons@Penn

Recommended from our members

OBOME - Ontology based opinion mining in UBIPOL

Author: Husani M
Ko A
Kocyigit A
Lee H
Tapucu D
Publication venue: Brunel University
Publication date: 01/01/2012
Field of study

Ontologies have a special role in the UBIPOL system, they help to structure the policy related context, provide conceptualization for policy domain and use in the opinion mining process. In this work we presented a system called Ontology Based Opinion Mining Engine (OBOME) for analyzing a domain-specific opinion corpus by first assisting the user with the creation of a domain ontology from the corpus. We determined the polarity of opinion on the various domain aspects. In the former step, the policy domain aspect has are identified (namely which policy category is represented by the concept). This identification is supported by the policy modelling ontology, which describe the most important policy – related classes and structure. Then the most informative documents from the corpus are extracted and asked the user to create a set of aspects and related keywords using these documents. In the latter step, we used the corpus specific ontology to model the domain and extracted aspect-polarity associations using grammatical dependencies between words. Later, summarized results are shown to the user to analyze and store. Finally, in an offline process policy modeling ontology is updated

Brunel University Research Archive

Generation and synchronous tree-adjoining grammars

Author: Schabes Yves
Shieber Stuart
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/1990
Field of study

Tree-adjoining grammars (TAG) have been proposed as a formalism for generation based on the intuition that the extended domain of syntactic locality that TAGs provide should aid in localizing semantic dependencies as well, in turn serving as an aid to generation from semantic representations. We demonstrate that this intuition can be made concrete by using the formalism of synchronous tree-adjoining grammars. The use of synchronous TAGs for generation provides solutions to several problems with previous approaches to TAG generation. Furthermore, the semantic monotonicity requirement previously advocated for generation grammars as a computational aid is seen to be an inherent property of synchronous TAGs.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

ScholarlyCommons@Penn