7 research outputs found
Pronominal Anaphora in Basque: computational point of view and the development of a corpus
This paper describes the process of annotating pronominal anaphor in a corpus of Basque which consists of 54.000 words. Our aim is to use this annotation as a basis for later computational processing. The linguistic study carried out and the criteria defined for the tagging process are also presented in the pape
Corpora for Computational Linguistics
Since the mid 90s corpora has become very important for computational linguistics. This paper offers a survey of how they are currently used in different fields of the discipline, with particular emphasis on anaphora and coreference resolution, automatic summarisation and term extraction.
Their influence on other fields is also briefly discussed
Linguistics parameters for zero anaphora resolution
Dissertação de mest., Natural Language Processing and Human Language Technology, Univ. do Algarve, 2009This dissertation describes and proposes a set of linguistically motivated rules for zero
anaphora resolution in the context of a natural language processing chain developed for
Portuguese. Some languages, like Portuguese, allow noun phrase (NP) deletion (or zeroing)
in several syntactic contexts in order to avoid the redundancy that would result from
repetition of previously mentioned words. The co-reference relation between the zeroed
element and its antecedent (or previous mention) in the discourse is here called zero
anaphora (Mitkov, 2002). In Computational Linguistics, zero anaphora resolution may be
viewed as a subtask of anaphora resolution and has an essential role in various Natural
Language Processing applications such as information extraction, automatic abstracting,
dialog systems, machine translation and question answering. The main goal of this
dissertation is to describe the grammatical rules imposing subject NP deletion and referential
constraints in the Brazilian Portuguese, in order to allow a correct identification of the
antecedent of the deleted subject NP. Some of these rules were then formalized into the
Xerox Incremental Parser or XIP (Ait-Mokhtar et al., 2002: 121-144) in order to constitute a
module of the Portuguese grammar (Mamede et al. 2010) developed at Spoken Language
Laboratory (L2F). Using this rule-based approach we expected to improve the performance
of the Portuguese grammar namely by producing better dependency structures with
(reconstructed) zeroed NPs for the syntactic-semantic interface. Because of the complexity
of the task, the scope of this dissertation had to be limited: (a) subject NP deletion; b) within
sentence boundaries and (c) with an explicit antecedent; besides, (d) rules were formalized
based solely on the results of the shallow parser (or chunks), that is, with minimal syntactic
(and no semantic) knowledge. A corpus of different text genres was manually annotated for
zero anaphors and other zero-shaped, usually indefinite, subjects. The rule-based
approached is evaluated and results are presented and discussed
Gramatika jaietan Patxi Goenagaren omenez
Aurkibidea / Índice / Index:- Hitzaurrea.- Curriculum vitae Patxi Goenaga Mendizabal.- Axun Aierbe Mendizabal: Euskal estilo-liburuetako gramatika-arloko itzulpengomendioez.- Gontzal Aldai: Patxi Goenagari 30 mila esker.- Izaskun Aldezabal Roteta: Aditz-azpikategorizazioa.- Iñaki Amundarain: Behar izan + partizipioa: geroaldiko balioaz.- M. J. Aranzabe, J. M. Arriola and Arantza Diaz de Ilarraza: Theoretical and.- methodological issues of tagging noun phrases structures following dependency grammar formalism.- Xabier Artiagoitia: Some arguments for complement-head order in Basque DPs.- Miren Azkarate Villar: Gertaera- eta emaitza-izenak.- Andoni Barreña, Marijose Ezeizabarrena eta Iñaki García: Entzundako hizkuntzaren eragina haur euskaldun txikien gramatika-garapenean.- Gidor Bilbao: Claude Maugerren eskuliburua Urteren eredu.- Klara Ceberio, Itziar Aduriz, Arantza Diaz de Ilarraza eta Inés M. Garcia Azkoaga: Erreferentziakidetasunaren azterketa eta anotazioa euskarazko corpus batean.- Karlos Cid Abasolo: Gramatika Atxagaren literatur bideetan (I).- Maia Duguine eta Aritz Irurtzun: Ohar batzuk nafar-lapurterazko galdera eta galdegai indartuez.- Luis Eguren: Clíticos léxicos y elipsis nominal.- José Luis Erdozia: Burundako hizkera, Arabako ekialdekoaren hondar euskalkia.- Maitena Etxebarria Arostegui: Análisis y evaluación de la vitalidad sociolingüística del euskera en la C.A.V.- Urtzi Etxeberria eta Ricardo Etxepare: Izen eta gertakarien gaineko kuantifikazioa.- Ricardo Etxepare and Myriam Uribe-Etxebarria: On negation and focus in Spanish and Basque.- Juan Garzia: Bada arazorik etik arazoak daude raino: existentzia-predikazioa eta inespezifikotasuna.- Ricardo Gómez: Euskal gramatikagintza zaharraren historia laburra: xvii-xviii.- mendeak.- Lluïsa Gràcia y Berta Crous: Sobre algunos predicados con fer y tenir en catalán: fer un infart vs. tenir un infart.- Bill Haddican and Paul Foulkes: Mid Vowel Raising and Second Vowel Deletion in Oiartzun Basque.- José Ignacio Hualde eta Oihana Lujanbio: Goizuetako azentuera.- Orreaga Ibarra Murillo: Sobre estrategias discursivas del lenguaje de los jóvenes vascoparlantes: aspectos pragmáticos y discursivos (conectores, marcadores).- Itziar Idiazabal: Gramatika eta hiz kun tzaren didaktika.- Itziar Laka: Senezkotasuna hizkuntzan: Gramatika Unibertsalaren inguruko hausnarketa.- Joseba A. Lakarra: Aitzineuskararen gramatikarantz (malkar eta osinetan zehar).- Mikel Lersundi, Igone Zabala eta Agurtzane Elordui: Aditzetiko izenen emankortasunaren azterketa morfopragmatikoa euskarazko corpus orokor eta berezituetan.- Ángel López García: Sobre una propiedad superestructural de la lengua vasca.- Juan Karlos López-Mugartza Iriarte: Erronkaribarko oikonimia, mitoak eta elezaharra.- Jesus Mari Makazaga Eizagirre: Ahozko jarduna komunikazioaren lagungarri: ekarpen bat ahozkoaren estrategia komunikatiboez.- Roger Martin and Juan Uriagereka: Competence for preferences.- Juan Carlos Moreno Cabrera: Alokutibotasunari buruzko zenbait hausnarketa hizkuntzalaritza orokorraren ikuspegitik.- Céline Mounole: Sintaxi diakronikoa eta aditz multzoaren garapena: Inperfektibozko perifrasiaren sorreraz.- José Antonio Mujika: Adlatiboaren berbalizazioaz.- Juan Carlos Odriozola Pereira: Quantifying compounds.- Miren Lourdes Oñederra: Izan edo ez izan: Fonologiak fonetikari ordaintzen diona afrikatuekin.- Javier Ormazabal: Kausatibo aldizkatzeak euskaraz eta inguruko hizkuntzetan.- B. Oyharçabal: Naturalist conceptions about agglutinative languages: Vinson’s ideas about Basque and linguistic Darwinism.- Georges Rebuschi: On older Northern Basque exclamatives in ala.- Milan Rezac: The forms of dative displacement: From Basauri to Itelmen.- Patxi Salaberri: Satznamen direlakoen inguruan. Erlatibozko perpausetan jatorri duten toponimoak aztergai.- Pello Salaburu: Hiztegi kontuak Baztan aldean.- Itziar San Martín: Defective domains in Basque nominalized dependants.- Ibon Sarasola: Iparraldeko hiztegigintza Larramendiren paradigmaren garaian.- Esther Torrego: Revisiting Romance SE.- Itziar Túrrez: Ideas acerca de la lengua de Tomás Tamayo de Vargas: una lectura de sus Anotaciones a Garcilaso.- Blanca Urgell: Berriemaileen gaitasuna eta eredu lexikografikoaren eragina Landucciren hiztegian.- Vidal Valmala: Topic, focus and quantifier float.- Koldo Zuazo: Euskara (batu)aren historiarako.- Juan Joxe Zubiri eta Patxi Salaberri: Zenbait irain-hitzen erabilera. Deklinabide-kasu hautsiak