Search CORE

8,195 research outputs found

One model, two languages: training bilingual parsers with harmonized treebanks

Author: Alonso Miguel A.
Gómez-Rodríguez Carlos
Vilares David
Publication venue
Publication date: 01/01/2016
Field of study

We introduce an approach to train lexicalized parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that can analyze sentences in either of the learned languages, or even sentences that mix both. We test the approach on the Universal Dependency Treebanks, training with MaltParser and MaltOptimizer. The results show that these bilingual parsers are more than competitive, as most combinations not only preserve accuracy, but some even achieve significant improvements over the corresponding monolingual parsers. Preliminary experiments also show the approach to be promising on texts with code-switching and when more languages are added.Comment: 7 pages, 4 tables, 1 figur

arXiv.org e-Print Archive

Crossref

Using the Variationist Comparative Method to Examine the Role of Language Contact in Synthetic and Periphrastic Verbs in Spanish

Author: Dumont Jenny
Vegara Wilson Damián
Publication venue: The Cupola: Scholarship at Gettysburg College
Publication date: 01/01/2016
Field of study

Language contact and linguistic change are thought to go hand in hand (e.g. Silva-Corvalán 1994), however there are methodological obstacles, such as collecting data at different points in time or the availability of monolingual data for comparison, that make claims about language change tenuous. The present study draws on two different corpora of spoken Spanish — bilingual New Mexican Spanish and monolingual Ecuadorian Spanish — in order to quantitatively assess the convergence hypothesis in which contact with English has produced a change to the Spanish verbal system, as reflected in an extension of the Present and Past Progressive forms at the expense of the synthetic Simple Present and Imperfect forms. The data do not show that the Spanish spoken by the bilinguals is changing to more closely resemble the analogous English progressive constructions, but instead suggest potential weakening of linguistic constraints on the conditioning of the variation between periphrastic and synthetic forms

Gettysburg College

Production Methods

Author: Eisenbeiss Sonja
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/2010
Field of study

University of Essex Research Repository