27,777 research outputs found
Corpora and evaluation tools for multilingual named entity grammar development
We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats
Non-Standard and Minority Varieties as Community Languages in the UK: Towards a New Strategy for Language Maintenance
Supplementary schools (also referred to as complementary or Saturday schools) play a key role in teaching community heritage languages. In this way they contribute to strengthening awareness of cultural identity and confidence among pupils of migrant and minority backgrounds. The diaspora setting poses a number of challenges: parents and pupils expect supplementary schools to provide instruction in formal aspects of the heritage languages (reading and writing, and ‘correct’ grammar), but also to help develop competence in using the language in everyday settings, not least in order to enable intergenerational communication. Where the formal language differs from non-standard speech varieties (such as regional dialects), gaps may emerge between expectations and delivery. Most schools do not equip teachers to address such issues because the traditional curricula (including textbooks and teacher training packages that are often imported from the origin countries) fail to take them into consideration.
The paper draws on recent research by specialist sociolinguists working in various UK settings and on a discussion among researchers and practitioners that was hosted by the University of Westminster in April 2019, co-organised by the Multilingual Manchester research unit at the University of Manchester as part of the Multilingual Communities strand of the AHRC Open World Research Initiative consortium ‘Cross- Language Dynamics: Re-shaping Communities.’
Research has shown that teachers, parents and pupils attribute importance to the teaching of standard languages, not least as a way of gaining additional formal qualifications and increasing prospects of university admission and employment. However, pupils also show an interest in everyday speech varieties and often challenge the prevailing language ideologies that fail to recognise their importance in informal communication. Teachers tend to be aware of this tension but lack the training and resources to address it in the classroom.
The workshop findings suggest that failure to take non-standard speech varieties into consideration can discourage pupils from attending supplementary schools and so it also risks having an adverse effect on the transmission of standard heritage languages. Pupils’ motivation can be boosted if they are offered more tools and opportunities to communicate in everyday speech varieties. To that end, non-standard varieties must be valorised and teachers should be equipped with the skills to address language variation and pupils’ multilingual repertoires and to promote them as valuable communicative resources.
The paper recommends that supplementary schools should explore ways to take into account pupils’ multilingualism and use of non-standard varieties. Curricula should be adjusted to recognise non- standard varieties as valuable resources while continuing to teach the formal (standard) varieties. Teacher training modules should be designed that take pupils’ multilingual repertoires into account and equip teachers to understand and address sociolinguistic issues such as structural variation, multilingualism and language ideologies.
The paper also recommends public engagement to address the inequality that underpins the use of the terms ‘community’ versus ‘modern languages’, and calls for collaboration between mainstream (statutory) schools and supplementary schools when it comes to celebrating diversity in their pupils’ backgrounds. Academics should play a greater role in providing advice, support and training to practitioners. They should work with practitioners and stakeholders to raise public awareness of the contribution that supplementary schools make and to develop policies and pedagogical approaches to support them
MAG: A Multilingual, Knowledge-base Agnostic and Deterministic Entity Linking Approach
Entity linking has recently been the subject of a significant body of
research. Currently, the best performing approaches rely on trained
mono-lingual models. Porting these approaches to other languages is
consequently a difficult endeavor as it requires corresponding training data
and retraining of the models. We address this drawback by presenting a novel
multilingual, knowledge-based agnostic and deterministic approach to entity
linking, dubbed MAG. MAG is based on a combination of context-based retrieval
on structured knowledge bases and graph algorithms. We evaluate MAG on 23 data
sets and in 7 languages. Our results show that the best approach trained on
English datasets (PBOH) achieves a micro F-measure that is up to 4 times worse
on datasets in other languages. MAG, on the other hand, achieves
state-of-the-art performance on English datasets and reaches a micro F-measure
that is up to 0.6 higher than that of PBOH on non-English languages.Comment: Accepted in K-CAP 2017: Knowledge Capture Conferenc
Non-native children speech recognition through transfer learning
This work deals with non-native children's speech and investigates both
multi-task and transfer learning approaches to adapt a multi-language Deep
Neural Network (DNN) to speakers, specifically children, learning a foreign
language. The application scenario is characterized by young students learning
English and German and reading sentences in these second-languages, as well as
in their mother language. The paper analyzes and discusses techniques for
training effective DNN-based acoustic models starting from children native
speech and performing adaptation with limited non-native audio material. A
multi-lingual model is adopted as baseline, where a common phonetic lexicon,
defined in terms of the units of the International Phonetic Alphabet (IPA), is
shared across the three languages at hand (Italian, German and English); DNN
adaptation methods based on transfer learning are evaluated on significant
non-native evaluation sets. Results show that the resulting non-native models
allow a significant improvement with respect to a mono-lingual system adapted
to speakers of the target language
- …