1,708 research outputs found
Multilingual Neural Machine Translation System for Indic to Indic Languages
This paper gives an Indic-to-Indic (IL-IL) MNMT baseline model for 11 ILs
implemented on the Samanantar corpus and analyzed on the Flores-200 corpus. All
the models are evaluated using the BLEU score. In addition, the languages are
classified under three groups namely East Indo- Aryan (EI), Dravidian (DR), and
West Indo-Aryan (WI). The effect of language relatedness on MNMT model
efficiency is studied. Owing to the presence of large corpora from English (EN)
to ILs, MNMT IL-IL models using EN as a pivot are also built and examined. To
achieve this, English- Indic (EN-IL) models are also developed, with and
without the usage of related languages. Results reveal that using related
languages is beneficial for the WI group only, while it is detrimental for the
EI group and shows an inconclusive effect on the DR group, but it is useful for
EN-IL models. Thus, related language groups are used to develop pivot MNMT
models. Furthermore, the IL corpora are transliterated from the corresponding
scripts to a modified ITRANS script, and the best MNMT models from the previous
approaches are built on the transliterated corpus. It is observed that the
usage of pivot models greatly improves MNMT baselines with AS-TA achieving the
minimum BLEU score and PA-HI achieving the maximum score. Among languages, AS,
ML, and TA achieve the lowest BLEU score, whereas HI, PA, and GU perform the
best. Transliteration also helps the models with few exceptions. The best
increment of scores is observed in ML, TA, and BN and the worst average
increment is observed in KN, HI, and PA, across all languages. The best model
obtained is the PA-HI language pair trained on PAWI transliterated corpus which
gives 24.29 BLEU.Comment: 38 pages, 2 figure
On the scope of the referential hierarchy in the typology of grammatical relations
In the late seventies, Bernard Comrie was one of the first linguists to explore the effects of the referential hierarchy (RH) on the distribution of grammatical relations (GRs). The referential hierarchy is also known in the literature as the animacy, empathy or indexibability hierarchy and ranks speech act participants (i.e. first and second person) above third persons, animates above inanimates, or more topical referents above less topical referents. Depending on the language, the hierarchy is sometimes extended by analogy to rankings of possessors above possessees, singulars above plurals, or other notions. In his 1981 textbook, Comrie analyzed RH effects as explaining (a) differential case (or adposition) marking of transitive subject ("A") noun phrases in low RH positions (e.g. inanimate or third person) and of object ("P") noun phrases in high RH positions (e.g. animate or first or second person), and (b) hierarchical verb agreement coupled with a direct vs. inverse distinction, as in Algonquian (Comrie 1981: Chapter 6)
Exploring SL Writing and SL Sensitivity during Writing Tasks : poor and advanced writing in a context of second language other than English
This study integrates a larger research empirical project that examines second language (SL) learnersâ profiles and valid procedures to perform complete and diagnostic assessment in schools. 102 learners of Portuguese as a SL aged 7 and 17 years speakers of distinct home languages were assessed in several linguistic tasks. In this article, we focused on writing performance in the specific task of narrative essay composition. The written outputs were measured using the score in six components adapted from an English SL assessment context (Alberta Education): linguistic vocabulary, grammar, syntax, strategy, socio-linguistic, and discourse. The writing processes and strategies in Portuguese language used by different immigrant students were analysed to determine features and diversity of deficits on authentic texts performed by SL writers. Differentiated performance was based on the diversity of the following variables: grades, previous schooling, home language, instruction in first language, and exposure to Portuguese as Second Language. Indo-Aryan languages speakers showed low writing scores compared to their peers and the type of language and respective cognitive mapping (such as Mandarin and Arabic) was the predictor, not linguistic distance. Home language instruction should also be prominently considered in further research to understand specificities of cognitive academic profile in a Romance languages learning context. Additionally, this study also examined the teachers representations that will be here addressed to understand educational implications of second language teaching in psychological distress of different minorities in schools of specific host countries.info:eu-repo/semantics/publishedVersio
Second language education context and home language effect: language dissimilarities and variation in immigrant studentsâ outcomes.
Heritage language speakers struggle in European classrooms with
insufficient material provided for second language (SL) learning
and assessment. Considering the amount of instruments and
pertinent studies in English SL, immigrant students are better
prepared than their peers in Romance language settings. This
study investigates how factors such as age and home language
can be used in the teaching environment to predict and examine
the development outcomes of SL students in verbal reasoning
and vocabulary tasks. Hundred and six Portuguese participants, SL
learners, between 8 and 17 years old, were assessed in vocabulary
frequency, verbal analogies and morphological extraction tasks. In
alphabetic languages (Romance languages), immigrant students
(in a SL learning situation) with a strong linguistic distance (a
home language with a very different orthographic foundation) are
expected to struggle in language learning in spite of being aware
of strategies that can improve their skills. The storage and
combination of morphemes can be a demanding task for
individual speakers at different levels. Cognitive mapping is
strongly based on linguistic features of L1 development. Results
show that home language, not age, was a significant predictor of
variation in studentâs outcomes. Speakers of alphasyllabary
languages (Indo-Aryan languages as L1) were the poorest
performers, the âlinguistic distanceâ of their languages explaining
the performanceâ result
In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology
This paper investigates the ability of neural network architectures to
effectively learn diachronic phonological generalizations in a multilingual
setting. We employ models using three different types of language embedding
(dense, sigmoid, and straight-through). We find that the Straight-Through model
outperforms the other two in terms of accuracy, but the Sigmoid model's
language embeddings show the strongest agreement with the traditional
subgrouping of the Slavic languages. We find that the Straight-Through model
has learned coherent, semi-interpretable information about sound change, and
outline directions for future research
Etyma for 'chicken', 'duck', and 'goose' among language phyla in China and Southeast Asia
This paper considers the history of words for domesticated poultry, including âchickenâ,
âgooseâ, and âduckâ, in China and mainland Southeast Asia to try to relate associated
domestication events with specific language groups. Linguistic, archaeological and historical
evidence supports Sinitic as one linguistic source, but in other cases, Tai and Austroasiatic
form additional centers of lexical forms which were borrowed by neighboring phyla. It is
hypothesized that these geographic regions of etyma for domesticated birds may represent
instances of bird domestication, or possibly advances in bird husbandry, by speech communities
in the region in the Neolithic Era, followed by spread of both words and cultural practices
Recommended from our members
Comparative philology, French music, and the composition of Indo-Europeanism from FĂ©tis to Messiaen.
This thesis argues that the disciplines of comparative philology and linguistics exerted significant force on the priorities and techniques of musicologists and composers in fin-de-siĂšcle France, and examines how ideologies of Indo-Europeanism (or aryanism), concomitant with comparative philology, generated efforts to âsound outâ Indo-Europeanism in music. Using a relational approach, dense interdisciplinary networks of philologists/linguists, musicologists, and composers are reconstructed to demonstrate how musicological appropriations of linguistic research reverberated in musical composition right through the 1950s. These contexts reveal how wide-ranging repertories emerged from ethnic-nationalist projects of reclaiming Indo-European âpatrimonyâ.
The thesis is in two Parts. Part I, âPhilologie comparĂ©e, musicologie, and Indo-European hypothesesâ, is organised around four overlapping intellectual networks comprising comparative philologists and musicologists. Francophone musicologistsâ efforts to model their discipline on that of comparative philology are surveyed. Scholars discussed include FĂ©tis, Gevaert, Bourgault-Ducoudray, Burnouf, Meillet, Aubry, Emmanuel, and Grosset. Arguments concerning the place of music between concepts of âlanguageâ and âraceâ are retraced, with special attention paid to musicologistsâ efforts to pinpoint quasi-morphological âIndo-Europeanâ musical structures â in particular, âmodesâ and âmetresâ â construed as âessentialâ and âancestralâ.
Part II, âComposing with philology: performances of authenticity and innovationâ, describes how the intellectual project elaborated in Part I infiltrated compositional practices. Close musical and paratextual readings show how composers legitimated experimentalism through âperformancesâ of philological âauthenticityâ. Over time, musical parameters such as modes and metres are abstracted and assimilated into compositional lexicons. Composers discussed include Bourgault-Ducoudray, Saint-SaĂ«ns, SĂ©verac, Roussel, and Emmanuel. This root system flourishes in the music of Olivier Messiaen, whose rhythmic technique is revisited in light of manuscript materials. From his borrowings of early Indian metres (deĆÄ«tÄlas) through his hyperformalist âMode de valeurs et dâintensitĂ©sâ, Messiaenâs rhythmic style is radically reinterpreted as a logical extension of francophone musicologyâs disciplinary and epistemological inheritance from comparative philology.Gates Cambridge Scholarshi
- âŠ