Search CORE

6,265 research outputs found

Identifying cognates in English-Dutch and French-Dutch by means of orthographic information and cross-lingual word embeddings

Author: Labat Sofie
Lefever Els
Singh Pranaydeep
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2020
Field of study

Between the historical languages and the reconstructed language : an alternative approach to the Gerundive + “Dative of Agent” construction in Indo-European

Author: Barddal Johanna
Danesi Serena
Johnson Cynthia A
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2017
Field of study

It is argued by Hettrich (1990) that the “dative of agent” construction in the Indo-European languages most likely continues a construction inherited from Proto-Indo-European. In two recent proposals (Danesi 2013, Luraghi 2016), it is argued that the “dative of agent” contains no agent at all, although the two proposals differ with regard to the reconstructability of the “dative of agent” construction. Luraghi argues that it is an independent secondary development from an original beneficiary function (cf. Hettrich 1990), while Danesi maintains that the construction is reconstructable for an earlier proto-stage. Elaborating on Danesi’s approach, we analyze gerundives with the “dative of agent” in six different Indo-European languages that bridge the east–west divide, namely, Sanskrit, Avestan, Ancient Greek, Latin, Tocharian, and Lithuanian. Scrutiny of the data reveals similarities at a morphosyntactic level, a semantic level (i.e. modal meaning and low degree of transitivity), and also, to some extent, at an etymological level. An analysis involving a modal reading of the predicate, with a dative subject and a nominative object, is better equipped to account for the particulars of the “gerundive + nominative + dative” construction than the traditional agentive/passive analysis. The proposal is couched within the theoretical framework of Construction Grammar, in which the basic unit of language is the Construction, i.e. a form–function correspondence, and no principled distinction between lexical items and complex syntactic structures is assumed. As these structures are by definition units of comparanda, required by the Comparative Method, they can be successfully utilized in the reconstruction of a proto-construction for Proto-Indo-European

Ghent University Academic Bibliography

The Adaptation Features and Text Reconstruction in Translating Grand Canal Poems into English

Author: Fan Kai-Fang
Publication venue: 'Scholink Co, Ltd.'
Publication date: 23/08/2022
Field of study

The English and Chinese versions of the Grand Canal poems have different adaptation features, which have an influence on the text reconstruction of the English versions of the Grand Canal poems. This study analyzes the pragmatic adaptation features of the English versions of the Grand Canal poems in terms linguistic choices and non-linguistic choices with the representative works of the English and Chinese versions of the Grand Canal poems as the corpus. The study finds that at the level of linguistic choices, the English versions of the Grand Canal poems is, to a large extent, adaptive to the current communicative purpose of the Grand Canal culture expressed in the original Chinese versions, but at the level of non-linguistic choices, it is, to a little extent, adaptive to the above communicative purpose. Therefore, the practice of translating Grand Canal poems into English needs to be improved in terms of linguistic choices at the level of vocabulary, syntax and rhetoric, as well as non-linguistic choices at the level of state of mind, moods and cultural image, so as to reconstruct English versions of Grand Canal poems that are adaptive adequately and fully to the pragmatic features of poems expressed in the Chinese versions and the needs of target readers of English versions

Scholink Journals

BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings

Author: Su Jinsong
Xiong Deyi
Zhang Biao
Publication venue
Publication date: 24/11/2016
Field of study

In this paper, we propose a bidimensional attention based recursive autoencoder (BattRAE) to integrate clues and sourcetarget interactions at multiple levels of granularity into bilingual phrase representations. We employ recursive autoencoders to generate tree structures of phrases with embeddings at different levels of granularity (e.g., words, sub-phrases and phrases). Over these embeddings on the source and target side, we introduce a bidimensional attention network to learn their interactions encoded in a bidimensional attention matrix, from which we extract two soft attention weight distributions simultaneously. These weight distributions enable BattRAE to generate compositive phrase representations via convolution. Based on the learned phrase representations, we further use a bilinear neural model, trained via a max-margin method, to measure bilingual semantic similarity. To evaluate the effectiveness of BattRAE, we incorporate this semantic similarity as an additional feature into a state-of-the-art SMT system. Extensive experiments on NIST Chinese-English test sets show that our model achieves a substantial improvement of up to 1.63 BLEU points on average over the baseline.Comment: 7 pages, accepted by AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval

Author: Kraaij Wessel
Nie Jian-Yun
Simard Michel
Publication venue
Publication date: 01/01/2003
Field of study

Although more and more language pairs are covered by machine translation services, there are still many pairs that lack translation resources. Cross-language information retrieval (CLIR) is an application which needs translation functionality of a relatively low level of sophistication since current models for information retrieval (IR) are still based on a bag-of-words. The Web provides a vast resource for the automatic construction of parallel corpora which can be used to train statistical translation models automatically. The resulting translation models can be embedded in several ways in a retrieval model. In this paper, we will investigate the problem of automatically mining parallel texts from the Web and different ways of integrating the translation models within the retrieval process. Our experiments on standard test collections for CLIR show that the Web-based translation models can surpass commercial MT systems in CLIR tasks. These results open the perspective of constructing a fully automatic query translation device for CLIR at a very low cost.Comment: 37 page

arXiv.org e-Print Archive

CiteSeerX

Leiden University Scholary Publications

Reconstructing Syntax

Author
Publication venue: 'Brill'
Publication date: 07/04/2022
Field of study

Contributing to the vigorous discussion of the viability of syntactic reconstruction, this volume offers methods for identifying i) cognates in syntax, and ii) the directionality of syntactic change, thus providing historical syntacticians with evidence that syntactic reconstruction is indeed both theoretically and practically feasible.; Readership: This volume is of interest to all historical syntacticians and historial linguists, as well as to specialists within Indo-European, Semitic, Austronesian and native American languages

Directory of Open Access Books (DOAB)

Computational phylogenetics and the classification of South American languages

Author: Chousou‐Polydouri Natalia
Michael Lev
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

In recent years, South Americanist linguists have embraced computational phylogenetic methods to resolve the numerous outstanding questions about the genealogi- cal relationships among the languages of the continent. We provide a critical review of the methods and language classification results that have accumulated thus far, emphasizing the superiority of character-based methods over distance-based ones and the importance of develop- ing adequate comparative datasets for producing well- resolved classifications

Crossref

eScholarship - University of California

ZORA

A viewing and processing tool for the analysis of a comparable corpus of Kiranti mythology

Author: Guillaume Séverine
Lahaussois Aimée
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceThis presentation describes a trilingual corpus of three endangered languages of the Kiranti group (Tibeto-Burman family) from Eastern Nepal. The languages, which are exclusively oral, share a rich mythology, and it is thus possible to build a corpus of the same native narrative material in the three languages. The segments of similar semantic content are tagged with a "similarity" label to identify correspondences among the three language versions of the story. An interface has been developed to allow these similarities to be viewed together, in order to allow make possible comparison of the different lexical and morphosyntactic features of each language. A concordancer makes it possible to see the various occurrences of words or glosses, and to further compare and contrast the languages

Hal-Diderot