Search CORE

3,291 research outputs found

Cross-language frame semantics transfer in bilingual corpora

Author: A. Moschitti
C.J. Fillmore
D. Gildea
L. Heyer
M. Palmer
T. Landauer
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Abstract. Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-English European languages. These works are based on the assumption that parallel corpora annotated for English can be used to transfer the semantic information to the other target languages. In this paper, a robust method based on a statistical machine translation step augmented with simple rule-based post-processing is presented. It alleviates problems related to preprocessing errors and the complex optimization required by syntax-dependent models of the cross-lingual mapping. Different alignment strategies are here in-vestigated against the Europarl corpus. Results suggest that the quality of the de-rived annotations is surprisingly good and well suited for training semantic role labeling systems.

CiteSeerX

Crossref

ART

An analysis of The Oxford Guide to practical lexicography (Atkins and Rundell 2008)

Author: de Schryver Gilles-Maurice
Publication venue
Publication date: 01/01/2008
Field of study

Since at least a decade ago, the lexicographic community at large has been demanding that a modern textbook be designed - one that Would place corpora in the centre of the lexicographic enterprise. Written by two of the most respected practising lexicographers, this book has finally arrived, and delivers on very many levels. This review article presents a critical analysis of its features

Ghent University Academic Bibliography

Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

Author: Korhonen Anna
Mrkšić Nikola
Vulić Ivan
Publication venue
Publication date: 01/01/2017
Field of study

Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines. In this work, we propose a novel cross-lingual transfer method for inducing VerbNets for multiple languages. To the best of our knowledge, this is the first study which demonstrates how the architectures for learning word embeddings can be applied to this challenging syntactic-semantic task. Our method uses cross-lingual translation pairs to tie each of the six target languages into a bilingual vector space with English, jointly specialising the representations to encode the relational information from English VerbNet. A standard clustering algorithm is then run on top of the VerbNet-specialised representations, using vector dimensions as features for learning verb classes. Our results show that the proposed cross-lingual transfer approach sets new state-of-the-art verb classification performance across all six target languages explored in this work.Comment: EMNLP 2017 (long paper

arXiv.org e-Print Archive

Crossref

Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus

Author: Fei Hao
Ji Donghong
Zhang Meishan
Publication venue
Publication date: 01/01/2020
Field of study

Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding. Supervised approaches have achieved impressing performances when large-scale corpora are available for resource-rich languages such as English. While for the low-resource languages with no annotated SRL dataset, it is still challenging to obtain competitive performances. Cross-lingual SRL is one promising way to address the problem, which has achieved great advances with the help of model transferring and annotation projection. In this paper, we propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages from the source gold-standard SRL annotations. Experimental results on Universal Proposition Bank show that the translation-based method is highly effective, and the automatic pseudo datasets can improve the target-language SRL performances significantly.Comment: Accepted at ACL 202

arXiv.org e-Print Archive

Crossref

Learning Bilingual Word Representations by Marginalizing Alignments

Author: Blunsom Phil
Hermann Karl Moritz
Kočiský Tomáš
Publication venue
Publication date: 01/01/2014
Field of study

We present a probabilistic model that simultaneously learns alignments and distributed representations for bilingual data. By marginalizing over word alignments the model captures a larger semantic context than prior work relying on hard alignments. The advantage of this approach is demonstrated in a cross-lingual classification task, where we outperform the prior published state of the art.Comment: Proceedings of ACL 2014 (Short Papers

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Promoting interdisciplinarity in Greek-English lexicography

Author: Dalpanagioti Thomai
Publication venue: Selected papers on theoretical and applied linguistics
Publication date: 24/07/2019
Field of study

Modern bilingual lexicography lies at the crossroads between linguistic theory, translation, language technology (related to corpora, databases and delivery media), and user needs considerations. It is the interplay of these factors involved in the route from the raw language data to the finished dictionary that motivates this paper. Promising theoretical perspectives such as frame semantics, the cognitive theory of metaphor and metonymy, and the contextual theory of meaning are combined with corpus methodology in compiling a production-oriented Greek-English entry for the verb περπατάω (‘walk’)

Aristotle University of Thessaloniki: Open Journals / ΑΡΙΣΤΟΤΕΛΕΙΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΟΝΙΚΗΣ