27,126 research outputs found

    TectoMT – a deep-­linguistic core of the combined Chimera MT system

    Get PDF
    Chimera is a machine translation system that combines the TectoMT deep-linguistic core with phrase-based MT system Moses. For English–Czech pair it also uses the Depfix post-correction system. All the components run on Unix/Linux platform and are open source (available from Perl repository CPAN and the LINDAT/CLARIN repository). The main website is https://ufal.mff.cuni.cz/tectomt. The development is currently supported by the QTLeap 7th FP project (http://qtleap.eu)

    META-NET Strategic Research Agenda for Multilingual Europe 2020

    Get PDF
    In everyday communication, Europe’s citizens, business partners and politicians are inevitably confronted with language barriers. Language technology has the potential to overcome these barriers and to provide innovative interfaces to technologies and knowledge. This document presents a Strategic Research Agenda for Multilingual Europe 2020. The agenda was prepared by META-NET, a European Network of Excellence. META-NET consists of 60 research centres in 34 countries, who cooperate with stakeholders from economy, government agencies, research organisations, non-governmental organisations, language communities and European universities. META-NET’s vision is high-quality language technology for all European languages. “The research carried out in the area of language technology is of utmost importance for the consolidation of Portuguese as a language of global communication in the information society.” — Dr. Pedro Passos Coelho (Prime-Minister of Portugal) “It is imperative that language technologies for Slovene are developed systematically if we want Slovene to flourish also in the future digital world.” — Dr. Danilo Türk (President of the Republic of Slovenia) “For such small languages like Latvian keeping up with the ever increasing pace of time and technological development is crucial. The only way to ensure future existence of our language is to provide its users with equal opportunities as the users of larger languages enjoy. Therefore being on the forefront of modern technologies is our opportunity.” — Valdis Dombrovskis (Prime Minister of Latvia) “Europe’s inherent multilingualism and our scientific expertise are the perfect prerequisites for significantly advancing the challenge that language technology poses. META-NET opens up new opportunities for the development of ubiquitous multilingual technologies.” — Prof. Dr. Annette Schavan (German Minister of Education and Research

    Efficient deep processing of japanese

    Get PDF
    We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. This grammar is being developed in a multilingual context, requiring MRS structures that are easily comparable across languages

    The dawn of the human-machine era: a forecast of new and emerging language technologies

    Get PDF
    New language technologies are coming, thanks to the huge and competing private investment fuelling rapid progress; we can either understand and foresee their effects, or be taken by surprise and spend our time trying to catch up. This report scketches out some transformative new technologies that are likely to fundamentally change our use of language. Some of these may feel unrealistically futuristic or far-fetched, but a central purpose of this report - and the wider LITHME network - is to illustrate that these are mostly just the logical development and maturation of technologies currently in prototype. But will everyone benefit from all these shiny new gadgets? Throughout this report we emphasise a range of groups who will be disadvantaged and issues of inequality. Important issues of security and privacy will accompany new language technologies. A further caution is to re-emphasise the current limitations of AI. Looking ahead, we see many intriguing opportunities and new capabilities, but a range of other uncertainties and inequalities. New devices will enable new ways to talk, to translate, to remember, and to learn. But advances in technology will reproduce existing inequalities among those who cannot afford these devices, among the world's smaller languages, and especially for sign language. Debates over privacy and security will flare and crackle with every new immersive gadget. We will move together into this curious new world with a mix of excitement and apprehension - reacting, debating, sharing and disagreeing as we always do. Plug in, as the human-machine era dawn

    The Role of Emotional and Facial Expression in Synthesised Sign Language Avatars

    Get PDF
    This thesis explores the role that underlying emotional facial expressions might have in regards to understandability in sign language avatars. Focusing specifically on Irish Sign Language (ISL), we examine the Deaf community’s requirement for a visual-gestural language as well as some linguistic attributes of ISL which we consider fundamental to this research. Unlike spoken language, visual-gestural languages such as ISL have no standard written representation. Given this, we compare current methods of written representation for signed languages as we consider: which, if any, is the most suitable transcription method for the medical receptionist dialogue corpus. A growing body of work is emerging from the field of sign language avatar synthesis. These works are now at a point where they can benefit greatly from introducing methods currently used in the field of humanoid animation and, more specifically, the application of morphs to represent facial expression. The hypothesis underpinning this research is: augmenting an existing avatar (eSIGN) with various combinations of the 7 widely accepted universal emotions identified by Ekman (1999) to deliver underlying facial expressions, will make that avatar more human-like. This research accepts as true that this is a factor in improving usability and understandability for ISL users. Using human evaluation methods (Huenerfauth, et al., 2008) the research compares an augmented set of avatar utterances against a baseline set with regards to 2 key areas: comprehension and naturalness of facial configuration. We outline our approach to the evaluation including our choice of ISL participants, interview environment, and evaluation methodology. Remarkably, the results of this manual evaluation show that there was very little difference between the comprehension scores of the baseline avatars and those augmented withEFEs. However, after comparing the comprehension results for the synthetic human avatar “Anna” against the caricature type avatar “Luna”, the synthetic human avatar Anna was the clear winner. The qualitative feedback allowed us an insight into why comprehension scores were not higher in each avatar and we feel that this feedback will be invaluable to the research community in the future development of sign language avatars. Other questions asked in the evaluation focused on sign language avatar technology in a more general manner. Significantly, participant feedback in regard to these questions indicates a rise in the level of literacy amongst Deaf adults as a result of mobile technology

    Proceedings of the COLING 2004 Post Conference Workshop on Multilingual Linguistic Ressources MLR2004

    No full text
    International audienceIn an ever expanding information society, most information systems are now facing the "multilingual challenge". Multilingual language resources play an essential role in modern information systems. Such resources need to provide information on many languages in a common framework and should be (re)usable in many applications (for automatic or human use). Many centres have been involved in national and international projects dedicated to building har- monised language resources and creating expertise in the maintenance and further development of standardised linguistic data. These resources include dictionaries, lexicons, thesauri, word-nets, and annotated corpora developed along the lines of best practices and recommendations. However, since the late 90's, most efforts in scaling up these resources remain the responsibility of the local authorities, usually, with very low funding (if any) and few opportunities for academic recognition of this work. Hence, it is not surprising that many of the resource holders and developers have become reluctant to give free access to the latest versions of their resources, and their actual status is therefore currently rather unclear. The goal of this workshop is to study problems involved in the development, management and reuse of lexical resources in a multilingual context. Moreover, this workshop provides a forum for reviewing the present state of language resources. The workshop is meant to bring to the international community qualitative and quantitative information about the most recent developments in the area of linguistic resources and their use in applications. The impressive number of submissions (38) to this workshop and in other workshops and conferences dedicated to similar topics proves that dealing with multilingual linguistic ressources has become a very hot problem in the Natural Language Processing community. To cope with the number of submissions, the workshop organising committee decided to accept 16 papers from 10 countries based on the reviewers' recommendations. Six of these papers will be presented in a poster session. The papers constitute a representative selection of current trends in research on Multilingual Language Resources, such as multilingual aligned corpora, bilingual and multilingual lexicons, and multilingual speech resources. The papers also represent a characteristic set of approaches to the development of multilingual language resources, such as automatic extraction of information from corpora, combination and re-use of existing resources, online collaborative development of multilingual lexicons, and use of the Web as a multilingual language resource. The development and management of multilingual language resources is a long-term activity in which collaboration among researchers is essential. We hope that this workshop will gather many researchers involved in such developments and will give them the opportunity to discuss, exchange, compare their approaches and strengthen their collaborations in the field. The organisation of this workshop would have been impossible without the hard work of the program committee who managed to provide accurate reviews on time, on a rather tight schedule. We would also like to thank the Coling 2004 organising committee that made this workshop possible. Finally, we hope that this workshop will yield fruitful results for all participants

    A rule-based translation from written Spanish to Spanish Sign Language glosses

    Full text link
    This is the author’s version of a work that was accepted for publication in Computer Speech and Language. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computer Speech and Language, 28, 3 (2015) DOI: 10.1016/j.csl.2013.10.003One of the aims of Assistive Technologies is to help people with disabilities to communicate with others and to provide means of access to information. As an aid to Deaf people, we present in this work a production-quality rule-based machine system for translating from Spanish to Spanish Sign Language (LSE) glosses, which is a necessary precursor to building a full machine translation system that eventually produces animation output. The system implements a transfer-based architecture from the syntactic functions of dependency analyses. A sketch of LSE is also presented. Several topics regarding translation to sign languages are addressed: the lexical gap, the bootstrapping of a bilingual lexicon, the generation of word order for topic-oriented languages, and the treatment of classifier predicates and classifier names. The system has been evaluated with an open-domain testbed, reporting a 0.30 BLEU (BiLingual Evaluation Understudy) and 42% TER (Translation Error Rate). These results show consistent improvements over a statistical machine translation baseline, and some improvements over the same system preserving the word order in the source sentence. Finally, the linguistic analysis of errors has identified some differences due to a certain degree of structural variation in LSE
    • …
    corecore