1,518 research outputs found

    CLARIN. The infrastructure for language resources

    Get PDF
    CLARIN, the "Common Language Resources and Technology Infrastructure", has established itself as a major player in the field of research infrastructures for the humanities. This volume provides a comprehensive overview of the organization, its members, its goals and its functioning, as well as of the tools and resources hosted by the infrastructure. The many contributors representing various fields, from computer science to law to psychology, analyse a wide range of topics, such as the technology behind the CLARIN infrastructure, the use of CLARIN resources in diverse research projects, the achievements of selected national CLARIN consortia, and the challenges that CLARIN has faced and will face in the future. The book will be published in 2022, 10 years after the establishment of CLARIN as a European Research Infrastructure Consortium by the European Commission (Decision 2012/136/EU)

    CLARIN

    Get PDF
    The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium

    Getting Past the Language Gap: Innovations in Machine Translation

    Get PDF
    In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

    CLARIN

    Get PDF
    The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium

    (Im)politeness in email communication: how English speakers and Chinese speakers negotiate meanings and develop intercultural (mis)understandings

    Get PDF
    This thesis looks at the way in which Chinese and English speakers employ (im)politeness strategies in their emails to develop intercultural understanding. From a theoretical perspective, this thesis contributes to the discussions of intercultural communication in relation to the negotiation of (im)politeness meaning. From a pedagogic perspective, the thesis reveals the potential for using email to experience culture as a process of meaning negotiation and construction and has relevance to teachers of EFL. Ethnographically-informed discourse analysis is employed to investigate discursively the negotiation of meaning in email interaction. The interplay between the computer-mediated communication, speech acts and (im)politeness are explored by using the analytical frameworks of Hymes’ ethnography of communication, Searle’s speech act theory (1969) and Brown and Levinson’s politeness theory (1987). This research shows that ‘(im)politeness’ is not a stable construct. Rather, it is constantly (re)negotiated by the interactants, who take into account the relevant contextualisation cues. It finds that the functions and (im)politeness meanings of speech acts can vary from situations to situations. In addition, this research finds that the computer-mediated paralanguages, such as emoticons and written out laughter, are also important in realising (im)politeness intent and developing intercultural understanding in emails

    Re-examining Phonological and Lexical Correlates of Second Language Comprehensibility:The Role of Rater Experience

    Get PDF
    Few researchers and teachers would disagree that some linguistic aspects of second language (L2) speech are more crucial than others for successful communication. Underlying this idea is the assumption that communicative success can be broadly defined in terms of speakers’ ability to convey the intended meaning to the interlocutor, which is frequently captured through a listener-based rating of comprehensibility or ease of understanding (e.g. Derwing & Munro, 2009; Levis, 2005). Previous research has shown that communicative success – for example, as defined through comprehensible L2 speech – depends on several linguistic dimensions of L2 output, including its segmental and suprasegmental pronunciation, fluency-based characteristics, lexical and grammatical content, as well as discourse structure (e.g. Field, 2005; Hahn, 2004; Kang et al., 2010; Trofimovich & Isaacs, 2012). Our chief objective in the current study was to explore the L2 comprehensibility construct from a language assessment perspective (e.g. Isaacs & Thomson, 2013), by targeting rater experience as a possible source of variance influencing the degree to which raters use various characteristics of speech in judging L2 comprehensibility. In keeping with this objective, we asked the following question: What is the extent to which linguistic aspects of L2 speech contributing to comprehensibility ratings depend on raters’ experience

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    Language technologies for a multilingual Europe

    Get PDF
    This volume of the series “Translation and Multilingual Natural Language Processing” includes most of the papers presented at the Workshop “Language Technology for a Multilingual Europe”, held at the University of Hamburg on September 27, 2011 in the framework of the conference GSCL 2011 with the topic “Multilingual Resources and Multilingual Applications”, along with several additional contributions. In addition to an overview article on Machine Translation and two contributions on the European initiatives META-NET and Multilingual Web, the volume includes six full research articles. Our intention with this workshop was to bring together various groups concerned with the umbrella topics of multilingualism and language technology, especially multilingual technologies. This encompassed, on the one hand, representatives from research and development in the field of language technologies, and, on the other hand, users from diverse areas such as, among others, industry, administration and funding agencies. The Workshop “Language Technology for a Multilingual Europe” was co-organised by the two GSCL working groups “Text Technology” and “Machine Translation” (http://gscl.info) as well as by META-NET (http://www.meta-net.eu)

    Enhancing Listening and Spoken Skills in Spanish Connected Speech for Anglophones

    Get PDF
    Native speech is directed towards native listeners, not designed for comprehension and analysis by language learners. Speed of delivery, or economy of effort, produces a speech signal to which the native listener can assign the correct words. There are no discrete words in the speech signal itself therefore there is often a linguistic barrier in dealing with the local spoken language.The creation, development and application of the Dynamic Spanish Speech Corpus (DSSC) facilitated an empirically-based appreciation of speaking speed and prosody as obstacles to intelligibility for learners of Spanish. “Duologues”, natural, relaxed dialogues recorded in such a manner that each interlocutor’s performance can be studied in isolation, thus avoiding problems normally caused by cross-talk and back-channelling, made possible the identification of the key phonetic features of informal native-native dialogue, and ultimately, the creation of high quality assets/ research data based on natural (unscripted) dialogues recorded at industry audio standards.These assets were used in this study, which involved documenting productive and receptive intelligibility problems when L2 users are exposed to the Spanish speech of native speakers. The aim was to observe where intelligibility problems occur and to determine the reasons for this, based on effects of the first language of the subjects, and other criteria, such as number of years learning/using Spanish, previous exposure to spoken Spanish and gender. This was achieved by playing recorded extracts/ snippets from the DSSC to which a time-scaling tool was applied

    Proceedings of the VIIth GSCP International Conference

    Get PDF
    The 7th International Conference of the Gruppo di Studi sulla Comunicazione Parlata, dedicated to the memory of Claire Blanche-Benveniste, chose as its main theme Speech and Corpora. The wide international origin of the 235 authors from 21 countries and 95 institutions led to papers on many different languages. The 89 papers of this volume reflect the themes of the conference: spoken corpora compilation and annotation, with the technological connected fields; the relation between prosody and pragmatics; speech pathologies; and different papers on phonetics, speech and linguistic analysis, pragmatics and sociolinguistics. Many papers are also dedicated to speech and second language studies. The online publication with FUP allows direct access to sound and video linked to papers (when downloaded)
    • 

    corecore