21,745 research outputs found

    OpenAdaptxt: an open source enabling technology for high quality text entry

    Get PDF
    Modern text entry systems, especially for touch screen phones and novel devices, rely on complex underlying technologies such as error correction and word suggestion. Furthermore, for global deployment a vast number of languages have to be supported. Together this has raised the entry bar for new text entry techniques, which makes developing and testing a longer process thus stifling innovation. For example, testing a new feedback mechanism in comparison to a stock keyboard now requires the researchers to support at least slip correction and probably word suggestion. This paper introduces OpenAdaptxt: an open source community driven text input platform to enable development of higher quality text input solutions. It is the first commercial-grade open source enabling technology for modern text entry that supports both multiple platforms and dictionary support for over 50 spoken languages

    The new accent technologies:recognition, measurement and manipulation of accented speech

    Get PDF

    Using Poetry to Celebrate Students\u27 Diverse Perspectives and Languages

    Get PDF

    Referenceless Quality Estimation for Natural Language Generation

    Full text link
    Traditional automatic evaluation measures for natural language generation (NLG) use costly human-authored references to estimate the quality of a system output. In this paper, we propose a referenceless quality estimation (QE) approach based on recurrent neural networks, which predicts a quality score for a NLG system output by comparing it to the source meaning representation only. Our method outperforms traditional metrics and a constant baseline in most respects; we also show that synthetic data helps to increase correlation results by 21% compared to the base system. Our results are comparable to results obtained in similar QE tasks despite the more challenging setting.Comment: Accepted as a regular paper to 1st Workshop on Learning to Generate Natural Language (LGNL), Sydney, 10 August 201

    Proceedings of the COLING 2004 Post Conference Workshop on Multilingual Linguistic Ressources MLR2004

    No full text
    International audienceIn an ever expanding information society, most information systems are now facing the "multilingual challenge". Multilingual language resources play an essential role in modern information systems. Such resources need to provide information on many languages in a common framework and should be (re)usable in many applications (for automatic or human use). Many centres have been involved in national and international projects dedicated to building har- monised language resources and creating expertise in the maintenance and further development of standardised linguistic data. These resources include dictionaries, lexicons, thesauri, word-nets, and annotated corpora developed along the lines of best practices and recommendations. However, since the late 90's, most efforts in scaling up these resources remain the responsibility of the local authorities, usually, with very low funding (if any) and few opportunities for academic recognition of this work. Hence, it is not surprising that many of the resource holders and developers have become reluctant to give free access to the latest versions of their resources, and their actual status is therefore currently rather unclear. The goal of this workshop is to study problems involved in the development, management and reuse of lexical resources in a multilingual context. Moreover, this workshop provides a forum for reviewing the present state of language resources. The workshop is meant to bring to the international community qualitative and quantitative information about the most recent developments in the area of linguistic resources and their use in applications. The impressive number of submissions (38) to this workshop and in other workshops and conferences dedicated to similar topics proves that dealing with multilingual linguistic ressources has become a very hot problem in the Natural Language Processing community. To cope with the number of submissions, the workshop organising committee decided to accept 16 papers from 10 countries based on the reviewers' recommendations. Six of these papers will be presented in a poster session. The papers constitute a representative selection of current trends in research on Multilingual Language Resources, such as multilingual aligned corpora, bilingual and multilingual lexicons, and multilingual speech resources. The papers also represent a characteristic set of approaches to the development of multilingual language resources, such as automatic extraction of information from corpora, combination and re-use of existing resources, online collaborative development of multilingual lexicons, and use of the Web as a multilingual language resource. The development and management of multilingual language resources is a long-term activity in which collaboration among researchers is essential. We hope that this workshop will gather many researchers involved in such developments and will give them the opportunity to discuss, exchange, compare their approaches and strengthen their collaborations in the field. The organisation of this workshop would have been impossible without the hard work of the program committee who managed to provide accurate reviews on time, on a rather tight schedule. We would also like to thank the Coling 2004 organising committee that made this workshop possible. Finally, we hope that this workshop will yield fruitful results for all participants

    Evaluating the Usability of Automatically Generated Captions for People who are Deaf or Hard of Hearing

    Full text link
    The accuracy of Automated Speech Recognition (ASR) technology has improved, but it is still imperfect in many settings. Researchers who evaluate ASR performance often focus on improving the Word Error Rate (WER) metric, but WER has been found to have little correlation with human-subject performance on many applications. We propose a new captioning-focused evaluation metric that better predicts the impact of ASR recognition errors on the usability of automatically generated captions for people who are Deaf or Hard of Hearing (DHH). Through a user study with 30 DHH users, we compared our new metric with the traditional WER metric on a caption usability evaluation task. In a side-by-side comparison of pairs of ASR text output (with identical WER), the texts preferred by our new metric were preferred by DHH participants. Further, our metric had significantly higher correlation with DHH participants' subjective scores on the usability of a caption, as compared to the correlation between WER metric and participant subjective scores. This new metric could be used to select ASR systems for captioning applications, and it may be a better metric for ASR researchers to consider when optimizing ASR systems.Comment: 10 pages, 8 figures, published in ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '17

    How definite are we about the English article system? Chinese learners, L1 interference and the teaching of articles in English for academic purposes programmes.

    Get PDF
    Omission and overspecification of the/a/an/Ø are among the most frequently occurring grammatical errors made in English academic writing by Chinese first language (L1) university students (Chuang & Nesi, 2006; Lee & Chen, 2009). However, in the context of competing demands in the English for academic purposes (EAP) syllabus and conflicting evidence about the effectiveness of error correction, EAP tutors are often unsure about whether article use should or could be a focus and whether such errors should be corrected or ignored. With the aim of informing pedagogy, this study investigates: whether explicit teaching or correction improves accuracy; which article uses present the most challenges for Chinese students; the causes of error and whether a focus on article form can be integrated within a modern genre based/student centred approach in EAP. First, a questionnaire survey investigates how EAP teachers in higher education explicitly teach or correct English article use. Second, the effect of explicit teaching and correction on English article accuracy is investigated in a longitudinal experiment with a control group. Analysis of this study’s post-study measures raise questions about the sustained benefits of written correction or decontextualised rule-based approaches. Third, findings are presented from a corpus-based study which includes an inductive and deductive analysis of the errors made by Chinese students. Finally, in a fourth study hypotheses are tested using a multiple-choice test (n=455) and the main findings are presented: 1) that general referential article accuracy is significantly affected by proficiency level, genre and students’ familiarity with the topic; 2) Chinese students are most challenged by generic and non-referential contexts of use which may be partly attributable to the lack of positive L1 transfer effects; 3) overspecification of definite articles is a frequent problem that sometimes gives Chinese B2 level students’ writing an ‘informal tone’; and 4) higher nominal density of pre-qualified noun phrases in academic writing is significantly associated with higher error rates. Several practical recommendations are presented which integrate an occasional focus on article form with whole text teaching, autonomous proofreading skills, register awareness, and genre-based approaches to EAP pedagogy
    corecore