35,296 research outputs found

    Empowering Active Learning to Jointly Optimize System and User Demands

    Full text link
    Existing approaches to active learning maximize the system performance by sampling unlabeled instances for annotation that yield the most efficient training. However, when active learning is integrated with an end-user application, this can lead to frustration for participating users, as they spend time labeling instances that they would not otherwise be interested in reading. In this paper, we propose a new active learning approach that jointly optimizes the seemingly counteracting objectives of the active learning system (training efficiently) and the user (receiving useful instances). We study our approach in an educational application, which particularly benefits from this technique as the system needs to rapidly learn to predict the appropriateness of an exercise to a particular user, while the users should receive only exercises that match their skills. We evaluate multiple learning strategies and user types with data from real users and find that our joint approach better satisfies both objectives when alternative methods lead to many unsuitable exercises for end users.Comment: To appear as a long paper in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020). Download our code and simulated user models at github: https://github.com/UKPLab/acl2020-empowering-active-learnin

    Topic and background knowledge effects on performance in speaking assessment

    Get PDF
    This study explores the extent to which topic and background knowledge of topic affect spoken performance in a high-stakes speaking test. It is argued that evidence of a substantial influence may introduce construct-irrelevant variance and undermine test fairness. Data were collected from 81 non-native speakers of English who performed on 10 topics across three task types. Background knowledge and general language proficiency were measured using self-report questionnaires and C-tests respectively. Score data were analysed using many-facet Rasch measurement and multiple regression. Findings showed that for two of the three task types, the topics used in the study generally exhibited difficulty measures which were statistically distinct. However, the size of the differences in topic difficulties was too small to have a large practical effect on scores. Participants’ different levels of background knowledge were shown to have a systematic effect on performance. However, these statistically significant differences also failed to translate into practical significance. Findings hold implications for speaking performance assessment

    State of the art review : language testing and assessment (part two).

    Get PDF
    In Part 1 of this two-part review article (Alderson & Banerjee, 2001), we first addressed issues of washback, ethics, politics and standards. After a discussion of trends in testing on a national level and in testing for specific purposes, we surveyed developments in computer-based testing and then finally examined self-assessment, alternative assessment and the assessment of young learners. In this second part, we begin by discussing recent theories of construct validity and the theories of language use that help define the constructs that we wish to measure through language tests. The main sections of the second part concentrate on summarising recent research into the constructs themselves, in turn addressing reading, listening, grammatical and lexical abilities, speaking and writing. Finally we discuss a number of outstanding issues in the field

    The SLS-Berlin: Validation of a German Computer-Based Screening Test to Measure Reading Proficiency in Early and Late Adulthood

    Get PDF
    Reading proficiency, i.e., successfully integrating early word-based information and utilizing this information in later processes of sentence and text comprehension, and its assessment is subject to extensive research. However, screening tests for German adults across the life span are basically non-existent. Therefore, the present article introduces a standardized computerized sentence-based screening measure for German adult readers to assess reading proficiency including norm data from 2,148 participants covering an age range from 16 to 88 years. The test was developed in accordance with the children’s version of the Salzburger LeseScreening (SLS, Wimmer and Mayringer, 2014). The SLS-Berlin has a high reliability and can easily be implemented in any research setting using German language. We present a detailed description of the test and report the distribution of SLS-Berlin scores for the norm sample as well as for two subsamples of younger (below 60 years) and older adults (60 and older). For all three samples, we conducted regression analyses to investigate the relationship between sentence characteristics and SLS-Berlin scores. In a second validation study, SLS-Berlin scores were compared with two (pseudo)word reading tests, a test measuring attention and processing speed and eye-movements recorded during expository text reading. Our results confirm the SLS-Berlin’s sensitivity to capture early word decoding and later text related comprehension processes. The test distinguished very well between skilled and less skilled readers and also within less skilled readers and is therefore a powerful and efficient screening test for German adults to assess interindividual levels of reading proficiency

    Growth in reading and how children spend their time outside of school

    Get PDF
    Running title: Growth in readingIncludes bibliographical references (leaves 36-38)Performed pursuant to contract no. 400-81-0030 of the National Institute of Educatio

    Sociolinguistic Conditioning of Phonetic Category Realisation in Non-Native Speech

    Get PDF
    The realisation of phonetic categories reflects a complex relationship between individual phonetic parameters and both linguistic and extra-linguistic conditioning of language usage. The present paper investigates the effect of selected socio-linguistic variables, such as the age, the amount of language use and cultural/social distance in English used by Polish immigrants to the U.S. Individual parameters used in the realisation of the category ‘voice’ have been found to vary in their sensitivity to extra-linguistic factors: while the production of target-like values of all parameters is related to the age, it is the closure duration that is most stable in the correspondence to the age and level of language proficiency. The VOT and vowel duration, on the other hand, prove to be more sensitive to the amount of language use and attitudinal factors
    corecore