35,296 research outputs found
Empowering Active Learning to Jointly Optimize System and User Demands
Existing approaches to active learning maximize the system performance by
sampling unlabeled instances for annotation that yield the most efficient
training. However, when active learning is integrated with an end-user
application, this can lead to frustration for participating users, as they
spend time labeling instances that they would not otherwise be interested in
reading. In this paper, we propose a new active learning approach that jointly
optimizes the seemingly counteracting objectives of the active learning system
(training efficiently) and the user (receiving useful instances). We study our
approach in an educational application, which particularly benefits from this
technique as the system needs to rapidly learn to predict the appropriateness
of an exercise to a particular user, while the users should receive only
exercises that match their skills. We evaluate multiple learning strategies and
user types with data from real users and find that our joint approach better
satisfies both objectives when alternative methods lead to many unsuitable
exercises for end users.Comment: To appear as a long paper in Proceedings of the 58th Annual Meeting
of the Association for Computational Linguistics (ACL 2020). Download our
code and simulated user models at github:
https://github.com/UKPLab/acl2020-empowering-active-learnin
Topic and background knowledge effects on performance in speaking assessment
This study explores the extent to which topic and background knowledge of topic affect spoken
performance in a high-stakes speaking test. It is argued that evidence of a substantial influence may introduce construct-irrelevant variance and undermine test fairness. Data were collected from 81 non-native speakers of English who performed on 10 topics across three task types. Background knowledge and general language proficiency were measured using self-report questionnaires and C-tests respectively. Score data were analysed using many-facet Rasch measurement and multiple regression. Findings showed that for two of the three task types, the topics used in the study generally exhibited difficulty measures which were statistically distinct. However, the size of the differences in topic difficulties was too small to have a large practical effect on scores. Participants’ different levels of background knowledge were shown to have a systematic effect on performance. However, these statistically significant differences also failed to translate into practical significance. Findings hold implications for speaking performance assessment
State of the art review : language testing and assessment (part two).
In Part 1 of this two-part review article (Alderson & Banerjee, 2001), we first addressed issues of washback, ethics, politics and standards. After a discussion of trends in testing on a national level and in testing for specific purposes, we surveyed developments in computer-based testing and then finally examined self-assessment, alternative assessment and the assessment of young learners. In this second part, we begin by discussing recent theories of construct validity and the theories of language use that help define the constructs that we wish to measure through language tests. The main sections of the second part concentrate on summarising recent research into the constructs themselves, in turn addressing reading, listening, grammatical and lexical abilities, speaking and writing. Finally we discuss a number of outstanding issues in the field
The SLS-Berlin: Validation of a German Computer-Based Screening Test to Measure Reading Proficiency in Early and Late Adulthood
Reading proficiency, i.e., successfully integrating early word-based information and utilizing this information in later processes of sentence and text comprehension, and its assessment is subject to extensive research. However, screening tests for German adults across the life span are basically non-existent. Therefore, the present article introduces a standardized computerized sentence-based screening measure for German adult readers to assess reading proficiency including norm data from 2,148 participants covering an age range from 16 to 88 years. The test was developed in accordance with the children’s version of the Salzburger LeseScreening (SLS, Wimmer and Mayringer, 2014). The SLS-Berlin has a high reliability and can easily be implemented in any research setting using German language. We present a detailed description of the test and report the distribution of SLS-Berlin scores for the norm sample as well as for two subsamples of younger (below 60 years) and older adults (60 and older). For all three samples, we conducted regression analyses to investigate the relationship between sentence characteristics and SLS-Berlin scores. In a second validation study, SLS-Berlin scores were compared with two (pseudo)word reading tests, a test measuring attention and processing speed and eye-movements recorded during expository text reading. Our results confirm the SLS-Berlin’s sensitivity to capture early word decoding and later text related comprehension processes. The test distinguished very well between skilled and less skilled readers and also within less skilled readers and is therefore a powerful and efficient screening test for German adults to assess interindividual levels of reading proficiency
Growth in reading and how children spend their time outside of school
Running title: Growth in readingIncludes bibliographical references (leaves 36-38)Performed pursuant to contract no. 400-81-0030 of the National Institute of Educatio
Sociolinguistic Conditioning of Phonetic Category Realisation in Non-Native Speech
The realisation of phonetic categories reflects a complex relationship between individual phonetic parameters and both linguistic and extra-linguistic conditioning of language usage. The present paper investigates the effect of selected socio-linguistic variables, such as the age, the amount of language use and cultural/social distance in English used by Polish immigrants to the U.S. Individual parameters used in the realisation of the category ‘voice’ have been found to vary in their sensitivity to extra-linguistic factors: while the production of target-like values of all parameters is related to the age, it is the closure duration that is most stable in the correspondence to the age and level of language proficiency. The VOT and vowel duration, on the other hand, prove to be more sensitive to the amount of language use and attitudinal factors
- …