Search CORE

11,719 research outputs found

Topic and background knowledge effects on performance in speaking assessment

Author: Khabbazbashi Nahal
Publication venue: 'SAGE Publications'
Publication date: 10/08/2015
Field of study

This study explores the extent to which topic and background knowledge of topic affect spoken performance in a high-stakes speaking test. It is argued that evidence of a substantial influence may introduce construct-irrelevant variance and undermine test fairness. Data were collected from 81 non-native speakers of English who performed on 10 topics across three task types. Background knowledge and general language proficiency were measured using self-report questionnaires and C-tests respectively. Score data were analysed using many-facet Rasch measurement and multiple regression. Findings showed that for two of the three task types, the topics used in the study generally exhibited difficulty measures which were statistically distinct. However, the size of the differences in topic difficulties was too small to have a large practical effect on scores. Participants’ different levels of background knowledge were shown to have a systematic effect on performance. However, these statistically significant differences also failed to translate into practical significance. Findings hold implications for speaking performance assessment

University of Bedfordshire Repository

Recommended from our members

Rhythm in the speech of a person with right hemisphere damage: Applying the pairwise variability index

Author: Abercrombie D
Ackermann H.
Behrens S. J
Charcot J
Crystal D
Dankovičová J.
Dauer R. M
Edmunds L. A.
Fischer-Jørgensen E.
Fletcher J.
Gay T
Grabe E.
Halliday M. A. K
Horvath B. M
Kent R. D.
Low E. L.
Monrad-Krohn G. H
Monrad-Krohn G. H
Peterson G. E.
Pike K. L
Ramus F.
Raven J. C
Roach P
Roach P
Ross E. D
Smith A
Trask R. L
Wechsler D
Wells J. C
White L.
Whitworth N
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2007
Field of study

Although several aspects of prosody have been studied in speakers with right hemisphere damage (RHD), rhythm remains largely uninvestigated. This study compares the rhythm of an Australian English speaker with right hemisphere damage (due to a stroke, but with no concomitant dysarthria) to that of a neurologically unimpaired individual. The speakers' rhythm is compared using the pairwise variability index (PVI) which allows for an acoustic characterization of rhythm by comparing the duration of successive vocalic and intervocalic intervals. A sample of speech from a structured interview between a speech and language therapist and each participant was analysed. Previous research has shown that speakers with RHD may have difficulties with intonation production, and therefore it was hypothesized that there may also be rhythmic disturbance. Results show that the neurologically normal control uses a similar rhythm to that reported for British English (there are no previous studies available for Australian English), whilst the speaker with RHD produces speech with a less strongly stress-timed rhythm. This finding was statistically significant for the intervocalic intervals measured (t(8) = 4.7, p < .01), and suggests that some aspects of prosody may be right lateralized for this speaker. The findings are discussed in relation to previous findings of dysprosody in RHD populations, and in relation to syllable-timed speech of people with other neurological conditions

City Research Online

Crossref

espace@Curtin

Integration of a web-based rating system with an oral proficiency interview test: argument-based approach to validation

Author: Yang Hye Jin
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

This dissertation focuses on the validation of the Oral Proficiency Interview (OPI), a component of the Oral English Certification Test for international teaching assistants. The rating of oral responses was implemented through an innovative computer technology—a web-based rating system called Rater-Platform (R-Plat). The main purpose of the dissertation was to investigate the validity of interpretations and uses of the OPI scores derived from raters’ assessment of examinees’ performance during the web-based rating process. Following the argument-based validation approach (Kane, 2006), an interpretive argument for the OPI was constructed. The interpretive argument specifies a series of inferences, warrants for each inference, as well as underlying assumptions and specific types of backing necessary to support the assumptions. Of seven inferences—domain description, evaluation, generalization, extrapolation, explanation, utilization, and impact—this study focuses on two. Specifically, it aims to obtain validity evidence for three assumptions underlying the evaluation inference and for three assumptions underlying the generalization inference. The research questions addressed: (1) raters’ perceptions towards R-Plat in terms of clarity, effectiveness, satisfaction, and comfort level; (2) quality of raters’ diagnostic descriptor markings; (3) quality of raters’ comments; (4) quality of OPI scores; (5) quality of individual raters’ OPI ratings; (6) prompt difficulty; and (7) raters’ rating practices. A mixed-methods design was employed to collect and analyze qualitative and quantitative data. Qualitative data consisted of: (a) 14 raters’ responses to open-ended questions about their perceptions towards R-Plat, (b) 5 recordings of individual/focus group interviews on eliciting raters’ perceptions, and (c) 1,900 evaluative units extracted from raters’ comments about examinees’ speaking performance. Quantitative data included: (a) 14 raters’ responses to six-point scale statements about their perceptions, (b) 2,524 diagnostic descriptor markings of examinees’ speaking ability, (c) OPI scores for 279 examinees, (d) 803 individual raters’ ratings, (e) individual prompt ratings divided by each intended prompt level, given by each rater, and (f) individual raters’ ratings on the given prompts, grouped by test administration. The results showed that the assumptions for the evaluation inference were supported. Raters’ responses to questionnaire and individual/focus group interviews revealed positive attitudes towards R-Plat. Diagnostic descriptors and raters’ comments, analyzed by chi-square tests, indicated different speaking ability levels. OPI scores were distributed across different proficiency levels throughout different test administrations. For the generalization inference, both positive and negative evidence was obtained. MFRM analyses showed that OPI scores reliably separated examinees into different speaking ability levels. Observed prompt difficulty matched intended prompt levels, although several problematic prompts were identified. Finally, while the raters used rating scales consistently adequately within the same test administration, they were not consistent in their severity. Overall, the foundational parts for the validity argument were successfully established. The findings of this study allow for moving forward with the investigation of the subsequent inferences in order to construct a complete OPI validity argument. They also suggest important implications for argument-based validation research, for the study of raters and task variability, and for future applications of web-based rating systems for speaking assessment

Digital Repository @ Iowa State University (ISU)

Measuring basic skills for longitudinal study: the design and development of instruments for use with cohort members in the age 34 follow-up in the 1970 British Cohort Study (BCS70)

Author: Bynner John
Foudouli Vanessa
Parsons Sam
Publication venue: National Research and Development Centre for adult literacy and numeracy, Institute of Education, University of London
Publication date: 01/01/2005
Field of study

Digital Education Resource Archive

UCL Discovery

A comparison of holistic, analytic, and part marking models in speaking assessment

Author: Galaczi Evelina D.
Khabbazbashi Nahal
Publication venue: 'SAGE Publications'
Publication date: 24/01/2020
Field of study

This mixed methods study examined holistic, analytic, and part marking models (MMs) in terms of their measurement properties and impact on candidate CEFR classifications in a semi-direct online speaking test. Speaking performances of 240 candidates were first marked holistically and by part (phase 1). On the basis of phase 1 findings – which suggested stronger measurement properties for the part MM – phase 2 focused on a comparison of part and analytic MMs. Speaking performances of 400 candidates were rated analytically and by part during that phase. Raters provided open comments on their marking experiences. Results suggested a significant impact of MM; approximately 30% and 50% of candidates in phases 1 and 2 respectively were awarded different (adjacent) CEFR levels depending on the choice of MM used to assign scores. There was a trend of higher CEFR levels with the holistic MM and lower CEFR levels with the part MM. While strong correlations were found between all pairings of MMs, further analyses revealed important differences. The part MM was shown to display superior measurement qualities particularly in allowing raters to make finer distinctions between different speaking ability levels. These findings have implications for the scoring validity of speaking tests

University of Bedfordshire Repository

Unstressed Vowels in German Learner English: An Instrumental Study

Author: Abercrombie
Abercrombie
Barry
Barry
Broselow
Broselow
Crystal
Crystal
Delattre
Delattre
Dellwo
Dellwo
Di
Di
Dretzke
Dretzke
Flege
Flege
German
German
German
German
Ghazali
Ghazali
Gibbon
Gibbon
Giegerich
Giegerich
Hansen Edwards
Hansen Edwards
Hoffmann
Hoffmann
Kaltenbacher
Kaltenbacher
Kendall
Kendall
Kohler
Kohler
Lee
Lee
Lukas Soenning
Mahwah
Mahwah
Nespor
Nespor
Ordin
Ordin
Parkes
Parkes
Pascoe
Pascoe
Peterson
Peterson
Pike
Pike
Porzuczek
Porzuczek
Prosodic
Prosodic
Roach
Roach
Traunmüller
Traunmüller
Wesener
Wesener
Wilcox
Wilcox
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2014
Field of study

This study investigates the production of vowels in unstressed syllables by advanced German learners of English in comparison with native speakers of Standard Southern British English. Two acoustic properties were measured: duration and formant structure. The results indicate that duration of unstressed vowels is similar in the two groups, though there is some variation depending on the phonetic context. In terms of formant structure, learners produce slightly higher F1 and considerably lower F2, the difference in F2 being statistically significant for each learner. Formant values varied as a function of context and orthographic representation of the vowel

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

Children at risk : their phonemic awareness development in holistic instruction

Author: Pearson P. David
Winsor Pamela J.
Publication venue: Champaign, Ill. : University of Illinois at Urbana-Champaign.
Publication date: 01/01/1992
Field of study

Includes bibliographical references (p. 17-19

Illinois Digital Environment for Access to Learning and Scholarship Repository

SPEAKING ENGLISH PERFORMANCE ASSESSMENT WITH THE FACET RASCH MEASUREMENT MODEL

Author: Anzali M. Noor
Hidayat Wahyu
Turmudi Muhammad
Publication venue: PROGRAM STUDI PENELITIAN DAN EVALUASI PENDIDIKAN
Publication date: 02/11/2023
Field of study

This study aims to assess students' English-speaking abilities based on peer assessment. This study is a quantitative study involving 10 students. Data was collected using tests and student speaking assessment rubrics with score criteria from 1 to 5. Speaking assessment criteria are pronunciation, grammar, vocabulary, fluency and understanding. Data were analyzed using Many Faceted Rasch Measurement (MFRM). The Facets Rasch Measurement model is able to see the interaction between respondents and items at once. The research results show that the item index for criteria/quality (6.39), speaker (0.51), and rater (5.32) as well as the standard deviation value clearly shows a good distribution of item difficulty. Criterion reliability is 0.98 for raters is 0.21, for raters is 0.97

Jurnal Evaluasi Pendidikan

The development of automatic speech evaluation system for learners of English

Author: Kondo Yusuke
Publication venue
Publication date: 01/01/2010
Field of study

制度:新 ; 報告番号:甲3183号 ; 学位の種類:博士(教育学) ; 授与年月日:2010/11/30 ; 早大学位記番号:新547

Waseda University Repository