2,380 research outputs found

    Usefulness and reliability of online assessments: a Business Faculty's experience

    Get PDF
    The usefulness and reliability of online assessment results relate to the clarity, specificity and articulation of assessment purposes, goals and criteria. Cheating and plagiarism are two frequent and controversial issues that arise and there is a view that the online assessments mode inherently lends itself to both these practices. However, reconceptualising practice and redeveloping techniques can pave the way for an authentic assessment approach which minimizes student academic dishonesty. This article describes research which investigated online assessments practice in a business faculty at an Australian university and proposes what might constitute good, sustainable practice and design in university online assessment practices

    An Argument-Based Validation Study of the Teacher Performance Assessment in Washington State

    Get PDF
    This study examines the validity assumptions of the Teacher Performance Assessment (TPA) using data collected from teacher candidates, mentor teachers, university supervisors and university faculty in two programs at one university during the 2012 field test in Washington State. Applying the work of Michael Kane (2006) on argument-based validation, this study developed interpretations and assumptions of TPA test score use using the following five inferences: Construct Representation, Scoring and Evaluation, Generalization, Extrapolation, and Decision Making. This multi-method study utilizes survey, case study, and test score data. The overarching research question that guided the study was “Is the TPA a valid measure for determining teacher readiness?” The overall findings suggest that the operationalized construct of readiness is stable but scores are not generalizable across populations and guidance was not in place regarding score meaning and use prior to the field test. Low correlation between the TPA and university instruments provided divergent evidence for the use of TPA scores, indicating that decisions made based solely from TPA scores may not be reliable

    WV Writes and Westest 2 online writing: the impact of writing prompts on student writing proficiency

    Get PDF
    This study’s purpose was to determine the effects of students practicing writing using practice writing prompts prior to completing the summative state writing assessment. It adds data to the body of knowledge related to the use of practicing writing using practice prompts prior to students taking a high stakes state-level writing assessment. The type of research design used for this study was a quantitative, post hoc, 2 x 2 ANOVA. The data were obtained from the WESTEST 2 Online Writing composite scores and the five analytic trait scores that comprise the WESTEST 2 Online Writing scores. The study had a population of 6,459 11th grade students enrolled in West Virginia public schools. These students had all taken the WESTEST 2 Online Writing as 11th graders in the spring of 2013, and in preparation for the year-end, state level writing assessment completed either Writing Roadmap 2.0 prompts or WESTEST 2 practice prompts. Using random sampling, 190 students who wrote essays using WESTEST 2 practice prompts and 190 students who wrote essays using Writing Roadmap prompts were selected from the student population. This gave a total of 380 students in the sample size. Findings revealed that no significant effects were found when using one type of writing prompt over another on composite writing scores or on the five analytic writing scores. However, significance was demonstrated (p. 000) with the scores of females being greater than male students. Results gave stakeholders evidence that students who had a generic writing prompt versus a mirror image of the high stakes writing assessment scored no better or worse than the other. The new assessment vendors, states, counties, schools, and teachers will all benefit from these study findings as new assessment systems are adopted based on Common Core writing standards across the nation. The results are critical in supporting the discrepancy that stands between females and males and their writing proficiencies. This study can support efforts that focus on addressing the inequalities and ensuring that the discrepancy is removed and male students become better writers

    Comparative pairs judgements for high-stakes practical assessments

    Get PDF
    Assessment of practical tasks, as opposed to that of theoretical tasks, has been considered to be problematic, mainly because it is usually resource intensive and the scoring is subjective. Most practical tasks need to be assessed on site or involve products that need to be collected, stored, or transported. Moreover, because practical tasks are generally open-ended, and therefore subjective, there is concern over the reliability of the scores. In high-stakes assessment, these problems are even more challenging. There is a need for an assessment method that could overcome these problems. In this study, such a method that will be referred to as the Comparative Pairs judgements was investigated. This scoring method was applied to samples from the practical examination in two secondary courses in Western Australia: Design and Visual Arts. This study was conducted within the first phase of an Australian Research Council (ARC) Linkage Project titled the Authentic Digital Representation of Creative Works in Education. This main project was a collaboration between the Centre for Schooling and Learning Technologies (CSaLT) in Edith Cowan University and the Curriculum Council of Western Australia. The purpose of the present study was to investigate the suitability of the Comparative Pairs judgements as an alternative assessment method for assessing highstakes practical production tasks. The overarching research question was how representative are the Comparative Pairs judgement scores of the quality of the student practical production work in Visual Arts and Design courses? In the present study, student work that was submitted for the practical examination was digitised for online scoring processes. The digital representation of student work enabled online access for judging, regardless of the location of the assessors. Both a Comparative Pairs judgements method and an Analytical marking method were used to score these digital representations. An interpretive research paradigm was employed, by utilising an explanatory sequential mixed method design. Data collected for the present study were part of the data collected in the main project. While data for the main project was quite extensive, only scoring data and the assessor interviews and online notes were considered relevant to this study, and therefore only these data were analysed and discussed in this thesis. A total of 157 students studying Design and Visual Arts participated in the first phase of the main project and the present study. A total of 25 assessors participated in the Comparative Pairs judgements and the Analytical marking processes. Scoring data analysed in this study were obtained from three scoring processes: the official practical examination scores, the online Analytical marking, and the Comparative Pairs judgements. Data analysis included descriptive statistics, correlation analysis, Rasch dichotomous modelling, fit statistics, and reliability analysis. A further discrepancy analysis was conducted on student works that showed scoring inconsistency, either between methods of scoring or between assessors. Data from the assessor interviews and judgement notes from the scoring processes were triangulated with the scoring data to examine the validity of the Comparative Pairs judgements method as an alternative scoring method. Data from the scoring of the digital representations of the student work in Design and Visual Arts were analysed separately to examine the suitability of the Comparative Pairs judgements in each course, and consequently compared to examine the influence of the different assessment tasks in the two subjects on the scoring result. Findings for both the Design and Visual Arts courses suggested that the scoring resulting from the Comparative Pairs judgements was reliable. This was mainly due to the numerous judgements and the pairing algorithm, therefore the inconsistencies in judgements were cancelled out, creating scoring results that could be more reliable than the more commonly used Analytical marking. The validity analysis that was conducted used both the evidence for, and threats against validity, suggested that this assessment method could be a valid method for high-stakes practical assessment in these two courses. The present study found that the reliability of the scores and the validity of the Comparative Pairs judgements as an assessment method make this method suitable for assessing high-stakes practical production. Findings from the present study suggested that this method is applied and further investigated in different educational settings for different practical assessment tasks. This method of judgements should be considered to be potentially valuable for formative assessment and summative assessment alike, as well as teacher professional learning, and moderation practice

    Meeting the Challenges to Measurement in an Era of Accountability

    Get PDF
    Under pressure and support from the federal government, states have increasingly turned to indicators based on student test scores to evaluate teachers and schools, as well as students themselves. The focus thus far has been on test scores in those subject areas where there is a sequence of consecutive tests, such as in mathematics or English/language arts with a focus on grades 4-8. Teachers in these subject areas, however, constitute less than thirty percent of the teacher workforce in a district. Comparatively little has been written about the measurement of achievement in the other grades and subjects. This volume seeks to remedy this imbalance by focusing on the assessment of student achievement in a broad range of grade levels and subject areas, with particular attention to their use in the evaluation of teachers and schools in all. It addresses traditional end-of-course tests, as well as alternative measures such as portfolios, exhibitions, and student learning objectives. In each case, issues related to design and development, psychometric considerations, and validity challenges are covered from both a generic and a content-specific perspective. The NCME Applications of Educational Measurement and Assessment series includes edited volumes designed to inform research-based applications of educational measurement and assessment. Edited by leading experts, these books are comprehensive and practical resources on the latest developments in the field. The NCME series editorial board is comprised of Michael J. Kolen, Chair; Robert L. Brennan; Wayne Camara; Edward H. Haertel; Suzanne Lane; and Rebecca Zwick

    External validation of the foreign language speaking tasks of the high school leaving exam

    Get PDF
    Spain is going through one of its most significant educational changes in the last 20 years. The change involves all the educational stages k-12 grade and it will entail a comprehensive exam at the end of high school. The score obtained on this exam will be used as the main criterion to compete for a place at the university level. One significant part of it is the foreign language section. This paper addresses differences between the speaking tasks of the current University Entrance Examination and the future High School Leaving Diploma. The paper compares theoretically and qualitatively a pilot study run by the Ministry of Education, Culture & Sports (MECD) on the use of speaking tasks in the University Entrance Examination with one undertaken by a large research group working together in the OPENPAU project (Reference FFI2011-22442 with ERDF co-financing) on the same matter. This paper intends to show that the scope of the MECD piloting project is limited compared to the OPENPAU group proposal. It is suggested that the MECD should redefine the test construct to reflect more the students? speaking proficiency test as in the OPENPAU project proposal

    Meeting the Challenges to Measurement in an Era of Accountability

    Get PDF
    Under pressure and support from the federal government, states have increasingly turned to indicators based on student test scores to evaluate teachers and schools, as well as students themselves. The focus thus far has been on test scores in those subject areas where there is a sequence of consecutive tests, such as in mathematics or English/language arts with a focus on grades 4-8. Teachers in these subject areas, however, constitute less than thirty percent of the teacher workforce in a district. Comparatively little has been written about the measurement of achievement in the other grades and subjects. This volume seeks to remedy this imbalance by focusing on the assessment of student achievement in a broad range of grade levels and subject areas, with particular attention to their use in the evaluation of teachers and schools in all. It addresses traditional end-of-course tests, as well as alternative measures such as portfolios, exhibitions, and student learning objectives. In each case, issues related to design and development, psychometric considerations, and validity challenges are covered from both a generic and a content-specific perspective. The NCME Applications of Educational Measurement and Assessment series includes edited volumes designed to inform research-based applications of educational measurement and assessment. Edited by leading experts, these books are comprehensive and practical resources on the latest developments in the field. The NCME series editorial board is comprised of Michael J. Kolen, Chair; Robert L. Brennan; Wayne Camara; Edward H. Haertel; Suzanne Lane; and Rebecca Zwick

    Evaluating the Validity of Technology-Enhanced Educational Assessment Items and Tasks: An Empirical Approach to Studying Item Features and Scoring Rubrics.

    Full text link
    With the advent of the newly developed Common Core State Standards and the Next Generation Science Standards, innovative assessments, including technology-enhanced items and tasks, will be needed to meet the challenges of developing valid and reliable assessments in a world of computer-based testing. In a recent critique of the next generation assessments in math (i.e., Smarter Balanced), Rasmussen (2015) observed that many aspects of the technology “enhancements” can be expected to do more harm than good as the computer interfaces may introduce construct irrelevant variance. This paper focused on issues surrounding the design of TEIs and how cognitive load theory (Miller, 1956) is a promising framework that can be applied to computer-based item design to mitigate the effects of computer interface usability. Two studies were conducted. In the first study I used multi-level modeling to assess the effect of item characteristics on examinees’ relative performance. I hypothesized that item level characteristics, namely response format, would significantly contribute to the amount of variance explained by item characteristics over and above student characteristics. In study two, I used two exemplar items to show how data concerning examinees’ actions—produced through latent class analyses—can be used as evidence in validity investigations. Results from study 1 suggested that item type does not explain the variation in student scores over and above examinee characteristics. Results from study two suggested that LCA is a useful tool for diagnosing potential issues in the design of items and the design of their scoring rubrics. Evidence provided from both studies illuminates the immediate need to further research computer-based items that are beginning to be used widely in high stakes, large-scale assessments. In an effort to move away from traditional multiple choice items and toward more authentic measurement by incorporating technology based item features, we may be affecting how examinees respond to the item due to inadvertent increases in cognitive load. Future research involving experimental manipulation is necessary for understanding how item characteristics impact how examinees responses

    Marking in a Standards-Referencing Framework

    Get PDF
    Education systems around the world have adopted standards-referencing in a move to provide meaningful information about students’ knowledge and skills on the completion of their secondary schooling. Standards-referencing systems report student achievement against predetermined descriptions of performance from which the learning outcomes of a syllabus are derived. However, the whole credibility of using a standards-referencing system is built upon teachers being able to determine the correct “image” of what students know and can do as they create internal school-based assessment tasks. If the wrong image is produced, then the validity of decisions regarding student performance is reduced and calls into question the credentialling process. As such, when teachers create assessments, they must ensure that there is alignment between the cognitive demands of the learning outcomes, assessment question, marking rubric(s), and performance band descriptors for the course as they operationalise the theoretical tenets of standards-referencing to maximise the reliability and validity of students’ results. Evidence suggests this is not occurring, and teachers use an amalgamation of norm-, criterion-, and standards-referencing assessment practices. Given these potential differences between current practice and assessment-system requirements, and the lack of clarity around what exactly the requirements are for teachers’ assessment practices, this thesis first explicates a theoretical assessment process model for effective assessment in a standards-referencing system, which serves as a blueprint for the practical support of teachers by clarifying how teachers could effectively create assessments aligned with the principles of standards-referencing using the New South Wales Higher School Certificate English course as an example. The thesis also determines the extent to which teachers’ practices and beliefs adhered to this idealised process. By contrasting current practice and teacher assessment skills against this process, recommendations are made to identify a clearer path towards effective assessment and marking practices within the current standards-referencing system
    • 

    corecore