36 research outputs found

    Developing and Examining Validity Evidence for the Writing Rubric to Inform Teacher Educators (WRITE)

    Get PDF
    Assessment is an under-researched challenge of writing development, instruction, and teacher preparation. One reason for the lack of research on writing assessment in teacher preparation is that writing achievement is multi-faceted and difficult to measure consistently. Additionally, research has reported that teacher educators and preservice teaches may have limited assessment literacy knowledge. In previous studies, researchers have struggled to provide strong evidence of validity, reliability, and fairness across raters, writing samples, and rubric items. In the present study, we fill several gaps in the research literature by developing a rubric, the Writing Rubric to Inform Teacher Educators (WRITE), which utilizes a structure that promotes assessment literacy while raters score samples. Furthermore, using modern measurement theory, we strengthen the field’s understanding of writing assessment by providing evidence of validity, reliability, and fairness of scores to support the interpretation and use of the WRITE

    Validity of smartphone heart rate variability pre- and post-resistance exercise

    Get PDF
    The aim was to examine the validity of heart rate variability (HRV) measurements from photoplethysmography (PPG) via a smartphone application pre- and post-resistance exercise (RE) and to examine the intraday and interday reliability of the smartphone PPG method. Thirty-one adults underwent two simultaneous ultrashort-term electrocardiograph (ECG) and PPG measurements followed by 1-repetition maximum testing for back squats, bench presses, and bent-over rows. The participants then performed RE, where simultaneous ultrashort-term ECG and PPG measurements were taken: two pre- and one post-exercise. The natural logarithm of the root mean square of successive normal-to-normal (R-R) differences (LnRMSSD) values were compared with paired-sampl

    Acute Blood Flow Responses to Varying Blood Flow Restriction Pressures in the Lower Limbs

    Get PDF
    International Journal of Exercise Science 16(2): 118-128, 2023. The purpose of this study was to investigate lower limb blood flow responses under varying blood flow restriction (BFR) pressures based on individualized limb occlusion pressures (LOP) using a commonly used occlusion device. Twenty-nine participants (65.5% female, 23.8 ± 4.7 years) volunteered for this study. An 11.5cm tourniquet was placed around participants’ right proximal thigh, followed by an automated LOP measurement (207.1 ± 29.4mmHg). Doppler ultrasound was used to assess posterior tibial artery blood flow at rest, followed by 10% increments of LOP (10-90% LOP) in a randomized order. All data were collected during a single 90-minute laboratory visit. Friedman’s and one-way repeated-measures ANOVAs were used to examine potential differences in vessel diameter, volumetric blood flow (VolFlow), and reduction in VolFlow relative to rest (%Rel) between relative pressures. No differences in vessel diameter were observed between rest and all relative pressures (all p \u3c .05). Significant reductions from rest in VolFlow and %Rel were first observed at 50% LOP and 40% LOP, respectively. VolFlow at 80% LOP, a commonly used occlusion pressure in the legs, was not significantly different from 60% (p = .88), 70% (p = .20), or 90% (p = 1.00) LOP. Findings indicate a minimal threshold pressure of 50%LOP may be required to elicit a significant decrease in arterial blood flow at rest when utilizing the 11.5cm Delfi PTSII tourniquet system

    The Development and Validation of an Engineering Assessment

    No full text
    This slide deck, from the Advanced Manufacturing and Prototyping Integrated to Unlock Potential (AMP-IT-UP) project, was presented at the annual meeting of the National Association for Research in Science Teaching in April, 2016. This presentation discusses the AMP-IT-UP project and the designing an assessment of the engineering design process. Sections of the slides include the following: Overarching Purpose, Next Generation Science Standards (2013), NGSS Engineering Practices, Designing an Assessment of the Engineering Design Process, Fit Statistics, Pre/Post-Test Analyses, and Engineering and Science Practices Alignment

    Online_Appendix_A – Supplemental material for Examining the Impacts of Rater Effects in Performance Assessments

    No full text
    <p>Supplemental material, Online_Appendix_A for Examining the Impacts of Rater Effects in Performance Assessments by Stefanie A. Wind in Applied Psychological Measurement</p

    The Effects of Incomplete Rating Designs in Combination With Rater Effects

    No full text
    © 2019 by the National Council on Measurement in Education Researchers have explored a variety of topics related to identifying and distinguishing among specific types of rater effects, as well as the implications of different types of incomplete data collection designs for rater-mediated assessments. In this study, we used simulated data to examine the sensitivity of latent trait model indicators of three rater effects (leniency, central tendency, and severity) in combination with different types of incomplete rating designs (systematic links, anchor performances, and spiral). We used the rating scale model and the partial credit model to calculate rater location estimates, standard errors of rater estimates, model–data fit statistics, and the standard deviation of rating scale category thresholds as indicators of rater effects and we explored the sensitivity of these indicators to rater effects under different conditions. Our results suggest that it is possible to detect rater effects when each of the three types of rating designs is used. However, there are differences in the sensitivity of each indicator related to type of rater effect, type of rating design, and the overall proportion of effect raters. We discuss implications for research and practice related to rater-mediated assessments

    Not Just Generalizability: A Case for Multifaceted Latent Trait Models in Teacher Observation Systems

    No full text
    Teacher evaluation systems often include classroom observations in which raters use rating scales to evaluate teachers’ effectiveness. Recently, researchers have promoted the use of multifaceted approaches to investigating reliability using Generalizability theory, instead of rater reliability statistics. Generalizability theory allows analysts to quantify the contribution of multiple sources of variance (e.g., raters and tasks) to measurement error. We used data from a teacher evaluation system to illustrate another multifaceted approach that provides additional indicators of the quality of observational systems. We show how analysts can use Many-Facet Rasch models to identify and control for differences in rater severity, identify idiosyncratic ratings associated with various facets, and evaluate rating scale functioning. We discuss implications for research and practice in teacher evaluation

    Exploring the Influence of Range Restrictions on Connectivity in Sparse Assessment Networks: An Illustration and Exploration Within the Context of Classroom Observations

    No full text
    Range restrictions, or raters’ tendency to limit their ratings to a subset of available rating scale categories, are well documented in large-scale teacher evaluation systems based on principal observations. When these restrictions occur, the ratings observed during operational teacher evaluations are limited to a subset of the available categories. However, range restrictions are less common within teacher performances that are used to establish links (anchor ratings) in otherwise disconnected assessment systems. As a result, principals’ category use may be different between anchor ratings and operational ratings. The purpose of this study is to explore the consequences of discrepancies in rating scale category use across operational and anchor ratings within the context of teacher evaluation systems based on principal observations. First, we used real data to illustrate the presence of range restriction in operational ratings, and the effect of this restriction on connectivity. Then, we used simulated data to explore these effects using experimental manipulation. Results suggested that discrepancies in range restriction between anchor and operational ratings do not systematically impact the precision of teacher, principal, and teaching practice estimates. We discuss the implications of these results in terms of research and practice for teacher evaluation systems

    The Stabilizing Influences of Linking Set Size and Model–Data Fit in Sparse Rater-Mediated Assessment Networks

    No full text
    Previous research includes frequent admonitions regarding the importance of establishing connectivity in data collection designs prior to the application of Rasch models. However, details regarding the influence of characteristics of the linking sets used to establish connections among facets, such as locations on the latent variable, model–data fit, and sample size, have not been thoroughly explored. These considerations are particularly important in assessment systems that involve large proportions of missing data (i.e., sparse designs) and are associated with high-stakes decisions, such as teacher evaluations based on teaching observations. The purpose of this study is to explore the influence of characteristics of linking sets in sparsely connected rating designs on examinee, rater, and task estimates. A simulation design whose characteristics were intended to reflect practical large-scale assessment networks with sparse connections were used to consider the influence of locations on the latent variable, model–data fit, and sample size within linking sets on the stability and model–data fit of estimates. Results suggested that parameter estimates for examinee and task facets are quite robust to modifications in the size, model–data fit, and latent-variable location of the link. Parameter estimates for the rater, while still quite robust, are more sensitive to reductions in link size. The implications are discussed as they relate to research, theory, and practice

    Desarrollando un cuestionario exploratorio de la sensación de pertenencia relacionado al aprendizaje de idiomas utilizando la teoría de medición educacional de Rasch

    No full text
    El presente estudio exploró las propiedades psicométricas de una encuesta diseñada para medir la construcción de pertenencia basada en la administración piloto de un nuevo instrumento. Esta exploración del instrumento utilizó la teoría de medición de Rasch para determinar el grado de usabilidad del instrumento para estudiar el constructo mencionado dentro del contexto de aprendizaje y enseñanza del lenguaje, como también para informar de las revisiones antes de su uso en futuras investigaciones. Los datos del piloto para la nueva encuesta fueron reunidos de 249 estudiantes de pregrado inscritos en cuatro universidades del Sudeste de Estados Unidos y mostraron un buen ajuste al modelo de Rasch. En general, los resultados sugieren que los estudiantes reportan percepciones complejas de su propia pertenencia, indicando que no se sienten aislados ni poseen una comunidad cercana en su aprendizaje del idioma francés. Los estudiantes mostraron, en general, una falta de orgullo o vergüenza en patrones de información complejos. Las implicancias de la teoría, investigación y práctica sugieren una mayor exploración de la construcción de pertenencia en diversos aspectos, para el cultivo activo de la comunidad como para el trabajo de promoción
    corecore