18 research outputs found
Designing and verifying a tool for diagnosing scientific misconceptions in genetics topic
The main purpose of this study was to design and verify the quality of an assessment aimed at diagnosing scientific misconceptions among students enrolled in 10th-grade biology subjects. A sample consisted of N=200 students from schools under the administration of the Office of Secondary Educational Service 31, Nakhon Ratchasima province. We employed the design-based approach, which consists of four phases, namely, construct map, item design, outcome space, and Wright map. Multidimensional Random Coefficient Multinomial Logit Model (MRCMLM) was used to evaluate the quality of the assessment tool. The assessment tool consists of two dimensions, namely, knowledge and reasoning. The assessment tool is comprised of 40 items, with 20 items tapping each of the dimensions. Findings form item analysis and modeling revealed sufficient evidence on the internal structure and validity of the instrument. It can be concluded that the unidimensional model is more appropriate than the multidimensional model for diagnosing scientific misconceptions in the genetics topic for dichotomously scored with four options in each item. The polytomous scored type should be further investigated to determine the best fit model in the items relating to evidence-based reasoning for supporting students’ answers
Assessment of Learning in Digital Interactive Social Networks: A Learning Analytics Approach
This paper summarizes initial field-test results from data analytics used in the work of the Assessment and Teaching of 21st Century Skills (ATC21S) project, on the “ICT Literacy – Learning in digital networks” learning progression. This project, sponsored by Cisco, Intel and Microsoft, aims to help educators around the world enable students with the skills to succeed in future career and college goals. The paper begins with describing some expansions to a common definition of learning analytics, then includes a review of the literature on ICT literacy, including the specific development that led to the ATC21S effort. This is followed by a description of the development of a “learning progression” for this project, as well as the logic behind the instrument construction and data analytics, along with examples of each. Data were collected in a demonstration digital environment in four countries: Australia, Finland, Singapore and the U.S. The results indicate that the new constructs developed by the project, and the novel item forms and analytics that were employed, are indeed capable of being employed in a large-scale digital environment. The paper concludes with a discussion of the next steps for this effort
The role of engagement and academic behavioral skills on young students’ academic performance—A validation across four countries
peer reviewedThe aim of this study is to validate an instrument measuring students’ academic behavioral skills and engagement—skills identified as vital for student achievement. We inspect the reliability and validity of the survey with respect to item fit, factorial structure, relations with academic performance, and the fairness of the items across student groups. The fairness analyses are critical to making valid comparisons between groups and across countries. Data comprising 8520 grade 10 students from four countries were analysed using item response theory. We found that both scales were multidimensional, acted fairly across students’ gender, country, immigrant-, and socio-economic background (after removing four items), and were positively and significantly correlated with self-reported and performance-based academic performance. © 2020 Elsevier Lt
Essays in psychometrics and behavioral statistics
This dissertation consists of three chapters. The main focus of the first chapter is on Lord’s paradox. Lord’s paradox arises from the conflicting inferences obtained from two alternative approaches that are typically used in evaluating the treatment effect using a pre-post test design. The chapter is designed as a guide to researchers who are using this research design. As an example, I investigate whether the treatment—a new mathematics curriculum—had an effect on student-level outcomes using both approaches. I demonstrate that Lord’s paradox can occur even when the two approaches are accounting for the measurement error in variables. Ordinal response data obtained from surveys and tests are often modeled using cumulative, adjacent-category, or continuation-ratio logit link functions. Instead of using one of these specifically designed procedures for each of these formulations of logits, we can modify the structure of the data in such a way that methods designed for dichotomous outcomes (i.e., binary logistic regression) allow us to achieve the targeted polytomous contrasting (cumulative, adjacent-category, or continuation-ratio). Thus, one can implement procedures designed for dichotomous outcomes on appropriately expanded data. The techniques presented in the second chapter, which I refer to as data expansion techniques, represent this approach.The third chapter aims to contribute to the estimation and interpretation of multidimensional item response theory (MIRT) models within the field of psychometrics and latent variable modeling. The main goal of the chapter is to advance the use of the second-order Rasch model. A second-order Rasch model assumes an overall dimension as a second order factor that explains the covariance between the first-order (component) dimensions. The main contribution of the chapter is to suggest ways of using the model by still preserving the advantages of the Rasch model. Historically, the main challenge in the use of such models were (1) computationally intensive estimation and (2) availability of software. In addition, it is difficult to obtain reliable and meaningful estimates in cases when a variance of one of the dimensions is low relative to other dimensions. In such cases, one first needs to re-assess if the multidimensional structure is appropriate. One, then, can use alternative parameterization of the model to avoid difficulties in the estimation, and guidelines in this chapter provide recommendations on how to achieve such parameterizations with the Rasch model
Does the Dizziness Handicap Inventory—Children and Adolescents (DHI-CA) Demonstrate Properties to Support Clinical Application in the Post-Concussion Population: A Rasch Analysis
The purpose of this cross-sectional validation study was to evaluate the clinical utility of the DHI-CA by (1) examining its dimensionality using exploratory factor analysis (EFA) and (2) calibrating DHI-CA items (using the multidimensional Rasch model) to obtain item difficulty levels. A retrospective chart review was conducted for 132 patients between the ages of 8 and 18 years (mean age = 15.3 ± 2.1 years) from a multidisciplinary post-concussion management tertiary center. Data were extracted on age, sex, and DHI-CA. EFA revealed that 12 out of 25 items did not fit in the subscale that they were originally described under, indicating poor dimensionality. Calibration of items on the Wright Maps revealed that 50% of the items pooled in the lower difficulty level, indicating a potential ceiling effect. Corrected item–rest correlations for the physical, emotional, walking/mobility, and community participation ranged from 0.44–0.66, 0.27–0.61, 0.54–0.57, and 0.32–0.69 (p < 0.001), respectively. The clinical utility of the DHI-CA was found to be questionable due to the presence of double-barreled items and the ceiling effect. Clinicians must supplement data from the DHI-CA with other measures and patient interviews to make informed clinical decisions specific to the post-concussion population until new, robust, and valid measures are developed
A Taxonomy of Critical Dimensions at the Intersection of Learning Analytics and Educational Measurement
15 pagesFrom a measurement perspective, a variety of analytic approaches are fast emerging
in the data mining and exploratory analytics branches of the field of data sciences.
In particular, for learning analytics, more theory is needed showing how the analytical
approaches are related to one another and to their respective purposes when
measurement is involved. For example, machine learning acting on process data can
yield sets of specific patterns as results, but the critical question from a measurement
perspective is: What do these results mean and how can they be used successfully
in learning analytics? That is, if the goal is to make an inference regarding some
underlying variable or set of elements about a student (or a teacher, school, or other
agent or program within an educational setting), what claims are being made regarding
the evidence and how can learning analytics contribute? In this paper we introduce
techniques that move toward theory extensions that need to be developed at the
intersection of learning analytics with measurement technology. For elucidating potential
theoretical components from a measurement perspective, we draw on a type of case
study research in the computer science domain, specifically employing “use cases.”
A use case in computer science describes a scenario of use for software. Different
use cases can describe different situations in which software may have utility. Like
other multi-case designs, use cases can offer a means of exploring relationships
and advancing potential theories by comparing similarities and differences among the
cases. Here we explore three LA use case examples that differ purposively in critical
ways. Examining their similarities and differences highlights potential dimensions that
distinguish among emerging LA use cases at the intersection of data science and
measurement technology
Learning in Digital Networks -- ICT literacy: A novel assessment of students' 21st century skills
The present investigation aims to fill some of the gaps revealed in the literature regarding the limited access to more advanced and novel assessment instruments for measuring students' ICT literacy. In particular, this study outlines the adaption, further development, and validation of the Learning in Digital Networks-ICT literacy (LDN-ICT) test. The LDN-ICT test comprises an online performance-based assessment in which real-time student-student collaboration is facilitated through two different platforms (i.e., Google-Docs and chat). The test attempts to measure students' ability in handling digital information, to communicate and collaborate during problem solving. The data are derived from 144 students in grade 9 analyzed using item response theory models (unidimensional and multidimensional Rasch models). The appropriateness of the models was evaluated by examining the item fit statistics. To gather validity evidence for the test, we investigated the differential item functioning of the individual items and correlations with other constructs (e.g., self-efficacy, collective efficacy, perceived usefulness and academic aspirations). Our results supported the hypothesized structure of LDN-ICT as comprising four dimensions. No significant differences across gender groups were identified. In support of existing research, we found positive relations to self-efficacy, academic aspirations, and socio-economic background. In sum, our results provide evidence for the reliability and validity of the test. Further refinements and the future use of the test are discussed
Recommended from our members
Seeking a Better Balance Between Efficiency and Interpretability: Comparing the Likert Response Format With the Guttman Response Format
The Likert item response format for items is almost ubiquitous in the social sciences and has particular virtues regarding the relative simplicity of item-generation and the efficiency for coding responses. However, in this article, we critique this very common item format, focusing on its affordance for interpretation in terms of internal structure validity evidence. We suggest an alternative, the Guttman response format, which we see as providing a better approach for gathering and interpreting internal structure validity evidence. Using a specific survey-based example, we illustrate how items in this alternative format can be developed, exemplify how such items operate, and explore some comparisons between the results from using the two formats. In conclusion, we recommend usage of the Guttman response format for improving the interpretability of the resulting outcomes. Finally, we also note how this approach may be used in tandem with items that use the Likert response format to help balance efficiency with interpretability. (PsycInfo Database Record (c) 2022 APA, all rights reserved)
The efficacy of acoustic-based articulatory phenotyping for characterizing and classifying four divergent neurodegenerative diseases using sequential motion rates
Despite the impacts of neurodegeneration on speech function, little is known about how to comprehensively characterize the resulting speech abnormalities using a set of objective measures. Quantitative phenotyping of speech motor impairments may have important implications for identifying clinical syndromes and their underlying etiologies, monitoring disease progression over time, and improving treatment efficacy. The goal of this research was to investigate the validity and classification accuracy of comprehensive acoustic-based articulatory phenotypes in speakers with distinct neurodegenerative diseases. Articulatory phenotypes were characterized based on acoustic features that were selected to represent five components of motor performance: Coordination, Consistency, Speed, Precision, and Rate. The phenotypes were first used to characterize the articulatory abnormalities across four progressive neurologic diseases known to have divergent speech motor deficits: amyotrophic lateral sclerosis (ALS), progressive ataxia (PA), Parkinson’s disease (PD), and the nonfluent variant of primary progressive aphasia and progressive apraxia of speech (nfPPA + PAOS). We then examined the efficacy of articulatory phenotyping for disease classification. Acoustic analyses were conducted on audio recordings of 217 participants (i.e., 46 ALS, 52 PA, 60 PD, 20 nfPPA + PAOS, and 39 controls) during a sequential speech task. Results revealed evidence of distinct articulatory phenotypes for the four clinical groups and that the phenotypes demonstrated strong classification accuracy for all groups except ALS. Our results highlight the phenotypic variability present across neurodegenerative diseases, which, in turn, may inform (1) the differential diagnosis of neurological diseases and (2) the development of sensitive outcome measures for monitoring disease progression or assessing treatment efficacy
Recommended from our members
Measuring social norms of intimate partner violence to exert control over wife agency, sexuality, and reproductive autonomy: an item response modelling of the IPV-ASRA scale.
BackgroundThe field of violence prevention research is unequivocal that interventions must target contextual factors, like social norms, to reduce gender-based violence. Limited research, however, on the social norms contributing to intimate partner violence or reproductive coercion exists. One of the driving factors is lack of measurement tools to accurately assess social norms.MethodsUsing an item response modelling approach, this study psychometrically assesses the reliability and validity of a social norms measure of the acceptability of intimate partner violence to exert control over wife agency, sexuality, and reproductive autonomy with data from a population-based sample of married adolescent girls (ages 13-18) and their husbands in rural Niger (n = 559 husband-wife dyads) collected in 2019.ResultsA two-dimensional Partial Credit Model for polytomous items was fit, showing evidence of reliability and validity. Higher scores on the "challenging husband authority" dimension were statistically associated with husband perpetration of intimate partner violence.ConclusionsThis brief scale is a short (5 items), practical measure with strong reliability and validity evidence. This scale can help identify populations with high-need for social norms-focused IPV prevention and to help measure the impact of such efforts