477 research outputs found

    Toward the Development of Cancer Literacy Assessment Tools

    Get PDF
    Background: This study represented the first that documents the development of breast and cervical cancer literacy assessments that can be administered orally by laypersons. Methods: Critical indicators of cancer literacy were identified through a review of pertinent literature and interviews with ethnically diverse women. The pilot-test consisted of a 29- question assessment for language appropriateness. A score of 75% was established as the threshold for functional cancer literacy. Results: The assessment tools demonstrated a high level of internal consistency. Paired t-test analysis of pre- and post-intervention tests showed that the instrument was sensitive to changes in literacy of breast and cervical cancer as well as improvements in functional cancer literacy. Conclusion: The analysis demonstrated that the instrument is a reliable and valid indicator of breast and cervical cancer literacy. These assessment instruments can provide researchers and educators a tool to measure functional cancer literacy that can enhance their ability to tailor appropriate health interventions and promotions

    Optimal item pool design for computerized adaptive tests with polytomous items using GPCM

    Get PDF
    Abstract Computerized adaptive testing (CAT) is a testing procedure with advantages in improving measurement precision and increasing test efficiency. An item pool with optimal characteristics is the foundation for a CAT program to achieve those desirable psychometric features. This study proposed a method to design an optimal item pool for tests with polytomous items using the generalized partial credit model (G-PCM). It extended a method for approximating optimality with polytomous items being described succinctly for the purpose of pool design. Optimal item pools were generated using CAT simulations with and without practical constraints of content balancing and item exposure control. The performances of the item pools were evaluated against an operational item pool. The results indicated that the item pools designed with stratification based on discrimination parameters performed well with an efficient use of the less discriminative items within the target accuracy levels. The implications for developing item pools are also discussed

    Exploring differential item functioning in the SF-36 by demographic, clinical, psychological and social factors in an osteoarthritis population

    Get PDF
    The SF-36 is a very commonly used generic measure of health outcome in osteoarthritis (OA). An important, but frequently overlooked, aspect of validating health outcome measures is to establish if items work in the same way across subgroup of a population. That is, if respondents have the same 'true' level of outcome, does the item give the same score in different subgroups or is it biased towards one subgroup or another. Differential item functioning (DIF) can identify items that may be biased for one group or another and has been applied to measuring patient reported outcomes. Items may show DIF for different conditions and between cultures, however the SF-36 has not been specifically examined in an osteoarthritis population nor in a UK population. Hence, the aim of the study was to apply the DIF method to the SF-36 for a UK OA population. The sample comprised a community sample of 763 people with OA who participated in the Somerset and Avon Survey of Health. The SF-36 was explored for DIF with respect to demographic, social, clinical and psychological factors. Well developed ordinal regression models were used to identify DIF items. Results: DIF items were found by age (6 items), employment status (6 items), social class (2 items), mood (2 items), hip v knee (2 items), social deprivation (1 item) and body mass index (1 item). Although the impact of the DIF items rarely had a significant effect on the conclusions of group comparisons, in most cases there was a significant change in effect size. Overall, the SF-36 performed well with only a small number of DIF items identified, a reassuring finding in view of the frequent use of the SF-36 in OA. Nevertheless, where DIF items were identified it would be advisable to analyse data taking account of DIF items, especially when age effects are the focus of interest

    Network Psychometrics

    Full text link
    This chapter provides a general introduction of network modeling in psychometrics. The chapter starts with an introduction to the statistical model formulation of pairwise Markov random fields (PMRF), followed by an introduction of the PMRF suitable for binary data: the Ising model. The Ising model is a model used in ferromagnetism to explain phase transitions in a field of particles. Following the description of the Ising model in statistical physics, the chapter continues to show that the Ising model is closely related to models used in psychometrics. The Ising model can be shown to be equivalent to certain kinds of logistic regression models, loglinear models and multi-dimensional item response theory (MIRT) models. The equivalence between the Ising model and the MIRT model puts standard psychometrics in a new light and leads to a strikingly different interpretation of well-known latent variable models. The chapter gives an overview of methods that can be used to estimate the Ising model, and concludes with a discussion on the interpretation of latent variables given the equivalence between the Ising model and MIRT.Comment: In Irwing, P., Hughes, D., and Booth, T. (2018). The Wiley Handbook of Psychometric Testing, 2 Volume Set: A Multidisciplinary Reference on Survey, Scale and Test Development. New York: Wile

    Compensatory and non-compensatory multidimensional randomized item response models

    Get PDF
    Randomized response (RR) models are often used for analysing univariate randomized response data and measuring population prevalence of sensitive behaviours. There is much empirical support for the belief that RR methods improve the cooperation of the respondents. Recently, RR models have been extended to measure individual unidimensional behaviour. An extension of this modelling framework is proposed to measure compensatory or non-compensatory multiple sensitive factors underlying the randomized item response process. A confirmatory multidimensional randomized item response theory model (MRIRT) is proposed for the analysis of multivariate RR data by modelling the response process and specifying structural relationships between sensitive behaviours and background information. A Markov chain Monte Carlo algorithm is developed to estimate simultaneously the parameters of the MRIRT model. The model extension enables the computation of individual true item response probabilities, estimates of individuals’ sensitive behaviour on different domains, and their relationships with background variables. An MRIRT analysis is presented of data from a college alcohol problem scale, measuring alcohol-related socio-emotional and community problems, and alcohol expectancy questionnaire, measuring alcohol-related sexual enhancement expectancies. Students were interviewed via direct or RR questioning. Scores of alcohol-related problems and expectancies are significantly higher for the group of students questioned using the RR technique. Alcohol-related problems and sexual enhancement expectancies are positively moderately correlated and vary differently across gender and universities

    A Knowledge Graph Enhanced Learner Model to Predict Outcomes to Questions in the Medical Field

    Get PDF
    International audienceThe training curriculum for medical doctors requires the intensive and rapid assimilation of a lot of knowledge. To help medical students optimize their learning path, the SIDES 3.0 national French project aims to extend an existing platform with intelligent learning services. This platform contains a large number of annotated learning resources, from training and evaluation questions to students' learning traces, available as an RDF knowledge graph. In order for the platform to provide personalized learning services, the knowledge and skills progressively acquired by students on each subject should be taken into account when choosing the training and evaluation questions to be presented to them, in the form of customized quizzes. To achieve such recommendation , a first step lies in the ability to predict the outcome of students when answering questions (success or failure). With this objective in mind, in this paper we propose a model of the students' learning on the SIDES platform, able to make such predictions. The model extends a state-of-the-art approach to fit the specificity of medical data, and to take into account additional knowledge extracted from the OntoSIDES knowledge graph in the form of graph embeddings. Through an evaluation based on learning traces for pediatrics and cardiovascular specialties, we show that considering the vector representations of answers, questions and students nodes substantially improves the prediction results compared to baseline models

    Standard setting: Comparison of two methods

    Get PDF
    BACKGROUND: The outcome of assessments is determined by the standard-setting method used. There is a wide range of standard – setting methods and the two used most extensively in undergraduate medical education in the UK are the norm-reference and the criterion-reference methods. The aims of the study were to compare these two standard-setting methods for a multiple-choice question examination and to estimate the test-retest and inter-rater reliability of the modified Angoff method. METHODS: The norm – reference method of standard -setting (mean minus 1 SD) was applied to the 'raw' scores of 78 4th-year medical students on a multiple-choice examination (MCQ). Two panels of raters also set the standard using the modified Angoff method for the same multiple-choice question paper on two occasions (6 months apart). We compared the pass/fail rates derived from the norm reference and the Angoff methods and also assessed the test-retest and inter-rater reliability of the modified Angoff method. RESULTS: The pass rate with the norm-reference method was 85% (66/78) and that by the Angoff method was 100% (78 out of 78). The percentage agreement between Angoff method and norm-reference was 78% (95% CI 69% – 87%). The modified Angoff method had an inter-rater reliability of 0.81 – 0.82 and a test-retest reliability of 0.59–0.74. CONCLUSION: There were significant differences in the outcomes of these two standard-setting methods, as shown by the difference in the proportion of candidates that passed and failed the assessment. The modified Angoff method was found to have good inter-rater reliability and moderate test-retest reliability

    The Number of Factors Problem

    Get PDF
    This chapter focuses on formal criteria to assess the dimensionality for exploratory factor modelling with the aim to facilitate the selection of a proper criterion in empirical practice. It introduces the different foundations that underlie the various criteria and provides an overview of currently available formal criteria, which we selected on the basis of their popularity in empirical practice and/or proven effectiveness. The chapter successively reviews principal component analysis (PCA)‐based methods and common factor analysis (CFA)‐based methods to assess the number of common factors. To assess the number of factors underlying an empirical data set, the chapter suggests some strategies. It explains the finding in many studies that the Kaiser criterion clearly yields inaccurate indications of the number of PCs and common factors, mostly indicating too many factors. Minimum average partial (MAP) performances in indicating the number of major factors deteriorated when the unique variances increased, with no clear tendency to over‐ or underindicate the number of factors

    Linking tests of English for academic purposes to the CEFR: the score user’s perspective

    Get PDF
    The Common European Framework of Reference for Languages (CEFR) is widely used in setting language proficiency requirements, including for international students seeking access to university courses taught in English. When different language examinations have been related to the CEFR, the process is claimed to help score users, such as university admissions staff, to compare and evaluate these examinations as tools for selecting qualified applicants. This study analyses the linking claims made for four internationally recognised tests of English widely used in university admissions. It uses the Council of Europe’s (2009) suggested stages of specification, standard setting, and empirical validation to frame an evaluation of the extent to which, in this context, the CEFR has fulfilled its potential to “facilitate comparisons between different systems of qualifications.” Findings show that testing agencies make little use of CEFR categories to explain test content; represent the relationships between their tests and the framework in different terms; and arrive at conflicting conclusions about the correspondences between test scores and CEFR levels. This raises questions about the capacity of the CEFR to communicate competing views of a test construct within a coherent overarching structure
    • 

    corecore