43 research outputs found

    Precision and sample size requirements for regression-based norming methods for change scores

    Get PDF
    To interpret a person’s change score, one typically transforms the change score into, for example, a percentile, so that one knows a person’s location in a distribution of change scores. Transformed scores are referred to as norms and the construction of norms is referred to as norming. Two often-used norming methods for change scores are the regression-based change approach and the T Scores for Change method. In this article, we discuss the similarities and differences between these norming methods, and use a simulation study to systematically examine the precision of the two methods and to establish the minimum sample size requirements for satisfactory precision

    Steps toward preregistration of research on research integrity

    Get PDF
    Abstract Background A proposal to encourage the preregistration of research on research integrity was developed and adopted as the Amsterdam Agenda at the 5th World Conference on Research Integrity (Amsterdam, 2017). This paper reports on the degree to which abstracts of the 6th World Conference in Research Integrity (Hong Kong, 2019) reported on preregistered research. Methods Conference registration data on participants presenting a paper or a poster at 6th WCRI were made available to the research team. Because the data set was too small for inferential statistics this report is limited to a basic description of results and some recommendations that should be considered when taking further steps to improve preregistration. Results 19% of the 308 presenters preregistered their research. Of the 56 usable cases, less than half provided information on the six key elements of the Amsterdam Agenda. Others provided information that invalidated their data, such as an uninformative URL. There was no discernable difference between qualitative and quantitative research. Conclusions Some presenters at the WCRI have preregistered their research on research integrity, but further steps are needed to increase frequency and completeness of preregistration. One approach to increase preregistration would be to make it a requirement for research presented at the World Conferences on Research Integrity.http://deepblue.lib.umich.edu/bitstream/2027.42/174042/1/41073_2021_Article_108.pd

    Psychometric properties of the Bermond-Vorst Alexithymia Questionnaire (BVAQ) in the general population and a clinical population

    Get PDF
    Introduction: The Bermond-Vorst Alexithymia Questionnaire (BVAQ) has been validated in student samples and small clinical samples, but not in the general population; thus, representative general-population norms are lacking. Aim: We examined the factor structure of the BVAQ in Longitudinal Internet Studies for the Social Sciences panel data from the Dutch general population (N = 974). Results: Factor analyses revealed a first-order five-factor model and a second-order two-factor model. However, in the second-order model, the factor interpreted as analyzing ability loaded on both the affective factor and the cognitive factor. Further analyses showed that the first-order test scores are more reliable than the second-order test scores. External and construct validity were addressed by comparing BVAQ scores with a clinical sample of patients suffering from somatic symptom and related disorder (SSRD) (N = 235). BVAQ scores differed significantly between the general population and patients suffering from SSRD, suggesting acceptable construct validity. Age was positively associated with alexithymia. Males showed higher levels of alexithymia. Discussion: The BVAQ is a reliable alternative measure for measuring alexithymia

    A cross-cultural comparison of non-cognitive outputs towards science between Turkish and Dutch students taking into account detected person misfit

    No full text
    Comparing countries according to their PISA results can be considered cross-cultural studies. An important issue in these studies is that the measurement tools must be culturally and linguistically equivalent. Cultural or linguistic differences in measurement tools may threaten validity. Aberrant behavior is another important factor that affects validity. Person-fit analysis is a statistical method of detecting aberrant individual behavior. This study investigated the effects of students who were flagged as aberrant on a comparison of Turkish and Dutch students' non-cognitive outputs towards science in the PISA 2015. Whether the items contained differential item functioning (DIF) or not was determined by a Poly-SIBTEST for each scale. Misfit students were detected by nonparametric GNp and U3p statistics. It was determined that flagged students were not effective for comparing the mean of non-cognitive outputs towards science, but it was observed that flagged students had an effect on DIF

    Precision and sample size requirements for regression-based norming methods for change scores

    No full text
    To interpret a person’s change score, one typically transforms the change score into, for example, a percentile, so that one knows a person’s location in a distribution of change scores. Transformed scores are referred to as norms and the construction of norms is referred to as norming. Two often-used norming methods for change scores are the regression-based change approach and the T Scores for Change method. In this article, we discuss the similarities and differences between these norming methods, and use a simulation study to systematically examine the precision of the two methods and to establish the minimum sample size requirements for satisfactory precision

    Comparison of Classical Test Theory and Item Response Theory in Individual Change Assessment

    No full text
    Clinical psychologists are advised to assess clinical and statistical significance when assessing change in individual patients. Individual change assessment can be conducted using either the methodologies of classical test theory (CTT) or item response theory (IRT). Researchers have been optimistic about the possible advantages of using IRT rather than CTT in change assessment. However, little empirical evidence is available to support the alleged superiority of IRT in the context of individual change assessment. In this study, the authors compared the CTT and IRT methods with respect to their Type I error and detection rates. Preliminary results revealed that IRT is indeed superior to CTT in individual change detection, provided that the tests consist of at least 20 items. For shorter tests, however, CTT is generally better at correctly detecting change in individuals. The results and their implications are discussed

    Estimating difference-score reliability in pretest-posttest settings

    No full text
    Clinical, medical, and health psychologists use difference scores obtained from pretest-posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed change. This article compares the well-documented traditional method and the unfamiliar, rarely used item-level method for estimating difference-score reliability. We simulated data under various conditions that are typical of change assessment in pretest-posttest designs. The item-level method had smaller bias and greater precision than the traditional method and may be recommended for practical use

    Estimating difference-score reliability in pretest-posttest settings

    No full text
    Clinical, medical, and health psychologists use difference scores obtained from pretest-posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed change. This article compares the well-documented traditional method and the unfamiliar, rarely used item-level method for estimating difference-score reliability. We simulated data under various conditions that are typical of change assessment in pretest-posttest designs. The item-level method had smaller bias and greater precision than the traditional method and may be recommended for practical use
    corecore