25 research outputs found

    The United States COVID-19 Forecast Hub dataset

    Get PDF
    Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages

    An approach to assessing unidimensionality revisited

    No full text
    A reanalysis of data from Hambleton and Rovinelli (1986) argues that the methods suggested by Bejar (1980) are a valuable descriptive tool for assessing the unidimensionality assumption when a priori information is available about possible response factors. Index terms: achievement testing, item response theory, unidimensionality.Bejar, Isaac I.. (1988). An approach to assessing unidimensionality revisited. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/104312

    Subject matter experts' assessment of item statistics

    No full text
    This study was conducted to determine the degree to which subject matter experts could predict the difficulty and discrimination of items on the Test of Standard Written English. It was concluded that despite an extended training period the raters did not approach a high level of accuracy, nor were they able to pinpoint the factors that contribute to item difficulty and discrimination. Further research should attempt to uncover those factors by examining the items from a linguistic and psycholinguistic perspective. It is argued that by coupling linguistic features of the items with subject matter ratings it may be possible to attain more accurate predictions of item difficulty and discrimination

    An application of the continuous response level model to personality measurement

    No full text
    This paper reports an application of Samejima’s latent trait model for continuous responses. A brief review of latent trait theory is presented, including an elaboration of the theory for test responses other than dichotomous responses, in order to put the continuous model in perspective. The model is then applied using the Impulsivity and Harmavoidance scales of Jackson’s Personality Research Form. Special attention is given to the requirement that the model be invariant across populations and sex groups. Results showed that responses from males fit the model better than those from females, especially for the Harmavoidance scale. The practical and theoretical implications of the study are discussed

    An Approach to Assessing Unidimensionality Revisited

    No full text

    Towards automatic scoring of non-native spontaneous speech

    No full text
    This paper investigates the feasibility of automated scoring of spoken English proficiency of non-native speakers. Unlike existing automated assessments of spoken English, our data consists of spontaneous spoken responses to complex test items. We first compute a set of features relevant for measuring communicative competence based on speech recognition output. We then perform both a quantitative and a qualitative analysis of these features using two different machine learning approaches. (1) We use support vector machines to produce a score and evaluate it with respect to a mode baseline and to human rater agreement. We find that scoring based on support vector machines yields accuracies approaching inter-rater agreement in some cases. (2) We use classification and regression trees to understand the role of different features and feature classes in the characterization of speaking proficiency by human scorers. Our analysis shows that across all the test items most or all the feature classes are used in the nodes of the trees suggesting that the scores are, appropriately, a combination of multiple components of speaking proficiency. Future research will concentrate on extending the set of features and introducing new feature classes to arrive at a scoring model that comprises additional relevant aspects of speaking proficiency

    A study of pre-equating based on item response theory

    No full text
    The study reports a feasibility study using item response theory (IRT) as a means of equating the Test of Standard Written English (TSWE). The study focused on the possibility of pre-equating, that is, deriving the equating transformation prior to the final administration of the test. The three-parameter logistic model was postulated as the response model and its fit was assessed at the item, subscore, and total score level. Minor problems were found at each of these levels; but, on the whole, the three-parameter model was found to portray the data well. The adequacy of the equating provided by IRT procedures was investigated in two TSWE forms. It was concluded that pre-equating does not appear to present problems beyond those inherent to IRT-equating.Bejar, Isaac I.; Wingersky, Marilyn S.. (1982). A study of pre-equating based on item response theory. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/101475

    Factorial invariance in student ratings on instruction

    No full text
    The factorial invariance of student ratings of instruction across three curricular areas was investigated by means of maximum likelihood factor analysis. The results indicate that a one-factor model was not completely adequate from a statistical point of view. Nevertheless, a single factor was accepted as reasonable from a practical point of view. It was concluded that the single factor was invariant across three curricular groups. The reliability of the single factor was essentially the same in the three groups, but in every case it was very high. Some of the theoretical and practical implications of the study were discussed.Bejar, Isaac I.; Doyle, Kenneth O.. (1981). Factorial invariance in student ratings on instruction. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/100400
    corecore