794 research outputs found

    The influence of the applicants' gender on the modeling of a peer review process by using latent Markov models

    Get PDF
    In the grant peer review process we can distinguish various evaluation stages in which assessors judge applications on a rating scale. Bornmann & al. [2008] show that latent Markov models offer a fundamentally good opportunity to model statistically peer review processes. The main objective of this short communication is to test the influence of the applicants' gender on the modeling of a peer review process by using latent Markov models. We found differences in transition probabilities from one stage to the other for applications for a doctoral fellowship submitted by male and female applicant

    How to reduce the number of rating scale items without predictability loss?

    Get PDF
    Rating scales are used to elicit data about qualitative entities (e.g., research collaboration). This study presents an innovative method for reducing the number of rating scale items without the predictability loss. The "area under the receiver operator curve method" (AUC ROC) is used. The presented method has reduced the number of rating scale items (variables) to 28.57\% (from 21 to 6) making over 70\% of collected data unnecessary. Results have been verified by two methods of analysis: Graded Response Model (GRM) and Confirmatory Factor Analysis (CFA). GRM revealed that the new method differentiates observations of high and middle scores. CFA proved that the reliability of the rating scale has not deteriorated by the scale item reduction. Both statistical analysis evidenced usefulness of the AUC ROC reduction method.Comment: 14 pages, 5 figure

    Peer Review Practices

    Get PDF
    Peer Review (PRev) is among the oldest certification practices in science and was designed to prevent poor research from taking place. There is overall agreement that PRev is the most solid method for the evaluation of scientific quality. Since PRev spans the boundaries of several societal communities, science and policy, research and practice, academia and bureaucracy, public and private, the purposes and meaning of this process may be understood differently across the communities. In Europe, internationally competitive research activities take place in large superstructures as well as in small, insufficiently funded university departments; research can be publicly or privately funded; the purpose may be applied research often with a focus on the needs of regional industry, or purely ‘blue-sky’ research. In current report we focused mainly in on PRev of grant applications, the analysis has been carried out on the basis of PRev related literature analysis (Thomson Reuters, Union Library Catalogues, Google Scholar, and reports of selected research funding organisations)

    The academic backbone: longitudinal continuities in educational achievement from secondary school and medical school to MRCP(UK) and the specialist register in UK medical students and doctors

    Get PDF
    Background: Selection of medical students in the UK is still largely based on prior academic achievement, although doubts have been expressed as to whether performance in earlier life is predictive of outcomes later in medical school or post-graduate education. This study analyses data from five longitudinal studies of UK medical students and doctors from the early 1970s until the early 2000s. Two of the studies used the AH5, a group test of general intelligence (that is, intellectual aptitude). Sex and ethnic differences were also analyzed in light of the changing demographics of medical students over the past decades. Methods: Data from five cohort studies were available: the Westminster Study (began clinical studies from 1975 to 1982), the 1980, 1985, and 1990 cohort studies (entered medical school in 1981, 1986, and 1991), and the University College London Medical School (UCLMS) Cohort Study (entered clinical studies in 2005 and 2006). Different studies had different outcome measures, but most had performance on basic medical sciences and clinical examinations at medical school, performance in Membership of the Royal Colleges of Physicians (MRCP(UK)) examinations, and being on the General Medical Council Specialist Register. Results: Correlation matrices and path analyses are presented. There were robust correlations across different years at medical school, and medical school performance also predicted MRCP(UK) performance and being on the GMC Specialist Register. A-levels correlated somewhat less with undergraduate and post-graduate performance, but there was restriction of range in entrants. General Certificate of Secondary Education (GCSE)/O-level results also predicted undergraduate and post-graduate outcomes, but less so than did A-level results, but there may be incremental validity for clinical and post-graduate performance. The AH5 had some significant correlations with outcome, but they were inconsistent. Sex and ethnicity also had predictive effects on measures of educational attainment, undergraduate, and post-graduate performance. Women performed better in assessments but were less likely to be on the Specialist Register. Non-white participants generally underperformed in undergraduate and post-graduate assessments, but were equally likely to be on the Specialist Register. There was a suggestion of smaller ethnicity effects in earlier studies. Conclusions: The existence of the Academic Backbone concept is strongly supported, with attainment at secondary school predicting performance in undergraduate and post-graduate medical assessments, and the effects spanning many years. The Academic Backbone is conceptualized in terms of the development of more sophisticated underlying structures of knowledge ('cognitive capital’ and 'medical capital’). The Academic Backbone provides strong support for using measures of educational attainment, particularly A-levels, in student selection

    The motivation and opportunity for socially desirable responding does not alter the general factor of personality

    Get PDF
    Socially desirable responding may affect the factor structure of personality questionnaires and may be one of the reasons for the common variance among personality traits. In this study, we test this hypothesis by investigating the influence of the motivational test-taking context (development vs. selection) and the opportunity to distort responses (forced-choice vs. Likert response format) on personality questionnaire scores. Data from real selection and assessment candidates (total N = 3,980) matched on gender, age, and educational level were used. Mean score differences were found between the selection and development groups, with smaller differences for the FC version. Yet, exploratory structural equation models showed that the overall factor structures as well as the general factor were highly similar across the four groups. Thus, although socially desirable responding may affect mean scores on personality traits, it does not appear to affect factor st

    Evaluating and extending statistical methods for estimating the construct-level predictive validity of selection tests

    Get PDF
    Background: In this thesis the problem of range restriction was addressed using the United Kingdom Clinical Aptitude Test (UKCAT) and Professional and Linguistic Assessments Board (PLAB) test in the selection of undergraduate medical school entrants and International Medical Graduates (IMGs) in the UK as motivating examples. Methods for correcting for bias in the estimate of predictive validity due to range restriction (particularly Multiple Imputation (MI) and Full Information Maximum Likelihood (FIML)) were evaluated for the predictive validity, single hurdle concurrent and multiple hurdle validity designs under varying degrees of strictness in selection. For MI, the impact of the composition of the imputation model was also investigated. Methods: The performance of MI and FIML was tested through Monte Carlo simulations and validated using PLAB data. Results: Generally, MI and FIML were found to be equivalent in performance and superior to other methods of correcting for range restriction bias for selection ratios of ≤ 20% only in instances where data were multivariate normal. The inclusion of highly predictive variables in the imputation model increased the precision of MI. Conclusion: MI and FIML are viable alternatives for tackling bias in the estimate of predictive validity for direct range restricted data that satisfies the assumption of multivariate normality. Caution should be taken to avoid their application in instances where the assumption of multivariate normality is violated

    DATA-DRIVEN BAYESIAN METHOD-BASED TRAFFIC CRASH DRIVER INJURY SEVERITY FORMULATION, ANALYSIS, AND INFERENCE

    Get PDF
    Traffic crashes have resulted in significant cost to society in terms of life and economic losses, and comprehensive examination of crash injury outcome patterns is of practical importance. By inferring the parameters of interest from prior information and studied datasets, Bayesian models are efficient methods in data analysis with more accurate results, but their applications in traffic safety studies are still limited. By examining the driver injury severity patterns, this research is proposed to systematically examine the applicability of Bayesian methods in traffic crash driver injury severity prediction in traffic crashes. In this study, three types of Bayesian models are defined: hierarchical Bayesian regression model, Bayesian non-regression model and knowledge-based Bayesian non-parametric model, and a conceptual framework is developed for selecting the appropriate Bayesian model based on discrete research purposes. Five Bayesian models are applied accordingly to test their effectiveness in traffic crash driver injury severity prediction and variable impact estimation: hierarchical Bayesian binary logit model, hierarchical Bayesian ordered logit model, hierarchical Bayesian random intercept model with cross-level interactions, multinomial logit (MNL)-Bayesian Network (BN) model, and decision table/na\xefve Bayes (DTNB) model. A complete dataset containing all crashes occurring on New Mexico roadways in 2010 and 2011 is used for model analyses. The studied dataset is composed of three major sub-datasets: crash dataset, vehicle dataset and driver dataset, and all included variables are therefore divided into two hierarchical levels accordingly: crash-level variables and vehicle/driver variables. From all these five models, the model performance and analysis results have shown promising performance on injury severity prediction and variable influence analysis, and these results underscore the heterogeneous impacts of these significant variables on driver injury severity outcomes. The performances of these models are also compared among these methods or with traditional traffic safety models. With the analyzed results, tentative suggestions regarding countermeasures and further research efforts to reduce crash injury severity are proposed. The research results enhance the understandings of the applicability of Bayesian methods in traffic safety analysis and the mechanisms of crash injury severity outcomes, and provide beneficial inference to improve safety performance of the transportation system

    A Review of the Role of Causality in Developing Trustworthy AI Systems

    Full text link
    State-of-the-art AI models largely lack an understanding of the cause-effect relationship that governs human understanding of the real world. Consequently, these models do not generalize to unseen data, often produce unfair results, and are difficult to interpret. This has led to efforts to improve the trustworthiness aspects of AI models. Recently, causal modeling and inference methods have emerged as powerful tools. This review aims to provide the reader with an overview of causal methods that have been developed to improve the trustworthiness of AI models. We hope that our contribution will motivate future research on causality-based solutions for trustworthy AI.Comment: 55 pages, 8 figures. Under revie
    corecore