64 research outputs found

    A rejoinder to the comments of Benedetto et al. on the paper “Critical remarks on the Italian research assessment exercise VQR 2011–2014” (Journal of Informetrics, 11(2): 337–357)

    Get PDF
    The paper “Critical remarks on the Italian research assessment exercise VQR 2011–2014” (Franceschini & Maisano, 2017) analyzed some vulnerabilities of the recently concluded Italian assessment exercise. Some apical (former and current)members of ANVUR promptly commented on our criticisms through a letter to the editor (Benedetto, Checchi, Graziosi, & Malgarini, 2017). We believe that this letter is not very convincing. In the following, we provide a rejoinder to the comments directed to our paper

    Critical remarks on the Italian research assessment exercise VQR 2011–2014

    Get PDF
    For nearly a decade, several national exercises have been implemented for assessing the Italian research performance, from the viewpoint of universities and other research institutions. The penultimate one – i.e., the VQR 2004–2010, which adopted a hybrid evaluation approach based on bibliometric analysis and peer review – suffered heavy criticism at a national and international level. The architecture of the subsequent exercise – i.e., the VQR 2011–2014, still in progress – is partly similar to that of the previous one, except for a few presumed improvements. Nevertheless, this other exercise is suffering heavy criticism too. This paper presents a structured discussion of the VQR 2011–2014, collecting and organizing some critical arguments so far emerged, and developing them in detail. Some of the major vulnerabilities of the VQR 2011–2014 are: (1) the fact that evaluations cover a relatively small fraction of the scientific publications produced by the researchers involved in the evaluation, (2) incorrect and anachronistic use of the journal metrics (i.e., ISI Impact Factor and similar ones) for assessing individual papers, and (3) conceptually misleading criteria for normalizing and aggregating the bibliometric indicators in use

    On tit for tat: Franceschini and Maisano versus ANVUR regarding the Italian research assessment exercise VQR 2011-2014

    Full text link
    The response by Benedetto, Checchi, Graziosi & Malgarini (2017) (hereafter "BCG&M"), past and current members of the Italian Agency for Evaluation of University and Research Systems (ANVUR), to Franceschini and Maisano's ("F&M") article (2017), inevitably draws us into the debate. BCG&M in fact complain "that almost all criticisms to the evaluation procedures adopted in the two Italian research assessments VQR 2004-2010 and 2011-2014 limit themselves to criticize the procedures without proposing anything new and more apt to the scope". Since it is us who raised most criticisms in the literature, we welcome this opportunity to retrace our vainly "constructive" recommendations, made with the hope of contributing to assessments of the Italian research system more in line with the state of the art in scientometrics. We see it as equally interesting to confront the problem of the failure of knowledge transfer from R&D (scholars) to engineering and production (ANVUR's practitioners) in the Italian VQRs. We will provide a few notes to help the reader understand the context for this failure. We hope that these, together with our more specific comments, will also assist in communicating the reasons for the level of scientometric competence expressed in BCG&M's heated response to F&M's criticism

    On the Shapley value and its application to the Italian VQR research assessment exercise

    Get PDF
    Research assessment exercises have now become common evaluation tools in a number of countries. These exercises have the goal of guiding merit-based public funds allocation, stimulating improvement of research productivity through competition and assessing the impact of adopted research support policies. One case in point is Italy's most recent research assessment effort, VQR 2011–2014 (Research Quality Evaluation), which, in addition to research institutions, also evaluated university departments, and individuals in some cases (i.e., recently hired research staff and members of PhD committees). However, the way an institution's score was divided, according to VQR rules, between its constituent departments or its staff members does not enjoy many desirable properties well known from coalitional game theory (e.g., budget balance, fairness, marginality). We propose, instead, an alternative score division rule that is based on the notion of Shapley value, a well known solution concept in coalitional game theory, which enjoys the desirable properties mentioned above. For a significant test case (namely, Sapienza University of Rome, the largest university in Italy), we present a detailed comparison of the scores obtained, for substructures and individuals, by applying the official VQR rules, with those resulting from Shapley value computations. We show that there are significant differences in the resulting scores, making room for improvements in the allocation rules used in research assessment exercises

    Are Italian research assessment exercises size-biased?

    Get PDF
    Research assessment exercises have enjoyed ever-increasing popularity in many countries in recent years, both as a method to guide public funds allocation and as a validation tool for adopted research support policies. Italy’s most recently completed evaluation effort (VQR 2011–14) required each university to submit to the Ministry for Education, University, and Research (MIUR) 2 research products per author (3 in the case of other research institutions), chosen in such a way that the same product is not assigned to two authors belonging to the same institution. This constraint suggests that larger institutions, where collaborations among colleagues may be more frequent, could suffer a size-related bias in their evaluation scores. To validate our claim, we investigate the outcome of artificially splitting Sapienza University of Rome, one of the largest universities in Europe, in a number of separate partitions, according to several criteria, noting significant score increases for several partitioning scenarios

    Errors and secret data in the Italian research assessment exercise. A comment to a reply

    Get PDF
    Italy adopted a performance-based system for funding universities that is centered on the results of a national research assessment exercise, realized by a governmental agency (ANVUR). ANVUR evaluated papers by using “a dual system of evaluation”, that is by informed peer review or by bibliometrics. In view of validating that system, ANVUR performed an experiment for estimating the agreement between informed review and bibliometrics. Ancaiani et al. (2015) presents the main results of the experiment. Alberto Baccini and De Nicolao (2017) documented in a letter, among other critical issues, that the statistical analysis was not realized on a random sample of articles. A reply to the letter has been published by Research Evaluation (Benedetto et al. 2017). This note highlights that in the reply there are (1) errors in data, (2) problems with “representativeness” of the sample, (3) unverifiable claims about weights used for calculating kappas, (4) undisclosed averaging procedures; (5) a statement about “same protocol in all areas” contradicted by official reports. Last but not least: the data used by the authors continue to be undisclosed. A general warning concludes: many recently published papers use data originating from Italian research assessment exercise. These data are not accessible to the scientific community and consequently these papers are not reproducible. They can be hardly considered as containing sound evidence at least until authors or ANVUR disclose the data necessary for replication

    Evaluating Research and Scholarly Impact in Criminology and Criminal Justice in the United Kingdom and Italy: A Comparative Perspective

    Get PDF
    What scholarly impact is, and how it is evaluated, vary across different countries. In the United Kingdom, for instance, scholarly impact is mainly assessed through the Research Excellence Framework (REF) in the context of providing—among other things—accountability for public investment in research, demonstrating the public benefits of research, and informing the selective allocation of research funding. In the REF system, impact needs to show a demonstrable effect on change, or evidence of benefits outside academia, and is formally assessed through case studies. In Italy, there is a comparable system for evaluating research, known as Evaluation of Research Quality, but in this latter case, the focus is on the quality of selected research outputs as indicators of research performance. Impact is here considered with reference to the so-called third mission (which includes activities aimed at the valorization of research, and activities that have positive spillovers into society at large) and is evaluated separately. Our contribution aims at critically analyzing the commonalities and differences of these two systems when it comes to evaluating research in Criminology and Criminal Justice, considering some of the benefits and potential pitfalls of research evaluation in both regions, and discussing how these disciplines are framed and delimited differently in the two countries considered

    Metrics and peer review agreement at the institutional level

    Full text link
    In the past decades, many countries have started to fund academic institutions based on the evaluation of their scientific performance. In this context, peer review is often used to assess scientific performance. Bibliometric indicators have been suggested as an alternative. A recurrent question in this context is whether peer review and metrics tend to yield similar outcomes. In this paper, we study the agreement between bibliometric indicators and peer review at the institutional level. Additionally, we also quantify the internal agreement of peer review at the institutional level. We find that the level of agreement is generally higher at the institutional level than at the publication level. Overall, the agreement between metrics and peer review is on par with the internal agreement among two reviewers for certain fields of science. This suggests that for some fields, bibliometric indicators may possibly be considered as an alternative to peer review for national research assessment exercises

    Do they agree? Bibliometric evaluation vs informed peer review in the Italian research assessment exercise

    Get PDF
    During the Italian research assessment exercise, the national agency ANVUR performed an experiment to assess agreement between grades attributed to journal articles by informed peer review (IR) and by bibliometrics. A sample of articles was evaluated by using both methods and agreement was analyzed by weighted Cohen's kappas. ANVUR presented results as indicating an overall 'good' or 'more than adequate' agreement. This paper re-examines the experiment results according to the available statistical guidelines for interpreting kappa values, by showing that the degree of agreement, always in the range 0.09-0.42 has to be interpreted, for all research fields, as unacceptable, poor or, in a few cases, as, at most, fair. The only notable exception, confirmed also by a statistical meta-analysis, was a moderate agreement for economics and statistics (Area 13) and its sub-fields. We show that the experiment protocol adopted in Area 13 was substantially modified with respect to all the other research fields, to the point that results for economics and statistics have to be considered as fatally flawed. The evidence of a poor agreement supports the conclusion that IR and bibliometrics do not produce similar results, and that the adoption of both methods in the Italian research assessment possibly introduced systematic and unknown biases in its final results. The conclusion reached by ANVUR must be reversed: the available evidence does not justify at all the joint use of IR and bibliometrics within the same research assessment exercise.Comment: in Scientometrics, 201
    • …
    corecore