113 research outputs found
Reply to the comment of Bertocchi et al.
The aim of this note is to reply to Bertocchi et al.âs comment to our paper âDo they agree? Bibliometric evaluation versus informed peer review in the Italian research assessment exerciseâ. Our paper analyzed results of the experiment conducted by the Italian governmental agency ANVUR during the research assessment exercise about the agreement between informed peer review (IR) and bibliometrics. We argued that according to available statistical guidelines, results of the experiment are indicative of a poor agreement in all research fields with only one exception, results reached in the so called Area 13 (economics and statistics). We argued that this difference was due to the changes introduced in Area 13 with respect to the protocol adopted in all the other areas. Bertocchi et al.âs comment dismiss our explanation and suggest that the difference was due to âdifferences in the evaluation processes between Area 13 and other areasâ. In addition, they state that all our five claims about Area 13 experiment protocol âare either incorrect or not based on any evidenceâ. Based on textual evidence drawn from ANVUR official reports, we show that: (1) none of the four differences listed by Bertocchi et al. is peculiar of Area 13; (2) their five arguments contesting our claims about the experiment protocol are all contradicted by official records of the experiment itself
Errors and secret data in the Italian research assessment exercise. A comment to a reply
Italy adopted a performance-based system for funding universities that is centered on the results of a national research assessment exercise, realized by a governmental agency (ANVUR). ANVUR evaluated papers by using âa dual system of evaluationâ, that is by informed peer review or by bibliometrics. In view of validating that system, ANVUR performed an experiment for estimating the agreement between informed review and bibliometrics. Ancaiani et al. (2015) presents the main results of the experiment. Alberto Baccini and De Nicolao (2017) documented in a letter, among other critical issues, that the statistical analysis was not realized on a random sample of articles. A reply to the letter has been published by Research Evaluation (Benedetto et al. 2017). This note highlights that in the reply there are (1) errors in data, (2) problems with ârepresentativenessâ of the sample, (3) unverifiable claims about weights used for calculating kappas, (4) undisclosed averaging procedures; (5) a statement about âsame protocol in all areasâ contradicted by official reports. Last but not least: the data used by the authors continue to be undisclosed. A general warning concludes: many recently published papers use data originating from Italian research assessment exercise. These data are not accessible to the scientific community and consequently these papers are not reproducible. They can be hardly considered as containing sound evidence at least until authors or ANVUR disclose the data necessary for replication
A letter on Ancaiani et al. âEvaluating scientific research in Italy: the 2004-10 research evaluation exerciseâ
This letter documents some problems in Ancaiani et al. (2015). Namely the evaluation of concordance, based on Cohen's kappa, reported by Ancaiani et al. was not computed on the whole random sample of 9,199 articles, but on a subset of 7,597 articles. The kappas relative to the whole random sample were in the range 0.07â0.15, indicating an unacceptable agreement between peer review and bibliometrics. The subset was obtained by non-random exclusion of all articles for which bibliometrics produced an uncertain classification; these raw data were not disclosed, so that concordance analysis is not reproducible. The VQR-weighted kappa for Area 13 reported by Ancaiani et al. is higher than that reported by Area 13 panel and confirmed by Bertocchi et al. (2015), a difference explained by the use, under the same name, of two different set of weights. Two values of kappa reported by Ancaiani et al. differ from the corresponding ones published in the official report. Results reported by Ancaiani et al. do not support a good concordance between peer review and bibliometrics. As a consequence, the use of both techniques introduced systematic distortions in the final results of the Italian research assessment exercise. The conclusion that it is possible to use both technique as interchangeable in a research assessment exercise appears to be unsound, by being based on a misinterpretation of the statistical significance of kappa values
Are Italian research assessment exercises size-biased?
Research assessment exercises have enjoyed ever-increasing popularity in many countries in recent years, both as a method to guide public funds allocation and as a validation tool for adopted research support policies. Italyâs most recently completed evaluation effort (VQR 2011â14) required each university to submit to the Ministry for Education, University, and Research (MIUR) 2 research products per author (3 in the case of other research institutions), chosen in such a way that the same product is not assigned to two authors belonging to the same institution. This constraint suggests that larger institutions, where collaborations among colleagues may be more frequent, could suffer a size-related bias in their evaluation scores. To validate our claim, we investigate the outcome of artificially splitting Sapienza University of Rome, one of the largest universities in Europe, in a number of separate partitions, according to several criteria, noting significant score increases for several partitioning scenarios
The Holy Grail and the Bad Sampling: a test for the homogeneity of missing proportions for evaluating the agreement between peer review and bibliometrics in the Italian research assessment exercises
Two experiments for evaluating the agreement between bibliometrics and
informed peer review - depending on two large samples of journal articles -
were performed by the Italian governmental agency for research evaluation. They
were presented as successful and as warranting the combined use of peer review
and bibliometrics in research assessment exercises. However, the results of
both experiments were supposed to be based on a stratified random sampling of
articles with a proportional allocation, even if solely subsets of the original
samples in the strata were selected owing to the presence of missing articles.
Such a kind of selection has the potential to introduce biases in the results
of the experiments, since different proportions of articles could be missed in
different strata. In order to assess the 'representativeness' of the sampling,
we develop a novel statistical test for assessing the homogeneity of missing
proportions between strata and we consider its application to data of both
experiments. Outcome of the testing procedure show that the null hypotesis of
missing proportion homogeneity should be rejected for both experiments. As a
consequence, the obtained samples cannot be considered as 'representative' of
the population of articles submitted to the research assessments. It is
therefore impossible to exclude that the combined use of peer review and
bibliometrics might have introduced uncontrollable major biases in the final
results of the Italian research assessment exercises. Moreover, the two
experiments should not be considered as valid pieces of knowledge to be used in
the ongoing search of the Holy Grail of a definite agreement between peer
review and bibliometrics
Taylor Rules and Interest Rate Smoothing in the US and EMU
In this paper we estimate simple Taylor rules paying a particular attention to interest rate smoothing. Following English, Nelson, and Sack (2002), we employ a model in first differences to gain some insights on the presence and significance of the degree of partial ad- justment. Moreover, we estimate a nested model to take both interest rate smoothing and serially correlated deviations from various Taylor rate prescriptions into account. Our findings suggest that the lagged interest rate enters the Taylor rule in its own right, and may very well coexist with a serially correlated policy shock. Asymmetric preferences on the output gap level and financial indicators turn out to be impor- tant factors to understand Greenspanâs policy conduct. By contrast, our findings support standard regressors for the âEuropeanâ Taylor rule.Taylor rules omitted variables serial correlation interest rate smoothing
ANVUR: i dati chiusi della bibliometria di stato
LâItalia eÌ probabilmente il paese del mondo occidentale dove lâossessione per le etichette dâeccellenza sta modificando piuÌ profondamente i comportamenti dei ricercatori e delle istituzioni. Con lâesercizio di valutazione massiva della ricerca denominato VQR, si eÌ inaugurata una fase di crescente controllo centralizzato, realizzato attraverso dispositivi apparentemente tecnici. Il tentativo di dare una giustificazione scientifica ai metodi di valutazione adottati nella valutazione ha generato un inedito conflitto tra dimensione politica, scientifica ed etica della ricerca. In questo contributo, lâattenzione eÌ focalizzata sullâesperimento condotto e analizzato dallâAgenzia italiana per la valutazione della ricerca (ANVUR) per validare la metodologia adottata per la valutazione. Se ne descrive dettagliatamente la strategia di di usione da parte dellâagenzia, con la pubblicazione di estratti dei rapporti u ciali in working papers di diverse istituzioni, riviste accademiche e blog gestiti da think-tank. E si illustra un inedito conflitto di interessi: la metodologia e i risultati della valutazione della ricerca nazionale sono stati giustificati a posteriori con documenti scritti dagli stessi studiosi che hanno sviluppato e applicato la metodologia u cialmente adottata dal governo italiano. Inoltre, i risultati pubblicati in questi articoli non sono replicabili, dal momento che i dati non sono mai stati resi disponibili a studiosi diversi da quelli che collaborano con ANVUR
- âŠ