Search CORE

143 research outputs found

Accelerating Scientific Publication in Biology

Author: Vale Ronald D.
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 12/09/2015
Field of study

Scientific publications enable results and ideas to be transmitted throughout the scientific community. The number and type of journal publications also have become the primary criteria used in evaluating career advancement. Our analysis suggests that publication practices have changed considerably in the life sciences over the past thirty years. More experimental data is now required for publication, and the average time required for graduate students to publish their first paper has increased and is approaching the desirable duration of Ph.D. training. Since publication is generally a requirement for career progression, schemes to reduce the time of graduate student and postdoctoral training may be difficult to implement without also considering new mechanisms for accelerating communication of their work. The increasing time to publication also delays potential catalytic effects that ensue when many scientists have access to new information. The time has come for life scientists, funding agencies, and publishers to discuss how to communicate new findings in a way that best serves the interests of the public and the scientific community.Comment: 39 pages, 6 figures, 1 table, and a Q&A related to pre-print

arXiv.org e-Print Archive

Ezid

PubMed Central

eScholarship - University of California

Double Trouble? The Communication Dimension of the Reproducibility Crisis in Experimental Psychology and Neuroscience

Author: Hensel Witold M.
Publication venue
Publication date: 18/09/2020
Field of study

Most discussions of the reproducibility crisis focus on its epistemic aspect: the fact that the scientific community fails to follow some norms of scientific investigation, which leads to high rates of irreproducibility via a high rate of false positive findings. The purpose of this paper is to argue that there is a heretofore underappreciated and understudied dimension to the reproducibility crisis in experimental psychology and neuroscience that may prove to be at least as important as the epistemic dimension. This is the communication dimension. The link between communication and reproducibility is immediate: independent investigators would not be able to recreate an experiment whose design or implementation were inadequately described. I exploit evidence of a replicability and reproducibility crisis in computational science, as well as research into quality of reporting to support the claim that a widespread failure to adhere to reporting standards, especially the norm of descriptive completeness, is an important contributing factor in the current reproducibility crisis in experimental psychology and neuroscience

PhilSci Archive

Recommended from our members

Counting What Is Measured or Measuring What Counts? League Tables and Their Impact On Higher Education Institutions in England

Author: King Roger
Locke William
Richardson John T. E.
Verbik Line
Publication venue: Higher Education Funding Council for England
Publication date: 01/04/2008
Field of study

This report investigates league tables and their impact on higher education institutions (HEIs) in England. It presents findings from two strands of research: – an analysis of five league tables selected for the study, their methodologies and the underlying data employed, and – an investigation of how higher education institutions respond to league tables generally and the extent to which they influence institutional decision-making and actions. The purpose of the research is to stimulate informed debate about the approaches and limitations of the various league tables, and greater understanding among the users and stakeholders of the implications of making decisions based on these sources of information

Open Research Online (The Open University)

Leakage and the Reproducibility Crisis in ML-based Science

Author: Kapoor Sayash
Narayanan Arvind
Publication venue
Publication date: 14/07/2022
Field of study

The use of machine learning (ML) methods for prediction and forecasting has become widespread across the quantitative sciences. However, there are many known methodological pitfalls, including data leakage, in ML-based science. In this paper, we systematically investigate reproducibility issues in ML-based science. We show that data leakage is indeed a widespread problem and has led to severe reproducibility failures. Specifically, through a survey of literature in research communities that adopted ML methods, we find 17 fields where errors have been found, collectively affecting 329 papers and in some cases leading to wildly overoptimistic conclusions. Based on our survey, we present a fine-grained taxonomy of 8 types of leakage that range from textbook errors to open research problems. We argue for fundamental methodological changes to ML-based science so that cases of leakage can be caught before publication. To that end, we propose model info sheets for reporting scientific claims based on ML models that would address all types of leakage identified in our survey. To investigate the impact of reproducibility errors and the efficacy of model info sheets, we undertake a reproducibility study in a field where complex ML models are believed to vastly outperform older statistical models such as Logistic Regression (LR): civil war prediction. We find that all papers claiming the superior performance of complex ML models compared to LR models fail to reproduce due to data leakage, and complex ML models don't perform substantively better than decades-old LR models. While none of these errors could have been caught by reading the papers, model info sheets would enable the detection of leakage in each case

arXiv.org e-Print Archive

Series distance - an intuitive metric to quantify hydrograph similarity in terms of occurrence, amplitude and timing of hydrological events

Author: Ehret U.
Zehe E.
Publication venue: Copernicus Publications
Publication date: 01/03/2011
Field of study

Applying metrics to quantify the similarity or dissimilarity of hydrographs is a central task in hydrological modelling, used both in model calibration and the evaluation of simulations or forecasts. Motivated by the shortcomings of standard objective metrics such as the Root Mean Square Error (RMSE) or the Mean Absolute Peak Time Error (MAPTE) and the advantages of visual inspection as a powerful tool for simultaneous, case-specific and multi-criteria (yet subjective) evaluation, we propose a new objective metric termed Series Distance, which is in close accordance with visual evaluation. The Series Distance quantifies the similarity of two hydrographs neither in a time-aggregated nor in a point-by-point manner, but on the scale of hydrological events. It consists of three parts, namely a Threat Score which evaluates overall agreement of event occurrence, and the overall distance of matching observed and simulated events with respect to amplitude and timing. The novelty of the latter two is the way in which matching point pairs on the observed and simulated hydrographs are identified: not by equality in time (as is the case with the RMSE), but by the same relative position in matching segments (rise or recession) of the event, indicating the same underlying hydrological process. Thus, amplitude and timing errors are calculated simultaneously but separately, from point pairs that also match visually, considering complete events rather than only individual points (as is the case with MAPTE). Relative weights can freely be assigned to each component of the Series Distance, which allows (subjective) customization of the metric to various fields of application, but in a traceable way. Each of the three components of the Series Distance can be used in an aggregated or non-aggregated way, which makes the Series Distance a suitable tool for differentiated, process-based model diagnostics. After discussing the applicability of established time series metrics for hydrographs, we present the Series Distance theory, discuss its properties and compare it to those of standard metrics used in Hydrology, both at the example of simple, artificial hydrographs and an ensemble of realistic forecasts. The results suggest that the Series Distance quantifies the degree of similarity of two hydrographs in a way comparable to visual inspection, but in an objective, reproducible way

KITopen

Directory of Open Access Journals

Double Trouble? The Communication Dimension of the Reproducibility Crisis in Experimental Psychology and Neuroscience

Author: Hensel Witold M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2020
Field of study

PhilSci Archive