11 research outputs found

    Guidelines for developing and reporting machine learning predictive models in biomedical research : a multidisciplinary view

    Get PDF
    BACKGROUND: As more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs. OBJECTIVE: To attain a set of guidelines on the use of machine learning predictive models within clinical settings to make sure the models are correctly applied and sufficiently reported so that true discoveries can be distinguished from random coincidence. METHODS: A multidisciplinary panel of machine learning experts, clinicians, and traditional statisticians were interviewed, using an iterative process in accordance with the Delphi method. RESULTS: The process produced a set of guidelines that consists of (1) a list of reporting items to be included in a research article and (2) a set of practical sequential steps for developing predictive models. CONCLUSIONS: A set of guidelines was generated to enable correct application of machine learning models and consistent reporting of model specifications and results in biomedical research. We believe that such guidelines will accelerate the adoption of big data analysis, particularly with machine learning methods, in the biomedical research community

    Organic carbon transfers in the subtropical Red River system (Viet Nam): insights on CO2 sources and sinks

    No full text
    International audienceThe Red River, draining a 169,000 km2 watershed, is the second largest river in Viet Nam and constitutes the main source of water for a large percentage of the population of North Viet Nam. Here we present the results of an investigation into the spatial distribution and temporal dynamics of particulate and dissolved organic carbon (POC and DOC, respectively) in the Red River Basin. POC concentrations ranged from 0.24 to 5.80 mg C L-1 and DOC concentrations ranged from 0.26 to 5.39 mg C L-1. The application of the Seneque/Riverstrahler model to monthly POC and DOC measurements showed that, in general, the model simulations of the temporal variations and spatial distribution of organic carbon (OC) concentration followed the observed trends. They also show the impact of high population densities (up to 994 in-hab km-2 in the delta area) on OC inputs in surface runoff from the different land use classes and from urban point sources. A budget of the main fluxes of OC in the whole river network, including diffuse inputs from soil leaching and runoff and point sources from urban centers, as well as algal net primary production and heterotrophic respiration was established using the model results. It shows the predominantly heterotrophic character of the river system and provides an estimate of CO 2 emissions from the river of 330 Gg C year-1. This value is in reasonable agreement with the few available direct measurements of CO 2 fluxes in the downstream part of the river network
    corecore