37 research outputs found

    Using R for analysing spatio-temporal datasets: a satellite-based precipitation case study

    Get PDF
    Increasing computer power and the availability of remote-sensing data measuring different environmental variables has led to unprecedented opportunities for Earth sciences in recent decades. However, dealing with hundred or thousands of files, usually in different vectorial and raster formats and measured with different temporal frequencies, impose high computation challenges to take full advantage of all the available data. R is a language and environment for statistical computing and graphics which includes several functions for data manipulation, calculation and graphical display, which are particularly well suited for Earth sciences. In this work I describe how R was used to exhaustively evaluate seven state-of-the-art satellite-based rainfall estimates (SRE) products (TMPA 3B42v7, CHIRPSv2, CMORPH, PERSIANN-CDR, PERSIAN-CCS-adj, MSWEPv1.1 and PGFv3) over the complex topography and diverse climatic gradients of Chile. First, built-in functions were used to automatically download the satellite-images in different raster formats and spatial resolutions and to clip them into the Chilean spatial extent if necessary. Second, the raster package was used to read, plot, and conduct an exploratory data analysis in selected files of each SRE product, in order to detect unexpected problems (rotated spatial domains, order or variables in NetCDF files, etc). Third, raster was used along with the hydroTSM package to aggregate SRE files into different temporal scales (daily, monthly, seasonal, annual). Finally, the hydroTSM and hydroGOF packages were used to carry out a point-to-pixel comparison between precipitation time series measured at 366 stations and the corresponding grid cell of each SRE. The modified Kling-Gupta index of model performance was used to identify possible sources of systematic errors in each SRE, while five categorical indices (PC, POD, FAR, ETS, fBIAS) were used to assess the ability of each SRE to correctly identify different precipitation intensities. In the end, R proved to be and efficient environment to deal with thousands of raster, vectorial and time series files, with different spatial and temporal resolutions and spatial reference systems. In addition, the use of well-documented R scripts made code readable and re-usable, facilitating reproducible research which is essential to build trust in stakeholders and scientific community

    The br2 – weighting Method for Estimating the Effects of Air Pollution on Population Health

    Get PDF
    Uncertainties, limitations and biases may impede the correct application of concentration-response linear functions to estimate the effects of air pollution exposure on population health. The reliability of a prediction depends largely on the strength of the linear correlation between the studied variables. This work proposes the joint use of the coefficient of determination, r2, with the regression slope, b, as an improved measure of the strength of the linear relation between air pollution and its effects on population health. The proposed br2‑weighting method offers more reliable inferences about the potential effects of air pollution on population health, and can be applied universally to other fields of research

    The CAMELS-CL dataset: catchment attributes and meteorology for large sample studies – Chile dataset

    Get PDF
    We introduce the first catchment dataset for large sample studies in Chile. This dataset includes 516 catchments; it covers particularly wide latitude (17.8 to 55.0∘ S) and elevation (0 to 6993 m a.s.l.) ranges, and it relies on multiple data sources (including ground data, remote-sensed products and reanalyses) to characterise the hydroclimatic conditions and landscape of a region where in situ measurements are scarce. For each catchment, the dataset provides boundaries, daily streamflow records and basin-averaged daily time series of precipitation (from one national and three global datasets), maximum, minimum and mean temperatures, potential evapotranspiration (PET; from two datasets), and snow water equivalent. We calculated hydro-climatological indices using these time series, and leveraged diverse data sources to extract topographic, geological and land cover features. Relying on publicly available reservoirs and water rights data for the country, we estimated the degree of anthropic intervention within the catchments. To facilitate the use of this dataset and promote common standards in large sample studies, we computed most catchment attributes introduced by Addor et al. (2017) in their Catchment Attributes and MEteorology for Large-sample Studies (CAMELS) dataset, and added several others. We used the dataset presented here (named CAMELS-CL) to characterise regional variations in hydroclimatic conditions over Chile and to explore how basin behaviour is influenced by catchment attributes and water extractions. Further, CAMELS-CL enabled us to analyse biases and uncertainties in basin-wide precipitation and PET. The characterisation of catchment water balances revealed large discrepancies between precipitation products in arid regions and a systematic precipitation underestimation in headwater mountain catchments (high elevations and steep slopes) over humid regions. We evaluated PET products based on ground data and found a fairly good performance of both products in humid regions (r>0.91) and lower correlation (r<0.76) in hyper-arid regions. Further, the satellite-based PET showed a consistent overestimation of observation-based PET. Finally, we explored local anomalies in catchment response by analysing the relationship between hydrological signatures and an attribute characterising the level of anthropic interventions. We showed that larger anthropic interventions are correlated with lower than normal annual flows, runoff ratios, elasticity of runoff with respect to precipitation, and flashiness of runoff, especially in arid catchments. CAMELS-CL provides unprecedented information on catchments in a region largely underrepresented in large sample studies. This effort is part of an international initiative to create multi-national large sample datasets freely available for the community. CAMELS-CL can be visualised from http://camels.cr2.cl and downloaded from https://doi.pangaea.de/10.1594/PANGAEA.894885

    Panta Rhei benchmark dataset: socio-hydrological data of paired events of floods and droughts

    Get PDF
    As the adverse impacts of hydrological extremes increase in many regions of the world, a better understanding of the drivers of changes in risk and impacts is essential for effective flood and drought risk management and climate adaptation. However, there is currently a lack of comprehensive, empirical data about the processes, interactions and feedbacks in complex human-water systems leading to flood and drought impacts. Here we present a benchmark dataset containing socio-hydrological data of paired events, i.e., two floods or two droughts that occurred in the same area. The 45 paired events occurred in 42 different study areas and cover a wide range of socio-economic and hydro-climatic conditions. The dataset is unique in covering both floods and droughts, in the number of cases assessed, and in the quantity of socio-hydrological data. The benchmark dataset comprises: 1) detailed review style reports about the events and key processes between the two events of a pair; 2) the key data table containing variables that assess the indicators which characterise management shortcomings, hazard, exposure, vulnerability and impacts of all events; 3) a table of the indicators-of-change that indicate the differences between the first and second event of a pair. The advantages of the dataset are that it enables comparative analyses across all the paired events based on the indicators-of-change and allows for detailed context- and location-specific assessments based on the extensive data and reports of the individual study areas. The dataset can be used by the scientific community for exploratory data analyses e.g. focused on causal links between risk management, changes in hazard, exposure and vulnerability and flood or drought impacts. The data can also be used for the development, calibration and validation of socio-hydrological models. The dataset is available to the public through the GFZ Data Services (Kreibich et al. 2023, link for review: https://dataservices.gfz-potsdam.de/panmetaworks/review/923c14519deb04f83815ce108b48dd2581d57b90ce069bec9c948361028b8c85/).</p

    On the effects of hydrological uncertainty in assessing the impacts of climate change on water resources

    Get PDF
    This dissertation focuses on the assessment of projected changes on water resources by the end of this century (2071-2100), considering an ensemble of high resolution future climate scenarios, the effects of hydrological parameterisation, and the bias of the hydrological model in representing different streamflow magnitudes. Quantification of the impacts of climate change on water resources will depend on the emission scenario, climate model, downscaling technique and impact model used to drive the impact study. In particular, hydrological impact studies involve important decisions (e.g., model structure, parameterisation, input data) whose effects are reflected into the final impacts. As a result, quantification of impacts of climate change have to be seen as a "cascade of uncertainty", in which decisions taken in every step of the assessment process convey uncertainties that are unavoidably propagated to subsequent levels. At the other hand, uncertainties in projections of climate models and those involved in the quantification of their hydrological response limit the understanding of those future impacts and hamper the assessment of mitigation policies. The Soil and Water Assessment Tool (SWAT) hydrological model was set up for daily simulations of the western part of the Ebro River basin (~ 42000 km2) in Spain, during the control period 01/Jan/1961 to 31/Dec/1990, and two subcatchments were selected for testing the methodology proposed in this dissertation. A sensitivity analysis with Latin Hypercube One-factor-At-a-Time (LH-OAT) was carried out in order to identify parameters with a high effect on simulated streamflows. Then, an uncertainty analysis was carried out using the Generalized Likelihood Uncertainty Estimation (GLUE) methodology, in order to select parameter sets that can be considered as acceptable simulators of the system, adopting a re-scaled Nash-Sutcliffe efficiency as "less formal" likelihood, and a cut-off threshold equal to zero to discriminate between behavioural and non-behavioural simulators. Afterwards, a Latin Hypercube (LH) sampling strategy was implemented within GLUE, in order to reduce the number of model runs required to obtain a good exploration of the parameter space. The 95% of the cumulative distribution of each predicted output, weighted by the re-scaled likelihood of each behavioural parameter set, was used to compute the predictive uncertainty bounds, both during the control and future scenarios. Bias-corrected daily time series of precipitation and air temperature, for the future period 2071-2100, were derived from an ensemble of six high-resolution climate change scenarios, selected from the EU FP5 PRUDENCE project. Long-term averages of precipitation and air temperature fields were computed for the control period, and projected anomalies for the future scenarios were computed as well, in an annual, seasonal and monthly basis, including expected changes for different elevation bands within the basin. The same bias-corrected time series were then used to drive daily hydrological simulations during the future period on the two selected catchments. For each climate scenario, a number of simulations equal to the number of behavioural parameter sets obtained during the uncertainty analysis was carried out. Resulting streamflows were used to compute daily flow duration curves (FDCs) to provide a qualitative assessment of the relative importance of uncertainties coming from the choice of the driving RCM and from hydrological parameterisation. In addition, streamflows derived from running each climate scenario with its corresponding behavioural parameter sets, were used to compute empirical cumulative density functions (ECDFs) of three selected percentiles, representing different flow magnitudes, in order to provide a quantitative assessment of the projected changes in streamflows. We observed that the hydrological parametric uncertainty was larger than the uncertainty coming from the driving RCM, during the complete future period and each one of the four seasons, for the two selected catchments. However, this result can not be generalised, because it is conditional to decisions taken during the uncertainty analysis and to the ensemble of RCMs considered. Empirical CDFs computed for projected values of low (Q5), medium (Q50) and high (Q95) flows show that there is a general projected decrease in all the streamflow magnitudes, but bias in the representation of the streamflows during the control period 1961-1990 hamper the assessment of reliable quantitative projections for low and medium flows, whereas projected decreases for high flows range from 0 to 60%, depending on the catchment and the climate scenario considered
    corecore