    Computing Accurate Probabilistic Estimates of One-D Entropy from Equiprobable Random Samples

    We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimensional entropy from equiprobable random samples, and compare it with the popular Bin-Counting (BC) method. In contrast to BC, which uses equal-width bins with varying probability mass, the QS method uses estimates of the quantiles that divide the support of the data generating probability density function (pdf) into equal-probability-mass intervals. Whereas BC requires optimal tuning of a bin-width hyper-parameter whose value varies with sample size and shape of the pdf, QS requires specification of the number of quantiles to be used. Results indicate, for the class of distributions tested, that the optimal number of quantile-spacings is a fixed fraction of the sample size (empirically determined to be ~0.25-0.35), and that this value is relatively insensitive to distributional form or sample size, providing a clear advantage over BC since hyperparameter tuning is not required. Bootstrapping is used to approximate the sampling variability distribution of the resulting entropy estimate, and is shown to accurately reflect the true uncertainty. For the four distributional forms studied (Gaussian, Log-Normal, Exponential and Bimodal Gaussian Mixture), expected estimation bias is less than 1% and uncertainty is relatively low even for very small sample sizes. We speculate that estimating quantile locations, rather than bin-probabilities, results in more efficient use of the information in the data to approximate the underlying shape of an unknown data generating pdf.Comment: 23 pages, 12 figure

    Using the Airborne Snow Observatory to Assess Remotely Sensed Snowfall Products in the California Sierra Nevada

    The Airborne Snow Observatory (ASO) performed two acquisitions over two mountainous basins in California on 29 January and 3 March 2017, encompassing two atmospheric river events that brought heavy snowfall to the area. These surveys produced high-resolution (50 m) maps of snow depth and snow water equivalent (SWE) that were used to estimate monthly areal snowfall accumulation. Comparison of ASO snow accumulation with point measurements showed that the ASO estimates ranged from -10 to +16% relative bias across three sites, which is likely inflated by the disagreement in areal representation of the quantities from the actual errors in these products. The aggregated SWE accumulations from ASO are then used to evaluate a suite of in situ based and remote sensing precipitation products. During the study period, Parameter-Elevation Regressions on Independent Slopes Model (PRISM) and Mountain Mapper estimates had relative bias 0.8 with ASO snow accumulation over the selected grids at the monthly scale. Finally, we leveraged the fine-scale sampling of the spatially complete ASO products to show that by moving from 100 m to 2 km spatial scales, the perceived bias errors SWE at point locations increased by an order of magnitude, displaying a nonlinear relationship. The study demonstrates that ASO acquisitions in cold months can bring a new and effective approach to spatial evaluation of precipitation products.National Aeronautics and Space Administration; NASA Energy and Water Cycle Study [NNH13ZDA001N-NEWS]; NASA Terrestrial Hydrology Programs; NASA Western Water Applications Office6 month embargo; published online: 10 September 2018This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]

    Application of machine learning and remote sensing for gap-filling daily precipitation data of a sparsely gauged basin in East Africa

    Abstract Access to spatiotemporal distribution of precipitation is needed in many hydrological applications. However, gauges often have spatiotemporal gaps. To mitigate this, we considered three main approaches: (i) using remotely sensing and reanalysis precipitation products; (ii) machine learning-based approaches; and (iii) a gap-filling software explicitly developed for filling the gaps of daily precipitation records. This study evaluated all approaches over a sparsely gauged basin in East Africa. Among the examined precipitation products, PERSIANN-CDR outperformed other satellite products in terms of root mean squared error (7.3 mm), and correlation coefficient (0.46) while having a large bias (50%) compared to the available in situ precipitation records. PERSIANN-CDR also demonstrates the highest skill in distinguishing rainy and non-rainy days. On the other hand, Random Forest outperformed all other approaches (including PERSIANN-CDR) with the least relative bias (-2%), root mean squared error (6.9 mm), and highest correlation coefficient (0.53)

    Evaluating the evolution of ECMWF precipitation products using observational data for Iran:from ERA40 to ERA5

    Abstract European Center for Medium-Range Weather Forecasts Reanalysis (ERA), one of the most widely used precipitation products, has evolved from ERA-40 to ERA-20CM, ERA-20C, ERA-Interim, and ERA5. Studies evaluating the performance of individual ERA products cannot adequately assess the evolution of the products. We compared the performance of all ERA precipitation products at daily, monthly, and annual data (1980–2018) using more than 2100 Iran precipitation gauges. Results indicated that ERA-40 performed worst, followed by ERA-20CM, which showed only minor improvements over ERA-40. ERA-20C considerably outperformed its predecessors, benefiting from the assimilation of observational data. Although several previous studies have reported full superiority of ERA5 over ERA-Interim, our results revealed several shortcomings in ERA5 compared with the ERA-Interim estimates. Both ERA-Interim and ERA5 performed best overall, with ERA-Interim showing better statistical and categorical skill scores, and ERA5 performing better in estimating extreme precipitations. These results suggest that the accuracy of ERA precipitation products has improved from ERA-40 to ERA-Interim, but not consistently from ERA-Interim to ERA5. This study employed a grid-grid comparison approach by first creating a gridded reference data set through the spatial aggregation of point source observations, however, the results from a point-grid approach showed no change in the overall ranking of products (despite the slight changes in the error index values). These findings are useful for model development at a global scale and for hydrological applications in Iran