Data-driven risk analysis involves the inference of probability distributions
from measured or simulated data. In the case of a highly reliable system, such
as the electricity grid, the amount of relevant data is often exceedingly
limited, but the impact of estimation errors may be very large. This paper
presents a robust nonparametric Bayesian method to infer possible underlying
distributions. The method obtains rigorous error bounds even for small samples
taken from ill-behaved distributions. The approach taken has a natural
interpretation in terms of the intervals between ordered observations, where
allocation of probability mass across intervals is well-specified, but the
location of that mass within each interval is unconstrained. This formulation
gives rise to a straightforward computational resampling method: Bayesian
Interval Sampling. In a comparison with common alternative approaches, it is
shown to satisfy strict error bounds even for ill-behaved distributions.Comment: 13 pages, 3 figures; supplementary information provided. A revised
version of this manuscript has been accepted for publication in Philosophical
Transactions of the Royal Society A: Mathematical, Physical and Engineering
Science