31 research outputs found
Comparing predictive distributions in EMOS
EMOS models are widely used post-processing techniques for obtaining
predictive distributions from ensembles for future weather variables. A predictive
distribution can be easily obtained by substituting the unknown parameters
with suitable estimates in the distribution of the future variable, thus obtaining a so
called estimative distribution. Nonetheless, these distributions may perform poorly
in terms of coverage probability of the corresponding quantiles. In this work we
propose the use of calibrated predictive distributions in the context of EMOS models.
The proposed calibrated predictive distribution improves on estimative solutions,
producing quantiles with exact coverage level. A simulation study assesses
the goodness of the calibrated predictive distribution in terms of coverage probabilities
and also logarithmic score and CRPS
Improved maximum likelihood estimation in heteroscedastic nonlinear regression models.
Nonlinear heteroscedastic regression models are a widely used class of models in applied statistics, with applications especially in biology, medicine or chemistry. Nonlinearity and variance heterogeneity can make likelihood estimation for a scalar parameter of interest rather inaccurate for small or moderate samples. In this paper, we suggest a new approach to point estimation based on estimating equations obtained from higher-order pivots for the parameter of interest. In particular, we take as an estimating function the modified directed likelihood. This is a higher-order pivotal quantity that can be easily computed in practice for nonlinear heteroscedastic models with normally distributed errors , using a recently developed S-PLUS library (HOA, 2000) . The estimators obtained from this procedure are a refinement of the maximum likelihood estimators, improving their small sample properties and keeping equivariance under reparameterisation. Two applications to real data sets are discussed
A note on simultaneous calibrated prediction intervals for time series
This paper deals with simultaneous prediction for time series models. In particular, it presents a simple procedure which gives well-calibrated simultaneous prediction intervals with coverage probability close to the target nominal value. Although the exact computation of the proposed intervals is usually not feasible, an approximation can be easily attained by means of a suitable bootstrap simulation procedure. This new predictive solution is much simpler to compute than those ones already proposed in the literature, based on asymptotic calculations. Applications of the bootstrap calibrated procedure to AR, MA and ARCH models are presented
Robust prediction limits based on M-estimators
In this paper we discuss a robust solution to the problem of prediction. Following Barndorff-Nielsen and Cox (1996) and Vidoni (1998), we propose improved prediction limits based on M-estimators instead of maximum likelihood estimators. To compute these robust prediction limits, the expressions of the bias and variance of an M-estimator are required. Here a general asymptotic approximation for the bias of an M-estimator is derived. Moreover, by means of comparative studies in the context of affine transformation models, we show that the proposed robust procedure for prediction behaves in a similar manner to the classical one when the model is correctly specified, but it is designed to be stable in a neighborhood of the model
Confidence distributions for predictive tail probabilities
In this short paper we propose the use of a calibration procedure in order to obtain
predictive probabilities for a future random variable of interest. The new calibration method gives rise to a confidence distribution function which probabilities are close to the nominal ones to a high order of approximation. Moreover, the proposed predictive distribution can be easily obtained by means of a bootstrap simulation procedure.
A simulation study is presented in order to assess the good properties of our proposal. The calibrated procedure is also applied to a series of real data related
to sport records, with the aim of closely estimate
the probability of future records
DancingLines: An Analytical Scheme to Depict Cross-Platform Event Popularity
Nowadays, events usually burst and are propagated online through multiple
modern media like social networks and search engines. There exists various
research discussing the event dissemination trends on individual medium, while
few studies focus on event popularity analysis from a cross-platform
perspective. Challenges come from the vast diversity of events and media,
limited access to aligned datasets across different media and a great deal of
noise in the datasets. In this paper, we design DancingLines, an innovative
scheme that captures and quantitatively analyzes event popularity between
pairwise text media. It contains two models: TF-SW, a semantic-aware popularity
quantification model, based on an integrated weight coefficient leveraging
Word2Vec and TextRank; and wDTW-CD, a pairwise event popularity time series
alignment model matching different event phases adapted from Dynamic Time
Warping. We also propose three metrics to interpret event popularity trends
between pairwise social platforms. Experimental results on eighteen real-world
event datasets from an influential social network and a popular search engine
validate the effectiveness and applicability of our scheme. DancingLines is
demonstrated to possess broad application potentials for discovering the
knowledge of various aspects related to events and different media
Extreme value prediction: an application to sport records
Extreme value theory studies the extreme deviations from the central
portion of a probability distribution.
Results in this field have considerable importance in assessing the risk
that characterises rare events, such as collapse of the stock market,
or earthquakes of exceptional intensity, or floods.
In the last years, application of extreme value theory for prediction
of sport records have received increased interest by the scientific community.
In this work we face the problem of constructing prediction limits for series
of extreme values coming from sport data.
We propose the use of a calibration procedure applied to the generalised
extreme value distribution, in order to obtain a proper predictive distribution
for future records.
The calibrated procedure is applied to series of real data related
to sport records. In particular, we consider sequences of annual maxima
for different athletic events.
Using the proposed calibrated predictive distribution, we show how to correctly predict
the probability of future records and we discuss the existence and
interpretation of ultimate records
Simultaneous calibrated prediction intervals for time series
This paper deals with simultaneous prediction for time series models. In
particular, it presents a simple procedure which gives well-calibrated simultaneous
predictive intervals with coverage probability equal or close to the target nominal
value. Although the exact computation of the proposed intervals is usually not feasible,
an approximation can be easily obtained by means of a suitable bootstrap simulation
procedure. This new predictive solution is much simpler to compute than
those ones already proposed in the literature based on asymptotic calculations. An
application of the bootstrap calibrated procedure to first order autoregressive models
is presented
Calibrated prediction regions for Gaussian random fields
This paper proposes a method to construct well-calibrated frequentist prediction regions, with particular regard to the highest prediction density regions, which may be useful for multivariate spatial prediction. We consider, in particular, Gaussian random fields, and using a calibrating procedure we effectively improve the estimative prediction regions, because the coverage probability turns out to be closer to the target nominal value. Whenever a closed-form expression for the well-calibrated prediction region is not available, we may specify a simple bootstrap-based estimator. Particular attention is dedicated to the associated, improved predictive distribution function, which can be usefully considered for identifying spatial locations with extreme or unusual observations. A simulation study is proposed in order to compare empirically the calibrated predictive regions with the estimative ones. The proposed method is then applied to the global model assessment of a deterministic model for the prediction of PM10 levels using data from a network of air quality monitoring stations
Dynamics of Information Diffusion and Social Sensing
Statistical inference using social sensors is an area that has witnessed
remarkable progress and is relevant in applications including localizing events
for targeted advertising, marketing, localization of natural disasters and
predicting sentiment of investors in financial markets. This chapter presents a
tutorial description of four important aspects of sensing-based information
diffusion in social networks from a communications/signal processing
perspective. First, diffusion models for information exchange in large scale
social networks together with social sensing via social media networks such as
Twitter is considered. Second, Bayesian social learning models and risk averse
social learning is considered with applications in finance and online
reputation systems. Third, the principle of revealed preferences arising in
micro-economics theory is used to parse datasets to determine if social sensors
are utility maximizers and then determine their utility functions. Finally, the
interaction of social sensors with YouTube channel owners is studied using time
series analysis methods. All four topics are explained in the context of actual
experimental datasets from health networks, social media and psychological
experiments. Also, algorithms are given that exploit the above models to infer
underlying events based on social sensing. The overview, insights, models and
algorithms presented in this chapter stem from recent developments in network
science, economics and signal processing. At a deeper level, this chapter
considers mean field dynamics of networks, risk averse Bayesian social learning
filtering and quickest change detection, data incest in decision making over a
directed acyclic graph of social sensors, inverse optimization problems for
utility function estimation (revealed preferences) and statistical modeling of
interacting social sensors in YouTube social networks.Comment: arXiv admin note: text overlap with arXiv:1405.112