10,021 research outputs found
Modeling Persistent Trends in Distributions
We present a nonparametric framework to model a short sequence of probability
distributions that vary both due to underlying effects of sequential
progression and confounding noise. To distinguish between these two types of
variation and estimate the sequential-progression effects, our approach
leverages an assumption that these effects follow a persistent trend. This work
is motivated by the recent rise of single-cell RNA-sequencing experiments over
a brief time course, which aim to identify genes relevant to the progression of
a particular biological process across diverse cell populations. While
classical statistical tools focus on scalar-response regression or
order-agnostic differences between distributions, it is desirable in this
setting to consider both the full distributions as well as the structure
imposed by their ordering. We introduce a new regression model for ordinal
covariates where responses are univariate distributions and the underlying
relationship reflects consistent changes in the distributions over increasing
levels of the covariate. This concept is formalized as a "trend" in
distributions, which we define as an evolution that is linear under the
Wasserstein metric. Implemented via a fast alternating projections algorithm,
our method exhibits numerous strengths in simulations and analyses of
single-cell gene expression data.Comment: To appear in: Journal of the American Statistical Associatio
Extreme Value Statistics of the Total Energy in an Intermediate Complexity Model of the Mid-latitude Atmospheric Jet. Part I: Stationary case
A baroclinic model for the atmospheric jet at middle-latitudes is used as a
stochastic generator of time series of the total energy of the system.
Statistical inference of extreme values is applied to yearly maxima sequences
of the time series, in the rigorous setting provided by extreme value theory.
In particular, the Generalized Extreme Value (GEV) family of distributions is
used here. Several physically realistic values of the parameter ,
descriptive of the forced equator-to-pole temperature gradient and responsible
for setting the average baroclinicity in the atmospheric model, are examined.
The location and scale GEV parameters are found to have a piecewise smooth,
monotonically increasing dependence on . This is in agreement with the
similar dependence on observed in the same system when other dynamically
and physically relevant observables are considered. The GEV shape parameter
also increases with but is always negative, as \textit{a priori} required
by the boundedness of the total energy of the system. The sensitivity of the
statistical inference process is studied with respect to the selection
procedure of the maxima: the roles of both the length of maxima sequences and
of the length of data blocks over which the maxima are computed are critically
analyzed. Issues related to model sensitivity are also explored by varying the
resolution of the system
Extreme Value GARCH modelling with Bayesian Inference
Extreme value theory is widely used financial applications such as risk analysis, forecasting and pricing models. One of the major difficulties in the applications to finance and economics is that the assumption of independence of time series observations is generally not satisfied, so that the dependent extremes may not necessarily be in the domain of attraction of the classical generalised extreme value distribution. This study examines a conditional extreme value distribution with the added specification that the extreme values (maxima or minima) follows a conditional autoregressive heteroscedasticity process. The dependence has been modelled by allowing the location and scale parameters of the extreme distribution to vary with time. The resulting combined model, GEV-GARCH, is developed by implementing the GARCH volatility mechanism in these extreme value model parameters. Bayesian inference is used for the estimation of parameters and posterior inference is available through the Markov Chain Monte Carlo (MCMC) method. The model is firstly applied to relevant simulated data to verify model stability and reliability of the parameter estimation method. Then real stock returns are used to consider evidence for the appropriate application of the model. A comparison is made between the GEV-GARCH and traditional GARCH models. Both the GEV-GARCH and GARCH show similarity in the resulting conditional volatility estimates, however the GEV-GARCH model differs from GARCH in that it can capture and explain extreme quantiles better than the GARCH model because of more reliable extrapolation of the tail behaviour.Extreme value distribution, dependency, Bayesian, MCMC, Return quantile
Estimation of Extreme Quantiles for Functions of Dependent Random Variables
We propose a new method for estimating the extreme quantiles for a function
of several dependent random variables. In contrast to the conventional approach
based on extreme value theory, we do not impose the condition that the tail of
the underlying distribution admits an approximate parametric form, and,
furthermore, our estimation makes use of the full observed data. The proposed
method is semiparametric as no parametric forms are assumed on all the marginal
distributions. But we select appropriate bivariate copulas to model the joint
dependence structure by taking the advantage of the recent development in
constructing large dimensional vine copulas. Consequently a sample quantile
resulted from a large bootstrap sample drawn from the fitted joint distribution
is taken as the estimator for the extreme quantile. This estimator is proved to
be consistent. The reliable and robust performance of the proposed method is
further illustrated by simulation.Comment: 18 pages, 2 figure
Fixed Effect Estimation of Large T Panel Data Models
This article reviews recent advances in fixed effect estimation of panel data
models for long panels, where the number of time periods is relatively large.
We focus on semiparametric models with unobserved individual and time effects,
where the distribution of the outcome variable conditional on covariates and
unobserved effects is specified parametrically, while the distribution of the
unobserved effects is left unrestricted. Compared to existing reviews on long
panels (Arellano and Hahn 2007; a section in Arellano and Bonhomme 2011) we
discuss models with both individual and time effects, split-panel Jackknife
bias corrections, unbalanced panels, distribution and quantile effects, and
other extensions. Understanding and correcting the incidental parameter bias
caused by the estimation of many fixed effects is our main focus, and the
unifying theme is that the order of this bias is given by the simple formula
p/n for all models discussed, with p the number of estimated parameters and n
the total sample size.Comment: 40 pages, 1 tabl
Quantile Regression in Risk Calibration
Financial risk control has always been challenging and becomes now an even harder problem as joint extreme events occur more frequently. For decision makers and government regulators, it is therefore important to obtain accurate information on the interdependency of risk factors. Given a stressful situation for one market participant, one likes to measure how this stress affects other factors. The CoVaR (Conditional VaR) framework has been developed for this purpose. The basic technical elements of CoVaR estimation are two levels of quantile regression: one on market risk factors; another on individual risk factor. Tests on the functional form of the two-level quantile regression reject the linearity. A flexible semiparametric modeling framework for CoVaR is proposed. A partial linear model (PLM) is analyzed. In applying the technology to stock data covering the crisis period, the PLM outperforms in the crisis time, with the justification of the backtesting procedures. Moreover, using the data on global stock markets indices, the analysis on marginal contribution of risk (MCR) defined as the local first order derivative of the quantile curve sheds some light on the source of the global market risk.CoVaR, Value-at-Risk, quantile regression, locally linear quantile regression, partial linear model, semiparametric model
Local bilinear multiple-output quantile/depth regression
A new quantile regression concept, based on a directional version of Koenker
and Bassett's traditional single-output one, has been introduced in [Ann.
Statist. (2010) 38 635-669] for multiple-output location/linear regression
problems. The polyhedral contours provided by the empirical counterpart of that
concept, however, cannot adapt to unknown nonlinear and/or heteroskedastic
dependencies. This paper therefore introduces local constant and local linear
(actually, bilinear) versions of those contours, which both allow to
asymptotically recover the conditional halfspace depth contours that completely
characterize the response's conditional distributions. Bahadur representation
and asymptotic normality results are established. Illustrations are provided
both on simulated and real data.Comment: Published at http://dx.doi.org/10.3150/14-BEJ610 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Efficient semiparametric estimation of a partially linear quantile regression model
This paper is concerned with estimating a conditional quantile function that is assumed to be partially linear. The paper develops a simple estimator of the parametric component of the conditional quantile. The semiparametric efficiency bound for the parametric component is derived, and two types of efficient estimators are considered. Asymptotic properties of the proposed estimators are established under regularity conditions. Some Monte Carlo experiments indicate that the proposed estimators perform well in small samples
- …