15,572 research outputs found
Testing for Homogeneity in Mixture Models
Statistical models of unobserved heterogeneity are typically formalized as
mixtures of simple parametric models and interest naturally focuses on testing
for homogeneity versus general mixture alternatives. Many tests of this type
can be interpreted as tests, as in Neyman (1959), and shown to be
locally, asymptotically optimal. These tests will be contrasted
with a new approach to likelihood ratio testing for general mixture models. The
latter tests are based on estimation of general nonparametric mixing
distribution with the Kiefer and Wolfowitz (1956) maximum likelihood estimator.
Recent developments in convex optimization have dramatically improved upon
earlier EM methods for computation of these estimators, and recent results on
the large sample behavior of likelihood ratios involving such estimators yield
a tractable form of asymptotic inference. Improvement in computation efficiency
also facilitates the use of a bootstrap methods to determine critical values
that are shown to work better than the asymptotic critical values in finite
samples. Consistency of the bootstrap procedure is also formally established.
We compare performance of the two approaches identifying circumstances in which
each is preferred
Distribution-Free Tests of Independence in High Dimensions
We consider the testing of mutual independence among all entries in a
-dimensional random vector based on independent observations. We study
two families of distribution-free test statistics, which include Kendall's tau
and Spearman's rho as important examples. We show that under the null
hypothesis the test statistics of these two families converge weakly to Gumbel
distributions, and propose tests that control the type I error in the
high-dimensional setting where . We further show that the two tests are
rate-optimal in terms of power against sparse alternatives, and outperform
competitors in simulations, especially when is large.Comment: to appear in Biometrik
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
Evidence of non-Markovian behavior in the process of bank rating migrations
This paper estimates transition matrices for the ratings on financial insti-tutions, using an unusually informative data set. We show that the process of rating migration exhibits significant non-Markovian behavior, in the sense that the transition intensities are affected by macroeconomic and bank spe- cific variables. We illustrate how the use of a continuous time framework may improve the estimation of the transition probabilities. However, the time homogeneity assumption, frequently done in economic applications, does not hold, even for short time intervals. Thus, the information provided by migrations alone is not enough to forecast the future behavior of ratings. The stage of the business cycle should be taken into account, and individual characteristics of banks must be considered as well.Financial institutions; macroeconomic variables; capitaliza- tion; supervision; transition intensities. Classification JEL: C4; E44; G21; G23; G38.
Characterization of the frequency of extreme events by the Generalized Pareto Distribution
Based on recent results in extreme value theory, we use a new technique for
the statistical estimation of distribution tails. Specifically, we use the
Gnedenko-Pickands-Balkema-de Haan theorem, which gives a natural limit law for
peak-over-threshold values in the form of the Generalized Pareto Distribution
(GPD). Useful in finance, insurance, hydrology, we investigate here the
earthquake energy distribution described by the Gutenberg-Richter seismic
moment-frequency law and analyze shallow earthquakes (depth h < 70 km) in the
Harvard catalog over the period 1977-2000 in 18 seismic zones. The whole GPD is
found to approximate the tails of the seismic moment distributions quite well
above moment-magnitudes larger than mW=5.3 and no statistically significant
regional difference is found for subduction and transform seismic zones. We
confirm that the b-value is very different in mid-ocean ridges compared to
other zones (b=1.50=B10.09 versus b=1.00=B10.05 corresponding to a power law
exponent close to 1 versus 2/3) with a very high statistical confidence. We
propose a physical mechanism for this, contrasting slow healing ruptures in
mid-ocean ridges with fast healing ruptures in other zones. Deviations from the
GPD at the very end of the tail are detected in the sample containing
earthquakes from all major subduction zones (sample size of 4985 events). We
propose a new statistical test of significance of such deviations based on the
bootstrap method. The number of events deviating from the tails of GPD in the
studied data sets (15-20 at most) is not sufficient for determining the
functional form of those deviations. Thus, it is practically impossible to give
preference to one of the previously suggested parametric families describing
the ends of tails of seismic moment distributions.Comment: pdf document of 21 pages + 2 tables + 20 figures (ps format) + one
file giving the regionalizatio
- …