21,212 research outputs found
Confidence distribution (CD) -- distribution estimator of a parameter
The notion of confidence distribution (CD), an entirely frequentist concept,
is in essence a Neymanian interpretation of Fisher's Fiducial distribution. It
contains information related to every kind of frequentist inference. In this
article, a CD is viewed as a distribution estimator of a parameter. This leads
naturally to consideration of the information contained in CD, comparison of
CDs and optimal CDs, and connection of the CD concept to the (profile)
likelihood function. A formal development of a multiparameter CD is also
presented.Comment: Published at http://dx.doi.org/10.1214/074921707000000102 in the IMS
Lecture Notes Monograph Series
(http://www.imstat.org/publications/lecnotes.htm) by the Institute of
Mathematical Statistics (http://www.imstat.org
DeMoN: Depth and Motion Network for Learning Monocular Stereo
In this paper we formulate structure from motion as a learning problem. We
train a convolutional network end-to-end to compute depth and camera motion
from successive, unconstrained image pairs. The architecture is composed of
multiple stacked encoder-decoder networks, the core part being an iterative
network that is able to improve its own predictions. The network estimates not
only depth and motion, but additionally surface normals, optical flow between
the images and confidence of the matching. A crucial component of the approach
is a training loss based on spatial relative differences. Compared to
traditional two-frame structure from motion methods, results are more accurate
and more robust. In contrast to the popular depth-from-single-image networks,
DeMoN learns the concept of matching and, thus, better generalizes to
structures not seen during training.Comment: Camera ready version for CVPR 2017. Supplementary material included.
Project page:
http://lmb.informatik.uni-freiburg.de/people/ummenhof/depthmotionnet
Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models
Empirical best linear unbiased prediction (EBLUP) method uses a linear mixed
model in combining information from different sources of information. This
method is particularly useful in small area problems. The variability of an
EBLUP is traditionally measured by the mean squared prediction error (MSPE),
and interval estimates are generally constructed using estimates of the MSPE.
Such methods have shortcomings like under-coverage or over-coverage, excessive
length and lack of interpretability. We propose a parametric bootstrap approach
to estimate the entire distribution of a suitably centered and scaled EBLUP.
The bootstrap histogram is highly accurate, and differs from the true EBLUP
distribution by only , where is the number of parameters
and the number of observations. This result is used to obtain highly
accurate prediction intervals. Simulation results demonstrate the superiority
of this method over existing techniques of constructing prediction intervals in
linear mixed models.Comment: Published in at http://dx.doi.org/10.1214/07-AOS512 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Boosted Beta regression.
Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures
Approximately unbiased tests of regions using multistep-multiscale bootstrap resampling
Approximately unbiased tests based on bootstrap probabilities are considered
for the exponential family of distributions with unknown expectation parameter
vector, where the null hypothesis is represented as an arbitrary-shaped region
with smooth boundaries. This problem has been discussed previously in Efron and
Tibshirani [Ann. Statist. 26 (1998) 1687-1718], and a corrected p-value with
second-order asymptotic accuracy is calculated by the two-level bootstrap of
Efron, Halloran and Holmes [Proc. Natl. Acad. Sci. U.S.A. 93 (1996)
13429-13434] based on the ABC bias correction of Efron [J. Amer. Statist.
Assoc. 82 (1987) 171-185]. Our argument is an extension of their asymptotic
theory, where the geometry, such as the signed distance and the curvature of
the boundary, plays an important role. We give another calculation of the
corrected p-value without finding the ``nearest point'' on the boundary to the
observation, which is required in the two-level bootstrap and is an
implementational burden in complicated problems. The key idea is to alter the
sample size of the replicated dataset from that of the observed dataset. The
frequency of the replicates falling in the region is counted for several sample
sizes, and then the p-value is calculated by looking at the change in the
frequencies along the changing sample sizes. This is the multiscale bootstrap
of Shimodaira [Systematic Biology 51 (2002) 492-508], which is third-order
accurate for the multivariate normal model. Here we introduce a newly devised
multistep-multiscale bootstrap, calculating a third-order accurate p-value for
the exponential family of distributions.Comment: Published at http://dx.doi.org/10.1214/009053604000000823 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Intrinsic data depth for Hermitian positive definite matrices
Nondegenerate covariance, correlation and spectral density matrices are
necessarily symmetric or Hermitian and positive definite. The main contribution
of this paper is the development of statistical data depths for collections of
Hermitian positive definite matrices by exploiting the geometric structure of
the space as a Riemannian manifold. The depth functions allow one to naturally
characterize most central or outlying matrices, but also provide a practical
framework for inference in the context of samples of positive definite
matrices. First, the desired properties of an intrinsic data depth function
acting on the space of Hermitian positive definite matrices are presented.
Second, we propose two computationally fast pointwise and integrated data depth
functions that satisfy each of these requirements and investigate several
robustness and efficiency aspects. As an application, we construct depth-based
confidence regions for the intrinsic mean of a sample of positive definite
matrices, which is applied to the exploratory analysis of a collection of
covariance matrices associated to a multicenter research trial
HD 174884: a strongly eccentric, short-period early-type binary system discovered by CoRoT
Accurate photometric CoRoT space observations of a secondary seismological
target, HD 174884, led to the discovery that this star is an astrophysically
important double-lined eclipsing spectroscopic binary in an eccentric orbit (e
of about 0.3), unusual for its short (3.65705d) orbital period. The high
eccentricity, coupled with the orientation of the binary orbit in space,
explains the very unusual observed light curve with strongly unequal primary
and secondary eclipses having the depth ratio of 1-to-100 in the CoRoT 'seismo'
passband. Without the high accuracy of the CoRoT photometry, the secondary
eclipse, 1.5 mmag deep, would have gone unnoticed. A spectroscopic follow-up
program provided 45 high dispersion spectra. The analysis of the CoRoT light
curve was performed with an adapted version of PHOEBE that supports CoRoT
passbands. The final solution was obtained by simultaneous fitting of the light
and the radial velocity curves. Individual star spectra were derived by
spectrum disentangling. The uncertainties of the fit were derived by bootstrap
resampling and the solution uniqueness was tested by heuristic scanning. The
results provide a consistent picture of the system composed of two late B
stars. The Fourier analysis of the light curve fit residuals yields two
components, with orbital frequency multiples and an amplitude of about 0.1
mmag, which are tentatively interpreted as tidally induced pulsations. An
extensive comparison with theoretical models is carried out by means of the
Levenberg-Marquardt minimization technique and the discrepancy between models
and the derived parameters is discussed. The best fitting models yield a young
system age of 125 million years which is consistent with the eccentric orbit
and synchronous component rotation at periastron.Comment: 15 pages, 12 figures. Accepted for publication by A&
Characterization of the frequency of extreme events by the Generalized Pareto Distribution
Based on recent results in extreme value theory, we use a new technique for
the statistical estimation of distribution tails. Specifically, we use the
Gnedenko-Pickands-Balkema-de Haan theorem, which gives a natural limit law for
peak-over-threshold values in the form of the Generalized Pareto Distribution
(GPD). Useful in finance, insurance, hydrology, we investigate here the
earthquake energy distribution described by the Gutenberg-Richter seismic
moment-frequency law and analyze shallow earthquakes (depth h < 70 km) in the
Harvard catalog over the period 1977-2000 in 18 seismic zones. The whole GPD is
found to approximate the tails of the seismic moment distributions quite well
above moment-magnitudes larger than mW=5.3 and no statistically significant
regional difference is found for subduction and transform seismic zones. We
confirm that the b-value is very different in mid-ocean ridges compared to
other zones (b=1.50=B10.09 versus b=1.00=B10.05 corresponding to a power law
exponent close to 1 versus 2/3) with a very high statistical confidence. We
propose a physical mechanism for this, contrasting slow healing ruptures in
mid-ocean ridges with fast healing ruptures in other zones. Deviations from the
GPD at the very end of the tail are detected in the sample containing
earthquakes from all major subduction zones (sample size of 4985 events). We
propose a new statistical test of significance of such deviations based on the
bootstrap method. The number of events deviating from the tails of GPD in the
studied data sets (15-20 at most) is not sufficient for determining the
functional form of those deviations. Thus, it is practically impossible to give
preference to one of the previously suggested parametric families describing
the ends of tails of seismic moment distributions.Comment: pdf document of 21 pages + 2 tables + 20 figures (ps format) + one
file giving the regionalizatio
- …