97,267 research outputs found
Indexing the Earth Mover's Distance Using Normal Distributions
Querying uncertain data sets (represented as probability distributions)
presents many challenges due to the large amount of data involved and the
difficulties comparing uncertainty between distributions. The Earth Mover's
Distance (EMD) has increasingly been employed to compare uncertain data due to
its ability to effectively capture the differences between two distributions.
Computing the EMD entails finding a solution to the transportation problem,
which is computationally intensive. In this paper, we propose a new lower bound
to the EMD and an index structure to significantly improve the performance of
EMD based K-nearest neighbor (K-NN) queries on uncertain databases. We propose
a new lower bound to the EMD that approximates the EMD on a projection vector.
Each distribution is projected onto a vector and approximated by a normal
distribution, as well as an accompanying error term. We then represent each
normal as a point in a Hough transformed space. We then use the concept of
stochastic dominance to implement an efficient index structure in the
transformed space. We show that our method significantly decreases K-NN query
time on uncertain databases. The index structure also scales well with database
cardinality. It is well suited for heterogeneous data sets, helping to keep EMD
based queries tractable as uncertain data sets become larger and more complex.Comment: VLDB201
Combining information from independent sources through confidence distributions
This paper develops new methodology, together with related theories, for
combining information from independent studies through confidence
distributions. A formal definition of a confidence distribution and its
asymptotic counterpart (i.e., asymptotic confidence distribution) are given and
illustrated in the context of combining information. Two general combination
methods are developed: the first along the lines of combining p-values, with
some notable differences in regard to optimality of Bahadur type efficiency;
the second by multiplying and normalizing confidence densities. The latter
approach is inspired by the common approach of multiplying likelihood functions
for combining parametric information. The paper also develops adaptive
combining methods, with supporting asymptotic theory which should be of
practical interest. The key point of the adaptive development is that the
methods attempt to combine only the correct information, downweighting or
excluding studies containing little or wrong information about the true
parameter of interest. The combination methodologies are illustrated in
simulated and real data examples with a variety of applications.Comment: Published at http://dx.doi.org/10.1214/009053604000001084 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Confidence distribution (CD) -- distribution estimator of a parameter
The notion of confidence distribution (CD), an entirely frequentist concept,
is in essence a Neymanian interpretation of Fisher's Fiducial distribution. It
contains information related to every kind of frequentist inference. In this
article, a CD is viewed as a distribution estimator of a parameter. This leads
naturally to consideration of the information contained in CD, comparison of
CDs and optimal CDs, and connection of the CD concept to the (profile)
likelihood function. A formal development of a multiparameter CD is also
presented.Comment: Published at http://dx.doi.org/10.1214/074921707000000102 in the IMS
Lecture Notes Monograph Series
(http://www.imstat.org/publications/lecnotes.htm) by the Institute of
Mathematical Statistics (http://www.imstat.org
Fast Spinning Pulsars as Probes of Massive Black Holes' Gravity
Dwarf galaxies and globular clusters may contain intermediate mass black
holes ( to solar masses) in their cores. Estimates of
~ neutron stars in the central parsec of the Galaxy and similar numbers
in small elliptical galaxies and globular clusters along with an estimated high
probability of ms-pulsar formation in those environments has led many workers
to propose the use of ms-pulsar timing to measure the mass and spin of
intermediate mass black holes. Models of pulsar motion around a rotating black
hole generally assume geodesic motion of a "test" particle in the Kerr metric.
These approaches account for well-known effects like de Sitter precession and
the Lense-Thirring effect but they do not account for the non-linear effect of
the pulsar's stress-energy tensor on the space-time metric. Here we model the
motion of a pulsar near a black hole with the Mathisson-Papapetrou-Dixon (MPD)
equations. Numerical integration of the MPD equations for black holes of mass 2
X , and solar masses shows that the pulsar will not
remain in an orbital plane with motion vertical to the plane being largest
relative to the orbit's radial dimensions for the lower mass black holes. The
pulsar's out of plane motion will lead to timing variations that are up to ~10
microseconds different from those predicted by planar orbit models. Such
variations might be detectable in long term observations of millisecond
pulsars. If pulsar signals are used to measure the mass and spin of
intermediate mass black holes on the basis of dynamical models of the received
pulsar signal then the out of plane motion of the pulsar should be part of that
model.Comment: Accepted by MNRAS March 27, 201
- …