46,343 research outputs found
Indexing the Earth Mover's Distance Using Normal Distributions
Querying uncertain data sets (represented as probability distributions)
presents many challenges due to the large amount of data involved and the
difficulties comparing uncertainty between distributions. The Earth Mover's
Distance (EMD) has increasingly been employed to compare uncertain data due to
its ability to effectively capture the differences between two distributions.
Computing the EMD entails finding a solution to the transportation problem,
which is computationally intensive. In this paper, we propose a new lower bound
to the EMD and an index structure to significantly improve the performance of
EMD based K-nearest neighbor (K-NN) queries on uncertain databases. We propose
a new lower bound to the EMD that approximates the EMD on a projection vector.
Each distribution is projected onto a vector and approximated by a normal
distribution, as well as an accompanying error term. We then represent each
normal as a point in a Hough transformed space. We then use the concept of
stochastic dominance to implement an efficient index structure in the
transformed space. We show that our method significantly decreases K-NN query
time on uncertain databases. The index structure also scales well with database
cardinality. It is well suited for heterogeneous data sets, helping to keep EMD
based queries tractable as uncertain data sets become larger and more complex.Comment: VLDB201
Line-distortion, Bandwidth and Path-length of a graph
We investigate the minimum line-distortion and the minimum bandwidth problems
on unweighted graphs and their relations with the minimum length of a
Robertson-Seymour's path-decomposition. The length of a path-decomposition of a
graph is the largest diameter of a bag in the decomposition. The path-length of
a graph is the minimum length over all its path-decompositions. In particular,
we show:
- if a graph can be embedded into the line with distortion , then
admits a Robertson-Seymour's path-decomposition with bags of diameter at most
in ;
- for every class of graphs with path-length bounded by a constant, there
exist an efficient constant-factor approximation algorithm for the minimum
line-distortion problem and an efficient constant-factor approximation
algorithm for the minimum bandwidth problem;
- there is an efficient 2-approximation algorithm for computing the
path-length of an arbitrary graph;
- AT-free graphs and some intersection families of graphs have path-length at
most 2;
- for AT-free graphs, there exist a linear time 8-approximation algorithm for
the minimum line-distortion problem and a linear time 4-approximation algorithm
for the minimum bandwidth problem
Exploring the magnetic field complexity in M dwarfs at the boundary to full convection
Based on detailed spectral synthesis we carry out quantitative measurements
of the strength and complexity of surface magnetic fields in the four
well-known M-dwarfs GJ 388, GJ 729, GJ 285, and GJ 406 populating the mass
regime around the boundary between partially and fully convective stars. Very
high resolution R=100000, high signal-to-noise (up to 400) near-infrared Stokes
I spectra were obtained with CRIRES at ESO's Very Large Telescope covering
regions of the FeH Wing-Ford transitions at 1mum. The field distributions in
all four stars are characterized by three distinct groups of field components,
the data are neither consistent with a smooth distribution of different field
strengths, nor with one average field strength covering the full star. We find
evidence of a subtle difference in the field distribution of GJ 285 compared to
the other three targets. GJ 285 also has the highest average field of 3.5kG and
the strongest maximum field component of 7-7.5kG. The maximum local field
strengths in our sample seem to be correlated with rotation rate. While the
average field strength is saturated, the maximum local field strengths in our
sample show no evidence for saturation. We find no difference between the field
distributions of partially and fully convective stars. The one star with
evidence for a field distribution different to the other three is the most
active star (i.e. with largest x-ray luminosity and mean surface magnetic
field) rotating relatively fast. A possible explanation is that rotation
determines the distribution of surface magnetic fields, and that local field
strengths grow with rotation even in stars in which the average field is
already saturated.Comment: 15 pages, 8 figure
- …