46,343 research outputs found

    Indexing the Earth Mover's Distance Using Normal Distributions

    Full text link
    Querying uncertain data sets (represented as probability distributions) presents many challenges due to the large amount of data involved and the difficulties comparing uncertainty between distributions. The Earth Mover's Distance (EMD) has increasingly been employed to compare uncertain data due to its ability to effectively capture the differences between two distributions. Computing the EMD entails finding a solution to the transportation problem, which is computationally intensive. In this paper, we propose a new lower bound to the EMD and an index structure to significantly improve the performance of EMD based K-nearest neighbor (K-NN) queries on uncertain databases. We propose a new lower bound to the EMD that approximates the EMD on a projection vector. Each distribution is projected onto a vector and approximated by a normal distribution, as well as an accompanying error term. We then represent each normal as a point in a Hough transformed space. We then use the concept of stochastic dominance to implement an efficient index structure in the transformed space. We show that our method significantly decreases K-NN query time on uncertain databases. The index structure also scales well with database cardinality. It is well suited for heterogeneous data sets, helping to keep EMD based queries tractable as uncertain data sets become larger and more complex.Comment: VLDB201

    Line-distortion, Bandwidth and Path-length of a graph

    Full text link
    We investigate the minimum line-distortion and the minimum bandwidth problems on unweighted graphs and their relations with the minimum length of a Robertson-Seymour's path-decomposition. The length of a path-decomposition of a graph is the largest diameter of a bag in the decomposition. The path-length of a graph is the minimum length over all its path-decompositions. In particular, we show: - if a graph GG can be embedded into the line with distortion kk, then GG admits a Robertson-Seymour's path-decomposition with bags of diameter at most kk in GG; - for every class of graphs with path-length bounded by a constant, there exist an efficient constant-factor approximation algorithm for the minimum line-distortion problem and an efficient constant-factor approximation algorithm for the minimum bandwidth problem; - there is an efficient 2-approximation algorithm for computing the path-length of an arbitrary graph; - AT-free graphs and some intersection families of graphs have path-length at most 2; - for AT-free graphs, there exist a linear time 8-approximation algorithm for the minimum line-distortion problem and a linear time 4-approximation algorithm for the minimum bandwidth problem

    Exploring the magnetic field complexity in M dwarfs at the boundary to full convection

    Full text link
    Based on detailed spectral synthesis we carry out quantitative measurements of the strength and complexity of surface magnetic fields in the four well-known M-dwarfs GJ 388, GJ 729, GJ 285, and GJ 406 populating the mass regime around the boundary between partially and fully convective stars. Very high resolution R=100000, high signal-to-noise (up to 400) near-infrared Stokes I spectra were obtained with CRIRES at ESO's Very Large Telescope covering regions of the FeH Wing-Ford transitions at 1mum. The field distributions in all four stars are characterized by three distinct groups of field components, the data are neither consistent with a smooth distribution of different field strengths, nor with one average field strength covering the full star. We find evidence of a subtle difference in the field distribution of GJ 285 compared to the other three targets. GJ 285 also has the highest average field of 3.5kG and the strongest maximum field component of 7-7.5kG. The maximum local field strengths in our sample seem to be correlated with rotation rate. While the average field strength is saturated, the maximum local field strengths in our sample show no evidence for saturation. We find no difference between the field distributions of partially and fully convective stars. The one star with evidence for a field distribution different to the other three is the most active star (i.e. with largest x-ray luminosity and mean surface magnetic field) rotating relatively fast. A possible explanation is that rotation determines the distribution of surface magnetic fields, and that local field strengths grow with rotation even in stars in which the average field is already saturated.Comment: 15 pages, 8 figure
    • …
    corecore