Search CORE

49 research outputs found

Scalable Probabilistic Similarity Ranking in Uncertain Databases (Technical Report)

Author: Bernecker Thomas
Kriegel Hans-Peter
Mamoulis Nikos
Renz Matthias
Zuefle Andreas
Publication venue
Publication date: 01/01/2009
Field of study

This paper introduces a scalable approach for probabilistic top-k similarity ranking on uncertain vector data. Each uncertain object is represented by a set of vector instances that are assumed to be mutually-exclusive. The objective is to rank the uncertain data according to their distance to a reference object. We propose a framework that incrementally computes for each object instance and ranking position, the probability of the object falling at that ranking position. The resulting rank probability distribution can serve as input for several state-of-the-art probabilistic ranking models. Existing approaches compute this probability distribution by applying a dynamic programming approach of quadratic complexity. In this paper we theoretically as well as experimentally show that our framework reduces this to a linear-time complexity while having the same memory requirements, facilitated by incremental accessing of the uncertain vector instances in increasing order of their distance to the reference object. Furthermore, we show how the output of our method can be used to apply probabilistic top-k ranking for the objects, according to different state-of-the-art definitions. We conduct an experimental evaluation on synthetic and real data, which demonstrates the efficiency of our approach

arXiv.org e-Print Archive

CiteSeerX

HKU Scholars Hub

The Flexible Group Spatial Keyword Query

Author: D Papadias
G Cong
GR Hjaltason
K Yao
ME Ali
N Roussopoulos
X Cao
Z Li
Publication venue
Publication date: 24/04/2017
Field of study

We present a new class of service for location based social networks, called the Flexible Group Spatial Keyword Query, which enables a group of users to collectively find a point of interest (POI) that optimizes an aggregate cost function combining both spatial distances and keyword similarities. In addition, our query service allows users to consider the tradeoffs between obtaining a sub-optimal solution for the entire group and obtaining an optimimized solution but only for a subgroup. We propose algorithms to process three variants of the query: (i) the group nearest neighbor with keywords query, which finds a POI that optimizes the aggregate cost function for the whole group of size n, (ii) the subgroup nearest neighbor with keywords query, which finds the optimal subgroup and a POI that optimizes the aggregate cost function for a given subgroup size m (m <= n), and (iii) the multiple subgroup nearest neighbor with keywords query, which finds optimal subgroups and corresponding POIs for each of the subgroup sizes in the range [m, n]. We design query processing algorithms based on branch-and-bound and best-first paradigms. Finally, we provide theoretical bounds and conduct extensive experiments with two real datasets which verify the effectiveness and efficiency of the proposed algorithms.Comment: 12 page

arXiv.org e-Print Archive

Crossref

Solving Large-Scale Minimum-Weight Triangulation Instances to Provable Optimality

Author: Haas Andreas
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th International Symposium on Computational Geometry (SoCG 2018)
Publication date: 01/01/2018
Field of study

We consider practical methods for the problem of finding a minimum-weight triangulation (MWT) of a planar point set, a classic problem of computational geometry with many applications. While Mulzer and Rote proved in 2006 that computing an MWT is NP-hard, Beirouti and Snoeyink showed in 1998 that computing provably optimal solutions for MWT instances of up to 80,000 uniformly distributed points is possible, making use of clever heuristics that are based on geometric insights. We show that these techniques can be refined and extended to instances of much bigger size and different type, based on an array of modifications and parallelizations in combination with more efficient geometric encodings and data structures. As a result, we are able to solve MWT instances with up to 30,000,000 uniformly distributed points in less than 4 minutes to provable optimality. Moreover, we can compute optimal solutions for a vast array of other benchmark instances that are not uniformly distributed, including normally distributed instances (up to 30,000,000 points), all point sets in the TSPLIB (up to 85,900 points), and VLSI instances with up to 744,710 points. This demonstrates that from a practical point of view, MWT instances can be handled quite well, despite their theoretical difficulty

Digitale Bibliothek Braunschweig

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Uncertain voronoi cell computation based on space decomposition

Author: B Zheng
CY Chow
F Aurenhammer
G Beskales
J Li
M Sharifzadeh
M Sharifzadeh
ME Ali
S Nutanong
T Bernecker
T Emrich
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

LNCS v. 9239 entitled: Advances in Spatial and Temporal Databases: 14th International Symposium, SSTD 2015 ... ProceedingsThe problem of computing Voronoi cells for spatial objects whose locations are not certain has been recently studied. In this work, we propose a new approach to compute Voronoi cells for the case of objects having rectangular uncertainty regions. Since exact computation of Voronoi cells is hard, we propose an approximate solution. The main idea of this solution is to apply hierarchical access methods for both data and object space. Our space index is used to efficiently find spatial regions which must (not) be inside a Voronoi cell. Our object index is used to efficiently identify Delauny relations, i.e., data objects which affect the shape of a Voronoi cell. We develop three algorithms to explore index structures and show that the approach that descends both index structures in parallel yields fast query processing times. Our experiments show that we are able to approximate uncertain Voronoi cells much more effectively than the state-of-the-art, and at the same time, improve run-time performance.postprin

CiteSeerX

Crossref

HKU Scholars Hub

Information system for image classification based on frequency curve proximity

Author: Agrawal
Agrawal
Bohm
Burkhard
Cattell
Chavez
Dean
Dehne
Fernando
Han
Han
Hjaltason
Hjaltason
Javier Alfonso-Cendón
Joaquín B. Ordieres-Meré
Kalantari
Korenblum
Kosters
Kruskal
Kumar
L. Sánchez
Manuel Castejón Limas
McCreadie
MWG
NISO
Patella
Paulo Novais
Pokorny
Ritzer
Skopal
Tesic
Tiago Oliveira
Zaki
Zezula
Publication venue: 'Elsevier BV'
Publication date: 01/03/2017
Field of study

With the size digital collections are currently reaching, retrieving the best match of a document from large collections by comparing hundreds of tags is a task that involves considerable algorithm complexity, even more so if the number of tags in the collection is not fixed. For these cases, similarity search appears to be the best retrieval method, but there is a lack of techniques suited for these conditions. This work presents a combination of machine learning algorithms put together to find the most similar object of a given one in a set of pre-processed objects based only on their metadata tags. The algorithm represents objects as character frequency curves and is capable of finding relationships between objects without an apparent association. It can also be parallelized using MapReduce strategies to perform the search. This method can be applied to a wide variety of documents with metadata tags. The case-study used in this work to demonstrate the similarity search technique is that of a collection of image objects in JavaScript Object Notation (JSON) containing metadata tags.This work has been done in the context of the project “ASASEC (Advisory System Against Sexual Exploitation of Children)” (HOME/2010/ISEC/AG/043) supported by the European Union with the program “Prevention and fight against crime”.info:eu-repo/semantics/publishedVersio

Universidade do Minho: RepositoriUM

Crossref

Application of Strand-Cartesian Interfaced Solver on Flows Around Various Geometries

Author: Yanagita Yushi
Publication venue: DigitalCommons@USU
Publication date: 01/05/2017
Field of study

This work examines the application of a high-order numerical method to strand-based grids to solve the Navier-Stokes equations. Coined Flux Correction , this method eliminates error terms in the fluxes of traditional second-order finite volume Galerkin methods. Flux Correction is first examined for applications to the Reynolds-Averaged Navier-Stokes equations to compute turbulent flows on a strictly strand-based domain. Flow over three geometries are examined to demonstrate the method’s capabilities: a three-dimensional bump, an infinite wing, and a hemisphere-cylinder configuration. Comparison to results obtained from established codes show that the turbulent Flux Correction scheme accurately predicts flow properties such as pressure, velocity profiles, shock location and strength. However, it can be seen that an overset Cartesian solver is necessary to more accurately capture certain flow properties in the wake region. The Strand-Cartesian Interface Manager(SCIM) uses a combination of second-order trilinear interpolation and mixed-order Lagrange interpolation to establish domain connectivity between the overset grids. Verification of the high-order SCIM code are conducted through the method of manufactured solutions. Steady and unsteady flow around a sphere are used to validate the SCIM library. The method is found to be have a combined order of accuracy of approximately 2.5, and has improved accuracy for steady cases. However, for unsteady cases the method fails to accurately predict the time-dependent flow field

DigitalCommons@USU

Approximate NN Queries on Streams with Guaranteed Error/performance Bounds

Author: B OOI
K TAN
N KOUDAS
R ZHANG
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Crossref