Search CORE

6,339 research outputs found

SVS-JOIN : efficient spatial visual similarity join for geo-multimedia

Author: Huang Fang
Yu Hao
Yu Weiren
Zhang Chengyuan
Zhang Zuping
Zhu Lei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/10/2019
Field of study

In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently

Warwick Research Archives Portal Repository

An R*-Tree Based Semi-Dynamic Clustering Method for the Efficient Processing of Spatial Join in a Shared-Nothing Parallel Database System

Author: Ganpaa Gayatri
Publication venue: ScholarWorks@UNO
Publication date: 20/01/2006
Field of study

The growing importance of geospatial databases has made it essential to perform complex spatial queries efficiently. To achieve acceptable performance levels, database systems have been increasingly required to make use of parallelism. The spatial join is a computationally expensive operator. Efficient implementation of the join operator is, thus, desirable. The work presented in this document attempts to improve the performance of spatial join queries by distributing the data set across several nodes of a cluster and executing queries across these nodes in parallel. This document discusses a new parallel algorithm that implements the spatial join in an efficient manner. This algorithm is compared to an existing parallel spatial-join algorithm, the clone join. Both algorithms have been implemented on a Beowulf cluster and compared using real datasets. An extensive experimental analysis reveals that the proposed algorithm exhibits superior performance both in declustering time as well as in the execution time of the join query

University of New Orleans

AsterixDB: A Scalable, Open Source BDMS

Author: Alsubaiee Sattam
Altowim Yasser
Altwaijry Hotham
Behm Alexander
Borkar Vinayak
Bu Yingyi
Carey Michael
Cetindil Inci
Cheelangi Madhusudan
Faraaz Khurram
Gabrielova Eugenia
Grover Raman
Heilbron Zachary
Kim Young-Seok
Li Chen
Li Guangqiang
Ok Ji Mahn
Onose Nicola
Pirzadeh Pouria
Tsotras Vassilis
Vernica Rares
Wen Jian
Westmann Till
Publication venue
Publication date: 02/07/2014
Field of study

AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make it well-suited to applications like web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries; a scalable runtime; partitioned, LSM-based data storage and indexing (including B+-tree, R-tree, and text indexes); support for external as well as natively stored data; a rich set of built-in types; support for fuzzy, spatial, and temporal types and queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store. Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This paper is the first complete description of the resulting open source AsterixDB system. Covered herein are the system's data model, its query language, and its software architecture. Also included are a summary of the current status of the project and a first glimpse into how AsterixDB performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform, for things that both technologies can do. Also included is a brief description of some initial trials that the system has undergone and the lessons learned (and plans laid) based on those early "customer" engagements

arXiv.org e-Print Archive

CiteSeerX

The Family of MapReduce and Large Scale Data Processing Systems

Author: Anna Liu
Ayman G. Fayoumi
King Abdulaziz
See Profile
Sherif Sakr
Sherif Sakr
South Wales
South Wales
Publication venue
Publication date: 12/02/2013
Field of study

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

arXiv.org e-Print Archive

CiteSeerX

Approximate spatio-temporal retrieval

Author: ARGE L.
BACCHUS F.
BACCHUS F.
BERCHTOLD S.
BRINKHOFF T.
BRUNS T.
CHANG S.
Dimitris Papadias
EGENHOFER M.J.
GUTTMAN A.
HARALICK R.M.
HUANG Y. -W.
KOUDAS N.
LEE S.
Nikos Mamoulis
PAPADIAS D.
PAPADIAS D.
PAPADIAS D.
PAPADIAS D.
PARK H.
PREPARATA F.P.
ROTEM D.
ROUSSOPOULOS N.
SEIDL T.
SELLIS T.
Vasilis Delis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

A query processing system for very large spatial databases using a new map algebra

Author: Firouzabadi Seyed-Ali
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2002
Field of study

Dans cette thèse nous introduisons une approche de traitement de requêtes pour des bases de donnée spatiales. Nous expliquons aussi les concepts principaux que nous avons défini et développé: une algèbre spatiale et une approche à base de graphe utilisée dans l'optimisateur. L'algèbre spatiale est défini pour exprimer les requêtes et les règles de transformation pendant les différentes étapes de l'optimisation de requêtes. Nous avons essayé de définir l'algèbre la plus complète que possible pour couvrir une grande variété d'application. L'opérateur algébrique reçoit et produit seulement des carte. Les fonctions reçoivent des cartes et produisent des scalaires ou des objets. L'optimisateur reçoit la requête en expression algébrique et produit un QEP (Query Evaluation Plan) efficace dans deux étapes: génération de QEG (Query Evaluation Graph) et génération de QEP. Dans première étape un graphe (QEG) équivalent de l'expression algébrique est produit. Les règles de transformation sont utilisées pour transformer le graphe a un équivalent plus efficace. Dans deuxième étape un QEP est produit de QEG passé de l'étape précédente. Le QEP est un ensemble des opérations primitives consécutives qui produit les résultats finals (la réponse finale de la requête soumise au base de donnée). Nous avons implémenté l'optimisateur, un générateur de requête spatiale aléatoire, et une base de donnée simulée. La base de donnée spatiale simulée est un ensemble de fonctions pour simuler des opérations spatiales primitives. Les requêtes aléatoires sont soumis à l'optimisateur. Les QEPs générées sont soumis au simulateur de base de données spatiale. Les résultats expérimentaux sont utilisés pour discuter les performances et les caractéristiques de l'optimisateur.Abstract: In this thesis we introduce a query processing approach for spatial databases and explain the main concepts we defined and developed: a spatial algebra and a graph based approach used in the optimizer. The spatial algebra was defined to express queries and transformation rules during different steps of the query optimization. To cover a vast variety of potential applications, we tried to define the algebra as complete as possible. The algebra looks at the spatial data as maps of spatial objects. The algebraic operators act on the maps and result in new maps. Aggregate functions can act on maps and objects and produce objects or basic values (characters, numbers, etc.). The optimizer receives the query in algebraic expression and produces one efficient QEP (Query Evaluation Plan) through two main consecutive blocks: QEG (Query Evaluation Graph) generation and QEP generation. In QEG generation we construct a graph equivalent of the algebraic expression and then apply graph transformation rules to produce one efficient QEG. In QEP generation we receive the efficient QEG and do predicate ordering and approximation and then generate the efficient QEP. The QEP is a set of consecutive phases that must be executed in the specified order. Each phase consist of one or more primitive operations. All primitive operations that are in the same phase can be executed in parallel. We implemented the optimizer, a randomly spatial query generator and a simulated spatial database. The query generator produces random queries for the purpose of testing the optimizer. The simulated spatial database is a set of functions to simulate primitive spatial operations. They return the cost of the corresponding primitive operation according to input parameters. We put randomly generated queries to the optimizer, got the generated QEPs and put them to the spatial database simulator. We used the experimental results to discuss on the optimizer characteristics and performance. The optimizer was designed for databases with a very large number of spatial objects nevertheless most of the concepts we used can be applied to all spatial information systems."--Résumé abrégé par UMI

Savoirs UdeS

Optimal-Location-Selection Query Processing in Spatial Databases

Author: Baihua Zheng
Gencai Chen
Qing Li
Senior Member
Yunjun Gao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Abstract—This paper introduces and solves a novel type of spatial queries, namely, Optimal-Location-Selection (OLS) search, which has many applications in real life. Given a data object set DA, a target object set DB, a spatial region R, and a critical distance dc in a multidimensional space, an OLS query retrieves those target objects in DB that are outside R but have maximal optimality. Here, the optimality of a target object b 2 DB located outside R is defined as the number of the data objects from DA that are inside R and meanwhile have their distances to b not exceeding dc. When there is a tie, the accumulated distance from the data objects to b serves as the tie breaker, and the one with smaller distance has the better optimality. In this paper, we present the optimality metric, formalize the OLS query, and propose several algorithms for processing OLS queries efficiently. A comprehensive experimental evaluation has been conducted using both real and synthetic data sets to demonstrate the efficiency and effectiveness of the proposed algorithms. Index Terms—Query processing, optimal-location-selection, spatial database, algorithm. Ç

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University