Search CORE

9 research outputs found

bdbms -- A Database Management System for Biological Data

Author: Aref Walid G.
Eltabakh Mohamed Y.
Ouzzani Mourad
Publication venue
Publication date: 01/12/2006
Field of study

Biologists are increasingly using databases for storing and managing their data. Biological databases typically consist of a mixture of raw data, metadata, sequences, annotations, and related data obtained from various sources. Current database technology lacks several functionalities that are needed by biological databases. In this paper, we introduce bdbms, an extensible prototype database management system for supporting biological data. bdbms extends the functionalities of current DBMSs to include: (1) Annotation and provenance management including storage, indexing, manipulation, and querying of annotation and provenance as first class objects in bdbms, (2) Local dependency tracking to track the dependencies and derivations among data items, (3) Update authorization to support data curation via content-based authorization, in contrast to identity-based authorization, and (4) New access methods and their supporting operators that support pattern matching on various types of compressed biological data types. This paper presents the design of bdbms along with the techniques proposed to support these functionalities including an extension to SQL. We also outline some open issues in building bdbms.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

arXiv.org e-Print Archive

CiteSeerX

Purdue E-Pubs

An Efficient Algorithm for Bulk-Loading xBR+ -trees

Author: Corral Liria Antonio Leopoldo
Manolopoulos Yannis
Roumelis George
Vassilakopoulos Michael
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

A major part of the interface to a database is made up of the queries that can be addressed to this database and answered (processed) in an efficient way, contributing to the quality of the developed software. Efficiently processed spatial queries constitute a fundamental part of the interface to spatial databases due to the wide area of applications that may address such queries, like geographical information systems (GIS), location-based services, computer visualization, automated mapping, facilities management, etc. Another important capability of the interface to a spatial database is to offer the creation of efficient index structures to speed up spatial query processing. The xBR + -tree is a balanced disk-resident quadtree-based index structure for point data, which is very efficient for processing such queries. Bulk-loading refers to the process of creating an index from scratch, when the dataset to be indexed is available beforehand, instead of creating the index gradually (and more slowly), when the dataset elements are inserted one-by-one. In this paper, we present an algorithm for bulk-loading xBR + -trees for big datasets residing on disk, using a limited amount of main memory. The resulting tree is not only built fast, but exhibits high performance in processing a broad range of spatial queries, where one or two datasets are involved. To justify these characteristics, using real and artificial datasets of various cardinalities, first, we present an experimental comparison of this algorithm vs. a previous version of the same algorithm and STR, a popular algorithm of bulk-loading R-trees, regarding tree creation time and the characteristics of the trees created, and second, we experimentally compare the query efficiency of bulk-loaded xBR + -trees vs. bulk-loaded R-trees, regarding I/O and execution time. Thus, this paper contributes to the implementation of spatial database interfaces and the efficient storage organization for big spatial data management

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional de la Universidad de Almería (Spain)

Sorting in Space: Multidimensional, spatial, and metric data structures for applications in spatial databases, geographic information systems (GIS), and location-based services

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Data-parallel polygonization

Author: Blelloch
Blelloch
Comer
Elmasri
Erik G. Hoel
Franklin
Hanan Samet
JáJá
Nassimi
Nelson
Nievergelt
Peano
Samet
Samet
Samet
Schwartz
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

A Framework for Spatio-Temporal Trajectory Data Segmentation and Query

Author: Kang Huaqiang
Publication venue
Publication date: 26/03/2019
Field of study

Trajectory segmentation is a technique of dividing sequential trajectory data into segments. These segments are building blocks to various applications for big trajectory data. Hence a system framework is essential to support trajectory segment indexing, storage, and query. When the size of segments is beyond the computing capacity of a single processing node, a distributed solution is proposed. In this thesis, a distributed trajectory segmentation framework that includes a greedy-split segmentation method is created. This framework consists of distributed in-memory processing and a cluster of graph storage respectively. For fast trajectory queries, distributed spatial R-tree index of trajectory segments is applied. Using the trajectory indexes, this framework builds queries of segments from in-memory processing and from the graph storage. Based on this segmentation framework, two metrics to measure trajectory similarity and chance of collision are defined. These two metrics are further applied to identify moving groups of trajectories. This study quantitatively evaluates the effects of data partition, parallelism, and data size on the system. The study identifies the bottleneck factors at the data partition stage, and validate two mitigation solutions. The evaluation demonstrates the distributed segmentation method and the system framework scale as the growth of the workload and the size of the parallel cluster

Concordia University Research Repository

New Efficient Spatial Index Structures, PML-Tree and SMR-Tree, for Spatial Databases

Author: Bang Kap. S.
Publication venue: 'Oklahoma State University Library'
Publication date: 01/12/1995
Field of study

Computer Scienc

SHAREOK repository

Multidimensional access methods

Author: ABEL D. J.
ABEL D. J.
ANG C.
AREF W. G.
BAYER R.
BAYER R.
BECKER B.
BECKMANN N.
BELUSSI A.
BENTLEY J. L.
BERCHTOLD S.
BLANKEN H.
BRINKHOFF T.
BRINKHOFF T.
BRINKHOFF T.
BRINKHOFF T.
BRODSKY A.
BURKHARD W.
BURKHARD W.A.
EVANGELIDIS G.
FALOUTSOS C.
FALOUTSOS C.
FALOUTSOS C.
FALOUTSOS C.
FALOUTSOS C.
FALOUTSOS C.
FINKEL R.
FLAJOLET P.
FRANK A.
FREESTON M.
FREESTON M.
FREESTON M.
FREESTON M.
FREESTON M.
GAEDE V.
GAEDE V.
GAEDE V.
GAEDE V.
GREENE D.
GUNTHER O.
GUNTHER O.
GUNTHER O.
GUNTHER O.
GUNTHER O.
GUNTHER O.
GUTING R. H.
GUTING R. H.
GUTTMAN A.
HELLERSTEIN J. M.
HELLERSTEIN J. M.
HENRICH A.
HENRICH A.
HENRICH A.
HENRICH A.
HOEL E. G.
HOEL E. G.
HUTFLESZ A.
HUTFLESZ A.
HUTFLESZ A.
HUTFLESZ A.
JAGADISH H. V.
JAGADISH H. V.
JAGADISH H.V.
KAMEL I.
KAMEL I.
KAMEL I.
KAMEL I.
KANELLAKIS P. C.
KEDEM G.
KLINGER A.
KNOTT G.
KOLOVSON C.
KORNACKER M.
KRIEGEL H.-P.
KRIEGEL H.-P.
KRIEGEL H.-P.
KRIEGEL H.-P.
KRIEGEL H.-P.
KRIEGEL H.-P.
KUMAR A.
LARSON P.A.
LIN K.-I.
LITWIN W.
LOMET D. B.
LOMET D. B.
LOMET D.B.
MATSUYAMA T.
MCDONELL K. J.
NELSON R.
NG R. T.
NG V.
NG V.
NIEVERGELT
NIEVERGELT ICHS
OHSAWA Y.
OHSAWA Y.
Oliver Günther
OoI
ORENSTEIN J.
ORENSTEIN J.
ORENSTEIN J.
ORENSTEIN J.
ORENSTEIN J.
ORENSTEIN J.
ORENSTEIN J. A.
OTOO E. J.
OTOO E. J.
OTOO E. J.
OUKSEL M.
OUKSEL M.
PAGEL B. U.
PAGEL B. U.
PAGEL B. U.
PAPADIAS D.
PAPADOPOULOS A.
RAVISHANKAR C.
ROBINSON J.T.
ROTEM D.
ROUSSOPOULOS N.
ROUSSOPOULOS N.
SCHNEIDER R.
SCHOLL M.
SEEGER B.
SEEGER B.
SEEGER B.
SELLIS T.
SEVCIK K.
SEXTON P.
SHEKHAR S.
SIEMENS
SIX H.
SMITH T. R.
STONEBRAKER M.
STUCKEY P.
SUBRAMANIAN S.
TAMMINEN M.
TAMMINEN M.
THEODORIDIS Y.
TROPF H.
Volker Gaede
WHITE M.
WIDMAYER P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

A Population Analysis for Hierarchical Data Structures

Author: Hanan Samet
Randal C Nelson
Publication venue
Publication date: 01/01/1987
Field of study

A new method termed population analysis 1s presented for approxlmatmg the dlstrlbutlon of node occupancies m hierarchical data structures which store a variable number of geometric data items per node The basic idea 1s to describe a dynamic data structure as a set of populations which are per-mitted to transform mto one another according to certain rules The transformation rules are used to obtam a set of equations describing a population dlstrlbutlon which 1s stable under msertion of addttional mformation mto the structure These equations can then be solved, &her analytically or numerlcally, to obtain the population distribution Hierarclu-cal data structures are modeled by letting each population represent the nodes of a given occupancy A detailed analysis of quadtree data structures for storing point data IS presented, and the results are compared to experimental data Two phenomena referred to as agang and phasmg are defined and shown to account for the differences between the expert-mental results and those predicted by the model The popu-lation techmque IS compared with statistical methods of analyzing smular data structures CR Categories and Subject Descriptors E 1 [Data] Data Structures- trees, F 2 2 [Theory of Computation] Analysis of nonnumernzal algorithms and problems-Geometrical problems and computations, H 3 3 [ Informa-tion Storage and Retrieval] Content Analysis and Index-mg- mdexmg methods Key words and phrases file structures, bucketing methods, multidimensional attributes, hierarchical data structures, quadtrees Pernusslon to copy without fee all or part of this material IS granted provided that the copies are not made or chstrlbuted for direct commercial advantage, the ACM copyright notice and the title of the publication and Its date appear, and notlcc 1s given that copym

CiteSeerX

A population analysis for hierarchical data structures

Author: Hanan Samet
Randal C. Nelson
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref