5,898 research outputs found
Performance comparison of point and spatial access methods
In the past few years a large number of multidimensional point access methods, also called
multiattribute index structures, has been suggested, all of them claiming good performance. Since no
performance comparison of these structures under arbitrary (strongly correlated nonuniform, short
"ugly") data distributions and under various types of queries has been performed, database
researchers and designers were hesitant to use any of these new point access methods. As shown in
a recent paper, such point access methods are not only important in traditional database applications.
In new applications such as CAD/CIM and geographic or environmental information systems, access
methods for spatial objects are needed. As recently shown such access methods are based on point
access methods in terms of functionality and performance. Our performance comparison naturally
consists of two parts. In part I we w i l l compare multidimensional point access methods, whereas in
part I I spatial access methods for rectangles will be compared. In part I we present a survey and
classification of existing point access methods. Then we carefully select the following four methods
for implementation and performance comparison under seven different data files (distributions) and
various types of queries: the 2-level grid file, the BANG file, the hB-tree and a new scheme, called
the BUDDY hash tree. We were surprised to see one method to be the clear winner which was the
BUDDY hash tree. It exhibits an at least 20 % better average performance than its competitors and is
robust under ugly data and queries. In part I I we compare spatial access methods for rectangles.
After presenting a survey and classification of existing spatial access methods we carefully selected
the following four methods for implementation and performance comparison under six different data
files (distributions) and various types of queries: the R-tree, the BANG file, PLOP hashing and the
BUDDY hash tree. The result presented two winners: the BANG file and the BUDDY hash tree.
This comparison is a first step towards a standardized testbed or benchmark. We offer our data and
query files to each designer of a new point or spatial access method such that he can run his
implementation in our testbed
Recommended from our members
Chronoprints: Identifying Samples by Visualizing How They Change over Space and Time.
The modern tools of chemistry excel at identifying a sample, but the cost, size, complexity, and power consumption of these instruments often preclude their use in resource-limited settings. In this work, we demonstrate a simple and low-cost method for identifying a sample based on visualizing how the sample changes over space and time in response to a perturbation. Different types of perturbations could be used, and in this proof-of-concept we use a dynamic temperature gradient that rapidly cools different parts of the sample at different rates. We accomplish this by first loading several samples into long parallel channels on a "microfluidic thermometer chip." We then immerse one end of the chip in liquid nitrogen to create a dynamic temperature gradient along the channels, and we use an inexpensive USB microscope to record a video of how the samples respond to the changing temperature gradient. The video is then converted into several bitmap images (one per sample) that capture each sample's response to the perturbation in both space (the y-axis; the distance along the dynamic temperature gradient) and time (the x-axis); we call these images "chronological fingerprints" or "chronoprints" of each sample. If two samples' chronoprints are similar, this suggests that the samples are the same chemical substance or mixture, but if two samples' chronoprints are significantly different, this proves that the samples are chemically different. Since chronoprints are just bitmap images, they can be compared using a variety of techniques from computer science, and in this work we use three different image comparison algorithms to quantify chronoprint similarity. As a demonstration of the versatility of chronoprints, we use them in three different applications: distinguishing authentic olive oil from adulterated oil (an example of the over $10 billion global problem of food fraud), identifying adulterated or counterfeit medication (which represents around 10% of all medication in low- and middle-income countries), and distinguishing the occasionally confused pharmaceutical ingredients glycerol and diethylene glycol (whose accidental or intentional substitution has led to hundreds of deaths). The simplicity and versatility of chronoprints should make them valuable analytical tools in a variety of different fields
Data Management and Mining in Astrophysical Databases
We analyse the issues involved in the management and mining of astrophysical
data. The traditional approach to data management in the astrophysical field is
not able to keep up with the increasing size of the data gathered by modern
detectors. An essential role in the astrophysical research will be assumed by
automatic tools for information extraction from large datasets, i.e. data
mining techniques, such as clustering and classification algorithms. This asks
for an approach to data management based on data warehousing, emphasizing the
efficiency and simplicity of data access; efficiency is obtained using
multidimensional access methods and simplicity is achieved by properly handling
metadata. Clustering and classification techniques, on large datasets, pose
additional requirements: computational and memory scalability with respect to
the data size, interpretability and objectivity of clustering or classification
results. In this study we address some possible solutions.Comment: 10 pages, Late
- …