Search CORE

4 research outputs found

B-puiden intervallipoistot

Author: Lilja Timo
Publication venue
Publication date: 01/01/2005
Field of study

Intervallipoistossa B-puuhakurakenteesta poistetaan kaikki annetulle avainvälille kuuluvat avaimet. Intervallipoisto on erikoistapaus yleisestä ryhmäpoisto-operaatiosta, jossa joukko avaimia poistetaan puusta. Kun poistettavan ryhmän aliryhmä muodostaa jatkuvan avainvälialueen puussa, voidaan soveltaa intervallipoistoa. Relaatiotietokannoissa transaktion toteuttava prosessi tuottaa indeksiin kohdistuvia intervallipoisto-operaatioita tietyissä tilanteissa silloin, kun viite-eheys on säilytettävä. Ylhäältä alaspäin etenevä tasapainotus on menetelmä, jossa puu tasapainotetaan päivitysoperaation etsintävaiheessa. Alun perin tasapainotukset suoritettiin vasta lisäys- tai poisto-operaation jälkeen erillisellä tasapainotusvaiheella. Rinnakkaisia järjestelmiä tarkasteltaessa ylhäältä alaspäin etenevä tasapainotus takaa sen, että solmuja tarvitsee lukita ainoastaan vakiomäärä, kun taas alhaalta ylöspäin etenevässä tasapainotuksessa lukkojen lukumäärä on pahimmassa tapauksessa suhteessa puun korkeuteen. Siispä ylhäältä alaspäin etenevä tasapainotus mahdollistaa prosessien tehokkaamman rinnakkaisen suorittamisen. Tässä työssä luodaan katsaus B-, B+-, Blink- ja (a, b)-puihin ja niissä sovellettaviin tasapainotusmenetelmiin sekä selvitetään nykyisin tunnettuja intervallipoisto-operaation toteutustapoja. Työssä esitellään yksityiskohtaisesti ylhäältä alaspäin etenevä yksivaiheinen intervallipoistoalgoritmi, joka suorittaa vakiomäärän tasapainotusoperaatioita tasoitetun vaativuuden mielessä. Työssä esitellään myös kokeellinen osuus, jossa pyritään vertailemaan algoritmin suorituskykyä muihin lähestymistapoihin ja hakemaan kokeellista tukea esitetylle teorialle. Työn uusi tulos on ylhäältä alaspäin tasapainottava yksivaiheinen intervallipoistoalgoritmi

Aaltodoc Publication Archive

Partial Replica Location And Selection For Spatial Datasets

Author: Tian Yun
Publication venue: eGrove
Publication date: 01/01/2013
Field of study

As the size of scientific datasets continues to grow, we will not be able to store enormous datasets on a single grid node, but must distribute them across many grid nodes. The implementation of partial or incomplete replicas, which represent only a subset of a larger dataset, has been an active topic of research. Partial Spatial Replicas extend this functionality to spatial data, allowing us to distribute a spatial dataset in pieces over several locations. We investigate solutions to the partial spatial replica selection problems. First, we describe and develop two designs for an Spatial Replica Location Service (SRLS), which must return the set of replicas that intersect with a query region. Integrating a relational database, a spatial data structure and grid computing software, we build a scalable solution that works well even for several million replicas. In our SRLS, we have improved performance by designing a R-tree structure in the backend database, and by aggregating several queries into one larger query, which reduces overhead. We also use the Morton Space-filling Curve during R-tree construction, which improves spatial locality. In addition, we describe R-tree Prefetching(RTP), which effectively utilizes the modern multi-processor architecture. Second, we present and implement a fast replica selection algorithm in which a set of partial replicas is chosen from a set of candidates so that retrieval performance is maximized. Using an R-tree based heuristic algorithm, we achieve O(n log n) complexity for this NP-complete problem. We describe a model for disk access performance that takes filesystem prefetching into account and is sufficiently accurate for spatial replica selection. Making a few simplifying assumptions, we present a fast replica selection algorithm for partial spatial replicas. The algorithm uses a greedy approach that attempts to maximize performance by choosing a collection of replica subsets that allow fast data retrieval by a client machine. Experiments show that the performance of the solution found by our algorithm is on average always at least 91% and 93.4% of the performance of the optimal solution in 4-node and 8-node tests respectively

eGrove (Univ. of Mississippi)

Bulk loading a Data Warehouse built upon a UB-Tree,” presented at IDEAS

Author: Akihiko Kawakami
Robert Fenk
Volker Markl
Publication venue
Publication date
Field of study

This paper considers the issue of bulk loading large data sets for the UB-Tree, a multidimensional index structure. Especially in dataware housing (DW), data mining and OLAP it is necessary to have efficient bulk loading techniques, because loading occurs not continuously, but only from time to time with usually large data sets. We propose two techniques, one for initial loading, which creates a new UB-Tree, and one for incremental loading, which adds data to an existing UB-Tree. Both techniques try to minimize I/O and CPU cost. Measurements with artificial data and data of a commercial data warehouse demonstrate that our algorithms are efficient and able to handle large data sets. As well as the UB-Tree, they are easily integrated into a RDBMS. Keywords: bulk loading, UB-tree, multidimensional index, dataware housing, data mining, OLAP

CiteSeerX