12 research outputs found
A Back-to-Basics Empirical Study of Priority Queues
The theory community has proposed several new heap variants in the recent
past which have remained largely untested experimentally. We take the field
back to the drawing board, with straightforward implementations of both classic
and novel structures using only standard, well-known optimizations. We study
the behavior of each structure on a variety of inputs, including artificial
workloads, workloads generated by running algorithms on real map data, and
workloads from a discrete event simulator used in recent systems networking
research. We provide observations about which characteristics are most
correlated to performance. For example, we find that the L1 cache miss rate
appears to be strongly correlated with wallclock time. We also provide
observations about how the input sequence affects the relative performance of
the different heap variants. For example, we show (both theoretically and in
practice) that certain random insertion-deletion sequences are degenerate and
can lead to misleading results. Overall, our findings suggest that while the
conventional wisdom holds in some cases, it is sorely mistaken in others
The Logarithmic Funnel Heap: A Statistically Self-Similar Priority Queue
The present work contains the design and analysis of a statistically
self-similar data structure using linear space and supporting the operations,
insert, search, remove, increase-key and decrease-key for a deterministic
priority queue in expected O(1) time. Extract-max runs in O(log N) time. The
depth of the data structure is at most log* N. On the highest level, each
element acts as the entrance of a discrete, log* N-level funnel with a
logarithmically decreasing stem diameter, where the stem diameter denotes a
metric for the expected number of items maintained on a given level.Comment: 14 pages, 4 figure
Smooth heaps and a dual view of self-adjusting data structures
We present a new connection between self-adjusting binary search trees (BSTs)
and heaps, two fundamental, extensively studied, and practically relevant
families of data structures. Roughly speaking, we map an arbitrary heap
algorithm within a natural model, to a corresponding BST algorithm with the
same cost on a dual sequence of operations (i.e. the same sequence with the
roles of time and key-space switched). This is the first general transformation
between the two families of data structures.
There is a rich theory of dynamic optimality for BSTs (i.e. the theory of
competitiveness between BST algorithms). The lack of an analogous theory for
heaps has been noted in the literature. Through our connection, we transfer all
instance-specific lower bounds known for BSTs to a general model of heaps,
initiating a theory of dynamic optimality for heaps.
On the algorithmic side, we obtain a new, simple and efficient heap
algorithm, which we call the smooth heap. We show the smooth heap to be the
heap-counterpart of Greedy, the BST algorithm with the strongest proven and
conjectured properties from the literature, widely believed to be
instance-optimal. Assuming the optimality of Greedy, the smooth heap is also
optimal within our model of heap algorithms. As corollaries of results known
for Greedy, we obtain instance-specific upper bounds for the smooth heap, with
applications in adaptive sorting.
Intriguingly, the smooth heap, although derived from a non-practical BST
algorithm, is simple and easy to implement (e.g. it stores no auxiliary data
besides the keys and tree pointers). It can be seen as a variation on the
popular pairing heap data structure, extending it with a "power-of-two-choices"
type of heuristic.Comment: Presented at STOC 2018, light revision, additional figure
Algebras for weighted search
Weighted search is an essential component of many fundamental and useful algorithms. Despite this, it is relatively under explored as a computational effect, receiving not nearly as much attention as either depth- or breadth-first search. This paper explores the algebraic underpinning of weighted search, and demonstrates how to implement it as a monad transformer. The development first explores breadth-first search, which can be expressed as a polynomial over semirings. These polynomials are generalised to the free semi module monad to capture a wide range of applications, including probability monads, polynomial monads, and monads for weighted search. Finally, a monad trans-former based on the free semi module monad is introduced. Applying optimisations to this type yields an implementation of pairing heaps, which is then used to implement Dijkstra’s algorithm and efficient probabilistic sampling. The construction is formalised in Cubical Agda and implemented in Haskell
Selectable Heaps and Their Application to Lazy Search Trees
We show the O(log n) time extract minimum function of efficient priority queues can be generalized to the extraction of the k smallest elements in O(k log(n/k)) time. We first show the heap-ordered tree selection of Kaplan et al. can be applied on the heap-ordered trees of the classic Fibonacci heap to support the extraction in O(k \log(n/k)) amortized time. We then show selection is possible in a priority queue with optimal worst-case guarantees by applying heap-ordered tree selection on Brodal queues, supporting the operation in O(k log(n/k)) worst-case time.
Via a reduction from the multiple selection problem, Ω(k log(n/k)) time is necessary.
We then apply the result to the lazy search trees of Sandlund & Wild, creating a new interval data structure based on selectable heaps. This gives optimal O(B+n) lazy search tree performance, lowering insertion complexity into a gap Δi to O(log(n/|Δi|))$ time. An O(1)-time merge operation is also made possible under certain conditions. If Brodal queues are used, all runtimes of the lazy search tree can be made worst-case. The presented data structure uses soft heaps of Chazelle, biased search trees, and efficient priority queues in a non-trivial way, approaching the theoretically-best data structure for ordered data
Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries
We study ranked enumeration of join-query results according to very general
orders defined by selective dioids. Our main contribution is a framework for
ranked enumeration over a class of dynamic programming problems that
generalizes seemingly different problems that had been studied in isolation. To
this end, we extend classic algorithms that find the k-shortest paths in a
weighted graph. For full conjunctive queries, including cyclic ones, our
approach is optimal in terms of the time to return the top result and the delay
between results. These optimality properties are derived for the widely used
notion of data complexity, which treats query size as a constant. By performing
a careful cost analysis, we are able to uncover a previously unknown tradeoff
between two incomparable enumeration approaches: one has lower complexity when
the number of returned results is small, the other when the number is very
large. We theoretically and empirically demonstrate the superiority of our
techniques over batch algorithms, which produce the full result and then sort
it. Our technique is not only faster for returning the first few results, but
on some inputs beats the batch algorithm even when all results are produced.Comment: 50 pages, 19 figure
Reitinhakualgoritmien toiminnan optimointi tieaineistolla
Tiivistelmä. Nopeimman reitin haku tieaineistolla on osa monien ihmisten arkipäivää esimerkiksi käytettäessä mobiililaitteita. Optimaalisen reitin haku kahden pisteen välillä on aihe, josta on olemassa paljon aiempaa tutkimustietoa. Klassisia algoritmeja, jotka ratkaisevat nopeimman reitin ongelman graafissa ja joita voidaan hyödyntää myös tieaineistolla ovat Dijkstran algoritmi ja A* -algoritmi. Uusimmissa tutkimuksissa on kehitetty nopeita algoritmeja, joille tarvittava reittiaineisto esikäsitellään nopeuden parantamiseksi. Tämän työn keskeisenä tutkimuskysymyksenä on selvittää, miten klassisia reitinhakualgoritmeja voidaan optimoida Suomen kattavalla Digiroad-tieaineistolla aiemmasta tutkimuksesta löytyvin menetelmin. Työssä tutkitaan tietorakenteiden optimointia tieaineistolle, keskinopeuden käyttöä heuristiikkana matka-ajan optimoinnissa sekä miten tieaineiston metadataa voidaan hyödyntää algoritmien optimoinnissa. Saatuja tuloksia verrataan esikäsittelyä hyödyntäviin algoritmeihin.
Kirjallisuuskatsaus sisältää käsittelyn klassisista reititysalgoritmeista, tieaineiston erityispiirteistä sekä miten esikäsittelyä hyödyntävien menetelmien kehitys on vaikuttanut tutkimukseen. Lisäksi käsitellään tietorakenteiden roolia reititysalgoritmien toiminnassa sekä millaisia tuloksia eri optimointimenetelmien yhdistelmillä on saatu. Tämän työn tutkimusmenetelmänä on suunnittelutiede (Design Science). Menetelmän avulla luotu artefakti lukee Digiroad-aineistoa ja toimii yhtenäisenä alustana kolmelle erilaisille koejärjestelylle, jotka vastaavat esitettyihin tutkimuskysymyksiin. Jokaisessa koejärjestelyssä kerättiin tietoa algoritmien toiminnassa suorituskyky ja laatumittareilla käyttäen viittäsataa satunnaisesti valittua pisteparia.
Tietorakenteiden ja algoritmien koejärjestelyn keskeisenä tuloksena havaittiin A* -algoritmin suoriutuvan 3,59 kertaa nopeammin kuin Dijkstran algoritmi Digiroad-aineistolla. Lisäksi tietorakenteiden käytännön toteutuksella on suuri vaikutus algoritmien nopeuteen. Binäärikeon toimintaa optimoimalla voitiin parantaa reitityksen nopeutta A* -algoritmilla 6,43 kertaisesti verrattuna perinteiseen binäärikekoon. Keskinopeutta käsittelevässä koejärjestelyssä tutkittiin A* -algoritmin heuristiikkaa ja millä arvioidulla keskinopeudella saadaan parhaat tulokset heuristiikassa linnuntietä hyödyntäen. Paras tasapaino reititysnopeuden ja optimaalisesta reitistä poikkeaman välillä saavutettiin arvoilla 70–90 km/h. Tällöin poikkeama oli 0,03–1,61 % optimaalisesta ratkaisusta reititysajan ollessa 3,54–14,84 kertaa parempi kuin referenssinä toimineessa Dijkstran algoritmissa. Kolmannessa koejärjestelyssä hyödynnettiin Digiroad-aineiston metatietoja luomalla niiden pohjalta erilaisia kaksitasoisia hierarkkisia algoritmeja käyttäen pohjana A* -algoritmia ja aiempien koejärjestelyjen tuloksia tavoitteena parantaa reititysaikoja. Tulokset voidaan jakaa kahteen ryhmään. Laadukkaimmilla algoritmeilla saavutettiin 14,25–18,27 kertainen parannus Dijkstran algoritmiin verrattuna poikkeaman optimaalisesta reitistä ollen 0,79–0,92 %. Nopeimmilla algoritmeilla saavutettiin 26,36–37,54 kertainen parannus Dijkstran algoritmiin, mutta poikkeama optimaalisesta reitistä oli tällöin 2,60–3,21 %. Tutkituilla algoritmeilla voidaan lähestyä nopeudeltaan yksinkertaisia esikäsittelyä hyödyntäviä algoritmeja, jos poikkeama optimaalisesta reitistä on hyväksyttävissä