Search CORE

6,811 research outputs found

Fast Deterministic Selection

Author: Alexandrescu Andrei
Publication venue
Publication date: 04/08/2016
Field of study

The Median of Medians (also known as BFPRT) algorithm, although a landmark theoretical achievement, is seldom used in practice because it and its variants are slower than simple approaches based on sampling. The main contribution of this paper is a fast linear-time deterministic selection algorithm QuickselectAdaptive based on a refined definition of MedianOfMedians. The algorithm's performance brings deterministic selection---along with its desirable properties of reproducible runs, predictable run times, and immunity to pathological inputs---in the range of practicality. We demonstrate results on independent and identically distributed random inputs and on normally-distributed inputs. Measurements show that QuickselectAdaptive is faster than state-of-the-art baselines.Comment: Pre-publication draf

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Analysis of pivot sampling in dual-pivot Quicksort: A holistic analysis of Yaroslavskiy's partitioning scheme

Author: Martínez Parra Conrado
Nebel Markus E.
Wild Sebastian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/08/2015
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/s00453-015-0041-7The new dual-pivot Quicksort by Vladimir Yaroslavskiy-used in Oracle's Java runtime library since version 7-features intriguing asymmetries. They make a basic variant of this algorithm use less comparisons than classic single-pivot Quicksort. In this paper, we extend the analysis to the case where the two pivots are chosen as fixed order statistics of a random sample. Surprisingly, dual-pivot Quicksort then needs more comparisons than a corresponding version of classic Quicksort, so it is clear that counting comparisons is not sufficient to explain the running time advantages observed for Yaroslavskiy's algorithm in practice. Consequently, we take a more holistic approach and give also the precise leading term of the average number of swaps, the number of executed Java Bytecode instructions and the number of scanned elements, a new simple cost measure that approximates I/O costs in the memory hierarchy. We determine optimal order statistics for each of the cost measures. It turns out that the asymmetries in Yaroslavskiy's algorithm render pivots with a systematic skew more efficient than the symmetric choice. Moreover, we finally have a convincing explanation for the success of Yaroslavskiy's algorithm in practice: compared with corresponding versions of classic single-pivot Quicksort, dual-pivot Quicksort needs significantly less I/Os, both with and without pivot sampling.Peer ReviewedPostprint (author's final draft

arXiv.org e-Print Archive

University of Liverpool Repository

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

Publications at Bielefeld University

University of Southern Denmark Research Output

Analysis of Quickselect under Yaroslavskiy's Dual-Pivoting Algorithm

Author: Mahmoud Hosam
Nebel Markus E.
Wild Sebastian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

There is excitement within the algorithms community about a new partitioning method introduced by Yaroslavskiy. This algorithm renders Quicksort slightly faster than the case when it runs under classic partitioning methods. We show that this improved performance in Quicksort is not sustained in Quickselect; a variant of Quicksort for finding order statistics. We investigate the number of comparisons made by Quickselect to find a key with a randomly selected rank under Yaroslavskiy's algorithm. This grand averaging is a smoothing operator over all individual distributions for specific fixed order statistics. We give the exact grand average. The grand distribution of the number of comparison (when suitably scaled) is given as the fixed-point solution of a distributional equation of a contraction in the Zolotarev metric space. Our investigation shows that Quickselect under older partitioning methods slightly outperforms Quickselect under Yaroslavskiy's algorithm, for an order statistic of a random rank. Similar results are obtained for extremal order statistics, where again we find the exact average, and the distribution for the number of comparisons (when suitably scaled). Both limiting distributions are of perpetuities (a sum of products of independent mixed continuous random variables).Comment: full version with appendices; otherwise identical to Algorithmica versio

arXiv.org e-Print Archive

Crossref

Publications at Bielefeld University

Simple Symmetric Sustainable Sorting -- the greeNsort article

Author: Oehlschlägel Jens
Publication venue
Publication date: 02/02/2024
Field of study

We explored an uncharted part of the solution space for sorting algorithms: the role of symmetry in divide&conquer algorithms. We found/designed novel simple binary Quicksort and Mergesort algorithms operating in contiguous space which achieve improved trade-offs between worst-case CPU-efficiency, best-case adaptivity and RAM-requirements. The 'greeNsort' algorithms need less hardware (RAM) and/or less energy (CPU) compared to the prior art. The new algorithms fit a theoretical framework: 'Footprint' KPIs allow to compare algorithms with different RAM-requirements, a new 'definition' of sorting API-targets simplifies construction of stable algorithms with mirrored scan directions, and our ordinal machine model encourages robust algorithms that minimize access 'distance'. Unlike earlier 'Quicksorts', our 'Zacksort', 'Zucksort' and 'Ducksort' algorithms optimally marry CPU-efficiency and tie-adaptivity. Unlike earlier 'Mergesorts' which required 100% distant buffer, our 'Frogsort' and 'Geckosort' algorithms achieve similar CPU-efficiency with 50% or less local buffer. Unlike natural Mergesorts such as 'Timsort' which are optimized for the best case of full-presorting, our 'Octosort' and 'Squidsort' algorithms achieve excellent bi-adaptivity to presorted best-cases without sacrificing worst-case efficiency in real sorting tasks. Our 'Walksort' and 'Jumpsort' have lower Footprint than the impressive low-memory 'Grailsort' and 'Sqrtsort' of Astrelin. Given the current climate-emergency, this is a call to action for all maintainers of sorting libraries, all software-engineers using custom sorting code, all professors teaching algorithms, all IT professionals designing programming languages, compilers and CPUs: check for better algorithms and consider symmetric code-mirroring.Comment: 50 pages, 6 Figures, latest version under https://github.com/greeNsort/greeNsort.article, see also https://greensort.or

arXiv.org e-Print Archive

Virtual Prototyping for Dynamically Reconfigurable Architectures using Dynamic Generic Mapping

Author: Gibson Darrell
Long David Ian
Vasilko Milan
Publication venue
Publication date: 26/10/1998
Field of study

This paper presents a virtual prototyping methodology for Dynamically Reconfigurable (DR) FPGAs. The methodology is based around a library of VHDL image processing components and allows the rapid prototyping and algorithmic development of low-level image processing systems. For the effective modelling of dynamically reconfigurable designs a new technique named, Dynamic Generic Mapping is introduced. This method allows efficient representation of dynamic reconfiguration without needing any additional components to model the reconfiguration process. This gives the designer more flexibility in modelling dynamic configurations than other methodologies. Models created using this technique can then be simulated and targeted to a specific technology using the same code. This technique is demonstrated through the realisation of modules for a motion tracking system targeted to a DR environment, RIFLE-62

Bournemouth University Research Online

Efficient bulk-loading methods for temporal and multidimensional index structures

Author: Achakeev Daniar
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2013
Field of study

Nahezu alle naturwissenschaftlichen Bereiche profitieren von neuesten Analyse- und Verarbeitungsmethoden für große Datenmengen. Diese Verfahren setzten eine effiziente Verarbeitung von geo- und zeitbezogenen Daten voraus, da die Zeit und die Position wichtige Attribute vieler Daten sind. Die effiziente Anfrageverarbeitung wird insbesondere durch den Einsatz von Indexstrukturen ermöglicht. Im Fokus dieser Arbeit liegen zwei Indexstrukturen: Multiversion B-Baum (MVBT) und R-Baum. Die erste Struktur wird für die Verwaltung von zeitbehafteten Daten, die zweite für die Indexierung von mehrdimensionalen Rechteckdaten eingesetzt. Ständig- und schnellwachsendes Datenvolumen stellt eine große Herausforderung an die Informatik dar. Der Aufbau und das Aktualisieren von Indexen mit herkömmlichen Methoden (Datensatz für Datensatz) ist nicht mehr effizient. Um zeitnahe und kosteneffiziente Datenverarbeitung zu ermöglichen, werden Verfahren zum schnellen Laden von Indexstrukturen dringend benötigt. Im ersten Teil der Arbeit widmen wir uns der Frage, ob es ein Verfahren für das Laden von MVBT existiert, das die gleiche I/O-Komplexität wie das externe Sortieren besitz. Bis jetzt blieb diese Frage unbeantwortet. In dieser Arbeit haben wir eine neue Kostruktionsmethode entwickelt und haben gezeigt, dass diese gleiche Zeitkomplexität wie das externe Sortieren besitzt. Dabei haben wir zwei algorithmische Techniken eingesetzt: Gewichts-Balancierung und Puffer-Bäume. Unsere Experimenten zeigen, dass das Resultat nicht nur theoretischer Bedeutung ist. Im zweiten Teil der Arbeit beschäftigen wir uns mit der Frage, ob und wie statistische Informationen über Geo-Anfragen ausgenutzt werden können, um die Anfrageperformanz von R-Bäumen zu verbessern. Unsere neue Methode verwendet Informationen wie Seitenverhältnis und Seitenlängen eines repräsentativen Anfragerechtecks, um einen guten R-Baum bezüglich eines häufig eingesetzten Kostenmodells aufzubauen. Falls diese Informationen nicht verfügbar sind, optimieren wir R-Bäume bezüglich der Summe der Volumina von minimal umgebenden Rechtecken der Blattknoten. Da das Problem des Aufbaus von optimalen R-Bäumen bezüglich dieses Kostenmaßes NP-hart ist, führen wir zunächst das Problem auf ein eindimensionales Partitionierungsproblem zurück, indem wir die Daten bezüglich optimierte raumfüllende Kurven sortieren. Dann lösen wir dieses Problem durch Einsatz vom dynamischen Programmieren. Die I/O-Komplexität des Verfahrens ist gleich der von externem Sortieren, da die I/O-Laufzeit der Methode durch die Laufzeit des Sortierens dominiert wird. Im letzten Teil der Arbeit haben wir die entwickelten Partitionierungsvefahren für den Aufbau von Geo-Histogrammen eingesetzt, da diese ähnlich zu R-Bäumen eine disjunkte Partitionierung des Raums erzeugen. Ergebnisse von intensiven Experimenten zeigen, dass sich unter Verwendung von neuen Partitionierungstechniken sowohl R-Bäume mit besserer Anfrageperformanz als auch Geo-Histogrammen mit besserer Schätzqualität im Vergleich zu Konkurrenzverfahren generieren lassen

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg