54 research outputs found
Heaps and heapsort on secondary storage
AbstractA heap structure designed for secondary storage is suggested that tries to make the best use of the available buffer space in primary memory. The heap is a complete multi-way tree, with multi-page blocks of records as nodes, satisfying a generalized heap property. A special feature of the tree is that the nodes may be partially filled, as in B-trees. The structure is complemented with priority-queue operations insert and delete-max. When handling a sequence of S operations, the number of page transfers performed is shown to be O(∑i = 1S(1P) log(MP)(NiP)), where P denotes the number of records fitting into a page, M the capacity of the buffer space in records, and Ni, the number of records in the heap prior to the ith operation (assuming P ⩾ 1 and S > M ⩾ c · P, where c is a small positive constant). The number of comparisons required when handling the sequence is O(∑i = 1S log2 Ni). Using the suggested data structure we obtain an optimal external heapsort that performs O((NP) log(MP)(NP)) page transfers and O(N log2 N) comparisons in the worst case when sorting N records
An In-Place Sorting with O(n log n) Comparisons and O(n) Moves
We present the first in-place algorithm for sorting an array of size n that
performs, in the worst case, at most O(n log n) element comparisons and O(n)
element transports.
This solves a long-standing open problem, stated explicitly, e.g., in [J.I.
Munro and V. Raman, Sorting with minimum data movement, J. Algorithms, 13,
374-93, 1992], of whether there exists a sorting algorithm that matches the
asymptotic lower bounds on all computational resources simultaneously
Two-way replacement selection
The performance of external sorting is highly dependant on the length of the runs generated.
One of the most commonly used run generation strategies is Replacement Selection (RS) because,
on average, it generates runs that are twice the size of the memory available.
However, the length of the runs generated by RS is downsized for data with certain characteristics,like inputs sorted inversely with respect to the desired output order.
The goal of this project is to propose and analyze two-way replacement selection (2WRS),
which is a generalization of RS obtained by implementing two heaps instead of the single
heap implemented by RS. The appropriate management of these two heaps allows generating runs larger than the memory available in a stable way, i.e. independent from the characteristics of the datasets.
Depending on the changing characteristics of the input dataset,
2WRS assigns a new data record to one or the other heap, and grows or shrinks each heap,
accommodating to the growing or decreasing tendency of the dataset.
On average, 2WRS creates runs of at least the length generated by RS,
and longer for datasets that combine increasing and decreasing data subsets.
We tested both algorithms on large datasets with different characteristics
and 2WRS achieves speedups at least similar to RS, and over 2.5 when RS fails
to generate large runs.
. El projecte consisteix en desenvolupar un algorisme d'ordenació externa basat en Replacement Selection, de manera que solucioni els problemes inherents a replacement selection.
L'estudiant haurà de dissenyar i implementar l'algorisme, fer un estudi estadÃstic de la seva eficiència, i comparar la eficiència en temps del nou algorisme amb replacement selection
Two-way replacement selection
The performance of external sorting is highly dependant on the length of the runs generated.
One of the most commonly used run generation strategies is Replacement Selection (RS) because,
on average, it generates runs that are twice the size of the memory available.
However, the length of the runs generated by RS is downsized for data with certain characteristics,like inputs sorted inversely with respect to the desired output order.
The goal of this project is to propose and analyze two-way replacement selection (2WRS),
which is a generalization of RS obtained by implementing two heaps instead of the single
heap implemented by RS. The appropriate management of these two heaps allows generating runs larger than the memory available in a stable way, i.e. independent from the characteristics of the datasets.
Depending on the changing characteristics of the input dataset,
2WRS assigns a new data record to one or the other heap, and grows or shrinks each heap,
accommodating to the growing or decreasing tendency of the dataset.
On average, 2WRS creates runs of at least the length generated by RS,
and longer for datasets that combine increasing and decreasing data subsets.
We tested both algorithms on large datasets with different characteristics
and 2WRS achieves speedups at least similar to RS, and over 2.5 when RS fails
to generate large runs.
. El projecte consisteix en desenvolupar un algorisme d'ordenació externa basat en Replacement Selection, de manera que solucioni els problemes inherents a replacement selection.
L'estudiant haurà de dissenyar i implementar l'algorisme, fer un estudi estadÃstic de la seva eficiència, i comparar la eficiència en temps del nou algorisme amb replacement selection
RAM-Efficient External Memory Sorting
In recent years a large number of problems have been considered in external
memory models of computation, where the complexity measure is the number of
blocks of data that are moved between slow external memory and fast internal
memory (also called I/Os). In practice, however, internal memory time often
dominates the total running time once I/O-efficiency has been obtained. In this
paper we study algorithms for fundamental problems that are simultaneously
I/O-efficient and internal memory efficient in the RAM model of computation.Comment: To appear in Proceedings of ISAAC 2013, getting the Best Paper Awar
Data Structures & Algorithm Analysis in C++
This is the textbook for CSIS 215 at Liberty University.https://digitalcommons.liberty.edu/textbooks/1005/thumbnail.jp
- …