Search CORE

5 research outputs found

Improved Average Complexity for Comparison-Based Sorting

Author: DE Knuth
FK Hwang
GK Manacher
GK Manacher
J Schulte
LR Ford
M Ayala-Rincón
M Peczarski
M Peczarski
M Peczarski
M Thanh
Publication venue
Publication date: 02/05/2017
Field of study

This paper studies the average complexity on the number of comparisons for sorting algorithms. Its information-theoretic lower bound is

n \lg n - 1.4427n + O(\log n)

. For many efficient algorithms, the first

n\lg n

term is easy to achieve and our focus is on the (negative) constant factor of the linear term. The current best value is

-1.3999

for the MergeInsertion sort. Our new value is

-1.4106

, narrowing the gap by some

25\%

. An important building block of our algorithm is "two-element insertion," which inserts two numbers

A

and

B

A<B

, into a sorted sequence

T

. This insertion algorithm is still sufficiently simple for rigorous mathematical analysis and works well for a certain range of the length of

T

for which the simple binary insertion does not, thus allowing us to take a complementary approach with the binary insertion.Comment: 21 pages, 2 figure

arXiv.org e-Print Archive

Crossref

On the Optimality of Tape Merge of Two Lists with Similar Size

Author: Li Qian
Sun Xiaoming
Zhang Jialin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th International Symposium on Algorithms and Computation (ISAAC 2016)
Publication date: 01/01/2016
Field of study

The problem of merging sorted lists in the least number of pairwise comparisons has been solved completely only for a few special cases. Graham and Karp \cite{taocp} independently discovered that the tape merge algorithm is optimal in the worst case when the two lists have the same size. In the seminal papers, Stockmeyer and Yao\cite{yao}, Murphy and Paull\cite{3k3}, and Christen\cite{christen1978optimality} independently showed when the lists to be merged are of size

m

and

n

satisfying

m\leq n\leq\lfloor\frac{3}{2}m\rfloor+1

, the tape merge algorithm is optimal in the worst case. This paper extends this result by showing that the tape merge algorithm is optimal in the worst case whenever the size of one list is no larger than 1.52 times the size of the other. The main tool we used to prove lower bounds is Knuth's adversary methods \cite{taocp}. In addition, we show that the lower bound cannot be improved to 1.8 via Knuth's adversary methods. We also develop a new inequality about Knuth's adversary methods, which might be interesting in its own right. Moreover, we design a simple procedure to achieve constant improvement of the upper bounds for

2m-2\leq n\leq 3m

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

The production of partial orders

Author: Schönhage A.
Publication venue
Publication date: 01/01/1976
Field of study

Numérisation de Documents Anciens Mathématiques

Algorithms and Data Structures for In-Memory Text Search Engines

Author: Transier Frederik
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2010
Field of study

KITopen

Efficient Evaluation of Set Expressions

Author: Mirzazadeh Mehdi
Publication venue: 'University of Waterloo'
Publication date: 01/01/2014
Field of study

In this thesis, we study the problem of evaluating set expressions over sorted sets in the comparison model. The problem arises in the context of evaluating search queries in text database systems; most text search engines maintain an inverted list, which consists of a set of documents that contain each possible word. Thus, answering a query is reduced to computing the union, the intersection, or a more complex set expression over sets of documents containing the words in the query. At the first step, for a given expression on a number of sets and the sizes of the sets, we investigate the worst-case complexity of evaluating the expression in terms of the sizes of the sets. We prove lower bounds and provide algorithms with the matching running time up to a constant factor. We then refine the problem further and design an algorithm that computes such expressions according to the degree by which the input sets are interleaved rather than only considering sets sizes. %We prove the running time of our algorithm is asymptotically optimal. We prove the optimality of our algorithm by way of presenting a matching lower bound sensitive to the interleaving measure. The algorithms we present are different in the set of set operators they allow in input expressions. We provide algorithms that are worst-case optimal for inputs with union, intersection, and symmetric difference operators. One of the algorithms we provide also supports minus and complement operators and is conjectured to be optimal when an input is allowed to contain these operators as well. We also provide a worst-case optimal algorithm for the form of problem where the input may contain "threshold'" operators, which generalize union and intersection operators: for a number t, a t-threshold operator selects elements that appear in at least in t of the operand sets. Finally, the adaptive algorithm we provide supports union and intersection operators

CiteSeerX

University of Waterloo's Institutional Repository