160 research outputs found
Efficient Optimization for Rank-based Loss Functions
The accuracy of information retrieval systems is often measured using complex
loss functions such as the average precision (AP) or the normalized discounted
cumulative gain (NDCG). Given a set of positive and negative samples, the
parameters of a retrieval system can be estimated by minimizing these loss
functions. However, the non-differentiability and non-decomposability of these
loss functions does not allow for simple gradient based optimization
algorithms. This issue is generally circumvented by either optimizing a
structured hinge-loss upper bound to the loss function or by using asymptotic
methods like the direct-loss minimization framework. Yet, the high
computational complexity of loss-augmented inference, which is necessary for
both the frameworks, prohibits its use in large training data sets. To
alleviate this deficiency, we present a novel quicksort flavored algorithm for
a large class of non-decomposable loss functions. We provide a complete
characterization of the loss functions that are amenable to our algorithm, and
show that it includes both AP and NDCG based loss functions. Furthermore, we
prove that no comparison based algorithm can improve upon the computational
complexity of our approach asymptotically. We demonstrate the effectiveness of
our approach in the context of optimizing the structured hinge loss upper bound
of AP and NDCG loss for learning models for a variety of vision tasks. We show
that our approach provides significantly better results than simpler
decomposable loss functions, while requiring a comparable training time.Comment: 15 pages, 2 figure
Just Sort It! A Simple and Effective Approach to Active Preference Learning
We address the problem of learning a ranking by using adaptively chosen
pairwise comparisons. Our goal is to recover the ranking accurately but to
sample the comparisons sparingly. If all comparison outcomes are consistent
with the ranking, the optimal solution is to use an efficient sorting
algorithm, such as Quicksort. But how do sorting algorithms behave if some
comparison outcomes are inconsistent with the ranking? We give favorable
guarantees for Quicksort for the popular Bradley-Terry model, under natural
assumptions on the parameters. Furthermore, we empirically demonstrate that
sorting algorithms lead to a very simple and effective active learning
strategy: repeatedly sort the items. This strategy performs as well as
state-of-the-art methods (and much better than random sampling) at a minuscule
fraction of the computational cost.Comment: Accepted at ICML 201
Engineering Parallel String Sorting
We discuss how string sorting algorithms can be parallelized on modern
multi-core shared memory machines. As a synthesis of the best sequential string
sorting algorithms and successful parallel sorting algorithms for atomic
objects, we first propose string sample sort. The algorithm makes effective use
of the memory hierarchy, uses additional word level parallelism, and largely
avoids branch mispredictions. Then we focus on NUMA architectures, and develop
parallel multiway LCP-merge and -mergesort to reduce the number of random
memory accesses to remote nodes. Additionally, we parallelize variants of
multikey quicksort and radix sort that are also useful in certain situations.
Comprehensive experiments on five current multi-core platforms are then
reported and discussed. The experiments show that our implementations scale
very well on real-world inputs and modern machines.Comment: 46 pages, extension of "Parallel String Sample Sort" arXiv:1305.115
Notes on the applicability of contraction method for stable limit laws
We presented a proof for the classical stable limit laws under use of contraction method in combination with the Zolotarev metric. Furthermore, a stable limit law was proved for scaled sums of growing into sequences. This limit law was alternatively formulated for sequences of random variables defined by a simple degenerate recursion
A Unified Approach to Tail Estimates for Randomized Incremental Construction
By combining several interesting applications of random sampling in geometric algorithms like point location, linear programming, segment intersections, binary space partitioning, Clarkson and Shor [Kenneth L. Clarkson and Peter W. Shor, 1989] developed a general framework of randomized incremental construction (RIC ). The basic idea is to add objects in a random order and show that this approach yields efficient/optimal bounds on expected running time. Even quicksort can be viewed as a special case of this paradigm. However, unlike quicksort, for most of these problems, sharper tail estimates on their running times are not known. Barring some promising attempts in [Kurt Mehlhorn et al., 1993; Kenneth L. Clarkson et al., 1992; Raimund Seidel, 1991], the general question remains unresolved.
In this paper we present a general technique to obtain tail estimates for RIC and and provide applications to some fundamental problems like Delaunay triangulations and construction of Visibility maps of intersecting line segments. The main result of the paper is derived from a new and careful application of Freedman\u27s [David Freedman, 1975] inequality for Martingale concentration that overcomes the bottleneck of the better known Azuma-Hoeffding inequality. Further, we explore instances, where an RIC based algorithm may not have inverse polynomial tail estimates. In particular, we show that the RIC time bounds for trapezoidal map can encounter a running time of Omega (n log n log log n) with probability exceeding 1/(sqrt{n)}. This rules out inverse polynomial concentration bounds within a constant factor of the O(n log n) expected running time
Comparison and Enhancement of Sorting Algorithms
Some of the primordial issues in computer science is searching, arranging and ordering a list of items or information. Sorting is an important data structure operation, which makes these daunting tasks very easy and helps in searching and arranging the information. A lot of sorting algorithms has been developed to enhance and aggrandize the performance in terms of computational complexity, efficiency, memory, space, speed and other factors. Although there are an enormous number of sorting algorithms, searching and sorting problem has attracted a great deal of research and experimentation; because efficient sorting is important to optimize the use of other algorithms. This paper is an attempt to compare the performance of seven already existing sorting algorithm named as Bubble sort, Merge sort, Quick sort, Heap sort, Insertion sort, Shell sort, Selection sort and to provide an enhancement to these sorting algorithms to make the sorting through these algorithms better. In many cases this enhancement was found faster than the existing algorithms available
- …