160 research outputs found

    Efficient Optimization for Rank-based Loss Functions

    Full text link
    The accuracy of information retrieval systems is often measured using complex loss functions such as the average precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions does not allow for simple gradient based optimization algorithms. This issue is generally circumvented by either optimizing a structured hinge-loss upper bound to the loss function or by using asymptotic methods like the direct-loss minimization framework. Yet, the high computational complexity of loss-augmented inference, which is necessary for both the frameworks, prohibits its use in large training data sets. To alleviate this deficiency, we present a novel quicksort flavored algorithm for a large class of non-decomposable loss functions. We provide a complete characterization of the loss functions that are amenable to our algorithm, and show that it includes both AP and NDCG based loss functions. Furthermore, we prove that no comparison based algorithm can improve upon the computational complexity of our approach asymptotically. We demonstrate the effectiveness of our approach in the context of optimizing the structured hinge loss upper bound of AP and NDCG loss for learning models for a variety of vision tasks. We show that our approach provides significantly better results than simpler decomposable loss functions, while requiring a comparable training time.Comment: 15 pages, 2 figure

    Just Sort It! A Simple and Effective Approach to Active Preference Learning

    Get PDF
    We address the problem of learning a ranking by using adaptively chosen pairwise comparisons. Our goal is to recover the ranking accurately but to sample the comparisons sparingly. If all comparison outcomes are consistent with the ranking, the optimal solution is to use an efficient sorting algorithm, such as Quicksort. But how do sorting algorithms behave if some comparison outcomes are inconsistent with the ranking? We give favorable guarantees for Quicksort for the popular Bradley-Terry model, under natural assumptions on the parameters. Furthermore, we empirically demonstrate that sorting algorithms lead to a very simple and effective active learning strategy: repeatedly sort the items. This strategy performs as well as state-of-the-art methods (and much better than random sampling) at a minuscule fraction of the computational cost.Comment: Accepted at ICML 201

    Engineering Parallel String Sorting

    Get PDF
    We discuss how string sorting algorithms can be parallelized on modern multi-core shared memory machines. As a synthesis of the best sequential string sorting algorithms and successful parallel sorting algorithms for atomic objects, we first propose string sample sort. The algorithm makes effective use of the memory hierarchy, uses additional word level parallelism, and largely avoids branch mispredictions. Then we focus on NUMA architectures, and develop parallel multiway LCP-merge and -mergesort to reduce the number of random memory accesses to remote nodes. Additionally, we parallelize variants of multikey quicksort and radix sort that are also useful in certain situations. Comprehensive experiments on five current multi-core platforms are then reported and discussed. The experiments show that our implementations scale very well on real-world inputs and modern machines.Comment: 46 pages, extension of "Parallel String Sample Sort" arXiv:1305.115

    Notes on the applicability of contraction method for stable limit laws

    Get PDF
    We presented a proof for the classical stable limit laws under use of contraction method in combination with the Zolotarev metric. Furthermore, a stable limit law was proved for scaled sums of growing into sequences. This limit law was alternatively formulated for sequences of random variables defined by a simple degenerate recursion

    A Unified Approach to Tail Estimates for Randomized Incremental Construction

    Get PDF
    By combining several interesting applications of random sampling in geometric algorithms like point location, linear programming, segment intersections, binary space partitioning, Clarkson and Shor [Kenneth L. Clarkson and Peter W. Shor, 1989] developed a general framework of randomized incremental construction (RIC ). The basic idea is to add objects in a random order and show that this approach yields efficient/optimal bounds on expected running time. Even quicksort can be viewed as a special case of this paradigm. However, unlike quicksort, for most of these problems, sharper tail estimates on their running times are not known. Barring some promising attempts in [Kurt Mehlhorn et al., 1993; Kenneth L. Clarkson et al., 1992; Raimund Seidel, 1991], the general question remains unresolved. In this paper we present a general technique to obtain tail estimates for RIC and and provide applications to some fundamental problems like Delaunay triangulations and construction of Visibility maps of intersecting line segments. The main result of the paper is derived from a new and careful application of Freedman\u27s [David Freedman, 1975] inequality for Martingale concentration that overcomes the bottleneck of the better known Azuma-Hoeffding inequality. Further, we explore instances, where an RIC based algorithm may not have inverse polynomial tail estimates. In particular, we show that the RIC time bounds for trapezoidal map can encounter a running time of Omega (n log n log log n) with probability exceeding 1/(sqrt{n)}. This rules out inverse polynomial concentration bounds within a constant factor of the O(n log n) expected running time

    Comparison and Enhancement of Sorting Algorithms

    Get PDF
    Some of the primordial issues in computer science is searching, arranging and ordering a list of items or information. Sorting is an important data structure operation, which makes these daunting tasks very easy and helps in searching and arranging the information. A lot of sorting algorithms has been developed to enhance and aggrandize the performance in terms of computational complexity, efficiency, memory, space, speed and other factors. Although there are an enormous number of sorting algorithms, searching and sorting problem has attracted a great deal of research and experimentation; because efficient sorting is important to optimize the use of other algorithms. This paper is an attempt to compare the performance of seven already existing sorting algorithm named as Bubble sort, Merge sort, Quick sort, Heap sort, Insertion sort, Shell sort, Selection sort and to provide an enhancement to these sorting algorithms to make the sorting through these algorithms better. In many cases this enhancement was found faster than the existing algorithms available
    • …
    corecore