18,031 research outputs found
Twenty-Five Comparators is Optimal when Sorting Nine Inputs (and Twenty-Nine for Ten)
This paper describes a computer-assisted non-existence proof of nine-input
sorting networks consisting of 24 comparators, hence showing that the
25-comparator sorting network found by Floyd in 1964 is optimal. As a
corollary, we obtain that the 29-comparator network found by Waksman in 1969 is
optimal when sorting ten inputs.
This closes the two smallest open instances of the optimal size sorting
network problem, which have been open since the results of Floyd and Knuth from
1966 proving optimality for sorting networks of up to eight inputs.
The proof involves a combination of two methodologies: one based on
exploiting the abundance of symmetries in sorting networks, and the other,
based on an encoding of the problem to that of satisfiability of propositional
logic. We illustrate that, while each of these can single handed solve smaller
instances of the problem, it is their combination which leads to an efficient
solution for nine inputs.Comment: 18 page
Formalizing Size-Optimal Sorting Networks: Extracting a Certified Proof Checker
Since the proof of the four color theorem in 1976, computer-generated proofs
have become a reality in mathematics and computer science. During the last
decade, we have seen formal proofs using verified proof assistants being used
to verify the validity of such proofs.
In this paper, we describe a formalized theory of size-optimal sorting
networks. From this formalization we extract a certified checker that
successfully verifies computer-generated proofs of optimality on up to 8
inputs. The checker relies on an untrusted oracle to shortcut the search for
witnesses on more than 1.6 million NP-complete subproblems.Comment: IMADA-preprint-c
Optimizing a Certified Proof Checker for a Large-Scale Computer-Generated Proof
In recent work, we formalized the theory of optimal-size sorting networks
with the goal of extracting a verified checker for the large-scale
computer-generated proof that 25 comparisons are optimal when sorting 9 inputs,
which required more than a decade of CPU time and produced 27 GB of proof
witnesses. The checker uses an untrusted oracle based on these witnesses and is
able to verify the smaller case of 8 inputs within a couple of days, but it did
not scale to the full proof for 9 inputs. In this paper, we describe several
non-trivial optimizations of the algorithm in the checker, obtained by
appropriately changing the formalization and capitalizing on the symbiosis with
an adequate implementation of the oracle. We provide experimental evidence of
orders of magnitude improvements to both runtime and memory footprint for 8
inputs, and actually manage to check the full proof for 9 inputs.Comment: IMADA-preprint-c
Even faster sorting of (not only) integers
In this paper we introduce RADULS2, the fastest parallel sorter based on
radix algorithm. It is optimized to process huge amounts of data making use of
modern multicore CPUs. The main novelties include: extremely optimized
algorithm for handling tiny arrays (up to about a hundred of records) that
could appear even billions times as subproblems to handle and improved
processing of larger subarrays with better use of non-temporal memory stores
Conclave: secure multi-party computation on big data (extended TR)
Secure Multi-Party Computation (MPC) allows mutually distrusting parties to
run joint computations without revealing private data. Current MPC algorithms
scale poorly with data size, which makes MPC on "big data" prohibitively slow
and inhibits its practical use.
Many relational analytics queries can maintain MPC's end-to-end security
guarantee without using cryptographic MPC techniques for all operations.
Conclave is a query compiler that accelerates such queries by transforming them
into a combination of data-parallel, local cleartext processing and small MPC
steps. When parties trust others with specific subsets of the data, Conclave
applies new hybrid MPC-cleartext protocols to run additional steps outside of
MPC and improve scalability further.
Our Conclave prototype generates code for cleartext processing in Python and
Spark, and for secure MPC using the Sharemind and Obliv-C frameworks. Conclave
scales to data sets between three and six orders of magnitude larger than
state-of-the-art MPC frameworks support on their own. Thanks to its hybrid
protocols, Conclave also substantially outperforms SMCQL, the most similar
existing system.Comment: Extended technical report for EuroSys 2019 pape
Engineering faster sorters for small sets of items
Sorting a set of items is a task that can be useful by itself or as a building block for more complex operations. That is why a lot of effort has been put into finding sorting algorithms that sort large sets as efficiently as possible. But the more sophisticated and complex the algorithms become, the less efficient they are for small sets of items due to large constant factors. A relatively simple sorting algorithm that is often used as a base case sorter is insertion sort, because it has small code size and small constant factors influencing its execution time. We aim to determine if there is a faster way to sort small sets of items to provide an efficient base case sorter. We looked at sorting networks, at how they can improve the speed of sorting few elements, and how to implement them in an efficient manner using conditional moves. Since sorting networks need to be implemented explicitly for each set size, providing networks for larger sizes becomes less efficient due to increased code sizes. To also enable the sorting of slightly larger base cases, we adapted sample sort to Register Sample Sort, to break down those larger sets into sizes that can in turn be sorted by sorting networks. From our experiments we found that when sorting only small sets of integers, the sorting networks outperform insertion sort by a factor of at least 1.76 for any array size between six and 16, and by a factor of 2.72 on average across all machines and array sizes. When integrating sorting networks as a base case sorter into Quicksort, we achieved far less performance improvements over using insertion sort, which is probably due to the networks having a larger code size and cluttering the L1 instruction cache. The same effect occurs when including Register Sample Sort as a base case sorter for IPS4o. But for x86 machines that have a larger L1 instruction cache of 64 KiB or more, we obtained speedups of 12.7% when using sorting networks as a base case sorter in std::sort, and of 5%–6% when integrating Register Sample Sort as a base case sorter into IPS4o, each in comparison to using insertion sort as the base case sorter. In conclusion, the desired improvement in speed could only be achieved under special circumstances, but the results clearly show the potential of using conditional moves in the field of sorting algorithms
The Economics of International Differences in Educational Achievement
An emerging economic literature over the past decade has made use of international tests of educational achievement to analyze the determinants and impacts of cognitive skills. The cross-country comparative approach provides a number of unique advantages over national studies: It can exploit institutional variation that does not exist within countries; draw on much larger variation than usually available within any country; reveal whether any result is country-specific or more general; test whether effects are systematically heterogeneous in different settings; circumvent selection issues that plague within-country identification by using system-level aggregated measures; and uncover general-equilibrium effects that often elude studies in a single country. The advantages come at the price of concerns about the limited number of country observations, the cross-sectional character of most available achievement data, and possible bias from unobserved country factors like culture. This chapter reviews the economic literature on international differences in educational achievement, restricting itself to comparative analyses that are not possible within single countries and placing particular emphasis on studies trying to address key issues of empirical identification. While quantitative input measures show little impact, several measures of institutional structures and of the quality of the teaching force can account for significant portions of the large international differences in the level and equity of student achievement. Variations in skills measured by the international tests are in turn strongly related to individual labor-market outcomes and, perhaps more importantly, to cross-country variations in economic growth.human capital, cognitive skills, international student achievement tests, education production function
- …