191,546 research outputs found
Radix Sorting With No Extra Space
It is well known that n integers in the range [1,n^c] can be sorted in O(n)
time in the RAM model using radix sorting. More generally, integers in any
range [1,U] can be sorted in O(n sqrt{loglog n}) time. However, these
algorithms use O(n) words of extra memory. Is this necessary?
We present a simple, stable, integer sorting algorithm for words of size
O(log n), which works in O(n) time and uses only O(1) words of extra memory on
a RAM model. This is the integer sorting case most useful in practice. We
extend this result with same bounds to the case when the keys are read-only,
which is of theoretical interest. Another interesting question is the case of
arbitrary c. Here we present a black-box transformation from any RAM sorting
algorithm to a sorting algorithm which uses only O(1) extra space and has the
same running time. This settles the complexity of in-place sorting in terms of
the complexity of sorting.Comment: Full version of paper accepted to ESA 2007. (17 pages
CoNLL-Merge: Efficient Harmonization of Concurrent Tokenization and Textual Variation
The proper detection of tokens in of running text represents the initial processing step in modular NLP pipelines. But strategies for defining these minimal units can differ, and conflicting analyses of the same text seriously limit the integration of subsequent linguistic annotations into a shared representation. As a solution, we introduce CoNLL Merge, a practical tool for harmonizing TSV-related data models, as they occur, e.g., in multi-layer corpora with non-sequential, concurrent tokenizations, but also in ensemble combinations in Natural Language Processing. CoNLL Merge works unsupervised, requires no manual intervention or external data sources, and comes with a flexible API for fully automated merging routines, validity and sanity checks. Users can chose from several merging strategies, and either preserve a reference tokenization (with possible losses of annotation granularity), create a common tokenization layer consisting of minimal shared subtokens (loss-less in terms of annotation granularity, destructive against a reference tokenization), or present tokenization clashes (loss-less and non-destructive, but introducing empty tokens as place-holders for unaligned elements). We demonstrate the applicability of the tool on two use cases from natural language processing and computational philology
CoNLL-Merge: efficient harmonization of concurrent tokenization and textual variation
The proper detection of tokens in of running text represents the initial processing step in modular NLP pipelines. But strategies for defining these minimal units can differ, and conflicting analyses of the same text seriously limit the integration of subsequent linguistic annotations into a shared representation. As a solution, we introduce CoNLL Merge, a practical tool for harmonizing TSV-related data models, as they occur, e.g., in multi-layer corpora with non-sequential, concurrent tokenizations, but also in ensemble combinations in Natural Language Processing. CoNLL Merge works unsupervised, requires no manual intervention or external data sources, and comes with a flexible API for fully automated merging routines, validity and sanity checks. Users can chose from several merging strategies, and either preserve a reference tokenization (with possible losses of annotation granularity), create a common tokenization layer consisting of minimal shared subtokens (loss-less in terms of annotation granularity, destructive against a reference tokenization), or present tokenization clashes (loss-less and non-destructive, but introducing empty tokens as place-holders for unaligned elements). We demonstrate the applicability of the tool on two use cases from natural language processing and computational philology
Engineering Parallel String Sorting
We discuss how string sorting algorithms can be parallelized on modern
multi-core shared memory machines. As a synthesis of the best sequential string
sorting algorithms and successful parallel sorting algorithms for atomic
objects, we first propose string sample sort. The algorithm makes effective use
of the memory hierarchy, uses additional word level parallelism, and largely
avoids branch mispredictions. Then we focus on NUMA architectures, and develop
parallel multiway LCP-merge and -mergesort to reduce the number of random
memory accesses to remote nodes. Additionally, we parallelize variants of
multikey quicksort and radix sort that are also useful in certain situations.
Comprehensive experiments on five current multi-core platforms are then
reported and discussed. The experiments show that our implementations scale
very well on real-world inputs and modern machines.Comment: 46 pages, extension of "Parallel String Sample Sort" arXiv:1305.115
Revising the U.S. Vertical Merger Guidelines: Policy Issues and an Interim Guide for Practitioners
Mergers and acquisitions are a major component of antitrust law and practice. The U.S. antitrust agencies spend a majority of their time on merger enforcement. The focus of most merger review at the agencies involves horizontal mergers, that is, mergers among firms that compete at the same level of production or distribution.
Vertical mergers combine firms at different levels of production or distribution. In the simplest case, a vertical merger joins together a firm that produces an input (and competes in an input market) with a firm that uses that input to produce output (and competes in an output market).
Over the years, the agencies have issued Merger Guidelines that outline the type of analysis carried out by the agencies and the agencies’ enforcement intentions in light of state of the law. These Guidelines are used by agency staff in evaluating mergers, as well as by outside counsel and the courts.
Guidelines for vertical mergers were issued in 1968 and revised in 1984. However, the Vertical Merger Guidelines have not been revised since 1984. Those Guidelines are now woefully out of date. They do not reflect current economic thinking about vertical mergers. Nor do they reflect current agency practice. Nor do they reflect the analytic approach taken in the 2010 Horizontal Merger Guidelines. As a result, practitioners and firms lack the benefits of up-to-date guidance from the U.S. enforcement agencies
Worst-Case Efficient Sorting with QuickMergesort
The two most prominent solutions for the sorting problem are Quicksort and
Mergesort. While Quicksort is very fast on average, Mergesort additionally
gives worst-case guarantees, but needs extra space for a linear number of
elements. Worst-case efficient in-place sorting, however, remains a challenge:
the standard solution, Heapsort, suffers from a bad cache behavior and is also
not overly fast for in-cache instances.
In this work we present median-of-medians QuickMergesort (MoMQuickMergesort),
a new variant of QuickMergesort, which combines Quicksort with Mergesort
allowing the latter to be implemented in place. Our new variant applies the
median-of-medians algorithm for selecting pivots in order to circumvent the
quadratic worst case. Indeed, we show that it uses at most
comparisons for large enough.
We experimentally confirm the theoretical estimates and show that the new
algorithm outperforms Heapsort by far and is only around 10% slower than
Introsort (std::sort implementation of stdlibc++), which has a rather poor
guarantee for the worst case. We also simulate the worst case, which is only
around 10% slower than the average case. In particular, the new algorithm is a
natural candidate to replace Heapsort as a worst-case stopper in Introsort
Generalized gap acceptance models for unsignalized intersections
This paper contributes to the modeling and analysis of unsignalized
intersections. In classical gap acceptance models vehicles on the minor road
accept any gap greater than the CRITICAL gap, and reject gaps below this
threshold, where the gap is the time between two subsequent vehicles on the
major road. The main contribution of this paper is to develop a series of
generalizations of existing models, thus increasing the model's practical
applicability significantly. First, we incorporate {driver impatience behavior}
while allowing for a realistic merging behavior; we do so by distinguishing
between the critical gap and the merging time, thus allowing MULTIPLE vehicles
to use a sufficiently large gap. Incorporating this feature is particularly
challenging in models with driver impatience. Secondly, we allow for multiple
classes of gap acceptance behavior, enabling us to distinguish between
different driver types and/or different vehicle types. Thirdly, we use the
novel M/SM2/1 queueing model, which has batch arrivals, dependent service
times, and a different service-time distribution for vehicles arriving in an
empty queue on the minor road (where `service time' refers to the time required
to find a sufficiently large gap). This setup facilitates the analysis of the
service-time distribution of an arbitrary vehicle on the minor road and of the
queue length on the minor road. In particular, we can compute the MEAN service
time, thus enabling the evaluation of the capacity for the minor road vehicles
- …