23,680 research outputs found
Multitriangulations, pseudotriangulations and primitive sorting networks
We study the set of all pseudoline arrangements with contact points which
cover a given support. We define a natural notion of flip between these
arrangements and study the graph of these flips. In particular, we provide an
enumeration algorithm for arrangements with a given support, based on the
properties of certain greedy pseudoline arrangements and on their connection
with sorting networks. Both the running time per arrangement and the working
space of our algorithm are polynomial.
As the motivation for this work, we provide in this paper a new
interpretation of both pseudotriangulations and multitriangulations in terms of
pseudoline arrangements on specific supports. This interpretation explains
their common properties and leads to a natural definition of
multipseudotriangulations, which generalizes both. We study elementary
properties of multipseudotriangulations and compare them to iterations of
pseudotriangulations.Comment: 60 pages, 40 figures; minor corrections and improvements of
presentatio
Synchronous Counting and Computational Algorithm Design
Consider a complete communication network on nodes, each of which is a
state machine. In synchronous 2-counting, the nodes receive a common clock
pulse and they have to agree on which pulses are "odd" and which are "even". We
require that the solution is self-stabilising (reaching the correct operation
from any initial state) and it tolerates Byzantine failures (nodes that
send arbitrary misinformation). Prior algorithms are expensive to implement in
hardware: they require a source of random bits or a large number of states.
This work consists of two parts. In the first part, we use computational
techniques (often known as synthesis) to construct very compact deterministic
algorithms for the first non-trivial case of . While no algorithm exists
for , we show that as few as 3 states per node are sufficient for all
values . Moreover, the problem cannot be solved with only 2 states per
node for , but there is a 2-state solution for all values .
In the second part, we develop and compare two different approaches for
synthesising synchronous counting algorithms. Both approaches are based on
casting the synthesis problem as a propositional satisfiability (SAT) problem
and employing modern SAT-solvers. The difference lies in how to solve the SAT
problem: either in a direct fashion, or incrementally within a counter-example
guided abstraction refinement loop. Empirical results suggest that the former
technique is more efficient if we want to synthesise time-optimal algorithms,
while the latter technique discovers non-optimal algorithms more quickly.Comment: 35 pages, extended and revised versio
A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs
Sorting is at the core of many database operations, such as index creation,
sort-merge joins, and user-requested output sorting. As GPUs are emerging as a
promising platform to accelerate various operations, sorting on GPUs becomes a
viable endeavour. Over the past few years, several improvements have been
proposed for sorting on GPUs, leading to the first radix sort implementations
that achieve a sorting rate of over one billion 32-bit keys per second. Yet,
state-of-the-art approaches are heavily memory bandwidth-bound, as they require
substantially more memory transfers than their CPU-based counterparts.
Our work proposes a novel approach that almost halves the amount of memory
transfers and, therefore, considerably lifts the memory bandwidth limitation.
Being able to sort two gigabytes of eight-byte records in as little as 50
milliseconds, our approach achieves a 2.32-fold improvement over the
state-of-the-art GPU-based radix sort for uniform distributions, sustaining a
minimum speed-up of no less than a factor of 1.66 for skewed distributions.
To address inputs that either do not reside on the GPU or exceed the
available device memory, we build on our efficient GPU sorting approach with a
pipelined heterogeneous sorting algorithm that mitigates the overhead
associated with PCIe data transfers. Comparing the end-to-end sorting
performance to the state-of-the-art CPU-based radix sort running 16 threads,
our heterogeneous approach achieves a 2.06-fold and a 1.53-fold improvement for
sorting 64 GB key-value pairs with a skewed and a uniform distribution,
respectively.Comment: 16 pages, accepted at SIGMOD 201
Algorithmic Complexity of Power Law Networks
It was experimentally observed that the majority of real-world networks
follow power law degree distribution. The aim of this paper is to study the
algorithmic complexity of such "typical" networks. The contribution of this
work is twofold.
First, we define a deterministic condition for checking whether a graph has a
power law degree distribution and experimentally validate it on real-world
networks. This definition allows us to derive interesting properties of power
law networks. We observe that for exponents of the degree distribution in the
range such networks exhibit double power law phenomenon that was
observed for several real-world networks. Our observation indicates that this
phenomenon could be explained by just pure graph theoretical properties.
The second aim of our work is to give a novel theoretical explanation why
many algorithms run faster on real-world data than what is predicted by
algorithmic worst-case analysis. We show how to exploit the power law degree
distribution to design faster algorithms for a number of classical P-time
problems including transitive closure, maximum matching, determinant, PageRank
and matrix inverse. Moreover, we deal with the problems of counting triangles
and finding maximum clique. Previously, it has been only shown that these
problems can be solved very efficiently on power law graphs when these graphs
are random, e.g., drawn at random from some distribution. However, it is
unclear how to relate such a theoretical analysis to real-world graphs, which
are fixed. Instead of that, we show that the randomness assumption can be
replaced with a simple condition on the degrees of adjacent vertices, which can
be used to obtain similar results. As a result, in some range of power law
exponents, we are able to solve the maximum clique problem in polynomial time,
although in general power law networks the problem is NP-complete
Aspects of k-k-Routing in Meshes and OTIS Networks
Aspects of k-k Routing in Meshes and OTIS-Networks
Abstract
Efficient data transport in parallel computers build on
sparse interconnection networks is crucial for their
performance. A basic transport problem in such a computer
is the k-k routing problem. In this thesis,
aspects of the k-k routing problem on r-dimensional
meshes and OTIS-G networks are discussed. The first oblivious
routing algorithms for these networks are presented
that solve the k-k routing problem in an
asymptotically optimal running time and a constant
buffer size. Furthermore, other aspects of the k-k
routing problem for OTIS-G networks are analysed.
In particular, lower bounds for the problem based on the
diameter and bisection width of OTIS-G networks are
given, and the k-k sorting problem on the OTIS-Mesh
is considered. Based on OTIS-G networks, a new class
of networks, called Extended OTIS-G networks, is introduced,
which have smaller diameters than OTIS-G networks.Für die Leistungfähigkeit von Parallelrechnern, die über ein Verbindungsnetzwerk kommunizieren, ist ein effizienter Datentransport entscheidend. Ein grundlegendes Transportproblem in einem solchen Rechner ist das k-k Routing Problem. In dieser Arbeit werden Aspekte dieses Problems in r-dimensionalen Gittern und OTIS-G Netzwerken untersucht. Es wird der erste vergessliche (oblivious) Routing Algorithmus vorgestellt, der das k-k Routing Problem in diesen Netzwerken in einer asymptotisch optimalen Laufzeit bei konstanter Puffergröße löst. Für OTIS-G Netzwerke werden untere Laufzeitschranken für das untersuchte Problem angegeben, die auf dem Durchmesser und der Bisektionsweite der Netzwerke basieren. Weiterhin wird ein Algorithmus vorgestellt, der das k-k Sorting Problem mit einer Laufzeit löst, die nahe an der Bisektions- und Durchmesserschranke liegt. Basierend auf den OTIS-G Netzwerken, wird eine neue Klasse von Netzwerken eingeführt, die sogenannten Extended OTIS-G Netzwerke, die sich durch einen kleineren Durchmesser von OTIS-G Netzwerken unterscheiden
- …