Search CORE

8 research outputs found

Succinct List Indexing in Optimal Time

Author: Holland William L.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 33rd International Symposium on Algorithms and Computation (ISAAC 2022)
Publication date: 01/01/2022
Field of study

Dagstuhl Research Online Publication Server

On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs

Author: Bender Michael A.
Fineman Jeremy T.
Gilbert Seth
Leiserson Charles E.
Publication venue
Publication date: 01/01/2004
Field of study

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two algorithms, one serial and one parallel, to maintain series-parallel (SP) relationships "on the fly" for fork-join multithreaded programs. The serial SP-order algorithm runs in O(1) amortized time per operation. In contrast, the previously best algorithm requires a time per operation that is proportional to Tarjan’s functional inverse of Ackermann’s function. SP-order employs an order-maintenance data structure that allows us to implement a more efficient "English-Hebrew" labeling scheme than was used in earlier race detectors, which immediately yields an improved determinacy-race detector. In particular, any fork-join program running in T₁ time on a single processor can be checked on the fly for determinacy races in O(T₁) time. Corresponding improved bounds can also be obtained for more sophisticated data-race detectors, for example, those that use locks. By combining SP-order with Feng and Leiserson’s serial SP-bags algorithm, we obtain a parallel SP-maintenance algorithm, called SP-hybrid. Suppose that a fork-join program has n threads, T₁ work, and a critical-path length of T[subscript â]. When executed on P processors, we prove that SP-hybrid runs in O((T₁/P + PT[subscript â]) lg n) expected time. To understand this bound, consider that the original program obtains linear speed-up over a 1-processor execution when P = O(T₁/T[subscript â]). In contrast, SP-hybrid obtains linear speed-up when P = O(√T₁/T[subscript â]), but the work is increased by a factor of O(lg n).Singapore-MIT Alliance (SMA

CiteSeerX

DSpace@MIT

Crossref

Load balancing and locality in range-queriable data structures

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Provably good race detection that runs in parallel

Author: Fineman Jeremy T
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 93-98).A multithreaded parallel program that is intended to be deterministic may exhibit nondeterminism clue to bugs called determinacy races. A key capability of race detectors is to determine whether one thread executes logically in parallel with another thread or whether the threads must operate in series. This thesis presents two algorithms, one serial and one parallel, to maintain the series-parallel (SP) relationships "on the fly" for fork-join multithreaded programs. For a fork-join program with T1 work and a critical-path length of T[infinity], the serial SP-Maintenance algorithm runs in O(T1) time. The parallel algorithm executes in the nearly optimal O(T1/P + PT[infinity]) time, when run on P processors and using an efficient scheduler. These SP-maintenance algorithms can be incorporated into race detectors to get a provably good race detector that runs in parallel. This thesis describes an efficient parallel race detector I call Nondeterminator-3. For a fork-join program T1 work, critical-path length T[infinity], and v shared memory locations, the Nondeterminator-3 runs in O(T1/P + PT[infinity] lg P + min [(T1 lg P)/P, vT[infinity] Ig P]) expected time, when run on P processors and using an efficient scheduler.by Jeremy T. Fineman.S.M

CiteSeerX

DSpace@MIT

LIPIcs, Volume 248, ISAAC 2022, Complete Volume

Author: Bae Sang Won
Park Heejin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 33rd International Symposium on Algorithms and Computation (ISAAC 2022)
Publication date: 01/01/2022
Field of study

LIPIcs, Volume 248, ISAAC 2022, Complete Volum

Dagstuhl Research Online Publication Server

Vergleichen und Aggregieren von partiellen Ordnungen

Author: Hofmeier Andreas
Publication venue
Publication date: 05/11/2012
Field of study

Das Vergleichen und Aggregieren von Informationen ist ein zentraler Bereich in der Analyse von Wahlsystemen. In diesen müssen die verschiedenen Meinungen von Wählern über eine Menge von Kandidaten zu einem möglichst gerechten Wahlergebnis aggregiert werden. In den meisten politischen Wahlen entscheidet sich jeder Wähler durch Ankreuzen für einen einzigen Kandidaten. Daneben werden aber auch Rangordnungsprobleme als eine Variante von Wahlsystemen untersucht. Bei diesen bringt jeder Wähler seine Meinung in Form einer totalen Ordnung über der Menge der Kandidaten zum Ausdruck, wodurch seine oftmals komplexe Meinung exakter repräsentiert werden kann als durch die Auswahl eines einzigen, favorisierten Kandidaten. Das Wahlergebnis eines Rangordnungsproblems ist dann eine ebenfalls totale Ordnung der Kandidaten, welche die geringste Distanz zu den Meinungen der Wähler aufweist. Als Distanzmaße zwischen zwei totalen Ordnungen haben sich neben anderen Kendalls Tau-Distanz und Spearmans Footrule-Distanz etabliert. Durch moderne Anwendungsmöglichkeiten von Rangordnungsproblemen im maschinellen Lernen, in der künstlichen Intelligenz, in der Bioinformatik und vor allem in verschiedenen Bereichen des World Wide Web rücken bereits bekannte, jedoch bislang eher wenig studierte Aspekte in den Fokus der Forschung. Zum einen gewinnt die algorithmische Komplexität von Rangordnungsproblemen an Bedeutung. Zum anderen existieren in vielen dieser Anwendungen unvollständige „Wählermeinungen“ mit unentschiedenen oder unvergleichbaren Kandidaten, so dass totale Ordnungen zu deren Repräsentation nicht länger geeignet sind. Die vorliegende Arbeit greift diese beiden Aspekte auf und betrachtet die algorithmische Komplexität von Rangordnungsproblemen, in denen Wählermeinungen anstatt durch totale Ordnungen durch schwache oder partielle Ordnungen repräsentiert werden. Dazu werden Kendalls Tau-Distanz und Spearmans Footrule-Distanz auf verschiedene nahe liegende Arten verallgemeinert. Es zeigt sich dabei, dass nun bereits die Distanzberechnung zwischen zwei Ordnungen ein algorithmisch komplexes Problem darstellt. So ist die Berechnung der verallgemeinerten Versionen von Kendalls Tau-Distanz oder Spearmans Footrule-Distanz für schwache Ordnungen noch effizient möglich. Sobald jedoch partielle Ordnungen betrachtet werden, sind die Probleme NP-vollständig, also vermutlich nicht mehr effizient lösbar. In diesem Fall werden Resultate zur Approximierbarkeit und zur parametrisierten Komplexität der Probleme vorgestellt. Auch die Komplexität der Rangordnungsprobleme selbst erhöht sich. Für totale Ordnungen effizient lösbare Varianten werden für schwache Ordnungen NP-vollständig, für totale Ordnungen NP-vollständige Varianten hingegen liegen für partielle Ordnungen teilweise außerhalb der Komplexitätsklasse NP. Die Arbeit schließt mit einem Ausblick auf offene Problemstellungen

Algorithms incorporating concurrency and caching

Author: Fineman Jeremy T
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 189-203).This thesis describes provably good algorithms for modern large-scale computer systems, including today's multicores. Designing efficient algorithms for these systems involves overcoming many challenges, including concurrency (dealing with parallel accesses to the same data) and caching (achieving good memory performance.) This thesis includes two parallel algorithms that focus on testing for atomicity violations in a parallel fork-join program. These algorithms augment a parallel program with a data structure that answers queries about the program's structure, on the fly. Specifically, one data structure, called SP-ordered-bags, maintains the series-parallel relationships among threads, which is vital for uncovering race conditions (bugs) in the program. Another data structure, called XConflict, aids in detecting conflicts in a transactional-memory system with nested parallel transactions. For a program with work T and span To, maintaining either data structure adds an overhead of PT, to the running time of the parallel program when executed on P processors using an efficient scheduler, yielding a total runtime of O(T1/P + PTo). For each of these data structures, queries can be answered in 0(1) time. This thesis also introduces the compressed sparse rows (CSB) storage format for sparse matrices, which allows both Ax and ATx to be computed efficiently in parallel, where A is an n x n sparse matrix with nnz > n nonzeros and x is a dense n-vector. The parallel multiplication algorithm uses e(nnz) work and ... span, yielding a parallelism of ... , which is amply high for virtually any large matrix.(cont.) Also addressing concurrency, this thesis considers two scheduling problems. The first scheduling problem, motivated by transactional memory, considers randomized backoff when jobs have different lengths. I give an analysis showing that binary exponential backoff achieves makespan V2e(6v 1- i ) with high probability, where V is the total length of all n contending jobs. This bound is significantly larger than when jobs are all the same size. A variant of exponential backoff, however, achieves makespan of ... with high probability. I also present the size-hashed backoff protocol, specifically designed for jobs having different lengths, that achieves makespan ... with high probability. The second scheduling problem considers scheduling n unit-length jobs on m unrelated machines, where each job may fail probabilistically. Specifically, an input consists of a set of n jobs, a directed acyclic graph G describing the precedence constraints among jobs, and a failure probability qij for each job j and machine i. The goal is to find a schedule that minimizes the expected makespan. I give an O(log log(min {m, n}))-approximation for the case of independent jobs (when there are no precedence constraints) and an O(log(n + m) log log(min {m, n}))-approximation algorithm when precedence constraints form disjoint chains. This chain algorithm can be extended into one that supports precedence constraints that are trees, which worsens the approximation by another log(n) factor. To address caching, this thesis includes several new variants of cache-oblivious dynamic dictionaries.(cont.) A cache-oblivious dictionary fills the same niche as a classic B-tree, but it does so without tuning for particular memory parameters. Thus, cache-oblivious dictionaries optimize for all levels of a multilevel hierarchy and are more portable than traditional B-trees. I describe how to add concurrency to several previously existing cache-oblivious dictionaries. I also describe two new data structures that achieve significantly cheaper insertions with a small overhead on searches. The cache-oblivious lookahead array (COLA) supports insertions/deletions and searches in O((1/B) log N) and O(log N) memory transfers, respectively, where B is the block size, M is the memory size, and N is the number of elements in the data structure. The xDict supports these operations in O((1/1B E1-) logB(N/M)) and O((1/)0logB(N/M)) memory transfers, respectively, where 0 < E < 1 is a tunable parameter. Also on caching, this thesis answers the question: what is the worst possible page-replacement strategy? The goal of this whimsical chapter is to devise an online strategy that achieves the highest possible fraction of page faults / cache misses as compared to the worst offline strategy. I show that there is no deterministic strategy that is competitive with the worst offline. I also give a randomized strategy based on the most recently used heuristic and show that it is the worst possible pagereplacement policy. On a more serious note, I also show that direct mapping is, in some sense, a worst possible page-replacement policy. Finally, this thesis includes a new algorithm, following a new approach, for the problem of maintaining a topological ordering of a dag as edges are dynamically inserted.(cont.) The main result included here is an O(n2 log n) algorithm for maintaining a topological ordering in the presence of up to m < n(n - 1)/2 edge insertions. In contrast, the previously best algorithm has a total running time of O(min { m3/ 2, n5/2 }). Although these algorithms are not parallel and do not exhibit particularly good locality, some of the data structural techniques employed in my solution are similar to others in this thesis.by Jeremy T. Fineman.Ph.D

DSpace@MIT