53 research outputs found

    Optimizing for a Many-Core Architecture without Compromising Ease-of-Programming

    Get PDF
    Faced with nearly stagnant clock speed advances, chip manufacturers have turned to parallelism as the source for continuing performance improvements. But even though numerous parallel architectures have already been brought to market, a universally accepted methodology for programming them for general purpose applications has yet to emerge. Existing solutions tend to be hardware-specific, rendering them difficult to use for the majority of application programmers and domain experts, and not providing scalability guarantees for future generations of the hardware. This dissertation advances the validation of the following thesis: it is possible to develop efficient general-purpose programs for a many-core platform using a model recognized for its simplicity. To prove this thesis, we refer to the eXplicit Multi-Threading (XMT) architecture designed and built at the University of Maryland. XMT is an attempt at re-inventing parallel computing with a solid theoretical foundation and an aggressive scalable design. Algorithmically, XMT is inspired by the PRAM (Parallel Random Access Machine) model and the architecture design is focused on reducing inter-task communication and synchronization overheads and providing an easy-to-program parallel model. This thesis builds upon the existing XMT infrastructure to improve support for efficient execution with a focus on ease-of-programming. Our contributions aim at reducing the programmer's effort in developing XMT applications and improving the overall performance. More concretely, we: (1) present a work-flow guiding programmers to produce efficient parallel solutions starting from a high-level problem; (2) introduce an analytical performance model for XMT programs and provide a methodology to project running time from an implementation; (3) propose and evaluate RAP -- an improved resource-aware compiler loop prefetching algorithm targeted at fine-grained many-core architectures; we demonstrate performance improvements of up to 34.79% on average over the GCC loop prefetching implementation and up to 24.61% on average over a simple hardware prefetching scheme; and (4) implement a number of parallel benchmarks and evaluate the overall performance of XMT relative to existing serial and parallel solutions, showing speedups of up to 13.89x vs.~ a serial processor and 8.10x vs.~parallel code optimized for an existing many-core (GPU). We also discuss the implementation and optimization of the Max-Flow algorithm on XMT, a problem which is among the more advanced in terms of complexity, benchmarking and research interest in the parallel algorithms community. We demonstrate better speed-ups compared to a best serial solution than previous attempts on other parallel platforms

    SAT-based Analysis, (Re-)Configuration & Optimization in the Context of Automotive Product documentation

    Get PDF
    Es gibt einen steigenden Trend hin zu kundenindividueller Massenproduktion (mass customization), insbesondere im Bereich der Automobilkonfiguration. Kundenindividuelle Massenproduktion fĂŒhrt zu einem enormen Anstieg der KomplexitĂ€t. Es gibt Hunderte von Ausstattungsoptionen aus denen ein Kunde wĂ€hlen kann um sich sein persönliches Auto zusammenzustellen. Die Anzahl der unterschiedlichen konfigurierbaren Autos eines deutschen Premium-Herstellers liegt fĂŒr ein Fahrzeugmodell bei bis zu 10^80. SAT-basierte Methoden haben sich zur Verifikation der StĂŒckliste (bill of materials) von Automobilkonfigurationen etabliert. Carsten Sinz hat Mitte der 90er im Bereich der SAT-basierten Verifikationsmethoden fĂŒr die Daimler AG Pionierarbeit geleistet. Darauf aufbauend wurde nach 2005 ein produktives Software System bei der Daimler AG installiert. SpĂ€ter folgten weitere deutsche Automobilhersteller und installierten ebenfalls SAT-basierte Systeme zur Verifikation ihrer StĂŒcklisten. Die vorliegende Arbeit besteht aus zwei Hauptteilen. Der erste Teil beschĂ€ftigt sich mit der Entwicklung weiterer SAT-basierter Methoden fĂŒr Automobilkonfigurationen. Wir zeigen, dass sich SAT-basierte Methoden fĂŒr interaktive Automobilkonfiguration eignen. Wir behandeln unterschiedliche Aspekte der interaktiven Konfiguration. Darunter KonsistenzprĂŒfung, Generierung von Beispielen, ErklĂ€rungen und die Vermeidung von Fehlkonfigurationen. Außerdem entwickeln wir SAT-basierte Methoden zur Verifikation von dynamischen Zusammenbauten. Ein dynamischer Zusammenbau reprĂ€sentiert die chronologische Zusammenbau-Reihenfolge komplexer Teile. Der zweite Teil beschĂ€ftigt sich mit der Optimierung von Automobilkonfigurationen. Wir erlĂ€utern und vergleichen unterschiedliche Optimierungsprobleme der Aussagenlogik sowie deren algorithmische LösungsansĂ€tze. Wir beschreiben AnwendungsfĂ€lle aus der Automobilkonfiguration und zeigen wie diese als aussagenlogisches Optimierungsproblem formalisiert werden können. Beispielsweise möchte man zu einer Menge an AusstattungswĂŒnschen ein Test-Fahrzeug mit minimaler ErgĂ€nzung weiterer Ausstattungen berechnen um Kosten zu sparen. DesWeiteren beschĂ€ftigen wir uns mit der Problemstellung eine kleinste Menge an Fahrzeugen zu berechnen um eine Testmenge abzudecken. Im Rahmen dieser Arbeit haben wir einen Prototypen eines (Re-)Konfigurators, genannt AutoConfig, entwickelt. Unser (Re-)Konfigurator verwendet im Kern SAT-basierte Methoden und besitzt eine grafische BenutzeroberflĂ€che, welche interaktive Konfiguration erlaubt. AutoConfig kann mit Instanzen von drei großen deutschen Automobilherstellern umgehen, aber ist nicht alleine darauf beschrĂ€nkt. Mit Hilfe dieses Prototyps wollen wir die Anwendbarkeit unserer Methoden demonstrieren

    Decomposition methods for large scale stochastic and robust optimization problems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 107-112).We propose new decomposition methods for use on broad families of stochastic and robust optimization problems in order to yield tractable approaches for large-scale real world application. We introduce a new type of a Markov decision problem named the Generalized Rest less Bandits Problem that encompasses a broad generalization of the restless bandit problem. For this class of stochastic optimization problems, we develop a nested policy heuristic which iteratively solves a series of sub-problems operating on smaller bandit systems. We also develop linear-optimization based bounds for the Generalized Restless Bandit problem and demonstrate promising computational performance of the nested policy heuristic on a large-scale real world application of search term selection for sponsored search advertising. We further study the distributionally robust optimization problem with known mean, covariance and support. These optimization models are attractive in their real world applications as they require the model consumer to only rely on those statistics of uncertainty that are known with relative confidence rather than making arbitrary assumptions about the exact dynamics of the underlying distribution of uncertainty. Known to be AP - hard, current approaches invoke tractable but often weak relaxations for real-world applications. We develop a decomposition method for this family of problems which recursively derives sub-policies along projected dimensions of uncertainty and provides a sequence of bounds on the value of the derived policy. In the development of this method, we prove that non-convex quadratic optimization in n-dimensions over a box in two-dimensions is efficiently solvable. We also show that this same decomposition method yields a promising heuristic for the MAXCUT problem. We then provide promising computational results in the context of a real world fixed income portfolio optimization problem. The decomposition methods developed in this thesis recursively derive sub-policies on projected dimensions of the master problem. These sub-policies are optimal on relaxations which admit "tight" projections of the master problem; that is, the projection of the feasible region for the relaxation is equivalent to the projection of that of master problem along the dimensions of the sub-policy. Additionally, these decomposition strategies provide a hierarchical solution structure that aids in solving large-scale problems.by Adrian Bernard Druke Becker.Ph.D

    Geometric intersection problems

    Full text link

    Eight Biennial Report : April 2005 – March 2007

    No full text

    On Flows, Paths, Roots, and Zeros

    Get PDF
    This thesis has two parts; in the first of which we give new results for various network flow problems. (1) We present a novel dual ascent algorithm for min-cost flow and show that an implementation of it is very efficient on certain instance classes. (2) We approach the problem of numerical stability of interior point network flow algorithms by giving a path following method that works with integer arithmetic solely and is thus guaranteed to be free of any nu-merical instabilities. (3) We present a gradient descent approach for the undirected transship-ment problem and its special case, the single source shortest path problem (SSSP). For distrib-uted computation models this yields the first SSSP-algorithm with near-optimal number of communication rounds. The second part deals with fundamental topics from algebraic computation. (1) We give an algorithm for computing the complex roots of a complex polynomial. While achieving a com-parable bit complexity as previous best results, our algorithm is simple and promising to be of practical impact. It uses a test for counting the roots of a polynomial in a region that is based on Pellet's theorem. (2) We extend this test to polynomial systems, i.e., we develop an algorithm that can certify the existence of a k-fold zero of a zero-dimensional polynomial system within a given region. For bivariate systems, we show experimentally that this approach yields signifi-cant improvements when used as inclusion predicate in an elimination method.Im ersten Teil dieser Dissertation prĂ€sentieren wir neue Resultate fĂŒr verschiedene Netzwerkflussprobleme. (1)Wir geben eine neue Duale-Aufstiegsmethode fĂŒr das Min-Cost-Flow- Problem an und zeigen, dass eine Implementierung dieser Methode sehr effizient auf gewissen Instanzklassen ist. (2)Wir behandeln numerische StabilitĂ€t von Innere-Punkte-Methoden fĂŒrNetwerkflĂŒsse, indem wir eine solche Methode angeben die mit ganzzahliger Arithmetik arbeitet und daher garantiert frei von numerischen InstabilitĂ€ten ist. (3) Wir prĂ€sentieren ein Gradienten-Abstiegsverfahren fĂŒr das ungerichtete Transshipment-Problem, und seinen Spezialfall, das Single-Source-Shortest-Problem (SSSP), die fĂŒr SSSP in verteilten Rechenmodellen die erste mit nahe-optimaler Anzahl von Kommunikationsrunden ist. Der zweite Teil handelt von fundamentalen Problemen der Computeralgebra. (1) Wir geben einen Algorithmus zum Berechnen der komplexen Nullstellen eines komplexen Polynoms an, der eine vergleichbare BitkomplexitĂ€t zu vorherigen besten Resultaten hat, aber vergleichsweise einfach und daher vielversprechend fĂŒr die Praxis ist. (2)Wir erweitern den darin verwendeten Pellet-Test zum ZĂ€hlen der Nullstellen eines Polynoms auf Polynomsysteme, sodass wir die Existenz einer k-fachen Nullstelle eines Systems in einer gegebenen Region zertifizieren können. FĂŒr bivariate Systeme zeigen wir experimentell, dass eine Integration dieses Ansatzes in eine Eliminationsmethode zu einer signifikanten Verbesserung fĂŒhrt
    • 

    corecore