10 research outputs found

    Sublinear Parallel Time Recognition of Tree Adjoining Language

    Get PDF
    A parallel algorithm is presented for recognizing the class of languages generated by tree adjoining grammars, a tree rewriting system which has applications in computational Linguistics. This class of languages is known to properly include all context-free languages; for example, the non-context-free sets {anbncn} and {ww) are in this class. It is shown that the recognition problem for tree adjoining languages can be solved by a concurrent-read, exclusive-write parallel random-access machine (CREW PRAM) in 0 (log2(n)) time using polynomially many processors. This extends a previous result for context-free languages

    An Efficient Algorithm for Bicriteria Minimum-Cost Circulation Problem

    Get PDF
    This paper is concerned with a bicriteria minimum-cost circulation problem which arises in interactive multicriteria decision making. The author presents a strongly polynomial algorithm for this problem, that is achieved by making use of the parametric characterization of optimal solutions and a strongly polynomial algorithm for the single objective minimum-cost circulation problem

    Parallel Algorithmic Techniques for Combinatorial Computation

    Get PDF
    Parallel computation offers the promise of great improvements in the solution of problems that, if we were restricted to sequential computation, would take so much time that solution would be impractical. There is a drawback to the use of parallel computers, however, and that is that they seem to be harder to program. For this reason, parallel algorithms in practice are often restricted to simple problems such as matrix multiplication. Certainly this is useful, and in fact we shall see later some non-obvious uses of matrix manipulation, but many of the large problems requiring solution are of a more complex nature. In particular, an instance of a problem may be structured as an arbitrary graph or tree, rather than in the regular order of a matrix. In this paper we describe a number of algorithmic techniques that have been developed for solving such combinatorial problems. The intent of the paper is to show how the algorithmic tools we present can be used as building blocks for higher level algorithms, and to present pointers to the literature for the reader to look up the specifics of these algorithms. We make no claim to completeness; a number of techniques have been omitted for brevity or because their chief application is not combinatorial in nature. In particular we give very little attention to parallel sorting, although sorting is used as a subroutine in a number of the algorithms we describe. We also only describe algorithms, and not lower bounds, for solving problems in parallel

    Aspects of practical implementations of PRAM algorithms

    Get PDF
    The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

    Parallel iterative solution methods for Markov decision processes

    Get PDF

    On the Design, Analysis, and Implementation of Algorithms for Selected Problems in Graphs and Networks

    Get PDF
    This thesis studies three problems in network optimization, viz., the minimum spanning tree verification (MSTV) problem, the undirected negative cost cycle detection (UNCCD) problem, and the negative cost girth (NCG) problem. These problems find applications in several domains including program verification, proof theory, real-time scheduling, social networking, and operations research.;The MSTV problem is defined as follows: Given an undirected graph G = (V,E) and a spanning tree T, is T a minimum spanning tree of G? We focus on the case where the number of distinct edge weights is bounded. Using a bucketed data structure to organize the edge weights, we present an efficient algorithm for the MSTV problem, which runs in O (| E| + |V| · K) time, where K is the number of distinct edge weights. When K is a fixed constant, this algorithm runs in linear time. We also profile our MSTV algorithm with the current fastest known MSTV implementation. Our results demonstrate the superiority of our algorithm when K ≀ 24.;The UNCCD problem is defined as follows: Given an undirected graph G = (V,E) with arbitrarily weighted edges, does G contain a negative cost cycle? We discuss two polynomial time algorithms for solving the UNCCD problem: the b-matching approach and the T-join approach. We obtain new results for the case where the edge costs are integers in the range {lcub}--K ·· K{rcub}, where K is a positive constant. We also provide the first extensive empirical study that profiles the discussed UNCCD algorithms for various graph types, sizes, and experiments.;The NCG problem is defined as follows: Given a directed graph G = (V,E) with arbitrarily weighted edges, find the length, or number of edges, of the negative cost cycle having the least number of edges. We discuss three strongly polynomial NCG algorithms. The first NCG algorithm is known as the matrix multiplication approach in the literature. We present two new NCG algorithms that are asymptotically and empirically superior to the matrix multiplication approach for sparse graphs. We also provide a parallel implementation of the matrix multiplication approach that runs in polylogarithmic parallel time using a polynomial number of processors. We include an implementation profile to demonstrate the efficiency of the parallel implementation as we increase the graph size and number of processors. We also present an NCG algorithm for planar graphs that is asymptotically faster than the fastest topology-oblivious algorithm when restricted to planar graphs

    Analyse von IT-Anwendungen mittels Zeitvariation

    Get PDF
    Performanzprobleme treten in der Praxis von IT-Anwendungen hĂ€ufig auf, trotz steigender Hardwareleistung und verschiedenster AnsĂ€tze zur Entwicklung performanter Software im Softwarelebenszyklus. Modellbasierte Performanzanalysen ermöglichen auf Basis von Entwurfsartefakten eine PrĂ€vention von Performanzproblemen. Bei bestehenden oder teilweise implementierten IT-Anwendungen wird versucht, durch Hardwareskalierung oder Optimierung des Codes Performanzprobleme zu beheben. Beide AnsĂ€tze haben Nachteile: modellbasierte AnsĂ€tze werden durch die benötigte hohe Expertise nicht generell genutzt, die nachtrĂ€gliche Optimierung ist ein unsystematischer und unkoordinierter Prozess. Diese Dissertation schlĂ€gt einen neuen Ansatz zur Performanzanalyse fĂŒr eine nachfolgende Optimierung vor. Mittels eines Experiments werden Performanzwechselwirkungen in der IT-Anwendung identifiziert. Basis des Experiments, das Analyseinstrumentarium, ist eine zielgerichtete, zeitliche Variation von Start-, Endzeitpunkt oder Laufzeitdauer von AblĂ€ufen der IT-Anwendung. Diese Herangehensweise ist automatisierbar und kann strukturiert und ohne hohen Lernaufwand im Softwareentwicklungsprozess angewandt werden. Mittels der Turingmaschine wird bewiesen, dass durch die zeitliche Variation des Analyseinstrumentariums die Korrektheit von sequentiellen Berechnung beibehalten wird. Dies wird auf nebenlĂ€ufige Systeme mittels der parallelen Registermaschine erweitert und diskutiert. Mit diesem praxisnahen Maschinenmodell wird dargelegt, dass die entdeckten WirkzusammenhĂ€nge des Analyseinstrumentariums Optimierungskandidaten identifizieren. Eine spezielle Experimentierumgebung, in der die AblĂ€ufe eines Systems, bestehend aus Software und Hardware, programmierbar variiert werden können, wird mittels einer Virtualisierungslösung realisiert. Techniken zur Nutzung des Analyseinstrumentariums durch eine Instrumentierung werden angegeben. Eine Methode zur Ermittlung von Mindestanforderungen von IT-Anwendungen an die Hardware wird prĂ€sentiert und mittels der Experimentierumgebung anhand von zwei Szenarios und dem Android Betriebssystem exemplifiziert. Verschiedene Verfahren, um aus den Beobachtungen des Experiments die Optimierungskandidaten des Systems zu eruieren, werden vorgestellt, klassifiziert und evaluiert. Die Identifikation von Optimierungskandidaten und -potenzial wird an Illustrationsszenarios und mehreren großen IT-Anwendungen mittels dieser Methoden praktisch demonstriert. Als konsequente Erweiterung wird auf Basis des Analyseinstrumentariums eine Testmethode zum Validieren eines Systems gegenĂŒber nicht deterministisch reproduzierbaren Fehlern, die auf Grund mangelnder Synchronisationsmechanismen (z.B. Races) oder zeitlicher AblĂ€ufe entstehen (z.B. Heisenbugs, alterungsbedingte Fehler), angegeben.Performance problems are very common in IT-Application, even though hardware performance is consistently increasing and there are several different software performance engineering methodologies during the software life cycle. The early model based performance predictions are offering a prevention of performance problems based on software engineering artifacts. Existing or partially implemented IT-Applications are optimized with hardware scaling or code tuning. There are disadvantages with both approaches: the model based performance predictions are not generally used due to the needed high expertise, the ex post optimization is an unsystematic and unstructured process. This thesis proposes a novel approach to a performance analysis for a subsequent optimization of the IT-Application. Via an experiment in the IT-Application performance interdependencies are identified. The core of the analysis is a specific variation of start-, end time or runtime of events or processes in the IT-Application. This approach is automatic and can easily be used in a structured way in the software development process. With a Turingmachine the correctness of this experimental approach was proved. With these temporal variations the correctness of a sequential calculation is held. This is extended and discussed on concurrent systems with a parallel Registermachine. With this very practical machine model the effect of the experiment and the subsequent identification of optimization potential and candidates are demonstrated. A special experimental environment to vary temporal processes and events of the hardware and the software of a system was developed with a virtual machine. Techniques for this experimental approach via instrumenting are stated. A method to determine minimum hardware requirements with this experimental approach is presented and exemplified with two scenarios based on the Android Framework. Different techniques to determine candidates and potential for an optimization are presented, classified and evaluated. The process to analyze and identify optimization candidates and potential is demonstrated on scenarios for illustration purposes and real IT-Applications. As a consistent extension a test methodology enabling a test of non-deterministic reproducible errors is given. Such non-deterministic reproducible errors are faults in the system caused by insufficient synchronization mechanisms (for example Races or Heisenbugs) or aging-related faults

    Methodology and Software for Interactive Decision Support

    Get PDF
    These Proceedings report the scientific results of an International Workshop on "Methodology and Software for Interactive Decision Support" organized jointly by the System and Decision Sciences Program of IIASA and The National Committee for Applied Systems Analysis and Management in Bulgaria. Several other Bulgarian institutions sponsored the workshop -- The Committee for Science to the Council of Ministers, The State Committee for Research and Technology and The Bulgarian Industrial Association. The workshop was held in Albena, on the Black Sea Coast. In the first section, "Theory and Algorithms for Multiple Criteria Optimization," new theoretical developments in multiple criteria optimization are presented. In the second section, "Theory, Methodology and Software for Decision Support Systems," the principles of building decision support systems are presented as well as software tools constituting the building components of such systems. Moreover, several papers are devoted to the general methodology of building such systems or present experimental design of systems supporting certain class of decision problems. The third section addresses issues of "Applications of Decision Support Systems and Computer Implementations of Decision Support Systems." Another part of this section has a special character. Beside theoretical and methodological papers, several practical implementations of software for decision support have been presented during the workshop. These software packages varied from very experimental and illustrative implementations of some theoretical concept to well developed and documented systems being currently commercially distributed and used for solving practical problems

    Models for Parallel Computation in Multi-Core, Heterogeneous, and Ultra Wide-Word Architectures

    Get PDF
    Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a chip being widely available and an increasing number of cores predicted for the future. In addition, the decreasing costs and increasing programmability of Graphic Processing Units (GPUs) have made these an accessible source of parallel processing power in general purpose computing. Among the many research challenges that this scenario has raised are the fundamental problems related to theoretical modeling of computation in these architectures. In this thesis we study several aspects of computation in modern parallel architectures, from modeling of computation in multi-cores and heterogeneous platforms, to multi-core cache management strategies, through the proposal of an architecture that exploits bit-parallelism on thousands of bits. Observing that in practice multi-cores have a small number of cores, we propose a model for low-degree parallelism for these architectures. We argue that assuming a small number of processors (logarithmic in a problem's input size) simplifies the design of parallel algorithms. We show that in this model a large class of divide-and-conquer and dynamic programming algorithms can be parallelized with simple modifications to sequential programs, while achieving optimal parallel speedups. We further explore low-degree-parallelism in computation, providing evidence of fundamental differences in practice and theory between systems with a sublinear and linear number of processors, and suggesting a sharp theoretical gap between the classes of problems that are efficiently parallelizable in each case. Efficient strategies to manage shared caches play a crucial role in multi-core performance. We propose a model for paging in multi-core shared caches, which extends classical paging to a setting in which several threads share the cache. We show that in this setting traditional cache management policies perform poorly, and that any effective strategy must partition the cache among threads, with a partition that adapts dynamically to the demands of each thread. Inspired by the shared cache setting, we introduce the minimum cache usage problem, an extension to classical sequential paging in which algorithms must account for the amount of cache they use. This cache-aware model seeks algorithms with good performance in terms of faults and the amount of cache used, and has applications in energy efficient caching and in shared cache scenarios. The wide availability of GPUs has added to the parallel power of multi-cores, however, most applications underutilize the available resources. We propose a model for hybrid computation in heterogeneous systems with multi-cores and GPU, and describe strategies for generic parallelization and efficient scheduling of a large class of divide-and-conquer algorithms. Lastly, we introduce the Ultra-Wide Word architecture and model, an extension of the word-RAM model, that allows for constant time operations on thousands of bits in parallel. We show that a large class of existing algorithms can be implemented in the Ultra-Wide Word model, achieving speedups comparable to those of multi-threaded computations, while avoiding the more difficult aspects of parallel programming
    corecore