Search CORE

10 research outputs found

Sublinear Parallel Time Recognition of Tree Adjoining Language

Author: Palis Michael A.
Shende Sunil
Publication venue: ScholarlyCommons
Publication date: 01/08/1988
Field of study

A parallel algorithm is presented for recognizing the class of languages generated by tree adjoining grammars, a tree rewriting system which has applications in computational Linguistics. This class of languages is known to properly include all context-free languages; for example, the non-context-free sets {anbncn} and {ww) are in this class. It is shown that the recognition problem for tree adjoining languages can be solved by a concurrent-read, exclusive-write parallel random-access machine (CREW PRAM) in 0 (log2(n)) time using polynomially many processors. This extends a previous result for context-free languages

ScholarlyCommons@Penn

Recommended from our members

Reset Sequences for Finite Automata with Application to Design of Parts Orienters

Author: Eppstein David
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1987
Field of study

Natarajan reduced the problem of designing a certain type of mechanical parts orienter to that of finding reset sequences for monotonic deterministic finite automata. He gave algorithms that in polynomial time either find such sequences or prove that no such sequence exists. In this paper we present a new algorithm based on breadth first search that runs in faster asymptotic time than Natarajan's algorithms, and in addition finds the shortest possible reset sequence if such a sequence exists. We give tight bounds on the length of the minimum reset sequence. We further improve the time and space bounds of another algorithm given by Natarajan, which finds reset sequences for general automata in the special case that all states are initially possible

Columbia University Academic Commons

An Efficient Algorithm for Bicriteria Minimum-Cost Circulation Problem

Author: Katoh N.
Publication venue: WP-87-098
Publication date: 01/07/1987
Field of study

This paper is concerned with a bicriteria minimum-cost circulation problem which arises in interactive multicriteria decision making. The author presents a strongly polynomial algorithm for this problem, that is achieved by making use of the parametric characterization of optimal solutions and a strongly polynomial algorithm for the single objective minimum-cost circulation problem

International Institute for Applied Systems Analysis (IIASA)

Parallel Algorithmic Techniques for Combinatorial Computation

Author: Eppstein David
Galil Zvi
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1988
Field of study

Parallel computation offers the promise of great improvements in the solution of problems that, if we were restricted to sequential computation, would take so much time that solution would be impractical. There is a drawback to the use of parallel computers, however, and that is that they seem to be harder to program. For this reason, parallel algorithms in practice are often restricted to simple problems such as matrix multiplication. Certainly this is useful, and in fact we shall see later some non-obvious uses of matrix manipulation, but many of the large problems requiring solution are of a more complex nature. In particular, an instance of a problem may be structured as an arbitrary graph or tree, rather than in the regular order of a matrix. In this paper we describe a number of algorithmic techniques that have been developed for solving such combinatorial problems. The intent of the paper is to show how the algorithmic tools we present can be used as building blocks for higher level algorithms, and to present pointers to the literature for the reader to look up the specifics of these algorithms. We make no claim to completeness; a number of techniques have been omitted for brevity or because their chief application is not combinatorial in nature. In particular we give very little attention to parallel sorting, although sorting is used as a subroutine in a number of the algorithms we describe. We also only describe algorithms, and not lower bounds, for solving problems in parallel

CiteSeerX

Columbia University Academic Commons

Aspects of practical implementations of PRAM algorithms

Author: Ravindran Somasundaram
Publication venue
Publication date
Field of study

The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

Warwick Research Archives Portal Repository

Parallel iterative solution methods for Markov decision processes

Author: Archibald Thomas Welsh
Publication venue: The University of Edinburgh
Publication date: 01/01/1992
Field of study

Edinburgh Research Archive

On the Design, Analysis, and Implementation of Algorithms for Selected Problems in Graphs and Networks

Author: Williamson Matthew D.
Publication venue: The Research Repository @ WVU
Publication date: 01/05/2013
Field of study

This thesis studies three problems in network optimization, viz., the minimum spanning tree verification (MSTV) problem, the undirected negative cost cycle detection (UNCCD) problem, and the negative cost girth (NCG) problem. These problems find applications in several domains including program verification, proof theory, real-time scheduling, social networking, and operations research.;The MSTV problem is defined as follows: Given an undirected graph G = (V,E) and a spanning tree T, is T a minimum spanning tree of G? We focus on the case where the number of distinct edge weights is bounded. Using a bucketed data structure to organize the edge weights, we present an efficient algorithm for the MSTV problem, which runs in O (| E| + |V| · K) time, where K is the number of distinct edge weights. When K is a fixed constant, this algorithm runs in linear time. We also profile our MSTV algorithm with the current fastest known MSTV implementation. Our results demonstrate the superiority of our algorithm when K ≤ 24.;The UNCCD problem is defined as follows: Given an undirected graph G = (V,E) with arbitrarily weighted edges, does G contain a negative cost cycle? We discuss two polynomial time algorithms for solving the UNCCD problem: the b-matching approach and the T-join approach. We obtain new results for the case where the edge costs are integers in the range {lcub}--K ·· K{rcub}, where K is a positive constant. We also provide the first extensive empirical study that profiles the discussed UNCCD algorithms for various graph types, sizes, and experiments.;The NCG problem is defined as follows: Given a directed graph G = (V,E) with arbitrarily weighted edges, find the length, or number of edges, of the negative cost cycle having the least number of edges. We discuss three strongly polynomial NCG algorithms. The first NCG algorithm is known as the matrix multiplication approach in the literature. We present two new NCG algorithms that are asymptotically and empirically superior to the matrix multiplication approach for sparse graphs. We also provide a parallel implementation of the matrix multiplication approach that runs in polylogarithmic parallel time using a polynomial number of processors. We include an implementation profile to demonstrate the efficiency of the parallel implementation as we increase the graph size and number of processors. We also present an NCG algorithm for planar graphs that is asymptotically faster than the fastest topology-oblivious algorithm when restricted to planar graphs

The Research Repository @ WVU (West Virginia University)

Analyse von IT-Anwendungen mittels Zeitvariation

Author: Mangold Florian
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 18/11/2010
Field of study

Performanzprobleme treten in der Praxis von IT-Anwendungen häufig auf, trotz steigender Hardwareleistung und verschiedenster Ansätze zur Entwicklung performanter Software im Softwarelebenszyklus. Modellbasierte Performanzanalysen ermöglichen auf Basis von Entwurfsartefakten eine Prävention von Performanzproblemen. Bei bestehenden oder teilweise implementierten IT-Anwendungen wird versucht, durch Hardwareskalierung oder Optimierung des Codes Performanzprobleme zu beheben. Beide Ansätze haben Nachteile: modellbasierte Ansätze werden durch die benötigte hohe Expertise nicht generell genutzt, die nachträgliche Optimierung ist ein unsystematischer und unkoordinierter Prozess. Diese Dissertation schlägt einen neuen Ansatz zur Performanzanalyse für eine nachfolgende Optimierung vor. Mittels eines Experiments werden Performanzwechselwirkungen in der IT-Anwendung identifiziert. Basis des Experiments, das Analyseinstrumentarium, ist eine zielgerichtete, zeitliche Variation von Start-, Endzeitpunkt oder Laufzeitdauer von Abläufen der IT-Anwendung. Diese Herangehensweise ist automatisierbar und kann strukturiert und ohne hohen Lernaufwand im Softwareentwicklungsprozess angewandt werden. Mittels der Turingmaschine wird bewiesen, dass durch die zeitliche Variation des Analyseinstrumentariums die Korrektheit von sequentiellen Berechnung beibehalten wird. Dies wird auf nebenläufige Systeme mittels der parallelen Registermaschine erweitert und diskutiert. Mit diesem praxisnahen Maschinenmodell wird dargelegt, dass die entdeckten Wirkzusammenhänge des Analyseinstrumentariums Optimierungskandidaten identifizieren. Eine spezielle Experimentierumgebung, in der die Abläufe eines Systems, bestehend aus Software und Hardware, programmierbar variiert werden können, wird mittels einer Virtualisierungslösung realisiert. Techniken zur Nutzung des Analyseinstrumentariums durch eine Instrumentierung werden angegeben. Eine Methode zur Ermittlung von Mindestanforderungen von IT-Anwendungen an die Hardware wird präsentiert und mittels der Experimentierumgebung anhand von zwei Szenarios und dem Android Betriebssystem exemplifiziert. Verschiedene Verfahren, um aus den Beobachtungen des Experiments die Optimierungskandidaten des Systems zu eruieren, werden vorgestellt, klassifiziert und evaluiert. Die Identifikation von Optimierungskandidaten und -potenzial wird an Illustrationsszenarios und mehreren großen IT-Anwendungen mittels dieser Methoden praktisch demonstriert. Als konsequente Erweiterung wird auf Basis des Analyseinstrumentariums eine Testmethode zum Validieren eines Systems gegenüber nicht deterministisch reproduzierbaren Fehlern, die auf Grund mangelnder Synchronisationsmechanismen (z.B. Races) oder zeitlicher Abläufe entstehen (z.B. Heisenbugs, alterungsbedingte Fehler), angegeben.Performance problems are very common in IT-Application, even though hardware performance is consistently increasing and there are several different software performance engineering methodologies during the software life cycle. The early model based performance predictions are offering a prevention of performance problems based on software engineering artifacts. Existing or partially implemented IT-Applications are optimized with hardware scaling or code tuning. There are disadvantages with both approaches: the model based performance predictions are not generally used due to the needed high expertise, the ex post optimization is an unsystematic and unstructured process. This thesis proposes a novel approach to a performance analysis for a subsequent optimization of the IT-Application. Via an experiment in the IT-Application performance interdependencies are identified. The core of the analysis is a specific variation of start-, end time or runtime of events or processes in the IT-Application. This approach is automatic and can easily be used in a structured way in the software development process. With a Turingmachine the correctness of this experimental approach was proved. With these temporal variations the correctness of a sequential calculation is held. This is extended and discussed on concurrent systems with a parallel Registermachine. With this very practical machine model the effect of the experiment and the subsequent identification of optimization potential and candidates are demonstrated. A special experimental environment to vary temporal processes and events of the hardware and the software of a system was developed with a virtual machine. Techniques for this experimental approach via instrumenting are stated. A method to determine minimum hardware requirements with this experimental approach is presented and exemplified with two scenarios based on the Android Framework. Different techniques to determine candidates and potential for an optimization are presented, classified and evaluated. The process to analyze and identify optimization candidates and potential is demonstrated on scenarios for illustration purposes and real IT-Applications. As a consistent extension a test methodology enabling a test of non-deterministic reproducible errors is given. Such non-deterministic reproducible errors are faults in the system caused by insufficient synchronization mechanisms (for example Races or Heisenbugs) or aging-related faults

Digitale Hochschulschriften der LMU

Methodology and Software for Interactive Decision Support

Author: Lewandowski A.
Stanchev I.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1989
Field of study

These Proceedings report the scientific results of an International Workshop on "Methodology and Software for Interactive Decision Support" organized jointly by the System and Decision Sciences Program of IIASA and The National Committee for Applied Systems Analysis and Management in Bulgaria. Several other Bulgarian institutions sponsored the workshop -- The Committee for Science to the Council of Ministers, The State Committee for Research and Technology and The Bulgarian Industrial Association. The workshop was held in Albena, on the Black Sea Coast. In the first section, "Theory and Algorithms for Multiple Criteria Optimization," new theoretical developments in multiple criteria optimization are presented. In the second section, "Theory, Methodology and Software for Decision Support Systems," the principles of building decision support systems are presented as well as software tools constituting the building components of such systems. Moreover, several papers are devoted to the general methodology of building such systems or present experimental design of systems supporting certain class of decision problems. The third section addresses issues of "Applications of Decision Support Systems and Computer Implementations of Decision Support Systems." Another part of this section has a special character. Beside theoretical and methodological papers, several practical implementations of software for decision support have been presented during the workshop. These software packages varied from very experimental and illustrative implementations of some theoretical concept to well developed and documented systems being currently commercially distributed and used for solving practical problems

International Institute for Applied Systems Analysis (IIASA)

Models for Parallel Computation in Multi-Core, Heterogeneous, and Ultra Wide-Word Architectures

Author: Salinger Alejandro
Publication venue: 'University of Waterloo'
Publication date: 01/01/2013
Field of study

Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a chip being widely available and an increasing number of cores predicted for the future. In addition, the decreasing costs and increasing programmability of Graphic Processing Units (GPUs) have made these an accessible source of parallel processing power in general purpose computing. Among the many research challenges that this scenario has raised are the fundamental problems related to theoretical modeling of computation in these architectures. In this thesis we study several aspects of computation in modern parallel architectures, from modeling of computation in multi-cores and heterogeneous platforms, to multi-core cache management strategies, through the proposal of an architecture that exploits bit-parallelism on thousands of bits. Observing that in practice multi-cores have a small number of cores, we propose a model for low-degree parallelism for these architectures. We argue that assuming a small number of processors (logarithmic in a problem's input size) simplifies the design of parallel algorithms. We show that in this model a large class of divide-and-conquer and dynamic programming algorithms can be parallelized with simple modifications to sequential programs, while achieving optimal parallel speedups. We further explore low-degree-parallelism in computation, providing evidence of fundamental differences in practice and theory between systems with a sublinear and linear number of processors, and suggesting a sharp theoretical gap between the classes of problems that are efficiently parallelizable in each case. Efficient strategies to manage shared caches play a crucial role in multi-core performance. We propose a model for paging in multi-core shared caches, which extends classical paging to a setting in which several threads share the cache. We show that in this setting traditional cache management policies perform poorly, and that any effective strategy must partition the cache among threads, with a partition that adapts dynamically to the demands of each thread. Inspired by the shared cache setting, we introduce the minimum cache usage problem, an extension to classical sequential paging in which algorithms must account for the amount of cache they use. This cache-aware model seeks algorithms with good performance in terms of faults and the amount of cache used, and has applications in energy efficient caching and in shared cache scenarios. The wide availability of GPUs has added to the parallel power of multi-cores, however, most applications underutilize the available resources. We propose a model for hybrid computation in heterogeneous systems with multi-cores and GPU, and describe strategies for generic parallelization and efficient scheduling of a large class of divide-and-conquer algorithms. Lastly, we introduce the Ultra-Wide Word architecture and model, an extension of the word-RAM model, that allows for constant time operations on thousands of bits in parallel. We show that a large class of existing algorithms can be implemented in the Ultra-Wide Word model, achieving speedups comparable to those of multi-threaded computations, while avoiding the more difficult aspects of parallel programming

University of Waterloo's Institutional Repository