Search CORE

206 research outputs found

A Reverse Engineering Methodology for Extracting Parallelism From Design Abstractions.

Author: Erraguntla Ravi Chandra
Publication venue: LSU Digital Commons
Publication date: 01/01/1996
Field of study

Migration of code from sequential environments to the parallel processing environments is often done in an ad hoc manner. The purpose of this research is to develop a reverse engineering methodology to facilitate systematic migration of code from sequential to the parallel processing environments. The research results include the development of a three-phase methodology and the design and development of a reverse engineering toolkit (abbreviated as RETK) which serves to establish a working model for the methodology. The methodology consists of three phases: Analysis, Synthesis, and Transformation. The Analysis phase uses concepts from reverse engineering research to recover the sequential design description from programs using a new design recovery technique. The Synthesis phase is comprised of processes that compute the data and control dependences by using the design abstractions produced by the Analysis phase to construct the program dependence graph. The Transformation phase consists of processes that require knowledge-based analysis of the program and dependence information produced by the Analysis and Synthesis phases, respectively. Design recommendations for parallel environments are the key output of the Transformation phase. The main components of RETK are an Information Extractor, a Dependence Analyzer, and a Design Assistant that implement the processes of the Analysis, Synthesis, and Transformation phases, respectively. The object-oriented design and implementation of the Information Extractor and Dependence Analyzer are described. The design and implementation of the Design Assistant using C Language Interface Production System (CLIPS) are described. In addition, experimental results of applying the methodology to test programs by RETK are presented. The results include analysis of a Numerical Aerodynamic Simulation (NAS) benchmark program. By uniquely combining research in reverse engineering, dependence analysis, and knowledge-based analysis, the methodology provides a systematic approach for code migration. The benefits of using the methodology are increased comprehensibility and improved efficiency in migrating sequential systems to parallel environments

Louisiana State University

Multiplex: Unifying Conventional and Speculative Thread-Level Parallelism on a Chip Multiprocessor

Author: Eigenmann Rudolph
Falsafi Babak
Kim Seon Wook
Ooi Chong-Liang
Park Il
Vijaykumar T. N.
Publication venue
Publication date: 06/04/2009
Field of study

Recent proposals for Chip Multiprocessors (CMPs) advocate speculative, or implicit, threading in which the hardware employs prediction to peel off instruction sequences (i.e., implicit threads) from the sequential execution stream and speculatively executes them in parallel on multiple processor cores. These proposals augment a conventional multiprocessor, which employs explicit threading, with the ability to handle implicit threads. Current proposals focus on only implicitly-threaded code sections. This paper identifies, for the first time, the issues in combining explicit and implicit threading. We present the Multiplex architecture to combine the two threading models. Multiplex exploits the similarities between implicit and explicit threading, and provides a unified support for the two threading models without additional hardware. Multiplex groups a subset of protocol states in an implicitly-threaded CMP to provide a write-invalidate protocol for explicit threads. Using a fully-integrated compiler inf rastructure for automatic generation of Multiplex code, this paper presents a detailed performance analysis for entire benchmarks, instead of just implicitly- threaded sections, as done in previous papers. We show that neither threading models alone performs consistently better than the other across the benchmarks. A CMP with four dual-issue CPUs achieves a speedup of 1.48 and 2.17 over one dual-issue CPU, using implicit-only and explicit-only threading, respectively. Multiplex matches or outperforms the better of the two threading models for every benchmark, and a four-CPU Multiplex achieves a speedup of 2.63. Our detailed analysis indicates that the dominant overheads in an implicitly-threaded CMP are speculation state overflow due to limited L1 cache capacity, and load imbalance and data dependences in fine-grain threads

Infoscience - École polytechnique fédérale de Lausanne

Recommended from our members

A Methodology for Programming Production Systems and its Implications on Parallelism

Author: Pasik Alexander J.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1988
Field of study

Production systems have been studied as a language for artificial intelligence programming for over a decade. The flexibility of a programming paradigm which allows for loosely structured, independent rules to represent knowledge is attractive. Unfortunately, two seemingly independent phenomena have hindered the ability to take full advantage of production systems. First, the performance of large production systems suffers due to the large amounts of computation required to run them. Second, the programming styles of individuals primarily accustomed to conventional programming has adversely affected the maintainability and performance of the resulting systems. The parallel execution of production systems has been studied in order to address the performance issues. Preliminary results have been interpreted pessimistically; production systems have been observed to contain only moderate to low levels of parallelism. By investigating the issue of programming style, however, it will be shown that the apparent lack of large-scale or massive parallelism is an artifact of this problem. Indeed, a set of programming guidelines and tools will be presented which yield more maintainable, understandable, and parallelizable production systems. Is there a programming methodology or environment which will allow for the development of more maintainable and parallelizable production systems? This work will attempt to demonstrate that using a combination of several techniques, resulting production systems will more appropriately conform with the theory which supports their use. Production systems are not appropriate for encoding all problem solving tasks. They are appropriate when there is a clear separation of explicit control knowledge, tabular knowledge, and pattern-directed knowledge. This classification has been presented by many researchers in the field, often in order to advocate their separation. The issue has been addressed from a knowledge representation standpoint: here it will be one of several issues which, when addressed properly, will result in systems with improved performance in addition to their more adequate representation of the knowledge. Substantially more paral1elism can be extracted from these systems. In this regard, the techniques complement parallel match algorithms which provide the first step in the solution for mapping production systems onto parallel architectures. The techniques are table-driven rules, creating constrained copies of culprit rules, multiple rule firing, and combining rule chains. These methods are combined into a new way of viewing production system execution. Rather than assuming the sequentiality of production systems and trying to extract parallelism explicitly, the systems are assumed to be implicitly parallel and all necessarily sequential aspects are explicitly defined

Columbia University Academic Commons

Parallelization of Goal-Driven, Production Systems on Hypercube Machines in a C Environment.

Author: Shrivastava Rajendra Kumar
Publication venue: LSU Digital Commons
Publication date: 01/01/1990
Field of study

Production systems are widely used in artificial intelligence to capture the notion of expertise in modeling expert systems. Production systems are computationally intensive programs spending most of the execution time in their MATCH or recognise phase. Efforts have been made by the research in this dissertation to minimize the production system\u27s execution time by optimizing the MATCH phase. Goal oriented deterministic production systems are commonly used for robotics applications and formed the main class of production systems that were studied in this dissertation. The main motivation for the research was to provide a better MATCH algorithm and use the multiprocessing capabilities of existing parallel computer hardware. The dissertation realizes these goals by transforming a traditional production system\u27s scalar equivalence operations into C arithmetic hashing function to generate an indexing variable for the switch-case construct of the C language. Partitioning of the working memory into homogeneous blocks and distributing production memory over the multiprocessors enhanced the MIMD operation of the production system. A scheme is formulated and implemented to identify a few key condition elements that may be used as an indexing variable and reduce the number of condition elements used in the MATCH phase. The complete translation from OPS5 code to C and the implementation scheme is presented in this dissertation. Various issues regarding the distribution of the inference engine over the multiprocessor environment and other related synchronization topics for distributed systems are covered in the dissertation. A detailed description of the parallel computer\u27s simulator is also provided in the dissertation. The dissertation identifies other research topics and problems related to parallelization of production systems, the most significant being the ability to incorporate LEARNING in production systems by using one or all of the idle processors that are waiting for the active processor to complete it\u27s activities

Louisiana State University

Survey on encode biometric data for transmission in wireless communication networks

Author: Al_Barazanchi Israa
Hussein Ali Mohammed
Ibrahim Amer
Wahbah Hasan
Publication venue: 'International University of Sarajevo'
Publication date: 26/12/2021
Field of study

The aim of this research survey is to review an enhanced model supported by artificial intelligence to encode biometric data for transmission in wireless communication networks can be tricky as performance decreases with increasing size due to interference, especially if channels and network topology are not selected carefully beforehand. Additionally, network dissociations may occur easily if crucial links fail as redundancy is neglected for signal transmission. Therefore, we present several algorithms and its implementation which addresses this problem by finding a network topology and channel assignment that minimizes interference and thus allows a deployment to increase its throughput performance by utilizing more bandwidth in the local spectrum by reducing coverage as well as connectivity issues in multiple AI-based techniques. Our evaluation survey shows an increase in throughput performance of up to multiple times or more compared to a baseline scenario where an optimization has not taken place and only one channel for the whole network is used with AI-based techniques. Furthermore, our solution also provides a robust signal transmission which tackles the issue of network partition for coverage and for single link failures by using airborne wireless network. The highest end-to-end connectivity stands at 10 Mbps data rate with a maximum propagation distance of several kilometers. The transmission in wireless network coverage depicted with several signal transmission data rate with 10 Mbps as it has lowest coverage issue with moderate range of propagation distance using enhanced model to encode biometric data for transmission in wireless communication

Periodicals of Engineering and Natural Sciences (PEN - International University of Sarajevo)

DynamO: A free O(N) general event-driven molecular-dynamics simulator

Author: Adams
Alder
Alder
Allen
Allen
Andersen
Atkinson
Bannerman
Bannerman
Bannerman
Bannerman
Bannerman
Bannerman
Bird
Bowers
Brilliantov
Chapela
Cheatham
Cheon
Cundall
Deltour
Donev
Donev
Elliott
Elliott
Fish
Frenkel
Goldhirsch
Haff
Haile
Hansson
Hess
Hoover
Hopkins
Jefferson
Jorgensen
Krantz
Lees
Limbach
Lubachevsky
MacKerel
Marchut
Marin
Marin
Marin
Marrink
Miller
Molinero
Monticelli
Nguyen
Paul
Phillips
Plimpton
Ponder
Pöschel
Rahman
Raman
Rapaport
Schultz
Sigurgeirsson
Smith
Spoel
Swendsen
Unlu
Vahid
van Zon
Verlet
Vrabecz
Walton
Wong
Woodcock
Woodhead
Publication venue: 'Wiley'
Publication date: 26/07/2011
Field of study

Molecular-dynamics algorithms for systems of particles interacting through discrete or "hard" potentials are fundamentally different to the methods for continuous or "soft" potential systems. Although many software packages have been developed for continuous potential systems, software for discrete potential systems based on event-driven algorithms are relatively scarce and specialized. We present DynamO, a general event-driven simulation package which displays the optimal O(N) asymptotic scaling of the computational cost with the number of particles N, rather than the O(N log(N)) scaling found in most standard algorithms. DynamO provides reference implementations of the best available event-driven algorithms. These techniques allow the rapid simulation of both complex and large (>10^6 particles) systems for long times. The performance of the program is benchmarked for elastic hard sphere systems, homogeneous cooling and sheared inelastic hard spheres, and equilibrium Lennard-Jones fluids. This software and its documentation are distributed under the GNU General Public license and can be freely downloaded from http://marcusbannerman.co.uk/dynamo

arXiv.org e-Print Archive

Crossref

University of Strathclyde Institutional Repository