Search CORE

382 research outputs found

Experimental Benchmarks and Initial Evaluation of the Performance of the PASM System Prototype

Author: Casavant T. L.
Fineberg A.
Jamieson Leah H.
McPheters M. J.
Schwederski T.
Siegel H. S.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/1988
Field of study

The work reported here represents experiences with the PASM parallel processing system prototype during its first operational year. Most of the experiments were performed by students in the Fall semester of 1987. The first programming, and the first timing measurements, were made during the summer of 1987 by Sam Fineberg. The goal of the collection of experiments presented here was to undertake an Application-driven Architecture Study of the PASM system as a paradigm for parallel architecture evaluation in general. PASM was an excellent vehicle for experimenting with this evaluation technique due to its unique architectural features. Among these are: 1. A reconfigurable, partitionable multistage circuit-switched network. 2. Support for both SIMD and MIMD programs. 3. Ability to execute hybrid SIMD/MIMD programs. 4. An instruction queue which allows overlap of control-flow and data manipulation between micro-control (MC) units and processing elements (PE). It had been hypothesized that superlinear speed-up over the number of PEs could be attained with this feature, and experimental results verified this. 5. Support for barrier synchronization of MIMD tasks. This feature was exploited in some non-standard ways to show the ability to decouple variant length SIMD instructions into multiple MIMD streams for an overall performance benefit. This type of study is expected to continue in the future on PASM and other parallel machines at Purdue. This report should serve as a guide for this future work as well

Purdue E-Pubs

Reading list of selected PASM-related publications

Author: Siegel Howard Jay
Young Dalton
Publication venue: 'Springer Publishing Company'
Publication date: 01/01/2010
Field of study

Prepared for a chapter to be published in the forthcoming Encyclopedia of Parallel Computing by Springer Publishing Company. The Encyclopedia will contain a broad coverage of the field and will include entries on machine organization, programming, algorithms, and applications. The broad coverage, together with extensive pointers to the literature for in-depth study, is expected to make the Encyclopedia a useful reference tool in parallel computing

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Experimental Evaluation of SIMD PE-Mask Generation and Hybrid Mode Parallel Computing on Multi- Microprocessor Systems

Author: Casavant Thomas L.
Fineberg Samuel A.
Siegel Howard Jay
Publication venue: 'Purdue University (bepress)'
Publication date: 01/11/1988
Field of study

Experimentation aimed at determining the potential efficiency of multi-microprocessor designs of SIMD machines is reported. The experimentation is based on timing measurements made on the PASM system prototype at Purdue. The application used to measure and evaluate this phenomenon was bitonic sorting, which has feasible solutions in both SIMD and MIMD modes of computation, as well as in at least two hybrids of SlMD and MIMD modes. Bitonic sorting was coded in these four ways and experiments were performed that examine the tradeoffs among all of these modes. Also, a new PE mask generation scheme for multiple of-the-shelf microprocessor based SIMD systems is proposed, and its performance was measured

Purdue E-Pubs

Prospectus for a Remote PASM Execution and Debugging Environment - PDB

Author: Casavant Thomas L.
Lumpp James E., Jr
Nation Wayne
Schwederski Thomas
Publication venue: 'Purdue University (bepress)'
Publication date: 01/04/1988
Field of study

This document describes four design alternatives for a remote debugging and execution environment for the PASM Parallel Processing System Prototype in the School of EE at Purdue. Two alternatives involve acquisition of modest hardware for system enhancement, while the others are software-only solutions. All solutions involve use of a high-resolution bit-mapped graphics device, mouse and keyboard input, and a broad-band Ethernet-like communication medium. These latter components are currently available. The goal of this environment is to support any type of debugging which is currently supported by using the front panel of the machine and several terminals which are manually multiplexed between PEs and other resource management processors of the system. The environment will support voluntary output of processor activity from, and input to, any of the 30 processors of the PASM prototype. This configuration represents a step toward multiprogramming of the machine and will support development of software tools, languages and additional applications. Debugging information will be in the form of textual (or other) output displayed on virtual windows of a high-resolution device such as a SUN 3/50

Purdue E-Pubs

Model for an Intelligent Operating System for Executing Tasks on a Reconfigurable Parallel Architecture

Author: Chu C. Henry
Delp Edward J.
Jamieson Leah H.
Siegel Howard Jay
Weil Frank J
Whinston Andrew B
Publication venue: 'Purdue University (bepress)'
Publication date: 01/11/1988
Field of study

Parallel processing is one approach to achieve the large computational processing capabilities required by many real-time computing tasks. One of the problems that must be addressed in the use of reconfigurable multiprocessor systems is matching the architecture configuration to the algorithms to be executed. This paper presents a conceptual model that explores the potential of artificial intelligence tools, specifically expert systems, to design an Intelligent Operating System for multiprocessor systems. The target task is the implementation of image understanding systems on multiprocessor architectures. PASM is used as an example multiprocessor. The Intelligent Operating System concepts developed here could also be used to address other problems requiring real-time processing. An example image understanding task is presented to illustrate the concept of intelligent scheduling by the Intelligent Operating System. Also considered is the use of the conceptual model when developing an image understanding system in order to test different strategies for choosing algorithms, imposing execution order constraints, and integrating results from various algorithms

Purdue E-Pubs

Reconfiguration for Fault Tolerance and Performance Analysis

Author: Kollmeier Harold Henry
Publication venue: ScholarlyCommons
Publication date: 01/11/1987
Field of study

Architecture reconfiguration, the ability of a system to alter the active interconnection among modules, has a history of different purposes and strategies. Its purposes develop from the relatively simple desire to formalize procedures that all processes have in common to reconfiguration for the improvement of fault-tolerance, to reconfiguration for performance enhancement, either through the simple maximizing of system use or by sophisticated notions of wedding topology to the specific needs of a given process. Strategies range from straightforward redundancy by means of an identical backup system to intricate structures employing multistage interconnection networks. The present discussion surveys the more important contributions to developments in reconfigurable architecture. The strategy here is in a sense to approach the field from an historical perspective, with the goal of developing a more coherent theory of reconfiguration. First, the Turing and von Neumann machines are discussed from the perspective of system reconfiguration, and it is seen that this early important theoretical work contains little that anticipates reconfiguration. Then some early developments in reconfiguration are analyzed, including the work of Estrin and associates on the fixed plus variable restructurable computer system, the attempt to theorize about configurable computers by Miller and Cocke, and the work of Reddi and Feustel on their restructable computer system. The discussion then focuses on the most sustained systems for fault tolerance and performance enhancement that have been proposed. An attempt will be made to define fault tolerance and to investigate some of the strategies used to achieve it. By investigating four different systems, the Tandern computer, the C.vmp system, the Extra Stage Cube, and the Gamma network, the move from dynamic redundancy to reconfiguration is observed. Then reconfiguration for performance enhancement is discussed. A survey of some proposals is attempted, then the discussion focuses on the most sustained systems that have been proposed: PASM, the DC architecture, the Star local network, and the NYU Ultracomputer. The discussion is organized around a comparison of control, scheduling, communication, and network topology. Finally, comparisons are drawn between fault tolerance and performance enhancement, in order to clarify the notion of reconfiguration and to reveal the common ground of fault tolerance and performance enhancement as well as the areas in which they diverge. An attempt is made in the conclusion to derive from this survey and analysis some observations on the nature of reconfiguration, as well as some remarks on necessary further areas of research

CiteSeerX

ScholarlyCommons@Penn

Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system

Author: Li Li
Long Yanhua
Wei Haoran
Xu Dongxing
Publication venue
Publication date: 02/11/2022
Field of study

Exploiting effective target modeling units is very important and has always been a concern in end-to-end automatic speech recognition (ASR). In this work, we propose a phonetic-assisted multi-target units (PMU) modeling approach, to enhance the Conformer-Transducer ASR system in a progressive representation learning manner. Specifically, PMU first uses the pronunciation-assisted subword modeling (PASM) and byte pair encoding (BPE) to produce phonetic-induced and text-induced target units separately; Then, three new frameworks are investigated to enhance the acoustic encoder, including a basic PMU, a paraCTC and a pcaCTC, they integrate the PASM and BPE units at different levels for CTC and transducer multi-task training. Experiments on both LibriSpeech and accented ASR tasks show that, the proposed PMU significantly outperforms the conventional BPE, it reduces the WER of LibriSpeech clean, other, and six accented ASR testsets by relative 12.7%, 6.0% and 7.7%, respectively.Comment: 5 pages, 1 figures, submitted to ICASSP 202

arXiv.org e-Print Archive

Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming

Author: Cole Murray
Publication venue
Publication date: 01/01/2004
Field of study

Edinburgh Research Explorer