50 research outputs found
Dynamic Systolization for Developing Multiprocessor Supercomputers
A dynamic network approach is introduced for developing reconfigurable, systolic arrays or wavefront processors; This allows one to design very powerful and flexible processors to be used in a general-purpose, reconfigurable, and fault-tolerant, multiprocessor computer system. The concepts of macro-dataflow and multitasking can be integrated to handle variable-resolution granularities in computationally intensive algorithms. A multiprocessor architecture, Remps, is proposed based on these design methodologies. The Remps architecture is generalized from the Cedar, HEP, Cray X- MP, Trac, NYU ultracomputer, S-l, Pumps, Chip, and SAM projects. Our goal is to provide a multiprocessor research model for developing design methodologies, multiprocessing and multitasking supports, dynamic systolic/wavefront array processors, interconnection networks, reconfiguration techniques, and performance analysis tools. These system design and operational techniques should be useful to those who are developing or evaluating multiprocessor supercomputers
Design of testbed and emulation tools
The research summarized was concerned with the design of testbed and emulation tools suitable to assist in projecting, with reasonable accuracy, the expected performance of highly concurrent computing systems on large, complete applications. Such testbed and emulation tools are intended for the eventual use of those exploring new concurrent system architectures and organizations, either as users or as designers of such systems. While a range of alternatives was considered, a software based set of hierarchical tools was chosen to provide maximum flexibility, to ease in moving to new computers as technology improves and to take advantage of the inherent reliability and availability of commercially available computing systems
A formalism for describing and simulating systems with interacting components.
This thesis addresses the problem of descriptive complexity presented by systems involving a high number of interacting components. It investigates the evaluation measure of performability and its application to such systems. A new description and simulation language, ICE and it's application to performability modelling is presented. ICE (Interacting ComponEnts) is based upon an earlier description language which was first proposed for defining reliability problems. ICE is declarative in style and has a limited number of keywords. The ethos in the development of the language has been to provide an intuitive formalism with a powerful descriptive space. The full syntax of the language is presented with discussion as to its philosophy. The implementation of a discrete event simulator using an ICE interface is described, with use being made of examples to illustrate the functionality of the code and the semantics of the language. Random numbers are used to provide the required stochastic behaviour within the simulator. The behaviour of an industry standard generator within the simulator and different methods of number allocation are shown. A new generator is proposed that is a development of a fast hardware shift register generator and is demonstrated to possess good statistical properties and operational speed. For the purpose of providing a rigorous description of the language and clarification of its semantics, a computational model is developed using the formalism of extended coloured Petri nets. This model also gives an indication of the language's descriptive power relative to that of a recognised and well developed technique. Some recognised temporal and structural problems of system event modelling are identified. and ICE solutions given. The growing research area of ATM communication networks is introduced and a sophisticated top down model of an ATM switch presented. This model is simulated and interesting results are given. A generic ICE framework for performability modelling is developed and demonstrated. This is considered as a positive contribution to the general field of performability research
Architecture, Design, Simulation and Performance Evaluation for Implementing ALAX -- The ATM LAN Access Switch Integrating the IEEE 1355 Serial Bus
IEEE 1355 is a serial bus standard for Heterogeneous Inter Connect (HIC) developed for "enabling high-performance, scalable, modular and parallel systems to be built with low system integration cost." However to date, few systems have been built around this standard specification. In this thesis, we propose ALAX -- an internetworking switching device based on IEEE 1355. The aim of the thesis is two-fold. First, we discuss and summarize research works leading to the architecture, design and simulation development for ALAX; we synthesize and analyze relevant data collected from the simulation experiments of the 4- port model of ALAX (i.e., 4-by-4 with four input and output queues) -- these activities were conducted during the 2-year length of the project. Secondly, we expand the original 4-by-4 size of the ALAX simulation model into 8-, 12- and 16-port models and present and interpret the outcomes. Thus, overall we establish a performance assessment of the ALAX switch, and also identify several critical design measurements to support the ALAX prototype implementation. We review progresses made in Local Area Networks (LANs) where traditional software-enabled bridges or routers are being replaced in many instances by hardware-enabled switches to enhance network performance. Within that context, ATM (Asynchronous Transfer Mode) technology emerges as an alternative for the next generation of high-speed LANs. Hence, ALAX incarnates our effective approach to build an ATM-LAN interface using a suitable switching platform. ALAX currently provides the capability to conveniently interconnect legacy Ethernet and ATM- based networks. Its distributed architecture features a multi- processor environment of T9000 transputers with parallel processing capability, a 32-by-32 way non-blocking crossbar fabric (C104 chipset) partitioned into Transport (i.e., Data) and Control planes, and many other modules interlaced with IEEE 1355- based connectors. It also employs existing and emerging protocols such as LANE (LAN Emulation), IEEE 802.3 and SNMP (Simple Network Management Protocol). We provide the component breakdown of the ALAX simulation model based on Optimized Network Engineering Tools (OPNET). The critical parameters for the study are acceptable processor speeds and queuing sizes of shared memory buffer at each switch port. The performance metric used is the end-to-end packet delay. Finally, we end the thesis with conclusive recommendations pertaining to performance and design measurement, and a brief summary of areas for further research study
NETRA - A Parallel Architecture for Integrated Vision Systems I: Architecture and Organization
Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNational Aeronautics and Space Administration / NASA-NAG-1-61
Adaptive source routing and route generation for multicomputers
Ankara : Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Master's) -- Bilkent University, 1995.Includes bibliographical references leaves 62-64.Scalable multicomputers are based upon interconnection networks that typically
provide multiple communication routes between any given pair of processor
nodes. In such networks, the selection of the routes is an important problem
because of its impact on the communication performance. We propose the
adaptive source routing (ASR) scheme which combines adaptive routing and
source routing into one which has the advantages of both schemes. In ASR,
the degree of adaptivity of each packet is determined at the source processor.
Every packet can be routed in a fully adaptive or partially adaptive or nonadaptive
manner, all within the same network at the same time. The ASR
scheme permits any network topology to be used provided that deadlock constraints
are satisfied. We evaluate and compare performance of the adaptive
source routing and non-adaptive randomized routing by simulations. Also we
propose an algorithm to generate adaptive routes for all pairs of processors in
any multistage interconnection network. Adaptive routes are stored in a route
table in each processor’s memory and provide high bandwidth and reliable interprocessor
communication. We evaluate the performance of the algorithm on
IBM SP2 networks in terms of obtained bandwidth, time to fill in the route
tables, and efficiency exploited by the parallel execution of the algorithm.Aydoğan, YücelM.S
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems
Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems
Recommended from our members
Canonical approximation in the performance analysis of distributed systems
The problem of analyzing distributed systems arises in many areas of computer science, such as communication networks, distributed databases, packet radio networks, VLSI communications and switching mechanisms. Analysis of distributed systems is difficult since one must deal with many tightly-interacting components. The number of possible state configurations typically grows exponentially with the system size, making the exact analysis intractable even for relatively small systems. For the stochastic models of these systems, whose steady-state probability is of the product form, many global performance measures of interest can be computed once one knows the normalization constant of the steady-state probability distribution. This constant, called the system partition function, is typically difficult to derive in closed form. The key difficulty in performance analysis of such models can be viewed as trying to derive a good approximation to the partition function or calculate it numerically. In this Ph.D. work we introduce a new approximation technique to analyze a variety of such models of distributed systems. This technique, which we call the method of Canonical Approximation, is similar to that developed in statistical physics to compute the partition function. The new method gives a closed-form approximation of the partition function and of the global performance measures. It is computationally simple with complexity independent of the system size, gives an excellent degree of precision for large systems, and is applicable to a wide variety of problems. The method is applied to the analysis of multihop packet radio networks, locking schemes in database systems, closed queueing networks, and interconnection networks
The connection machine
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.Bibliography: leaves 134-157.by William Daniel Hillis.Ph.D