Search CORE

722 research outputs found

A Non-blocking Interconnection Network-Shared Cache Organization for Multi-core Processors

Author: Allam Rebhi Mohammad AbuMwais
علام ربحي محمد ابومويس
Publication venue: جامعة القدس
Publication date: 28/09/2013
Field of study

Al-Quds University Digital Repository

System configuration and executive requirements specifications for reusable shuttle and space station/base

Author: Curran R. T.
Fitzpatrick W. S.
Johnson J. M.
Kennedy J. R.
Publication venue
Publication date
Field of study

System configuration and executive requirements specifications for reusable shuttle and space station/bas

NASA Technical Reports Server

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

The exploitation of parallelism on shared memory multiprocessors

Author: Stoker Michael Allan
Publication venue: Newcastle University
Publication date: 01/01/1990
Field of study

PhD ThesisWith the arrival of many general purpose shared memory multiple processor (multiprocessor) computers into the commercial arena during the mid-1980's, a rift has opened between the raw processing power offered by the emerging hardware and the relative inability of its operating software to effectively deliver this power to potential users. This rift stems from the fact that, currently, no computational model with the capability to elegantly express parallel activity is mature enough to be universally accepted, and used as the basis for programming languages to exploit the parallelism that multiprocessors offer. To add to this, there is a lack of software tools to assist programmers in the processes of designing and debugging parallel programs. Although much research has been done in the field of programming languages, no undisputed candidate for the most appropriate language for programming shared memory multiprocessors has yet been found. This thesis examines why this state of affairs has arisen and proposes programming language constructs, together with a programming methodology and environment, to close the ever widening hardware to software gap. The novel programming constructs described in this thesis are intended for use in imperative languages even though they make use of the synchronisation inherent in the dataflow model by using the semantics of single assignment when operating on shared data, so giving rise to the term shared values. As there are several distinct parallel programming paradigms, matching flavours of shared value are developed to permit the concise expression of these paradigms.The Science and Engineering Research Council

Newcastle University eTheses

A multiprocessor system using a switch matrix configuration

Author: Aoufi Rabah
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1980
Field of study

This thesis describes a class of interconnection networks based on the use of a switch matrix to provide processor to memory communication. This switch allows a direct link between any processor to any memory module. The cost and performance of this network are analytically examined. The results are compared with those of a multiprocessor system using a time-shared bus configuration and it is shown that for the two extreme cases of maximum and minimum throughput, the two approaches are equivalent from a performance point of view. However, in the general case, even with a higher cost, the switch matrix provides a much better performance than the time-shared bus configuration. Furthermore, the architecture of a multiprocessor MIMD type computer using a switch matrix is investigated and Petri net techniques are used to model process coordination among processors --Abstract, page ii

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Solution of partial differential equations on vector and parallel computers

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

NASA Technical Reports Server

Center for Aeronautics and Space Information Sciences

Author: Flynn Michael J.
Publication venue
Publication date
Field of study

This report summarizes the research done during 1991/92 under the Center for Aeronautics and Space Information Science (CASIS) program. The topics covered are computer architecture, networking, and neural nets

NASA Technical Reports Server

Approaches to multiprocessor error recovery using an on-chip interconnect subsystem

Author: Vadlamani Ramakrishna P
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2010
Field of study

For future multicores, a dedicated interconnect subsystem for on-chip monitors was found to be highly beneficial in terms of scalability, performance and area. In this thesis, such a monitor network (MNoC) is used for multicores to support selective error identification and recovery and maintain target chip reliability in the context of dynamic voltage and frequency scaling (DVFS). A selective shared memory multiprocessor recovery is performed using MNoC in which, when an error is detected, only the group of processors sharing an application with the affected processors are recovered. Although the use of DVFS in contemporary multicores provides significant protection from unpredictable thermal events, a potential side effect can be an increased processor exposure to soft errors. To address this issue, a flexible fault prevention and recovery mechanism has been developed to selectively enable a small amount of per-core dual modular redundancy (DMR) in response to increased vulnerability, as measured by the processor architectural vulnerability factor (AVF). Our new algorithm for DMR deployment aims to provide a stable effective soft error rate (SER) by using DMR in response to DVFS caused by thermal events. The algorithm is implemented in real-time on the multicore using MNoC and controller which evaluates thermal information and multicore performance statistics in addition to error information. DVFS experiments with a multicore simulator using standard benchmarks show an average 6% improvement in overall power consumption and a stable SER by using selective DMR versus continuous DMR deployment

CiteSeerX

ScholarWorks@UMass Amherst

GPU implementation of block transforms

Author: Zhang Boyan
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2012
Field of study

Traditionally, intensive floating-point computational ability of Graphics Processing Units (GPUs) has been mainly limited for rendering and visualization application by architecture and programming model. However, with increasing programmability and architecture progress, GPUs inherent massively parallel computational ability have become an essential part of today\u27s mainstream general purpose (non-graphical) high performance computing system. It has been widely reported that adapted GPU-based algorithms outperform significantly their CPU counterpart. The focus of the thesis is to utilize NVIDIA CUDA GPUs to implement orthogonal transforms such as signal dependent Karhunen-Loeve Transform and signal independent Discrete Cosine Transform. GPU architecture and programming model are examined. Mathematical preliminaries of orthogonal transform, eigen-analysis and algorithms are re-visited. Due to highly parallel structure, GPUs are well suited to such computation. Further, the thesis examines multiple implementations schemes and configuration, measurement of performance is provided. A real time processing display application frame is developed to visually exhibit GPU compute capability

Digital Commons @ New Jersey Institute of Technology (NJIT)

An investigation into Multiprocessor Systems based on UNIX

Author: Welten P.J.M.
Publication venue
Publication date: 28/02/1989
Field of study

Pure OAI Repository