Search CORE

137 research outputs found

A study of the selection of microcomputer architectures to automate planetary spacecraft power systems

Author: Nauda A.
Publication venue
Publication date
Field of study

Performance and reliability models of alternate microcomputer architectures as a methodology for optimizing system design were examined. A methodology for selecting an optimum microcomputer architecture for autonomous operation of planetary spacecraft power systems was developed. Various microcomputer system architectures are analyzed to determine their application to spacecraft power systems. It is suggested that no standardization formula or common set of guidelines exists which provides an optimum configuration for a given set of specifications

NASA Technical Reports Server

Static allocation of computation to processors in multicomputers

Author: Norman Michael G.
Publication venue: The University of Edinburgh
Publication date: 01/01/1993
Field of study

Edinburgh Research Archive

Submicron Systems Architecture Project: Semiannual Technial Report

Author: Seitz Charles L.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1989
Field of study

No abstract available

Caltech Authors

Towards the Teraflop CFD

Author: Schreiber Robert
Simon Horst D.
Publication venue
Publication date
Field of study

We are surveying current projects in the area of parallel supercomputers. The machines considered here will become commercially available in the 1990 - 1992 time frame. All are suitable for exploring the critical issues in applying parallel processors to large scale scientific computations, in particular CFD calculations. This chapter presents an overview of the surveyed machines, and a detailed analysis of the various architectural and technology approaches taken. Particular emphasis is placed on the feasibility of a Teraflops capability following the paths proposed by various developers

NASA Technical Reports Server

Submicron Systems Architecture Project: Semiannual Technical Report

Author: Seitz Charles L.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1987
Field of study

No abstract available

Caltech Authors

Network Multicomputing Using Recoverable Distributed Shared Memory

Author: Carter John B.
Cox Alan L.
Dwarkadas Sandhya
Elnozahy Elmootazbellah N.
Johnson David B.
Keleher Pete
Zwaenepoel Willy
Publication venue
Publication date: 20/10/2005
Field of study

A network multicomputer is a multiprocessor in which the processors are connected by general-purpose networking technology, in contrast to current distributed memory multiprocessors where a dedicated special-purpose interconnect is used. The advent of high-speed general-purpose networks provides the impetus for a new look at the network multiprocessor model, by removing the bottleneck of current slow networks. However, major software issues remain unsolved. It is pointed out that a convenient machine abstraction must be developed that hides from the application programmer low-level details such as message passing or machine failures. Use is made of distributed shared memory as a programming abstraction, and rollback recovery through consistent checkpointing to provide fault tolerance. Measurements of the authors' implementations of distributed shared memory and consistent checkpointing show that these abstractions can be implemented efficientl

Infoscience - École polytechnique fédérale de Lausanne

Submicron Systems Architecture: Semiannual Technical Report

Author: Martin Alain J.
McEliece Robert J.
Rem Martin
Seitz Charles L.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1987
Field of study

No abstract available

Caltech Authors

Compiler optimization to improve data locality for processor multithreading

Author: Balaram Sinharoy
Publication venue
Publication date: 24/04/2020
Field of study

Over the last decade processor speed has increased dramatically, whereas the speed of the memory subsystem improved at a modest rate. Due to the increase in the cache miss latency (in terms of the processor cycle), processors stall on cache misses for a significant portion of its execution time. Multithreaded processors has been proposed in the literature to reduce the processor stall time due to cache misses. Although multithreading improves processor utilization, it may also increase cache miss rates, because in a multithreaded processor multiple threads share the same cache, which effectively reduces the cache size available to each individual thread. Increased processor utilization and the increase in the cache miss rate demands higher memory bandwidth. A novel compiler optimization method has been presented in this paper that improves data locality for each of the threads and enhances data sharing among the threads. The method is based on loop transformation theory and optimizes both spatial and temporal data locality. The created threads exhibit high level of intra-thread and inter-thread data locality which effectively reduces both the data cache miss rates and the total execution time of numerically intensive computation running on a multithreaded processor

CiteSeerX

Adaptive sampling-based profiling techniques for optimizing the distributed JVM runtime

Author: Lam KT
Luo Y
Wang CL
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Extending the standard Java virtual machine (JVM) for cluster-awareness is a transparent approach to scaling out multithreaded Java applications. While this clustering solution is gaining momentum in recent years, efficient runtime support for fine-grained object sharing over the distributed JVM remains a challenge. The system efficiency is strongly connected to the global object sharing profile that determines the overall communication cost. Once the sharing or correlation between threads is known, access locality can be optimized by collocating highly correlated threads via dynamic thread migrations. Although correlation tracking techniques have been studied in some page-based sof Tware DSM systems, they would entail prohibitively high overheads and low accuracy when ported to fine-grained object-based systems. In this paper, we propose a lightweight sampling-based profiling technique for tracking inter-thread sharing. To preserve locality across migrations, we also propose a stack sampling mechanism for profiling the set of objects which are tightly coupled with a migrant thread. Sampling rates in both techniques can vary adaptively to strike a balance between preciseness and overhead. Such adaptive techniques are particularly useful for applications whose sharing patterns could change dynamically. The profiling results can be exploited for effective thread-to-core placement and dynamic load balancing in a distributed object sharing environment. We present the design and preliminary performance result of our distributed JVM with the profiling implemented. Experimental results show that the profiling is able to obtain over 95% accurate global sharing profiles at a cost of only a few percents of execution time increase for fine- to medium- grained applications. © 2010 IEEE.published_or_final_versionThe 24th IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2010), Atlanta, GA., 19-23 April 2010. In Proceedings of the 24th IPDPS, 2010, p. 1-1

HKU Scholars Hub