Search CORE

4,019 research outputs found

A case study for NoC based homogeneous MPSoC architectures

Author: Casu Mario Roberto
Macchiarulo Luca
Ruo Roch Massimo
Tota Sergio Vincenzo
Zamboni Maurizio
Publication venue: IEEE
Publication date: 01/01/2009
Field of study

The many-core design paradigm requires flexible and modular hardware and software components to provide the required scalability to next-generation on-chip multiprocessor architectures. A multidisciplinary approach is necessary to consider all the interactions between the different components of the design. In this paper, a complete design methodology that tackles at once the aspects of system level modeling, hardware architecture, and programming model has been successfully used for the implementation of a multiprocessor network-on-chip (NoC)-based system, the NoCRay graphic accelerator. The design, based on 16 processors, after prototyping with field-programmable gate array (FPGA), has been laid out in 90-nm technology. Post-layout results show very low power, area, as well as 500 MHz of clock frequency. Results show that an array of small and simple processors outperform a single high-end general purpose processo

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Redundancy management for efficient fault recovery in NASA's distributed computing system

Author: Malek Miroslaw
Pandya Mihir
Yau Kitty
Publication venue
Publication date
Field of study

The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance

NASA Technical Reports Server

Recommended from our members

Automatic data/program partitioning using the single assignment principle

Author: Bic Lubomir
Nagel Mark D.
Roy John M.A.
Publication venue: eScholarship, University of California
Publication date: 01/01/1989
Field of study

Loosely-coupled MIMD architectures do not suffer from memory contention; hence large numbers of processors may be utilized. The main problem, however, is how to partition data and programs in order to exploit the available parallelism. In this paper we show that efficient schemes for automatic data/program partitioning and synchronization may be employed if single assignment is used. Using simulations of program loops common to scientific computations (the Livermore Loops), we demonstrate that only a small fraction of data accesses are remote and thus the degradation in network performance due to multiprocessing is minimal

eScholarship - University of California

DSIM: A distributed simulator

Author: Goswami Kumar K.
Iyer Ravishankar K.
Publication venue
Publication date
Field of study

Discrete event-driven simulation makes it possible to model a computer system in detail. However, such simulation models can require a significant time to execute. This is especially true when modeling large parallel or distributed systems containing many processors and a complex communication network. One solution is to distribute the simulation over several processors. If enough parallelism is achieved, large simulation models can be efficiently executed. This study proposes a distributed simulator called DSIM which can run on various architectures. A simulated test environment is used to verify and characterize the performance of DSIM. The results of the experiments indicate that speedup is application-dependent and, in DSIM's case, is also dependent on how the simulation model is distributed among the processors. Furthermore, the experiments reveal that the communication overhead of ethernet-based distributed systems makes it difficult to achieve reasonable speedup unless the simulation model is computation bound

NASA Technical Reports Server

Strengthening measurements from the edges: application-level packet loss rate estimation

Author: Carlson R.
Dischinger M.
Juan Carlos De Martin
Michela Meo
Simone Basso
Sänchez M. A.
Publication venue: ACM New York, NY, USA
Publication date: 01/01/2013
Field of study

Network users know much less than ISPs, Internet exchanges and content providers about what happens inside the network. Consequently users cannot either easily detect network neutrality violations or readily exercise their market power by knowledgeably switching ISPs. This paper contributes to the ongoing efforts to empower users by proposing two models to estimate -- via application-level measurements -- a key network indicator, i.e., the packet loss rate (PLR) experienced by FTP-like TCP downloads. Controlled, testbed, and large-scale experiments show that the Inverse Mathis model is simpler and more consistent across the whole PLR range, but less accurate than the more advanced Likely Rexmit model for landline connections and moderate PL

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Validation of multiprocessor systems

Author: Kong T.
Segall Z.
Siewiorek D. P.
Publication venue
Publication date
Field of study

Experiments that can be used to validate fault free performance of multiprocessor systems in aerospace systems integrating flight controls and avionics are discussed. Engineering prototypes for two fault tolerant multiprocessors are tested

NASA Technical Reports Server

Principles for problem aggregation and assignment in medium scale multiprocessors

Author: Nicol David M.
Saltz Joel H.
Publication venue
Publication date
Field of study

One of the most important issues in parallel processing is the mapping of workload to processors. This paper considers a large class of problems having a high degree of potential fine grained parallelism, and execution requirements that are either not predictable, or are too costly to predict. The main issues in mapping such a problem onto medium scale multiprocessors are those of aggregation and assignment. We study a method of parameterized aggregation that makes few assumptions about the workload. The mapping of aggregate units of work onto processors is uniform, and exploits locality of workload intensity to balance the unknown workload. In general, a finer aggregate granularity leads to a better balance at the price of increased communication/synchronization costs; the aggregation parameters can be adjusted to find a reasonable granularity. The effectiveness of this scheme is demonstrated on three model problems: an adaptive one-dimensional fluid dynamics problem with message passing, a sparse triangular linear system solver on both a shared memory and a message-passing machine, and a two-dimensional time-driven battlefield simulation employing message passing. Using the model problems, the tradeoffs are studied between balanced workload and the communication/synchronization costs. Finally, an analytical model is used to explain why the method balances workload and minimizes the variance in system behavior

NASA Technical Reports Server

Building Resilient Cloud Over Unreliable Commodity Infrastructure

Author: Bansal Sorav
Deshpande Deepak
Iyer Sreekanth
Kedia Piyus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/08/2012
Field of study

Cloud Computing has emerged as a successful computing paradigm for efficiently utilizing managed compute infrastructure such as high speed rack-mounted servers, connected with high speed networking, and reliable storage. Usually such infrastructure is dedicated, physically secured and has reliable power and networking infrastructure. However, much of our idle compute capacity is present in unmanaged infrastructure like idle desktops, lab machines, physically distant server machines, and laptops. We present a scheme to utilize this idle compute capacity on a best-effort basis and provide high availability even in face of failure of individual components or facilities. We run virtual machines on the commodity infrastructure and present a cloud interface to our end users. The primary challenge is to maintain availability in the presence of node failures, network failures, and power failures. We run multiple copies of a Virtual Machine (VM) redundantly on geographically dispersed physical machines to achieve availability. If one of the running copies of a VM fails, we seamlessly switchover to another running copy. We use Virtual Machine Record/Replay capability to implement this redundancy and switchover. In current progress, we have implemented VM Record/Replay for uniprocessor machines over Linux/KVM and are currently working on VM Record/Replay on shared-memory multiprocessor machines. We report initial experimental results based on our implementation.Comment: Oral presentation at IEEE "Cloud Computing for Emerging Markets", Oct. 11-12, 2012, Bangalore, Indi

arXiv.org e-Print Archive

Crossref