Search CORE

2,743 research outputs found

A NoC-based hybrid message-passing/shared-memory approach to CMP design

Author: Agarwal
Daemen
Forsell
Grecu
Karniadakis
Lorensen
Mario R. Casu
Massimo Ruo Roch
Maurizio Zamboni
Owens
Paulin
Radulescu
Sergio V. Tota
Snir
Tota
Publication venue: Elsevier
Publication date: 01/01/2011
Field of study

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips

Author: Hechtman Blake A.
Sorin Daniel J.
Publication venue
Publication date: 01/01/2013
Field of study

The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores with cache-coherent shared virtual memory (CCSVM), this is not the communication paradigm used by any current HMC. In this paper, we present a CCSVM design for a CPU/MTTOP chip, as well as an extension of the pthreads programming model, called xthreads, for programming this HMC. Our goal is to evaluate the potential performance benefits of tightly coupling heterogeneous cores with CCSVM

arXiv.org e-Print Archive

CiteSeerX

Crossref

A Non-blocking Interconnection Network-Shared Cache Organization for Multi-core Processors

Author: Allam Rebhi Mohammad AbuMwais
علام ربحي محمد ابومويس
Publication venue: جامعة القدس
Publication date: 28/09/2013
Field of study

Al-Quds University Digital Repository

Shared versus distributed memory multiprocessors

Author: Jordan Harry F.
Publication venue
Publication date
Field of study

The question of whether multiprocessors should have shared or distributed memory has attracted a great deal of attention. Some researchers argue strongly for building distributed memory machines, while others argue just as strongly for programming shared memory multiprocessors. A great deal of research is underway on both types of parallel systems. Special emphasis is placed on systems with a very large number of processors for computation intensive tasks and considers research and implementation trends. It appears that the two types of systems will likely converge to a common form for large scale multiprocessors

NASA Technical Reports Server

Support for Programming Models in Network-on-Chip-based Many-core Systems

Author: Rasmussen Morten Sleth
Publication venue: Technical University of Denmark
Publication date: 01/09/2010
Field of study

Online Research Database In Technology

Concepts of Communication and Synchronization in FPGA-Based Embedded Multiprocessor Systems

Author: David Antonio-Torres
Publication venue: 'IntechOpen'
Publication date: 16/03/2012
Field of study

IntechOpen

Efficient Inter-Task Communication for Nested Loop Programs on a Multiprocessor System

Author: Bekooij Marco
Bijlsma Tjerk
Jansen Pierre
Smit Gerard
Publication venue
Publication date: 01/01/2007
Field of study

In modern multiprocessor systems, processors can be stalled by inter-task communication when reading from a remote buffer. This paper presents a solution for the inter-task communication, that has a minimal impact on the performance of the system, hides the inter-task communication latency without requiring additional hardware. The solution applies to jobs, represented as task graphs, where the tasks are nested loop programs. Buffers are allocated in scratch-pad memories of the consuming tasks to provide low latency read access. For the nested loop programs, minimal buffer sizes can be determined to cover all possible communication patterns. The added computational complexity is low, as the solution adds only a few operations to the nested loop programs

University of Twente Research Information

The force on the flex: Global parallelism and portability

Author: Jordan H. F.
Publication venue
Publication date
Field of study

A parallel programming methodology, called the force, supports the construction of programs to be executed in parallel by an unspecified, but potentially large, number of processes. The methodology was originally developed on a pipelined, shared memory multiprocessor, the Denelcor HEP, and embodies the primitive operations of the force in a set of macros which expand into multiprocessor Fortran code. A small set of primitives is sufficient to write large parallel programs, and the system has been used to produce 10,000 line programs in computational fluid dynamics. The level of complexity of the force primitives is intermediate. It is high enough to mask detailed architectural differences between multiprocessors but low enough to give the user control over performance. The system is being ported to a medium scale multiprocessor, the Flex/32, which is a 20 processor system with a mixture of shared and local memory. Memory organization and the type of processor synchronization supported by the hardware on the two machines lead to some differences in efficient implementations of the force primitives, but the user interface remains the same. An initial implementation was done by retargeting the macros to Flexible Computer Corporation's ConCurrent C language. Subsequently, the macros were caused to directly produce the system calls which form the basis for ConCurrent C. The implementation of the Fortran based system is in step with Flexible Computer Corporations's implementation of a Fortran system in the parallel environment

NASA Technical Reports Server