Search CORE

2,928 research outputs found

Cycle-accurate evaluation of reconfigurable photonic networks-on-chip

Author: Artundo Iñigo
Debaes Christof
Heirman Wim
Thienpont Hugo
Van Campenhout Jan
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2010
Field of study

There is little doubt that the most important limiting factors of the performance of next-generation Chip Multiprocessors (CMPs) will be the power efficiency and the available communication speed between cores. Photonic Networks-on-Chip (NoCs) have been suggested as a viable route to relieve the off- and on-chip interconnection bottleneck. Low-loss integrated optical waveguides can transport very high-speed data signals over longer distances as compared to on-chip electrical signaling. In addition, with the development of silicon microrings, photonic switches can be integrated to route signals in a data-transparent way. Although several photonic NoC proposals exist, their use is often limited to the communication of large data messages due to a relatively long set-up time of the photonic channels. In this work, we evaluate a reconfigurable photonic NoC in which the topology is adapted automatically (on a microsecond scale) to the evolving traffic situation by use of silicon microrings. To evaluate this system's performance, the proposed architecture has been implemented in a detailed full-system cycle-accurate simulator which is capable of generating realistic workloads and traffic patterns. In addition, a model was developed to estimate the power consumption of the full interconnection network which was compared with other photonic and electrical NoC solutions. We find that our proposed network architecture significantly lowers the average memory access latency (35% reduction) while only generating a modest increase in power consumption (20%), compared to a conventional concentrated mesh electrical signaling approach. When comparing our solution to high-speed circuit-switched photonic NoCs, long photonic channel set-up times can be tolerated which makes our approach directly applicable to current shared-memory CMPs

Crossref

Ghent University Academic Bibliography

Simulation of 1+1 dimensional surface growth and lattices gases using GPUs

Author: Amir
Barabási
Barma
Bernaschi
Castro
Chowdhury
Chowdhury
Edwards
Facsko
Family
Forster
Gergely Ódor
Gross
Géza Ódor
Halpin-Healy
Harris
Henrik Schulz
Hinrichsen
Hwa
Janssen
Juhász
Juhász
Kardar
Kardar
Katz
Klumpp
Krug
Krug
Ligget
Meakin
Máté Ferenc Nagy
Plischke
Preis
Prähofer
Sasamoto
Sasamoto
Schittmann
Schliwa
van Beijeren
van Meel
Weigel
Ódor
Ódor
Ódor
Ódor
Ódor
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Restricted solid on solid surface growth models can be mapped onto binary lattice gases. We show that efficient simulation algorithms can be realized on GPUs either by CUDA or by OpenCL programming. We consider a deposition/evaporation model following Kardar-Parisi-Zhang growth in 1+1 dimensions related to the Asymmetric Simple Exclusion Process and show that for sizes, that fit into the shared memory of GPUs one can achieve the maximum parallelization speedup ~ x100 for a Quadro FX 5800 graphics card with respect to a single CPU of 2.67 GHz). This permits us to study the effect of quenched columnar disorder, requiring extremely long simulation times. We compare the CUDA realization with an OpenCL implementation designed for processor clusters via MPI. A two-lane traffic model with randomized turning points is also realized and the dynamical behavior has been investigated.Comment: 20 pages 12 figures, 1 table, to appear in Comp. Phys. Com

arXiv.org e-Print Archive

Crossref

ELTE Digital Institutional Repository (EDIT)

Queueing analysis of a canonical model of real-time multiprocessors

Author: Krishna C. M.
Shin K. G.
Publication venue
Publication date
Field of study

A logical classification of multiprocessor structures from the point of view of control applications is presented. A computation of the response time distribution for a canonical model of a real time multiprocessor is presented. The multiprocessor is approximated by a blocking model. Two separate models are derived: one created from the system's point of view, and the other from the point of view of an incoming task

NASA Technical Reports Server

Recommended from our members

Fault tolerance in super-scalar and VLIW processors

Author: Blough Douglas M.
Nicolau Alexandru
Publication venue: eScholarship, University of California
Publication date: 01/01/1991
Field of study

In this paper, we present a method for utilizing the spare capacity in super-scalar and very long instruction word (VLIW) processors to tolerate functional unit failures. Unlike previous work that was primarily interested in detection of transient faults, we are concerned with more permanent and/or intermittent faults which necessitate processor reconfiguration. Our method utilizes the VLIW compiler or the superscalar scheduler to insert redundant operations whenever idle functional units exist. The results of these redundant operations are used to detect and diagnose functional unit failures. For super-scalar processors, the scheduler can then utilize this information to ensure that operations are performed only on non-faulty units. In VLIW processors, this is equivalent to recompiling the code to run on the remaining non-faulty functional units. Since in certain applications, recompilation may not be possible, we consider two alternative reconfiguration strategies for VLIW processors. These strategies sacrifice storage space and execution time, respectively, in order to reconfigure without recompiling. We present Markov models that describe the behavior of processors using these different approaches and we evaluate their reliabilities. The results show that, while super-scalar and VLIW with recompilation provide the highest reliability, all proposed strategies significantly increase reliability over that of an unprotected processor

eScholarship - University of California