Search CORE

1,538 research outputs found

Energy Model of Networks-on-Chip and a Bus

Author: Becker Jens E.
Becker Jürgen
Kavaldjiev Nikolay
Smit Gerard J.M.
Wolkotte Pascal T.
Publication venue: IEEE Computer Society
Publication date: 01/01/2005
Field of study

A Network-on-Chip (NoC) is an energy-efficient onchip communication architecture for Multi-Processor Systemon-Chip (MPSoC) architectures. In earlier papers we proposed two Network-on-Chip architectures based on packet-switching and circuit-switching. In this paper we derive an energy model for both NoC architectures to predict their energy consumption per transported bit. Both architectures are also compared with a traditional bus architecture. The energy model is primarily needed to find a near optimal run-time mapping (from an energy point of view) of inter-process communication to NoC link

CiteSeerX

Crossref

University of Twente Research Information

OpenCL + OpenSHMEM Hybrid Programming Model for the Adapteva Epiphany Architecture

Author: D Wentzlaff
J Ross
JE Stone
M Baker
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/08/2016
Field of study

There is interest in exploring hybrid OpenSHMEM + X programming models to extend the applicability of the OpenSHMEM interface to more hardware architectures. We present a hybrid OpenCL + OpenSHMEM programming model for device-level programming for architectures like the Adapteva Epiphany many-core RISC array processor. The Epiphany architecture comprises a 2D array of low-power RISC cores with minimal uncore functionality connected by a 2D mesh Network-on-Chip (NoC). The Epiphany architecture offers high computational energy efficiency for integer and floating point calculations as well as parallel scalability. The Epiphany-III is available as a coprocessor in platforms that also utilize an ARM CPU host. OpenCL provides good functionality for supporting a co-design programming model in which the host CPU offloads parallel work to a coprocessor. However, the OpenCL memory model is inconsistent with the Epiphany memory architecture and lacks support for inter-core communication. We propose a hybrid programming model in which OpenSHMEM provides a better solution by replacing the non-standard OpenCL extensions introduced to achieve high performance with the Epiphany architecture. We demonstrate the proposed programming model for matrix-matrix multiplication based on Cannon's algorithm showing that the hybrid model addresses the deficiencies of using OpenCL alone to achieve good benchmark performance.Comment: 12 pages, 5 figures, OpenSHMEM 2016: Third workshop on OpenSHMEM and Related Technologie

arXiv.org e-Print Archive

Crossref

04241 Abstracts Collection -- Graph Transformations and Process Algebras for Modeling Distributed and Mobile Systems

Author: Gardner Philippa
Montanari Ugo
Publication venue: Dagstuhl Seminar Proceedings. 04241 - Graph Transformations and Process Algebras for Modeling Distributed and Mobile Systems
Publication date: 01/01/2005
Field of study

Recently there has been a lot of research, combining concepts of process algebra with those of the theory of graph grammars and graph transformation systems. Both can be viewed as general frameworks in which one can specify and reason about concurrent and distributed systems. There are many areas where both theories overlap and this reaches much further than just using graphs to give a graphic representation to processes. Processes in a communication network can be seen in two different ways: as terms in an algebraic theory, emphasizing their behaviour and their interaction with the environment, and as nodes (or edges) in a graph, emphasizing their topology and their connectedness. Especially topology, mobility and dynamic reconfigurations at runtime can be modelled in a very intuitive way using graph transformation. On the other hand the definition and proof of behavioural equivalences is often easier in the process algebra setting. Also standard techniques of algebraic semantics for universal constructions, refinement and compositionality can take better advantage of the process algebra representation. An important example where the combined theory is more convenient than both alternatives is for defining the concurrent (noninterleaving), abstract semantics of distributed systems. Here graph transformations lack abstraction and process algebras lack expressiveness. Another important example is the work on bigraphical reactive systems with the aim of deriving a labelled transitions system from an unlabelled reactive system such that the resulting bisimilarity is a congruence. Here, graphs seem to be a convenient framework, in which this theory can be stated and developed. So, although it is the central aim of both frameworks to model and reason about concurrent systems, the semantics of processes can have a very different flavour in these theories. Research in this area aims at combining the advantages of both frameworks and translating concepts of one theory into the other. The Dagsuthl Seminar, which took place from 06.06. to 11.06.2004, was aimed at bringing together researchers of the two communities in order to share their ideas and develop new concepts. These proceedings4 of the do not only contain abstracts of the talks given at the seminar, but also summaries of topics of central interest. We would like to thank all participants of the seminar for coming and sharing their ideas and everybody who has contributed to the proceedings

Dagstuhl Research Online Publication Server

Dagstuhl News January - December 2000

Author: Wilhelm Reinhard
Publication venue: Dagstuhl Publications. Dagstuhl News
Publication date: 01/01/2000
Field of study

"Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic

Dagstuhl Research Online Publication Server

Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks

Author: Chandra Vikas
Chau Benson
Esmaeilzadeh Hadi
Kim Joon Kyung
Lai Liangzhen
Park Jongse
Sharma Hardik
Suda Naveen
Publication venue
Publication date: 30/05/2018
Field of study

Fully realizing the potential of acceleration for Deep Neural Networks (DNNs) requires understanding and leveraging algorithmic properties. This paper builds upon the algorithmic insight that bitwidth of operations in DNNs can be reduced without compromising their classification accuracy. However, to prevent accuracy loss, the bitwidth varies significantly across DNNs and it may even be adjusted for each layer. Thus, a fixed-bitwidth accelerator would either offer limited benefits to accommodate the worst-case bitwidth requirements, or lead to a degradation in final accuracy. To alleviate these deficiencies, this work introduces dynamic bit-level fusion/decomposition as a new dimension in the design of DNN accelerators. We explore this dimension by designing Bit Fusion, a bit-flexible accelerator, that constitutes an array of bit-level processing elements that dynamically fuse to match the bitwidth of individual DNN layers. This flexibility in the architecture enables minimizing the computation and the communication at the finest granularity possible with no loss in accuracy. We evaluate the benefits of BitFusion using eight real-world feed-forward and recurrent DNNs. The proposed microarchitecture is implemented in Verilog and synthesized in 45 nm technology. Using the synthesis results and cycle accurate simulation, we compare the benefits of Bit Fusion to two state-of-the-art DNN accelerators, Eyeriss and Stripes. In the same area, frequency, and process technology, BitFusion offers 3.9x speedup and 5.1x energy savings over Eyeriss. Compared to Stripes, BitFusion provides 2.6x speedup and 3.9x energy reduction at 45 nm node when BitFusion area and frequency are set to those of Stripes. Scaling to GPU technology node of 16 nm, BitFusion almost matches the performance of a 250-Watt Titan Xp, which uses 8-bit vector instructions, while BitFusion merely consumes 895 milliwatts of power

arXiv.org e-Print Archive

Crossref

Reconfigurable and software-defined networks of connectors and components

Author: BRUNI ROBERTO
MONTANARI UGO GIOVANNI ERASMO
Sammartino M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The diffusion of adaptive systems motivate the study of models of software entities whose interaction capabilities can evolve dynamically. In this paper we overview the contributions in the ASCENS project in the area of software defined networks and of reconfigurable connectors. In particular we highlight: (i) the definition of the Network-conscious pi-calculus and its use in the modeling and verification of the PASTRY protocol, and (ii) the mutual correspondence between different frameworks for defining networks of connectors together with two suitable enhancements for addressing dynamically changing systems

Archivio della Ricerca - Università di Pisa