Search CORE

14,434 research outputs found

A case study for NoC based homogeneous MPSoC architectures

Author: Casu Mario Roberto
Macchiarulo Luca
Ruo Roch Massimo
Tota Sergio Vincenzo
Zamboni Maurizio
Publication venue: IEEE
Publication date: 01/01/2009
Field of study

The many-core design paradigm requires flexible and modular hardware and software components to provide the required scalability to next-generation on-chip multiprocessor architectures. A multidisciplinary approach is necessary to consider all the interactions between the different components of the design. In this paper, a complete design methodology that tackles at once the aspects of system level modeling, hardware architecture, and programming model has been successfully used for the implementation of a multiprocessor network-on-chip (NoC)-based system, the NoCRay graphic accelerator. The design, based on 16 processors, after prototyping with field-programmable gate array (FPGA), has been laid out in 90-nm technology. Post-layout results show very low power, area, as well as 500 MHz of clock frequency. Results show that an array of small and simple processors outperform a single high-end general purpose processo

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Tribes Is Hard in the Message Passing Model

Author: Chattopadhyay Arkadev
Mukhopadhyay Sagnik
Publication venue
Publication date: 01/01/2015
Field of study

We consider the point-to-point message passing model of communication in which there are

k

processors with individual private inputs, each

n

-bit long. Each processor is located at the node of an underlying undirected graph and has access to private random coins. An edge of the graph is a private channel of communication between its endpoints. The processors have to compute a given function of all their inputs by communicating along these channels. While this model has been widely used in distributed computing, strong lower bounds on the amount of communication needed to compute simple functions have just begun to appear. In this work, we prove a tight lower bound of

\Omega(kn)

on the communication needed for computing the Tribes function, when the underlying graph is a star of

k+1

nodes that has

k

leaves with inputs and a center with no input. Lower bound on this topology easily implies comparable bounds for others. Our lower bounds are obtained by building upon the recent information theoretic techniques of Braverman et.al (FOCS'13) and combining it with the earlier work of Jayram, Kumar and Sivakumar (STOC'03). This approach yields information complexity bounds that is of independent interest

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Representing Conversations for Scalable Overhearing

Author: Gutnik G.
Kaminka G. A.
Publication venue: 'AI Access Foundation'
Publication date: 26/09/2011
Field of study

Open distributed multi-agent systems are gaining interest in the academic community and in industry. In such open settings, agents are often coordinated using standardized agent conversation protocols. The representation of such protocols (for analysis, validation, monitoring, etc) is an important aspect of multi-agent applications. Recently, Petri nets have been shown to be an interesting approach to such representation, and radically different approaches using Petri nets have been proposed. However, their relative strengths and weaknesses have not been examined. Moreover, their scalability and suitability for different tasks have not been addressed. This paper addresses both these challenges. First, we analyze existing Petri net representations in terms of their scalability and appropriateness for overhearing, an important task in monitoring open multi-agent systems. Then, building on the insights gained, we introduce a novel representation using Colored Petri nets that explicitly represent legal joint conversation states and messages. This representation approach offers significant improvements in scalability and is particularly suitable for overhearing. Furthermore, we show that this new representation offers a comprehensive coverage of all conversation features of FIPA conversation standards. We also present a procedure for transforming AUML conversation protocol diagrams (a standard human-readable representation), to our Colored Petri net representation

arXiv.org e-Print Archive

Crossref

A NoC-based hybrid message-passing/shared-memory approach to CMP design

Author: Agarwal
Daemen
Forsell
Grecu
Karniadakis
Lorensen
Mario R. Casu
Massimo Ruo Roch
Maurizio Zamboni
Owens
Paulin
Radulescu
Sergio V. Tota
Snir
Tota
Publication venue: Elsevier
Publication date: 01/01/2011
Field of study

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Turbomachinery CFD on parallel computers

Author: Blech Richard A.
Milner Edward J.
Quealy Angela
Townsend Scott E.
Publication venue
Publication date
Field of study

The role of multistage turbomachinery simulation in the development of propulsion system models is discussed. Particularly, the need for simulations with higher fidelity and faster turnaround time is highlighted. It is shown how such fast simulations can be used in engineering-oriented environments. The use of parallel processing to achieve the required turnaround times is discussed. Current work by several researchers in this area is summarized. Parallel turbomachinery CFD research at the NASA Lewis Research Center is then highlighted. These efforts are focused on implementing the average-passage turbomachinery model on MIMD, distributed memory parallel computers. Performance results are given for inviscid, single blade row and viscous, multistage applications on several parallel computers, including networked workstations

NASA Technical Reports Server

Scalable data abstractions for distributed parallel computations

Author: Hanlon James
Hollis Simon J.
May David
Publication venue
Publication date: 03/10/2012
Field of study

The ability to express a program as a hierarchical composition of parts is an essential tool in managing the complexity of software and a key abstraction this provides is to separate the representation of data from the computation. Many current parallel programming models use a shared memory model to provide data abstraction but this doesn't scale well with large numbers of cores due to non-determinism and access latency. This paper proposes a simple programming model that allows scalable parallel programs to be expressed with distributed representations of data and it provides the programmer with the flexibility to employ shared or distributed styles of data-parallelism where applicable. It is capable of an efficient implementation, and with the provision of a small set of primitive capabilities in the hardware, it can be compiled to operate directly on the hardware, in the same way stack-based allocation operates for subroutines in sequential machines

arXiv.org e-Print Archive

Explore Bristol Research