5,577 research outputs found
OutFlank Routing: Increasing Throughput in Toroidal Interconnection Networks
We present a new, deadlock-free, routing scheme for toroidal interconnection
networks, called OutFlank Routing (OFR). OFR is an adaptive strategy which
exploits non-minimal links, both in the source and in the destination nodes.
When minimal links are congested, OFR deroutes packets to carefully chosen
intermediate destinations, in order to obtain travel paths which are only an
additive constant longer than the shortest ones. Since routing performance is
very sensitive to changes in the traffic model or in the router parameters, an
accurate discrete-event simulator of the toroidal network has been developed to
empirically validate OFR, by comparing it against other relevant routing
strategies, over a range of typical real-world traffic patterns. On the
16x16x16 (4096 nodes) simulated network OFR exhibits improvements of the
maximum sustained throughput between 14% and 114%, with respect to Adaptive
Bubble Routing.Comment: 9 pages, 5 figures, to be presented at ICPADS 201
An analytical performance model for the Spidergon NoC
Networks on chip (NoC) emerged as a promising alternative to bus-based interconnect networks to handle the increasing communication requirements of the large systems on chip. Employing an appropriate topology for a NoC is of high importance mainly because it typically trade-offs between cross-cutting concerns such as performance and cost. The spidergon topology is a novel architecture which is proposed recently for NoC domain. The objective of the spidergon NoC has been addressing the need for a fixed and optimized topology to realize cost effective multi-processor SoC (MPSoC) development [7]. In this paper we analyze the traffic behavior in the spidergon scheme and present an analytical evaluation of the average message latency in the architecture. We prove the validity of the analysis by comparing the model against the results produced by a discreteevent simulator
Performance evaluation of multi-core multi-cluster architecture
A multi-core cluster is a cluster composed of numbers of nodes where each node has a number of processors, each with more than one core within each single chip. Cluster nodes are connected via an interconnection network. Multi-cored processors are able to achieve higher performance without driving up power consumption and heat, which is the main concern in a single-core processor. A general problem in the network arises from the fact that multiple messages can be in transit at the same time on the same network links. This paper considers the communication latencies of a multi-core multi-cluster architecture will be investigated using simulation experiments and measurements under various working conditions
Parallel and Distributed Simulation from Many Cores to the Public Cloud (Extended Version)
In this tutorial paper, we will firstly review some basic simulation concepts
and then introduce the parallel and distributed simulation techniques in view
of some new challenges of today and tomorrow. More in particular, in the last
years there has been a wide diffusion of many cores architectures and we can
expect this trend to continue. On the other hand, the success of cloud
computing is strongly promoting the everything as a service paradigm. Is
parallel and distributed simulation ready for these new challenges? The current
approaches present many limitations in terms of usability and adaptivity: there
is a strong need for new evaluation metrics and for revising the currently
implemented mechanisms. In the last part of the paper, we propose a new
approach based on multi-agent systems for the simulation of complex systems. It
is possible to implement advanced techniques such as the migration of simulated
entities in order to build mechanisms that are both adaptive and very easy to
use. Adaptive mechanisms are able to significantly reduce the communication
cost in the parallel/distributed architectures, to implement load-balance
techniques and to cope with execution environments that are both variable and
dynamic. Finally, such mechanisms will be used to build simulations on top of
unreliable cloud services.Comment: Tutorial paper published in the Proceedings of the International
Conference on High Performance Computing and Simulation (HPCS 2011). Istanbul
(Turkey), IEEE, July 2011. ISBN 978-1-61284-382-
Analytical performance modelling of adaptive wormhole routing in the star interconnection network
The star graph was introduced as an attractive alternative to the well-known hypercube and its properties have been well studied in the past. Most of these studies have focused on topological properties and algorithmic aspects of this network. Although several analytical models have been proposed in the literature for different interconnection networks, none of them have dealt with star graphs. This paper proposes the first analytical model to predict message latency in wormhole-switched star interconnection networks with fully adaptive routing. The analysis focuses on a fully adaptive routing algorithm which has shown to be the most effective for star graphs. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy under different operating conditions
Engine performance characteristics and evaluation of variation in the length of intake plenum
In the engine with multipoint fuel injection system using electronically controlled fuel injectors has an intake manifold in which only the air flows and, the fuel is injected into the intake valve. Since the intake manifolds transport mainly air, the supercharging effects of the variable length intake plenum will be different from carbureted engine. Engine tests have been carried out with the aim of constituting a base study to design a new variable length intake manifold plenum. The objective in this research is to study the engine performance characteristics and to evaluate the effects of the variation in the length of intake plenum. The engine test bed used for experimental work consists of a control panel, a hydraulic dynamometer and measurement instruments to measure the parameters of engine performance characteristics. The control panel is being used to perform administrative and management operating system. Besides that, the hydraulic dynamometer was used to measure the power of an engine by using a cell filled with liquid to increase its load. Thus, measurement instrument is provided in this test to measure the as brake torque, brake power, thermal efficiency and specific fuel consumption. The results showed that the variation in the plenum length causes an improvement on the engine performance characteristics especially on the fuel consumption at high load and low engine speeds which are put forward the system using for urban roads. From this experiment, it will show the behavior of engine performance
An empirical evaluation of techniques for parallel simulation of message passing networks
209 p.[EN]In the field of computer design, simulation is an essential tool to validate and evaluate architectural proposals. Conventional simulation techniques, designed for their use in sequential computers, are too slow if the system to simulate is large or complex. The aim of this work is to search for techniques to accelerate simulations exploiting the parallelism available in current, commercial multicomputers, and to use these techniques to study a model of a message router. This router has been designed to constitute the communication infrastructure of a (hypothetical) massively parallel computer.
Three parallel simulation techniques have been considered: synchronous, asynchronous-conservative and asynchronous-optimistic. These algorithms have been implemented in three multicomputers: a transputer-based Supernode, an Intel Paragon and a network of workstations. The influence that factors such as the characteristics of the simulated models, the organization of the simulators and the characteristics of the target multicomputers have in the performance of the simulations has been measured and characterized.
It is concluded that optimistic parallel simulation techniques are not suitable for the considered kind of models, although they may provide good performance in other environments. A network of workstations is not the right platform for our experiments, because the communication demands of the parallel simulators surpass the abilities of local area networks—the granularity is too fine. Synchronous and conservative parallel simulation techniques perform very well in the Supernode and in the Paragon, specially if the model to simulate is complex or large—precisely the worst case for traditional, sequential simulators. This way, studies previously considered as unrealizable, due to their exceedingly high computational cost, can be performed in reasonable times. Additionally, the spectrum of possibilities of using multicomputers can be broadened to execute more than numeric applications.[ES]En el ámbito del diseño de computadores, la simulación es una herramienta imprescindible para la validación y evaluación de cualquier propuesta arquitectónica. Las ténicas convencionales de simulación, diseñadas para su utilización en computadores secuenciales, son demasiado lentas si el sistema a simular es grande o complejo. El objetivo de esta tesis es buscar técnicas para acelerar estas simulaciones, aprovechando el paralelismo disponible en multicomputadores comerciales, y usar esas técnicas para el estudio de un modelo de encaminador de mensajes. Este encaminador está diseñado para formar infraestructura de comunicaciones de un hipotético computador masivamente paralelo.
En este trabajo se consideran tres técnicas de simulación paralela: síncrona, asíncrona-conservadora y asíncrona-optimista. Estos algoritmos se han implementado en tres multicomputadores: un Supernode basado en Transputers, un Intel Paragon y una red de estaciones de trabajo. Se caracteriza la influencia que tienen en las prestaciones de los simuladores aspectos tales como los parámetros del modelo simulado, la organización del simulador y las características del multicomputador utilizado.
Se concluye que las técnicas de simulación paralela optimista no resultan adecuadas para trabajar con el modelo considerado, aunque pueden ofrecer un buen rendimiento en otros entornos. La red de estaciones de trabajo no resulta una plataforma apropiada para estas simulaciones, ya que una red local no reúne condiciones para la ejecución de aplicaciones paralelas de grano fino. Las técnicas de simulación paralela síncrona y conservadora dan muy buenos resultados en el Supernode y en el Paragon, especialmente si el modelo a simular es complejo o grande—precisamente el peor caso para los algoritmos secuenciales. De esta forma, estudios previamente considerados inviables, por ser demasiado costosos computacionalmente, pueden realizarse en tiempos razonables. Además, se amplía el espectro de posibilidades de los multicomputadores, utilizándolos para algo más que aplicaciones numéricas.Este trabajo ha sido parcialmente subvencionado por la Comisión Interministerial de Ciencia y Tecnología, bajo contrato TIC95-037
Software-based fault-tolerant routing algorithm in multidimensional networks
Massively parallel computing systems are being built with hundreds or thousands of components such as nodes, links, memories, and connectors. The failure of a component in such systems will not only reduce the computational power but also alter the network's topology. The software-based fault-tolerant routing algorithm is a popular routing to achieve fault-tolerance capability in networks. This algorithm is initially proposed only for two dimensional networks (Suh et al., 2000). Since, higher dimensional networks have been widely employed in many contemporary massively parallel systems; this paper proposes an approach to extend this routing scheme to these indispensable higher dimensional networks. Deadlock and livelock freedom and the performance of presented algorithm, have been investigated for networks with different dimensionality and various fault regions. Furthermore, performance results have been presented through simulation experiments
- …