Search CORE

55,213 research outputs found

Scalable parallel communications

Author: Foudriat E. C.
Khanna S.
Maly K.
Mukkamala R.
Overstreet C. M.
Sekhar Y. S.
Zubair M.
Publication venue
Publication date
Field of study

Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups

NASA Technical Reports Server

Extending and Implementing the Self-adaptive Virtual Processor for Distributed Memory Architectures

Author: Koivisto Juha
van Tol Michiel W.
Publication venue
Publication date: 01/01/2011
Field of study

Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current implementations assume a single address space shared memory. We investigate and extend SVP to handle distributed environments, and discuss a prototype SVP implementation which transparently supports execution on heterogeneous distributed memory clusters over TCP/IP connections, while retaining the original SVP programming model

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

String Figure: A Scalable and Elastic Memory Network Architecture

Author: IEEE
Miller Ethan L
Ogleari Matheus Almeida
Qian Chen
Yu Ye
Zhao Jishen
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Crossref

eScholarship - University of California

Cache Equalizer: A Cache Pressure Aware Block Placement Scheme for Large-Scale Chip Multiprocessors

Author: Cho Sangyeun
Hammoud Mohammad
Melhem Rami
Publication venue: Department of Computer Science, University of Pittsburgh
Publication date: 01/01/2009
Field of study

This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets usages. CE decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Temporal pressure at the on-chip last-level cache, is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement process. An incoming block is consequently placed at a cache group that exhibits the minimum pressure. CE provides Quality of Service (QoS) by robustly offering better performance than the baseline shared NUCA cache. Simulation results using a full-system simulator demonstrate that CE outperforms shared NUCA caches by an average of 15.5% and by as much as 28.5% for the benchmark programs we examined. Furthermore, evaluations manifested the outperformance of CE versus related CMP cache designs

D-Scholarship@Pitt

Supporting Cyber-Physical Systems with Wireless Sensor Networks: An Outlook of Software and Services

Author: Duquennoy Simon
Höglund Joel
Misra Prasant
Mottola Luca
Raza Shahid
Tsiftes Nicolas
Voigt Thiemo
Publication venue: Indian Institute of Science
Publication date: 01/01/2013
Field of study

Sensing, communication, computation and control technologies are the essential building blocks of a cyber-physical system (CPS). Wireless sensor networks (WSNs) are a way to support CPS as they provide fine-grained spatial-temporal sensing, communication and computation at a low premium of cost and power. In this article, we explore the fundamental concepts guiding the design and implementation of WSNs. We report the latest developments in WSN software and services for meeting existing requirements and newer demands; particularly in the areas of: operating system, simulator and emulator, programming abstraction, virtualization, IP-based communication and security, time and location, and network monitoring and management. We also reflect on the ongoing efforts in providing dependable assurances for WSN-driven CPS. Finally, we report on its applicability with a case-study on smart buildings

Archivio istituzionale della ricerca - Politecnico di Milano

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Parallel Discrete Event Simulation with Erlang

Author: D'Angelo Gabriele
Marzolla Moreno
Toscano Luca
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Discrete Event Simulation (DES) is a widely used technique in which the state of the simulator is updated by events happening at discrete points in time (hence the name). DES is used to model and analyze many kinds of systems, including computer architectures, communication networks, street traffic, and others. Parallel and Distributed Simulation (PADS) aims at improving the efficiency of DES by partitioning the simulation model across multiple processing elements, in order to enabling larger and/or more detailed studies to be carried out. The interest on PADS is increasing since the widespread availability of multicore processors and affordable high performance computing clusters. However, designing parallel simulation models requires considerable expertise, the result being that PADS techniques are not as widespread as they could be. In this paper we describe ErlangTW, a parallel simulation middleware based on the Time Warp synchronization protocol. ErlangTW is entirely written in Erlang, a concurrent, functional programming language specifically targeted at building distributed systems. We argue that writing parallel simulation models in Erlang is considerably easier than using conventional programming languages. Moreover, ErlangTW allows simulation models to be executed either on single-core, multicore and distributed computing architectures. We describe the design and prototype implementation of ErlangTW, and report some preliminary performance results on multicore and distributed architectures using the well known PHOLD benchmark.Comment: Proceedings of ACM SIGPLAN Workshop on Functional High-Performance Computing (FHPC 2012) in conjunction with ICFP 2012. ISBN: 978-1-4503-1577-

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Case for Time Slotted Channel Hopping for ICN in the IoT

Author: Adjih Cédric
Baccelli Emmanuel
Hahm Oliver
Schmidt Thomas C.
Wählisch Matthias
Publication venue
Publication date: 01/01/2016
Field of study

Recent proposals to simplify the operation of the IoT include the use of Information Centric Networking (ICN) paradigms. While this is promising, several challenges remain. In this paper, our core contributions (a) leverage ICN communication patterns to dynamically optimize the use of TSCH (Time Slotted Channel Hopping), a wireless link layer technology increasingly popular in the IoT, and (b) make IoT-style routing adaptive to names, resources, and traffic patterns throughout the network--both without cross-layering. Through a series of experiments on the FIT IoT-LAB interconnecting typical IoT hardware, we find that our approach is fully robust against wireless interference, and almost halves the energy consumed for transmission when compared to CSMA. Most importantly, our adaptive scheduling prevents the time-slotted MAC layer from sacrificing throughput and delay

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

REPOSIT

The simplicity project: easing the burden of using complex and heterogeneous ICT devices and services

Author: EU FP5 (Funder)
Blefari-Melazzi N.
Ceneri G.
Cortese G.
Davies Nigel
Dellas N.
Friday Adrian
Hamard J.
Koutsoloukas E.
Niedermeier C.
Noda C.
Papanis J.
Petrioli C.
Rukzio Enrico
Storz Oliver
Urban J.
Publication venue
Publication date: 01/01/2004
Field of study

As of today, to exploit the variety of different "services", users need to configure each of their devices by using different procedures and need to explicitly select among heterogeneous access technologies and protocols. In addition to that, users are authenticated and charged by different means. The lack of implicit human computer interaction, context-awareness and standardisation places an enormous burden of complexity on the shoulders of the final users. The IST-Simplicity project aims at leveraging such problems by: i) automatically creating and customizing a user communication space; ii) adapting services to user terminal characteristics and to users preferences; iii) orchestrating network capabilities. The aim of this paper is to present the technical framework of the IST-Simplicity project. This paper is a thorough analysis and qualitative evaluation of the different technologies, standards and works presented in the literature related to the Simplicity system to be developed

Lancaster E-Prints

Archivio della ricerca- Università di Roma La Sapienza