55,213 research outputs found

    Scalable parallel communications

    Get PDF
    Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups

    Extending and Implementing the Self-adaptive Virtual Processor for Distributed Memory Architectures

    Get PDF
    Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current implementations assume a single address space shared memory. We investigate and extend SVP to handle distributed environments, and discuss a prototype SVP implementation which transparently supports execution on heterogeneous distributed memory clusters over TCP/IP connections, while retaining the original SVP programming model

    Cache Equalizer: A Cache Pressure Aware Block Placement Scheme for Large-Scale Chip Multiprocessors

    Get PDF
    This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets usages. CE decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Temporal pressure at the on-chip last-level cache, is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement process. An incoming block is consequently placed at a cache group that exhibits the minimum pressure. CE provides Quality of Service (QoS) by robustly offering better performance than the baseline shared NUCA cache. Simulation results using a full-system simulator demonstrate that CE outperforms shared NUCA caches by an average of 15.5% and by as much as 28.5% for the benchmark programs we examined. Furthermore, evaluations manifested the outperformance of CE versus related CMP cache designs

    Supporting Cyber-Physical Systems with Wireless Sensor Networks: An Outlook of Software and Services

    Get PDF
    Sensing, communication, computation and control technologies are the essential building blocks of a cyber-physical system (CPS). Wireless sensor networks (WSNs) are a way to support CPS as they provide fine-grained spatial-temporal sensing, communication and computation at a low premium of cost and power. In this article, we explore the fundamental concepts guiding the design and implementation of WSNs. We report the latest developments in WSN software and services for meeting existing requirements and newer demands; particularly in the areas of: operating system, simulator and emulator, programming abstraction, virtualization, IP-based communication and security, time and location, and network monitoring and management. We also reflect on the ongoing efforts in providing dependable assurances for WSN-driven CPS. Finally, we report on its applicability with a case-study on smart buildings

    Parallel Discrete Event Simulation with Erlang

    Full text link
    Discrete Event Simulation (DES) is a widely used technique in which the state of the simulator is updated by events happening at discrete points in time (hence the name). DES is used to model and analyze many kinds of systems, including computer architectures, communication networks, street traffic, and others. Parallel and Distributed Simulation (PADS) aims at improving the efficiency of DES by partitioning the simulation model across multiple processing elements, in order to enabling larger and/or more detailed studies to be carried out. The interest on PADS is increasing since the widespread availability of multicore processors and affordable high performance computing clusters. However, designing parallel simulation models requires considerable expertise, the result being that PADS techniques are not as widespread as they could be. In this paper we describe ErlangTW, a parallel simulation middleware based on the Time Warp synchronization protocol. ErlangTW is entirely written in Erlang, a concurrent, functional programming language specifically targeted at building distributed systems. We argue that writing parallel simulation models in Erlang is considerably easier than using conventional programming languages. Moreover, ErlangTW allows simulation models to be executed either on single-core, multicore and distributed computing architectures. We describe the design and prototype implementation of ErlangTW, and report some preliminary performance results on multicore and distributed architectures using the well known PHOLD benchmark.Comment: Proceedings of ACM SIGPLAN Workshop on Functional High-Performance Computing (FHPC 2012) in conjunction with ICFP 2012. ISBN: 978-1-4503-1577-

    A Case for Time Slotted Channel Hopping for ICN in the IoT

    Full text link
    Recent proposals to simplify the operation of the IoT include the use of Information Centric Networking (ICN) paradigms. While this is promising, several challenges remain. In this paper, our core contributions (a) leverage ICN communication patterns to dynamically optimize the use of TSCH (Time Slotted Channel Hopping), a wireless link layer technology increasingly popular in the IoT, and (b) make IoT-style routing adaptive to names, resources, and traffic patterns throughout the network--both without cross-layering. Through a series of experiments on the FIT IoT-LAB interconnecting typical IoT hardware, we find that our approach is fully robust against wireless interference, and almost halves the energy consumed for transmission when compared to CSMA. Most importantly, our adaptive scheduling prevents the time-slotted MAC layer from sacrificing throughput and delay

    The simplicity project: easing the burden of using complex and heterogeneous ICT devices and services

    Get PDF
    As of today, to exploit the variety of different "services", users need to configure each of their devices by using different procedures and need to explicitly select among heterogeneous access technologies and protocols. In addition to that, users are authenticated and charged by different means. The lack of implicit human computer interaction, context-awareness and standardisation places an enormous burden of complexity on the shoulders of the final users. The IST-Simplicity project aims at leveraging such problems by: i) automatically creating and customizing a user communication space; ii) adapting services to user terminal characteristics and to users preferences; iii) orchestrating network capabilities. The aim of this paper is to present the technical framework of the IST-Simplicity project. This paper is a thorough analysis and qualitative evaluation of the different technologies, standards and works presented in the literature related to the Simplicity system to be developed
    • 

    corecore