683 research outputs found

    PPE-level protocols for carpet clusters

    Get PDF
    Journal ArticleWe describe the lowest level of a suite of protocols for workstation cluster multicomputers: the parts implemented in hardware by a Protocol Processing Engine (PPE) and the software level immediately above the PPE. The stated goal of this work is extremely low end-to-end latency communications on independent workstations connected by a packet switching communication fabric. The workstations are expected to run a commercial operating system and must present the same security characteristics as traditional protocols. We begin with a realization of sender-based protocols. Such protocols can avoid much of the copying that slows down traditional approaches and can also reduce the overhead involved in demultiplexing packet streams and notification of recipients. Finally, we present some measurements of an early prototype

    Heterogeneous LTE/ Wi-Fi architecture for intelligent transportation systems

    Get PDF
    Intelligent Transportation Systems (ITS) make use of advanced technologies to enhance road safety and improve traffic efficiency. It is anticipated that ITS will play a vital future role in improving traffic efficiency, safety, comfort and emissions. In order to assist the passengers to travel safely, efficiently and conveniently, several application requirements have to be met simultaneously. In addition to the delivery of regular traffic and safety information, vehicular networks have been recently required to support infotainment services. Previous vehicular network designs and architectures do not satisfy this increasing traffic demand as they are setup for either voice or data traffic, which is not suitable for the transfer of vehicular traffic. This new requirement is one of the key drivers behind the need for new mobile wireless broadband architectures and technologies. For this purpose, this thesis proposes and investigates a heterogeneous IEEE 802.11 and LTE vehicular system that supports both infotainment and ITS traffic control data. IEEE 802.11g is used for V2V communications and as an on-board access network while, LTE is used for V2I communications. A performance simulation-based study is conducted to validate the feasibility of the proposed system in an urban vehicular environment. The system performance is evaluated in terms of data loss, data rate, delay and jitter. Several simulation scenarios are performed and evaluated. In the V2I-only scenario, the delay, jitter and data drops for both ITS and video traffic are within the acceptable limits, as defined by vehicular application requirements. Although a tendency of increase in video packet drops during handover from one eNodeB to another is observed yet, the attainable data loss rate is still below the defined benchmarks. In the integrated V2V-V2I scenario, data loss in uplink ITS traffic was initially observed so, Burst communication technique is applied to prevent packet losses in the critical uplink ITS traffic. A quantitative analysis is performed to determine the number of packets per burst, the inter-packet and inter-burst intervals. It is found that a substantial improvement is achieved using a two-packet Burst, where no packets are lost in the uplink direction. The delay, jitter and data drops for both uplink and downlink ITS traffic, and video traffic are below the benchmarks of vehicular applications. Thus, the results indicate that the proposed heterogeneous system offers acceptable performance that meets the requirements of the different vehicular applications. All simulations are conducted on OPNET Network Modeler and results are subjected to a 95% confidence analysis

    Message passing on InfiniBand RDMA for parallel run-time supports

    Get PDF
    InfiniBand networks are commonly used in the high performance computing area. They offer RDMA-based operations that help to improve the performance of communication subsystems. In this paper, we propose a minimal message-passing communication layer providing the programmer with a point-to-point communication channel implemented by way of InfiniBand RDMA features. Differently from other libraries exploiting the InfiniBand features, such as the well-known Message Passing Interface (MPI), the proposed library is a communication layer only rather than a programming model, and can be easily used as building block for high-level parallel programming frameworks. Evaluated on micro-benchmarks, the proposed RDMA-based communication channel implementation achieves a comparable performance with highly optimised MPI/InfiniBand implementations. Eventually, the flexibility of the communication layer is evaluated by integrating it within the FastFlow parallel framework, currently supporting TCP/IP networks (via the ZeroMQ communication library). © 2014 IEEE

    Middleware for large scale in situ analytics workflows

    Get PDF
    The trend to exascale is causing researchers to rethink the entire computa- tional science stack, as future generation machines will contain both diverse hardware environments and run times that manage them. Additionally, the science applications themselves are stepping away from the traditional bulk-synchronous model and are moving towards a more dynamic and decoupled environment where analysis routines are run in situ alongside the large scale simulations. This thesis presents CoApps, a middleware that allows in situ science analytics applications to operate in a location-flexible manner. Additionally, CoApps explores methods to extract information from, and issue management operations to, lower level run times that are managing the diverse hardware expected to be found on next generation exascale machines. This work leverages experience with several extremely scalable applications in materials and fusion, and has been evaluated on machines ranging from local Linux clusters to the supercomputer Titan.Ph.D

    G-LOMARC-TS: Lookahead group matchmaking for time/space sharing on multi-core parallel machines

    Get PDF
    Parallel machines with multi-core nodes are becoming increasingly popular. The performances of applications running on these machines are improved gradually due to the resource competition in each node. Researches have found that coscheduling different applications with complementary resource characteristics on the same set of nodes (semi time sharing) may improve the performance. We propose a scheduling algorithm G-LOMARC-TS which incorporates both space and semi time sharing scheduling methods and matches groups of jobs if possible for coscheduling. Since matchmaking may select jobs further down the waiting queue and the jobs in front of the queue may be delayed subsequently, fairness for each individual job will be watched and the delay will be kept within a limited bound. Several heuristics are used to solve the NP-complete problem of forming groups. Our experiment results show both utilization gain and average relative response time improvements of G-LOMARC-TS over other several scheduling policies

    The role of the host in a cooperating mainframe and workstation environment, volumes 1 and 2

    Get PDF
    In recent years, advancements made in computer systems have prompted a move from centralized computing based on timesharing a large mainframe computer to distributed computing based on a connected set of engineering workstations. A major factor in this advancement is the increased performance and lower cost of engineering workstations. The shift to distributed computing from centralized computing has led to challenges associated with the residency of application programs within the system. In a combined system of multiple engineering workstations attached to a mainframe host, the question arises as to how does a system designer assign applications between the larger mainframe host and the smaller, yet powerful, workstation. The concepts related to real time data processing are analyzed and systems are displayed which use a host mainframe and a number of engineering workstations interconnected by a local area network. In most cases, distributed systems can be classified as having a single function or multiple functions and as executing programs in real time or nonreal time. In a system of multiple computers, the degree of autonomy of the computers is important; a system with one master control computer generally differs in reliability, performance, and complexity from a system in which all computers share the control. This research is concerned with generating general criteria principles for software residency decisions (host or workstation) for a diverse yet coupled group of users (the clustered workstations) which may need the use of a shared resource (the mainframe) to perform their functions

    Design-space exploration of most-recent-only communication using myrinet on SGI ccNUMA architectures

    Get PDF
    technical reportSGI's current ccNUMA multiprocessor architectures offer high scalability and performance without sacrificing the ease of use of simpler SMP systems. Although these systems also provide a standard PCI expansion bus, the bridging between PCI and SGI's ccNUMA architecture invalidates the assumptions typically made by network protocol designers attempting to use Myrinet to reduce communications latencies. We explore the complications introduced by SGI's architecture in the context of designing most-recent-only communications, in which a reader requires only the most recent datum produced by a writer
    corecore