Search CORE

1,308 research outputs found

Overlapping of Communication and Computation and Early Binding: Fundamental Mechanisms for Improving Parallel Performance on Clusters of Workstations

Author: Dimitrov Rossen Petkov
Publication venue: Scholars Junction
Publication date: 12/05/2001
Field of study

This study considers software techniques for improving performance on clusters of workstations and approaches for designing message-passing middleware that facilitate scalable, parallel processing. Early binding and overlapping of communication and computation are identified as fundamental approaches for improving parallel performance and scalability on clusters. Currently, cluster computers using the Message-Passing Interface for interprocess communication are the predominant choice for building high-performance computing facilities, which makes the findings of this work relevant to a wide audience from the areas of high-performance computing and parallel processing. The performance-enhancing techniques studied in this work are presently underutilized in practice because of the lack of adequate support by existing message-passing libraries and are also rarely considered by parallel algorithm designers. Furthermore, commonly accepted methods for performance analysis and evaluation of parallel systems omit these techniques and focus primarily on more obvious communication characteristics such as latency and bandwidth. This study provides a theoretical framework for describing early binding and overlapping of communication and computation in models for parallel programming. This framework defines four new performance metrics that facilitate new approaches for performance analysis of parallel systems and algorithms. This dissertation provides experimental data that validate the correctness and accuracy of the performance analysis based on the new framework. The theoretical results of this performance analysis can be used by designers of parallel system and application software for assessing the quality of their implementations and for predicting the effective performance benefits of early binding and overlapping. This work presents MPI/Pro, a new MPI implementation that is specifically optimized for clusters of workstations interconnected with high-speed networks. This MPI implementation emphasizes features such as persistent communication, asynchronous processing, low processor overhead, and independent message progress. These features are identified as critical for delivering maximum performance to applications. The experimental section of this dissertation demonstrates the capability of MPI/Pro to facilitate software techniques that result in significant application performance improvements. Specific demonstrations with Virtual Interface Architecture and TCP/IP over Ethernet are offered

Scholars Junction - Mississippi State University Institutional Repository

Monotonicity and run-time scheduling

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

Crossref

Effect of Switchover Time in Cyclically Switched Systems

Author: Khurram Aziz
Publication venue: 'IntechOpen'
Publication date: 01/12/2009
Field of study

IntechOpen

Performance evaluation in terms of congestion and flow control of interconnected token ring local area networks

Author: Raza Muhammad Salim
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/1991
Field of study

In an interconnected network, if user demands are allowed to exceed the system capacity, unpleasant congestion effects occur which rapidly neutralize the delay and efficiency advantages. Congestion can be eliminated by using an appropriate set of traffic monitoring and control procedures called flow control procedures. This thesis first investigates the major technical concepts underlying the token-ring technology; performance and flow control issues and then gives an approximate analytical solution in terms of mean end-toned delay in a system of token-ring local area network interconnected through bridges. The analytical solution is based on an approximation of the mean end-to-end delay in a stand alone LAN and then extended by approximating the arrival rates at the bridges as a function of the throughput of each sub network. Besides throughput and delay, a more compact form of performance measure called power has also been in the study

Digital Commons @ New Jersey Institute of Technology (NJIT)

SDN Architecture and Southbound APIs for IPv6 Segment Routing Enabled Wide Area Networks

Author: Filsfils Clarence
Salsano Stefano
Tajiki Mohammad Mahdi
Ventre Pier Luigi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The SRv6 architecture (Segment Routing based on IPv6 data plane) is a promising solution to support services like Traffic Engineering, Service Function Chaining and Virtual Private Networks in IPv6 backbones and datacenters. The SRv6 architecture has interesting scalability properties as it reduces the amount of state information that needs to be configured in the nodes to support the network services. In this paper, we describe the advantages of complementing the SRv6 technology with an SDN based approach in backbone networks. We discuss the architecture of a SRv6 enabled network based on Linux nodes. In addition, we present the design and implementation of the Southbound API between the SDN controller and the SRv6 device. We have defined a data-model and four different implementations of the API, respectively based on gRPC, REST, NETCONF and remote Command Line Interface (CLI). Since it is important to support both the development and testing aspects we have realized an Intent based emulation system to build realistic and reproducible experiments. This collection of tools automate most of the configuration aspects relieving the experimenter from a significant effort. Finally, we have realized an evaluation of some performance aspects of our architecture and of the different variants of the Southbound APIs and we have analyzed the effects of the configuration updates in the SRv6 enabled nodes

arXiv.org e-Print Archive

ART

Issues in designing transport layer multicast facilities

Author: Dempsey Bert J.
Weaver Alfred C.
Publication venue
Publication date
Field of study

Multicasting denotes a facility in a communications system for providing efficient delivery from a message's source to some well-defined set of locations using a single logical address. While modem network hardware supports multidestination delivery, first generation Transport Layer protocols (e.g., the DoD Transmission Control Protocol (TCP) (15) and ISO TP-4 (41)) did not anticipate the changes over the past decade in underlying network hardware, transmission speeds, and communication patterns that have enabled and driven the interest in reliable multicast. Much recent research has focused on integrating the underlying hardware multicast capability with the reliable services of Transport Layer protocols. Here, we explore the communication issues surrounding the design of such a reliable multicast mechanism. Approaches and solutions from the literature are discussed, and four experimental Transport Layer protocols that incorporate reliable multicast are examined

NASA Technical Reports Server

Using BIP to reinforce correctness of resource-constrained IoT applications

Author: Bozga Marius
Georgiadis Christos,
Katsaros Panagiotis
Lekidis Alexios
Stachtiari Emmanouela
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/06/2015
Field of study

International audienceIoT applications have either a sense-only or a sense-compute-actuate goal and they implement a capability to process and respond to multiple (external) events while performing computations. Existing IoT operating systems provide a versatile execution environment that adheres to the limitations of the interconnected resource-constrained devices. To reduce the development effort, applications are often built on top of RESTful web services, which can be shared and reused. However, the asynchronous communication between remote nodes is prone to event scheduling delays, which cannot be predicted and taken into account while programming the application. Moreover, to avoid long delays in message processing and communication due to packet collisions, the data transmission frequencies between the system's nodes have to carefully chosen. In general, even when appropriate debugging tools and simulators are available, it is still a hard challenge to guarantee the required functional and non-functional properties at the application and system levels. To this end, we focus on IoT applications for the Contiki OS and we introduce a model-based rigorous analysis approach using the BIP component framework. At the application level, we verify qualitative properties regarding service responsiveness, whereas at the system level we can validate qualitative and quantitative properties using statistical model checking. We present results for an application scenario running on a distributed system infrastructure with nodes executing the Contiki OS

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

The Design of a System Architecture for Mobile Multimedia Computers

Author: Havinga Paul Johannes Mattheus
Publication venue: University of Twente
Publication date: 01/01/2000
Field of study

This chapter discusses the system architecture of a portable computer, called Mobile Digital Companion, which provides support for handling multimedia applications energy efficiently. Because battery life is limited and battery weight is an important factor for the size and the weight of the Mobile Digital Companion, energy management plays a crucial role in the architecture. As the Companion must remain usable in a variety of environments, it has to be flexible and adaptable to various operating conditions. The Mobile Digital Companion has an unconventional architecture that saves energy by using system decomposition at different levels of the architecture and exploits locality of reference with dedicated, optimised modules. The approach is based on dedicated functionality and the extensive use of energy reduction techniques at all levels of system design. The system has an architecture with a general-purpose processor accompanied by a set of heterogeneous autonomous programmable modules, each providing an energy efficient implementation of dedicated tasks. A reconfigurable internal communication network switch exploits locality of reference and eliminates wasteful data copies

CiteSeerX

University of Twente Research Information