26 research outputs found

    On the Use of SCTP in Wireless Networks

    Get PDF

    Design of scalable Java message-passing communications over InfiniBand

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in The Journal of Supercomputing. The final authenticated version is available online at: https://doi.org/10.1007/s11227-011-0654-9[Abstract] This paper presents ibvdev a scalable and efficient low-level Java message-passing communication device over InfiniBand. The continuous increase in the number of cores per processor underscores the need for efficient communication support for parallel solutions. Moreover, current system deployments are aggregating a significant number of cores through advanced network technologies, such as InfiniBand, increasing the complexity of communication protocols, especially when dealing with hybrid shared/distributed memory architectures such as clusters. Here, Java represents an attractive choice for the development of communication middleware for these systems, as it provides built-in networking and multithreading support. As the gap between Java and compiled languages performance has been narrowing for the last years, Java is an emerging option for High Performance Computing (HPC). The developed communication middleware ibvdev increases Java applications performance on clusters of multicore processors interconnected via InfiniBand through: (1) providing Java with direct access to InfiniBand using InfiniBand Verbs API, somewhat restricted so far to MPI libraries; (2) implementing an efficient and scalable communication protocol which obtains start-up latencies and bandwidths similar to MPI performance results; and (3) allowing its integration in any Java parallel and distributed application. In fact, it has been successfully integrated in the Java messaging library MPJ Express. The experimental evaluation of this middleware on an InfiniBand cluster of multicore processors has shown significant point-to-point performance benefits, up to 85% start-up latency reduction and twice the bandwidth compared to previous Java middleware on InfiniBand. Additionally, the impact of ibvdev on message-passing collective operations is significant, achieving up to one order of magnitude performance increases compared to previous Java solutions, especially when combined with multithreading. Finally, the efficiency of this middleware, which is even competitive with MPI in terms of performance, increments the scalability of communications intensive Java HPC applications.Ministerio de Ciencia e Innovación; TIN2010-1673

    FastMPJ: a scalable and efficient Java message-passing library

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Cluster Computing. The final authenticated version is available online at: http://dx.doi.org/https://doi.org/10.1007/s10586-014-0345-4[Abstract] The performance and scalability of communications are key for high performance computing (HPC) applications in the current multi-core era. Despite the significant benefits (e.g., productivity, portability, multithreading) of Java for parallel programming, its poor communications support has hindered its adoption in the HPC community. This paper presents FastMPJ, an efficient message-passing in Java (MPJ) library, boosting Java for HPC by: (1) providing high-performance shared memory communications using Java threads; (2) taking full advantage of high-speed cluster networks (e.g., InfiniBand) to provide low-latency and high bandwidth communications; (3) including a scalable collective library with topology aware primitives, automatically selected at runtime; (4) avoiding Java data buffering overheads through zero-copy protocols; and (5) implementing the most widely extended MPI-like Java bindings for a highly productive development. The comprehensive performance evaluation on representative testbeds (InfiniBand, 10 Gigabit Ethernet, Myrinet, and shared memory systems) has shown that FastMPJ communication primitives rival native MPI implementations, significantly improving the efficiency and scalability of Java HPC parallel applications.Ministerio de Educación y Ciencia; AP2010-4348Ministerio de Economía y Competitividad; TIN2010-16735Xunta de Galicia; CN2012/211Xunta de Galicia; GRC2013/05

    Process Industry 4.0: Effect on Interfaces between MES and Shop floor Integrations in Pulp and Paper Industry

    Get PDF
    The purpose of this thesis is to research how Industry 4.0 affects integrations between MES and shop floor in pulp and paper industry. Industry 4.0 is a generally used term for the fourth industrial revolution introducing modern technologies and producing principles into manufactur-ing. These modern technologies include big data and analytics, cloud computing, and IoT. Four Industry 4.0 design principles, interconnection, information transparency, decentralized deci-sions, and technical assistance, are handled as central principles when designing Industry 4.0 compliant factories. The impact of Industry 4.0 on factories covers the entire system, including system architecture, modern technologies on shop floor, and new communication methods and protocols. The thesis can be divided into two parts, theoretical and practical part. First in the theoretical part, comprehensive literature review was conducted to find out Industry 4.0 related trends that affect the shop floor. As in pulp and paper industry little Industry 4.0 related research has been done, the scope of the literature review covered also research done in other industrial fields. Based on the findings of the literature review, Industry 4.0 compliant prototype was designed and implemented. The design and implementation of the prototype form the practical part of the thesis. The most promising trends that are likely to be seen in factories when moving towards In-dustry 4.0 compliant smart factories, are smarter sensors, devices, and products, new wireless communication technologies and IoT messaging protocols, cloud and fog computing, service-oriented architecture, and decentralisation of decision making. As new communication technol-ogies, such as IoT messaging protocols, seemed to be an important part of almost every find-ing in literature review, prototype was decided to be built based on communication using OPC UA PubSub over MQTT. In this thesis it is concluded that investing in Industry 4.0 is crucial for business success in the future. In several articles and other sources introduced in this thesis the positive impact of Industry 4.0 solutions on factories has been shown. These benefits include e.g., increased prof-itability and productivity, more adaptability, and solutions for more complicated customer needs and scarce resources. Companies working with Manufacturing Execution Systems should be prepared to all the changes discussed in this thesis, although, as stated in the thesis, many of the recent technologies need to be tested more thoroughly in real production environment prior to concluding their suitability for pulp and paper industry. The implemented prototype gives promising results and indicates that communication using OPC UA PubSub over MQTT is rela-tively easy to implement

    Design of efficient Java communications for high performance computing

    Get PDF
    [Abstract] There is an increasing interest to adopt Java as the parallel programming language for the multi-core era. Although Java offers important advantages, such as built-in multithreading and networking support, productivity and portability, the lack of efficient communication middleware is an important drawback for its uptake in High Performance Computing (HPC). This PhD Thesis presents the design, implementation and evaluation of several solutions to improve this situation: (1) a high performance Java sockets implementation (JFS, Java Fast Sockets) on high-speed networks (e.g., Myrinet, InfiniBand) and shared memory (e.g., multi-core) machines; (2) a low-level messaging device, iodev, which efficiently overlaps communication and computation; and (3) a more scalable Java message-passing library, Fast MPJ (F-MPJ). Furthermore, new Java parallel benchmarks have been implemented and used for the performance evaluation of the developed middleware. The final and main conclusion is that the use of Java for HPC is feasible and even advisable when looking for productive development, provided that efficient communication middleware is made available, such as the projects presented in this Thesis.[Resumen] La tesis doctoral "Design of Efficient Java Communications for High Performance Computing" parte de la hipótesis inicial de que es posible desarrollar aplicaciones Java en computación de altas prestaciones, un ámbito en el que el rendimiento es crucial, siempre que esté disponible un middleware de comunicación eficiente. Así, se han diseñado, desarrollado y evaluado diferentes bibliotecas de comunicación en Java, desde el nivel de sockets al de paso de mensajes, obteniendo notables incrementos de eficiencia, confirmando que la hipótesis inicial es factible

    Effects of Communication Protocol Stack Offload on Parallel Performance in Clusters

    Get PDF
    The primary research objective of this dissertation is to demonstrate that the effects of communication protocol stack offload (CPSO) on application execution time can be attributed to the following two complementary sources. First, the application-specific computation may be executed concurrently with the asynchronous communication performed by the communication protocol stack offload engine. Second, the protocol stack processing can be accelerated or decelerated by the offload engine. These two types of performance effects can be quantified with the use of the degree of overlapping Do and degree of acceleration Daccs. The composite communication speedup metrics S_comm(Do, Daccs) can be used in order to quantify the combined effects of the protocol stack offload. This dissertation thesis is validated empirically. The degree of overlapping Do, the degree of acceleration Daccs, and the communication speedup Scomm characteristic of the system configurations under test are derived in the course of experiments performed for the system configurations of interest. It is shown that the proposed metrics adequately describe the effects of the protocol stack offload on the application execution time. Additionally, a set of analytical models of the networking subsystem of a PC-based cluster node is developed. As a result of the modeling, the metrics Do, Daccs, and Scomm are obtained. The models are evaluated as to their complexity and precision by comparing the modeling results with the measured values of Do, Daccs, and Scomm. The primary contributions of this dissertation research are as follows. First, the metric Daccs and Scomm are introduced in order to complement the Do metric in its use for evaluation of the effects of optimizations in the networking subsystem on parallel performance in clusters. The metrics are shown to adequately describe CPSO performance effects. Second, a method for assessing performance effects of CPSO scenarios on application performance is developed and presented. Third, a set of analytical models of cluster node networking subsystems with CPSO capability is developed and characterised as to their complexity and precision of the prediction of the Do and Daccs metrics

    Building global and scalable systems with atomic multicast

    Get PDF
    The rise of worldwide Internet-scale services demands large distributed systems. Indeed, when handling several millions of users, it is common to operate thousands of servers spread across the globe. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B-trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B- trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure

    SCTP - Evaluating, Improving and Extending the Protocol for Broader Deployment

    Get PDF
    Zugriff auf den Volltext ist gesperrt, neue Version unter DuEPublico-ID 35000 The Stream Control Transmission Protocol (SCTP), originally designed for the transport of signaling messages over IP based telephony signaling networks, is a general transport protocol with features suitable for a variety of applications that can benefit from multihoming, multiple streams, or one of SCTP’s numerous extensions. To date, SCTP has found its way into all kernel implementations of UNIX derivatives and a Windows prototype, but there are still flaws, which have to be identified and corrected. In this thesis, first, a suite of tools consisting of an SCTP simulation and testing environment is provided to lay the groundwork for further studies. Starting from comparing and analyzing kernel implementations, several aspects of the protocol that lead to undesirable behavior are examined. Congestion and flow control that are adopted from the Transmission Control Protocol (TCP), although using the same mechanisms, need a special treatment because of SCTP’s message orientation. The analysis of the SCTP specific characteristics with the help of the simulation will finally result in solutions that lead to a better performance. The deployment of SCTP will be another concern that can be improved by introducing a specific Network Address Translation (NAT) for SCTP.Zugriff auf den Volltext ist gesperrt, neue Version unter DuEPublico-ID 35000 Das Stream Control Transmission Protocol (SCTP) wurde ursprünglich für den Transport von Signalisierungsnachrichten über IP basierte Netze konzipiert. Inzwischen hat es sich jedoch zu einem allgemeinen Transportprotokoll entwickelt, das einzigartige Eigenschaften besitzt. Daher ist es besonders für Anwendungen interessant, die von mehreren Netzwerkadressen pro Verbindung (Multihoming), mehreren unabhängigen Nachrichtenströmen oder einer der zahlreichen Protokollerweiterungen profitieren können. Mittlerweile hat SCTP in die Betriebssystemkerne aller UNIX-Derivate und eines Windows Prototyps Einzug gehalten, aber es gibt noch Mängel, deren Ursachen es zu entdecken und zu korrigieren gilt. In dieser Dissertation wird zunächst eine Reihe von Werkzeugen bereitgestellt, um die Grundlage für weitere Untersuchungen zu schaffen. Ausgehend von der Analyse und dem Vergleich von Implementierungen im Systemkern verschiedener Betriebssysteme werden einige Aspekte des Protokolls untersucht, die zu unerwünschtem Verhalten führen. Die Prinzipien der Überlast- und Flusskontrolle wurden vom stream-orientierten Transmission Control Protocol (TCP) übernommen und benutzen daher dieselben Mechanismen. SCTP als nachrichtenorientiertes Protokoll benötigt jedoch eine diesem Unterschied Rechnung tragende Implementierung der Algorithmen. Die Analyse von SCTP-spezifischen Charakteristika mithilfe der Simulation wird schließlich zu Lösungen führen und zu einer Verbesserung des Durchsatzes. Ein weiteres Anliegen dieser Arbeit ist die Verbreitung von SCTP. Sie kann durch die Einführung einer SCTP-spezifischen Methode zur Umsetzung von Netzwerkadressen (Network Address Translation (NAT)) verbessert werden

    Recent Advances in Wireless Communications and Networks

    Get PDF
    This book focuses on the current hottest issues from the lowest layers to the upper layers of wireless communication networks and provides "real-time" research progress on these issues. The authors have made every effort to systematically organize the information on these topics to make it easily accessible to readers of any level. This book also maintains the balance between current research results and their theoretical support. In this book, a variety of novel techniques in wireless communications and networks are investigated. The authors attempt to present these topics in detail. Insightful and reader-friendly descriptions are presented to nourish readers of any level, from practicing and knowledgeable communication engineers to beginning or professional researchers. All interested readers can easily find noteworthy materials in much greater detail than in previous publications and in the references cited in these chapters
    corecore