2,567 research outputs found

    Frameworks for Protocol Implementation

    Get PDF
    This paper reports on the development of a catalogue of frameworks for protocol implementation. Frameworks are software structures developed for a specific application domain, which can be re-used in the implementation of various different concrete systems in this domain. By using frameworks we aim at increasing the effectiveness of the protocol implementation process. We assume that whenever protocols are directly implemented from their specifications one may be able to increase the correctness and the speed of the implementation process, and the maintainability of the resulting system. We argue that frameworks should match the concepts underlying the techniques used for specifying protocols. Consequently, we couple the development of frameworks for protocol implementation to the investigation of the different alternative design models for protocol specification. This paper presents the approach we have been using to develop frameworks, and illustrates this approach with an example of framework

    A Flexible and Modular Framework for Implementing Infrastructures for Global Computing

    Get PDF
    We present a Java software framework for building infrastructures to support the development of applications for systems where mobility and network awareness are key issues. The framework is particularly useful to develop run-time support for languages oriented towards global computing. It enables platform designers to customize communication protocols and network architectures and guarantees transparency of name management and code mobility in distributed environments. The key features are illustrated by means of a couple of simple case studies

    The End of Slow Networks: It's Time for a Redesign

    Full text link
    Next generation high-performance RDMA-capable networks will require a fundamental rethinking of the design and architecture of modern distributed DBMSs. These systems are commonly designed and optimized under the assumption that the network is the bottleneck: the network is slow and "thin", and thus needs to be avoided as much as possible. Yet this assumption no longer holds true. With InfiniBand FDR 4x, the bandwidth available to transfer data across network is in the same ballpark as the bandwidth of one memory channel, and it increases even further with the most recent EDR standard. Moreover, with the increasing advances of RDMA, the latency improves similarly fast. In this paper, we first argue that the "old" distributed database design is not capable of taking full advantage of the network. Second, we propose architectural redesigns for OLTP, OLAP and advanced analytical frameworks to take better advantage of the improved bandwidth, latency and RDMA capabilities. Finally, for each of the workload categories, we show that remarkable performance improvements can be achieved

    GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP

    Full text link
    Full detector simulation was among the largest CPU consumer in all CERN experiment software stacks for the first two runs of the Large Hadron Collider (LHC). In the early 2010's, the projections were that simulation demands would scale linearly with luminosity increase, compensated only partially by an increase of computing resources. The extension of fast simulation approaches to more use cases, covering a larger fraction of the simulation budget, is only part of the solution due to intrinsic precision limitations. The remainder corresponds to speeding-up the simulation software by several factors, which is out of reach using simple optimizations on the current code base. In this context, the GeantV R&D project was launched, aiming to redesign the legacy particle transport codes in order to make them benefit from fine-grained parallelism features such as vectorization, but also from increased code and data locality. This paper presents extensively the results and achievements of this R&D, as well as the conclusions and lessons learnt from the beta prototype.Comment: 34 pages, 26 figures, 24 table

    A New Simplified Federated Single Sign-on System

    Get PDF
    The work presented in this MPhil thesis addresses this challenge by developing a new simplified FSSO system that allows end-users to access desktop systems, web-based services/applications and non-web based services/applications using one authentication process. This new system achieves this using two major components: an “Authentication Infrastructure Integration Program (AIIP) and an “Integration of Desktop Authentication and Web-based Authentication (IDAWA). The AIIP acquires Kerberos tickets (for end-users who have been authenticated by a Kerberos single sign-on system in one net- work domain) from Kerberos single sign-on systems in different network domains without establishing trust between these Kerberos single sign-on systems. The IDAWA is an extension to the web-based authentication systems (i.e. the web portal), and it authenticates end-users by verifying the end-users\u27 Kerberos tickets. This research also developed new criteria to determine which FSSO system can deliver true single sign-on to the end-users (i.e. allowing end-users to access desktop systems, web-based services/applications and non-web based services/applications using one authentication process). The evaluation shows that the new simplified FSSO system (i.e. the combination of AIIP and IDAWA) can deliver true single sign-on to the end- users. In addition, the evaluation shows the new simplified FSSO system has advantages over existing FSSO systems as it does not require additional modifications to network domains\u27 existing non-web based authentication infrastructures (i.e. Kerberos single sign- on systems) and their firewall rules

    Deep Learning Models on CPUs: A Methodology for Efficient Training

    Full text link
    GPUs have been favored for training deep learning models due to their highly parallelized architecture. As a result, most studies on training optimization focus on GPUs. There is often a trade-off, however, between cost and efficiency when deciding on how to choose the proper hardware for training. In particular, CPU servers can be beneficial if training on CPUs was more efficient, as they incur fewer hardware update costs and better utilizing existing infrastructure. This paper makes several contributions to research on training deep learning models using CPUs. First, it presents a method for optimizing the training of deep learning models on Intel CPUs and a toolkit called ProfileDNN, which we developed to improve performance profiling. Second, we describe a generic training optimization method that guides our workflow and explores several case studies where we identified performance issues and then optimized the Intel Extension for PyTorch, resulting in an overall 2x training performance increase for the RetinaNet-ResNext50 model. Third, we show how to leverage the visualization capabilities of ProfileDNN, which enabled us to pinpoint bottlenecks and create a custom focal loss kernel that was two times faster than the official reference PyTorch implementation
    corecore