9,767 research outputs found

    Security Estimates for Quadratic Field Based Cryptosystems

    Get PDF
    We describe implementations for solving the discrete logarithm problem in the class group of an imaginary quadratic field and in the infrastructure of a real quadratic field. The algorithms used incorporate improvements over previously-used algorithms, and extensive numerical results are presented demonstrating their efficiency. This data is used as the basis for extrapolations, used to provide recommendations for parameter sizes providing approximately the same level of security as block ciphers with 80,80, 112,112, 128,128, 192,192, and 256256-bit symmetric keys

    Efficient asynchronous accumulators for distributed PKI

    Full text link
    Cryptographic accumulators are a tool for compact set representation and secure set membership proofs. When an element is added to a set by means of an accumulator, a membership witness is generated. This witness can later be used to prove the membership of the element. Typically, the membership witness has to be synchronized with the accumulator value, and to be updated every time another element is added to the accumulator. In this work we propose an accumulator that, unlike any prior scheme, does not require strict synchronization. In our construction a membership witness needs to be updated only a logarithmic number of times in the number of subsequent element additions. Thus, an out-of-date witness can be easily made current. Vice versa, a verifier with an out-of-date accumulator value can still verify a current membership witness. These properties make our accumulator construction uniquely suited for use in distributed applications, such as blockchain-based public key infrastructures

    MoPS: A Modular Protection Scheme for Long-Term Storage

    Full text link
    Current trends in technology, such as cloud computing, allow outsourcing the storage, backup, and archiving of data. This provides efficiency and flexibility, but also poses new risks for data security. It in particular became crucial to develop protection schemes that ensure security even in the long-term, i.e. beyond the lifetime of keys, certificates, and cryptographic primitives. However, all current solutions fail to provide optimal performance for different application scenarios. Thus, in this work, we present MoPS, a modular protection scheme to ensure authenticity and integrity for data stored over long periods of time. MoPS does not come with any requirements regarding the storage architecture and can therefore be used together with existing archiving or storage systems. It supports a set of techniques which can be plugged together, combined, and migrated in order to create customized solutions that fulfill the requirements of different application scenarios in the best possible way. As a proof of concept we implemented MoPS and provide performance measurements. Furthermore, our implementation provides additional features, such as guidance for non-expert users and export functionalities for external verifiers.Comment: Original Publication (in the same form): ASIACCS 201

    Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation

    Full text link
    TensorFlow has been the most widely adopted Machine/Deep Learning framework. However, little exists in the literature that provides a thorough understanding of the capabilities which TensorFlow offers for the distributed training of large ML/DL models that need computation and communication at scale. Most commonly used distributed training approaches for TF can be categorized as follows: 1) Google Remote Procedure Call (gRPC), 2) gRPC+X: X=(InfiniBand Verbs, Message Passing Interface, and GPUDirect RDMA), and 3) No-gRPC: Baidu Allreduce with MPI, Horovod with MPI, and Horovod with NVIDIA NCCL. In this paper, we provide an in-depth performance characterization and analysis of these distributed training approaches on various GPU clusters including the Piz Daint system (6 on Top500). We perform experiments to gain novel insights along the following vectors: 1) Application-level scalability of DNN training, 2) Effect of Batch Size on scaling efficiency, 3) Impact of the MPI library used for no-gRPC approaches, and 4) Type and size of DNN architectures. Based on these experiments, we present two key insights: 1) Overall, No-gRPC designs achieve better performance compared to gRPC-based approaches for most configurations, and 2) The performance of No-gRPC is heavily influenced by the gradient aggregation using Allreduce. Finally, we propose a truly CUDA-Aware MPI Allreduce design that exploits CUDA kernels and pointer caching to perform large reductions efficiently. Our proposed designs offer 5-17X better performance than NCCL2 for small and medium messages, and reduces latency by 29% for large messages. The proposed optimizations help Horovod-MPI to achieve approximately 90% scaling efficiency for ResNet-50 training on 64 GPUs. Further, Horovod-MPI achieves 1.8X and 3.2X higher throughput than the native gRPC method for ResNet-50 and MobileNet, respectively, on the Piz Daint cluster.Comment: 10 pages, 9 figures, submitted to IEEE IPDPS 2019 for peer-revie
    • …
    corecore