114 research outputs found

    Throughput constrained parallelism reduction in cyclo-static dataflow applications

    Get PDF
    International audienceThis paper deals with semantics-preserving parallelism reduction methods for cyclo-static dataflow applications. Parallelism reduction is the process of equivalent actors fusioning. The principal objectives of parallelism reduction are to decrease the memory footprint of an application and to increase its execution performance. We focus on parallelism reduction methodologies constrained by application throughput. A generic parallelism reduction methodology is introduced. Experimental results are provided for asserting the performance of the proposed method

    A linear programming approach to general dataflow process network verification and dimensioning

    Full text link
    In this paper, we present linear programming-based sufficient conditions, some of them polynomial-time, to establish the liveness and memory boundedness of general dataflow process networks. Furthermore, this approach can be used to obtain safe upper bounds on the size of the channel buffers of such a network.Comment: In Proceedings ICE 2010, arXiv:1010.530

    Generating Code and Memory Buffers to Reorganize Data on Many-core Architectures

    Get PDF
    International audienceThe dataflow programming model has shown to be a relevant approach to efficiently run mas-sively parallel applications over many-core architectures. In this model, some particular builtin agents are in charge of data reorganizations between user agents. Such agents can Split, Join and Duplicate data onto their communication ports. They are widely used in signal processing for example. These system agents, and their associated implementations, are of major impor-tance when it comes to performance, because they can stand on the critical path (think about Amdhal's law). Furthermore, a particular data reorganization can be expressed by the devel-oper in several ways that may lead to inefficient solutions (mostly unneeded data copies and transfers). In this paper, we propose several strategies to manage data reorganization at compile time, with a focus on indexed accesses to shared buffers to avoid data copies. These strategies are complementary: they ensure correctness for each system agent configuration, as well as performance when possible. They have been implemented within the Sigma-C industry-grade compilation toolchain and evaluated over the Kalray MPPA 256-core processor

    A probabilistic design for practical homomorphic majority voting with intrinsic differential privacy

    Get PDF
    As machine learning (ML) has become pervasive throughout various fields (industry, healthcare, social networks), privacy concerns regarding the data used for its training have gained a critical importance. In settings where several parties wish to collaboratively train a common model without jeopardizing their sensitive data, the need for a private training protocol is particularly stringent and implies to protect the data against both the model's end-users and the other actors of the training phase. In this context of secure collaborative learning, Differential Privacy (DP) and Fully Homomorphic Encryption (FHE) are two complementary countermeasures of growing interest to thwart privacy attacks in ML systems. Central to many collaborative training protocols, in the line of PATE, is majority voting aggregation. Thus, in this paper, we design SHIELD, a probabilistic approximate majority voting operator which is faster when homomorphically executed than existing approaches based on exact argmax computation over an histogram of votes. As an additional benefit, the inaccuracy of SHIELD is used as a feature to provably enable DP guarantees. Although SHIELD may have other applications, we focus here on one setting and seamlessly integrate it in the SPEED collaborative training framework from \cite{grivet2021speed} to improve its computational efficiency. After thoroughly describing the FHE implementation of our algorithm and its DP analysis, we present experimental results. To the best of our knowledge, it is the first work in which relaxing the accuracy of an algorithm is constructively usable as a degree of freedom to achieve better FHE performances

    Stochastic graph partitioning: quadratic versus SOCP formulations

    Get PDF
    International audienceWe consider a variant of the graph partitioning problem involving knapsack constraints with Gaussian random coefficients. In this new variant, under this assumption of probability distribution, the problem can be traditionally formulated as a binary SOCP for which the continuous relaxation is convex. In this paper, we reformulate the problem as a binary quadratic constrained program for which the continuous relaxation is not necessarily convex. We propose several linearization techniques for latter: the classical linearization proposed by Fortet (Trabajos de Estadistica 11(2):111–118, 1960) and the linearization proposed by Sherali and Smith (Optim Lett 1(1):33–47, 2007). In addition to the basic implementation of the latter, we propose an improvement which includes, in the computation, constraints coming from the SOCP formulation. Numerical results show that an improvement of Sherali–Smith’s linearization outperforms largely the binary SOCP program and the classical linearization when investigated in a branch-and-bound approach

    Practical Multi-Key Homomorphic Encryption for More Flexible and Efficient Secure Federated Aggregation (preliminary work)

    Get PDF
    In this work, we introduce a lightweight communication-efficient multi-key approach suitable for the Federated Averaging rule. By combining secret-key RLWE-based HE, additive secret sharing and PRFs, we reduce approximately by a half the communication cost per party when compared to the usual public-key instantiations, while keeping practical homomorphic aggregation performances. Additionally, for LWE-based instantiations, our approach reduces the communication cost per party from quadratic to linear in terms of the lattice dimension

    DFA on LS-Designs with a Practical Implementation on SCREAM (extended version)

    Get PDF
    LS-Designs are a family of SPN-based block ciphers whose linear layer is based on the so-called interleaved construction. They will be dedicated to low-end devices with high performance and low-resource constraints, objects which need to be resistant to physical attacks. In this paper we describe a complete Differential Fault Analysis against LS-Designs and also on other families of SPN-based block ciphers. First we explain how fault attacks can be used against their implementations depending on fault models. Then, we validate the DFA in a practical example on a hardware implementation of SCREAM running on an FPGA. The faults have been injected using electromagnetic pulses during the execution of SCREAM and the faulty ciphertexts have been used to recover the key’s bits. Finally, we discuss some countermeasures that could be used to thwart such attacks

    Towards Better Availability and Accountability for IoT Updates by means of a Blockchain

    Get PDF
    International audienceBuilding the Internet of Things requires deploying a huge number of devices with full or limited connectivity to the Internet. Given that these devices are exposed to attackers and generally not secured-by-design, it is essential to be able to update them, to patch their vulnerabilities and to prevent hackers from enrolling them into botnets. Ideally, the update infrastructure should implement the CIA triad properties, i.e., confidentiality, integrity and availability. In this work, we investigate how the use of a blockchain infrastructure can meet these requirements, with a focus on availability

    Stream ciphers: A Practical Solution for Efficient Homomorphic-Ciphertext Compression

    Get PDF
    International audienceIn typical applications of homomorphic encryption, the first step consists for Alice to encrypt some plaintext m under Bob’s public key pk and to send the ciphertext c = HEpk(m) to some third-party evaluator Charlie. This paper specifically considers that first step, i.e. the problem of transmitting c as efficiently as possible from Alice to Charlie. As previously noted, a form of compression is achieved using hybrid encryption. Given a symmetric encryption scheme E, Alice picks a random key k and sends a much smaller ciphertext c′ = (HEpk(k), Ek(m)) that Charlie decompresses homomorphically into the original c using a decryption circuit CE−1 .In this paper, we revisit that paradigm in light of its concrete implemen- tation constraints; in particular E is chosen to be an additive IV-based stream cipher. We investigate the performances offered in this context by Trivium, which belongs to the eSTREAM portfolio, and we also pro- pose a variant with 128-bit security: Kreyvium. We show that Trivium, whose security has been firmly established for over a decade, and the new variant Kreyvium have an excellent performance

    Running compression algorithms in the encrypted domain: a case-study on the homomorphic execution of RLE

    Get PDF
    This paper is devoted to the study of the problem of running compression algorithms in the encrypted domain, using a (somewhat) Fully Homomorphic Encryption (FHE) scheme. We do so with a particular focus on conservative compression algorithms. Despite of the encrypted domain Turing-completeness which comes with the magic of FHE operators, we show that a number of subtleties crop up when it comes to running compression algorithms and, in particular, that guaranteed conservative compression is not possible to achieve in the FHE setting. To illustrate these points, we analyze the most elementary conservative compression algorithm of all, namely Run-Length Encoding (RLE). We first study the way to regularize this algorithm in order to make it (meaningfully) fit within the constraints of a FHE execution. Secondly, we analyze it from the angle of optimizing the resulting structure towards (as much as possible) FHE execution efficiency. The paper is concluded by concrete experimental results obtained using the Fan-Vercauteren cryptosystem as well as the Armadillo FHE compiler
    corecore