960 research outputs found

    Recursion based parallelization of exact dense linear algebra routines for Gaussian elimination

    Get PDF
    International audienceWe present block algorithms and their implementation for the parallelization of sub-cubic Gaussian elimination on shared memory architectures.Contrarily to the classical cubic algorithms in parallel numerical linear algebra, we focus here on recursive algorithms and coarse grain parallelization.Indeed, sub-cubic matrix arithmetic can only be achieved through recursive algorithms making coarse grain block algorithms perform more efficiently than fine grain ones. This work is motivated by the design and implementation of dense linear algebraover a finite field, where fast matrix multiplication is used extensively and where costly modular reductions also advocate for coarse grain block decomposition. We incrementally build efficient kernels, for matrix multiplication first, then triangular system solving, on top of which a recursive PLUQ decomposition algorithm is built. We study the parallelization of these kernels using several algorithmic variants: either iterative or recursive and using different splitting strategies. Experiments show that recursive adaptive methods for matrix multiplication, hybrid recursive-iterative methods for triangular system solve and tile recursive versions of the PLUQ decomposition, together with various data mapping policies, provide the best performance on a 32 cores NUMA architecture. Overall, we show that the overhead of modular reductions is more than compensated by the fast linear algebra algorithms and that exact dense linear algebra matches the performance of full rank reference numerical software even in the presence of rank deficiencies

    Accelerating Homomorphic Encryption in the Cloud Environment through High-Level Synthesis and Reconfigurable Resources

    Get PDF
    The recent surge in cloud services is revolutionizing the way that data is stored and processed. Everyone with an internet connection, from large corporations to small companies and private individuals, now have access to cutting-edge processing power and vast amounts of data storage. This rise in cloud computing and storage, however, has brought with it a need for a new type of security. In order to have access to cloud services, users must allow the service provider to have full access to their private, unencrypted data. Users are required to trust the integrity of the service provider and the security of its data centers. The recent development of fully homomorphic encryption schemes can offer a solution to this dilemma. These algorithms allow encrypted data to be used in computations without ever stripping the data of the protection of encryption. Unfortunately, the demanding memory requirements and computational complexity of the proposed schemes has hindered their wide-scale use. Custom hardware accelerators for homomorphic encryption could be implemented on the increasing number of reconfigurable hardware resources in the cloud, but the long development time required for these processors would lead to high production costs. This research seeks to develop a strategy for faster development of homomorphic encryption hardware accelerators using the process of High-Level Synthesis. Insights from existing number theory software libraries and custom hardware accelerators are used to develop a scalable, proof-of-concept software implementation of Karatsuba modular polynomial multiplication. This implementation was designed to be used with High-Level Synthesis to accelerate the large modular polynomial multiplication operations required by homomorphic encryption. The accelerator generated from this implementation by the High-Level Synthesis tool Vivado HLS achieved significant speedup over the implementations available in the highly-optimized FLINT software library

    Algebraic Stream Processing

    Get PDF
    We identify and analyse the typically higher-order approaches to stream processing in the literature. From this analysis we motivate an alternative approach to the specification of SPSs as STs based on an essentially first-order equational representation. This technique is called Cartesian form specification. More specifically, while STs are properly second-order objects we show that using Cartesian forms, the second-order models needed to formalise STs are so weak that we may use and develop well-understood first-order methods from computability theory and mathematical logic to reason about their properties. Indeed, we show that by specifying STs equationally in Cartesian form as primitive recursive functions we have the basis of a new, general purpose and mathematically sound theory of stream processing that emphasises the formal specification and formal verification of STs. The main topics that we address in the development of this theory are as follows. We present a theoretically well-founded general purpose stream processing language ASTRAL (Algebraic Stream TRAnsformer Language) that supports the use of modular specification techniques for full second-order STs. We show how ASTRAL specifications can be given a Cartesian form semantics using the language PREQ that is an equational characterisation of the primitive recursive functions. In more detail, we show that by compiling ASTRAL specifications into an equivalent Cartesian form in PREQ we can use first-order equational logic with induction as a logical calculus to reason about STs. In particular, using this calculus we identify a syntactic class of correctness statements for which the verification of ASTRAL programmes is decidable relative to this calculus. We define an effective algorithm based on term re-writing techniques to implement this calculus and hence to automatically verify a very broad class of STs including conventional hardware devices. Finally, we analyse the properties of this abstract algorithm as a proof assistant and discuss various techniques that have been adopted to develop software tools based on this algorithm

    Just below the surface: developing knowledge management systems using the paradigm of the noetic prism

    Get PDF
    In this paper we examine how the principles embodied in the paradigm of the noetic prism can illuminate the construction of knowledge management systems. We draw on the formalism of the prism to examine three successful tools: frames, spreadsheets and databases, and show how their power and also their shortcomings arise from their domain representation, and how any organisational system based on integration of these tools and conversion between them is inevitably lossy. We suggest how a late-binding, hybrid knowledge based management system (KBMS) could be designed that draws on the lessons learnt from these tools, by maintaining noetica at an atomic level and storing the combinatory processes necessary to create higher level structure as the need arises. We outline the “just-below-the-surface” systems design, and describe its implementation in an enterprise-wide knowledge-based system that has all of the conventional office automation features

    Reliability models for dataflow computer systems

    Get PDF
    The demands for concurrent operation within a computer system and the representation of parallelism in programming languages have yielded a new form of program representation known as data flow (DENN 74, DENN 75, TREL 82a). A new model based on data flow principles for parallel computations and parallel computer systems is presented. Necessary conditions for liveness and deadlock freeness in data flow graphs are derived. The data flow graph is used as a model to represent asynchronous concurrent computer architectures including data flow computers

    Introspective Pushdown Analysis of Higher-Order Programs

    Full text link
    In the static analysis of functional programs, pushdown flow analysis and abstract garbage collection skirt just inside the boundaries of soundness and decidability. Alone, each method reduces analysis times and boosts precision by orders of magnitude. This work illuminates and conquers the theoretical challenges that stand in the way of combining the power of these techniques. The challenge in marrying these techniques is not subtle: computing the reachable control states of a pushdown system relies on limiting access during transition to the top of the stack; abstract garbage collection, on the other hand, needs full access to the entire stack to compute a root set, just as concrete collection does. \emph{Introspective} pushdown systems resolve this conflict. Introspective pushdown systems provide enough access to the stack to allow abstract garbage collection, but they remain restricted enough to compute control-state reachability, thereby enabling the sound and precise product of pushdown analysis and abstract garbage collection. Experiments reveal synergistic interplay between the techniques, and the fusion demonstrates "better-than-both-worlds" precision.Comment: Proceedings of the 17th ACM SIGPLAN International Conference on Functional Programming, 2012, AC

    Polymonadic Programming

    Full text link
    Monads are a popular tool for the working functional programmer to structure effectful computations. This paper presents polymonads, a generalization of monads. Polymonads give the familiar monadic bind the more general type forall a,b. L a -> (a -> M b) -> N b, to compose computations with three different kinds of effects, rather than just one. Polymonads subsume monads and parameterized monads, and can express other constructions, including precise type-and-effect systems and information flow tracking; more generally, polymonads correspond to Tate's productoid semantic model. We show how to equip a core language (called lambda-PM) with syntactic support for programming with polymonads. Type inference and elaboration in lambda-PM allows programmers to write polymonadic code directly in an ML-like syntax--our algorithms compute principal types and produce elaborated programs wherein the binds appear explicitly. Furthermore, we prove that the elaboration is coherent: no matter which (type-correct) binds are chosen, the elaborated program's semantics will be the same. Pleasingly, the inferred types are easy to read: the polymonad laws justify (sometimes dramatic) simplifications, but with no effect on a type's generality.Comment: In Proceedings MSFP 2014, arXiv:1406.153
    • …
    corecore