960 research outputs found
Recursion based parallelization of exact dense linear algebra routines for Gaussian elimination
International audienceWe present block algorithms and their implementation for the parallelization of sub-cubic Gaussian elimination on shared memory architectures.Contrarily to the classical cubic algorithms in parallel numerical linear algebra, we focus here on recursive algorithms and coarse grain parallelization.Indeed, sub-cubic matrix arithmetic can only be achieved through recursive algorithms making coarse grain block algorithms perform more efficiently than fine grain ones. This work is motivated by the design and implementation of dense linear algebraover a finite field, where fast matrix multiplication is used extensively and where costly modular reductions also advocate for coarse grain block decomposition. We incrementally build efficient kernels, for matrix multiplication first, then triangular system solving, on top of which a recursive PLUQ decomposition algorithm is built. We study the parallelization of these kernels using several algorithmic variants: either iterative or recursive and using different splitting strategies. Experiments show that recursive adaptive methods for matrix multiplication, hybrid recursive-iterative methods for triangular system solve and tile recursive versions of the PLUQ decomposition, together with various data mapping policies, provide the best performance on a 32 cores NUMA architecture. Overall, we show that the overhead of modular reductions is more than compensated by the fast linear algebra algorithms and that exact dense linear algebra matches the performance of full rank reference numerical software even in the presence of rank deficiencies
Accelerating Homomorphic Encryption in the Cloud Environment through High-Level Synthesis and Reconfigurable Resources
The recent surge in cloud services is revolutionizing the way that data is stored and processed. Everyone with an internet connection, from large corporations to small companies and private individuals, now have access to cutting-edge processing power and vast amounts of data storage. This rise in cloud computing and storage, however, has brought with it a need for a new type of security. In order to have access to cloud services, users must allow the service provider to have full access to their private, unencrypted data. Users are required to trust the integrity of the service provider and the security of its data centers. The recent development of fully homomorphic encryption schemes can offer a solution to this dilemma. These algorithms allow encrypted data to be used in computations without ever stripping the data of the protection of encryption. Unfortunately, the demanding memory requirements and computational complexity of the proposed schemes has hindered their wide-scale use. Custom hardware accelerators for homomorphic encryption could be implemented on the increasing number of reconfigurable hardware resources in the cloud, but the long development time required for these processors would lead to high production costs. This research seeks to develop a strategy for faster development of homomorphic encryption hardware accelerators using the process of High-Level Synthesis. Insights from existing number theory software libraries and custom hardware accelerators are used to develop a scalable, proof-of-concept software implementation of Karatsuba modular polynomial multiplication. This implementation was designed to be used with High-Level Synthesis to accelerate the large modular polynomial multiplication operations required by homomorphic encryption. The accelerator generated from this implementation by the High-Level Synthesis tool Vivado HLS achieved significant speedup over the implementations available in the highly-optimized FLINT software library
Algebraic Stream Processing
We identify and analyse the typically higher-order approaches to stream processing in the literature. From this analysis we motivate an alternative approach to the specification of SPSs as STs based on an essentially first-order equational representation. This technique is called Cartesian form specification. More specifically, while STs are properly second-order objects we show that using Cartesian forms, the second-order models needed to formalise STs are so weak that we may use and develop well-understood first-order methods from computability theory and mathematical logic to reason about their properties. Indeed, we show that by specifying STs equationally in Cartesian form as primitive recursive functions we have the basis of a new, general purpose and mathematically sound theory of stream processing that emphasises the formal specification and formal verification of STs. The main topics that we address in the development of this theory are as follows. We present a theoretically well-founded general purpose stream processing language ASTRAL (Algebraic Stream TRAnsformer Language) that supports the use of modular specification techniques for full second-order STs. We show how ASTRAL specifications can be given a Cartesian form semantics using the language PREQ that is an equational characterisation of the primitive recursive functions. In more detail, we show that by compiling ASTRAL specifications into an equivalent Cartesian form in PREQ we can use first-order equational logic with induction as a logical calculus to reason about STs. In particular, using this calculus we identify a syntactic class of correctness statements for which the verification of ASTRAL programmes is decidable relative to this calculus. We define an effective algorithm based on term re-writing techniques to implement this calculus and hence to automatically verify a very broad class of STs including conventional hardware devices. Finally, we analyse the properties of this abstract algorithm as a proof assistant and discuss various techniques that have been adopted to develop software tools based on this algorithm
Just below the surface: developing knowledge management systems using the paradigm of the noetic prism
In this paper we examine how the principles embodied in the paradigm of the noetic prism can illuminate the construction of knowledge management systems. We draw on the formalism of the prism to examine three successful tools: frames, spreadsheets and databases, and show how their power and also their shortcomings arise from their domain representation, and how any organisational system based on integration of these tools and conversion between them is inevitably lossy. We suggest how a late-binding, hybrid knowledge based management system (KBMS) could be designed that draws on the lessons learnt from these tools, by maintaining noetica at an atomic level and storing the combinatory processes necessary to create higher level structure as the need arises. We outline the “just-below-the-surface” systems design, and describe its implementation in an enterprise-wide knowledge-based system that has all of the conventional office automation features
Reliability models for dataflow computer systems
The demands for concurrent operation within a computer system and the representation of parallelism in programming languages have yielded a new form of program representation known as data flow (DENN 74, DENN 75, TREL 82a). A new model based on data flow principles for parallel computations and parallel computer systems is presented. Necessary conditions for liveness and deadlock freeness in data flow graphs are derived. The data flow graph is used as a model to represent asynchronous concurrent computer architectures including data flow computers
Introspective Pushdown Analysis of Higher-Order Programs
In the static analysis of functional programs, pushdown flow analysis and
abstract garbage collection skirt just inside the boundaries of soundness and
decidability. Alone, each method reduces analysis times and boosts precision by
orders of magnitude. This work illuminates and conquers the theoretical
challenges that stand in the way of combining the power of these techniques.
The challenge in marrying these techniques is not subtle: computing the
reachable control states of a pushdown system relies on limiting access during
transition to the top of the stack; abstract garbage collection, on the other
hand, needs full access to the entire stack to compute a root set, just as
concrete collection does. \emph{Introspective} pushdown systems resolve this
conflict. Introspective pushdown systems provide enough access to the stack to
allow abstract garbage collection, but they remain restricted enough to compute
control-state reachability, thereby enabling the sound and precise product of
pushdown analysis and abstract garbage collection. Experiments reveal
synergistic interplay between the techniques, and the fusion demonstrates
"better-than-both-worlds" precision.Comment: Proceedings of the 17th ACM SIGPLAN International Conference on
Functional Programming, 2012, AC
Polymonadic Programming
Monads are a popular tool for the working functional programmer to structure
effectful computations. This paper presents polymonads, a generalization of
monads. Polymonads give the familiar monadic bind the more general type forall
a,b. L a -> (a -> M b) -> N b, to compose computations with three different
kinds of effects, rather than just one. Polymonads subsume monads and
parameterized monads, and can express other constructions, including precise
type-and-effect systems and information flow tracking; more generally,
polymonads correspond to Tate's productoid semantic model. We show how to equip
a core language (called lambda-PM) with syntactic support for programming with
polymonads. Type inference and elaboration in lambda-PM allows programmers to
write polymonadic code directly in an ML-like syntax--our algorithms compute
principal types and produce elaborated programs wherein the binds appear
explicitly. Furthermore, we prove that the elaboration is coherent: no matter
which (type-correct) binds are chosen, the elaborated program's semantics will
be the same. Pleasingly, the inferred types are easy to read: the polymonad
laws justify (sometimes dramatic) simplifications, but with no effect on a
type's generality.Comment: In Proceedings MSFP 2014, arXiv:1406.153
- …