90 research outputs found
Structured Parallelism by Composition - Design and implementation of a framework supporting skeleton compositionality
This thesis is dedicated to the efficient compositionality of algorithmic skeletons, which are abstractions of common parallel programming patterns. Skeletons can be implemented in the functional parallel language Eden as mere parallel higher order functions. The use of algorithmic skeletons facilitates parallel programming massively. This is because they already implement the tedious details of parallel programming and can be specialised for concrete applications by providing problem specific functions and parameters. Efficient skeleton compositionality is of particular importance because complex, specialised skeletons can be compound of simpler base skeletons. The resulting modularity is especially important for the context of functional programming and should not be missing in a functional language. We subdivide composition into three categories:
-Nesting: A skeleton is instantiated from another skeleton instance. Communication is tree shaped, along the call hierarchy. This is directly supported by Eden.
-Composition in sequence: The result of a skeleton is the input for a succeeding skeleton. Function composition is expressed in Eden by the ( . ) operator. For performance reasons the processes of both skeletons should be able to exchange results directly instead of using the indirection via the caller process. We therefore introduce the remote data concept.
-Iteration: A skeleton is called in sequence a variable number of times. This can be defined using recursion and composition in sequence. We optimise the number of skeleton instances, the communication in between the iteration steps and the control of the loop. To this end, we developed an iteration framework where iteration skeletons are composed from control and body skeletons.
Central to our composition concept is remote data. We send a remote data handle instead of ordinary data, the data handle is used at its destination to request the referenced data. Remote data can be used inside arbitrary container types for efficient skeleton composition similar to ordinary distributed data types. The free combinability of remote data with arbitrary container types leads to a high degree of flexibility. The programmer is not restricted by using a predefined set of distributed data types and (re-)distribution functions. Moreover, he can use remote data with arbitrary container types to elegantly create process topologies.
For the special case of skeleton iteration we prevent the repeated construction and deconstruction of skeleton instances for each single iteration step, which is common for the recursive use of skeletons. This minimises the parallel overhead for process and channel creation and allows to keep data local on persistent processes. To this end we provide a skeleton framework. This concept is independent of remote data, however the use of remote data in combination with the iteration framework makes the framework more flexible.
For our case studies, both approaches perform competitively compared to programs with identical parallel structure but which are implemented using monolithic skeletons - i.e. skeleton not composed from simpler ones.
Further, we present extensions of Eden which enhance composition support: generalisation of overloaded communication, generalisation of process instantiation, compositional process placement and extensions of Box types used to adapt communication behaviour
Probability around the Quantum Gravity. Part 1: Pure Planar Gravity
In this paper we study stochastic dynamics which leaves quantum gravity
equilibrium distribution invariant. We start theoretical study of this dynamics
(earlier it was only used for Monte-Carlo simulation). Main new results concern
the existence and properties of local correlation functions in the
thermodynamic limit. The study of dynamics constitutes a third part of the
series of papers where more general class of processes were studied (but it is
self-contained), those processes have some universal significance in
probability and they cover most concrete processes, also they have many
examples in computer science and biology. At the same time the paper can serve
an introduction to quantum gravity for a probabilist: we give a rigorous
exposition of quantum gravity in the planar pure gravity case. Mostly we use
combinatorial techniques, instead of more popular in physics random matrix
models, the central point is the famous exponent.Comment: 40 pages, 11 figure
Recommended from our members
Skeleton Structures and Origami Design
In this dissertation we study problems related to polygonal skeleton structures that have applications to computational origami. The two main structures studied are the straight skeleton of a simple polygon (and its generalizations to planar straight line graphs) and the universal molecule of a Lang polygon. This work builds on results completed jointly with my advisor Ileana Streinu.
Skeleton structures are used in many computational geometry algorithms. Examples include the medial axis, which has applications including shape analysis, optical character recognition, and surface reconstruction; and the Voronoi diagram, which has a wide array of applications including geographic information systems (GIS), point location data structures, motion planning, etc.
The straight skeleton, studied in this work, has applications in origami design, polygon interpolation, biomedical imaging, and terrain modeling, to name just a few. Though the straight skeleton has been well studied in the computational geometry literature for over 20 years, there still exists a significant gap between the fastest algorithms for constructing it and the known lower bounds.
One contribution of this thesis is an efficient algorithm for computing the straight skeleton of a polygon, polygon with holes, or a planar straight-line graph given a secondary structure called the induced motorcycle graph.
The universal molecule is a generalization of the straight skeleton to certain convex polygons that have a particular relationship to a metric tree. It is used in Robert Lang\u27s seminal TreeMaker method for origami design. Informally, the universal molecule is a subdivision of a polygon (or polygonal sheet of paper) that allows the polygon to be ``folded\u27\u27 into a particular 3D shape with certain tree-like properties. One open problem is whether the universal molecule can be rigidly folded: given the initial flat state and a particular desired final ``folded\u27\u27 state, is there a continuous motion between the two states that maintains the faces of the subdivision as rigid panels? A partial characterization is known: for a certain measure zero class of universal molecules there always exists such a folding motion. Another open problem is to remove the restriction of the universal molecule to convex polygons. This is of practical importance since the TreeMaker method sometimes fails to produce an output on valid input due the convexity restriction and extending the universal molecule to non-convex polygons would allow TreeMaker to work on all valid inputs. One further interesting problem is the development of faster algorithms for computing the universal molecule. In this thesis we make the following contributions to the study of the universal molecule. We first characterize the tree-like family of surfaces that are foldable from universal molecules. In order to do this we define a new family of surfaces we call Lang surfaces and prove that a restricted class of these surfaces are equivalent to the universal molecules. Next, we develop and compare efficient implementations for computing the universal molecule. Then, by investigating properties of broader classes of Lang surfaces, we arrive at a generalization of the universal molecule from convex polygons in the plane to non-convex polygons in arbitrary flat surfaces. This is of both practical and theoretical interest. The practical interest is that this work removes the case from Lang\u27s TreeMaker method that causes TreeMaker to fail to produce output in the presence of non-convex polygons. The theoretical interest comes from the fact that our generalization encompasses more than just those surfaces that can be cut out of a sheet of paper, and pertains to polygons that cannot be lied flat in the plane without self-intersections. Finally, we identify a large class of universal molecules that are not foldable by rigid folding motions. This makes progress towards a complete characterization of the foldability of the universal molecule
Recommended from our members
Program Synthesis for Software-Defined Networking
Software-defined networking (SDN) is revolutionizing the networking industry, but even the most advanced SDN programming platforms lack mechanisms for changing the global configuration (the set of all forwarding rules on the switches) correctly and automatically. This seemingly-simple notion of global configuration change (known as a network update) can be quite challenging for SDN programmers to implement by hand, because networks are distributed systems with hundreds or thousands of interacting nodes---even if the initial and final configurations are correct, naïvely updating individual nodes can lead to bugs in the intermediate configurations. Additionally, SDN programs must simultaneously describe both static forwarding behavior, and dynamic updates in response to events. These event-driven updates are critical to get right, but even more difficult to implement correctly due to interleavings of data packets and control messages. Existing SDN platforms offer only weak guarantees in this regard, also opening the door for incorrect behavior. As an added wrinkle, event-driven network programs are often physically distributed, running on several nodes of the network, and this distributed setting makes programming and debugging even more difficult. Bugs arising from any of these issues can cause serious incorrect transient behaviors, including loops, black holes, and access-control violations.This thesis presents a synthesis-based approach for solving these issues. First, I show how to automatically synthesize network updates that are guaranteed to preserve specified properties. I formalize the network updates problem and develop a synthesis algorithm based on counterexample-guided search and incremental model checking. Second, I add the ability to reason about transitions between configurations in response to events, by introducing event-driven consistent updates that are guaranteed to preserve well-defined behaviors in this context. I propose network event structures (NESs) to model constraints on updates, such as which events can be enabled simultaneously and causal dependencies between events. I define an extension of the NetKAT language with mutable state, give semantics to stateful programs using NESs, and discuss provably-correct strategies for implementing NESs in SDNs. Third, I propose a synchronization synthesis approach that allows correct "parallel composition" of several event-driven programs (processes)---the programmer can specify each sequential process, and add a declarative specification of paths that packets are allowed to take. The synthesizer then inserts synchronization among the distributed controller processes such that the declarative specification will be satisfied by all packets traversing the network. The key technical contribution here is a counterexample-guided synthesis algorithm that furnishes network processes with the synchronization required to prevent any races causing specification violations. An important component of this is an extension of network event structures to a more general programming model called event nets based on Petri nets. Finally, I describe an approach for implementing event nets in an efficient distributed way on modern SDN hardware. For each of the core components, I describe a prototype implementation, and present results from experiments on realistic topologies and properties, demonstrating that the tools handle real network programs, and scale to networks of 1000+ nodes
Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms
This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach.
The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms.
Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods.
We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs.
This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution.
The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms.
Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms
An assessment of gene regulatory network inference algorithms
A conceptual issue regarding gene regulatory network (GRN) inference algorithms is establishing their validity or correctness. In this study, we argue that for this purpose it is useful to conceive these algorithms as estimators of graph-valued parameters of explicit models for gene expression data. On this basis, we perform an assessment of a selection of influential GRN inference algorithms as estimators for two types of models: (i) causal graphs with associated structural equations models (SEMs), and (ii) differential equations models based on the thermodynamics of gene expression. Our findings corroborate that networks of marginal dependence fail in estimating GRNs, but they also suggest that the strength of statistical association as measured by mutual information may be indicative of GRN structure. Also, in simulations, we find that the GRN inference algorithms GENIE3 and TIGRESS outperform competing algorithms. However, more importantly, we also find that many observed patterns hinge on the GRN topology and the assumed data generating mechanism.Un problema conceptual con respecto a los algoritmos de inferencia de redes de regulación génica (RRG) es cómo establecer su validez. En este estudio sostenemos que para este objetivo conviene concebir estos algoritmos como estimadores de parámetros de modelos estadÃsticos explÃcitos para datos de expresión génica. Sobre esta base, realizamos una evaluación de una selección de algoritmos de inferencia de RRG como estimadores para dos tipos de modelos: (i) modelos de grafos causales asociados a modelos de ecuaciones estructurales (MEE), y (ii) modelos de ecuaciones diferenciales basados en la termodinámica de la expresion genica. Nuestros hallazgos corroboran que las redes de dependencias marginales fallan en la estimación de las RRG, pero también sugieren que la fuerza de la asociación estadÃstica medida por la información mutua puede reflejar en cierto grado la estructura de las RRG. Además, en un estudio de simulaciones, encontramos que los algoritmos de inferencia GENIE3 y TIGRESS son los de mejor desempeño. Sin embargo, crucialmente, también encontramos que muchos patrones observados en las simulaciones dependen de la topologÃa de la RRG y del modelo generador de datos.MaestrÃ
Workflow models for heterogeneous distributed systems
The role of data in modern scientific workflows becomes more and more crucial. The unprecedented amount of data available in the digital era, combined with the recent advancements in Machine Learning and High-Performance Computing (HPC), let computers surpass human performances in a wide range of fields, such as Computer Vision, Natural Language Processing and Bioinformatics. However, a solid data management strategy becomes crucial for key aspects like performance optimisation, privacy preservation and security.
Most modern programming paradigms for Big Data analysis adhere to the principle of data locality: moving computation closer to the data to remove transfer-related overheads and risks. Still, there are scenarios in which it is worth, or even unavoidable, to transfer data between different steps of a complex workflow.
The contribution of this dissertation is twofold. First, it defines a novel methodology for distributed modular applications, allowing topology-aware scheduling and data management while separating business logic, data dependencies, parallel patterns and execution environments. In addition, it introduces computational notebooks as a high-level and user-friendly interface to this new kind of workflow, aiming to flatten the learning curve and improve the adoption of such methodology.
Each of these contributions is accompanied by a full-fledged, Open Source implementation, which has been used for evaluation purposes and allows the interested reader to experience the related methodology first-hand. The validity of the proposed approaches has been demonstrated on a total of five real scientific applications in the domains of Deep Learning, Bioinformatics and Molecular Dynamics Simulation, executing them on large-scale mixed cloud-High-Performance Computing (HPC) infrastructures
- …