1,559 research outputs found

    Skeletons for Distributed Topological Computation

    Get PDF
    Parallel implementation of topological algorithms is highly desirable, but the challenges, from reconstructing algorithms around independent threads through to runtime load balancing, have proven to be formidable. This problem, made all the more acute by the diversity of hardware platforms, has led to new kinds of implementation platform for computational science, with sophisticated runtime systems managing and coordinating large threadcounts to keep processing elements heavily utilized. While simpler and more portable than direct management of threads, these approaches still entangle program logic with resource management. Similar kinds of highly parallel runtime system have also been developed for functional languages. Here, however, language support for higher-order functions allows a cleaner separation between the algorithm and `skeletons' that express generic patterns of parallel computation. We report results on using this technique to develop a distributed version of the Joint Contour Net, a generalization of the Contour Tree to multifields. We present performance comparisons against a recent Haskell implementation using shared-memory parallelism, and initial work on a skeleton for distributed memory implementation that utilizes an innovative strategy to reduce inter-process communication overheads

    Structured Parallelism by Composition - Design and implementation of a framework supporting skeleton compositionality

    Get PDF
    This thesis is dedicated to the efficient compositionality of algorithmic skeletons, which are abstractions of common parallel programming patterns. Skeletons can be implemented in the functional parallel language Eden as mere parallel higher order functions. The use of algorithmic skeletons facilitates parallel programming massively. This is because they already implement the tedious details of parallel programming and can be specialised for concrete applications by providing problem specific functions and parameters. Efficient skeleton compositionality is of particular importance because complex, specialised skeletons can be compound of simpler base skeletons. The resulting modularity is especially important for the context of functional programming and should not be missing in a functional language. We subdivide composition into three categories: -Nesting: A skeleton is instantiated from another skeleton instance. Communication is tree shaped, along the call hierarchy. This is directly supported by Eden. -Composition in sequence: The result of a skeleton is the input for a succeeding skeleton. Function composition is expressed in Eden by the ( . ) operator. For performance reasons the processes of both skeletons should be able to exchange results directly instead of using the indirection via the caller process. We therefore introduce the remote data concept. -Iteration: A skeleton is called in sequence a variable number of times. This can be defined using recursion and composition in sequence. We optimise the number of skeleton instances, the communication in between the iteration steps and the control of the loop. To this end, we developed an iteration framework where iteration skeletons are composed from control and body skeletons. Central to our composition concept is remote data. We send a remote data handle instead of ordinary data, the data handle is used at its destination to request the referenced data. Remote data can be used inside arbitrary container types for efficient skeleton composition similar to ordinary distributed data types. The free combinability of remote data with arbitrary container types leads to a high degree of flexibility. The programmer is not restricted by using a predefined set of distributed data types and (re-)distribution functions. Moreover, he can use remote data with arbitrary container types to elegantly create process topologies. For the special case of skeleton iteration we prevent the repeated construction and deconstruction of skeleton instances for each single iteration step, which is common for the recursive use of skeletons. This minimises the parallel overhead for process and channel creation and allows to keep data local on persistent processes. To this end we provide a skeleton framework. This concept is independent of remote data, however the use of remote data in combination with the iteration framework makes the framework more flexible. For our case studies, both approaches perform competitively compared to programs with identical parallel structure but which are implemented using monolithic skeletons - i.e. skeleton not composed from simpler ones. Further, we present extensions of Eden which enhance composition support: generalisation of overloaded communication, generalisation of process instantiation, compositional process placement and extensions of Box types used to adapt communication behaviour

    Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms

    Get PDF
    This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach. The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms. Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods. We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs. This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution. The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms. Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms

    Exploring the concept of interaction computing through the discrete algebraic analysis of the Belousov–Zhabotinsky reaction

    Get PDF
    Interaction computing (IC) aims to map the properties of integrable low-dimensional non-linear dynamical systems to the discrete domain of finite-state automata in an attempt to reproduce in software the self-organizing and dynamically stable properties of sub-cellular biochemical systems. As the work reported in this paper is still at the early stages of theory development it focuses on the analysis of a particularly simple chemical oscillator, the Belousov-Zhabotinsky (BZ) reaction. After retracing the rationale for IC developed over the past several years from the physical, biological, mathematical, and computer science points of view, the paper presents an elementary discussion of the Krohn-Rhodes decomposition of finite-state automata, including the holonomy decomposition of a simple automaton, and of its interpretation as an abstract positional number system. The method is then applied to the analysis of the algebraic properties of discrete finite-state automata derived from a simplified Petri net model of the BZ reaction. In the simplest possible and symmetrical case the corresponding automaton is, not surprisingly, found to contain exclusively cyclic groups. In a second, asymmetrical case, the decomposition is much more complex and includes five different simple non-abelian groups whose potential relevance arises from their ability to encode functionally complete algebras. The possible computational relevance of these findings is discussed and possible conclusions are drawn

    Methotrexate Toxicity in Growing Long Bones of Young Rats: A Model for Studying Cancer Chemotherapy-Induced Bone Growth Defects in Children

    Get PDF
    The advancement and intensive use of chemotherapy in treating childhood cancers has led to a growing population of young cancer survivors who face increased bone health risks. However, the underlying mechanisms for chemotherapy-induced skeletal defects remain largely unclear. Methotrexate (MTX), the most commonly used antimetabolite in paediatric cancer treatment, is known to cause bone growth defects in children undergoing chemotherapy. Animal studies not only have confirmed the clinical observations but also have increased our understanding of the mechanisms underlying chemotherapy-induced skeletal damage. These models revealed that high-dose MTX can cause growth plate dysfunction, damage osteoprogenitor cells, suppress bone formation, and increase bone resorption and marrow adipogenesis, resulting in overall bone loss. While recent rat studies have shown that antidote folinic acid can reduce MTX damage in the growth plate and bone, future studies should investigate potential adjuvant treatments to reduce chemotherapy-induced skeletal toxicities

    ctrl. + Z: a DNA / ZOO for the 21st century

    Get PDF
    The disappearance of naturally occurring organisms, their extinction, and their reinterpretation through science, reinvites the ancient allegory of Plato’s cave. The story is a scenario in which reality and illusion are confused: Socrates asks Glaucon to imagine a cave inhabited by prisoners who have been chained and held immobile since childhood: not only are their arms and legs held in place, but their heads are also fixed, compelled to gaze at a wall in front of them. Behind the prisoners is an enormous fire, and between the fire and the prisoners is a raised walkway, along which people walk carrying objects on their heads. The prisoners watch the shadows cast by the men, and hear their echoes, not knowing that they are shadows and reflections. Socrates suggests that the prisoners would take the shadows and echoes to be reality, not just reflections of reality, since they are all they knew, and the whole of their society would depend on the shadows on the wall. We are currently confronted with a similar conundrum where information can be misconstrued as both reality and myth. The headlines are a riot of outcries since the escalation of rhino poaching for new-age traditional medicine. The result of rhino poaching is their imminent extinction. Without the media frenzy, animals would silently disappear and man would neglect to acknowledge the part he has played before it was too late. Our relationship with animals provides us with a useful mirror of society. The incomprehension between man and any other species forces us to project emotions and meaning onto them in order to understand them. The synapse of ambiguity creates a void that is filled with questions, curiosity and guilt. ABSTRACT The rising number of vulnerable species highlights the fact that measures taken to stall extinction are ineffective. The artificial landscapes attempted by man to preserve animals: namely nature reserves, zoological gardens and natural history museums; construct new versions of reality into which we file nature so that it corresponds with human logic. Our incessant need to control, dissect, and extrapolate habitats has amounted in anthropomorphic and anthropocentric typologies. Through assessing these preservation models as well as their priorities, which seem more concerned with capture and display for capital than reestablishing a natural order; I argue that the current situation is outdated and requires a reinvention. The human population has hindered the natural migration of animals, however, it is now possible to reinstate some of this natural order through establishing a network of genetics between zoos, natural history museums and nature reserves. In the process of collecting animal DNA data, we are creating a back up system for animals in the future. My thesis proposes the integration of the concepts of game reserve, zoo, natural history museum and cryobank into a single ‘DNA Zoo’ concept for the 21s

    GUMSMP: a scalable parallel Haskell implementation

    Get PDF
    The most widely available high performance platforms today are hierarchical, with shared memory leaves, e.g. clusters of multi-cores, or NUMA with multiple regions. The Glasgow Haskell Compiler (GHC) provides a number of parallel Haskell implementations targeting different parallel architectures. In particular, GHC-SMP supports shared memory architectures, and GHC-GUM supports distributed memory machines. Both implementations use different, but related, runtime system (RTS) mechanisms and achieve good performance. A specialised RTS for the ubiquitous hierarchical architectures is lacking. This thesis presents the design, implementation, and evaluation of a new parallel Haskell RTS, GUMSMP, that combines shared and distributed memory mechanisms to exploit hierarchical architectures more effectively. The design evaluates a variety of design choices and aims to efficiently combine scalable distributed memory parallelism, using a virtual shared heap over a hierarchical architecture, with low-overhead shared memory parallelism on shared memory nodes. Key design objectives in realising this system are to prefer local work, and to exploit mostly passive load distribution with pre-fetching. Systematic performance evaluation shows that the automatic hierarchical load distribution policies must be carefully tuned to obtain good performance. We investigate the impact of several policies including work pre-fetching, favouring inter-node work distribution, and spark segregation with different export and select policies. We present the performance results for GUMSMP, demonstrating good scalability for a set of benchmarks on up to 300 cores. Moreover, our policies provide performance improvements of up to a factor of 1.5 compared to GHC- GUM. The thesis provides a performance evaluation of distributed and shared heap implementations of parallel Haskell on a state-of-the-art physical shared memory NUMA machine. The evaluation exposes bottlenecks in memory management, which limit scalability beyond 25 cores. We demonstrate that GUMSMP, that combines both distributed and shared heap abstractions, consistently outper- forms the shared memory GHC-SMP on seven benchmarks by a factor of 3.3 on average. Specifically, we show that the best results are obtained when shar- ing memory only within a single NUMA region, and using distributed memory system abstractions across the regions

    The Ability of Soil Pore Network Metrics to Predict Redox Dynamics Is Scale Dependent

    Get PDF
    Variations in microbial community structure and metabolic efficiency are governed in part by oxygen availability, which is a function of water content, diffusion distance, and oxygen demand; for this reason, the volume, connectivity, and geometry of soil pores may exert primary controls on spatial metabolic diversity in soil. Here, we combine quantitative pore network metrics derived from X-ray computed tomography (XCT) with measurements of electromotive potentials to assess how the metabolic status of soil depends on variations of the overall pore network architecture. Contrasting pore network architectures were generated using a Mollisol—A horizon, and compared to intact control samples from the same soil. Mesocosms from each structural treatment were instrumented with Pt-electrodes to record available energy dynamics during a regimen of varying moisture conditions. We found that volume-based XCT-metrics were more frequently correlated with metrics describing changes in available energy than medial-axis XCT-metrics. An abundance of significant correlations between pore network metrics and available energy parameters was not only a function of pore architecture, but also of the dimensions of the sub-sample chosen for XCT analysis. Pore network metrics had the greatest power to statistically explain changes in available energy in the smallest volumes analyzed. Our work underscores the importance of scale in observations of natural systems
    corecore