65 research outputs found

    Two Fundamental Concepts in Skeletal Parallel Programming

    Get PDF
    We define the concepts of nesting mode and interaction mode as they arise in the description of skeletal parallel programming systems. We sugegs

    The Integration of Task and Data Parallel Skeletons

    Full text link
    We describe a skeletal parallel programming library which integrates task and data parallel constructs within an API for C++. Traditional skeletal requirements for higher orderness and polymorphism are achieved through exploitation of operator overloading and templates, while the underlying parallelism is provided by MPI. We present a case study describing two algorithms for the travelling salesman problem

    A Comparison of Big Data Frameworks on a Layered Dataflow Model

    Get PDF
    In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models, for which only informal (and often confusing) semantics is generally provided, all share a common underlying model, namely, the Dataflow model. The Dataflow model we propose shows how various tools share the same expressiveness at different levels of abstraction. The contribution of this work is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm), thus making it easier to understand high-level data-processing applications written in such frameworks. Second, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level.Comment: 19 pages, 6 figures, 2 tables, In Proc. of the 9th Intl Symposium on High-Level Parallel Programming and Applications (HLPP), July 4-5 2016, Muenster, German

    Toward optimised skeletons for heterogeneous parallel architecture with performance cost model

    Get PDF
    High performance architectures are increasingly heterogeneous with shared and distributed memory components, and accelerators like GPUs. Programming such architectures is complicated and performance portability is a major issue as the architectures evolve. This thesis explores the potential for algorithmic skeletons integrating a dynamically parametrised static cost model, to deliver portable performance for mostly regular data parallel programs on heterogeneous archi- tectures. The rst contribution of this thesis is to address the challenges of program- ming heterogeneous architectures by providing two skeleton-based programming libraries: i.e. HWSkel for heterogeneous multicore clusters and GPU-HWSkel that enables GPUs to be exploited as general purpose multi-processor devices. Both libraries provide heterogeneous data parallel algorithmic skeletons including hMap, hMapAll, hReduce, hMapReduce, and hMapReduceAll. The second contribution is the development of cost models for workload dis- tribution. First, we construct an architectural cost model (CM1) to optimise overall processing time for HWSkel heterogeneous skeletons on a heterogeneous system composed of networks of arbitrary numbers of nodes, each with an ar- bitrary number of cores sharing arbitrary amounts of memory. The cost model characterises the components of the architecture by the number of cores, clock speed, and crucially the size of the L2 cache. Second, we extend the HWSkel cost model (CM1) to account for GPU performance. The extended cost model (CM2) is used in the GPU-HWSkel library to automatically nd a good distribution for both a single heterogeneous multicore/GPU node, and clusters of heteroge- neous multicore/GPU nodes. Experiments are carried out on three heterogeneous multicore clusters, four heterogeneous multicore/GPU clusters, and three single heterogeneous multicore/GPU nodes. The results of experimental evaluations for four data parallel benchmarks, i.e. sumEuler, Image matching, Fibonacci, and Matrix Multiplication, show that our combined heterogeneous skeletons and cost models can make good use of resources in heterogeneous systems. Moreover using cores together with a GPU in the same host can deliver good performance either on a single node or on multiple node architectures

    Skeletons for parallel image processing: an overview of the SKiPPER project

    Get PDF
    International audienceThis paper is a general overview of the SKIPPER project, run at Blaise Pascal University between 1996 and 2002. The main goal of the SKIPPER project was to demonstrate the appli- cability of skeleton-based parallel programming techniques to the fast prototyping of reactive vision applications. This project has produced several versions of a full-fledged integrated pa- rallel programming environment (PPE). These PPEs have been used to implement realistic vi- sion applications, such as road following or vehicle tracking for assisted driving, on embedded parallel platforms embarked on semi-autonomous vehicles. All versions of SKIPPER share a common front-end and repertoire of skeletons--presented in previous papers--but differ in the techniques used for implementing skeletons. This paper focuses on these implementation issues, by making a comparative survey, according to a set of four criteria (efficiency, expres- sivity, portability, predictability), of these implementation techniques. It also gives an account of the lessons we have learned, both when dealing with these implementation issues and when using the resulting tools for prototyping vision applications

    Pattern operators for grid

    Get PDF
    The definition and programming of distributed applications has become a major research issue due to the increasing availability of (large scale) distributed platforms and the requirements posed by the economical globalization. However, such a task requires a huge effort due to the complexity of the distributed environments: large amount of users may communicate and share information across different authority domains; moreover, the “execution environment” or “computations” are dynamic since the number of users and the computational infrastructure change in time. Grid environments, in particular, promise to be an answer to deal with such complexity, by providing high performance execution support to large amount of users, and resource sharing across different organizations. Nevertheless, programming in Grid environments is still a difficult task. There is a lack of high level programming paradigms and support tools that may guide the application developer and allow reusability of state-of-the-art solutions. Specifically, the main goal of the work presented in this thesis is to contribute to the simplification of the development cycle of applications for Grid environments by bringing structure and flexibility to three stages of that cycle through a commonmodel. The stages are: the design phase, the execution phase, and the reconfiguration phase. The common model is based on the manipulation of patterns through pattern operators, and the division of both patterns and operators into two categories, namely structural and behavioural. Moreover, both structural and behavioural patterns are first class entities at each of the aforesaid stages. At the design phase, patterns can be manipulated like other first class entities such as components. This allows a more structured way to build applications by reusing and composing state-of-the-art patterns. At the execution phase, patterns are units of execution control: it is possible, for example, to start or stop and to resume the execution of a pattern as a single entity. At the reconfiguration phase, patterns can also be manipulated as single entities with the additional advantage that it is possible to perform a structural reconfiguration while keeping some of the behavioural constraints, and vice-versa. For example, it is possible to replace a behavioural pattern, which was applied to some structural pattern, with another behavioural pattern. In this thesis, besides the proposal of the methodology for distributed application development, as sketched above, a definition of a relevant set of pattern operators was made. The methodology and the expressivity of the pattern operators were assessed through the development of several representative distributed applications. To support this validation, a prototype was designed and implemented, encompassing some relevant patterns and a significant part of the patterns operators defined. This prototype was based in the Triana environment; Triana supports the development and deployment of distributed applications in the Grid through a dataflow-based programming model. Additionally, this thesis also presents the analysis of a mapping of some operators for execution control onto the Distributed Resource Management Application API (DRMAA). This assessment confirmed the suitability of the proposed model, as well as the generality and flexibility of the defined pattern operatorsDepartamento de Informática and Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa; Centro de Informática e Tecnologias da Informação of the FCT/UNL; Reitoria da Universidade Nova de Lisboa; Distributed Collaborative Computing Group, Cardiff University, United Kingdom; Fundação para a Ciência e Tecnologia; Instituto de Cooperação Científica e Tecnológica Internacional; French Embassy in Portugal; European Union Commission through the Agentcities.NET and Coordina projects; and the European Science Foundation, EURESCO

    Categorization And Visualization Of Parallel Programming Systems

    Get PDF
    Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2005Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2005Yükesek kazanımlı programlama olarak da bilinen paralel programlama, bir problemi daha hızlı çözmek için aynı anda birden çok işlemci kullanılmasına denir. Günümüzde, ağır işlemler içeren birçok problem paralel olarak uygulanmaya çalışılmaktadır, buna örnek olarak nehir sularının simüle edilmesi, fizik veya kimya problemleri, astrolojik simülasyonlar verilebilir. Bu tezin amacı, bilimsel hesaplama veya mühendislik amaçlı kullanılan yüksek kazanımlı yazılımları tartışmaktır. Paralel programlama sistemleri ile kastedilen kütüphaneler, diller, derleyiciler, derleyici yönlendiricileri veya bunun dışında kalan, programcının paralel algoritmasını ifade edebileceği yapılardır. Yükesek kazanımlı program tasarımı için programcının dikkat etmesi gereken iki önemli nokta vardır: problemi iyi kavrayıp uygun bir çözüm önermek, doğru sisteme karar verebilmek. Doğru karar verebilmek için kullanıcının sistemler hakkında oldukça iyi bilgiye sahip olması gerekir. Bazen, birden çok yazılım ve donanımı bir arada kullanmak da gerekebilir. Bu tezde var olan paralel programlama sistemleri tanımlanır ve sınıflandırılır, bunun için güncel bildiriler esas alınmıştır. Özellikle algoritmik taslaklar ve fonsiyonel paralel programlama üzerinde durulmuştur.Ayrica güncel bilgileri depolamak ve bir kaynak yaratmak için wiki temelli bir web kaynağı oluşturulmuştur. Sistemlerin grafik gösterimini sağlayıp daha anlaşılır bir sınıflandırma yapabilmek için yeni bir sözdizimi tasarlanıp dinamik ağ çizebilecek webdot aracı ile bir araya getirilerek sistemleri temsil edecek ağı çizecek araç geliştirilmiştir. Bu sözdiziminin öğrenilmesi ve kullanılması son derece kolaydır. Son olarak iki temel paralel programlama tipi, paylaşılan bellek ve mesajlaşma, iki farklı tipte algoritma kullanılarak karşılaştırılmıştır. Programlar OpenMP ve MPI ile gerçeklenmiştir, farklı paralel makinelerde koşturulup sonuçları karşılaştırılmıştır. Paralel makineler için Almanya nın Aachen Üniversitesi nin SMP ağı ve Ulakbim in dağıtık bellekli paralel makineleri kullanılmıştır.Parallel computing, also called high-performance computing, refers to solving problems faster by using multiple processors simultaneously. Nowadays, almost every computationally-intensive problem that one could imagine is tried to be implemented in parallel. This thesis is aimed at discussing high-performance software for scientific or engineering applications. The term parallel programming systems here means libraries, languages, compiler directives or other means through which a programmer can express a parallel algorithm. To design high performance programs, there are two keys for the programmer: to understand the problem and find a solution for parallelization, and to decide on the right system for the implementation, which requires a good knowledge about existing parallel programming systems. The programmer, after having understood the problem, has to choose between many systems, some of which are closely related, whereas others have big differences. This thesis describes and classifies existing parallel programming systems, thus bringing existing surveys up to date. It describes a wiki-based web portal for collecting information about most recent systems, which has been developed as part of the thesis. A special syntax and a visualization tool has been developed. This syntax and tool allow users to have their own categorization scheme. Fourth, it compares two major programming styles message passing and shared memory with two different algorithms in order show performance differences of these styles. Algorithms are implemented in OpenMP and MPI, performance of both programs are measured on the SMP Cluster of Aachen University, Germany and on the Beowulf Cluster of Ulakbim, Ankara.Yüksek LisansM.Sc

    StochKit-FF: Efficient Systems Biology on Multicore Architectures

    Full text link
    The stochastic modelling of biological systems is an informative, and in some cases, very adequate technique, which may however result in being more expensive than other modelling approaches, such as differential equations. We present StochKit-FF, a parallel version of StochKit, a reference toolkit for stochastic simulations. StochKit-FF is based on the FastFlow programming toolkit for multicores and exploits the novel concept of selective memory. We experiment StochKit-FF on a model of HIV infection dynamics, with the aim of extracting information from efficiently run experiments, here in terms of average and variance and, on a longer term, of more structured data.Comment: 14 pages + cover pag
    corecore