188 research outputs found

    PiCo: A Domain-Specific Language for Data Analytics Pipelines

    Get PDF
    In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models—for which only informal (and often confusing) semantics is generally provided—all share a common under- lying model, namely, the Dataflow model. Using this model as a starting point, it is possible to categorize and analyze almost all aspects about Big Data analytics tools from a high level perspective. This analysis can be considered as a first step toward a formal model to be exploited in the design of a (new) framework for Big Data analytics. By putting clear separations between all levels of abstraction (i.e., from the runtime to the user API), it is easier for a programmer or software designer to avoid mixing low level with high level aspects, as we are often used to see in state-of-the-art Big Data analytics frameworks. From the user-level perspective, we think that a clearer and simple semantics is preferable, together with a strong separation of concerns. For this reason, we use the Dataflow model as a starting point to build a programming environment with a simplified programming model implemented as a Domain-Specific Language, that is on top of a stack of layers that build a prototypical framework for Big Data analytics. The contribution of this thesis is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm, Google Dataflow), thus making it easier to understand high-level data-processing applications written in such frameworks. As result of this analysis, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level. Second, we propose a programming environment based on such layered model in the form of a Domain-Specific Language (DSL) for processing data collections, called PiCo (Pipeline Composition). The main entity of this programming model is the Pipeline, basically a DAG-composition of processing elements. This model is intended to give the user an unique interface for both stream and batch processing, hiding completely data management and focusing only on operations, which are represented by Pipeline stages. Our DSL will be built on top of the FastFlow library, exploiting both shared and distributed parallelism, and implemented in C++11/14 with the aim of porting C++ into the Big Data world

    Design of testbed and emulation tools

    Get PDF
    The research summarized was concerned with the design of testbed and emulation tools suitable to assist in projecting, with reasonable accuracy, the expected performance of highly concurrent computing systems on large, complete applications. Such testbed and emulation tools are intended for the eventual use of those exploring new concurrent system architectures and organizations, either as users or as designers of such systems. While a range of alternatives was considered, a software based set of hierarchical tools was chosen to provide maximum flexibility, to ease in moving to new computers as technology improves and to take advantage of the inherent reliability and availability of commercially available computing systems

    Model-Based Design for High-Performance Signal Processing Applications

    Get PDF
    Developing high-performance signal processing applications requires not only effective signal processing algorithms but also efficient software design methods that can take full advantage of the available processing resources. An increasingly important type of hardware platform for high-performance signal processing is a multicore central processing unit (CPU) combined with a graphics processing unit (GPU) accelerator. Efficiently coordinating computations on both the host (CPU) and device (GPU), and managing host-device data transfers are critical to utilizing CPU-GPU platforms effectively. However, such coordination is challenging for system designers, given the complexity of modern signal processing applications and the stringent constraints under which they must operate. Dataflow models of computation provide a useful framework for addressing this challenge. In such a modeling approach, signal processing applications are represented as directed graphs that can be viewed intuitively as high-level signal flow diagrams. The formal, high-level abstraction provided by dataflow principles provides a useful foundation to investigate model-based analysis and optimization for new challenges in design and implementation of signal processing systems. This thesis presents a new model-based design methodology and an evolution of three novel design tools. These contributions provide an automated design flow for high performance signal processing. The design flow takes high-level dataflow representations as input and systematically derives optimized implementations on CPU-GPU platforms. The proposed design flow and associated design methodology are inspired by a previously-developed application programming interface (API) called the Hybrid Task Graph Scheduler (HTGS). HTGS was developed for implementing scalable workflows for high-performance computing applications on compute nodes that have large numbers of processing cores, and that may be equipped with multiple GPUs. However, HTGS has a limitation due to its relatively loose use of dataflow techniques (or other forms of model-based design), which results in a significant designer effort being required to apply the provided APIs effectively. The main contributions of the thesis are summarized as follows: (1) Development of a companion tool to HTGS that is called the HTGS Model-based Engine (HMBE). HMBE introduces novel capabilities to automatically analyze application dataflow graphs and generate efficient schedules for these graphs through hybrid compile-time and runtime analysis. The systematic, model-based approaches provided by HMBE enable the automation of complex tasks that must be performed manually when using HTGS alone. We have demonstrated the effectiveness of HMBE and the associated model-based design methodology through extensive experiments involving two case studies: an image stitching application for large scale microscopy images, and a background subtraction application for multispectral video streams. (2) Integration of HMBE with HTGS to develop a new design tool for the design and implementation of high-performance signal processing systems. This tool, called HMBE-Integrated-HTGS (HI-HTGS), provides novel capabilities for model-based system design, memory management, and scheduling targeted to multicore platforms. HMBE takes as input a single- or multi-dimensional dataflow model of the given signal processing application. The tool then expands the dataflow model into an expanded representation that exposes more parallelism and provides significantly more detail on the interactions between different application tasks (dataflow actors). This expanded representation is derived by HI-HTGS at compile-time and provided as input to the HI-HTGS runtime system. The runtime system in turn applies the expanded representation to guide dynamic scheduling decisions throughout system execution. (3) Extension of HMBE to the class of CPU-GPU platforms motivated above. We call this new model-based design tool the CPU-GPU Model-Based Engine (CGMBE). CGMBE uses an unfolded dataflow graph representation of the application along with thread-pool-based executors, which are optimized for efficient operation on the targeted CPU-GPU platform. This approach automates complex aspects of the design and implementation process for signal processing system designers while maximizing the utilization of computational power, reducing the memory footprint for both the CPU and GPU, and facilitating experimentation for tuning performance-oriented designs

    A graphical system for parallel software development

    Get PDF
    PhD ThesisParallel architectures have become popular in the race to meet an increasing demand for computational power. While the benefits of parallel computing - the performance improvements due to simultaneous computations - are clear, achieving these benefits has proved difficult. The wide variety of parallel architectures has led to a similarly diverse range of parallel languages and methods for parallel programming, many of which feature complicated architecture-specific language mechanisms. The lack of good tools to assist in the development of parallel software has compounded the problem of parallel programming being limited to a field which is both specialist and fragmented. This thesis investigates techniques for the graphical specification of parallel programs, using an architecture-independent graph-based notation representing the design of the program, combined with conventional sequential languages. Automatic code generation is used to translate the graph program into executable code suitable for different parallel architectures. To overcome the differing performance characteristics of parallel architectures, methods for the graphical adjustment of granularity are proposed and investigated, and an encompassing parallel design environment is presented.Engineering and Physical Sciences Research Council

    Memory-Aware Functional IR for Higher-Level Synthesis of Accelerators

    Get PDF

    Systematic Design Space Exploration of Dynamic Dataflow Programs for Multi-core Platforms

    Get PDF
    The limitations of clock frequency and power dissipation of deep sub-micron CMOS technology have led to the development of massively parallel computing platforms. They consist of dozens or hundreds of processing units and offer a high degree of parallelism. Taking advantage of that parallelism and transforming it into high program performances requires the usage of appropriate parallel programming models and paradigms. Currently, a common practice is to develop parallel applications using methods evolving directly from sequential programming models. However, they lack the abstractions to properly express the concurrency of the processes. An alternative approach is to implement dataflow applications, where the algorithms are described in terms of streams and operators thus their parallelism is directly exposed. Since algorithms are described in an abstract way, they can be easily ported to different types of platforms. Several dataflow models of computation (MoCs) have been formalized so far. They differ in terms of their expressiveness (ability to handle dynamic behavior) and complexity of analysis. So far, most of the research efforts have focused on the simpler cases of static dataflow MoCs, where many analyses are possible at compile-time and several optimization problems are greatly simplified. At the same time, for the most expressive and the most difficult to analyze dynamic dataflow (DDF), there is still a dearth of tools supporting a systematic and automated analysis minimizing the programming efforts of the designer. The objective of this Thesis is to provide a complete framework to analyze, evaluate and refactor DDF applications expressed using the RVC-CAL language. The methodology relies on a systematic design space exploration (DSE) examining different design alternatives in order to optimize the chosen objective function while satisfying the constraints. The research contributions start from a rigorous DSE problem formulation. This provides a basis for the definition of a complete and novel analysis methodology enabling systematic performance improvements of DDF applications. Different stages of the methodology include exploration heuristics, performance estimation and identification of refactoring directions. All of the stages are implemented as appropriate software tools. The contributions are substantiated by several experiments performed with complex dynamic applications on different types of physical platforms

    Design methodology for embedded computer vision systems

    Get PDF
    Computer vision has emerged as one of the most popular domains of embedded appli¬cations. Though various new powerful embedded platforms to support such applica¬tions have emerged in recent years, there is a distinct lack of efficient domain-specific synthesis techniques for optimized implementation of such systems. In this thesis, four different aspects that contribute to efficient design and synthesis of such systems are explored: (1) Graph Transformations: Dataflow modeling is widely used in digital signal processing (DSP) systems. However, support for dynamic behavior in such systems exists mainly at the modeling level and there is a lack of optimized synthesis tech¬niques for these models. New transformation techniques for efficient system-on-chip (SoC) design methods are proposed and implemented for cyclo-static dataflow and its parameterized version (parameterized cyclo-static dataflow) -- two powerful models that allow dynamic reconfigurability and phased behavior in DSP systems. (2) Design Space Exploration: The broad range of target platforms along with the complexity of applications provides a vast design space, calling for efficient tools to explore this space and produce effective design choices. A novel architectural level design methodology based on a formalism called multirate synchronization graphs is presented along with methods for performance evaluation. (3) Multiprocessor Communication Interface: Efficient code synthesis for emerg¬ing new parallel architectures is an important and sparsely-explored problem. A widely-encountered problem in this regard is efficient communication between pro¬cessors running different sub-systems. A widely used tool in the domain of general-purpose multiprocessor clusters is MPI (Message Passing Interface). However, this does not scale well for embedded DSP systems. A new, powerful and highly optimized communication interface for multiprocessor signal processing systems is presented in this work that is based on the integration of relevant properties of MPI with dataflow semantics. (4) Parameterized Design Framework for Particle Filters: Particle filter systems constitute an important class of applications used in a wide number of fields. An effi¬cient design and implementation framework for such systems has been implemented based on the observation that a large number of such applications exhibit similar prop¬erties. The key properties of such applications are identified and parameterized appro¬priately to realize different systems that represent useful trade-off points in the space of possible implementations
    corecore