87 research outputs found

    Parallel and Distributed Machine Learning Algorithms for Scalable Big Data Analytics

    Get PDF
    This editorial is for the Special Issue of the journal Future Generation Computing Systems, consisting of the selected papers of the 6th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics (ParLearning 2017). In this editorial, we have given a high-level overview of the 4 papers contained in this special issue, along with references to some of the related works

    Doctor of Philosophy

    Get PDF
    dissertationStochastic methods, dense free-form mapping, atlas construction, and total variation are examples of advanced image processing techniques which are robust but computationally demanding. These algorithms often require a large amount of computational power as well as massive memory bandwidth. These requirements used to be ful lled only by supercomputers. The development of heterogeneous parallel subsystems and computation-specialized devices such as Graphic Processing Units (GPUs) has brought the requisite power to commodity hardware, opening up opportunities for scientists to experiment and evaluate the in uence of these techniques on their research and practical applications. However, harnessing the processing power from modern hardware is challenging. The di fferences between multicore parallel processing systems and conventional models are signi ficant, often requiring algorithms and data structures to be redesigned signi ficantly for efficiency. It also demands in-depth knowledge about modern hardware architectures to optimize these implementations, sometimes on a per-architecture basis. The goal of this dissertation is to introduce a solution for this problem based on a 3D image processing framework, using high performance APIs at the core level to utilize parallel processing power of the GPUs. The design of the framework facilitates an efficient application development process, which does not require scientists to have extensive knowledge about GPU systems, and encourages them to harness this power to solve their computationally challenging problems. To present the development of this framework, four main problems are described, and the solutions are discussed and evaluated: (1) essential components of a general 3D image processing library: data structures and algorithms, as well as how to implement these building blocks on the GPU architecture for optimal performance; (2) an implementation of unbiased atlas construction algorithms|an illustration of how to solve a highly complex and computationally expensive algorithm using this framework; (3) an extension of the framework to account for geometry descriptors to solve registration challenges with large scale shape changes and high intensity-contrast di fferences; and (4) an out-of-core streaming model, which enables developers to implement multi-image processing techniques on commodity hardware

    Advancing the Multi-Solver Paradigm for Overset CFD Toward Heterogeneous Architectures

    Get PDF
    A multi-solver, overset, computational fluid dynamics framework is developed for efficient, large-scale simulation of rotorcraft problems. Two primary features distinguish the developed framework from the current state of the art. First, the framework is designed for heterogeneous compute architectures, making use of both traditional codes run on the Central Processing Unit (CPU) as well as codes run on the Graphics Processing Unit (GPU). Second, a framework-level implementation of the Generalized Minimal Residual linear solver is used to consider all meshes from all solvers in a single linear system. The developed GPU flow solver and framework are validated against conventional implementations, achieving a 5.35× speedup for a single GPU compared to 24 CPU cores. Similarly, the overset linear solver is compared to traditional techniques, demonstrating the same convergence order can be achieved using as few as half the number of iterations. Applications of the developed methods are organized into two chapters. First, the heterogeneous, overset framework is applied to a notional helicopter configuration based on the ROBIN wind tunnel experiments. A tail rotor and hub are added to create a challenging case representative of a realistic, full-rotorcraft simulation. Interactional aerodynamics between the different components are reviewed in detail. The second application chapter focuses on performance of the overset linear solver for unsteady applications. The GPU solver is used along with an unstructured code to simulate laminar flow over a sphere as well as laminar coaxial rotors designed for a Mars helicopter. In all results, the overset linear solver out-performs the traditional, de-coupled approach. Conclusions drawn from both the full-rotorcraft and overset linear solver simulations can have a significant impact on improving modeling of complex rotorcraft aerodynamics

    Computational Methods in Science and Engineering : Proceedings of the Workshop SimLabs@KIT, November 29 - 30, 2010, Karlsruhe, Germany

    Get PDF
    In this proceedings volume we provide a compilation of article contributions equally covering applications from different research fields and ranging from capacity up to capability computing. Besides classical computing aspects such as parallelization, the focus of these proceedings is on multi-scale approaches and methods for tackling algorithm and data complexity. Also practical aspects regarding the usage of the HPC infrastructure and available tools and software at the SCC are presented
    • …
    corecore