32 research outputs found

    Parallel Image Processing Using a Pure Topological Framework

    Get PDF
    Image processing is a fundamental operation in many real time applications, where lots of parallelism can be extracted. Segmenting the image into different connected components is the most known operations, but there are many others like extracting the region adjacency graph (RAG) of these regions, or searching for features points, being invariant to rotations, scales, brilliant changes, etc. Most of these algorithms part from the basis of Tracing-type approaches or scan/raster methods. This fact necessarily implies a data dependence between the processing of one pixel and the previous one, which prevents using a pure parallel approach. In terms of time complexity, this means that linear order O(N) (N being the number of pixels) cannot be cut down. In this paper, we describe a novel approach based on the building of a pure Topological framework, which allows to implement fully parallel algorithms. Concerning topological analysis, a first stage is computed in parallel for every pixel, thus conveying the local neighboring conditions. Then, they are extended in a second parallel stage to the necessary global relations (e.g. to join all the pixels of a connected component). This combinatorial optimization process can be seen as the compression of the whole image to just one pixel. Using this final representation, every region can be related with the rest, which yields to pure topological construction of other image operations. Besides, complex data structures can be avoided: all the processing can be done using matrixes (with the same indexation as the original image) and element-wise operations. The time complexity order of our topological approach for a m×n pixel image is near O(log(m+n)), under the assumption that a processing element exists for each pixel. Results for a multicore processor show very good scalability until the memory bandwidth bottleneck is reached, both for bigger images and for much optimized implementations. The inherent parallelism of our approach points to the direction that even better results will be obtained in other less classical computing architectures.1Ministerio de Economía y Competitividad (España) TEC2012-37868-C04-02AEI/FEDER (UE) MTM2016-81030-PVPPI of the University of Sevill

    GPU-Accelerated Computation of Vietoris-Rips Persistence Barcodes

    Get PDF
    The computation of Vietoris-Rips persistence barcodes is both execution-intensive and memory-intensive. In this paper, we study the computational structure of Vietoris-Rips persistence barcodes, and identify several unique mathematical properties and algorithmic opportunities with connections to the GPU. Mathematically and empirically, we look into the properties of apparent pairs, which are independently identifiable persistence pairs comprising up to 99% of persistence pairs. We give theoretical upper and lower bounds of the apparent pair rate and model the average case. We also design massively parallel algorithms to take advantage of the very large number of simplices that can be processed independently of each other. Having identified these opportunities, we develop a GPU-accelerated software for computing Vietoris-Rips persistence barcodes, called Ripser++. The software achieves up to 30x speedup over the total execution time of the original Ripser and also reduces CPU-memory usage by up to 2.0x. We believe our GPU-acceleration based efforts open a new chapter for the advancement of topological data analysis in the post-Moore's Law era.Comment: 36 pages, 15 figures. To be published in Symposium on Computational Geometry 202

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Tuning the Computational Effort: An Adaptive Accuracy-aware Approach Across System Layers

    Get PDF
    This thesis introduces a novel methodology to realize accuracy-aware systems, which will help designers integrate accuracy awareness into their systems. It proposes an adaptive accuracy-aware approach across system layers that addresses current challenges in that domain, combining and tuning accuracy-aware methods on different system layers. To widen the scope of accuracy-aware computing including approximate computing for other domains, this thesis presents innovative accuracy-aware methods and techniques for different system layers. The required tuning of the accuracy-aware methods is integrated into a configuration layer that tunes the available knobs of the accuracy-aware methods integrated into a system

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Proceedings, MSVSCC 2015

    Get PDF
    The Virginia Modeling, Analysis and Simulation Center (VMASC) of Old Dominion University hosted the 2015 Modeling, Simulation, & Visualization Student capstone Conference on April 16th. The Capstone Conference features students in Modeling and Simulation, undergraduates and graduate degree programs, and fields from many colleges and/or universities. Students present their research to an audience of fellow students, faculty, judges, and other distinguished guests. For the students, these presentations afford them the opportunity to impart their innovative research to members of the M&S community from academic, industry, and government backgrounds. Also participating in the conference are faculty and judges who have volunteered their time to impart direct support to their students’ research, facilitate the various conference tracks, serve as judges for each of the tracks, and provide overall assistance to this conference. 2015 marks the ninth year of the VMASC Capstone Conference for Modeling, Simulation and Visualization. This year our conference attracted a number of fine student written papers and presentations, resulting in a total of 51 research works that were presented. This year’s conference had record attendance thanks to the support from the various different departments at Old Dominion University, other local Universities, and the United States Military Academy, at West Point. We greatly appreciated all of the work and energy that has gone into this year’s conference, it truly was a highly collaborative effort that has resulted in a very successful symposium for the M&S community and all of those involved. Below you will find a brief summary of the best papers and best presentations with some simple statistics of the overall conference contribution. Followed by that is a table of contents that breaks down by conference track category with a copy of each included body of work. Thank you again for your time and your contribution as this conference is designed to continuously evolve and adapt to better suit the authors and M&S supporters. Dr.Yuzhong Shen Graduate Program Director, MSVE Capstone Conference Chair John ShullGraduate Student, MSVE Capstone Conference Student Chai

    Accelerating Genomic Sequence Alignment using High Performance Reconfigurable Computers

    Get PDF
    Recongurable computing technology has progressed to a stage where it is now possible to achieve orders of magnitude performance and power eciency gains over conventional computer architectures for a subset of high performance computing applications. In this thesis, we investigate the potential of recongurable computers to accelerate genomic sequence alignment specically for genome sequencing applications. We present a highly optimized implementation of a parallel sequence alignment algorithm for the Berkeley Emulation Engine (BEE2) recongurable computer, allowing a single BEE2 to align simultaneously hundreds of sequences. For each recongurable processor (FPGA), we demonstrate a 61X speedup versus a state-of-the-art implementation on a modern conventional CPU core, and a 56X improvement in performance-per-Watt. We also show that our implementation is highly scalable and we provide performance results from a cluster implementation using 32 FPGAs. We conclude that reconfigurable computers provide an excellent platform on which to run sequence alignment, and that clusters of recongurable computers will be able to cope far more easily with the vast quantities of data produced by new ultra-high-throughput sequencers

    Accelerating Genomic Sequence Alignment using High Performance Reconfigurable Computers

    Get PDF
    Recongurable computing technology has progressed to a stage where it is now possible to achieve orders of magnitude performance and power eciency gains over conventional computer architectures for a subset of high performance computing applications. In this thesis, we investigate the potential of recongurable computers to accelerate genomic sequence alignment specically for genome sequencing applications. We present a highly optimized implementation of a parallel sequence alignment algorithm for the Berkeley Emulation Engine (BEE2) recongurable computer, allowing a single BEE2 to align simultaneously hundreds of sequences. For each recongurable processor (FPGA), we demonstrate a 61X speedup versus a state-of-the-art implementation on a modern conventional CPU core, and a 56X improvement in performance-per-Watt. We also show that our implementation is highly scalable and we provide performance results from a cluster implementation using 32 FPGAs. We conclude that recongurable computers provide an excellent platform on which to run sequence alignment, and that clusters of recongurable computers will be able to cope far more easily with the vast quantities of data produced by new ultra-high-throughput sequencers

    High-Performance and Power-Aware Graph Processing on GPUs

    Get PDF
    Graphs are a common representation in many problem domains, including engineering, finance, medicine, and scientific applications. Different problems map to very large graphs, often involving millions of vertices. Even though very efficient sequential implementations of graph algorithms exist, they become impractical when applied on such actual very large graphs. On the other hand, graphics processing units (GPUs) have become widespread architectures as they provide massive parallelism at low cost. Parallel execution on GPUs may achieve speedup up to three orders of magnitude with respect to the sequential counterparts. Nevertheless, accelerating efficient and optimized sequential algorithms and porting (i.e., parallelizing) their implementation to such many-core architectures is a very challenging task. The task is made even harder since energy and power consumption are becoming constraints in addition, or in same case as an alternative, to performance. This work aims at developing a platform that provides (I) a library of parallel, efficient, and tunable implementations of the most important graph algorithms for GPUs, and (II) an advanced profiling model to analyze both performance and power consumption of the algorithm implementations. The platform goal is twofold. Through the library, it aims at saving developing effort in the parallelization task through a primitive-based approach. Through the profiling framework, it aims at customizing such primitives by considering both the architectural details and the target efficiency metrics (i.e., performance or power)

    Methoden und Werkzeuge zum Einsatz von rekonfigurierbaren Akzeleratoren in Mehrkernsystemen

    Get PDF
    Rechensysteme mit Mehrkernprozessoren werden häufig um einen rekonfigurierbaren Akzelerator wie einen FPGA erweitert. Die Verlagerung von Anwendungsteilen in Hardware wird meist von Spezialisten vorgenommen. Damit Anwender selbst rekonfigurierbare Hardware programmieren können, ist mein Beitrag die komponentenbasierte Programmierung und Verwendung mit automatischer Beachtung der Datenlokalität. So lässt sich auch bei datenintensiven Anwendungen Nutzen aus den Akzeleratoren erzielen
    corecore