10 research outputs found

    Evaluation of low-power architectures in a scientific computing environment

    Get PDF
    HPC (High Performance Computing) represents, together with theory and experiments, the third pillar of science. Through HPC, scientists can simulate phenomena otherwise impossible to study. The need of performing larger and more accurate simulations requires to HPC to improve every day. HPC is constantly looking for new computational platforms that can improve cost and power efficiency. The Mont-Blanc project is a EU funded research project that targets to study new hardware and software solutions that can improve efficiency of HPC systems. The vision of the project is to leverage the fast growing market of mobile devices to develop the next generation supercomputers. In this work we contribute to the objectives of the Mont-Blanc project by evaluating performance of production scientific applications on innovative low power architectures. In order to do so, we describe our experiences porting and evaluating sate of the art scientific applications on the Mont-Blanc prototype, the first HPC system built with commodity low power embedded technology. We then extend our study to compare off-the-shelves ARMv8 platforms. We finally discuss the most impacting issues encountered during the development of the Mont-Blanc prototype system

    Grid and high performance computing applied to bioinformatics

    Get PDF
    Recent advances in genome sequencing technologies and modern biological data analysis technologies used in bioinformatics have led to a fast and continuous increase in biological data. The difficulty of managing the huge amounts of data currently available to researchers and the need to have results within a reasonable time have led to the use of distributed and parallel computing infrastructures for their analysis. In this context Grid computing has been successfully used. Grid computing is based on a distributed system which interconnects several computers and/or clusters to access global-scale resources. This infrastructure is exible, highly scalable and can achieve high performances with data-compute-intensive algorithms. Recently, bioinformatics is exploring new approaches based on the use of hardware accelerators, such as the Graphics Processing Units (GPUs). Initially developed as graphics cards, GPUs have been recently introduced for scientific purposes by rea- son of their performance per watt and the better cost/performance ratio achieved in terms of throughput and response time compared to other high-performance com- puting solutions. Although developers must have an in-depth knowledge of GPU programming and hardware to be effective, GPU accelerators have produced a lot of impressive results. The use of high-performance computing infrastructures raises the question of finding a way to parallelize the algorithms while limiting data dependency issues in order to accelerate computations on a massively parallel hardware. In this context, the research activity in this dissertation focused on the assessment and testing of the impact of these innovative high-performance computing technolo- gies on computational biology. In order to achieve high levels of parallelism and, in the final analysis, obtain high performances, some of the bioinformatic algorithms applicable to genome data analysis were selected, analyzed and implemented. These algorithms have been highly parallelized and optimized, thus maximizing the GPU hardware resources. The overall results show that the proposed parallel algorithms are highly performant, thus justifying the use of such technology. However, a software infrastructure for work ow management has been devised to provide support in CPU and GPU computation on a distributed GPU-based in- frastructure. Moreover, this software infrastructure allows a further coarse-grained data-parallel parallelization on more GPUs. Results show that the proposed appli- cation speed-up increases with the increase in the number of GPUs

    Programming model abstractions for optimizing I/O intensive applications

    Get PDF
    This thesis contributes from the perspective of task-based programming models to the efforts of optimizing I/O intensive applications. Throughout this thesis, we propose programming model abstractions and mechanisms that target a twofold objective: from the one hand, improve the I/O and total performance of applications on nowadays complex storage infrastructures. From the other hand, achieve such performance improvement without increasing the complexity of applications programming. The following paragraphs briefly summarize each of our contributions. First, towards exploiting compute-I/O patterns of I/O intensive applications and transparently improving I/O and total performance, we propose a number of abstractions that we refer to as I/O Awareness abstractions. An I/O aware task-based programming model is able to separate the handling of I/O and computations by supporting I/O Tasks. The execution of such tasks can overlap with compute tasks execution. Moreover, we provide programming model support to improve I/O performance by addressing the issue of I/O congestion. This is achieved by using Storage Bandwidth Constraints to control the level of task parallelism. We support two types of such constraints: (i) Static storage bandwidth constraints that are manually set by application developers. (ii) Auto-tunable constraints that are automatically set and tuned throughout the execution of application. Second, in order to exploit the heterogeneity of modern storage systems to improve performance in a transparent manner, we propose a set of capabilities that we refer to as Storage heterogeneity Awareness. A storage-heterogeneity aware task-based programming model builds on the concepts and abstractions that are introduced in the first contribution to improve the I/O performance of applications on heterogeneous storage systems. More specifically, such programming models support the following features: (i) abstracting the heterogeneity of the storage devices and exposing them as one hierarchical storage resource. (ii) supporting dedicated I/O scheduling. (iii) Finally, we introduce a mechanism that automatically and periodically flushes obsolete data from higher storage layers to lower storage layers. Third, targeting increasing parallelism levels of applications, we propose a Hybrid Programming Model that combines task-based programming models and MPI. In this programming model, tasks are used to achieve coarse-grained parallelism on large-scale distributed infrastructures, whereas MPI is used to gain fine-grained parallelism by parallelizing tasks execution. Such a hybrid programming model offers the possibility to enable parallel I/O and high-level I/O libraries in tasks. We enable such a hybrid programming model by supporting Native MPI Tasks. These tasks are native to the programming model for two reasons: they execute task code as opposed to calling external MPI binaries or scripts. Also, the data transfers and input/output handling is done in a completely transparent manner to application developers. Therefore, increasing parallelism levels while easing the design and programming of applications. Finally, to exploit the inherent parallelism opportunities in applications and overlap computation with I/O, we propose an Eager mechanism for releasing data dependencies. Unlike the traditional approach for releasing dependencies, eagerly releasing data dependencies allows successor tasks to be released for execution as soon as their data dependencies are ready, without having to wait for predecessor task(s) to completely finish execution. In order to support the eager-release of data dependencies, we describe the following core modifications to the design of task-based programming models: (i) defining and managing data dependency relationships as parameter-aware dependencies (ii) a mechanism for notifying the programming model that an output data has been generated before the execution of the producer task ends.Aquesta tesi contribueix des de la perspectiva dels models de programació basats en tasques als esforços d’optimitzar les aplicacions intensives de I/O. Al llarg d'aquesta tesi, proposem abstraccions i mecanismes del model de programació que persegueixen un doble objectiu: per una banda, millorar la I/O i el rendiment total de les aplicacions a les complexes infraestructures d'emmagatzematge de l'actualitat. D'altra banda, aconsegueixi aquesta millora del rendiment sense augmentar la complexitat de la programació d'aplicacions. Els paràgrafs següents resumeixen cadascuna de les nostres contribucions. En primer lloc, proposem una sèrie d'abstraccions a què ens referim com a abstraccions de consciència de I/O. Un model de programació basat en tasques amb reconeixement d'I/O pot separar el maneig d'I/O i els càlculs en admetre Tasques d'I/O. L'execució d'aquestes tasques es pot superposar amb l'execució de tasques de càlcul. A més, proporcionem suport de model de programació per millorar el rendiment d'I/O en abordar el problema de la congestió d'I/O. Això s'aconsegueix mitjançant l'ús de restriccions d'amplada de banda d'emmagatzematge per controlar el nivell de paral·lelisme de tasques. Admetem dos tipus d'aquestes restriccions: estàtic i autoajustable. En segon lloc, proposem un conjunt de capacitats a què ens referim com a Consciència d'heterogeneïtat d'emmagatzematge. Un model de programació basat en tasques conscient de l'heterogeneïtat de l'emmagatzematge es basa en els conceptes i les abstraccions que s'introdueixen en la primera contribució per millorar el rendiment d'I/O de les aplicacions en sistemes d'emmagatzematge heterogenis. Més específicament, aquests models de programació admeten les característiques següents: (i) abstreure l'heterogeneïtat dels dispositius d'emmagatzematge i exposar-los com a recurs d'emmagatzematge jeràrquic. (ii) admetre la programació d'I/O dedicada. (iii) Finalment, presentem un mecanisme que descarrega automàticament i periòdicament les dades obsoletes de les capes d'emmagatzematge superiors a les capes d'emmagatzematge inferiors. En tercer lloc, proposem un model de programació híbrid que combina models de programació basats en tasques i MPI. En aquest model de programació, les tasques s'utilitzen per aconseguir un paral·lelisme de gra gruixut en infraestructures distribuïdes a gran escala, mentre que MPI es fa servir per obtenir un paral·lelisme de gra fi en paral·lelitzar l'execució de tasques. Un model d'aquest tipus de programació híbrid ofereix la possibilitat d'habilitar I/O paral·leles i biblioteques d'I/O d'alt nivell en tasques. Habilitem un model de programació híbrid d'aquest tipus en admetre tasques MPI natives que executen codi de tasca en lloc de trucar a binaris o scripts MPI externs. A més, la transferència de dades i el maneig d’entrada / sortida es realitza d’una manera completament transparent per als desenvolupadors d’aplicacions. Per tant, augmenta els nivells de paral·lelisme alhora que se'n facilita el disseny i la programació d'aplicacions. Finalment proposem un mecanisme Eager per alliberar dependències de dades. A diferència de l'enfocament tradicional per alliberar dependències, alliberar amb entusiasme les dependències de dades permet que les tasques successores s'alliberin per a la seva execució tan aviat com les dependències de dades estiguin llestes, sense haver d'esperar que les tasques predecessores acabin completament l'execució. Per tal de donar suport a l'alliberament ansiós de les dependències de dades, descrivim les següents modificacions centrals al disseny de models de programació basats en tasques: (i) definir i administrar les relacions de dependència de dades com a dependències conscients de paràmetres (ii ) un mecanisme per notificar la model de programació que s'ha generat una dada de sortida abans que finalitzi l'execució de la tasca de productor.Postprint (published version

    Modern Systems for Large-scale Genomics Data Analysis in the Cloud

    Get PDF
    Genomics researchers increasingly turn to cloud computing as a means of accomplishing large-scale analyses efficiently and cost-effectively. Successful operation in the cloud requires careful instrumentation and management to avoid common pitfalls, such as resource bottlenecks and low utilisation that can both drive up costs and extend the timeline of a scientific project. We developed the Butler framework for large-scale scientific workflow management in the cloud to meet these challenges. The cornerstones of Butler design are: ability to support multiple clouds, declarative infrastructure configuration management, scalable, fault-tolerant operation, comprehensive resource monitoring, and automated error detection and recovery. Butler relies on industry-strength open-source components in order to deliver a framework that is robust and scalable to thousands of compute cores and millions of workflow executions. Butler’s error detection and self-healing capabilities are unique among scientific workflow frameworks and ensure that analyses are carried out with minimal human intervention. Butler has been used to analyse over 725TB of DNA sequencing data on the cloud, using 1500 CPU cores, and 6TB of RAM, delivering results with 43\% increased efficiency compared to other tools. The flexible design of this framework allows easy adoption within other fields of Life Sciences and ensures that it will scale together with the demand for scientific analysis in the cloud for years to come. Because many bioinformatics tools have been developed in the context of small sample sizes they often struggle to keep up with the demands for large-scale data processing required for modern research and clinical sequencing projects due to the limitations in their design. The Rheos software system is designed specifically with these large data sets in mind. Utilising the elastic compute capacity of modern academic and commercial clouds, Rheos takes a service-oriented containerised approach to the implementation of modern bioinformatics algorithms, which allows the software to achieve the scalability and ease-of-use required to succeed under increased operational load of massive data sets generated by projects like International Cancer Genomics Consortium (ICGC) Argo and the All of Us initiative. Rheos algorithms are based on an innovative stream-based approach for processing genomic data, which enables Rheos to make faster decisions about the presence of genomic mutations that drive diseases such as cancer, thereby improving the tools' efficacy and relevance to clinical sequencing applications. Our testing of the novel germline Single Nucleotide Polymorphism (SNP) and deletion variant calling algorithms developed within Rheos indicates that Rheos achieves ~98\% accuracy in SNP calling and ~85\% accuracy in deletion calling, which is comparable with other leading tools such as the Genome Analysis Toolkit (GATK), freebayes, and Delly. The two frameworks that we developed provide important contributions to solve the ever-growing need for large scale genomic data analysis on the cloud, by enabling more effective use of existing tools, in the case of Butler, and providing a new, more dynamic and real-time approach to genomic analysis, in the case of Rheos

    Efficient approximate string matching techniques for sequence alignment

    Get PDF
    One of the outstanding milestones achieved in recent years in the field of biotechnology research has been the development of high-throughput sequencing (HTS). Due to the fact that at the moment it is technically impossible to decode the genome as a whole, HTS technologies read billions of relatively short chunks of a genome at random locations. Such reads then need to be located within a reference for the species being studied (that is aligned or mapped to the genome): for each read one identifies in the reference regions that share a large sequence similarity with it, therefore indicating what the read¿s point or points of origin may be. HTS technologies are able to re-sequence a human individual (i.e. to establish the differences between his/her individual genome and the reference genome for the human species) in a very short period of time. They have also paved the way for the development of a number of new protocols and methods, leading to novel insights in genomics and biology in general. However, HTS technologies also pose a challenge to traditional data analysis methods; this is due to the sheer amount of data to be processed and the need for improved alignment algorithms that can generate accurate results quickly. This thesis tackles the problem of sequence alignment as a step within the analysis of HTS data. Its contributions focus on both the methodological aspects and the algorithmic challenges towards efficient, scalable, and accurate HTS mapping. From a methodological standpoint, this thesis strives to establish a comprehensive framework able to assess the quality of HTS mapping results. In order to be able to do so one has to understand the source and nature of mapping conflicts, and explore the accuracy limits inherent in how sequence alignment is performed for current HTS technologies. From an algorithmic standpoint, this work introduces state-of-the-art index structures and approximate string matching algorithms. They contribute novel insights that can be used in practical applications towards efficient and accurate read mapping. More in detail, first we present methods able to reduce the storage space taken by indexes for genome-scale references, while still providing fast query access in order to support effective search algorithms. Second, we describe novel filtering techniques that vastly reduce the computational requirements of sequence mapping, but are nonetheless capable of giving strict algorithmic guarantees on the completeness of the results. Finally, this thesis presents new incremental algorithmic techniques able to combine several approximate string matching algorithms; this leads to efficient and flexible search algorithms allowing the user to reach arbitrary search depths. All algorithms and methodological contributions of this thesis have been implemented as components of a production aligner, the GEM-mapper, which is publicly available, widely used worldwide and cited by a sizeable body of literature. It offers flexible and accurate sequence mapping while outperforming other HTS mappers both as to running time and to the quality of the results it produces.Uno de los avances más importantes de los últimos años en el campo de la biotecnología ha sido el desarrollo de las llamadas técnicas de secuenciación de alto rendimiento (high-throughput sequencing, HTS). Debido a las limitaciones técnicas para secuenciar un genoma, las técnicas de alto rendimiento secuencian individualmente billones de pequeñas partes del genoma provenientes de regiones aleatorias. Posteriormente, estas pequeñas secuencias han de ser localizadas en el genoma de referencia del organismo en cuestión. Este proceso se denomina alineamiento - o mapeado - y consiste en identificar aquellas regiones del genoma de referencia que comparten una alta similaridad con las lecturas producidas por el secuenciador. De esta manera, en cuestión de horas, la secuenciación de alto rendimiento puede secuenciar un individuo y establecer las diferencias de este con el resto de la especie. En última instancia, estas tecnologías han potenciado nuevos protocolos y metodologías de investigación con un profundo impacto en el campo de la genómica, la medicina y la biología en general. La secuenciación alto rendimiento, sin embargo, supone un reto para los procesos tradicionales de análisis de datos. Debido a la elevada cantidad de datos a analizar, se necesitan nuevas y mejoradas técnicas algorítmicas que puedan escalar con el volumen de datos y producir resultados precisos. Esta tesis aborda dicho problema. Las contribuciones que en ella se realizan se enfocan desde una perspectiva metodológica y otra algorítmica que propone el desarrollo de nuevos algoritmos y técnicas que permitan alinear secuencias de manera eficiente, precisa y escalable. Desde el punto de vista metodológico, esta tesis analiza y propone un marco de referencia para evaluar la calidad de los resultados del alineamiento de secuencias. Para ello, se analiza el origen de los conflictos durante la alineación de secuencias y se exploran los límites alcanzables en calidad con las tecnologías de secuenciación de alto rendimiento. Desde el punto de vista algorítmico, en el contexto de la búsqueda aproximada de patrones, esta tesis propone nuevas técnicas algorítmicas y de diseño de índices con el objetivo de mejorar la calidad y el desempeño de las herramientas dedicadas a alinear secuencias. En concreto, esta tesis presenta técnicas de diseño de índices genómicos enfocados a obtener un acceso más eficiente y escalable. También se presentan nuevas técnicas algorítmicas de filtrado con el fin de reducir el tiempo de ejecución necesario para alinear secuencias. Y, por último, se proponen algoritmos incrementales y técnicas híbridas para combinar métodos de alineamiento y mejorar el rendimiento en búsquedas donde el error esperado es alto. Todo ello sin degradar la calidad de los resultados y con garantías formales de precisión. Para concluir, es preciso apuntar que todos los algoritmos y metodologías propuestos en esta tesis están implementados y forman parte del alineador GEM. Este versátil alineador ofrece resultados de alta calidad en entornos de producción siendo varias veces más rápido que otros alineadores. En la actualidad este software se ofrece gratuitamente, tiene una amplia comunidad de usuarios y ha sido citado en numerosas publicaciones científicas.Postprint (published version

    GSI Scientific Report 2009 [GSI Report 2010-1]

    Get PDF
    corecore