75 research outputs found

    Supercharging the APGAS Programming Model with Relocatable Distributed Collections

    Full text link
    In this article we present our relocatable distributed collections library. Building on top of the AGPAS for Java library, we provide a number of useful intra-node parallel patterns as well as the features necessary to support the distributed nature of the computation through clearly identified methods. In particular, the transfer of distributed collections' entries between processes is supported via an integrated relocation system. This enables dynamic load-balancing capabilities, making it possible for programs to adapt to uneven or evolving cluster performance. The system we developed makes it possible to dynamically control the distribution and the data-flow of distributed programs through high-level abstractions. Programmers using our library can therefore write complex distributed programs combining computation and communication phases through a consistent API. We evaluate the performance of our library against two programs taken from well-known Java benchmark suites, demonstrating superior programmability, and obtaining better performance on one benchmark and reasonable overhead on the second. Finally, we demonstrate the ease and benefits of load-balancing and on a more complex application which uses the various features of our library extensively.Comment: 23 pages 8 figures Consult the source code in the GitHub repository at https://github.com/handist/collection

    Safety Verification of Phaser Programs

    Full text link
    We address the problem of statically checking control state reachability (as in possibility of assertion violations, race conditions or runtime errors) and plain reachability (as in deadlock-freedom) of phaser programs. Phasers are a modern non-trivial synchronization construct that supports dynamic parallelism with runtime registration and deregistration of spawned tasks. They allow for collective and point-to-point synchronizations. For instance, phasers can enforce barriers or producer-consumer synchronization schemes among all or subsets of the running tasks. Implementations %of these recent and dynamic synchronization are found in modern languages such as X10 or Habanero Java. Phasers essentially associate phases to individual tasks and use their runtime values to restrict possible concurrent executions. Unbounded phases may result in infinite transition systems even in the case of programs only creating finite numbers of tasks and phasers. We introduce an exact gap-order based procedure that always terminates when checking control reachability for programs generating bounded numbers of coexisting tasks and phasers. We also show verifying plain reachability is undecidable even for programs generating few tasks and phasers. We then explain how to turn our procedure into a sound analysis for checking plain reachability (including deadlock freedom). We report on preliminary experiments with our open source tool

    Design and implementation of Java bindings in Open MPI

    Full text link
    This paper describes the Java MPI bindings that have been included in the Open MPI distribution. Open MPI is one of the most popular implementations of MPI, the Message-Passing Interface, which is the predominant programming paradigm for parallel applications on distributed memory computers. We have added Java support to Open MPI, exposing MPI functionality to Java programmers. Our approach is based on the Java Native Interface, and has similarities with previous efforts, as well as important differences. This paper serves as a reference for the application program interface, and in addition we provide details of the internal implementation to justify some of the design decisions. We also show some results to assess the performance of the bindings. (C) 2016 Elsevier B.V. All rights reserved.We are indebted to Siegmar Grog for his exhaustive testing of the Java bindings. We also thank Ralph Castain for helping in the integration of the Java bindings in the Open MPI infrastructure. The NPB-MPJ benchmarks used in Section 5 were kindly provided by Guillermo Lopez Taboada. The first two authors were supported by the Spanish Ministry of Economy and Competitiveness under project number TIN2013-41049-P.Vega Gisbert, O.; Román Moltó, JE.; Squyres, JM. (2016). Design and implementation of Java bindings in Open MPI. Parallel Computing. 59:1-20. https://doi.org/10.1016/j.parco.2016.08.004S1205

    Machine learning and data-parallel processing for viral metagenomics

    Get PDF
    More than 2 million cancer cases around the world each year are caused by viruses. In addition, there are epidemiological indications that other cancer-associated viruses may also exist. However, the identification of highly divergent and yet unknown viruses in human biospecimens is one of the biggest challenges in bio- informatics. Modern-day Next Generation Sequencing (NGS) technologies can be used to directly sequence biospecimens from clinical cohorts with unprecedented speed and depth. These technologies are able to generate billions of bases with rapidly decreasing cost but current bioinformatics tools are inefficient to effectively process these massive datasets. Thus, the objective of this thesis was to facilitate both the detection of highly divergent viruses among generated sequences as well as large-scale analysis of human metagenomic datasets. To re-analyze human sample-derived sequences that were classified as being of “unknown” origin by conventional alignment-based methods, we used a meth- odology based on profile Hidden Markov Models (HMM) which can capture evolutionary changes by using multiple sequence alignments. We thus identified 510 sequences that were classified as distantly related to viruses. Many of these sequences were homologs to large viruses such as Herpesviridae and Mimiviridae but some of them were also related to small circular viruses such as Circoviridae. We found that bioinformatics analysis using viral profile HMM is capable of extending the classification of previously unknown sequences and consequently the detection of viruses in biospecimens from humans. Different organisms use synonymous codons differently to encode the same amino acids. To investigate whether codon usage bias could predict the presence of virus in metagenomic sequencing data originating from human samples, we trained Random Forest and Artificial Neural Networks based on Relative Synonymous Codon Usage (RSCU) frequency. Our analysis showed that machine learning tech- niques based on RSCU could identify putative viral sequences with area under the ROC curve of 0.79 and provide important information for taxonomic classification. For identification of viral genomes among raw metagenomic sequences, we devel- oped the tool ViraMiner, a deep learning-based method which uses Convolutional Neural Networks with two convolutional branches. Using 300 base-pair length sequences, ViraMiner achieved 0.923 area under the ROC curve which is con- siderably improved performance in comparison with previous machine learning methods for virus sequence classification. The proposed architecture, to the best of our knowledge, is the first deep learning tool which can detect viral genomes on raw metagenomic sequences originating from a variety of human samples. To enable large-scale analysis of massive metagenomic sequencing data we used Apache Hadoop and Apache Spark to develop ViraPipe, a scalable parallel bio- informatics pipeline for viral metagenomics. Comparing ViraPipe (executed on 23 nodes) with the sequential pipeline (executed on a single node) was 11 times faster in the metagenome analysis. The new distributed workflow contains several standard bioinformatics tools and can scale to terabytes of data by accessing more computer power from the nodes. To analyze terabytes of RNA-seq data originating from head and neck squamous cell carcinoma samples, we used our parallel bioinformatics pipeline ViraPipe and the most recent version of the HPV sequence database. We detected transcription of HPV viral oncogenes in 92/500 cancers. HPV 16 was the most important HPV type, followed by HPV 33 as the second most common infection. If these cancers are indeed caused by HPV, we estimated that vaccination might prevent about 36 000 head and neck cancer cases in the United States every year. In conclusion, the work in this thesis improves the prospects for biomedical researchers to classify the sequence contents of ultra-deep datasets, conduct large- scale analysis of metagenome studies, and detect presence of viral genomes in human biospecimens. Hopefully, this work will contribute to our understanding of biodiversity of viruses in humans which in turn can help exploring infectious causes of human disease

    Extended update plans

    Get PDF
    Formal methods are gaining popularity as a way of increasing the reliability of systems through the use of mathematically based techniques. Their domain is no longer restricted to purely academic environments and examples, as they are slowly moving into industrial settings. The slow rate at which this transition takes place is mainly due to the perceived difficulty of formalising the behaviour of systems. While this is undoubtedly true, it is not the case with all formal methods. Update Plans are a powerful formalism for the description of computer architectures and intermediate to low-level languages. They are a declarative specification language with an underlying imperative machine model. The descriptions using Update Plans are clear, compact, intuitive, unambiguous and simple to read. These characteristics allow for the minimisation of possible errors at early stages of the development process even before a verification takes place. In this thesis an overview of the Update Plans formalism is given and a number of realworld applications is shown. The investigation of the application area focuses on computer architectures for which various specifications already exist. The comparison of Update Plan specifications to other specifications provides a useful insight into the strengths and shortcomings of the formalism. The shortcomings, in particular the lack of synchronisation primitives and modularity, are addressed by the development and evaluation of several syntactic and semantic extensions described in this thesis. The extended formalism is also compared to other specification languages and conclusions are drawn

    A Framework for Parallelizing OWL Classification in Description Logic Reasoners

    Get PDF
    The Web Ontology Language (OWL) is a widely used knowledge representation language for describing knowledge in application domains by using classes, properties, and individuals. Ontology classification is an important and widely used service that computes a taxonomy of all classes occurring in an ontology. It can require significant amounts of runtime, but most OWL reasoners do not support any kind of parallel processing. This thesis reports on a black-box approach to parallelize existing description logic (DL) reasoners for theWeb Ontology Language. We focus on OWL ontology classification, which is an important inference service and supported by every major OWL/DL reasoner. To the best of our knowledge, we are the first to propose a flexible parallel framework which can be applied to existing OWL reasoners in order to speed up their classification process. There are two versions of our methods discussed: (i) the first version implements a novel thread-level parallel architecture with two parallel strategies to achieve a good speedup factor with an increasing number of threads, but does not rely on locking techniques and thus avoids possible race conditions. (ii) The improved version implements an improved data structure and various parallel computing techniques for precomputing and classification to reduce the overhead of processing ontologies and compete with other DL reasoners based on the wall clock time for classification. In order to test the performance of both versions of our approaches, we use a real-world repository for choosing the tested ontologies. For the first version of our approach, we evaluated our prototype implementation with a set of selected real-world ontologies. Our experiments demonstrate very good scalability resulting in a speedup that is linear to the number of available cores. For the second version, its performance is evaluated by parallelizing major OWL reasoners for concept classification. Currently, we mainly focus on comparison with two popular DL reasoners: Hermit and JFact. In comparison to the selected black-box reasoners, our results demonstrate that the wall clock time of ontology classification can be improved by one order of the magnitude for most real-world ontologies in the repository

    Programming Persistent Memory

    Get PDF
    Beginning and experienced programmers will use this comprehensive guide to persistent memory programming. You will understand how persistent memory brings together several new software/hardware requirements, and offers great promise for better performance and faster application startup times—a huge leap forward in byte-addressable capacity compared with current DRAM offerings. This revolutionary new technology gives applications significant performance and capacity improvements over existing technologies. It requires a new way of thinking and developing, which makes this highly disruptive to the IT/computing industry. The full spectrum of industry sectors that will benefit from this technology include, but are not limited to, in-memory and traditional databases, AI, analytics, HPC, virtualization, and big data. Programming Persistent Memory describes the technology and why it is exciting the industry. It covers the operating system and hardware requirements as well as how to create development environments using emulated or real persistent memory hardware. The book explains fundamental concepts; provides an introduction to persistent memory programming APIs for C, C++, JavaScript, and other languages; discusses RMDA with persistent memory; reviews security features; and presents many examples. Source code and examples that you can run on your own systems are included. What You’ll Learn Understand what persistent memory is, what it does, and the value it brings to the industry Become familiar with the operating system and hardware requirements to use persistent memory Know the fundamentals of persistent memory programming: why it is different from current programming methods, and what developers need to keep in mind when programming for persistence Look at persistent memory application development by example using the Persistent Memory Development Kit (PMDK) Design and optimize data structures for persistent memory Study how real-world applications are modified to leverage persistent memory Utilize the tools available for persistent memory programming, application performance profiling, and debugging Who This Book Is For C, C++, Java, and Python developers, but will also be useful to software, cloud, and hardware architects across a broad spectrum of sectors, including cloud service providers, independent software vendors, high performance compute, artificial intelligence, data analytics, big data, etc

    Applications of Mathematical Programming in Personnel Scheduling

    Get PDF
    In the few decades of its existence, mathematical programming has evolved into an important branch of operations research and management science. This thesis consists of four papers in which we apply mathematical programming to real-life personnel scheduling and project management problems. We develop exact mathematical programming formulations. Furthermore, we propose effective heuristic strategies to decompose the original problems into subproblems that can be solved effciently with tailored mathematical programming formulations. We opt for solution methods that are based on mathematical programming, because their advantages in practice are a) the exibility to easily accommodate changes in the problem setting, b) the possibility to evaluate the quality of the solutions obtained, and c) the possibility to use general-purpose solvers, which are often the only software available in practice

    State Space Symmetry Reduction for TBP Analysis

    Get PDF
    Threaded Behavior Protocols (TBP) je specifikační jazyk pro modelování chování softwarových komponent. Tato práce se zaměřuje na analýzu TBP specifikací v rámci prostředí, která obsahují neomezené množšví replikovaných vláken. Takové specifikace spolu s modely možných prostředí způsobí nekonečnost stavového prostoru analýzy, který obsahuje velké množství symetrií, způsobených replikací vláken. V práci je navžena technika analýzy takových modelů, která redukuje symetrie s použitím abstrakce zvané Symbolic Counter Abstraction. Pro její použití je však nutné převést vlastnosti modelu na problém dosažitelnosti stavů vláken. Navrhovaná technika je bezpečná ve smyslu odhalení všech chyb v modelu. Na druhou stranu může způsobovat tzv. spurious erros, tj. chyby které neodpovídají skutečným chybám v modelu. Tyto chyby jsou v práci dobře identifikovány a dále jsou nastíněny způsoby jejich redukce. Práce navíc může představovat malý krok směrem k podpoře dynamického vytváření vláken v TBP specifikacích.Threaded Behavioral Protocols (TBP) is a specification language for modelling the behavior of software components. This thesis aims at an analysis of TBP specifications within environments which involve an unbounded replication of threads. Such a TBP specification - together with a model of the possible environments - induces infinite state space which contains a vast amount of symmetries caused by thread replication. A model checking technique addressing such a state space and reducing the symmetries by using symbolic counter abstraction is proposed. In order to utilize the symbolic counter abstraction, the properties of the TBP specifications (called provisions) are converted into thread state reachability properties. The proposed analysis is safe in the sense that it discovers all errors in the model. On the other hand, it may yield spurious errors, i.e., errors that do not correspond to any real error in the model. The spurious errors are well identified and further possibilities to reduce them are outlined. Beyond the scope of the specific specifications, this work may also present a small step towards supporting dynamic thread creation in TBP.Department of Software EngineeringKatedra softwarového inženýrstvíFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult
    • …
    corecore