2,669 research outputs found

    Application-tailored Linear Algebra Algorithms: A search-based Approach

    Full text link
    In this paper, we tackle the problem of automatically generating algorithms for linear algebra operations by taking advantage of problem-specific knowledge. In most situations, users possess much more information about the problem at hand than what current libraries and computing environments accept; evidence shows that if properly exploited, such information leads to uncommon/unexpected speedups. We introduce a knowledge-aware linear algebra compiler that allows users to input matrix equations together with properties about the operands and the problem itself; for instance, they can specify that the equation is part of a sequence, and how successive instances are related to one another. The compiler exploits all this information to guide the generation of algorithms, to limit the size of the search space, and to avoid redundant computations. We applied the compiler to equations arising as part of sensitivity and genome studies; the algorithms produced exhibit, respectively, 100- and 1000-fold speedups

    Computing Petaflops over Terabytes of Data: The Case of Genome-Wide Association Studies

    Full text link
    In many scientific and engineering applications, one has to solve not one but a sequence of instances of the same problem. Often times, the problems in the sequence are linked in a way that allows intermediate results to be reused. A characteristic example for this class of applications is given by the Genome-Wide Association Studies (GWAS), a widely spread tool in computational biology. GWAS entails the solution of up to trillions (101210^{12}) of correlated generalized least-squares problems, posing a daunting challenge: the performance of petaflops (101510^{15} floating-point operations) over terabytes of data. In this paper, we design an algorithm for performing GWAS on multi-core architectures. This is accomplished in three steps. First, we show how to exploit the relation among successive problems, thus reducing the overall computational complexity. Then, through an analysis of the required data transfers, we identify how to eliminate any overhead due to input/output operations. Finally, we study how to decompose computation into tasks to be distributed among the available cores, to attain high performance and scalability. With our algorithm, a GWAS that currently requires the use of a supercomputer may now be performed in matter of hours on a single multi-core node. The discussion centers around the methodology to develop the algorithm rather than the specific application. We believe the paper contributes valuable guidelines of general applicability for computational scientists on how to develop and optimize numerical algorithms

    Large-scale linear regression: Development of high-performance routines

    Full text link
    In statistics, series of ordinary least squares problems (OLS) are used to study the linear correlation among sets of variables of interest; in many studies, the number of such variables is at least in the millions, and the corresponding datasets occupy terabytes of disk space. As the availability of large-scale datasets increases regularly, so does the challenge in dealing with them. Indeed, traditional solvers---which rely on the use of black-box" routines optimized for one single OLS---are highly inefficient and fail to provide a viable solution for big-data analyses. As a case study, in this paper we consider a linear regression consisting of two-dimensional grids of related OLS problems that arise in the context of genome-wide association analyses, and give a careful walkthrough for the development of {\sc ols-grid}, a high-performance routine for shared-memory architectures; analogous steps are relevant for tailoring OLS solvers to other applications. In particular, we first illustrate the design of efficient algorithms that exploit the structure of the OLS problems and eliminate redundant computations; then, we show how to effectively deal with datasets that do not fit in main memory; finally, we discuss how to cast the computation in terms of efficient kernels and how to achieve scalability. Importantly, each design decision along the way is justified by simple performance models. {\sc ols-grid} enables the solution of 101110^{11} correlated OLS problems operating on terabytes of data in a matter of hours

    High Performance Solutions for Big-data GWAS

    Full text link
    In order to associate complex traits with genetic polymorphisms, genome-wide association studies process huge datasets involving tens of thousands of individuals genotyped for millions of polymorphisms. When handling these datasets, which exceed the main memory of contemporary computers, one faces two distinct challenges: 1) Millions of polymorphisms and thousands of phenotypes come at the cost of hundreds of gigabytes of data, which can only be kept in secondary storage; 2) the relatedness of the test population is represented by a relationship matrix, which, for large populations, can only fit in the combined main memory of a distributed architecture. In this paper, by using distributed resources such as Cloud or clusters, we address both challenges: The genotype and phenotype data is streamed from secondary storage using a double buffer- ing technique, while the relationship matrix is kept across the main memory of a distributed memory system. With the help of these solutions, we develop separate algorithms for studies involving only one or a multitude of traits. We show that these algorithms sustain high-performance and allow the analysis of enormous datasets.Comment: Submitted to Parallel Computing. arXiv admin note: substantial text overlap with arXiv:1304.227

    Solving Sequences of Generalized Least-Squares Problems on Multi-threaded Architectures

    Full text link
    Generalized linear mixed-effects models in the context of genome-wide association studies (GWAS) represent a formidable computational challenge: the solution of millions of correlated generalized least-squares problems, and the processing of terabytes of data. We present high performance in-core and out-of-core shared-memory algorithms for GWAS: By taking advantage of domain-specific knowledge, exploiting multi-core parallelism, and handling data efficiently, our algorithms attain unequalled performance. When compared to GenABEL, one of the most widely used libraries for GWAS, on a 12-core processor we obtain 50-fold speedups. As a consequence, our routines enable genome studies of unprecedented size

    Accelerating scientific codes by performance and accuracy modeling

    Full text link
    Scientific software is often driven by multiple parameters that affect both accuracy and performance. Since finding the optimal configuration of these parameters is a highly complex task, it extremely common that the software is used suboptimally. In a typical scenario, accuracy requirements are imposed, and attained through suboptimal performance. In this paper, we present a methodology for the automatic selection of parameters for simulation codes, and a corresponding prototype tool. To be amenable to our methodology, the target code must expose the parameters affecting accuracy and performance, and there must be formulas available for error bounds and computational complexity of the underlying methods. As a case study, we consider the particle-particle particle-mesh method (PPPM) from the LAMMPS suite for molecular dynamics, and use our tool to identify configurations of the input parameters that achieve a given accuracy in the shortest execution time. When compared with the configurations suggested by expert users, the parameters selected by our tool yield reductions in the time-to-solution ranging between 10% and 60%. In other words, for the typical scenario where a fixed number of core-hours are granted and simulations of a fixed number of timesteps are to be run, usage of our tool may allow up to twice as many simulations. While we develop our ideas using LAMMPS as computational framework and use the PPPM method for dispersion as case study, the methodology is general and valid for a range of software tools and methods

    TMC Biodesign: The Design and Implementation of a Product Development Framework for Successful Innovation in the Healthcare Industry.

    Get PDF
    It is not uncommon to see both academic and industry institutions speed through, or even outright skip, the different stages of innovation. Industry often considers early stages of innovation, such as needs identification, to be too risky, or a waste of time and resources. They tend to focus more on improving validated solutions and creating incremental changes, resulting in products that lack innovation. Academia often considers aspects of the innovation process to be too commercial to consider during their research initiatives, which often results in the development of great technologies that cannot be implemented due to their lack of commercial viability, resulting in a great deal of wasted time and capital. There is a stark need to train everyone involved in the product development process to properly appreciate and implement all stages of the innovation cycle. Engineers, physicians, and business-minded people need to be taught how to come together to solve healthcare’s biggest problems. They need to learn how to turn technological developments into commercially viable products that solve customer needs. In partnership with the Texas Medical center, I present in this research a framework for providing future medical technology leaders the experience required to create transformational solutions to healthcare’s biggest challenges. I provide a structured process for innovating in the complex healthcare industry, beginning with first-hand observations of clinical needs and ending with a plan for commercializing a medical product. This thesis is intended to describe the proposed framework for medical device innovation and evaluate its potential for success through participation in the inaugural fellowship

    Development of a Robust Genetic Test for Hyperkalemic Periodic Paralysis (HYPP) in Quarter Horses

    Get PDF
    A single nucleotide substitution in a region of the skeletal muscle sodium channel gene (SCN4A) is known to cause an equine genetic disorder known as Hyperkalemic Periodic Paralysis (HYPP). The clinical effects of this disorder range from little or no symptoms to frequent episodes of muscle tremors, weakness, and/or complete collapse. Oligonucleotide primer pairs were designed for both the wild type and mutant alleles of the SCN4A gene for use in Amplification Refractory Mutation System-PCR (ARMS-PCR). These primers were tested with genomic DNA isolated from whole blood, saliva swabs, and hair of individual horses. It was determined that horse hair represented the most easily obtainable and reliable source for genomic DNA isolation. DNA was isolated from the hair of unaffected (wild-type) horses (N/N), a carrier for HYPP (N/H) and a homozygous mutant for the disease (H/H). As expected for the unaffected individual, a PCR product using the wild type-specific primer pair was generated, while the mutant-specific primer pair did not produce a product. DNA isolated from HYPP animals produced a PCR product with only the mutant-specific primer pair, and the carriers for HYPP produced a product in both wild type and mutant-specific PCR reactions. This robust DNA-based test was shown to generate an unambiguous assignment of the genetic status of the horses tested with respect to the HYPP trait

    Fishwalking in the Whites: Do Large Trees Grow Large Trout?

    Get PDF

    Death is no parenthesis| Stories

    Get PDF
    • …
    corecore