29 research outputs found

    Growing Open Science with the Combined Potential of Citizen Science and Auto Science

    Get PDF
    In this paper, we present our ideas on how to best support researchers in every phase of the research process when dealing with their research data. We propose a Research Data Portal as the central data infrastructure. With the help of this portal, a researcher can easily manage and update his or her research data, share it with collaborators, and reach out to the public. We further propose a Citizen Science Portal, which includes some new and innovative concepts and methods. In this portal, Citizen Science and Auto Science concepts are applied, and support to bring together the best of both worlds is provided. Citizen Science promises to entail the individual (scientists and hobby scientists) to help with research. Auto Science is meant to help analyze research data, e.g., to help publish the data and to help improve its quality, by applying methods from artificial intelligence

    Parallelizing Epistasis Detection in GWAS on FPGA and GPU-Accelerated Computing Systems

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in IEEE - ACM Transactions on Computational Biology and Bioinformatics. The final authenticated version is available online at: http://dx.doi.org/10.1109/TCBB.2015.2389958[Abstract] High-throughput genotyping technologies (such as SNP-arrays) allow the rapid collection of up to a few million genetic markers of an individual. Detecting epistasis (based on 2-SNP interactions) in Genome-Wide Association Studies is an important but time consuming operation since statistical computations have to be performed for each pair of measured markers. Computational methods to detect epistasis therefore suffer from prohibitively long runtimes; e.g., processing a moderately-sized dataset consisting of about 500,000 SNPs and 5,000 samples requires several days using state-of-the-art tools on a standard 3 GHz CPU. In this paper, we demonstrate how this task can be accelerated using a combination of fine-grained and coarse-grained parallelism on two different computing systems. The first architecture is based on reconfigurable hardware (FPGAs) while the second architecture uses multiple GPUs connected to the same host. We show that both systems can achieve speedups of around four orders-of-magnitude compared to the sequential implementation. This significantly reduces the runtimes for detecting epistasis to only a few minutes for moderatelysized datasets and to a few hours for large-scale datasets.London. Wellcome Trust; 076113London. Wellcome Trust; 08547

    FPGA-based Acceleration of Detecting Statistical Epistasis in GWAS

    Get PDF
    AbstractGenotype-by-genotype interactions (epistasis) are believed to be a significant source of unexplained genetic variation causing complex chronic diseases but have been ignored in genome-wide association studies (GWAS) due to the computational burden of analysis. In this work we show how to benefit from FPGA technology for highly parallel creation of contingency tables in a systolic chain with a subsequent statistical test. We present the implementation for the FPGA-based hardware platform RIVYERA S6-LX150 containing 128 Xilinx Spartan6-LX150 FPGAs. For performance evaluation we compare against the method iLOCi[9]. iLOCi claims to outperform other available tools in terms of accuracy. However, analysis of a dataset from the Wellcome Trust Case Control Consortium (WTCCC) with about 500,000 SNPs and 5,000 samples still takes about 19hours on a MacPro workstation with two Intel Xeon quad-core CPUs, while our FPGA-based implementation requires only 4minutes

    TOMOGRAPHIC IMAGE RECONSTRUCTION ON THE INSTRUCTION SYSTOLIC ARRAY

    No full text
    Instruction systolic arrays (ISAs) have been developed in order to combine the speed and simplicity of systolic arrays with the flexibility of MIMD parallel computer systems. ISAs are available as square arrays of small RISC processors capable of performing integer and floating point arithmetic. In this paper we show that the systolic control flow can be used for an efficient reconstruction of images from its projections. The demand for fast image reconstruction arises in the field of computerized tomography. It is shown how the new parallel algorithm leads to a high-speed implementation on Systola 1024, the first commercial parallel computer with the ISA architecture

    Massively parallel solutions for molecular sequence analysis

    No full text
    In this paper we present new approaches to high performance protein database scanning on two novel massively parallel architectures to gain supercomputer power at low cost. The first architecture is built around a Beowulf PCcluster linked by a high-speed network and fine-grained parallel Systola 1024 processor boards connected to each node. The second architecture is the Fuzion 150, a new parallel computer with a linear SIMD array of 1536 processing elements on a single chip. We present the design of a database scanning application based on the SmithWaterman algorithm in order to derive efficient mappings onto these architectures. The implementations lead to significant runtime savings for large-scale database scanning. This result shows that both architectures provide highthroughput sequence similarity analysis solutions at a good price/performance ratio

    Morphological hough transform on the instruction systolic array

    No full text
    corecore