458,958 research outputs found

    OPENMENDEL: A Cooperative Programming Project for Statistical Genetics

    Full text link
    Statistical methods for genomewide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, a collaborative effort of statistical geneticists is required to develop open source software targeted to genetic epidemiology. Our attempt to meet this need is called the OPENMENDELproject (https://openmendel.github.io). It aims to (1) enable interactive and reproducible analyses with informative intermediate results, (2) scale to big data analytics, (3) embrace parallel and distributed computing, (4) adapt to rapid hardware evolution, (5) allow cloud computing, (6) allow integration of varied genetic data types, and (7) foster easy communication between clinicians, geneticists, statisticians, and computer scientists. This article reviews and makes recommendations to the genetic epidemiology community in the context of the OPENMENDEL project.Comment: 16 pages, 2 figures, 2 table

    GPU ACCELERATION OF THE ISO–7 NUCLEAR REACTION NETWORK USING OPENCL

    Get PDF
    We looked at the potential performance increases available through OpenCL and its parallel computing capabilities, including GPU computing as it applies to time inte- gration of nuclear reaction networks. The particular method chosen in this work was the trapezoidal BDF-2 method using Picard iteration, which is a non-linear second order method. Nuclear reaction network integration by itself is a sequential process and not easily accelerated via parallel computation. However, in tackling a problem like modeling supernova dynamics, a spatial discretization of the volume of the star necessary, and in many cases is combined with the computational technique of oper- ator splitting. Every spatial cell would have its own reaction network independent of the others, which is where the parallel computation would prove useful. The partic- ular reaction network analyzed is called the iso–7 reaction network that looks at the dynamics of 7 of the more dominant nuclides in supernovae. The computational per- formance was compared between the CPU and the GPU, in which the GPU showed performance increases of up to 8 times. This increase was realized on the small–scale, because the computations were limited to running on a single device at any given time. However, these performance gains would only increase as the problem size was scaled up to the large–scale

    SAT-hadoop-processor: a distributed remote sensing big data processing software for earth observation applications

    Get PDF
    Nowadays, several environmental applications take advantage of remote sensing techniques. A considerable volume of this remote sensing data occurs in near real-time. Such data are diverse and are provided with high velocity and variety, their pre-processing requires large computing capacities, and a fast execution time is critical. This paper proposes a new distributed software for remote sensing data pre-processing and ingestion using cloud computing technology, specifically OpenStack. The developed software discarded 86% of the unneeded daily files and removed around 20% of the erroneous and inaccurate datasets. The parallel processing optimized the total execution time by 90%. Finally, the software efficiently processed and integrated data into the Hadoop storage system, notably the HDFS, HBase, and Hive.This research was funded by Erasmus+ KA 107 program, and the UPC funded the APC. This work has received funding from the Spanish Government under contracts PID2019-106774RBC21, PCI2019-111851-2 (LeadingEdge CHIST-ERA), PCI2019-111850-2 (DiPET CHIST-ERA).Peer ReviewedPostprint (published version

    A visual programming model to implement coarse-grained DSP applications on parallel and heterogeneous clusters

    No full text
    International audienceThe digital signal processing (DSP) applications are one of the biggest consumers of computing. They process a big data volume which is represented with a high accuracy. They use complex algorithms, and must satisfy a time constraints in most of cases. In the other hand, it's necessary today to use parallel and heterogeneous architectures in order to speedup the processing, where the best examples are the su-percomputers "Tianhe-2" and "Titan" from the top500 ranking. These architectures could contain several connected nodes, where each node includes a number of generalist processor (multi-core) and a number of accelerators (many-core) to finally allows several levels of parallelism. However, for DSP programmers, it's still complicated to exploit all these parallelism levels to reach good performance for their applications. They have to design their implementation to take advantage of all heteroge-neous computing units, taking into account the architecture specifici-ties of each of them: communication model, memory management, data management, jobs scheduling and synchronization . . . etc. In the present work, we characterize DSP applications, and based on their distinctive-ness, we propose a high level visual programming model and an execution model in order to drop down their implementations and in the same time make desirable performances
    • …
    corecore