661 research outputs found
Distributed Computing in a Pandemic
The current COVID-19 global pandemic caused by the SARS-CoV-2 betacoronavirus has resulted in over a million deaths and is having a grave socio-economic impact, hence there is an urgency to find solutions to key research challenges. Much of this COVID-19 research depends on distributed computing. In this article, I review distributed architectures -- various types of clusters, grids and clouds -- that can be leveraged to perform these tasks at scale, at high-throughput, with a high degree of parallelism, and which can also be used to work collaboratively. High-performance computing (HPC) clusters will be used to carry out much of this work. Several bigdata processing tasks used in reducing the spread of SARS-CoV-2 require high-throughput approaches, and a variety of tools, which Hadoop and Spark offer, even using commodity hardware. Extremely large-scale COVID-19 research has also utilised some of the world's fastest supercomputers, such as IBM's SUMMIT -- for ensemble docking high-throughput screening against SARS-CoV-2 targets for drug-repurposing, and high-throughput gene analysis -- and Sentinel, an XPE-Cray based system used to explore natural products. Grid computing has facilitated the formation of the world's first Exascale grid computer. This has accelerated COVID-19 research in molecular dynamics simulations of SARS-CoV-2 spike protein interactions through massively-parallel computation and was performed with over 1 million volunteer computing devices using the Folding@home platform. Grids and clouds both can also be used for international collaboration by enabling access to important datasets and providing services that allow researchers to focus on research rather than on time-consuming data-management tasks
A Performance/Cost Model for a CUDA Drug Discovery Application on Physical and Public Cloud Infrastructures
Virtual Screening (VS) methods can considerably aid drug discovery research, predicting how ligands interact with drug targets. BINDSURF is an efficient and fast blind VS methodology for the determination of protein binding sites, depending on the ligand, using the massively parallel architecture of graphics processing units(GPUs) for fast unbiased prescreening of large ligand databases. In this contribution, we provide a performance/cost model for the execution of this application on both local system and public cloud infrastructures. With our model, it is possible to determine which is the best infrastructure to use in terms of execution time and costs for any given problem to be solved by BINDSURF. Conclusions obtained from our study can be extrapolated to other GPUâbased VS methodologiesIngenierĂa, Industria y ConstrucciĂł
Distributed Computing in a Pandemic: A Review of Technologies Available for Tackling COVID-19
The current COVID-19 global pandemic caused by the SARS-CoV-2 betacoronavirus
has resulted in over a million deaths and is having a grave socio-economic
impact, hence there is an urgency to find solutions to key research challenges.
Much of this COVID-19 research depends on distributed computing. In this
article, I review distributed architectures -- various types of clusters, grids
and clouds -- that can be leveraged to perform these tasks at scale, at
high-throughput, with a high degree of parallelism, and which can also be used
to work collaboratively. High-performance computing (HPC) clusters will be used
to carry out much of this work. Several bigdata processing tasks used in
reducing the spread of SARS-CoV-2 require high-throughput approaches, and a
variety of tools, which Hadoop and Spark offer, even using commodity hardware.
Extremely large-scale COVID-19 research has also utilised some of the world's
fastest supercomputers, such as IBM's SUMMIT -- for ensemble docking
high-throughput screening against SARS-CoV-2 targets for drug-repurposing, and
high-throughput gene analysis -- and Sentinel, an XPE-Cray based system used to
explore natural products. Grid computing has facilitated the formation of the
world's first Exascale grid computer. This has accelerated COVID-19 research in
molecular dynamics simulations of SARS-CoV-2 spike protein interactions through
massively-parallel computation and was performed with over 1 million volunteer
computing devices using the Folding@home platform. Grids and clouds both can
also be used for international collaboration by enabling access to important
datasets and providing services that allow researchers to focus on research
rather than on time-consuming data-management tasks.Comment: 21 pages (15 excl. refs), 2 figures, 3 table
High-Performance Modelling and Simulation for Big Data Applications
This open access book was prepared as a Final Publication of the COST Action IC1406 âHigh-Performance Modelling and Simulation for Big Data Applications (cHiPSet)â project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
High-Performance Modelling and Simulation for Big Data Applications
This open access book was prepared as a Final Publication of the COST Action IC1406 âHigh-Performance Modelling and Simulation for Big Data Applications (cHiPSet)â project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
- âŠ