113 research outputs found

    Towards Energy Efficiency in Heterogeneous Processors: Findings on Virtual Screening Methods

    Get PDF
    The integration of the latest breakthroughs in computational modeling and high performance computing (HPC) has leveraged advances in the fields of healthcare and drug discovery, among others. By integrating all these developments together, scientists are creating new exciting personal therapeutic strategies for living longer that were unimaginable not that long ago. However, we are witnessing the biggest revolution in HPC in the last decade. Several graphics processing unit architectures have established their niche in the HPC arena but at the expense of an excessive power and heat. A solution for this important problem is based on heterogeneity. In this paper, we analyze power consumption on heterogeneous systems, benchmarking a bioinformatics kernel within the framework of virtual screening methods. Cores and frequencies are tuned to further improve the performance or energy efficiency on those architectures. Our experimental results show that targeted low‐cost systems are the lowest power consumption platforms, although the most energy efficient platform and the best suited for performance improvement is the Kepler GK110 graphics processing unit from Nvidia by using compute unified device architecture. Finally, the open computing language version of virtual screening shows a remarkable performance penalty compared with its compute unified device architecture counterpart.Ingeniería, Industria y Construcció

    Acceleration and Verification of Virtual High-throughput Multiconformer Docking

    Get PDF
    The work in this dissertation explores the use of massive computational power available through modern supercomputers as a virtual laboratory to aid drug discovery. As of November 2013, Tianhe-2, the fastest supercomputer in the world, has a theoretical performance peak of 54,902 TFlop/s or nearly 55 thousand trillion calculations per second. The Titan supercomputer located at Oak Ridge National Laboratory has 560,640 computing cores that can work in parallel to solve scientific problems. In order to harness this computational power to assist in drug discovery, tools are developed to aid in the preparation and analysis of high-throughput virtual docking screens, a tool to predict how and how well small molecules bind to disease associated proteins and potentially serve as a novel drug candidate. Methods and software for performing large screens are developed that run on high-performance computer systems. The future potential and benefits of using these tools to study polypharmacology and revolutionizing the pharmaceutical industry are also discussed

    Enhancing large-scale docking simulation on heterogeneous systems: An MPI vs rCUDA study

    Full text link
    [EN] Virtual Screening (VS) methods can considerably aid clinical research by predicting how ligands interact with pharmacological targets, thus accelerating the slow and critical process of finding new drugs. VS methods screen large databases of chemical compounds to find a candidate that interacts with a given target. The computational requirements of VS models, along with the size of the databases, containing up to millions of biological macromolecular structures, means computer clusters are a must. However, programming current clusters of computers is no easy task, as they have become heterogeneous and distributed systems where various programming models need to be used together to fully leverage their resources. This paper evaluates several strategies to provide peak performance to a GPU-based molecular docking application called METADOCK in heterogeneous clusters of computers based on CPU and NVIDIA Graphics Processing Units (GPUs). Our developments start with an OpenMP, MPI and CUDA METADOCK version as a baseline case of cluster utilization. Next, we explore the virtualized GPUs provided by the rCUDA framework in order to facilitate the programming process. rCUDA allows us to use remote GPUs, i.e. installed in other nodes of the cluster, as if they were installed in the local node, so enabling access to them using only OpenMP and CUDA. Finally, several load balancing strategies are analyzed in a search to enhance performance. Our results reveal that the use of middleware like rCUDA is a convincing alternative to leveraging heterogeneous clusters, as it offers even better performance than traditional approaches and also makes it easier to program these emerging clusters.This work is jointly supported by the Fundacion Seneca (Agencia Regional de Ciencia y Tecnologia, Region de Murcia) under grant 18946/JLI/13, and by the Spanish MEC and European Commission FEDER under grants TIN2015-66972-C5-3-R and TIN2016-78799-P (AEI/FEDER, UE). We also thank NVIDIA for hardware donation under GPU Educational Center 2014-2016 and Research Center 2015-2016. Furthermore, researchers from Universitat Politecnica de Valencia are supported by the Generalitat Valenciana under Grant PROMETEO/2017/077. Authors are also grateful for the generous support provided by Mellanox Technologies Inc.Imbernón, B.; Prades Gasulla, J.; Gimenez Canovas, D.; Cecilia, JM.; Silla Jiménez, F. (2018). Enhancing large-scale docking simulation on heterogeneous systems: An MPI vs rCUDA study. Future Generation Computer Systems. 79:26-37. https://doi.org/10.1016/j.future.2017.08.050S26377

    Scheduling and Tuning Kernels for High-performance on Heterogeneous Processor Systems

    Get PDF
    Accelerated parallel computing techniques using devices such as GPUs and Xeon Phis (along with CPUs) have proposed promising solutions of extending the cutting edge of high-performance computer systems. A significant performance improvement can be achieved when suitable workloads are handled by the accelerator. Traditional CPUs can handle those workloads not well suited for accelerators. Combination of multiple types of processors in a single computer system is referred to as a heterogeneous system. This dissertation addresses tuning and scheduling issues in heterogeneous systems. The first section presents work on tuning scientific workloads on three different types of processors: multi-core CPU, Xeon Phi massively parallel processor, and NVIDIA GPU; common tuning methods and platform-specific tuning techniques are presented. Then, analysis is done to demonstrate the performance characteristics of the heterogeneous system on different input data. This section of the dissertation is part of the GeauxDock project, which prototyped a few state-of-art bioinformatics algorithms, and delivered a fast molecular docking program. The second section of this work studies the performance model of the GeauxDock computing kernel. Specifically, the work presents an extraction of features from the input data set and the target systems, and then uses various regression models to calculate the perspective computation time. This helps understand why a certain processor is faster for certain sets of tasks. It also provides the essential information for scheduling on heterogeneous systems. In addition, this dissertation investigates a high-level task scheduling framework for heterogeneous processor systems in which, the pros and cons of using different heterogeneous processors can complement each other. Thus a higher performance can be achieve on heterogeneous computing systems. A new scheduling algorithm with four innovations is presented: Ranked Opportunistic Balancing (ROB), Multi-subject Ranking (MR), Multi-subject Relative Ranking (MRR), and Automatic Small Tasks Rearranging (ASTR). The new algorithm consistently outperforms previously proposed algorithms with better scheduling results, lower computational complexity, and more consistent results over a range of performance prediction errors. Finally, this work extends the heterogeneous task scheduling algorithm to handle power capping feature. It demonstrates that a power-aware scheduler significantly improves the power efficiencies and saves the energy consumption. This suggests that, in addition to performance benefits, heterogeneous systems may have certain advantages on overall power efficiency

    Accelerating the pace of protein functional annotation with intel xeon phi coprocessors

    Get PDF
    © 2002-2011 IEEE. Intel Xeon Phi is a new addition to the family of powerful parallel accelerators. The range of its potential applications in computationally driven research is broad; however, at present, the repository of scientific codes is still relatively limited. In this study, we describe the development and benchmarking of a parallel version of {\mmb e}FindSite, a structural bioinformatics algorithm for the prediction of ligand-binding sites in proteins. Implemented for the Intel Xeon Phi platform, the parallelization of the structure alignment portion of {\mmb e}FindSite using pragma-based OpenMP brings about the desired performance improvements, which scale well with the number of computing cores. Compared to a serial version, the parallel code runs 11.8 and 10.1 times faster on the CPU and the coprocessor, respectively; when both resources are utilized simultaneously, the speedup is 17.6. For example, ligand-binding predictions for 501 benchmarking proteins are completed in 2.1 hours on a single Stampede node equipped with the Intel Xeon Phi card compared to 3.1 hours without the accelerator and 36.8 hours required by a serial version. In addition to the satisfactory parallel performance, porting existing scientific codes to the Intel Xeon Phi architecture is relatively straightforward with a short development time due to the support of common parallel programming models by the coprocessor. The parallel version of {\mmb e}FindSite is freely available to the academic community at www.brylinski.org/efindsite

    High-Throughput parallel blind Virtual Screening using BINDSURF

    Full text link
    corecore