Search CORE

353 research outputs found

Molecular dynamics recipes for genome research

Author: Biagini Tommaso
Capocefalo Daniele
Castellana Stefano
Chillemi Giovanni
Fusilli Caterina
Grottesi Alessandro
Mazza Tommaso
Mazzoccoli Gianluigi
Vescovi Angelo Luigi
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Molecular dynamics (MD) simulation allows one to predict the time evolution of a system of interacting particles. It is widely used in physics, chemistry and biology to address specific questions about the structural properties and dynamical mechanisms of model systems. MD earned a great success in genome research, as it proved to be beneficial in sorting pathogenic from neutral genomic mutations. Considering their computational requirements, simulations are commonly performed on HPC computing devices, which are generally expensive and hard to administer. However, variables like the software tool used for modeling and simulation or the size of the molecule under investigation might make one hardware type or configuration more advantageous than another or even make the commodity hardware definitely suitable for MD studies. This work aims to shed lights on this aspect

Unitus DSpace

Archivio della ricerca- Università di Roma La Sapienza

Improving Structural Features Prediction in Protein Structure Modeling

Author: Yaseen Ashraf
Publication venue: ODU Digital Commons
Publication date: 01/07/2014
Field of study

Proteins play a vital role in the biological activities of all living species. In nature, a protein folds into a specific and energetically favorable three-dimensional structure which is critical to its biological function. Hence, there has been a great effort by researchers in both experimentally determining and computationally predicting the structures of proteins. The current experimental methods of protein structure determination are complicated, time-consuming, and expensive. On the other hand, the sequencing of proteins is fast, simple, and relatively less expensive. Thus, the gap between the number of known sequences and the determined structures is growing, and is expected to keep expanding. In contrast, computational approaches that can generate three-dimensional protein models with high resolution are attractive, due to their broad economic and scientific impacts. Accurately predicting protein structural features, such as secondary structures, disulfide bonds, and solvent accessibility is a critical intermediate step stone to obtain correct three-dimensional models ultimately. In this dissertation, we report a set of approaches for improving the accuracy of structural features prediction in protein structure modeling. First of all, we derive a statistical model to generate context-based scores characterizing the favorability of segments of residues in adopting certain structural features. Then, together with other information such as evolutionary and sequence information, we incorporate the context-based scores in machine learning approaches to predict secondary structures, disulfide bonds, and solvent accessibility. Furthermore, we take advantage of the emerging high performance computing architectures in GPU to accelerate the calculation of pairwise and high-order interactions in context-based scores. Finally, we make these prediction methods available to the public via web services and software packages

Old Dominion University

Scientific Application Acceleration Utilizing Heterogeneous Architectures

Author: Weill Edwin
Publication venue: Clemson University Libraries
Publication date: 01/12/2014
Field of study

Within the past decade, there have been substantial leaps in computer architectures to exploit the parallelism that is inherently present in many applications. The scientific community has benefited from the emergence of not only multi-core processors, but also other, less traditional architectures including general purpose graphical processing units (GPGPUs), field programmable gate arrays (FPGAs), and Intel\u27s many integrated cores (MICs) architecture (i.e. Xeon Phi). The popularity of the GPGPU has increased rapidly because of their ability to perform massive amounts of parallel computation quickly and at low cost with an ease of programmability. Also, with the addition of high-level programming interfaces for these devices, technical and non-technical individuals can interface with the device and rapidly obtain improved performance for many algorithms. Many applications can take advantage of the parallelism present in distributed computing and multithreading to achieve higher levels of performance for the computationally intensive parts of the application. The work presented in this thesis implements three applications for use in a performance study of the GPGPU architecture and multi-GPGPU systems. The first application study in this research is a K-Means clustering algorithm that categorizes each data point into the closest cluster. The second algorithm implemented is a spiking neural network algorithm that is used as a computational model for machine learning. The third, and final, study is the longest common subsequences problem, which attempts to enumerate comparisons between sequences (namely, DNA sequences). The results for the aforementioned applications with varying problem sizes and architectural configurations are presented and discussed in this thesis. The K-Means clustering algorithm achieved approximately 97x speedup when utilizing an architecture consisting of 32 CPU/GPGPU pairs. To achieve this substantial speedup, up to 750,000 data points were used with up 30,000 centroids (means). The spiking neural network algorithm resulted in speedups of about 33x for the entire algorithm and 160x for each iteration with a two-level network with 1000 total neurons (800 excitatory and 200 inhibitory neurons). The longest common subsequences problem achieved speedup of greater than 10x with 100 random sequences up to 500 characters in length. The maximum speedup values for each application were achieved by utilizing the GPGPU as well as multi-core devices simultaneously. The computations were scattered over multiple CPU/GPGPU pairs with the computationally intensive pieces of the algorithms offloaded onto the GPGPU device. The research in this thesis illustrates the ability to scale a heterogeneous cluster (i.e. CPUs and GPUs working collaboratively) for large-scale scientific application performance improvements. Each algorithm demonstrates slightly different types of computations and communications, which can be compared to other algorithms to predict how they would perform on an accelerator. The results show that substantial speedups can be achieved for scientific applications when utilizing the GPGPU and multi-core architectures

Clemson University: TigerPrints

Implementation and acceleration of neuron simulator with CUDA C

Author: Neofytou Alexandros
Νεοφύτου Αλέξανδρος
Publication venue
Publication date: 18/05/2020
Field of study

DSpace at NTUA

Implementation and acceleration of neuron simulator with CUDA C

Author: Neofytou Alexandros
Νεοφύτου Αλέξανδρος
Publication venue
Publication date: 27/05/2020
Field of study

DSpace at NTUA

Recommended from our members

ATOMISTIC SIMULATIONS OF INTRINSICALLY DISORDERED PROTEIN FOLDING AND DYNAMICS

Author: Gong Xiping
Publication venue: ScholarWorks@UMass Amherst
Publication date: 14/11/2023
Field of study

Intrinsically disordered proteins (IDPs) are crucial in biology and human diseases, necessitating a comprehensive understanding of their structure, dynamics, and interactions. Atomistic simulations have emerged as a key tool for unraveling the molecular intricacies and establishing mechanistic insights into how these proteins facilitate diverse biological functions. However, achieving accurate simulations requires both an appropriate protein force field capable of describing the energy landscape of functionally relevant IDP conformations and sufficient conformational sampling to capture the free energy landscape of IDP dynamics. These factors are fundamental in comprehending potential IDP structures, dynamics, and interactions. I first conducted explicit solvent simulations to assess the performance of two state-of-the-art protein force fields, namely CHARMM36m and a99SB-disp, in capturing the stability of small protein-protein interactions. To evaluate their accuracy, I selected a set of 46 amino acid backbone and side chain pairs with representative configurations and computed the free energy profiles of their interactions. The results demonstrated that CHARMM36m consistently predicted stronger protein-protein interactions compared to a99SB-disp. Notably, the most significant overestimation in CHARMM36m occurred in charged pairs involving Arg and Glu side chains, with an overestimation of up to 2.9 kcal/mol. Through free energy decomposition analysis, I determined that these overestimations were primarily driven by protein-water electrostatic interactions rather than van der Waals (vdW) interactions. Consequently, these findings suggest that careful rebalancing of electrostatic interactions should be considered in the further optimization of protein force fields. In order to enhance the conformational sampling of IDPs, I developed an integrated approach that combines an improved implicit solvent model called Generalized Born with molecular volume and solvent accessible surface area (GBMV2/SA) with a multiscale enhanced sampling (MSES) technique. To make this approach more efficient, I implemented it as a standalone OpenMM plugin on Graphics Processing Units (GPUs). The results demonstrated that the GPU-GBMV2/SA model achieved numerical equivalence to the original CPU-GBMV2/SA models, while providing a remarkable ~60x speedup on a single NVIDIA TITAN X (Pascal) graphics card for molecular dynamic simulations of both folded and unstructured proteins. This significant acceleration greatly facilitated the application of the approach in biomolecular simulations. In addition, I conducted an evaluation of the reliability of GBMV2/SA models in simulating both folded and unfolded proteins. The results revealed that the GBMV2/SA model accurately describes small proteins, but its applicability is limited when it comes to larger proteins such as KID and p53-TAD proteins. This limitation can be attributed to the absence of long-range solute-solvent dispersion interactions in the model. To address this issue, I introduced a comprehensive treatment of nonpolar solvation free energy called GBMV2/NP model. Unfortunately, the GBMV2/NP model exhibited a destabilizing effect on well-folded proteins, particularly larger ones, due to an inaccurate representation of the repulsive solvent accessible surface area (SASA) model caused by the utilization of unphysical van der Waals volume. This observation highlights the need for further improvements in accurately describing the nonpolar term in the model

ScholarWorks@UMass Amherst

PERFORMANCE ANALYSIS AND FITNESS OF GPGPU AND MULTICORE ARCHITECTURES FOR SCIENTIFIC APPLICATIONS

Author: Bhuiyan Mohammad
Publication venue: Clemson University Libraries
Publication date: 01/12/2011
Field of study

Recent trends in computing architecture development have focused on exploiting task- and data-level parallelism from applications. Major hardware vendors are experimenting with novel parallel architectures, such as the Many Integrated Core (MIC) from Intel that integrates 50 or more x86 processors on a single chip, the Accelerated Processing Unit from AMD that integrates a multicore x86 processor with a graphical processing unit (GPU), and many other initiatives from other hardware vendors that are underway. Therefore, various types of architectures are available to developers for accelerating an application. A performance model that predicts the suitability of the architecture for accelerating an application would be very helpful prior to implementation. Thus, in this research, a Fitness model that ranks the potential performance of accelerators for an application is proposed. Then the Fitness model is extended using statistical multiple regression to model both the runtime performance of accelerators and the impact of programming models on accelerator performance with high degree of accuracy. We have validated both performance models for all the case studies. The error rate of these models, calculated using the experimental performance data, is tolerable in the high-performance computing field. In this research, to develop and validate the two performance models we have also analyzed the performance of several multicore CPUs and GPGPU architectures and the corresponding programming models using multiple case studies. The first case study used in this research is a matrix-matrix multiplication algorithm. By varying the size of the matrix from a small size to a very large size, the performance of the multicore and GPGPU architectures are studied. The second case study used in this research is a biological spiking neural network (SNN), implemented with four neuron models that have varying requirements for communication and computation making them useful for performance analysis of the hardware platforms. We report and analyze the performance variation of the four popular accelerators (Intel Xeon, AMD Opteron, Nvidia Fermi, and IBM PS3) and four advanced CPU architectures (Intel 32 core, AMD 32 core, IBM 16 core, and SUN 32 core) with problem size (matrix and network size) scaling, available optimization techniques and execution configuration. This thorough analysis provides insight regarding how the performance of an accelerator is affected by problem size, optimization techniques, and accelerator configuration. We have analyzed the performance impact of four popular multicore parallel programming models, POSIX-threading, Open Multi-Processing (OpenMP), Open Computing Language (OpenCL), and Concurrency Runtime on an Intel i7 multicore architecture; and, two GPGPU programming models, Compute Unified Device Architecture (CUDA) and OpenCL, on a NVIDIA GPGPU. With the broad study conducted using a wide range of application complexity, multiple optimizations, and varying problem size, it was found that according to their achievable performance, the programming models for the x86 processor cannot be ranked across all applications, whereas the programming models for GPGPU can be ranked conclusively. We also have qualitatively and quantitatively ranked all the six programming models in terms of their perceived programming effort. The results and analysis in this research indicate and are supported by the proposed performance models that for a given hardware system, the best performance for an application is obtained with a proper match of programming model and architecture

Clemson University: TigerPrints

Radial Basis Functions: Biomedical Applications and Parallelization

Author: Liu Ke
Publication venue: UWM Digital Commons
Publication date: 01/12/2016
Field of study

Radial basis function (RBF) is a real-valued function whose values depend only on the distances between an interpolation point and a set of user-specified points called centers. RBF interpolation is one of the primary methods to reconstruct functions from multi-dimensional scattered data. Its abilities to generalize arbitrary space dimensions and to provide spectral accuracy have made it particularly popular in different application areas, including but not limited to: finding numerical solutions of partial differential equations (PDEs), image processing, computer vision and graphics, deep learning and neural networks, etc. The present thesis discusses three applications of RBF interpolation in biomedical engineering areas: (1) Calcium dynamics modeling, in which we numerically solve a set of PDEs by using meshless numerical methods and RBF-based interpolation techniques; (2) Image restoration and transformation, where an image is restored from its triangular mesh representation or transformed under translation, rotation, and scaling, etc. from its original form; (3) Porous structure design, in which the RBF interpolation used to reconstruct a 3D volume containing porous structures from a set of regularly or randomly placed points inside a user-provided surface shape. All these three applications have been investigated and their effectiveness has been supported with numerous experimental results. In particular, we innovatively utilize anisotropic distance metrics to define the distance in RBF interpolation and apply them to the aforementioned second and third applications, which show significant improvement in preserving image features or capturing connected porous structures over the isotropic distance-based RBF method. Beside the algorithm designs and their applications in biomedical areas, we also explore several common parallelization techniques (including OpenMP and CUDA-based GPU programming) to accelerate the performance of the present algorithms. In particular, we analyze how parallel programming can help RBF interpolation to speed up the meshless PDE solver as well as image processing. While RBF has been widely used in various science and engineering fields, the current thesis is expected to trigger some more interest from computational scientists or students into this fast-growing area and specifically apply these techniques to biomedical problems such as the ones investigated in the present work

University of Wisconsin-Milwaukee

Efficient Algorithms And Optimizations For Scientific Computing On Many-Core Processors

Author: Rushaidat Kamel
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2015
Field of study

Designing efficient algorithms for many-core and multicore architectures requires using different strategies to allow for the best exploitation of the hardware resources on those architectures. Researchers have ported many scientific applications to modern many-core and multicore parallel architectures, and by doing so they have achieved significant speedups over running on single CPU cores. While many applications have achieved significant speedups, some applications still require more effort to accelerate due to their inherently serial behavior. One class of applications that has this serial behavior is the Monte Carlo simulations. Monte Carlo simulations have been used to simulate many problems in statistical physics and statistical mechanics that were not possible to simulate using Molecular Dynamics. While there are a fair number of well-known and recognized GPU Molecular Dynamics codes, the existing Monte Carlo ensemble simulations have not been ported to the GPU, so they are relatively slow and could not run large systems in a reasonable amount of time. Due to the previously mentioned shortcomings of existing Monte Carlo ensemble codes and due to the interest of researchers to have a fast Monte Carlo simulation framework that can simulate large systems, a new GPU framework called GOMC is implemented to simulate different particle and molecular-based force fields and ensembles. GOMC simulates different Monte Carlo ensembles such as the canonical, grand canonical, and Gibbs ensembles. This work describes many challenges in developing a GPU Monte Carlo code for such ensembles and how I addressed these challenges. This work also describes efficient many-core and multicore large-scale energy calculations for Monte Carlo Gibbs ensemble using cell lists. Designing Monte Carlo molecular simulations is challenging as they have less computation and parallelism when compared to similar molecular dynamics applications. The modified cell list allows for more speedup gains for energy calculations on both many-core and multicore architectures when compared to other implementations without using the conventional cell lists. The work presents results and analysis of the cell list algorithms for each one of the parallel architectures using top of the line GPUs, CPUs, and Intel’s Phi coprocessors. In addition, the work evaluates the performance of the cell list algorithms for different problem sizes and different radial cutoffs. In addition, this work evaluates two cell list approaches, a hybrid MPI+OpenMP approach and a hybrid MPI+CUDA approach. The cell list methods are evaluated on a small cluster of multicore CPUs, Intel Phi coprocessors, and GPUs. The performance results are evaluated using different combinations of MPI processes, threads, and problem sizes. Another application presented in this dissertation involves the understanding of the properties of crystalline materials, and their design and control. Recent developments include the introduction of new models to simulate system behavior and properties that are of large experimental and theoretical interest. One of those models is the Phase-Field Crystal (PFC) model. The PFC model has enabled researchers to simulate 2D and 3D crystal structures and study defects such as dislocations and grain boundaries. In this work, GPUs are used to accelerate various dynamic properties of polycrystals in the 2D PFC model. Some properties require very intensive computation that may involve hundreds of thousands of atoms. The GPU implementation has achieved significant speedups of more than 46 times for some large systems simulations

Digital Commons@Wayne State University