93,045 research outputs found

    Searching for Globally Optimal Functional Forms for Inter-Atomic Potentials Using Parallel Tempering and Genetic Programming

    Full text link
    We develop a Genetic Programming-based methodology that enables discovery of novel functional forms for classical inter-atomic force-fields, used in molecular dynamics simulations. Unlike previous efforts in the field, that fit only the parameters to the fixed functional forms, we instead use a novel algorithm to search the space of many possible functional forms. While a follow-on practical procedure will use experimental and {\it ab inito} data to find an optimal functional form for a forcefield, we first validate the approach using a manufactured solution. This validation has the advantage of a well-defined metric of success. We manufactured a training set of atomic coordinate data with an associated set of global energies using the well-known Lennard-Jones inter-atomic potential. We performed an automatic functional form fitting procedure starting with a population of random functions, using a genetic programming functional formulation, and a parallel tempering Metropolis-based optimization algorithm. Our massively-parallel method independently discovered the Lennard-Jones function after searching for several hours on 100 processors and covering a miniscule portion of the configuration space. We find that the method is suitable for unsupervised discovery of functional forms for inter-atomic potentials/force-fields. We also find that our parallel tempering Metropolis-based approach significantly improves the optimization convergence time, and takes good advantage of the parallel cluster architecture

    Running parallel applications on a heterogeneous environment with accessible development practices and automatic scalability

    Get PDF
    Grid computing makes it possible to gather large quantities of resources to work on a problem. In order to exploit this potential, a framework that presents the resources to the user programmer in a form that maintains productivity is necessary. The framework must not only provide accessible development, but it must make efficient use of the resources. The Seeds framework is proposed. It uses the current Grid and distributed computing middleware to provide a parallel programming environment to a wider community of programmers. The framework was used to investigate the feasibility of scaling skeleton/pattern parallel programming into Grid computing. The research accomplished two goals: it made parallel programming on the Grid more accessible to domain­specific programmers, and it made parallel programs scale on a heterogeneous resource environ­ ment. Programming is made easier to the programmer by using skeleton and pat­ tern­based programming approaches that effectively isolate the program from the envi­ ronment. To extend the pattern approach, the pattern adder operator is proposed, imple­ mented and tested. The results show the pattern operator can reduce the number of lines of code when compared with an MPJ­Express implementation for a stencil algorithm while having an overhead of at most ten microseconds per iteration. The research in scal­ ability involved adapting existing load­balancing techniques to skeletons and patterns re­ quiring little additional configuration on the part of the programmer. The hierarchical de­ pendency concept is proposed as well, which uses a streamed data flow programming model. The concept introduces data flow computation hibernation and dependencies that can split to accommodate additional processors. The results from implementing skeleton/patterns on hierarchical dependencies show an 18.23% increase in code is neces­ sary to enable automatic scalability. The concept can increase speedup depending on the algorithm and grain size

    GronOR:Massively parallel and GPU-accelerated non-orthogonal configuration interaction for large molecular systems

    Get PDF
    GronOR is a program package for non-orthogonal configuration interaction calculations for an electronic wave function built in terms of anti-symmetrized products of multi-configuration molecular fragment wave functions. The two-electron integrals that have to be processed may be expressed in terms of atomic orbitals or in terms of an orbital basis determined from the molecular orbitals of the fragments. The code has been specifically designed for execution on distributed memory massively parallel and Graphics Processing Unit (GPU)-accelerated computer architectures, using an MPI+OpenACC/OpenMP programming approach. The task-based execution model used in the implementation allows for linear scaling with the number of nodes on the largest pre-exascale architectures available, provides hardware fault resiliency, and enables effective execution on systems with distinct central processing unit-only and GPU-accelerated partitions. The code interfaces with existing multi-configuration electronic structure codes that provide optimized molecular fragment orbitals, configuration interaction coefficients, and the required integrals. Algorithm and implementation details, parallel and accelerated performance benchmarks, and an analysis of the sensitivity of the accuracy of results and computational performance to thresholds used in the calculations are presented

    CodeCloud: A platform to enable execution of programming models on the Clouds

    Full text link
    This paper presents a platform that supports the execution of scientific applications covering different programming models (such as Master/Slave, Parallel/MPI, MapReduce and Workflows) on Cloud infrastructures. The platform includes (i) a high-level declarative language to express the requirements of the applications featuring software customization at runtime, (ii) an approach based on virtual containers to encapsulate the logic of the different programming models, (iii) an infrastructure manager to interact with different IaaS backends, (iv) a configuration software to dynamically configure the provisioned resources and (v) a catalog and repository of virtual machine images. By using this platform, an application developer can adapt, deploy and execute parallel applications agnostic to the Cloud backend.The authors wish to thank the financial support received from both the Spanish Ministry of Economy and Competitiveness to develop the project "Servicios avanzados para el despliegue y contextualizacion de aplicaciones virtualizadas para dar soporte a modelos de programacion en entornos cloud", with reference TIN2010-17804.Caballer Fernández, M.; Alfonso Laguna, CD.; Moltó, G.; Romero Alcalde, E.; Blanquer Espert, I.; García García, A. (2014). CodeCloud: A platform to enable execution of programming models on the Clouds. Journal of Systems and Software. 93:187-198. https://doi.org/10.1016/j.jss.2014.02.005S1871989

    OpenCL Actors - Adding Data Parallelism to Actor-based Programming with CAF

    Full text link
    The actor model of computation has been designed for a seamless support of concurrency and distribution. However, it remains unspecific about data parallel program flows, while available processing power of modern many core hardware such as graphics processing units (GPUs) or coprocessors increases the relevance of data parallelism for general-purpose computation. In this work, we introduce OpenCL-enabled actors to the C++ Actor Framework (CAF). This offers a high level interface for accessing any OpenCL device without leaving the actor paradigm. The new type of actor is integrated into the runtime environment of CAF and gives rise to transparent message passing in distributed systems on heterogeneous hardware. Following the actor logic in CAF, OpenCL kernels can be composed while encapsulated in C++ actors, hence operate in a multi-stage fashion on data resident at the GPU. Developers are thus enabled to build complex data parallel programs from primitives without leaving the actor paradigm, nor sacrificing performance. Our evaluations on commodity GPUs, an Nvidia TESLA, and an Intel PHI reveal the expected linear scaling behavior when offloading larger workloads. For sub-second duties, the efficiency of offloading was found to largely differ between devices. Moreover, our findings indicate a negligible overhead over programming with the native OpenCL API.Comment: 28 page
    corecore