36 research outputs found

    Measuring the effects of reaction coordinate and electronic treatments in the QM/MM reaction dynamics of Trypanosoma cruzi trans-sialidase

    Get PDF
    The free energy of activation, as defined in transition state theory, is central to calculating reaction rates, distinguishing between mechanistic paths and elucidating the catalytic process. Computational free energies are accessible through the reaction space that is comprised of the conformational and electronic degrees of freedom orthogonal to the reaction coordinate. The overarching aim of this thesis was to address theoretical and methodological challenges facing current methods for calculating reaction free energies in glycoenzyme systems. Tractable calculations balance chemical accuracy and sampling efficiency that necessitates simplification of these complex reaction spaces through quantum mechanics/molecular mechanics partitioning and use of a semi-empirical electronic method to sample an approximated reaction coordinate. Here I directly and indirectly interrogate both the appropriate levels of sampling as well as the accuracy of the semi-empirical method required for reliable analysis of glycoenzyme reaction pathways. Free Energies from Adaptive Reaction Coordinates Forces, a method that builds the potential of mean force from multiple iterations of reactive trajectories, was used to construct reaction surfaces and volumes for the glycosylation and deglycosylation reactions comprising the T. cruzi trans-sialidase catalytic itinerary. This enzyme was chosen for the wealth of experimental data available for it built from its significance as a potential drug target against Chagas disease. Of equal importance is the identification of an elimination reaction competing with the primary transferase activity. The identification of this side reaction, that is observable only in the absence of the trans-sialidase or sialic acid acceptor, presented the opportunity to study the means by which enzymes selectivity bias in favor of a single reaction path. I therefore set out to explore the molecular details of how T. cruzi transsialidase asserts a precision and selectivity synonymous with enzyme catalysis. The chemical nature of the transition sate, formally defined as a dividing hypersurface separating the reactant and product regions of phase space, was characterized for the deglycosylation reaction. More than 40 transition state configurations were isolated from reactive trajectories, and the sialic acid substrate conformations were analyzed as well as the substrate interactions with the nucleophile and catalytic acid/base. A successful barrier crossing requires that the substrate pass through a family of E₅, ⁴H₅ and ⁶H₅ puckered conformations, all of which interact slightly differently with the enzyme. This work brings new evidence to the prevailing premise that there are several pathways from reactant to product passing through the saddle and successful product formation is not restricted to the minimum energy path. Increasing the reaction space with use of a multi-dimensional (3-D) reaction coordinate allowed simultaneous monitoring of the hitherto unexplored competition between a minor elimination reaction and the dominant displacement reaction present in both steps of the catalytic cycle. The dominant displacement reactions display lower barriers in the free energy profiles, greater sampling of favorable reactant stereoelectronic alignments and a greater number of possible transition paths leading to successful crossing reaction trajectories. The effects on the electronic degrees of freedom in reaction space were then investigated by running density functional theory reactive trajectories on the semi-empirical free energy. In order to carry out these simulations Free Energies from Adaptive Reaction Coordinates Forces was ported as a Fortran 90 library that interfaces with the NWChem molecular dynamics package. The resulting B3LYP/6-31G/CHARMM crossing trajectory provides a molecular orbital description of the glycosylation reaction. Direct investigation of the underlying potential energy functions for B3LYP/6-31G(d), B3LYP/6-31G and SCC-DFTB/MIO point to the minimal basis set as the primary limitation in using self-consistent charge density functional tight binding as the quantum mechanical model for modeling of enzymatic reactions transforming sialic acid substrates

    Pinpointing Software Inefficiencies With Profiling

    Get PDF
    Complex codebases with several layers of abstractions have abundant inefficiencies that affect the performance. These inefficiencies arise due to various causes such as developers\u27 inattention to performance, inappropriate choice of algorithms and inefficient code generation among others. To eliminate the redundancies, lots of work has been done during the compiling phase. However, not all redundancies can be easily detected or eliminated with compiler optimization passes due to aliasing, limited optimization scopes, and insensitivity to input and execution contexts act as severe deterrents to static program analysis. There are also profiling tools which can reveal how resources are used. However, they can hard to distinguish whether the resource is worth fully used. More profiling tools are in needed to diagnose resource wastage and pinpoint inefficiencies. We have developed three tools to pinpoint different types of inefficiencies in different granularity. We build Runtime Value Numbering (RVN), a dynamic fine-grained profiler to pinpoint and quantify redundant computations in an execution. It is based on the classical value numbering technique but works at runtime instead of compile-time. We developed RedSpy, a fine-grained profiler to pinpoint and quantify value redundancies in program executions. Value redundancy may happen overtime at the same locations or in adjacent locations, and thus it has temporal and spatial locality. RedSpy identifies both temporal and spatial value locality. Furthermore, RedSpy is capable of identifying values that are approximately the same, enabling optimization opportunities in HPC codes that often use floating-point computations. RVN and RedSpy are both instrumentation based tools. They provide comprehensive result while introducing high space and time overhead. Our lightweight framework, Witch, samples consecutive accesses to the same memory location by exploiting two ubiquitous hardware features: the performance monitoring units (PMU) and debug registers. Witch performs no instrumentation. Hence, witchcraft - tools built atop Witch - can detect a variety of software inefficiencies while introducing negligible slowdown and insignificant memory consumption and yet maintaining accuracy comparable to exhaustive instrumentation tools. Witch allowed us to scale our analysis to a large number of codebases. All the tools work on fully optimized binary executable and provide insightful optimization guidance by apportioning redundancies to their provenance - source lines and full calling contexts. We apply RVN, RedSpy, and Witch on programs that were optimization targets for decades and guided by the tools, we were able to eliminate redundancies that resulted in significant speedups

    Quantum Chemistry Calculations for Metabolomics

    Get PDF
    A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials (“standards”), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for “standards-free” identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials (“standards”), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for “standards-free” identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples

    Environmental Molecular Sciences Laboratory 2007 Annual Report

    Full text link

    Large Scale Computing and Storage Requirements for Basic Energy Sciences Research

    Full text link

    An extensive study on iterative solver resilience : characterization, detection and prediction

    Get PDF
    Soft errors caused by transient bit flips have the potential to significantly impactan applicalion's behavior. This has motivated the design of an array of techniques to detect, isolate, and correct soft errors using microarchitectural, architectural, compilation­based, or application-level techniques to minimize their impact on the executing application. The first step toward the design of good error detection/correction techniques involves an understanding of an application's vulnerability to soft errors. This work focuses on silent data e orruption's effects on iterative solvers and efforts to mitigate those effects. In this thesis, we first present the first comprehensive characterizalion of !he impact of soft errors on !he convergen ce characteris tics of six iterative methods using application-level fault injection. We analyze the impact of soft errors In terms of the type of error (single-vs multi-bit), the distribution and location of bits affected, the data structure and statement impacted, and varialion with time. We create a public access database with more than 1.5 million fault injection results. We then analyze the performance of soft error detection mechanisms and present the comparalive results. Molivated by our observations, we evaluate a machine-learning based detector that takes as features that are the runtime features observed by the individual detectors to arrive al their conclusions. Our evalualion demonstrates improved results over individual detectors. We then propase amachine learning based method to predict a program's error behavior to make fault injection studies more efficient. We demonstrate this method on asse ssing the performance of soft error detectors. We show that our method maintains 84% accuracy on average with up to 53% less cost. We also show, once a model is trained further fault injection tests would cost 10% of the expected full fault injection runs.“Soft errors” causados por cambios de estado transitorios en bits, tienen el potencial de impactar significativamente el comportamiento de una aplicación. Esto, ha motivado el diseño de una variedad de técnicas para detectar, aislar y corregir soft errors aplicadas a micro-arquitecturas, arquitecturas, tiempo de compilación y a nivel de aplicación para minimizar su impacto en la ejecución de una aplicación. El primer paso para diseñar una buna técnica de detección/corrección de errores, implica el conocimiento de las vulnerabilidades de la aplicación ante posibles soft errors. Este trabajo se centra en los efectos de la corrupción silenciosa de datos en soluciones iterativas, así como en los esfuerzos para mitigar esos efectos. En esta tesis, primeramente, presentamos la primera caracterización extensiva del impacto de soft errors sobre las características convergentes de seis métodos iterativos usando inyección de fallos a nivel de aplicación. Analizamos el impacto de los soft errors en términos del tipo de error (único vs múltiples-bits), de la distribución y posición de los bits afectados, las estructuras de datos, instrucciones afectadas y de las variaciones en el tiempo. Creamos una base de datos pública con más de 1.5 millones de resultados de inyección de fallos. Después, analizamos el desempeño de mecanismos de detección de soft errors actuales y presentamos los resultados de su comparación. Motivados por las observaciones de los resultados presentados, evaluamos un detector de soft errors basado en técnicas de machine learning que toma como entrada las características observadas en el tiempo de ejecución individual de los detectores anteriores al llegar a su conclusión. La evaluación de los resultados obtenidos muestra una mejora por sobre los detectores individualmente. Basados en estos resultados propusimos un método basado en machine learning para predecir el comportamiento de los errores en un programa con el fin de hacer el estudio de inyección de errores mas eficiente. Presentamos este método para evaluar el rendimiento de los detectores de soft errors. Demostramos que nuestro método mantiene una precisión del 84% en promedio con hasta un 53% de mejora en el tiempo de ejecución. También mostramos que una vez que un modelo ha sido entrenado, las pruebas de inyección de errores siguientes costarían 10% del tiempo esperado de ejecución.Postprint (published version

    Exploring Computational Chemistry on Emerging Architectures

    Get PDF
    Emerging architectures, such as next generation microprocessors, graphics processing units, and Intel MIC cards, are being used with increased popularity in high performance computing. Each of these architectures has advantages over previous generations of architectures including performance, programmability, and power efficiency. With the ever-increasing performance of these architectures, scientific computing applications are able to attack larger, more complicated problems. However, since applications perform differently on each of the architectures, it is difficult to determine the best tool for the job. This dissertation makes the following contributions to computer engineering and computational science. First, this work implements the computational chemistry variational path integral application, QSATS, on various architectures, ranging from microprocessors to GPUs to Intel MICs. Second, this work explores the use of analytical performance modeling to predict the runtime and scalability of the application on the architectures. This allows for a comparison of the architectures when determining which to use for a set of program input parameters. The models presented in this dissertation are accurate within 6%. This work combines novel approaches to this algorithm and exploration of the various architectural features to develop the application to perform at its peak. In addition, this expands the understanding of computational science applications and their implementation on emerging architectures while providing insight into the performance, scalability, and programmer productivity

    Improving scalability of large-scale distributed Spiking Neural Network simulations on High Performance Computing systems using novel architecture-aware streaming hypergraph partitioning

    Get PDF
    After theory and experimentation, modelling and simulation is regarded as the third pillar of science, helping scientists to further their understanding of a complex system. In recent years there has been a growing scientific focus on computational neuroscience as a means to understand the brain and its functions, with large international projects (Human Brain Project, Brain Activity Map, MindScope and \textit{China Brain Project}) aiming to further our knowledge of high level cognitive functions. They are a testament to the enormous interest, difficulty and importance of solving the mysteries of the brain. Spiking Neural Network (SNN) simulations are widely used in the domain to facilitate experimentation. Scaling SNN simulations to large networks usually results in more-than-linear increase in computational complexity. The computing resources required at the brain scale simulation far surpass the capabilities of personal computers today. If those demands are to be met, distributed computation models need to be adopted, since there is a slow down of improvements in individual processors speed due to physical limitations on heat dissipation. This is a significant change that requires careful management of the workload in many levels: partition of work, communication and workload balancing, efficient inter-process communication and efficient use of available memory. If large scale neuronal network models are to be run successfully, simulators must consider these, and offer a viable solution to the challenges they pose. Large scale SNN simulations evidence most of the issues of general HPC systems evident in large distributed computation. Commonly used distribution of workload algorithms (round robin, random and manual allocation) do not take into consideration connectivity locality, which is natural in biological networks, which can lead to increased communication requirements when distributing the simulation in multiple computing nodes. State-of-the-art SNN simulations use dense communication collectives to distribute spike data. The common method of point to point communication in distributed computation is through dense patterns. Sparse communication collectives have been suggested to incur in lower overheads when the application's pattern of communication is sparse. In this work we characterise the bottlenecks on communication-bound SNN simulations and identify communication balance and sparsity as the main contributors to scalability. We propose hypergraph partitioning to distribute neurons along computing nodes to minimise communication (increasing sparsity). A hypergraph is a generalisation of graphs, where a (hyper)edge can link 2 or more vertices at once. Coupled with a novel use of sparse-aware communication collective, computational efficiency increases by up to 40.8 percent points and simulation time reduces by up to 73\%, compared to the common round-robin allocation in neuronal simulators. HPC systems have, by design, highly hierarchical communication network links, with qualitative differences in communication speed and latency between computing nodes. This can create a mismatch between the distributed simulation communication patterns and the physical capabilities of the hardware. If large distributed simulations are to take full advantage of these systems, the communication properties of the HPC need to be taken into consideration when allocating workload to route frequent, heavy communication through fast network links. Strategies that consider the heterogeneous physical communication capabilities are called architecture-aware. After demonstrating that hypergraph partitioning leads to more efficient workload allocation in SNN simulations, this thesis proposes a novel sequential hypergraph partitioning algorithm that incorporates network bandwidth via profiling. This leads to a significant reduction in execution time (up to 14x speedup in synthetic benchmark simulations compared to architecture-agnostic partitioners). The motivating context of this work is large scale brain simulations, however in the era of social media, large graphs and hypergraphs are increasingly relevant in many other scientific applications. A common feature of such graphs is that they are too big for a single machine to cope, both in terms of performance and memory requirements. State-of-the-art multilevel partitioning has been shown to struggle to scale to large graphs in distributed memory, not just because they take a long time to process, but also because they require full knowledge of the graph (not possible in dynamic graphs) and to fit the graph entirely in memory (not possible for very large graphs). To address those limitations we propose a parallel implementation of our architecture-aware streaming hypergraph partitioning algorithm (HyperPRAW) to model distributed applications. Results demonstrate that HyperPRAW produces consistent speedup over previous streaming approaches that only consider hyperedge overlap (up to 5.2x speedup). Compared to multilevel global partitioner in dense hypergraphs (those with high average cardinality), HyperPRAW is able to produce workload allocations that result in speeding up runtime in a synthetic simulation benchmark (up to 4.3x). HyperPRAW has the potential to scale to very large hypergraphs as it only requires local information to make allocation decisions, with an order of magnitude less memory footprint than global partitioners. The combined contributions of this thesis lead to a novel, parallel, scalable, streaming hypergraph partitioning algorithm (HyperPRAW) that can be used to help scale large distributed simulations in HPC systems. HyperPRAW helps tackle three of the main scalability challenges: it produces highly balanced distributed computation and communication, minimising idle time between computing nodes; it reduces the communication overhead by placing frequently communicating simulation elements close to each other (where the communication cost is minimal); and it provides a solution with a reasonable memory footprint that allows tackling larger problems than state-of-the-art alternatives such as global multilevel partitioning
    corecore