191 research outputs found

    Computational Methods in Science and Engineering : Proceedings of the Workshop SimLabs@KIT, November 29 - 30, 2010, Karlsruhe, Germany

    Get PDF
    In this proceedings volume we provide a compilation of article contributions equally covering applications from different research fields and ranging from capacity up to capability computing. Besides classical computing aspects such as parallelization, the focus of these proceedings is on multi-scale approaches and methods for tackling algorithm and data complexity. Also practical aspects regarding the usage of the HPC infrastructure and available tools and software at the SCC are presented

    Service-Oriented Data Mining

    Get PDF

    Complexity, Emergent Systems and Complex Biological Systems:\ud Complex Systems Theory and Biodynamics. [Edited book by I.C. Baianu, with listed contributors (2011)]

    Get PDF
    An overview is presented of System dynamics, the study of the behaviour of complex systems, Dynamical system in mathematics Dynamic programming in computer science and control theory, Complex systems biology, Neurodynamics and Psychodynamics.\u

    Metal Cations in Protein Force Fields: From Data Set Creation and Benchmarks to Polarizable Force Field Implementation and Adjustment

    Get PDF
    Metal cations are essential to life. About one-third of all proteins require metal cofactors to accurately fold or to function. Computer simulations using empirical parameters and classical molecular mechanics models (force fields) are the standard tool to investigate proteins’ structural dynamics and functions in silico. Despite many successes, the accuracy of force fields is limited when cations are involved. The focus of this thesis is the development of tools and strategies to create system-specific force field parameters to accurately describe cation-protein interactions. The accuracy of a force field mainly relies on (i) the parameters derived from increasingly large quantum chemistry or experimental data and (ii) the physics behind the energy formula. The first part of this thesis presents a large and comprehensive quantum chemistry data set on a consistent computational footing that can be used for force field parameterization and benchmarking. The data set covers dipeptides of the 20 proteinogenic amino acids with different possible side chain protonation states, 3 divalent cations (Ca2+, Mg2+, and Ba2+), and a wide relative energy range. Crucial properties related to force field development, such as partial charges, interaction energies, etc., are also provided. To make the data available, the data set was uploaded to the NOMAD repository and its data structure was formalized in an ontology. Besides a proper data basis for parameterization, the physics covered by the terms of the additive force field formulation model impacts its applicability. The second part of this thesis benchmarks three popular non-polarizable force fields and the polarizable Drude model against a quantum chemistry data set. After some adjustments, the Drude model was found to reproduce the reference interaction energy substantially better than the non-polarizable force fields, which showed the importance of explicitly addressing polarization effects. Tweaking of the Drude model involved Boltzmann-weighted fitting to optimize Thole factors and Lennard-Jones parameters. The obtained parameters were validated by (i) their ability to reproduce reference interaction energies and (ii) molecular dynamics simulations of the N-lobe of calmodulin. This work facilitates the improvement of polarizable force fields for cation-protein interactions by quantum chemistry-driven parameterization combined with molecular dynamics simulations in the condensed phase. While the Drude model exhibits its potential simulating cation-protein interactions, it lacks description of charge transfer effects, which are significant between cation and protein. The CTPOL model extends the classical force field formulation by charge transfer (CT) and polarization (POL). Since the CTPOL model is not readily available in any of the popular molecular-dynamics packages, it was implemented in OpenMM. Furthermore, an open-source parameterization tool, called FFAFFURR, was implemented that enables the (system specific) parameterization of OPLS-AA and CTPOL models. Following the method established in the previous part, the performance of FFAFFURR was evaluated by its ability to reproduce quantum chemistry energies and molecular dynamics simulations of the zinc finger protein. In conclusion, this thesis steps towards the development of next-generation force fields to accurately describe cation-protein interactions by providing (i) reference data, (ii) a force field model that includes charge transfer and polarization, and (iii) a freely-available parameterization tool.Metallkationen sind für das Leben unerlässlich. Etwa ein Drittel aller Proteine benötigen Metall-Cofaktoren, um sich korrekt zu falten oder zu funktionieren. Computersimulationen unter Verwendung empirischer Parameter und klassischer Molekülmechanik-Modelle (Kraftfelder) sind ein Standardwerkzeug zur Untersuchung der strukturellen Dynamik und Funktionen von Proteinen in silico. Trotz vieler Erfolge ist die Genauigkeit der Kraftfelder begrenzt, wenn Kationen beteiligt sind. Der Schwerpunkt dieser Arbeit liegt auf der Entwicklung von Werkzeugen und Strategien zur Erstellung systemspezifischer Kraftfeldparameter zur genaueren Beschreibung von Kationen-Protein-Wechselwirkungen. Die Genauigkeit eines Kraftfelds hängt hauptsächlich von (i) den Parametern ab, die aus immer größeren quantenchemischen oder experimentellen Daten abgeleitet werden, und (ii) der Physik hinter der Kraftfeld-Formel. Im ersten Teil dieser Arbeit wird ein großer und umfassender quantenchemischer Datensatz auf einer konsistenten rechnerischen Grundlage vorgestellt, der für die Parametrisierung und das Benchmarking von Kraftfeldern verwendet werden kann. Der Datensatz umfasst Dipeptide der 20 proteinogenen Aminosäuren mit verschiedenen möglichen Seitenketten-Protonierungszuständen, 3 zweiwertige Kationen (Ca2+, Mg2+ und Ba2+) und einen breiten relativen Energiebereich. Wichtige Eigenschaften für die Entwicklung von Kraftfeldern, wie Wechselwirkungsenergien, Partialladungen usw., werden ebenfalls bereitgestellt. Um die Daten verfügbar zu machen, wurde der Datensatz in das NOMAD-Repository hochgeladen und seine Datenstruktur wurde in einer Ontologie formalisiert. Neben einer geeigneten Datenbasis für die Parametrisierung beeinflusst die Physik, die von den Termen des additiven Kraftfeld-Modells abgedeckt wird, dessen Anwendbarkeit. Der zweite Teil dieser Arbeit vergleicht drei populäre nichtpolarisierbare Kraftfelder und das polarisierbare Drude-Modell mit einem Datensatz aus der Quantenchemie. Nach einigen Anpassungen stellte sich heraus, dass das Drude-Modell die Referenzwechselwirkungsenergie wesentlich besser reproduziert als die nichtpolarisierbaren Kraftfelder, was zeigt, wie wichtig es ist, Polarisationseffekte explizit zu berücksichtigen. Die Anpassung des Drude-Modells umfasste eine Boltzmann-gewichtete Optimierung der Thole-Faktoren und Lennard-Jones-Parameter. Die erhaltenen Parameter wurden validiert durch (i) ihre Fähigkeit, Referenzwechselwirkungsenergien zu reproduzieren und (ii) Molekulardynamik-Simulationen des Calmodulin-N-Lobe. Diese Arbeit demonstriert die Verbesserung polarisierbarer Kraftfelder für Kationen-Protein-Wechselwirkungen durch quantenchemisch gesteuerte Parametrisierung in Kombination mit Molekulardynamiksimulationen in der kondensierten Phase. Während das Drude-Modell sein Potenzial bei der Simulation von Kation - Protein - Wechselwirkungen zeigt, fehlt ihm die Beschreibung von Ladungstransfereffekten, die zwischen Kation und Protein von Bedeutung sind. Das CTPOL-Modell erweitert die klassische Kraftfeldformulierung um den Ladungstransfer (CT) und die Polarisation (POL). Da das CTPOL-Modell in keinem der gängigen Molekulardynamik-Pakete verfügbar ist, wurde es in OpenMM implementiert. Außerdem wurde ein Open-Source-Parametrisierungswerkzeug namens FFAFFURR implementiert, welches die (systemspezifische) Parametrisierung von OPLS-AA und CTPOL-Modellen ermöglicht. In Anlehnung an die im vorangegangenen Teil etablierte Methode wurde die Leistung von FFAFFURR anhand seiner Fähigkeit, quantenchemische Energien und Molekulardynamiksimulationen des Zinkfingerproteins zu reproduzieren, bewertet. Zusammenfassend lässt sich sagen, dass diese Arbeit einen Schritt in Richtung der Entwicklung von Kraftfeldern der nächsten Generation zur genauen Beschreibung von Kationen-Protein-Wechselwirkungen darstellt, indem sie (i) Referenzdaten, (ii) ein Kraftfeldmodell, das Ladungstransfer und Polarisation einschließt, und (iii) ein frei verfügbares Parametrisierungswerkzeug bereitstellt

    At the crossroads of big science, open science, and technology transfer

    Get PDF
    Les grans infraestructures científiques s’enfronten a demandes creixents de responsabilitat pública, no només per la seva contribució al descobriment científic, sinó també per la seva capacitat de generar valor econòmic secundari. Per construir i operar les seves infraestructures sofisticades, sovint generen tecnologies frontereres dissenyant i construint solucions tècniques per a problemes d’enginyeria complexos i sense precedents. En paral·lel, la dècada anterior ha presenciat la ràpida irrupció de canvis tecnològics que han afectat la manera com es fa i es comparteix la ciència, cosa que ha comportat l’emergència del concepte d’Open Science (OS). Els governs avancen ràpidament vers aquest paradigma de OS i demanen a les grans infraestructures científiques que "obrin" els seus processos científics. No obstant, aquestes dues forces s'oposen, ja que la comercialització de tecnologies i resultats científics requereixen normalment d’inversions financeres importants i les empreses només estan disposades a assumir aquest cost si poden protegir la innovació de la imitació o de la competència deslleial. Aquesta tesi doctoral té com a objectiu comprendre com les noves aplicacions de les TIC afecten els resultats de la recerca i la transferència de tecnologia resultant en el context de les grans infraestructures científiques. La tesis pretén descobrir les tensions entre aquests dos vectors normatius, així com identificar els mecanismes que s’utilitzen per superar-les. La tesis es compon de quatre estudis: 1) Un estudi que aplica un mètode de recerca mixt que combina dades de dues enquestes d’escala global realitzades online (2016, 2018), amb dos cas d’estudi de dues comunitats científiques en física d’alta energia i biologia molecular que avaluen els factors explicatius darrere les pràctiques de compartir dades per part dels científics; 2) Un estudi de cas d’Open Targets, una infraestructura d’informació basada en dades considerades bens comuns, on el Laboratori Europeu de Biologia Molecular-EBI i empreses farmacèutiques col·laboren i comparteixen dades científiques i eines tecnològiques per accelerar el descobriment de medicaments; 3) Un estudi d’un conjunt de dades únic de 170 projectes finançats en el marc d’ATTRACT (un nou instrument de la Comissió Europea liderat per les grans infraestructures científiques europees) que té com a objectiu comprendre la naturalesa del procés de serendipitat que hi ha darrere de la transició de tecnologies de grans infraestructures científiques a aplicacions comercials abans no anticipades. ; i 4) un cas d’estudi sobre la tecnologia White Rabbit, un hardware sofisticat de codi obert desenvolupat al Consell Europeu per a la Recerca Nuclear (CERN) en col·laboració amb un extens ecosistema d’empreses.Las grandes infraestructuras científicas se enfrentan a crecientes demandas de responsabilidad pública, no solo por su contribución al descubrimiento científico sino también por su capacidad de generar valor económico para la sociedad. Para construir y operar sus sofisticadas infraestructuras, a menudo generan tecnologías de vanguardia al diseñar y construir soluciones técnicas para problemas de ingeniería complejos y sin precedentes. Paralelamente, la década anterior ha visto la irrupción de rápidos cambios tecnológicos que afectan la forma en que se genera y comparte la ciencia, lo que ha llevado a acuñar el concepto de Open Science (OS). Los gobiernos se están moviendo rápidamente hacia este nuevo paradigma y están pidiendo a las grandes infraestructuras científicas que "abran" el proceso científico. Sin embargo, estas dos fuerzas se oponen, ya que la comercialización de tecnología y productos científicos generalmente requiere importantes inversiones financieras y las empresas están dispuestas a asumir este coste solo si pueden proteger la innovación de la imitación o la competencia desleal. Esta tesis doctoral tiene como objetivo comprender cómo las nuevas aplicaciones de las TIC están afectando los resultados científicos y la transferencia de tecnología resultante en el contexto de las grandes infraestructuras científicas. La tesis pretende descubrir las tensiones entre estas dos fuerzas normativas e identificar los mecanismos que se emplean para superarlas. La tesis se compone de cuatro estudios: 1) Un estudio que emplea un método mixto de investigación que combina datos de dos encuestas de escala global realizadas online (2016, 2018), con dos caso de estudio sobre dos comunidades científicas distintas -física de alta energía y biología molecular- que evalúan los factores explicativos detrás de las prácticas de intercambio de datos científicos; 2) Un caso de estudio sobre Open Targets, una infraestructura de información basada en datos considerados como bienes comunes, donde el Laboratorio Europeo de Biología Molecular-EBI y compañías farmacéuticas colaboran y comparten datos científicos y herramientas tecnológicas para acelerar el descubrimiento de fármacos; 3) Un estudio de un conjunto de datos único de 170 proyectos financiados bajo ATTRACT, un nuevo instrumento de la Comisión Europea liderado por grandes infraestructuras científicas europeas, que tiene como objetivo comprender la naturaleza del proceso fortuito detrás de la transición de las tecnologías de grandes infraestructuras científicas a aplicaciones comerciales previamente no anticipadas ; y 4) un estudio de caso de la tecnología White Rabbit, un sofisticado hardware de código abierto desarrollado en el Consejo Europeo de Investigación Nuclear (CERN) en colaboración con un extenso ecosistema de empresas.Big science infrastructures are confronting increasing demands for public accountability, not only within scientific discovery but also their capacity to generate secondary economic value. To build and operate their sophisticated infrastructures, big science often generates frontier technologies by designing and building technical solutions to complex and unprecedented engineering problems. In parallel, the previous decade has seen the disruption of rapid technological changes impacting the way science is done and shared, which has led to the coining of the concept of Open Science (OS). Governments are quickly moving towards the OS paradigm and asking big science centres to "open up” the scientific process. Yet these two forces run in opposition as the commercialization of scientific outputs usually requires significant financial investments and companies are willing to bear this cost only if they can protect the innovation from imitation or unfair competition. This PhD dissertation aims at understanding how new applications of ICT are affecting primary research outcomes and the resultant technology transfer in the context of big and OS. It attempts to uncover the tensions in these two normative forces and identify the mechanisms that are employed to overcome them. The dissertation is comprised of four separate studies: 1) A mixed-method study combining two large-scale global online surveys to research scientists (2016, 2018), with two case studies in high energy physics and molecular biology scientific communities that assess explanatory factors behind scientific data-sharing practices; 2) A case study of Open Targets, an information infrastructure based upon data commons, where European Molecular Biology Laboratory-EBI and pharmaceutical companies collaborate and share scientific data and technological tools to accelerate drug discovery; 3) A study of a unique dataset of 170 projects funded under ATTRACT -a novel policy instrument of the European Commission lead by European big science infrastructures- which aims to understand the nature of the serendipitous process behind transitioning big science technologies to previously unanticipated commercial applications; and 4) a case study of White Rabbit technology, a sophisticated open-source hardware developed at the European Council for Nuclear Research (CERN) in collaboration with an extensive ecosystem of companies

    Computational Design of Protein Structure and Prediction of Ligand Binding

    Get PDF
    Proteins perform a tremendous array of finely-tuned functions which are not only critical in living organisms, but can be used for industrial and medical purposes. The ability to rationally design these molecular machines could provide a wealth of opportunities, for example to improve human health and to expand the range and reduce cost of many industrial chemical processes. The modularity of a protein sequence combined with many degrees of structural freedom yield a problem that can frequently be best tackled using computational methods. These computational methods, which include the use of: bioinformatics analysis, molecular dynamics, empirical forcefields, statistical potentials, and machine learning approaches, amongst others, are collectively known as Computational Protein Design (CPD). Here CPD is examined from the perspective of four different goals: successful design of an intended structure, the prediction of folding and unfolding kinetics from structure (kinetic stability in particular), engineering of improved stability, and prediction of binding sites and energetics. A considerable proportion of protein folds, and the majority of the most common folds ("superfolds"), are internally symmetric, suggesting emergence from an ancient repetition event. CPD, an increasingly popular and successful method for generating de novo folded sequences and topologies, suffers from exponential scaling of complexity with protein size. Thus, the overwhelming majority of successful designs are of relatively small proteins (< 100 amino acids). Designing proteins comprised of repeated modular elements allows the design space to be partitioned into more manageable portions. Here, a bioinformatics analysis of a "superfold", the beta-trefoil, demonstrated that formation of a globular fold via repetition was not only an ancient event, but an ongoing means of generating diverse and functional sequences. Modular repetition also promotes rapid evolution for binding multivalent targets in the "evolutionary arms race" between host and pathogen. Finally, modular repetition was used to successfully design, on the first attempt, a well-folded and functional beta-trefoil, called ThreeFoil. Improving protein design requires understanding the outcomes of design and not simply the 3D structure. To this end, I undertook an extensive biophysical characterization of ThreeFoil, with the key finding that its unfolding is extraordinarily slow, with a half-life of almost a decade. This kinetic stability grants ThreeFoil near-immunity to common denaturants as well as high resistance to proteolysis. A large scale analysis of hundreds of proteins, and coarse-grained modelling of ThreeFoil and other beta-trefoils, indicates that high kinetic stability results from a folded structure rich in contacts between residues distant in sequence (long-range contacts). Furthermore, an analysis of unrelated proteins known to have similar protease resistance, demonstrates that the topological complexity resulting from these long-range contacts may be a general mechanism by which proteins remain folded in harsh environments. Despite the wonderful kinetic stability of ThreeFoil, it has only moderate thermodynamic stability. I sought to improve this in order to provide a stability buffer for future functional engineering and mutagenesis. Numerous computational tools which predict stability change upon point mutation were used, and 10 mutations made based on their recommendations. Despite claims of >80% accuracy for these predictions, only 2 of the 10 mutations were stabilizing. An in-depth analysis of more than 20 such tools shows that, to a large extent, while they are capable of recognizing highly destabilizing mutations, they are unable to distinguish between moderately destabilizing and stabilizing mutations. Designing protein structure tests our understanding of the determinants of protein folding, but useful function is often the final goal of protein engineering. I explored protein-ligand binding using molecular dynamics for several protein-ligand systems involving both flexible ligand binding to deep pockets and more rigid ligand binding to shallow grooves. I also used various levels of simulation complexity, from gas-phase, to implicit solvent, to fully explicit solvent, as well as simple equilibrium simulations to interrogate known interactions to more complex energetically biased simulations to explore diverse configurations and gain novel information
    • …
    corecore