167 research outputs found

    Lower-energy conformers search of TPP-1 polypeptide via hybrid particle swarm optimization and genetic algorithm

    Get PDF
    Low-energy conformation search on biological macromolecules remains a challenge in biochemical experiments and theoretical studies. Finding efficient approaches to minimize the energy of peptide structures is critically needed for researchers either studying peptide-protein interactions or designing peptide drugs. In this study, we aim to develop a heuristic-based algorithm to efficiently minimize a promising PD-L1 inhibiting polypeptide, TPP-1, and build its low-energy conformer pool to advance its subsequent structure optimization and molecular docking studies. Through our study, we find that, using backbone dihedral angles as the decision variables, both PSO and GA can outperform other existing heuristic approaches in optimizing the structure of Met-enkephalin, a benchmarking pentapeptide for evaluating the efficiency of conformation optimizers. Using the established algorithm pipeline, hybridizing PSO and GA minimized TPP-1 structure efficiently and a low-energy pool was built with an acceptable computational cost (a couple days using a single laptop). Remarkably, the efficiency of hybrid PSO-GA is hundreds-fold higher than the conventional Molecular Dynamic simulations running under the force filed. Meanwhile, the stereo-chemical quality of the minimized structures was validated using Ramachandran plot. In summary, hybrid PSO-GA minimizes TPP-1 structure efficiently and yields a low-energy conformer pool within a reasonably short time period. Overall, our approach can be extended to biochemical research to speed up the peptide conformation determinations and hence can facilitate peptide-involved drug development

    Metal Cations in Protein Force Fields: From Data Set Creation and Benchmarks to Polarizable Force Field Implementation and Adjustment

    Get PDF
    Metal cations are essential to life. About one-third of all proteins require metal cofactors to accurately fold or to function. Computer simulations using empirical parameters and classical molecular mechanics models (force fields) are the standard tool to investigate proteins’ structural dynamics and functions in silico. Despite many successes, the accuracy of force fields is limited when cations are involved. The focus of this thesis is the development of tools and strategies to create system-specific force field parameters to accurately describe cation-protein interactions. The accuracy of a force field mainly relies on (i) the parameters derived from increasingly large quantum chemistry or experimental data and (ii) the physics behind the energy formula. The first part of this thesis presents a large and comprehensive quantum chemistry data set on a consistent computational footing that can be used for force field parameterization and benchmarking. The data set covers dipeptides of the 20 proteinogenic amino acids with different possible side chain protonation states, 3 divalent cations (Ca2+, Mg2+, and Ba2+), and a wide relative energy range. Crucial properties related to force field development, such as partial charges, interaction energies, etc., are also provided. To make the data available, the data set was uploaded to the NOMAD repository and its data structure was formalized in an ontology. Besides a proper data basis for parameterization, the physics covered by the terms of the additive force field formulation model impacts its applicability. The second part of this thesis benchmarks three popular non-polarizable force fields and the polarizable Drude model against a quantum chemistry data set. After some adjustments, the Drude model was found to reproduce the reference interaction energy substantially better than the non-polarizable force fields, which showed the importance of explicitly addressing polarization effects. Tweaking of the Drude model involved Boltzmann-weighted fitting to optimize Thole factors and Lennard-Jones parameters. The obtained parameters were validated by (i) their ability to reproduce reference interaction energies and (ii) molecular dynamics simulations of the N-lobe of calmodulin. This work facilitates the improvement of polarizable force fields for cation-protein interactions by quantum chemistry-driven parameterization combined with molecular dynamics simulations in the condensed phase. While the Drude model exhibits its potential simulating cation-protein interactions, it lacks description of charge transfer effects, which are significant between cation and protein. The CTPOL model extends the classical force field formulation by charge transfer (CT) and polarization (POL). Since the CTPOL model is not readily available in any of the popular molecular-dynamics packages, it was implemented in OpenMM. Furthermore, an open-source parameterization tool, called FFAFFURR, was implemented that enables the (system specific) parameterization of OPLS-AA and CTPOL models. Following the method established in the previous part, the performance of FFAFFURR was evaluated by its ability to reproduce quantum chemistry energies and molecular dynamics simulations of the zinc finger protein. In conclusion, this thesis steps towards the development of next-generation force fields to accurately describe cation-protein interactions by providing (i) reference data, (ii) a force field model that includes charge transfer and polarization, and (iii) a freely-available parameterization tool.Metallkationen sind für das Leben unerlässlich. Etwa ein Drittel aller Proteine benötigen Metall-Cofaktoren, um sich korrekt zu falten oder zu funktionieren. Computersimulationen unter Verwendung empirischer Parameter und klassischer Molekülmechanik-Modelle (Kraftfelder) sind ein Standardwerkzeug zur Untersuchung der strukturellen Dynamik und Funktionen von Proteinen in silico. Trotz vieler Erfolge ist die Genauigkeit der Kraftfelder begrenzt, wenn Kationen beteiligt sind. Der Schwerpunkt dieser Arbeit liegt auf der Entwicklung von Werkzeugen und Strategien zur Erstellung systemspezifischer Kraftfeldparameter zur genaueren Beschreibung von Kationen-Protein-Wechselwirkungen. Die Genauigkeit eines Kraftfelds hängt hauptsächlich von (i) den Parametern ab, die aus immer größeren quantenchemischen oder experimentellen Daten abgeleitet werden, und (ii) der Physik hinter der Kraftfeld-Formel. Im ersten Teil dieser Arbeit wird ein großer und umfassender quantenchemischer Datensatz auf einer konsistenten rechnerischen Grundlage vorgestellt, der für die Parametrisierung und das Benchmarking von Kraftfeldern verwendet werden kann. Der Datensatz umfasst Dipeptide der 20 proteinogenen Aminosäuren mit verschiedenen möglichen Seitenketten-Protonierungszuständen, 3 zweiwertige Kationen (Ca2+, Mg2+ und Ba2+) und einen breiten relativen Energiebereich. Wichtige Eigenschaften für die Entwicklung von Kraftfeldern, wie Wechselwirkungsenergien, Partialladungen usw., werden ebenfalls bereitgestellt. Um die Daten verfügbar zu machen, wurde der Datensatz in das NOMAD-Repository hochgeladen und seine Datenstruktur wurde in einer Ontologie formalisiert. Neben einer geeigneten Datenbasis für die Parametrisierung beeinflusst die Physik, die von den Termen des additiven Kraftfeld-Modells abgedeckt wird, dessen Anwendbarkeit. Der zweite Teil dieser Arbeit vergleicht drei populäre nichtpolarisierbare Kraftfelder und das polarisierbare Drude-Modell mit einem Datensatz aus der Quantenchemie. Nach einigen Anpassungen stellte sich heraus, dass das Drude-Modell die Referenzwechselwirkungsenergie wesentlich besser reproduziert als die nichtpolarisierbaren Kraftfelder, was zeigt, wie wichtig es ist, Polarisationseffekte explizit zu berücksichtigen. Die Anpassung des Drude-Modells umfasste eine Boltzmann-gewichtete Optimierung der Thole-Faktoren und Lennard-Jones-Parameter. Die erhaltenen Parameter wurden validiert durch (i) ihre Fähigkeit, Referenzwechselwirkungsenergien zu reproduzieren und (ii) Molekulardynamik-Simulationen des Calmodulin-N-Lobe. Diese Arbeit demonstriert die Verbesserung polarisierbarer Kraftfelder für Kationen-Protein-Wechselwirkungen durch quantenchemisch gesteuerte Parametrisierung in Kombination mit Molekulardynamiksimulationen in der kondensierten Phase. Während das Drude-Modell sein Potenzial bei der Simulation von Kation - Protein - Wechselwirkungen zeigt, fehlt ihm die Beschreibung von Ladungstransfereffekten, die zwischen Kation und Protein von Bedeutung sind. Das CTPOL-Modell erweitert die klassische Kraftfeldformulierung um den Ladungstransfer (CT) und die Polarisation (POL). Da das CTPOL-Modell in keinem der gängigen Molekulardynamik-Pakete verfügbar ist, wurde es in OpenMM implementiert. Außerdem wurde ein Open-Source-Parametrisierungswerkzeug namens FFAFFURR implementiert, welches die (systemspezifische) Parametrisierung von OPLS-AA und CTPOL-Modellen ermöglicht. In Anlehnung an die im vorangegangenen Teil etablierte Methode wurde die Leistung von FFAFFURR anhand seiner Fähigkeit, quantenchemische Energien und Molekulardynamiksimulationen des Zinkfingerproteins zu reproduzieren, bewertet. Zusammenfassend lässt sich sagen, dass diese Arbeit einen Schritt in Richtung der Entwicklung von Kraftfeldern der nächsten Generation zur genauen Beschreibung von Kationen-Protein-Wechselwirkungen darstellt, indem sie (i) Referenzdaten, (ii) ein Kraftfeldmodell, das Ladungstransfer und Polarisation einschließt, und (iii) ein frei verfügbares Parametrisierungswerkzeug bereitstellt

    AQME: Automated quantum mechanical environments for researchers and educators

    Get PDF
    AQME, automated quantum mechanical environments, is a free and open-source Python package for the rapid deployment of automated workflows using cheminformatics and quantum chemistry. AQME workflows integrate tasks performed across multiple computational chemistry packages and data formats, preserving all computational protocols, data, and metadata for machine and human users to access and reuse. AQME has a modular structure of independent modules that can be implemented in any sequence, allowing the users to use all or only the desired parts of the program. The code has been developed for researchers with basic familiarity with the Python programming language. The CSEARCH module interfaces to molecular mechanics and semi-empirical QM (SQM) conformer generation tools (e.g., RDKit and Conformer–Rotamer Ensemble Sampling Tool, CREST) starting from various initial structure formats. The CMIN module enables geometry refinement with SQM and neural network potentials, such as ANI. The QPREP module interfaces with multiple QM programs, such as Gaussian, ORCA, and PySCF. The QCORR module processes QM results, storing structural, energetic, and property data while also enabling automated error handling (i.e., convergence errors, wrong number of imaginary frequencies, isomerization, etc.) and job resubmission. The QDESCP module provides easy access to QM ensemble-averaged molecular descriptors and computed properties, such as NMR spectra. Overall, AQME provides automated, transparent, and reproducible workflows to produce, analyze and archive computational chemistry results. SMILES inputs can be used, and many aspects of tedious human manipulation can be avoided. Installation and execution on Windows, macOS, and Linux platforms have been tested, and the code has been developed to support access through Jupyter Notebooks, the command line, and job submission (e.g., Slurm) scripts. Examples of pre-configured workflows are available in various formats, and hands-on video tutorials illustrate their use

    In silico studies of the effect of phenolic compounds from grape seed extracts on the activity of phosphoinositide 3-kinase (PI3K) and the farnesoid x receptor (FXR)

    Get PDF
    In silico studies of the effect of phenolic compounds from grape seed extracts on the activity of phosphoinositide 3-kinase (PI3K) and farnesoid X receptor (FXR)Montserrat Vaqué Marquès En aquesta tesis es pretén aplicar metodologies computacionals (generació de farmacòfors i docking proteïna lligand) en l'àmbit de la nutigenòmica (ciència que pretén entendre, a nivell molecular, com els nutrients afecten la salut). S'aplicaran metodologies in silico per entendre a nivell molecular com productes naturals com els compostos fenòlics presents en la nostra dieta, poden modular la funció d'una diana comportant un efect en la salut. Aquest efecte es creu que podria ser degut a la seva interacció directa amb proteïnes de vies de senyalització molecular o bé a la modificació indirecta de l'expressió gènica. Donat que el coneixement de l'estructura del complex lligand-receptor és bàsic per entendre el mecanisme d'acció d'aquests lligands s'aplica la metodologia docking per predir l'estructura tridimensional del complex. En aquest sentit, un dels programes de docking és AutoGrid/AutoDock (un dels més citats). No obstant, l'automatització d'AutoGrid/AutoDock no és trivial tan per (a) la cerca virtual en una llibreria de lligands contra un grup de possibles receptors, (b) l'ús de flexibilitat, i (c) realitzar un docking a cegues utilitzant tota la superfície del receptor. Per aquest motiu, es dissenya una interfície gràfica de fàcil ús per utilitzar AutoGrid/AutoDock. Blind Docking Tester (BDT) és una aplicació gràfica que s'executa sobre quatre programes escrits en Fortran i que controla les condicions de les execucions d'AutoGrid i AutoDock. BDT pot ser utilitzat per equips d'investigadors en el camp de la química i de ciències de la vida interessats en dur a terme aquest tipus d'experiments però que no tenen suficient habilitats en programació. En la modulació del metabolisme de la glucosa, treballs in vivio i in vitro en el nostre grup de recerca s'han atribuït els efectes beneficiosos de l'extracte de pinyol de raïm en induir captació de glucosa (punt crític pel manteniment de l'homeostasis de la glucosa). No obstant alguns compostos fenòlics no tenen efecte en la captació de la glucosa, d'altres l'inhibeixen reversiblement. En alguns casos aquesta inhibició és el resultat de la competició dels compostos fenòlics amb ATP pel lloc d'unió de l'ATP de la subunitat catalítica de la fosfatidil inositol 3-kinasa (PI3K). Estudis recents amb inhibidors específics d'isoforma han identificat la p110α (la subunitat catalítica de PI3Kα) com la isoforma crucial per la captació de glucosa estimulada per insulina en algunes línies cel·lulars. Els programes computacionals han estat aplicats per tal de correlacionar l'activitat biològica dels compostos fenòlics amb informació estructural per obtenir una relació quantitativa estructura-activitat (3D-QSAR) i obtenir informació dels requeriments estructura-lligand per augmentar l'afinitat i/o selectivitat amb la diana (proteïna). Tot hi haver-se demostrat que l'adició d'extractes de compostos fenòlics en l'aliment pot tenir en general un benefici per la salut, s'ha de tenir en compte que l'estudi 3D-QSAR (construït a partir d'inhibidors sintètics de p110α) prediu que algunes d'aquestes molècules poden agreujar la resistència a la insulina en individus susceptibles dificultant la capatació de glucosa en múscul i teixit adipós i, per tant, produir un efecte secundari indesitjat. Resultats en el nostre grup de recerca han demostrat que compostos fenòlics presents en extractes de llavor de raïm incrementen l'activitat del receptor "farnesoid x receptor" (FXR) de manera dosi depenent quan el lligand natural de FXR (CDCA) és present. Les metodologies in silico, docking i 3D-QSAR, han estat aplicades juntament amb dades biològiques d'agonistes no esteroidals de FXR que s'uneixen a un lloc d'unió proper però diferent al lligand esteroidal 6CDCA. Els resultats han mostrat que els compostos fenòlics no són capaços d'activar FXR per ells mateixos però poden afegir noves interaccions que estabilitzarien la conformació activa de FXR en presència del lligand natural CDCA. Els compostos fenòlics podrien induir canvis conformacionals específics que augmentarien l'activitat de FXR. In silico studies of the effect of phenolic compounds from grape seed extracts on the activity of phosphoinositide 3-kinase (PI3K) and farnesoid X receptor (FXR)Montserrat Vaqué Marquès This thesis was written with the aim of applying computational methods that have already been developed for molecular design and simulation (i.e. pharmacophore generation and protein-ligand docking) to nutrigenomics. So, in silico tools that are routinely used by the pharmaceutical industry to develop drugs have been used to understand, at the molecular level, how natural products such as phenolic compounds (i.e. molecules that are commonly found in fruits and vegetables) can improve health and prevent diseases. Therefore, we first focused on predicting the structure of protein-ligand complexes. The docking algorithms can use the individual structures from receptor and ligand to predict (1) whether they can form a complex and (2) if so, the structure of the resulting complex. This prediction can be made, for instance, with AutoGrid/AutoDock, the most cited docking software in the literature. The automation of AutoGrid/AutoDock is not trivial for tasks such as (1) the virtual screening of a library of ligands against a set of possible receptors; (2) the use of receptor flexibility and (3) making a blind-docking experiment with the whole receptor surface. Therefore, in order to circumvent these limitations, we have designed BDT (i.e. blind-docking tester; http://www.quimica.urv.cat/~pujadas/BDT), an easy-to-use graphic interface for using AutoGrid/AutoDock. BDT is a Tcl/Tk graphic front-end application that runs on top of four Fortran programs and which controls the conditions of the AutoGrid and AutoDock runs. As far as the modulation of the glucose metabolism is concerned, several in vivo and in vitro results obtained by our group have shown that grape seed procyanidin extracts (GSPE) stimulate glucose uptake in 3T3-L1 adipocytes and thus help to maintain their glucose homeostasis. In contrast, it is also well known that although some phenolic compounds do not affect glucose uptake, others reversibly inhibit it in several cell lines. Moreover, for at least some of these phenolic compounds, this inhibition is the result of their competition with ATP for the ATP-binding site in p110α (i.e. the α isoform of the catalytic subunit of phosphoinositide 3-kinase or PI3Kα). Furthermore, recent studies with isoform-specific inhibitors have identified p110α as the crucial isoform for insulin-stimulated glucose-uptake in some cell lines. Therefore, although it has been proved that the addition of phenolic compound extracts to food can have an overall benefit on health, it should be taken into account that some of these molecules may exacerbate insulin resistance in susceptible individuals via impaired glucose uptake in muscle and adipose tissues and, therefore, produce an undesirable side effect. In this context, we have applied computational approaches (i.e. protein-ligand docking and 3D-QSAR) to predict the IC50 (i.e. the concentration that reduces the p110α activity to 50%). Our results agree with previous experimental results and predict that some compounds are potential inhibitors of this enzyme. Recent results in our research group have demonstrated that the phenolic compounds in GSPE increase the activity of the farnesoid X receptor (i.e. FXR) in a dose-dependent way when the natural ligand of FXR (i.e. CDCA) is also present. The phenolic compounds might induce specific conformational changes that increase FXR activity and then contribute to cardioprotection through mechanisms that are independent of their intrinsic antioxidant capacities but that involve direct interaction with FXR to modulate gene expression. Taking into account this hypothesis a 3D-QSAR analysis was made in an attempt to understand how phenolic compounds activate FXR. So, our results explain why phenolic compounds cannot activate FXR by themselves and how they can add new interactions to stabilize the active conformation of FXR when its natural ligand (i.e. CDCA) is present. Therefore, we proposed a mechanism of FXR activation by dietary phenolic compounds in which they may enhance bile acid-bound FXR activity

    Determining Material Structures and Surface Chemistry by Genetic Algorithms and Quantum Chemical Simulations

    Get PDF
    With the advent of modern computing, the use of simulation in chemistry has become just as important as experiment. Simulations were originally only applicable to small molecules, but modern techniques, such as density functional theory (DFT) allow extension to materials science. While there are many valuable techniques for synthesis and characterization in chemistry laboratories, there are far more materials possible than can be synthesized, each with an entire host of surfaces. This wealth of chemical space to explore begs the use of computational chemistry to mimic synthesis and experimental characterization. In this work, genetic algorithms (GA), for the former, and DFT calculations, for the latter, are developed and used for the in silico exploration of materials chemistry. Genetic algorithms were first theorized in 1975 by John Holland and over the years subsequently expanded and developed for a variety of purposes. The first application to chemistry came in the early 1990’s and surface chemistry, specifically, appeared soon after. To complement the ability of a GA to explore chemical space is a second algorithmic technique: machine learning (ML) wherein a program is able to categorize or predict properties of an input after reviewing many, many examples of similar inputs. ML has more nebulous origins than GA, but applications to chemistry also appeared in the 1990’s. A history perspective and assessment of these techniques towards surface chemistry follows in this work. A GA designed to find the crystal structure of layered chemical materials given the material’s X-ray diffraction pattern is then developed. The approach reduces crystals into layers of atoms that are transformed and stacked until they repeat. In this manner, an entire crystal need only be represented by its base layer (or two, in some cases) and a set of instructions on how the layers are to be arranged and stacked. Molecules that may be present may not quite behave in this fashion, and so a second set of descriptors exist to determine the molecule’s position and orientation. Finally, the lattice of the unit cell is specified, and the structure is built to match. The GA determines the structure’s X-ray diffraction pattern, compares it against a provided experimental pattern, and assigns it a fitness value, where a higher value indicates a better match and a more fit individual. The most fit individuals mate, exchanging genetic material (which may mutate) to produce offspring which are further subjected to the same procedure. This GA can find the structure of bulk, layered, organic, and inorganic materials. Once a material’s bulk structure has been determined, surfaces of the material can be derived and analyzed by DFT. In this thesis, DFT is used to validate results from the GA regarding lithium-aluminum layered double hydroxide. Surface chemistry is more directly explored in the prediction of adsorbates on surfaces of lithiated nickel-manganese-cobalt oxide, a common cathode material in lithium-ion batteries. Surfaces are evaluated at the DFT+U level of theory, which reduces electron over-delocalization, and the energies of the surfaces both bare and with adsorbates are compared. By applying first-principles thermodynamics to predict system energies under varying temperatures and pressures, the behavior of these surfaces in experimental conditions is predicted to be mostly pristine and bare of adsorbates. For breadth, this thesis also presents an investigation of the electronic and optical properties of organic semiconductors via DFT and time-dependent DFT calculations

    COMPUTATIONAL TECHNIQUES TO EVALUATE AT ATOMIC LEVEL THE MECHANISM OF MOLECULAR BINDING

    Get PDF
    Integrins are an important class of transmembrane receptors that relay signals bidirectionally across the plasma membrane, regulating several cell functions and playing a key role in diverse pathological processes. Specifically, integrin subtype \u3b1IIb\u3b23 is involved in thrombosis and stroke, while subtypes \u3b1v\u3b23 and \u3b15\u3b21 play an important role in angiogenesis and tumor progression. They therefore emerged as attractive pharmacological targets. In the past decades several peptides and peptidomimetics targeting these proteins and based on the integrin recognition motif RGD (Arg-Gly-Asp) have been developed, whereby their affinity and selectivity for a specific integrin subtype have been fine-tuned by modulation of RGD flanking residues, by cyclization or by introduction of chemical modifications. Thus far, the design and development of RGD-based cyclopeptides have been mainly based on empirical approaches, requiring expensive and time-consuming synthesis campaigns. In this field, the employment of computational tools, that could be valuable to accelerate the drug design and optimization process, has been limited by the inherent difficulties to predict in silico the three-dimensional structure and the inhibitory activity of cyclopeptides. However, recent improvements in both computational resources and in docking and modeling techniques are expected to open new perspectives in the development of cyclopeptides as modulators of protein-protein interactions and, particularly, as integrin inhibitors. Within this PhD project, I have investigated the applicability of computational techniques in predicting and rationalizing how the environment of the recognition-motif in cyclopeptides (i.e. flanking residues and introduction of chemical modification) could influence their integrin affinity and selectivity. These features can regulate integrin affinity both by favoring direct interactions with the receptor and/or by modulating the three-dimensional conformation properties of the recognition motif. To take into account both these aspects, I have proposed and optimized a multi-stage computational protocol in which an exhaustive conformational sampling of the investigated cyclopeptides is followed by docking calculations and re-scoring techniques. Specifically: i) the exhaustive sampling could be achieved by using Metadynamics in its Bias Exchange variant (BE-META), an enhanced sampling technique which represents a valuable methodology for the acceleration of rare events, allowing to cross the high free energy barriers characteristic of cyclopeptides and providing reliable estimations of the populations of the accessible conformers. ii) The docking calculations, complemented with the re-scoring technique MM-GB/SA (Molecular Mechanics Generalized Born Surface Area) and the cluster analysis of the decoy poses, allow to evaluate the ability of each peptide to engage interactions with the receptors and to rank the docking poses according to their binding ability; iii) a joint analysis of the previous outcomes results in a reliable ranking of cyclopeptides based on their binding affinity and in the rationalization of their structure-activity relationship. This computational protocol has been exploited in two different applications, illustrated within the thesis. In the first application the protocol has been applied to rationalize how the introduction of chemical modifications, specifically backbone N-methylation, impacts on the equilibrium conformation and consequently on the integrin affinity of five RGD containing cyclic hexapeptides, which were previously generated by the group of professor Kessler to modulate their selectivity for \u3b1IIb\u3b23 integrin. The study revealed that backbone N-methylation affects the preferences of the \u3c6 dihedral angle of the methylated residue, specifically favoring the adoption of additional conformations, characterized by a 180\ub0 twist of the peptide bond plane preceding the methylated residue. These twists of dihedral angles were found to have relevant consequences on the cyclopeptides conformation, influencing the formation of intra-molecular hydrogen bonds as well as some structural features which are known to be fundamental in integrin binding. Both structural analysis and docking calculations allowed to identify the \u201cbioactive\u201d conformation (i.e. an extended RGD conformation able to recapitulate the canonical electrostatic and the additional stabilizing hydrophobic interactions). Of note, the cyclopeptides that are pre-organized, already in their free state, in this bioactive conformation are the ones displaying the best \u3b1IIb\u3b23 binding affinity in terms of IC50 values, confirming that pre-organization of cyclopeptides in solution can strongly affect their binding strength to the receptor and demonstrating that the knowledge of their conformational equilibrium is fundamental to provide reliable affinity predictions. In the second application, I have focused my attention on cyclopeptides harboring a recently discovered integrin recognition motif: isoDGR (isoAsp-Gly-Arg), deriving from the spontaneous deamidation of NGR (Asp-Gly-Arg) sequence present in integrin natural ligands. As a preliminary step, I have systematically tested the accuracy of eight Molecular Mechanics force fields in reproducing the equilibrium properties of isoDGR-based cyclopeptides, for which NMR experiments have been acquired. The comparison between simulated and NMR-derived data (i.e. chemical shifts and J scalar couplings) revealed that, while most of the investigated force fields can properly reproduce the equilibrium conformational properties of cyclic peptides, only two of them (i.e. the AMBER force fields ff99sb-ildn and ff99sb*-ildn) are able to recover the NMR observables characteristics of the non-standard residue isoAspartate with an accuracy close to the systematic uncertainty. Overall, these results suggest that the transferability of force field parameters to non standard amino acids is not straightforward. However, two force fields allowed to obtain a satisfactory accuracy and have been therefore employed for the subsequent investigation. I thus applied the computational protocol to rationalize the diverse selectivity and affinity profiles for integrins \u3b1v\u3b23 and \u3b15\u3b21, both related to cancer, displayed by three isoDGR-based cyclic hexapeptides. These molecules differ in the residues flanking the isoDGR motif and show appealing tumor-homing properties; specifically it has been shown that one of these, c(CGisoDGRG), can be coupled with human serum albumin through a chemical linker to be used as a drug delivery agent for functionalized gold nanoparticles. Herein, I investigated the role of the chemical linker in improving affinity and selectivity of c(CGisoDGRG) for \u3b1v\u3b23. The application of the multi-stage protocol allowed to propose an explanation for the different selectivity profiles displayed by these molecules, where the direct interactions engaged by the flanking residues and/or their steric hindrance seem to be largely responsible for the observed different affinities. As a last result, through the combination of MD and NMR techniques, I demonstrated that the chemical linker improved the \u3b1v\u3b23 affinity of c(CGisoDGRG) by engaging direct interactions with the receptor and I proposed two possible complex models, which well-reproduce data from Saturation Transfer Difference experiments. Overall, in this PhD work I have shown that the combination of different computational techniques, BE-META, docking and MM-GB/SA re-scoring, could be a reliable approach to perform structure-activity relationship studies in cyclopeptides. Specifically, the proposed protocol is able to predict the influence of the recognition motif environment (i.e. chemical modification and flanking residues) on integrin affinities. These two features regulate integrin affinity differently: the first one by conformational modulation of the recognition motif, the second by engaging direct interactions with the receptor. Of note, the approach can deal with both these mechanisms of affinity modulation. We expect that the protocol herein described could be used in future to screen novel peptides library or to complement biochemical experiments during the drug optimization stages, assisting organic chemists in the design of more effective integrin-targeting peptides

    TUNING OPTIMIZATION SOFTWARE PARAMETERS FOR MIXED INTEGER PROGRAMMING PROBLEMS

    Get PDF
    The tuning of optimization software is of key interest to researchers solving mixed integer programming (MIP) problems. The efficiency of the optimization software can be greatly impacted by the solver’s parameter settings and the structure of the MIP. A designed experiment approach is used to fit a statistical model that would suggest settings of the parameters that provided the largest reduction in the primal integral metric. Tuning exemplars of six and 59 factors (parameters) of optimization software, experimentation takes place on three classes of MIPs: survivable fixed telecommunication network design, a formulation of the support vector machine with the ramp loss and L1-norm regularization, and node packing for coding theory graphs. This research presents and demonstrates a framework for tuning a portfolio of MIP instances to not only obtain good parameter settings used for future instances of the same class of MIPs, but to also gain insights into which parameters and interactions of parameters are significant for that class of MIPs. The framework is used for benchmarking of solvers with tuned parameters on a portfolio of instances. A group screening method provides a way to reduce the number of factors in a design and reduces the time it takes to perform the tuning process. Portfolio benchmarking provides performance information of optimization solvers on a class with instances of a similar structure

    Computational Approaches to Address the Next-Generation Sequencing Era

    Get PDF
    In this thesis, I propose new algorithms and models to address biological problems. Computer science in fact plays a key role in proteomics and genetics research due to the advent of big datasets. In the context of protein study, I developed new methods for protein function prediction based on information retrieval principles. By using heterogeneous source of knowledge, like graph search and sequence similarity, I designed a tool called INGA that can be used to annotate entire genomes. It has been benchmarked during the Critical Assessment of Function Annotation challenge, and it proved to be one of the most effective approach for function inference. To better characterize proteins from the structural point of view, I proposed a protein conformers detection strategy based on residue interaction network (RIN) data. RIN graphs were extended to deal with the time-dependent protein coordinate fluctuations, and were generated by clustering algorithms. An implementation called RING MD highlighted effectively the key amino acids known to be functionally relevant in Ubiquitin. These amino acids in fact are very important to explain the protein three-dimensional dynamics. With the same rationale, RIN graphs were used also to predict the impact of mutations within a protein structure. By combining information about a mutant node in the network and its features, an artificial neural network was trained to estimate the free Gibbs energy change of a protein. Extreme changes in the internal energy might lead to the protein unfolding, and possibly to disease. The reduction of a protein flexibility may hamper its function as well. As an example, the extreme fluctuations observed in intrinsically disordered proteins (IDPs) are fundamental for their activities. To better understand IDPs, I contributed in the collection of the largest dataset of disordered regions. In the following analysis, it was shown what are the typical functions of these sequences and the biological processes where they are involved. Due to the importance of their detection, a comprehensive assessment of disorder predictors was performed to show what are the state-of-the-art methods and their limitations. In the context of genetics, I focused on phenotype prediction. During the Critical Assessment of Genome Interpretation (CAGI), I proposed new approaches for the analysis of exome data to prioritize the risk of Crohn's disease and abnormal cholesterol levels. These are often defined as complex disease, since the mechanism behind their insurgence is still unknown. In my study, human samples with an enrichment of mutations in critical genes were predicted to have an high genetic risk. In addition to disease associated genes, protein interaction networks were considered to better account for variants accumulation in biological pathways. Such strategy was shown to be among the best approaches by CAGI organizers. In the simpler case of Mendelian traits, with BOOGIE I designed a method for human blood groups prediction based on exome data. It uses a specialized version of nearest neighbor algorithm in order to match the gene variants in an unannotated exome with the ones available in a reference knowledge base. The most similar hit is used to transfer the blood group. With an accuracy above 90%, BOOGIE is a proof-of-concept that shows the potential applications of genetic prediction, and can be easily extended to any Mendelian trait. To summarize, this thesis is a partial answer to the exponential growth of sequences available that need further experiments. By integrating heterogeneous information and designing new predictive models based on machine learning, I developed novel tools for biological data analysis and classification. All implementations are freely available for the community and might be helpful during future investigations like in drug design and disease studies

    Scheduling and Tuning Kernels for High-performance on Heterogeneous Processor Systems

    Get PDF
    Accelerated parallel computing techniques using devices such as GPUs and Xeon Phis (along with CPUs) have proposed promising solutions of extending the cutting edge of high-performance computer systems. A significant performance improvement can be achieved when suitable workloads are handled by the accelerator. Traditional CPUs can handle those workloads not well suited for accelerators. Combination of multiple types of processors in a single computer system is referred to as a heterogeneous system. This dissertation addresses tuning and scheduling issues in heterogeneous systems. The first section presents work on tuning scientific workloads on three different types of processors: multi-core CPU, Xeon Phi massively parallel processor, and NVIDIA GPU; common tuning methods and platform-specific tuning techniques are presented. Then, analysis is done to demonstrate the performance characteristics of the heterogeneous system on different input data. This section of the dissertation is part of the GeauxDock project, which prototyped a few state-of-art bioinformatics algorithms, and delivered a fast molecular docking program. The second section of this work studies the performance model of the GeauxDock computing kernel. Specifically, the work presents an extraction of features from the input data set and the target systems, and then uses various regression models to calculate the perspective computation time. This helps understand why a certain processor is faster for certain sets of tasks. It also provides the essential information for scheduling on heterogeneous systems. In addition, this dissertation investigates a high-level task scheduling framework for heterogeneous processor systems in which, the pros and cons of using different heterogeneous processors can complement each other. Thus a higher performance can be achieve on heterogeneous computing systems. A new scheduling algorithm with four innovations is presented: Ranked Opportunistic Balancing (ROB), Multi-subject Ranking (MR), Multi-subject Relative Ranking (MRR), and Automatic Small Tasks Rearranging (ASTR). The new algorithm consistently outperforms previously proposed algorithms with better scheduling results, lower computational complexity, and more consistent results over a range of performance prediction errors. Finally, this work extends the heterogeneous task scheduling algorithm to handle power capping feature. It demonstrates that a power-aware scheduler significantly improves the power efficiencies and saves the energy consumption. This suggests that, in addition to performance benefits, heterogeneous systems may have certain advantages on overall power efficiency

    A theoretical study of metal-organic frameworks

    Get PDF
    Among the options for carbon sequestration, the development of CO2 capture materials has gained momentum over the past two decades. The design and construction of chemical and physical absorbents for the capture of CO2 and clean energy storage are a crucial technology for a sustainable low-carbon future. Metal-organic frameworks (MOFs) provide a new vision for the adsorption of molecules on solid surfaces. The interest in MOFs is owed to their ultrahigh porosity, high surface areas and tuneable pore sizes and shapes. The main objective of this thesis was to adopt a rational predictive capacity used in MOF design to control properties such as framework porosity and flexibility on a molecular scale. The in-silico studies were carried out by using ab initio quantum mechanical approaches such as density functional theory and perturbation theory. In addition, semi-classical methods like the Grand Canonical Monte Carlo (GCMC) approach was also used. A structural motif called vicinal fluorination was adopted to study MOF linkers in isolation and in a framework. An extensive conformational study, in various solvents, was carried out to investigate the effect of vicinal fluorination on the isolated MOF linkers and therefore elucidate their conformational stability. The effect of fluorination on adsorption isotherms was also investigated. Moreover, various fluorination patterns were explored. Adsorption isotherms of a non-fluorinated copperbased MOF based on experimental work, and its various fluorinated analogues were predicted using the GCMC method. It was found that vicinal fluorination is not dominant in controlling conformations of some MOF linkers. Rather, an interplay of interactions, including solute and steric interactions, influence the conformational stability on rotational profiles. However, vicinal fluorination was shown to control the flexibility of the linkers used in MOFs as it controls the force constants around the minima of rotational profiles of isolated MOF linkers. The study also highlighted the importance of the solvent on the relative energies of the linker conformations – this has a potential impact on the synthesis of MOFs. With the help of computational methods and validation from experimental data, the structural and sorption properties of the framework, upon fluorination, were shown to have consequences on the adsorption properties of the MOF. Vicinally fluorinated frameworks were shown to have higher uptakes at a low temperature and low pressures
    corecore