Search CORE

16 research outputs found

An Algebraic Approach to Modelling the Regulation of Gene Expression

Author: Abdel Aleem Hosameldin Anwar Mohamed
Publication venue
Publication date: 01/08/2011
Field of study

The University of Manchester - Institutional Repository

Nouvelles méthodes de calcul pour la prédiction des interactions protéine-protéine au niveau structural

Author: Popov Petr
Publication venue: HAL CCSD
Publication date: 28/01/2015
Field of study

Molecular docking is a method that predicts orientation of one molecule with respect to another one when forming a complex. The first computational method of molecular docking was applied to find new candidates against HIV-1 protease in 1990. Since then, using of docking pipelines has become a standard practice in drug discovery. Typically, a docking protocol comprises different phases. The exhaustive sampling of the binding site upon rigid-body approximation of the docking subunits is required. Clustering algorithms are used to group similar binding candidates. Refinement methods are applied to take into account flexibility of the molecular complex and to eliminate possible docking artefacts. Finally, scoring algorithms are employed to select the best binding candidates. The current thesis presents novel algorithms of docking protocols that facilitate structure prediction of protein complexes, which belong to one of the most important target classes in the structure-based drug design. First, DockTrina - a new algorithm to predict conformations of triangular protein trimers (i.e. trimers with pair-wise contacts between all three pairs of proteins) is presented. The method takes as input pair-wise contact predictions from a rigid-body docking program. It then scans and scores all possible combinations of pairs of monomers using a very fast root mean square deviation (RMSD) test. Being fast and efficient, DockTrina outperforms state-of-the-art computational methods dedicated to predict structure of protein oligomers on the collected benchmark of protein trimers. Second, RigidRMSD - a C++ library that in constant time computes RMSDs between molecular poses corresponding to rigid-body transformations is presented. The library is practically useful for clustering docking poses, resulting in ten times speed up compared to standard RMSD-based clustering algorithms. Third, KSENIA - a novel knowledge-based scoring function for protein-protein interactions is developed. The problem of scoring function reconstruction is formulated and solved as a convex optimization problem. As a result, KSENIA is a smooth function and, thus, is suitable for the gradient-base refinement of molecular structures. Remarkably, it is shown that native interfaces of protein complexes provide sufficient information to reconstruct a well-discriminative scoring function. Fourth, CARBON - a new algorithm for the rigid-body refinement of docking candidates is proposed. The rigid-body optimization problem is viewed as the calculation of quasi-static trajectories of rigid bodies influenced by the energy function. To circumvent the typical problem of incorrect stepsizes for rotation and translation movements of molecular complexes, the concept of controlled advancement is introduced. CARBON works well both in combination with a classical force-field and a knowledge-based scoring function. CARBON is also suitable for refinement of molecular complexes with moderate and large steric clashes between its subunits. Finally, a novel method to evaluate prediction capability of scoring functions is introduced. It allows to rigorously assess the performance of the scoring function of interest on benchmarks of molecular complexes. The method manipulates with the score distributions rather than with scores of particular conformations, which makes it advantageous compared to the standard hit-rate criteria. The methods described in the thesis are tested and validated on various protein-protein benchmarks. The implemented algorithms are successfully used in the CAPRI contest for structure prediction of protein-protein complexes. The developed methodology can be easily adapted to the recognition of other types of molecular interactions, involving ligands, polysaccharides, RNAs, etc. The C++ versions of the presented algorithms will be made available as SAMSON Elements for the SAMSON software platform at http://www.samson-connect.net or at http://nano-d.inrialpes.fr/software.Le docking moléculaire est une méthode permettant de prédire l'orientation d'une molécule donnée relativement à une autre lorsque celles-ci forment un complexe. Le premier algorithme de docking moléculaire a vu jour en 1990 afin de trouver de nouveaux candidats face à la protéase du VIH-1. Depuis, l'utilisation de protocoles de docking est devenue une pratique standard dans le domaine de la conception de nouveaux médicaments. Typiquement, un protocole de docking comporte plusieurs phases. Il requiert l'échantillonnage exhaustif du site d'interaction où les éléments impliqués sont considérées rigides. Des algorithmes de clustering sont utilisés afin de regrouper les candidats à l'appariement similaires. Des méthodes d'affinage sont appliquées pour prendre en compte la flexibilité au sein complexe moléculaire et afin d'éliminer de possibles artefacts de docking. Enfin, des algorithmes d'évaluation sont utilisés pour sélectionner les meilleurs candidats pour le docking. Cette thèse présente de nouveaux algorithmes de protocoles de docking qui facilitent la prédiction des structures de complexes protéinaires, une des cibles les plus importantes parmi les cibles visées par les méthodes de conception de médicaments. Une première contribution concerne l‘algorithme Docktrina qui permet de prédire les conformations de trimères protéinaires triangulaires. Celui-ci prend en entrée des prédictions de contacts paire-à-paire à partir d'hypothèse de corps rigides. Ensuite toutes les combinaisons possibles de paires de monomères sont évalués à l'aide d'un test de distance RMSD efficace. Cette méthode à la fois rapide et efficace améliore l'état de l'art sur les protéines trimères. Deuxièmement, nous présentons RigidRMSD une librairie C++ qui évalue en temps constant les distances RMSD entre conformations moléculaires correspondant à des transformations rigides. Cette librairie est en pratique utile lors du clustering de positions de docking, conduisant à des temps de calcul améliorés d'un facteur dix, comparé aux temps de calcul des algorithmes standards. Une troisième contribution concerne KSENIA, une fonction d'évaluation à base de connaissance pour l'étude des interactions protéine-protéine. Le problème de la reconstruction de fonction d'évaluation est alors formulé et résolu comme un problème d'optimisation convexe. Quatrièmement, CARBON, un nouvel algorithme pour l'affinage des candidats au docking basés sur des modèles corps-rigides est proposé. Le problème d'optimisation de corps-rigides est vu comme le calcul de trajectoires quasi-statiques de corps rigides influencés par la fonction énergie. CARBON fonctionne aussi bien avec un champ de force classique qu'avec une fonction d'évaluation à base de connaissance. CARBON est aussi utile pour l'affinage de complexes moléculaires qui comportent des clashes stériques modérés à importants. Finalement, une nouvelle méthode permet d'estimer les capacités de prédiction des fonctions d'évaluation. Celle-ci permet d‘évaluer de façon rigoureuse la performance de la fonction d'évaluation concernée sur des benchmarks de complexes moléculaires. La méthode manipule la distribution des scores attribués et non pas directement les scores de conformations particulières, ce qui la rend avantageuse au regard des critères standard basés sur le score le plus élevé. Les méthodes décrites au sein de la thèse sont testées et validées sur différents benchmarks protéines-protéines. Les algorithmes implémentés ont été utilisés avec succès pour la compétition CAPRI concernant la prédiction de complexes protéine-protéine. La méthodologie développée peut facilement être adaptée pour de la reconnaissance d'autres types d'interactions moléculaires impliquant par exemple des ligands, de l'ARN… Les implémentations en C++ des différents algorithmes présentés seront mises à disposition comme SAMSON Elements de la plateforme logicielle SAMSON sur http://www.samson-connect.net ou sur http://nano-d.inrialpes.fr/software

Thèses en Ligne

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

NASA Thesaurus. Volume 2: Access vocabulary

Author
Publication venue
Publication date
Field of study

The NASA Thesaurus -- Volume 2, Access Vocabulary -- contains an alphabetical listing of all Thesaurus terms (postable and nonpostable) and permutations of all multiword and pseudo-multiword terms. Also included are Other Words (non-Thesaurus terms) consisting of abbreviations, chemical symbols, etc. The permutations and Other Words provide 'access' to the appropriate postable entries in the Thesaurus

NASA Technical Reports Server

Principal Component Analysis

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book is aimed at raising awareness of researchers, scientists and engineers on the benefits of Principal Component Analysis (PCA) in data analysis. In this book, the reader will find the applications of PCA in fields such as taxonomy, biology, pharmacy,finance, agriculture, ecology, health and architecture

Directory of Open Access Books (DOAB)

Recommended from our members

Predicting multibody assembly of proteins

Author: Rasheed Md. Muhibur
Publication venue
Publication date: 25/09/2014
Field of study

textThis thesis addresses the multi-body assembly (MBA) problem in the context of protein assemblies. [...] In this thesis, we chose the protein assembly domain because accurate and reliable computational modeling, simulation and prediction of such assemblies would clearly accelerate discoveries in understanding of the complexities of metabolic pathways, identifying the molecular basis for normal health and diseases, and in the designing of new drugs and other therapeutics. [...] [We developed] F²Dock (Fast Fourier Docking) which includes a multi-term function which includes both a statistical thermodynamic approximation of molecular free energy as well as several of knowledge-based terms. Parameters of the scoring model were learned based on a large set of positive/negative examples, and when tested on 176 protein complexes of various types, showed excellent accuracy in ranking correct configurations higher (F² Dock ranks the correcti solution as the top ranked one in 22/176 cases, which is better than other unsupervised prediction software on the same benchmark). Most of the protein-protein interaction scoring terms can be expressed as integrals over the occupied volume, boundary, or a set of discrete points (atom locations), of distance dependent decaying kernels. We developed a dynamic adaptive grid (DAG) data structure which computes smooth surface and volumetric representations of a protein complex in O(m log m) time, where m is the number of atoms assuming that the smallest feature size h is [theta](r[subscript max]) where r[subscript max] is the radius of the largest atom; updates in O(log m) time; and uses O(m)memory. We also developed the dynamic packing grids (DPG) data structure which supports quasi-constant time updates (O(log w)) and spherical neighborhood queries (O(log log w)), where w is the word-size in the RAM. DPG and DAG together results in O(k) time approximation of scoring terms where k << m is the size of the contact region between proteins. [...] [W]e consider the symmetric spherical shell assembly case, where multiple copies of identical proteins tile the surface of a sphere. Though this is a restricted subclass of MBA, it is an important one since it would accelerate development of drugs and antibodies to prevent viruses from forming capsids, which have such spherical symmetry in nature. We proved that it is possible to characterize the space of possible symmetric spherical layouts using a small number of representative local arrangements (called tiles), and their global configurations (tiling). We further show that the tilings, and the mapping of proteins to tilings on arbitrary sized shells is parameterized by 3 discrete parameters and 6 continuous degrees of freedom; and the 3 discrete DOF can be restricted to a constant number of cases if the size of the shell is known (in terms of the number of protein n). We also consider the case where a coarse model of the whole complex of proteins are available. We show that even when such coarse models do not show atomic positions, they can be sufficient to identify a general location for each protein and its neighbors, and thereby restricts the configurational space. We developed an iterative refinement search protocol that leverages such multi-resolution structural data to predict accurate high resolution model of protein complexes, and successfully applied the protocol to model gp120, a protein on the spike of HIV and currently the most feasible target for anti-HIV drug design.Computer Science

Texas ScholarWorks

Deriving Protein Structures Efficiently by Integrating Experimental Data into Biomolecular Simulations

Author: Weiel-Potyagaylo Marie
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 06/08/2021
Field of study

Proteine sind molekulare Nanomaschinen in biologischen Zellen. Sie sind wesentliche Bausteine aller bekannten Lebensformen, von Einzellern bis hin zu Menschen, und erfüllen vielfältige Funktionen, wie beispielsweise den Sauerstofftransport im Blut oder als Bestandteil von Haaren. Störungen ihrer physiologischen Funktion können jedoch schwere degenerative Krankheiten wie Alzheimer und Parkinson verursachen. Die Entwicklung wirksamer Therapien für solche Proteinfehlfaltungserkrankungen erfordert ein tiefgreifendes Verständnis der molekularen Struktur und Dynamik von Proteinen. Da Proteine aufgrund ihrer lichtmikroskopisch nicht mehr auflösbaren Größe nur indirekt beobachtet werden können, sind experimentelle Strukturdaten meist uneindeutig. Dieses Problem lässt sich in silico mittels physikalischer Modellierung biomolekularer Dynamik lösen. In diesem Feld haben sich datengestützte Molekulardynamiksimulationen als neues Paradigma für das Zusammenfügen der einzelnen Datenbausteine zu einem schlüssigen Gesamtbild der enkodierten Proteinstruktur etabliert. Die Strukturdaten werden dabei als integraler Bestandteil in ein physikbasiertes Modell eingebunden. In dieser Arbeit untersuche ich, wie sogenannte strukturbasierte Modelle verwendet werden können, um mehrdeutige Strukturdaten zu komplementieren und die enthaltenen Informationen zu extrahieren. Diese Modelle liefern eine effiziente Beschreibung der aus der evolutionär optimierten nativen Struktur eines Proteins resultierenden Dynamik. Mithilfe meiner systematischen Simulationsmethode XSBM können biologische Kleinwinkelröntgenstreudaten mit möglichst geringem Rechenaufwand als physikalische Proteinstrukturen interpretiert werden. Die Funktionalität solcher datengestützten Methoden hängt stark von den verwendeten Simulationsparametern ab. Eine große Herausforderung besteht darin, experimentelle Informationen und theoretisches Wissen in geeigneter Weise relativ zueinander zu gewichten. In dieser Arbeit zeige ich, wie die entsprechenden Simulationsparameterräume mit Computational-Intelligence-Verfahren effizient erkundet und funktionale Parameter ausgewählt werden können, um die Leistungsfähigkeit komplexer physikbasierter Simulationstechniken zu optimieren. Ich präsentiere FLAPS, eine datengetriebene metaheuristische Optimierungsmethode zur vollautomatischen, reproduzierbaren Parametersuche für biomolekulare Simulationen. FLAPS ist ein adaptiver partikelschwarmbasierter Algorithmus inspiriert vom Verhalten natürlicher Vogel- und Fischschwärme, der das Problem der relativen Gewichtung verschiedener Kriterien in der multivariaten Optimierung generell lösen kann. Neben massiven Fortschritten in der Verwendung von künstlichen Intelligenzen zur Proteinstrukturvorhersage ermöglichen leistungsoptimierte datengestützte Simulationen detaillierte Einblicke in die komplexe Beziehung von biomolekularer Struktur, Dynamik und Funktion. Solche computergestützten Methoden können Zusammenhänge zwischen den einzelnen Puzzleteilen experimenteller Strukturinformationen herstellen und so unser Verständnis von Proteinen als den Grundbausteinen des Lebens vertiefen

KITopen

NASA thesaurus. Volume 2: Access vocabulary

Author
Publication venue
Publication date
Field of study

The Access Vocabulary, which is essentially a permuted index, provides access to any word or number in authorized postable and nonpostable terms. Additional entries include postable and nonpostable terms, other word entries, and pseudo-multiword terms that are permutations of words that contain words within words. The Access Vocabulary contains 40,738 entries that give increased access to the hierarchies in Volume 1 - Hierarchical Listing

NASA Technical Reports Server

NASA thesaurus. Volume 2: Access vocabulary

Author
Publication venue
Publication date
Field of study

The access vocabulary, which is essentially a permuted index, provides access to any word or number in authorized postable and nonpostable terms. Additional entries include postable and nonpostable terms, other word entries and pseudo-multiword terms that are permutations of words that contain words within words. The access vocabulary contains almost 42,000 entries that give increased access to the hierarchies in Volume 1 - Hierarchical Listing

NASA Technical Reports Server

IN SILICO METHODS FOR DRUG DESIGN AND DISCOVERY

Author: José L. Medina-Franco
Kamil Kuca
Marian Valko
Simone Brogi
Teodorico Castro Ramalho
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

Computer-aided drug design (CADD) methodologies are playing an ever-increasing role in drug discovery that are critical in the cost-effective identification of promising drug candidates. These computational methods are relevant in limiting the use of animal models in pharmacological research, for aiding the rational design of novel and safe drug candidates, and for repositioning marketed drugs, supporting medicinal chemists and pharmacologists during the drug discovery trajectory.Within this field of research, we launched a Research Topic in Frontiers in Chemistry in March 2019 entitled “In silico Methods for Drug Design and Discovery,” which involved two sections of the journal: Medicinal and Pharmaceutical Chemistry and Theoretical and Computational Chemistry. For the reasons mentioned, this Research Topic attracted the attention of scientists and received a large number of submitted manuscripts. Among them 27 Original Research articles, five Review articles, and two Perspective articles have been published within the Research Topic. The Original Research articles cover most of the topics in CADD, reporting advanced in silico methods in drug discovery, while the Review articles offer a point of view of some computer-driven techniques applied to drug research. Finally, the Perspective articles provide a vision of specific computational approaches with an outlook in the modern era of CADD

Archivio della Ricerca - Università di Pisa