341 research outputs found

    The role of dynamic hydrogen bond networks in protonation coupled dynamics of retinal proteins

    Get PDF
    Hydrogen bonds (H-bonds) are an essential interaction in membrane proteins. Embedded in complex hydrated lipid bilayers, intramolecular interactions through the means of hydrogen bonding networks are often crucial for the function of the protein. Internal water molecules that occupy stable sites inside the protein, or water molecules that visit transiently from the bulk, can play an important role in shaping local conformational dynamics forming complex networks that bridge regions of the protein via water-mediated hydrogen bonds that can function as wires for the transferring of protons as a part of the protein’s function. For example, the membrane-embedded channelrhodopsins which are found in archaea are proteins that couple light induced isomerization of a retinal chromophore with proton transfer reactions and passive flow of cations through their pore. I contributed to the development of a new algorithm package that features a unique approach to H-bond analyses. I performed analyses of long Molecular Dynamics (MD) trajectories of channelrhodopsin variants embedded in hydrated lipid membranes and large data sets of static structures, to detect and dissect dynamic hydrogen-bond networks. The photocycle of channelrhodopsins begins with absorption and isomerization of the retinal from an all-trans state to a 13-cis state and followed by the deprotonation of the Schiff base. Thus, the retinal is found in the epicenter of the analyses. Through the use of 2-dimensional graphs of the protein H-bond networks I identified protein groups potentially important for the proton transfer activity. Local dynamics are highly affected by point mutations of amino acids important for function. The interior of channelrhodopsin C1C2 hosts extensive networks of protein and H-bonded-water molecules, and a never reported before, network that can bridge transiently the two retinal chromophores in channelrhodopsin dimers. In a recently identified inward proton pump, AntR, I applied centrality measures on MD trajectories of the homology model I generated, to assess the communication of the amino acid residues within the networks. I detected a frequently sampled long water chain that connects the retinal with a candidate proton acceptor, as well as a conserved serine in the vicinity of the retinal chromophore plays a significant role in the connectivity and communication of the H-bond networks upon isomerization. A similar water bridge is sampled in independent simulations of ChR2, where a participant for the proton donor group connects to the 13-cis,15-anti retinal. Proton transfer reactions often take place through certain amino acids, forming patterns. I analyzed H-bond patterns or motifs in large hand-curated datasets of static structures of α-transmembrane helix proteins, organized according to the superfamilies they belong, their function and an alternative classification method. The presence of motifs in TM proteins is tightly related to their families/superfamilies of the host protein and their position along the membrane normal.Wasserstoffbrücken (H-Brücke) sind eine wesentliche Wechselwirkung in Membranproteinen. Eingebettet in komplexe hydratisierte Lipiddoppelschichten sind intramolekulare Wechselwirkungen über Wasserstoffbrückenbindungsnetzwerke oft entscheidend für die Funktion des Proteins. Interne Wassermoleküle, die stabile Stellen im Inneren des Proteins besetzen, oder Wassermoleküle, die vorübergehend aus der Masse zu Besuch kommen, können eine wichtige Rolle bei der Gestaltung der lokalen Konformationsdynamik spielen, indem sie komplexe Netzwerke bilden, die Regionen des Proteins über wasservermittelte Wasserstoffbrückenbindungen überbrücken, die als Drähte für den Transfer von Protonen als Teil der Proteinfunktion funktionieren können. Die in Archaeen vorkommenden, in die Membran eingebetteten Kanalrhodopsine sind beispielsweise Proteine, die die lichtinduzierte Isomerisierung eines Retinachromophors mit Protonentransferreaktionen und dem passiven Fluss von Kationen durch ihre Pore verbinden. Ich habe an der Entwicklung eines neuen Algorithmuspakets mitgewirkt, das einen einzigartigen Ansatz für H-Bindungsanalysen bietet. Ich habe lange Molekulardynamik-Trajektorien von Kanalrhodopsine-Varianten, die in hydratisierte Lipidmembranen eingebettet sind, sowie große Datensätze statischer Strukturen analysiert, um dynamische Wasserstoffbrücken-bindungsnetzwerke zu erkennen und zu zerlegen. Der Photozyklus der Kanalrhodopsine beginnt mit der Absorption und Isomerisierung des Retinals von einem all-trans-Zustand zu einem 13-cis-Zustand, gefolgt von der Deprotonierung der Schiff-Base. Somit steht das Retinal im Mittelpunkt der Analysen. Durch die Verwendung von 2-dimensionalen Graphen der Protein- H-Brückenetzwerke identifizierte ich Proteingruppen, die für die Protonentransferaktivität wichtig sein könnten. Die lokale Dynamik wird durch Punktmutationen der für die Funktion wichtigen Aminosäuren stark beeinflusst. Das Innere von Kanalrhodopsine C1C2 beherbergt ausgedehnte Netzwerke von Protein- und H-Brücke-Wassermolekülen und ein bisher unbekanntes Netzwerk, das die beiden retinalen Chromophore in Kanalrhodopsine-Dimeren vorübergehend überbrücken kann. In einer kürzlich identifizierten Protonenpumpe, AntR, wendete ich Zentralitätsmaße auf MD-Trajektorien des von mir erstellten Homologiemodells an, um die Kommunikation der Aminosäurereste innerhalb der Netzwerke zu bewerten. Ich fand, dass eine häufig gesampelte lange Wasserkette, die das Retinal mit einem Protonenakzeptor verbindet, sowie ein konserviertes Serin in der Nähe des Retinal-Chromophors eine wichtige Rolle bei der Konnektivität und Kommunikation der H-Brückesnetzwerke bei der Isomerisierung spielt. Eine ähnliche Wasserbrücke ist in unabhängigen Simulationen von Kanalrhodopsine-2 zu finden, wo ein Teilnehmer für die Protonendonorgruppe mit dem 13-cis,15-anti-Retinal verbunden ist. Protonenübertragungsreaktionen finden oft über bestimmte Aminosäuren statt und bilden Muster. Ich analysierte H-Brückemuster oder -motive in großen, von Hand kuratierten Datensätzen statischer Strukturen von α-Transmembranhelix-Proteinen, geordnet nach den Superfamilien, zu denen sie gehören, ihrer Funktion und einer alternativen Klassifizierungsmethode. Das Vorhandensein von Motiven in TM-Proteinen steht in engem Zusammenhang mit ihren Familien/Superfamilien des Wirtsproteins und ihrer Position entlang der Membrannormale

    New evolutionary approaches to protein structure prediction

    Get PDF
    Programa de doctorado en Biotecnología y Tecnología QuímicaThe problem of Protein Structure Prediction (PSP) is one of the principal topics in Bioinformatics. Multiple approaches have been developed in order to predict the protein structure of a protein. Determining the three dimensional structure of proteins is necessary to understand the functions of molecular protein level. An useful, and commonly used, representation for protein 3D structure is the protein contact map, which represents binary proximities (contact or non-contact) between each pair of amino acids of a protein. This thesis work, includes a compilation of the soft computing techniques for the protein structure prediction problem (secondary and tertiary structures). A novel evolutionary secondary structure predictor is also widely described in this work. Results obtained confirm the validity of our proposal. Furthermore, we also propose a multi-objective evolutionary approach for contact map prediction based on physico-chemical properties of amino acids. The evolutionary algorithm produces a set of decision rules that identifies contacts between amino acids. The rules obtained by the algorithm impose a set of conditions based on amino acid properties in order to predict contacts. Results obtained by our approach on four different protein data sets are also presented. Finally, a statistical study was performed to extract valid conclusions from the set of prediction rules generated by our algorithm.Universidad Pablo de Olavide. Centro de Estudios de Postgrad

    Modelling the structure and interactions of leukocyte integrins

    Get PDF
    Heterodimeric transmembrane protein structure is complex and insufficient structural information exists, concerning leukocyte integrin proteins. To determine protein structure, homology modelling was conducted and modelling software was evaluated. Leukocyte integrin homologs were obtained from the PDB and models were generated using online servers and MODELLER. Template homologs were fewer in number and of lower quality in comparison to monomeric extracellular proteins. Models were docked using ClusPro, HADDOCK2.2 and AutoDock vina. Models were evaluated using PROSA, Verify-3D and PROSESS. Higher quality models were generated when using MODELLER to separately model monomeric subunits in three defined domain regions (extracellular, transmembrane and cytoplasmic). Template selection concerning these proteins is critical as an intricate relationship exists between model quality, template quality, template quantity, template resolution, target-template identity and template sequence coverage. Docking monomeric subunits was challenging when using ClusPro and the best ligand docking procedures were completed using AutoDock vina. PROSESS provided the most accurate evaluation of protein models, in comparison to PROSA and Verify-3D. These results indicate that although homology modelling is a powerful tool there is much room for improvement. Experimentally obtained templates should be expanded upon within the PDB and energy functions should cater for both monomeric and transmembrane heterodimeric proteins. Leukocyte integrins appear to adopt a closed conformation, which may still facilitate LDV ligand association within the α/β interface. The α3β1 integrin may interact with laminin-5 through the ELV sequence within the G-domain of the α laminin subuni

    Characterization of Coenzyme Q Biosynthesis Proteins through Integrative Modeling at the Protein-Membrane Interface

    Get PDF
    Integral and peripheral membrane proteins account for one-third of the human proteome, and they are estimated to represent the target for over 50% of modern medicinal drugs. Despite their central role in medicine, the complex, heterogeneous and dynamic nature of biological membranes complicates the investigation of their mechanism of action by both experimental and computational techniques. Among the different membrane bound compartments in eukaryotic cells, mitochondria are highly complex in form and function, and they harbor a unique proteome that remains largely unexplored. A growing number of inherited metabolic diseases are associated with mitochondrial dysfunction, which necessitates the structural and functional elucidation of mitochondrial proteins. In this thesis, we combine experimental and computational methods to explore the activity of COQ8 and COQ9, two functionally elusive proteins of the biosynthetic complex that produces coenzyme Q, a redox-active lipid component of the mitochondrial electron transport chain. (i) Conserved Lipid Modulation of Ancient Kinase-Like UbiB Family Member COQ8. We demonstrate that COQ8 has an ATPase function that is activated when it specifically associates with cardiolipin-containing membranes. We identify its interaction surface with the inner mitochondrial membrane, which gives hints about the possible interaction surfaces with other members of the coenzyme Q synthesis machinery and has implications on how it mediates functional interactions with lipids. Collectively, this work reveals how the positioning of COQ8 on the inner mitochondrial membrane is key to its activation, and therefore advances our understanding of the COQ8 function. (ii) Membrane, Lipid, and Protein Interactions of Coenzyme Q Biosynthesis Protein COQ9. We explore the lipid binding activity of COQ9, and we reveal that COQ9 repurposes an ancient bacterial fold to selectively bind aromatic isoprenes, including CoQ intermediates that reside within the bilayer. We elucidate the mechanistic details of its membrane binding process, by which COQ9 warps the membrane surface and creates a tightly sealed hydrophobic region to access its lipid cargo. Finally, we establish a potential molecular interface between COQ9 and COQ7, the enzyme that catalyzes the penultimate step in CoQ biosynthesis, suggesting a model whereby COQ9 presents intermediates to CoQ enzymes to overcome the hydrophobic barrier of the membrane. Collectively, our results provide a mechanism for how a lipid binding protein might access, select, and extract specific cargo from a membrane and present it to a peripheral membrane enzyme. In conclusion, our work is a good illustration of the interplay between experiment and modeling in protein research and specifically in understanding how proteins perform their action in direct synergy with membrane environments. We anticipate our integrative methodologies and mechanistic findings will prove relevant to other membrane proteins, whose fine functional modulation at the membrane-water interface has been historically challenging to characterize

    Bionano-Interfaces through Peptide Design

    Get PDF
    The clinical success of restoring bone and tooth function through implants critically depends on the maintenance of an infection-free, integrated interface between the host tissue and the biomaterial surface. The surgical site infections, which are the infections within one year of surgery, occur in approximately 160,000-300,000 cases in the US annually. Antibiotics are the conventional treatment for the prevention of infections. They are becoming ineffective due to bacterial antibiotic-resistance from their wide-spread use. There is an urgent need both to combat bacterial drug resistance through new antimicrobial agents and to limit the spread of drug resistance by limiting their delivery to the implant site. This work aims to reduce surgical site infections from implants by designing of chimeric antimicrobial peptides to integrate a novel and effective delivery method. In recent years, antimicrobial peptides (AMPs) have attracted interest as natural sources for new antimicrobial agents. By being part of the immune system in all life forms, they are examples of antibacterial agents with successfully maintained efficacy across evolutionary time. Both natural and synthetic AMPs show significant promise for solving the antibiotic resistance problems. In this work, AMP1 and AMP2 was shown to be active against three different strains of pathogens in Chapter 4. In the literature, these peptides have been shown to be effective against multi-drug resistant bacteria. However, their effective delivery to the implantation site limits their clinical use. In recent years, different groups adapted covalent chemistry-based or non-specific physical adsorption methods for antimicrobial peptide coatings on implant surfaces. Many of these procedures use harsh chemical conditions requiring multiple reaction steps. Furthermore, none of these methods allow the orientation control of these molecules on the surfaces, which is an essential consideration for biomolecules. In the last few decades, solid binding peptides attracted high interest due to their material specificity and self-assembly properties. These peptides offer robust surface adsorption and assembly in diverse applications. In this work, a design method for chimeric antimicrobial peptides that can self-assemble and self-orient onto biomaterial surfaces was demonstrated. Three specific aims used to address this two-fold strategy of self-assembly and self-orientation are: 1) Develop classification and design methods using rough set theory and genetic algorithm search to customize antibacterial peptides; 2) Develop chimeric peptides by designing spacer sequences to improve the activity of antimicrobial peptides on titanium surfaces; 3) Verify the approach as an enabling technology by expanding the chimeric design approach to other biomaterials. In Aim 1, a peptide classification tool was developed because the selection of an antimicrobial peptide for an application was difficult among the thousands of peptide sequences available. A rule-based rough-set theory classification algorithm was developed to group antimicrobial peptides by chemical properties. This work is the first time that rough set theory has been applied to peptide activity analysis. The classification method on benchmark data sets resulted in low false discovery rates. The novel rough set theory method was combined with a novel genetic algorithm search, resulting in a method for customizing active antibacterial peptides using sequence-based relationships. Inspired by the fact that spacer sequences play critical roles between functional protein domains, in Aim 2, chimeric peptides were designed to combine solid binding functionality with antimicrobial functionality. To improve how these functions worked together in the same peptide sequence, new spacer sequences were engineered. The rough set theory method from Aim 1 was used to find structure-based relationships to discover new spacer sequences which improved the antimicrobial activity of the chimeric peptides. In Aim 3, the proposed approach is demonstrated as an enabling technology. In this work, calcium phosphate was tested and verified the modularity of the chimeric antimicrobial self-assembling peptide approach. Other chimeric peptides were designed for common biomaterials zirconia and urethane polymer. Finally, an antimicrobial peptide was engineered for a dental adhesive system toward applying spacer design concepts to optimize the antimicrobial activity

    Machine learning applications for the topology prediction of transmembrane beta-barrel proteins

    Get PDF
    The research topic for this PhD thesis focuses on the topology prediction of beta-barrel transmembrane proteins. Transmembrane proteins adopt various conformations that are about the functions that they provide. The two most predominant classes are alpha-helix bundles and beta-barrel transmembrane proteins. Alpha-helix proteins are present in larger numbers than beta-barrel transmembrane proteins in structure databases. Therefore, there is a need to find computational tools that can predict and detect the structure of beta-barrel transmembrane proteins. Transmembrane proteins are used for active transport across the membrane or signal transduction. Knowing the importance of their roles, it becomes essential to understand the structures of the proteins. Transmembrane proteins are also a significant focus for new drug discovery. Transmembrane beta-barrel proteins play critical roles in the translocation machinery, pore formation, membrane anchoring, and ion exchange. In bioinformatics, many years of research have been spent on the topology prediction of transmembrane alpha-helices. The efforts to TMB (transmembrane beta-barrel) proteins topology prediction have been overshadowed, and the prediction accuracy could be improved with further research. Various methodologies have been developed in the past to predict TMB proteins topology. Methods developed in the literature that are available include turn identification, hydrophobicity profiles, rule-based prediction, HMM (Hidden Markov model), ANN (Artificial Neural Networks), radial basis function networks, or combinations of methods. The use of cascading classifier has never been fully explored. This research presents and evaluates approaches such as ANN (Artificial Neural Networks), KNN (K-Nearest Neighbors, SVM (Support Vector Machines), and a novel approach to TMB topology prediction with the use of a cascading classifier. Computer simulations have been implemented in MATLAB, and the results have been evaluated. Data were collected from various datasets and pre-processed for each machine learning technique. A deep neural network was built with an input layer, hidden layers, and an output. Optimisation of the cascading classifier was mainly obtained by optimising each machine learning algorithm used and by starting using the parameters that gave the best results for each machine learning algorithm. The cascading classifier results show that the proposed methodology predicts transmembrane beta-barrel proteins topologies with high accuracy for randomly selected proteins. Using the cascading classifier approach, the best overall accuracy is 76.3%, with a precision of 0.831 and recall or probability of detection of 0.799 for TMB topology prediction. The accuracy of 76.3% is achieved using a two-layers cascading classifier. By constructing and using various machine-learning frameworks, systems were developed to analyse the TMB topologies with significant robustness. We have presented several experimental findings that may be useful for future research. Using the cascading classifier, we used a novel approach for the topology prediction of TMB proteins

    Biological Protein Patterning Systems across the Domains of Life: from Experiments to Modelling

    Full text link
    Distinct localisation of macromolecular structures relative to cell shape is a common feature across the domains of life. One mechanism for achieving spatiotemporal intracellular organisation is the Turing reaction-diffusion system (e.g. Min system in the bacterium Escherichia coli controlling in cell division). In this thesis, I explore potential Turing systems in archaea and eukaryotes as well as the effects of subdiffusion. Recently, a MinD homologue, MinD4, in the archaeon Haloferax volcanii was found to form a dynamic spatiotemporal pattern that is distinct from E. coli in its localisation and function. I investigate all four archaeal Min paralogue systems in H. volcanii by identifying four putative MinD activator proteins based on their genomic location and show that they alter motility but do not control MinD4 patterning. Additionally, one of these proteins shows remarkably fast dynamic motion with speeds comparable to eukaryotic molecular motors, while its function appears to be to control motility via interaction with the archaellum. In metazoa, neurons are highly specialised cells whose functions rely on the proper segregation of proteins to the axonal and somatodendritic compartments. These compartments are bounded by a structure called the axon initial segment (AIS) which is precisely positioned in the proximal axonal region during early neuronal development. How neurons control these self-organised localisations is poorly understood. Using a top-down analysis of developing neurons in vitro, I show that the AIS lies at the nodal plane of the first non-homogeneous spatial harmonic of the neuron shape while a key axonal protein, Tau, is distributed with a concentration that matches the same harmonic. These results are consistent with an underlying Turing patterning system which remains to be identified. The complex intracellular environment often gives rise to the subdiffusive dynamics of molecules that may affect patterning. To simulate the subdiffusive transport of biopolymers, I develop a stochastic simulation algorithm based on the continuous time random walk framework, which is then applied to a model of a dimeric molecular motor. This provides insight into the effects of subdiffusion on motor dynamics, where subdiffusion reduces motor speed while increasing the stall force. Overall, this thesis makes progress towards understanding intracellular patterning systems in different organisms, across the domains of life

    Development of a deep learning-based computational framework for the classification of protein sequences

    Get PDF
    Dissertação de mestrado em BioinformaticsProteins are one of the more important biological structures in living organisms, since they perform multiple biological functions. Each protein has different characteristics and properties, which can be employed in many industries, such as industrial biotechnology, clinical applications, among others, demonstrating a positive impact. Modern high-throughput methods allow protein sequencing, which provides the protein sequence data. Machine learning methodologies are applied to characterize proteins using information of the protein sequence. However, a major problem associated with this method is how to properly encode the protein sequences without losing the biological relationship between the amino acid residues. The transformation of the protein sequence into a numeric representation is done by encoder methods. In this sense, the main objective of this project is to study different encoders and identify the methods which yield the best biological representation of the protein sequences, when used in machine learning (ML) models to predict different labels related to their function. The methods were analyzed in two study cases. The first is related to enzymes, since they are a well-established case in the literature. The second used transporter sequences, a lesser studied case in the literature. In both cases, the data was collected from the curated database Swiss-Prot. The encoders that were tested include: calculated protein descriptors; matrix substitution methods; position-specific scoring matrices; and encoding by pre-trained transformer methods. The use of state-of-the-art pretrained transformers to encode protein sequences proved to be a good biological representation for subsequent application in state-of-the-art ML methods. Namely, the ESM-1b transformer achieved a Mathews correlation coefficient above 0.9 for any multiclassification task of the transporter classification system.As proteínas são estruturas biológicas importantes dos organismos vivos, uma vez que estas desempenham múltiplas funções biológicas. Cada proteína tem características e propriedades diferentes, que podem ser aplicadas em diversas indústrias, tais como a biotecnologia industrial, aplicações clínicas, entre outras, demonstrando um impacto positivo. Os métodos modernos de alto rendimento permitem a sequenciação de proteínas, fornecendo dados da sequência proteica. Metodologias de aprendizagem de máquinas tem sido aplicada para caracterizar as proteínas utilizando informação da sua sequência. Um problema associado a este método e como representar adequadamente as sequências proteicas sem perder a relação biológica entre os resíduos de aminoácidos. A transformação da sequência de proteínas numa representação numérica é feita por codificadores. Neste sentido, o principal objetivo deste projeto é estudar diferentes codificadores e identificar os métodos que produzem a melhor representação biológica das sequências proteicas, quando utilizados em modelos de aprendizagem mecânica para prever a classificação associada à sua função a sua função. Os métodos foram analisados em dois casos de estudo. O primeiro caso foi baseado em enzimas, uma vez que são um caso bem estabelecido na literatura. O segundo, na utilização de proteínas de transportadores, um caso menos estudado na literatura. Em ambos os casos, os dados foram recolhidos a partir da base de dados curada Swiss-Prot. Os codificadores testados incluem: descritores de proteínas calculados; métodos de substituição por matrizes; matrizes de pontuação específicas da posição; e codificação por modelos de transformadores pré-treinados. A utilização de transformadores de última geração para codificar sequências de proteínas demonstrou ser uma boa representação biológica para aplicação subsequente em métodos ML de última geração. Nomeadamente, o transformador ESM-1b atingiu um coeficiente de correlação de Matthews acima de 0,9 para multiclassificação do sistema de classificação de proteínas transportadoras
    corecore