231 research outputs found

    Machine-learning methods for structure prediction of β-barrel membrane proteins

    Get PDF
    Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the cell such as cell communication and active transport across the membrane. Transmembrane β-barrels (TMBBs) are a sub-class of membrane proteins largely under-represented in structure databases because of the extreme difficulty in experimental structure determination. For this reason, computational tools that are able to predict the structure of TMBBs are needed. In this thesis, two computational problems related to TMBBs were addressed: the detection of TMBBs in large datasets of proteins and the prediction of the topology of TMBB proteins. Firstly, a method for TMBB detection was presented based on a novel neural network framework for variable-length sequence classification. The proposed approach was validated on a non-redundant dataset of proteins. Furthermore, we carried-out genome-wide detection using the entire Escherichia coli proteome. In both experiments, the method significantly outperformed other existing state-of-the-art approaches, reaching very high PPV (92%) and MCC (0.82). Secondly, a method was also introduced for TMBB topology prediction. The proposed approach is based on grammatical modelling and probabilistic discriminative models for sequence data labeling. The method was evaluated using a newly generated dataset of 38 TMBB proteins obtained from high-resolution data in the PDB. Results have shown that the model is able to correctly predict topologies of 25 out of 38 protein chains in the dataset. When tested on previously released datasets, the performances of the proposed approach were measured as comparable or superior to the current state-of-the-art of TMBB topology prediction

    Cascading classifier application for topology prediction of TMB proteins

    Get PDF
    This paper is concerned with the use of a cascading classifier for trans-membrane beta-barrel topology prediction analysis. Most of novel drug design requires the use of membrane proteins. Trans-membrane proteins have key roles such as active transport across the membrane and signal transduction among other functions. Given their key roles, understanding their structures mechanisms and regulation at the level of molecules with the use of computational modeling is essential. In the field of bioinformatics, many years have been spent on the trans-membrane protein structure prediction focusing on the alpha-helix membrane proteins. Technological developments have been increasingly utilized in order to understand in more details membrane protein function and structure. Various methodologies have been developed for the prediction of TMB proteins topology however the use of cascading classifier has not been fully explored. This research presents a novel approach for TMB topology prediction. The MATLAB computer simulation results show that the proposed methodology predicts transmembrane topologies with high accuracy for randomly selected proteins

    Machine learning applications for the topology prediction of transmembrane beta-barrel proteins

    Get PDF
    The research topic for this PhD thesis focuses on the topology prediction of beta-barrel transmembrane proteins. Transmembrane proteins adopt various conformations that are about the functions that they provide. The two most predominant classes are alpha-helix bundles and beta-barrel transmembrane proteins. Alpha-helix proteins are present in larger numbers than beta-barrel transmembrane proteins in structure databases. Therefore, there is a need to find computational tools that can predict and detect the structure of beta-barrel transmembrane proteins. Transmembrane proteins are used for active transport across the membrane or signal transduction. Knowing the importance of their roles, it becomes essential to understand the structures of the proteins. Transmembrane proteins are also a significant focus for new drug discovery. Transmembrane beta-barrel proteins play critical roles in the translocation machinery, pore formation, membrane anchoring, and ion exchange. In bioinformatics, many years of research have been spent on the topology prediction of transmembrane alpha-helices. The efforts to TMB (transmembrane beta-barrel) proteins topology prediction have been overshadowed, and the prediction accuracy could be improved with further research. Various methodologies have been developed in the past to predict TMB proteins topology. Methods developed in the literature that are available include turn identification, hydrophobicity profiles, rule-based prediction, HMM (Hidden Markov model), ANN (Artificial Neural Networks), radial basis function networks, or combinations of methods. The use of cascading classifier has never been fully explored. This research presents and evaluates approaches such as ANN (Artificial Neural Networks), KNN (K-Nearest Neighbors, SVM (Support Vector Machines), and a novel approach to TMB topology prediction with the use of a cascading classifier. Computer simulations have been implemented in MATLAB, and the results have been evaluated. Data were collected from various datasets and pre-processed for each machine learning technique. A deep neural network was built with an input layer, hidden layers, and an output. Optimisation of the cascading classifier was mainly obtained by optimising each machine learning algorithm used and by starting using the parameters that gave the best results for each machine learning algorithm. The cascading classifier results show that the proposed methodology predicts transmembrane beta-barrel proteins topologies with high accuracy for randomly selected proteins. Using the cascading classifier approach, the best overall accuracy is 76.3%, with a precision of 0.831 and recall or probability of detection of 0.799 for TMB topology prediction. The accuracy of 76.3% is achieved using a two-layers cascading classifier. By constructing and using various machine-learning frameworks, systems were developed to analyse the TMB topologies with significant robustness. We have presented several experimental findings that may be useful for future research. Using the cascading classifier, we used a novel approach for the topology prediction of TMB proteins

    Eukaryote-wide sequence analysis of mitochondrial β-barrel outer membrane proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The outer membranes of mitochondria are thought to be homologous to the outer membranes of Gram negative bacteria, which contain 100's of distinct families of <it>β</it>-barrel membrane proteins (BOMPs) often forming channels for transport of nutrients or drugs. However, only four families of mitochondrial BOMPs (MBOMPs) have been confirmed to date. Although estimates as high as 100 have been made in the past, the number of yet undiscovered MBOMPs is an open question. Fortunately, the recent discovery of a membrane integration signal (the <it>β</it>-signal) for MBOMPs gave us an opportunity to look for undiscovered MBOMPs.</p> <p>Results</p> <p>We present the results of a comprehensive survey of eukaryotic protein sequences intended to identify new MBOMPs. Our search employs recent results on <it>β</it>-signals as well as structural information and a novel BOMP predictor trained on both bacterial and mitochondrial BOMPs. Our principal finding is circumstantial evidence suggesting that few MBOMPs remain to be discovered, if one assumes that, like known MBOMPs, novel MBOMPs will be monomeric and <it>β</it>-signal dependent. In addition to this, our analysis of MBOMP homologs reveals some exceptions to the current model of the <it>β</it>-signal, but confirms its consistent presence in the C-terminal region of MBOMP proteins. We also report a <it>β</it>-signal independent search for MBOMPs against the yeast and Arabidopsis proteomes. We find no good candidates MBOMPs in yeast but the Arabidopsis results are less conclusive.</p> <p>Conclusions</p> <p>Our results suggest there are no remaining MBOMPs left to discover in yeast; and if one assumes all MBOMPs are <it>β</it>-signal dependent, few MBOMP families remain undiscovered in any sequenced organism.</p

    Computational Approaches to Understanding the Structure, Dynamics, Functions, and Mechanisms of Various Bacterial Proteins

    Get PDF
    The 3D structure of a protein can be fundamentally useful for understanding protein function. In the absence of an experimentally determined structure, the most common way to obtain protein structures is to use homology modeling, or the mapping of the target sequence onto a closely related homolog with an available structure. However, despite recent efforts in structural biology, the 3D structures of many proteins remain unknown. Recent advances in genomic and metagenomic sequencing coupled with coevolution analysis and protein structure prediction have allowed for highly accurate models of proteins that were previously considered intractable to model due to the lack of suitable templates. Structural models obtained from homology modeling, coevolution-based modeling, or crystallography can then be used with other computational tools such as small molecule docking or molecular dynamics (MD) simulations to help understand protein function, dynamics, and mechanism.Here coevolution-based modeling was used to build a structural model of the HgcAB complex involved in mercury methylation (Chapter I). Based on the model it was proposed that conserved cysteines in HgcB are involved in shuttling mercury, methylmercury, or both. MD simulations and docking to a homology model of E. coli inosine monophosphate dehydrogenase (IMPDH) provided insights into how a single amino acid mutation could relieve inhibition by altering protein structure and dynamics (Chapter II). Coevolution-based structure prediction was also combined with docking, and experimental activity data to generate machine learning models that predict enzyme substrate scope for a series of bacterial nitrilases (Chapter III). Machine learning was also used to identify physicochemical properties that describe outer membrane permeability and efflux in E. coli and P. aeruginosa and new efflux pump inhibitors for the E. coli AcrAB-TolC efflux pump were identified using existing physicochemical guidelines in combination with small molecule docking to a homology model of AcrA (Chapter IV). Lastly, quantum mechanical/molecular mechanical simulations were used to study the mechanism of a key proton transfer step in Toho-1 beta-lactamase using experimentally determined structures of both the apo and cefotaxime-bound forms. These simulations revealed that substrate binding promotes catalysis by enhancing the favorability of this initial proton transfer step (Chapter V)

    The Design of Heteromeric and Metal-binding Alpha-Helical Barrels

    Get PDF
    Introduction: The field of protein design has drastically evolved over the past four years. Both the protein folding problem, which involves predicting the 3D arrangement of atoms from a given sequence of amino acids, and its inverse, have been technically solved after 50 years. However, the black box nature of the tools developed to address these problems limits our comprehension of protein folding and dynamics. Harnessing this knowledge could revolutionise sectors such as drug design, disease diagnosis, energy transfer, and material science. This work focuses on the rational design of a protein scaffold called coiled coils, positioning them as a model for advancing our control and understanding of proteins.Results: In this thesis, we navigate the uncharted territory of coiled coils with reduced symmetry. We generate novel A3B3 hexameric Îą-helical barrels with both parallel and antiparallel helix orientations, expanding understanding of coiled-coil assemblies and introducing new scaffolds. Utilising these assemblies, we create covalently attached bipyridyl functional groups situated within the barrel cores, capable of chelating iron and ruthenium ions. Additionally, we develop intrinsically disordered peptide sequences that assemble only upon the introduction of specific metal ions. This can be applied for both metal sensing, as well as metal mediated sensing of other ligands.Conclusions: This research advances the field of protein design through the generation of novel Îą-helical barrels and the development of coiled-coil assemblies with innovative functionalities. Our work has allowed for new potential applications in bio-sensing and catalysis and has further demonstrated the broad versatility of coiled-coil scaffolds.Implications: This study illuminates the potential of coiled coils in the understanding of protein structure-function relationships. It introduces metal-sensitive peptide sequences for bio-sensing and photocatalysis within Îą-helical barrels, potentially paving the way for advancements in applications for de novo designed proteins

    OPTIMIZATION OF TIME-RESPONSE AND AMPLIFICATION FEATURES OF EGOTs FOR NEUROPHYSIOLOGICAL APPLICATIONS

    Get PDF
    In device engineering, basic neuron-to-neuron communication has recently inspired the development of increasingly structured and efficient brain-mimicking setups in which the information flow can be processed with strategies resembling physiological ones. This is possible thanks to the use of organic neuromorphic devices, which can share the same electrolytic medium and adjust reciprocal connection weights according to temporal features of the input signals. In a parallel - although conceptually deeply interconnected - fashion, device engineers are directing their efforts towards novel tools to interface the brain and to decipher its signalling strategies. This led to several technological advances which allow scientists to transduce brain activity and, piece by piece, to create a detailed map of its functions. This effort extends over a wide spectrum of length-scales, zooming out from neuron-to-neuron communication up to global activity of neural populations. Both these scientific endeavours, namely mimicking neural communication and transducing brain activity, can benefit from the technology of Electrolyte-Gated Organic Transistors (EGOTs). Electrolyte-Gated Organic Transistors (EGOTs) are low-power electronic devices that functionally integrate the electrolytic environment through the exploitation of organic mixed ionic-electronic conductors. This enables the conversion of ionic signals into electronic ones, making such architectures ideal building blocks for neuroelectronics. This has driven extensive scientific and technological investigation on EGOTs. Such devices have been successfully demonstrated both as transducers and amplifiers of electrophysiological activity and as neuromorphic units. These promising results arise from the fact that EGOTs are active devices, which widely extend their applicability window over the capabilities of passive electronics (i.e. electrodes) but pose major integration hurdles. Being transistors, EGOTs need two driving voltages to be operated. If, on the one hand, the presence of two voltages becomes an advantage for the modulation of the device response (e.g. for devising EGOT-based neuromorphic circuitry), on the other hand it can become detrimental in brain interfaces, since it may result in a non-null bias directly applied on the brain. If such voltage exceeds the electrochemical stability window of water, undesired faradic reactions may lead to critical tissue and/or device damage. This work addresses EGOTs applications in neuroelectronics from the above-described dual perspective, spanning from neuromorphic device engineering to in vivo brain-device interfaces implementation. The advantages of using three-terminal architectures for neuromorphic devices, achieving reversible fine-tuning of their response plasticity, are highlighted. Jointly, the possibility of obtaining a multilevel memory unit by acting on the gate potential is discussed. Additionally, a novel mode of operation for EGOTs is introduced, enabling full retention of amplification capability while, at the same time, avoiding the application of a bias in the brain. Starting on these premises, a novel set of ultra-conformable active micro-epicortical arrays is presented, which fully integrate in situ fabricated EGOT recording sites onto medical-grade polyimide substrates. Finally, a whole organic circuitry for signal processing is presented, exploiting ad-hoc designed organic passive components coupled with EGOT devices. This unprecedented approach provides the possibility to sort complex signals into their constitutive frequency components in real time, thereby delineating innovative strategies to devise organic-based functional building-blocks for brain-machine interfaces.Nell’ingegneria elettronica, la comunicazione di base tra neuroni ha recentemente ispirato lo sviluppo di configurazioni sempre più articolate ed efficienti che imitano il cervello, in cui il flusso di informazioni può essere elaborato con strategie simili a quelle fisiologiche. Ciò è reso possibile grazie all'uso di dispositivi neuromorfici organici, che possono condividere lo stesso mezzo elettrolitico e regolare i pesi delle connessioni reciproche in base alle caratteristiche temporali dei segnali in ingresso. In modo parallelo, gli ingegneri elettronici stanno dirigendo i loro sforzi verso nuovi strumenti per interfacciare il cervello e decifrare le sue strategie di comunicazione. Si è giunti così a diversi progressi tecnologici che consentono agli scienziati di trasdurre l'attività cerebrale e, pezzo per pezzo, di creare una mappa dettagliata delle sue funzioni. Entrambi questi ambiti scientifici, ovvero imitare la comunicazione neurale e trasdurre l'attività cerebrale, possono trarre vantaggio dalla tecnologia dei transistor organici a base elettrolitica (EGOT). I transistor organici a base elettrolitica (EGOT) sono dispositivi elettronici a bassa potenza che integrano funzionalmente l'ambiente elettrolitico attraverso lo sfruttamento di conduttori organici misti ionici-elettronici, i quali consentono di convertire i segnali ionici in segnali elettronici, rendendo tali dispositivi ideali per la neuroelettronica. Gli EGOT sono stati dimostrati con successo sia come trasduttori e amplificatori dell'attività elettrofisiologica e sia come unità neuromorfiche. Tali risultati derivano dal fatto che gli EGOT sono dispositivi attivi, al contrario dell'elettronica passiva (ad esempio gli elettrodi), ma pongono comunque qualche ostacolo alla loro integrazione in ambiente biologico. In quanto transistor, gli EGOT necessitano l'applicazione di due tensioni tra i suoi terminali. Se, da un lato, la presenza di due tensioni diventa un vantaggio per la modulazione della risposta del dispositivo (ad esempio, per l'ideazione di circuiti neuromorfici basati su EGOT), dall'altro può diventare dannosa quando gli EGOT vengono adoperati come sito di registrazione nelle interfacce cerebrali, poiché una tensione non nulla può essere applicata direttamente al cervello. Se tale tensione supera la finestra di stabilità elettrochimica dell'acqua, reazioni faradiche indesiderate possono manifestarsi, le quali potrebbero danneggiare i tessuti e/o il dispositivo. Questo lavoro affronta le applicazioni degli EGOT nella neuroelettronica dalla duplice prospettiva sopra descritta: ingegnerizzazione neuromorfica ed implementazione come interfacce neurali in applicazioni in vivo. Vengono evidenziati i vantaggi dell'utilizzo di architetture a tre terminali per i dispositivi neuromorfici, ottenendo una regolazione reversibile della loro plasticità di risposta. Si discute inoltre la possibilità di ottenere un'unità di memoria multilivello agendo sul potenziale di gate. Viene introdotta una nuova modalità di funzionamento per gli EGOT, che consente di mantenere la capacità di amplificazione e, allo stesso tempo, di evitare l'applicazione di una tensione all’interfaccia cervello-dispositivo. Partendo da queste premesse, viene presentata una nuova serie di array micro-epicorticali ultra-conformabili, che integrano completamente i siti di registrazione EGOT fabbricati in situ su substrati di poliimmide. Infine, viene proposto un circuito organico per l'elaborazione del segnale, sfruttando componenti passivi organici progettati ad hoc e accoppiati a dispositivi EGOT. Questo approccio senza precedenti offre la possibilità di filtrare e scomporre segnali complessi nelle loro componenti di frequenza costitutive in tempo reale, delineando così strategie innovative per concepire blocchi funzionali a base organica per le interfacce cervello-macchina

    Evolution and Engineering in <i>Escherichia coli</i>

    Get PDF

    Characterization of protein secretion in Mycobacterium leprae using phoA fusions in Escherichia coli and Mycobacterium smegmatis

    Get PDF
    Complete sequencing and annotation of the M. leprae genome has provided new information related to proteins constituting its hypothetical proteome. Since M. leprae can not be grown in vitro, novel approaches are needed to determine which proteins are expressed during infection and whether these proteins are related to pathogenesis. Secreted proteins represent a distinct group of protein with respect to their structure and function, contribution to virulence and are of particular importance for vaccine development because they are often immunogenic and have the potential to be recognized early in infection. The objectives of this study were: 1) to identify putatively secreted proteins of M. leprae based on protein sequences homologies with known MT secreted proteins; 2) to apply bioinformatic tools designed to assess proteins for secretion, to proteins selected in objective 1 with the goal of improving the likelihood that selected proteins are secreted by M. leprae, 3) to validate secretion of selected ML proteins through genetic cloning of predicted secreted ML protein genes using surrogate host bacteria, E. coli and M. smegmatis. Bioinformatics identified 24 proteins with high probability for secretion in M. leprae. Fifteen of 24 ML genes showed more than 50% amino acid homology with their M. tuberculosis counterparts and were studied for gene expression and secretion. mRNA analysis identified transcripts for all Sec-dependent pathway proteins of 15 genes predicted to be secreted in M. leprae. PhoA fusion studies in E. coli showed that 5 of 6 (83%) ML proteins (ML0091, ML0097, ML0620, ML1811 and ML1812) were secreted in E. coli and 2 of 7 (29%) proteins (ML0715 and ML2569) were secreted in M. smegmatis. Only lipoproteins were secreted in M. smegmatis suggesting the importance of mycobacterial-related characteristics for secretion of ML lipoproteins. These results suggest that bioinformatic tools are reliable predictors for identifying secreted proteins in M. leprae and support the hypothesis that Sec-dependent secretion exists in M. leprae

    The Impact of Dynamics in Protein Assembly

    Get PDF
    Predicting the assembly of multiple proteins into specific complexes is critical to understanding their biological function in an organism, and thus the design of drugs to address their malfunction. Consequently, a significant body of research and development focuses on methods for elucidating protein quaternary structure. In silico techniques are used to propose models that decode experimental data, and independently as a structure prediction tool. These computational methods often consider proteins as rigid structures, yet proteins are inherently flexible molecules, with both local side-chain motion and larger conformational dynamics governing their behaviour. This treatment is particularly problematic for any protein docking engine, where even a simple rearrangement of the side-chain and backbone atoms at the interface of binding partners complicates the successful determination of the correct docked pose. Herein, we present a means of representing protein surface, electrostatics and local dynamics within a single volumetric descriptor, before applying it to a series of physical and biophysical problems to validate it as representative of a protein. We leverage this representation in a protein-protein docking context and demonstrate that its application bypasses the need to compensate for, and predict, specific side-chain packing at the interface of binding partners for both water-soluble and lipid-soluble protein complexes. We find little detriment in the quality of returned predictions with increased flexibility, placing our protein docking approach as highly competitive versus comparative methods. We then explore the role of larger, conformational dynamics in protein quaternary structure prediction, by exploiting large-scale Molecular Dynamics simulations of the SARS-CoV-2 spike glycoprotein to elucidate possible high-order spike-ACE2 oligomeric states. Our results indicate a possible novel path to therapeutics following the COVID-19 pandemic. Overall, we find that the structure of a protein alone is inadequate in understanding its function through its possible binding modes. Therefore, we must also consider the impact of dynamics in protein assembly
    • …
    corecore