13 research outputs found

    Exploiting Advanced Methods for Membrane Protein Structure Prediction

    Get PDF
    Recent strides in computational structural biology have opened up an opportunity to understand previously uncharacterised proteins. The under-representation of transmembrane proteins in the Protein Data Bank highlights the need to apply new and advanced bioinformatics methods to shed light on their structure and function. A protein’s structural information is crucial to understand its function and evolution. Currently, there is only experimental structural data for a tiny fraction of proteins. For instance, membrane proteins are encoded by 30% of the protein-coding genes of the human genome, but they only have a 3.5% representation in the Protein Data Bank (PDB). Membrane protein families are particularly poorly understood due to experimental difficulties, such as over-expression, which can result in toxicity to host cells, as well as difficulty in finding a suitable membrane mimetic to reconstitute the protein. Additionally, membrane proteins are much less conserved across species compared to water-soluble proteins, making sequence-based homologue identification a challenge, and in turn rendering homology modelling of these proteins more difficult. Until the structure of poorly characterised protein families can be elucidated experimentally, ab initio protein modelling can be used to predict a fold allowing for structure based function inferences. Such methods have made significant strides recently due to the availability of contact predictions, with these methods addressing larger targets than conventional fragment-assembly-based ab initio methods. This study initially focusses on the structure and function transmembrane proteins specifically in the process of autophagosome construction and demonstrates how covariance prediction data have multiple roles in modern structural bioinformatics: not just by acting as restraints for model making and serving for validation of the final models but by predicting domain boundaries and revealing the presence of cryptic internal repeats not evidenced by sequence analysis. Furthermore, we characterised a contact map feature characteristic of a re-entrant helix which may in future allow detection of this feature in other protein families. The recent innovations in computational structural biology were employed further giving rise to an opportunity to revise our current understanding of the structure and function of clinically important proteins. Through the modelling of the transmembrane Pfam families and subsequent mining of their structural libraries we identified the human Oca2 protein as a protein of interest. Oca2 is located on mature melanosomal membranes and mutations of Oca2 can result in a form of oculocutanous albinism which is the most prevalent and visually identifiable form of albinism. Sequence analysis predicts Oca2 to be a member of the SLC13 transporter family but it has not been classified into any existing SLC families. The modelling of Oca2 with AlphaFold2 and other advanced methods shows that, like SLC13 members, it consists of a scaffold and transport domain and displays a pseudo inverted repeat topology that includes re-entrant loops. This finding contradicts the prevailing consensus view of its topology. In addition to the scaffold and transport domains the presence of a cryptic GOLD domain is revealed that is likely responsible for its trafficking from the endoplasmic reticulum to the Golgi prior to localisation at the melanosomes and possesses known glycosylation sites. Analysis of the putative ligand binding site of the model shows the presence of highly conserved key asparagine residues that suggest Oca2 may be a Na+/dicarboxylate symporter. Known critical pathogenic mutations map to structural features present in the repeat regions that form the transport domain. Exploiting the AlphaFold2 multimeric modelling protocol in combination with conventional homology modelling allowed the building of a plausible homodimer in both an inward- and outward-facing conformation supporting an elevator-type transport mechanism

    Structural Insights into Pink-eyed Dilution Protein (Oca2).

    Get PDF
    Recent innovations in computational structural biology have opened an opportunity to revise our current understanding of the structure and function of clinically important proteins.  This study centres on human Oca2 which is located on mature melanosomal membranes. Mutations of Oca2 can result in a form of oculocutanous albinism which is the most prevalent and visually identifiable form of albinism. Sequence analysis predicts Oca2 to be a member of the SLC13 transporter family but it has not been classified into any existing SLC families. The modelling of Oca2 with AlphaFold2 and other advanced methods show that, like SLC13 members, it consists of a scaffold and transport domain and displays a pseudo inverted repeat topology that includes re-entrant loops. This finding contradicts the prevailing consensus view of its topology. In addition to the scaffold and transport domains, the presence of a cryptic GOLD domain is revealed that is likely responsible for its trafficking from the endoplasmic reticulum to the Golgi prior to localisation at the melanosomes. The GOLD harbours some known glycosylation sites. Analysis of the putative ligand binding site of the model shows the presence of highly conserved key asparagine residues that suggest Oca2 may be a Na+/dicarboxylate symporter. Known critical pathogenic mutations map to structural features present in the repeat regions that form the transport domain. Exploiting the AlphaFold2 multimeric modelling protocol in combination with conventional homology modelling allowed the building of plausible homodimers in both inward- and outward-facing conformations, supporting an elevator-type transport mechanism

    Beta Repeats Models

    No full text
    ● Deep learning-based structure-modelling methods discover novel predicted β-solenoids. ● Structural database screening identifies additional structural neighbours. ● Study uncovers unprecedentedly large and small β-solenoids. ● Models cover the full range of β-solenoid cross-sectional shapes. ● Present a novel β-solenoid coil shape with plausible complexes. ● Predicted structures are linked to possible functions. ● Eukaryotic and prokaryotic adhesins identified.</p

    Deep Learning-based structure modelling illuminates structure and function in uncharted regions of β-solenoid fold space

    No full text
    Repeat proteins are common in all domains of life and exhibit a wide range of functions. One class of repeat protein contains solenoid folds where the repeating unit consists of β-strands separated by tight turns. β-solenoids have distinguishing structural features such as handedness, twist, oligomerisation state, coil shape and size which give rise to their diversity. Characterised β-solenoid repeat proteins are known to form regions in bacterial and viral virulence factors, antifreeze proteins and functional amyloids. For many of these proteins, the experimental structure has not been solved, as they are difficult to crystallise or model. Here we use various deep learning-based structure-modelling methods to discover novel predicted β-solenoids, perform structural database searches to mine further structural neighbours and relate their predicted structure to possible functions. We find both eukaryotic and prokaryotic adhesins, confirming a known functional linkage between adhesin function and the β-solenoid fold. We further identify exceptionally long, flat β-solenoid folds as possible structures of mucin tandem repeat regions and unprecedentedly small β-solenoid structures. Additionally, we characterise a novel β-solenoid coil shape, the FapC Greek key β-solenoid as well as plausible complexes between it and other proteins involved in Pseudomonas functional amyloid fibres

    Evaluation of model refinement in CASP14

    No full text
    We report here an assessment of the model refinement category of the 14th round of Critical Assessment of Structure Prediction (CASP14). As before, predictors submitted up to five ranked refinements, along with associated residue-level error estimates, for targets that had a wide range of starting quality. The ability of groups to accurately rank their submissions and to predict coordinate error varied widely. Overall only four groups out-performed a “naïve predictor” corresponding to resubmission of the starting model. Among the top groups there are interesting differences of approach and in the spread of improvements seen: some methods are more conservative, others more adventurous. Some targets were “double-barrelled” for which predictors were offered a high-quality AlphaFold 2 (AF2)-derived prediction alongside another of lower quality. The AF2-derived models were largely unimprovable, many of their apparent errors being found to reside at domain and, especially, crystal lattice contacts. Refinement is shown to have a mixed impact overall on structure-based function annotation methods to predict nucleic acid binding, spot catalytic sites and dock protein structures

    In silico prediction of structure and function for a large family of transmembrane proteins that includes human Tmem41b.

    No full text
    Background: Recent strides in computational structural biology have opened up an opportunity to understand previously uncharacterised proteins.  The under-representation of transmembrane proteins in the Protein Data Bank highlights the need to apply new and advanced bioinformatics methods to shed light on their structure and function.  This study focuses on a family of transmembrane proteins containing the Pfam domain PF09335 ('SNARE_ASSOC'/ 'VTT '/'Tvp38'/'DedA'). One prominent member, Tmem41b, has been shown to be involved in early stages of autophagosome formation and is vital in mouse embryonic development as well as being identified as a viral host factor of SARS-CoV-2. Methods: We used evolutionary covariance-derived information to construct and validate ab initio models, make domain boundary predictions and infer local structural features.  Results: The results from the structural bioinformatics analysis of Tmem41b and its homologues showed that they contain a tandem repeat that is clearly visible in evolutionary covariance data but much less so by sequence analysis.  Furthermore, cross-referencing of other prediction data with covariance analysis showed that the internal repeat features two-fold rotational symmetry.  Ab initio modelling of Tmem41b and homologues reinforces these structural predictions.  Local structural features predicted to be present in Tmem41b were also present in Cl -/H + antiporters.  Conclusions: The results of this study strongly point to Tmem41b and its homologues being transporters for an as-yet uncharacterised substrate and possibly using H + antiporter activity as its mechanism for transport

    Breaking the conformational ensemble barrier: Ensemble structure modeling challenges in CASP15

    No full text
    For the first time, the 2022 CASP (Critical Assessment of Structure Prediction) community experiment included a section on computing multiple conformations for protein and RNA structures. There was full or partial success in reproducing the ensembles for four of the nine targets, an encouraging result. For protein structures, enhanced sampling with variations of the AlphaFold2 deep learning method was by far the most effective approach. One substantial conformational change caused by a single mutation across a complex interface was accurately reproduced. In two other assembly modeling cases, methods succeeded in sampling conformations near to the experimental ones even though environmental factors were not included in the calculations. An experimentally derived flexibility ensemble allowed a single accurate RNA structure model to be identified. Difficulties included how to handle sparse or low-resolution experimental data and the current lack of effective methods for modeling RNA/protein complexes. However, these and other obstacles appear addressable
    corecore