8 research outputs found

    The human RBPome: From genes and proteins to human disease

    Get PDF
    RNA binding proteins (RBPs) play a central role in mediating post transcriptional regulation of genes. However less is understood about them and their regulatory mechanisms. In this study, we construct a catalogue of 1344 experimentally confirmed RBPs. The domain architecture of RBPs enabled us to classify them into three groups — Classical (29%), Non-classical (19%) and unclassified (52%). A higher percentage of proteins with unclassified domains reveals the presence of various uncharacterised motifs that can potentially bind RNA. RBPs were found to be highly disordered compared to Non-RBPs (p < 2.2e-16, Fisher's exact test), suggestive of a dynamic regulatory role of RBPs in cellular signalling and homeostasis. Evolutionary analysis in 62 different species showed that RBPs are highly conserved compared to Non-RBPs (p < 2.2e-16, Wilcox-test), reflecting the conservation of various biological processes like mRNA splicing and ribosome biogenesis. The expression patterns of RBPs from human proteome map revealed that ~ 40% of them are ubiquitously expressed and ~ 60% are tissue-specific. RBPs were also seen to be highly associated with several neurological disorders, cancer and inflammatory diseases. Anatomical contexts like B cells, T-cells, foetal liver and foetal brain were found to be strongly enriched for RBPs, implying a prominent role of RBPs in immune responses and different developmental stages. The catalogue and meta-analysis presented here should form a foundation for furthering our understanding of RBPs and the cellular networks they control, in years to come. This article is part of a Special Issue entitled: Proteomics in India

    Mechanisms of binding diversity in protein disorder : molecular recognition features mediating protein interaction networks

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Intrinsically disordered proteins are proteins characterized by lack of stable tertiary structures under physiological conditions. Evidence shows that disordered proteins are not only highly involved in protein interactions, but also have the capability to associate with more than one partner. Short disordered protein fragments, called “molecular recognition features” (MoRFs), were hypothesized to facilitate the binding diversity of highly-connected proteins termed “hubs”. MoRFs often couple folding with binding while forming interaction complexes. Two protein disorder mechanisms were proposed to facilitate multiple partner binding and enable hub proteins to bind to multiple partners: 1. One region of disorder could bind to many different partners (one-to-many binding), so the hub protein itself uses disorder for multiple partner binding; and 2. Many different regions of disorder could bind to a single partner (many-to-one binding), so the hub protein is structured but binds to many disordered partners via interaction with disorder. Thousands of MoRF-partner protein complexes were collected from Protein Data Bank in this study, including 321 one-to-many binding examples and 514 many-to-one binding examples. The conformational flexibility of MoRFs was observed at atomic resolution to help the MoRFs to adapt themselves to various binding surfaces of partners or to enable different MoRFs with non-identical sequences to associate with one specific binding pocket. Strikingly, in one-to-many binding, post-translational modification, alternative splicing and partner topology were revealed to play key roles for partner selection of these fuzzy complexes. On the other hand, three distinct binding profiles were identified in the collected many-to-one dataset: similar, intersecting and independent. For the similar binding profile, the distinct MoRFs interact with almost identical binding sites on the same partner. The MoRFs can also interact with a partially the same but partially different binding site, giving the intersecting binding profile. Finally, the MoRFs can interact with completely different binding sites, thus giving the independent binding profile. In conclusion, we suggest that protein disorder with post-translational modifications and alternative splicing are all working together to rewire the protein interaction networks

    The elusive MAESTRO gene: Its human reproductive tissue-specific expression pattern

    Get PDF
    published_or_final_versio

    Master of Science

    Get PDF
    thesisNuclear magnetic resonance (NMR) spectroscopy was employed to characterize structural and dynamic properties of bacteriophage ƛN protein (ƛN). ƛN is an intrinsically disordered protein (IDP) that interacts with multiple partners to prevent termination in the phage ƛ-Escherichia coli transcription apparatus. Limited dispersion in the 1H dimension of the 1H-15N heteronuclear correlation spectra confirmed the extensively disordered nature of ƛN. Resonance assignments were made for the amide-15N, amide-1H, 13Cα and 13Cβ nuclei of more than 90% of the nonproline residues at pH 7 and 5.5, which were subsequently used to calculate secondary structure propensities. Residues 2-7 and 55-75 showed propensities to form α-helical structures, whereas the residues 34-47 and 95-107 showed propensities to form extended structures. Previous studies have shown that residues 1-22 of ƛN adopt a helical structure when bound to a site in the RNA transcript (boxB) and residues 34-47 form an extended structure to interact with E. coli host transcription factor NusA protein. We have discovered that the residues 55-75 of ƛN protein, hitherto uncharacterized, have propensities to form transient helical secondary structures. This putative transient helical region spanning residues 55-75 is amphipathic and may form coiled-coil structures, which further suggests a possible structural or functional role of this segment in the antitermination apparatus. To characterize the backbone dynamics of ƛN, 15N longitudinal relaxation rates (R1), transverse relaxation rates (R2), and steady-state 15N-1H nuclear Overhauser effects were measured. Significantly elevated transverse relaxation rates (R2) for the amide groups of residues 55-75 indicated slow conformational exchange in the ƛs-ms timescale, consistent with a transient secondary structure in this segment of ƛN. Faster amide-bond motions were analyzed by mapping reduced spectral density functions, derived from the 15N relaxation parameters, which further revealed backbone motions on two or more timescales, as expected for a nonglobular disordered protein. The results of this NMR study suggest the presence of previously unknown functional domains of ƛN protein, which may enhance our understanding of the phage ƛ-Escherichia coli antitermination apparatus and allow further investigations of the binding mechanisms of IDPs with their interacting partners

    Optimizing hydropathy scale to improve IDP prediction and characterizing IDPs' functions

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Intrinsically disordered proteins (IDPs) are flexible proteins without defined 3D structures. Studies show that IDPs are abundant in nature and actively involved in numerous biological processes. Two crucial subjects in the study of IDPs lie in analyzing IDPs’ functions and identifying them. We thus carried out three projects to better understand IDPs. In the 1st project, we propose a method that separates IDPs into different function groups. We used the approach of CH-CDF plot, which is based the combined use of two predictors and subclassifies proteins into 4 groups: structured, mixed, disordered, and rare. Studies show different structural biases for each group. The mixed class has more order-promoting residues and more ordered regions than the disordered class. In addition, the disordered class is highly active in mitosis-related processes among others. Meanwhile, the mixed class is highly associated with signaling pathways, where having both ordered and disordered regions could possibly be important. The 2nd project is about identifying if an unknown protein is entirely disordered. One of the earliest predictors for this purpose, the charge-hydropathy plot (C-H plot), exploited the charge and hydropathy features of the protein. Not only is this algorithm simple yet powerful, its input parameters, charge and hydropathy, are informative and readily interpretable. We found that using different hydropathy scales significantly affects the prediction accuracy. Therefore, we sought to identify a new hydropathy scale that optimizes the prediction. This new scale achieves an accuracy of 91%, a significant improvement over the original 79%. In our 3rd project, we developed a per-residue C-H IDP predictor, in which three hydropathy scales are optimized individually. This is to account for the amino acid composition differences in three regions of a protein sequence (N, C terminus and internal). We then combined them into a single per-residue predictor that achieves an accuracy of 74% for per-residue predictions for proteins containing long IDP regions

    Βιοπληροφορικές μελέτες δομής και λειτουργίας μεμβρανικών πρωτεϊνών

    Get PDF
    Οι μεμβρανικές πρωτεΐνες επιτελούν µια σειρά από πολύ σημαντικές λειτουργίες, απαραίτητες για τη ζωή του κυττάρου. Τη μεγάλη πλειοψηφία των διαμεμβρανικών πρωτεϊνών, αποτελούν πρωτεΐνες των οποίων τα διαμεμβρανικά τμήματα έχουν τη δομή α-έλικας η οποία αποτελείται κυρίως από υδρόφοβα αμινοξικά κατάλοιπα που διαπερνούν το υδρόφοβο εσωτερικό της λιπιδικής διπλοστιβάδας. Η μεγάλη σπουδαιότητα των διαμεμβρανικών πρωτεϊνών, αλλά και οι εγγενείς δυσκολίες που παρουσιάζονται σε προσπάθειες κρυστάλλωσης τους, καθιστούν απαραίτητη τη δημιουργία υπολογιστικών αλγορίθμων οι οποίοι θα πρέπει να προβλέπουν σχετικά αξιόπιστα και γρήγορα τη δευτεροταγή τους δομή αλλά και τα πιθανά λειτουργικά τους χαρακτηριστικά. Αποφασιστικής σημασίας για τη μελέτη της δομής μιας διαμεμβρανικής πρωτεΐνης είναι η εύρεση της τοπολογίας της στη μεμβράνη, δηλαδή ο αριθμός των διαμεμβρανικών τμημάτων, η θέση τους στην ακολουθία της πρωτεΐνης και ο προσανατολισμός τους σε σχέση με το επίπεδο της μεμβράνης. Στα πλαίσια της διατριβής αυτής αναπτύχθηκαν υπολογιστικές μέθοδοι, βασισμένες σε σύγχρονες μαθηματικές τεχνικές, με τις οποίες θα μπορεί να γίνεται πρόγνωση της δομής και της λειτουργίας μεμβρανικών πρωτεϊνών. Συγκεκριμένα, επικεντρωθήκαμε στις α- ελικοειδείς διαμεμβρανικές πρωτεΐνες και στην κατηγορία των περιφερειακών μεμβρανικών πρωτεϊνών. Παράλληλα δημιουργήθηκαν δημόσια διαθέσιμες βάσεις δεδομένων με στοιχεία σχετικά με τη δομή και τη λειτουργία των μεμβρανικών πρωτεϊνών και ιδιαίτερων χαρακτηριστικών τους. Η μέθοδος LPLRpred που αναπτύχθηκε, επιτρέπει την εύρεση μιας περιοχής στην ακολουθία των πρωτεϊνών η οποία μπορεί να αποτελέσει θέση εισαγωγής αλληλουχιών στόχων στα πειράματα προσδιορισμού της τοπολογίας των διαμεμβρανικών πρωτεϊνών. Η πληροφορία αυτή μπορεί να χρησιμοποιηθεί για την καθοδήγηση των πειραμάτων με υπολογιστικό τρόπο και να οδηγήσει σε ελαχιστοποίηση του αριθμού και του κόστους τους. Η βάση δεδομένων ExTopoDB που κατασκευάστηκε αποτελεί την πλέον ενημερωμένη, παγκόσμια πηγή πειραματικών δεδομένων για την τοπολογία α-ελικοειδών διαμεμβρανικών πρωτεϊνών. Στις διαμεμβρανικές πρωτεΐνες οι περιοχές οι οποίες φωσφορυλιώνονται και γλυκοζυλιώνονται εδράζονται στον κυτταροπλασματικό και εξωκυττάριο χώρο αντίστοιχα. Η μέθοδος HMMpTM εισήγαγε ένα σημαντικό χαρακτηριστικό στην πρόγνωση της τοπολογίας των α-ελικοειδών διαμεμβρανικών πρωτεϊνών αξιοποιώντας αυτή την πληροφορία. Η μέθοδος GPCRpipe επιτρέπει το διαχωρισμό των συζευγμένων με G-πρωτεΐνες υποδοχέων (GPCRs) από άλλες κατηγορίες πρωτεϊνών και παρέχει σημαντικές πληροφορίες για τη δομή και λειτουργία τους. Η ανάλυση που πραγματοποιήθηκε για τις περιφερειακές μεμβρανικές πρωτεΐνες παρέχει σημαντικές πληροφορίες για τη δομή και τη λειτουργία τους στη μεμβράνη. Από τη μελέτη του δικτύου αλληλεπιδράσεων των περιφερειακών μεμβρανικών πρωτεϊνών αναδεικνύονται πρωτεΐνες οι οποίες έχουν κεντρικό ρόλο και μπορούν να αποτελέσουν στόχους της φαρμακευτικής έρευνας και περαιτέρω πειραματικής μελέτης. Η βάση mpMoRFsDB που κατασκευάστηκε, μαζί με την ανάλυση που πραγματοποιήθηκε αποτελεί την πρώτη μελέτη του φαινομένου της εγγενούς έλλειψης δομής στις αλληλεπιδράσεις των μεμβρανικών πρωτεϊνών και παρέχει σημαντικά δεδομένα για την περαιτέρω μελέτη του φαινομένου στις πρωτεΐνες αυτές.Membrane proteins perform a variety of very important biological functions necessary for the survival of the cell. The vast majority of transmembrane proteins are proteins whose transmembrane segments form an alpha-helix composed of mainly hydrophobic residues, spanning the lipid bilayer hydrophobic interior. The importance of transmembrane proteins, as well as the inherent difficulties in crystallizing and obtaining a three-dimensional structure of these proteins, dictates the need for developing computational algorithms and tools that may allow a reliable and fast prediction of their structural and functional features. In order to understand their function we must acquire knowledge about their structure and topology in relation to the membrane. By topology, we refer to the knowledge of the number and the exact localization of transmembrane segments, as well as their orientation with respect to the lipid bilayer. In this study, we developed novel algorithms and computer software, based on modern mathematical methods and machine learning approaches, to predict the structure and function of membrane proteins. We focused on two major groups; alpha-helical transmembrane proteins and peripheral membrane proteins. In addition, we constructed specialized, publicly available databases containing information about the structure and function of membrane proteins. The LPLRpred method allows the determination of a region across a protein sequence that can be used for the insertion of target sites when studying the topology of alpha-helical transmembrane proteins. This information may be used to minimize the number and the cost of experiments and computationally guide the design of new experiments. The ExTopoDB database is the most up-to-date worldwide resource including experimental information about the topology of alpha-helical transmembrane proteins. The database might be a valuable tool for researchers, in order to design new experiments and, also, for bioinformaticians, since it provides a large representative set that can be used for training and testing prediction algorithms. Phosphorylation and glycosylation are post-translational modifications (PTMs) that occur in a compartment-specific manner and therefore the presence of a phosphorylation or glycosylation site in a transmembrane protein provides valuable topological information. We examined the combination of phosphorylation and glycosylation site prediction with transmembrane protein topology prediction. The HMMpTM method integrates a novel feature in topology prediction. It is not just a consensus of post-translational modification and topology prediction but integrates in a single Hidden Markov Model phosphorylation and glycosylation prediction in order to more accurately predict the orientation of transmembrane proteins in membranes. Given that the general topology prediction algorithms perform poorly in the case of GPCRs, we developed a specialized method for their structural topological annotation, and functional classification. GPCRpipe is a pipeline for the accurate detection and annotation of GPCRs in proteomes. Moreover, GPCRpipe may offer information regarding the family of GPCRs in which the predicted proteins may belong to and the coupling specificity to certain families of G-proteins. A study of the molecular interactions of peripheral membrane proteins was performed in order to obtain insights about their role and organization, in relation to the human plasma membrane. The mpMoRFsDB database and the analysis of MoRFs in membrane proteins is the first study of such protein regions in membrane proteins and provide insights about the disorder-based protein-protein interactions in membrane proteins

    Intrinsically disordered proteins in molecular recognition and structural proteomics

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Intrinsically disordered proteins (IDPs) are abundant in nature, being more prevalent in the proteomes of eukaryotes than those of bacteria or archaea. As introduced in Chapter I, these proteins, or portions of these proteins, lack stable equilibrium structures and instead have dynamic conformations that vary over time and population. Despite the lack of preformed structure, IDPs carry out many and varied molecular functions and participate in vital biological pathways. In particular, IDPs play important roles in cellular signaling that is, in part, enabled by the ability of IDPs to mediate molecular recognition. In Chapter II, the role of intrinsic disorder in molecular recognition is examined through two example IDPs: p53 and 14-3-3. The p53 protein uses intrinsically disordered regions at its N- and C-termini to interact with a large number of partners, often using the same residues. The 14-3-3 protein is a structured domain that uses the same binding site to recognize multiple intrinsically disordered partners. Examination of the structural details of these interactions highlights the importance of intrinsic disorder and induced fit in molecular recognition. More generally, many intrinsically disordered regions that mediate interactions share similar features that are identifiable from protein sequence. Chapter IV reviews several models of IDP mediated protein-protein interactions that use completely different parameterizations. Each model has its relative strengths in identifying novel interaction regions, and all suggest that IDP mediated interactions are common in nature. In addition to the biologic importance of IDPs, they are also practically important in the structural study of proteins. The presence of intrinsic disordered regions can inhibit crystallization and solution NMR studies of otherwise well-structured proteins. This problem is compounded in the context of high throughput structure determination. In Chapter III, the effect of IDPs on structure determination by X-ray crystallography is examined. It is found that protein crystals are intolerant of intrinsic disorder by examining existing crystal structures from the PDB. A retrospective analysis of Protein Structure Initiative data indicates that prediction of intrinsic disorder may be useful in the prioritization and improvement of targets for structure determination

    Ethylene response and phytohormone-mediated regulation of gene expression in Komagataeibacter xylinus ATCC 53582

    Get PDF
    Komagataeibacter xylinus ATCC 53582 is a fruit-associated, cellulose-producing bacterium that responds to and synthesizes phytohormones. This thesis elaborates on the ecophysiology of K. xylinus. Responses to indole-3-acetic acid (IAA), abscisic acid (ABA) and ethylene, produced in situ from ethephon, were of particular focus. The effect of these phytohormones on K. xylinus cellulose production and expression of cellulose biosynthesis-related genes (bcsA, bcsB, bcsC, bcsD, cmcAx, ccpAx and bglAx) were determined using pellicle assays and reverse transcription quantitative polymerase chain reaction (RT-qPCR), respectively. Ethylene enhanced cellulose yield by upregulating bcsA and bcsB expression, while IAA decreased cellulose yield by downregulating bcsA. Differential gene expression within the bacterial cellulose synthesis (bcs) operon is reported and a phytohormone-regulated CRP/FNR transcription factor was identified that may influence K. xylinus cellulose biosynthesis. Based on evidence provided in this thesis, the classification of K. xylinus as a saprophyte and its potential to accelerate fruit ripening in nature is proposed
    corecore