80 research outputs found

    Human Protein Reference Database—2009 update

    Get PDF
    Human Protein Reference Database (HPRD—http://www.hprd.org/), initially described in 2003, is a database of curated proteomic information pertaining to human proteins. We have recently added a number of new features in HPRD. These include PhosphoMotif Finder, which allows users to find the presence of over 320 experimentally verified phosphorylation motifs in proteins of interest. Another new feature is a protein distributed annotation system—Human Proteinpedia (http://www.humanproteinpedia.org/)—through which laboratories can submit their data, which is mapped onto protein entries in HPRD. Over 75 laboratories involved in proteomics research have already participated in this effort by submitting data for over 15 000 human proteins. The submitted data includes mass spectrometry and protein microarray-derived data, among other data types. Finally, HPRD is also linked to a compendium of human signaling pathways developed by our group, NetPath (http://www.netpath.org/), which currently contains annotations for several cancer and immune signaling pathways. Since the last update, more than 5500 new protein sequences have been added, making HPRD a comprehensive resource for studying the human proteome

    Systems biology approaches applied to regenerative medicine

    Get PDF
    Systems biology is the creation of theoretical and mathematical models for the study of biological systems, as an engine for hypothesis generation and to provide context to experimental data. It is underpinned by the collection and analysis of complex datasets from different biological systems, including global gene, RNA, protein and metabolite profiles. Regenerative medicine seeks to replace or repair tissues with compromised function (for example, through injury, deficiency or pathology), in order to improve their functionality. In this paper, we will address the application of systems biology approaches to the study of regenerative medicine, with a particular focus on approaches to study modifications to the genome, transcripts and small RNAs, proteins and metabolites

    Quantitative Proteomic Analysis of Human Embryonic Stem Cell Differentiation by 8-Plex iTRAQ Labelling

    Get PDF
    Analysis of gene expression to define molecular mechanisms and pathways involved in human embryonic stem cells (hESCs) proliferation and differentiations has allowed for further deciphering of the self-renewal and pluripotency characteristics of hESC. Proteins associated with hESCs were discovered through isobaric tags for relative and absolute quantification (iTRAQ). Undifferentiated hESCs and hESCs in different stages of spontaneous differentiation by embryoid body (EB) formation were analyzed. Using the iTRAQ approach, we identified 156 differentially expressed proteins involved in cell proliferation, apoptosis, transcription, translation, mRNA processing, and protein synthesis. Proteins involved in nucleic acid binding, protein synthesis, and integrin signaling were downregulated during differentiation, whereas cytoskeleton proteins were upregulated. The present findings added insight to our understanding of the mechanisms involved in hESC proliferation and differentiation

    Addressing statistical biases in nucleotide-derived protein databases for proteogenomic search strategies

    Get PDF
    [Image: see text] Proteogenomics has the potential to advance genome annotation through high quality peptide identifications derived from mass spectrometry experiments, which demonstrate a given gene or isoform is expressed and translated at the protein level. This can advance our understanding of genome function, discovering novel genes and gene structure that have not yet been identified or validated. Because of the high-throughput shotgun nature of most proteomics experiments, it is essential to carefully control for false positives and prevent any potential misannotation. A number of statistical procedures to deal with this are in wide use in proteomics, calculating false discovery rate (FDR) and posterior error probability (PEP) values for groups and individual peptide spectrum matches (PSMs). These methods control for multiple testing and exploit decoy databases to estimate statistical significance. Here, we show that database choice has a major effect on these confidence estimates leading to significant differences in the number of PSMs reported. We note that standard target:decoy approaches using six-frame translations of nucleotide sequences, such as assembled transcriptome data, apparently underestimate the confidence assigned to the PSMs. The source of this error stems from the inflated and unusual nature of the six-frame database, where for every target sequence there exists five “incorrect” targets that are unlikely to code for protein. The attendant FDR and PEP estimates lead to fewer accepted PSMs at fixed thresholds, and we show that this effect is a product of the database and statistical modeling and not the search engine. A variety of approaches to limit database size and remove noncoding target sequences are examined and discussed in terms of the altered statistical estimates generated and PSMs reported. These results are of importance to groups carrying out proteogenomics, aiming to maximize the validation and discovery of gene structure in sequenced genomes, while still controlling for false positives

    Correction: Exome Sequencing in an Admixed Isolated Population IndicatesNFXL1 Variants Confer a Risk for Specific Language Impairment

    Get PDF
    Children affected by Specific Language Impairment (SLI) fail to acquire age appropriate language skills despite adequate intelligence and opportunity. SLI is highly heritable, but the understanding of underlying genetic mechanisms has proved challenging. In this study, we use molecular genetic techniques to investigate an admixed isolated founder population from the Robinson Crusoe Island (Chile), who are affected by a high incidence of SLI, increasing the power to discover contributory genetic factors. We utilize exome sequencing in selected individuals from this population to identify eight coding variants that are of putative significance. We then apply association analyses across the wider population to highlight a single rare coding variant (rs144169475, Minor Allele Frequency of 4.1% in admixed South American populations) in the NFXL1 gene that confers a nonsynonymous change (N150K) and is significantly associated with language impairment in the Robinson Crusoe population (p = 2.04 × 10–4, 8 variants tested). Subsequent sequencing of NFXL1 in 117 UK SLI cases identified four individuals with heterozygous variants predicted to be of functional consequence. We conclude that coding variants within NFXL1 confer an increased risk of SLI within a complex genetic model

    HMGA1 Reprograms Somatic Cells into Pluripotent Stem Cells by Inducing Stem Cell Transcriptional Networks

    Get PDF
    PMC3499526BACKGROUND: Although recent studies have identified genes expressed in human embryonic stem cells (hESCs) that induce pluripotency, the molecular underpinnings of normal stem cell function remain poorly understood. The high mobility group A1 (HMGA1) gene is highly expressed in hESCs and poorly differentiated, stem-like cancers; however, its role in these settings has been unclear. METHODS/PRINCIPAL FINDINGS: We show that HMGA1 is highly expressed in fully reprogrammed iPSCs and hESCs, with intermediate levels in ECCs and low levels in fibroblasts. When hESCs are induced to differentiate, HMGA1 decreases and parallels that of other pluripotency factors. Conversely, forced expression of HMGA1 blocks differentiation of hESCs. We also discovered that HMGA1 enhances cellular reprogramming of somatic cells to iPSCs together with the Yamanaka factors (OCT4, SOX2, KLF4, cMYC - OSKM). HMGA1 increases the number and size of iPSC colonies compared to OSKM controls. Surprisingly, there was normal differentiation in vitro and benign teratoma formation in vivo of the HMGA1-derived iPSCs. During the reprogramming process, HMGA1 induces the expression of pluripotency genes, including SOX2, LIN28, and cMYC, while knockdown of HMGA1 in hESCs results in the repression of these genes. Chromatin immunoprecipitation shows that HMGA1 binds to the promoters of these pluripotency genes in vivo. In addition, interfering with HMGA1 function using a short hairpin RNA or a dominant-negative construct blocks cellular reprogramming to a pluripotent state. CONCLUSIONS: Our findings demonstrate for the first time that HMGA1 enhances cellular reprogramming from a somatic cell to a fully pluripotent stem cell. These findings identify a novel role for HMGA1 as a key regulator of the stem cell state by inducing transcriptional networks that drive pluripotency. Although further studies are needed, these HMGA1 pathways could be exploited in regenerative medicine or as novel therapeutic targets for poorly differentiated, stem-like cancers.JH Libraries Open Access Fun

    Identification and molecular characterization of a novel protein Saglin as a target of monoclonal antibodies affecting salivary gland infectivity of Plasmodium sporozoites

    No full text
    Molecular mechanisms underlying the interaction between malarial sporozoites and putative receptor(s) on the salivary glands of Anopheles gambiae remain largely unknown. In previous studies, a salivary gland protein of ∼100 kDa was identified as a putative target based on recognition of the protein by a monoclonal antibody (mAb) 2A3 that caused a ≥ 70% reduction in the average number of sporozoites per infected salivary gland when fed to mosquitoes. Using affinity purification we purified the target of this mAb from extracts of female A. gambiae salivary glands and it was found to be a novel protein by tandem mass spectrometric analysis. Biochemical and molecular characterization of the 100 kDa protein showed that this molecule, designated Saglin, exists as a disulphide-bonded homodimer of 50 kDa subunits. The ability to form homodimers was retained even in the recombinant Saglin expressed in mammalian cells (HEK293). The amino acid sequence of Saglin contains a signal peptide suggesting that Saglin is a secreted protein. If Saglin is indeed involved in the process of invasion of A. gambiae salivary glands by sporozoites of Plasmodium, it could provide a novel target for future investigations aimed at interruption of malaria transmission. © 2007 The Authors

    Proteogenomic analysis of Candida glabrata using high resolution mass spectrometry.

    No full text
    Item does not contain fulltextCandida glabrata is a common opportunistic human pathogen leading to significant mortality in immunosuppressed and immunodeficient individuals. We carried out proteomic analysis of C. glabrata using high resolution Fourier transform mass spectrometry with MS resolution of 60,000 and MS/MS resolution of 7500. On the basis of 32,453 unique peptides identified from 118,815 peptide-spectrum matches, we validated 4421 of the 5283 predicted protein-coding genes (83%) in the C. glabrata genome. Further, searching the tandem mass spectra against a six frame translated genome database of C. glabrata resulted in identification of 11 novel protein coding genes and correction of gene boundaries for 14 predicted gene models. A subset of novel protein-coding genes and corrected gene models were validated at the transcript level by RT-PCR and sequencing. Our study illustrates how proteogenomic analysis enabled by high resolution mass spectrometry can enrich genome annotation and should be an integral part of ongoing genome sequencing and annotation efforts
    corecore