Search CORE

685 research outputs found

Automatic annotation of experimentally derived, evolutionarily conserved post-translational modifications onto multiple genomes

Author: A. Marchler-Bauer
Aebersold
Amanchy
Arnott
Bailey
Beausoleil
Blom
Boekhorst
Chothia
Cox
Craig
Deutsch
Doolittle
Eddy
Fong
Frank
Geer
Geer
Gnad
Hill
Hornbeck
Hyland
Jones
Krishnamoorthy
L. Y. Geer
Mann
Marchler-Bauer
Mardis
Maris
Masaracchia
Miller
Obenauer
Palumbo
Pan
Ruttenberg
S. H. Bryant
Searle
Tabb
Tahara
Tan
Tanner
Tatusov
Ubersax
V. Sridhara
Wendt
Yates
Zhai
Publication venue: Oxford University Press
Publication date
Field of study

New generation sequencing technologies have resulted in significant increases in the number of complete genomes. Functional characterization of these genomes, such as by high-throughput proteomics, is an important but challenging task due to the difficulty of scaling up existing experimental techniques. By use of comparative genomics techniques, experimental results can be transferred from one genome to another, while at the same time minimizing errors by requiring discovery in multiple genomes. In this study, protein phosphorylation, an essential component of many cellular processes, is studied using data from large-scale proteomics analyses of the phosphoproteome. Phosphorylation sites from Homo sapiens, Mus musculus and Drosophila melanogaster phosphopeptide data sets were mapped onto conserved domains in NCBI’s manually curated portion of Conserved Domain Database (CDD). In this subset, 25 phosphorylation sites are found to be evolutionarily conserved between the three species studied. Transfer of phosphorylation annotation of these conserved sites onto sequences sharing the same conserved domains yield 3253 phosphosite annotations for proteins from coelomata, the taxonomic division that spans H. sapiens, M. musculus and D. melanogaster. The method scales automatically, so as the amount of experimental phosphoproteomics data increases, more conserved phosphorylation sites may be revealed

Crossref

PubMed Central

wKinMut: An integrated tool for the analysis and interpretation of mutations in human protein kinases

Author: A Baudot
A Gonzalez-Perez
A Torkamani
A Valencia
Alfonso Valencia
Angela del Pozo
B Reva
C Ferrer-Costa
C Greenman
C Greenman
C Ortutay
D Miranda-Saavedra
G Lopez
G Manning
G Wainreb
I Friedberg
IA Adzhubei
J Hurst
J Izarzugaza
JM Izarzugaza
JMG Izarzugaza
JMG Izarzugaza
Jose MG Izarzugaza
JS Kaminker
LD Wood
M Cline
M Krallinger
M Krallinger
Miguel Vazquez
MR Stratton
P Beltrao
P Lahiry
P Minguez
P Yue
PC Ng
R Calabrese
R Hoffmann
R Karchin
R Karchin
RJ Clifford
S Bamford
T Sjöblom
V Quesada
V Ramensky
VG Krishnan
XS Puente
Y Bromberg
YL Yip
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

BACKGROUND: Protein kinases are involved in relevant physiological functions and a broad number of mutations in this superfamily have been reported in the literature to affect protein function and stability. Unfortunately, the exploration of the consequences on the phenotypes of each individual mutation remains a considerable challenge. RESULTS: The wKinMut web-server offers direct prediction of the potential pathogenicity of the mutations from a number of methods, including our recently developed prediction method based on the combination of information from a range of diverse sources, including physicochemical properties and functional annotations from FireDB and Swissprot and kinase-specific characteristics such as the membership to specific kinase groups, the annotation with disease-associated GO terms or the occurrence of the mutation in PFAM domains, and the relevance of the residues in determining kinase subfamily specificity from S3Det. This predictor yields interesting results that compare favourably with other methods in the field when applied to protein kinases. Together with the predictions, wKinMut offers a number of integrated services for the analysis of mutations. These include: the classification of the kinase, information about associations of the kinase with other proteins extracted from iHop, the mapping of the mutations onto PDB structures, pathogenicity records from a number of databases and the classification of mutations in large-scale cancer studies. Importantly, wKinMut is connected with the SNP2L system that extracts mentions of mutations directly from the literature, and therefore increases the possibilities of finding interesting functional information associated to the studied mutations. CONCLUSIONS: wKinMut facilitates the exploration of the information available about individual mutations by integrating prediction approaches with the automatic extraction of information from the literature (text mining) and several state-of-the-art databases. wKinMut has been used during the last year for the analysis of the consequences of mutations in the context of a number of cancer genome projects, including the recent analysis of Chronic Lymphocytic Leukemia cases and is publicly available at http://wkinmut.bioinfo.cnio.es

Crossref

Springer - Publisher Connector

PubMed Central

Online Research Database In Technology

The computational analysis of post-translational modifications

Author: Damerell David Robert
Publication venue
Publication date: 01/01/2011
Field of study

The post-translational modification (PTMs) of proteins presents a means to increase the proteome size and diversity of an organism through the inclusion of structural elements not encoded at the sequence-level alone. Their erroneous inclusion or exclusion has been linked to a variety of diseases and disorders thus their characterisation has the potential to present viable drug targets. The proliferation of newer high-throughput methods, such as mass spectrometry, to identify such modifications has led to a rapid increase in the number of databases and tools to display and analyse such vast amounts of data effectively. This study covers the development of one such tool; PTM Browser, and the construction of the underlying database that it is based upon. This new database was initially seeded with annotations from the Swiss-Prot and Phospho.ELM resources. The initial database of PTMs was then expanded to include a large repertoire of previously unannotated proteins for a selection of topical species (e.g. Danio rerio and Tetraodon nigroviridis). Orthologue assignments have also been added to the database – to allow for queries to be performed regarding the conservation of modifications between homologous proteins. The PTM Browser tool allows for a full exploration of this new database of PTMs – with a special focus on allowing users to identify modifications that are both shared between and are specific to particular species. This tool is freely available for non-commercial use at the following URL: http://www.ptmbrowser.org. An analysis is presented on the conservation of modifications between members of the tumour suppressor family, p53, using this new tool. This tool has also been used to analysis the conservation of modifications between super-kingdoms and Eukaryote species

Sussex Research Online

OpenGrey Repository

Classification of Intrinsically Disordered Regions and Proteins.

Author: Buljan Marija
Daughdrill Gary W.
Fuxreiter Mónika
Lang Benjamin
Tompa Péter
van der Lee Robin
Weatheritt Robert J.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2014
Field of study

Repository of the Academy's Library

Experimental detection of short regulatory motifs in eukaryotic proteins: tips for good practice as well as for bad

Author
Publication venue: BioMed Central
Publication date: 18/11/2015
Field of study

Springer - Publisher Connector

GENOME INFORMATICS

Author: Birol I.
Cant J.
Champ M.
Publication venue
Publication date: 01/09/2010
Field of study

Cold Spring Harbor Laboratory Institutional Repository

Recommended from our members

Mechanisms  of  change  in  protein  architecture

Author: Buljan Marija
Publication venue: University of Cambridge
Publication date: 11/01/2011
Field of study

Proteins are the basic building blocks and functional units in all living organisms. Moreover, differences between species can frequently be explained with differences in their protein complements. Importantly, proteins are often composed of segments, i.e. domains that have a certain level of evolutionary, structural and/or functional independence. The majority of proteins in nature contain two or more domains, and an individual domain can often occur in combinations with different domain partners. In the first part of my thesis, I traced the history of animal gene families and the proteins these genes encode. By this means, I was able to infer events where changes in protein domain architectures took place. This showed that both insertions and deletions of single copy domains preferentially occur at protein termini, but also that changes are more likely to occur after gene duplication than organism speciation. Finally, domains that were most frequently gained were the ones that are related to an increase in organismal complexity, thus underlining the important role of domain shuffling in animal evolution. In the second part of my thesis, I focused on a set of high confidence domain gain events and investigated the evidence for molecular mechanisms that caused these domain gains. In agreement with observations from the first part - that changes preferentially occur at the termini - I have found that the strongest contribution to gains of novel domains in proteins comes from gene fusion through the joining of exons from adjacent genes into a novel gene unit. Two other mechanisms that have been suggested to play a major role in the evolution of animal proteins, retroposition and middle insertions through intronic recombination, have a smaller role in comparison to gene fusions. Since the majority of these domain gains are again observed after gene duplication, this suggests a powerful mechanism for neofunctionalization after gene duplication. iii Finally, in the last part of my thesis, I address a mechanism that increases the number and variety of proteins in an organism – alternative splicing. In particular, I investigate the functional consequences of tissue-specific alternative splicing events. I found that tissue-specific splicing tends to affect exons that encode protein regions without defined secondary or tertiary structure. Importantly, it is known that these disordered regions frequently play a role in protein interactions. In agreement with this, I observed significant enrichment of tissue-specifically encoded protein segments in disordered binding peptides and posttranslationally modified sites. A possible result of the finely regulated alternative splicing of these segments is a tissue-specific rewiring of protein network. In conclusion, both alternative splicing and domain shuffling can increase proteome diversity. However, a protein with a new function can often directly or indirectly shape the functions of other proteins in its environment

Apollo (Cambridge)

Deep sequencing of pre-translational mRNPs reveals hidden flux through evolutionarily conserved AS-NMD pathways

Author: Kovalak Carrie A.
Publication venue: eScholarship@UMassChan
Publication date: 06/01/2020
Field of study

Deep sequencing of mRNAs (RNA-Seq) is now the preferred method for transcriptome-wide quantification of gene expression. Yet many mRNA isoforms, such as those eliminated by nonsense-mediated decay (NMD), are inherently unstable. Thus a significant drawback of steady-state RNA-Seq is that it provides marginal information on the flux through alternative splicing pathways. Measurement of such flux necessitates capture of newly made species prior to mRNA decay. One means to capture nascent mRNAs is affinity purifying either the exon junction complex (EJC) or activated spliceosomes. Late-stage spliceosomes deposit the EJC upstream of exon-exon junctions, where it remains associated until the first round of translation. As most mRNA decay pathways are translation-dependent, these EJC- or spliceosome-associated, pre-translational mRNAs should provide an accurate record of the initial population of alternate mRNA isoforms. Previous work has analyzed the protein composition and structure of pre- translational mRNPs in detail. While in the Moore lab, my project has focused on exploring the diversity of mRNA isoforms contained within these complexes. As expected, known NMD isoforms are more highly represented in pre-translational mRNPs than in RNA-Seq libraries. To investigate whether pre-translational mRNPs contain novel mRNA isoforms, we created a bioinformatics pipeline that identified thousands of previously unannotated splicing events. Though many can be attributed to “splicing noise”, others are evolutionarily-conserved events that produce new AS-NMD isoforms likely involved in maintenance of protein homeostasis. Several of these occur in genes whose overexpression has been linked to poor cancer prognosis

eScholarship@UMMS

Gene expression data analysis using novel methods: Predicting time delayed correlations and evolutionarily conserved functional modules

Author: Balasubramaniyan Rajarajeswari
Kämper Jörg (Dr. )
Publication venue: Philipps-Universität Marburg, Biologie
Publication date: 01/01/2005
Field of study

Microarray technology enables the study of gene expression on a large scale. One of the main challenges has been to devise methods to cluster genes that share similar expression profiles. In gene expression time courses, a particular gene may encode transcription factor and thus controlling several genes downstream; in this case, the gene expression profiles may be staggered, indicating a time-delayed response in transcription of the later genes. The standard clustering algorithms consider gene expression profiles in a global way, thus often ignoring such local time-delayed correlations. We have developed novel methods to capture time-delayed correlations between expression profiles: (1) A method using dynamic programming and (2) CLARITY, an algorithm that uses a local shape based similarity measure to predict time-delayed correlations and local correlations. We used CLARITY on a dataset describing the change in gene expression during the mitotic cell cycle in Saccharomyces cerevisiae. The obtained clusters were significantly enriched with genes that share similar functions, reflecting the fact that genes with a similar function are often co-regulated and thus co-expressed. Time-shifted as well as local correlations could also be predicted using CLARITY. In datasets, where the expression profiles of independent experiments are compared, the standard clustering algorithms often cluster according to all conditions, considering all genes. This increases the background noise and can lead to the missing of genes that change the expression only under particular conditions. We have employed a genetic algorithm based module predictor that is capable to identify group of genes that change their expression only in a subset of conditions. With the aim of supplementing the Ustilago maydis genome annotation, we have used the module prediction algorithm on various independent datasets from Ustilago maydis. The predicted modules were cross-referenced in various Saccharomyces cerevisiae datasets to check its evolutionarily conservation between these two organisms. The key contributions of this thesis are novel methods that explore biological information from DNA microarray data

Identification, organisation and visualisation of complete proteomes in UniProt throughout all taxonomic ranks :|barchaea, bacteria, eukatyote and virus

Author: Stanley Eleanor Juliet
Publication venue: Cranfield University
Publication date: 01/04/2012
Field of study

Users of uniprot.org want to be able to query, retrieve and download proteome sets for an organism of their choice. They expect the data to be easily accessed, complete and up to date based on current available knowledge. UniProt release 2012_01 (25th Jan 2012) contains the proteomes of 2,923 organisms; 50% of which are bacteria, 38% viruses, 8% eukaryota and 4% archaea. Note that the term 'organism' is used in a broad sense to include subspecies, strains and isolates. Each completely sequenced organism is processed as an independent organism, hence the availability of 38 strain-specific proteomes Escherichia coli that are accessible for download. There is a project within UniProt dedicated to the mammoth task of maintaining the “Proteomes database”. This active resource is essential for UniProt to continually provide high quality proteome sets to the users. Accurate identification and incorporation of new, publically available, proteomes as well as the maintenance of existing proteomes permits sustained growth of the proteomes project. This is a huge, complicated and vital task accomplished by the activities of both curators and programmers. This thesis explains the data input and output of the proteomes database: the flow of genome project data from the nucleotide database into the proteomes database, then from each genome how a proteome is identified, augmented and made visible to uniprot.org users. Along this journey of discovery many issues arose, puzzles concerning data gathering, data integrity and also data visualisation. All were resolved and the outcome is a well-documented, actively maintained database that strives to provide optimal proteome information to its users

Cranfield CERES