Search CORE

322 research outputs found

Clustering the annotation space of proteins

Author: Kunin Victor
Ouzounis Christos A
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Current protein clustering methods rely on either sequence or functional similarities between proteins, thereby limiting inferences to one of these areas. RESULTS: Here we report a new approach, named CLAN, which clusters proteins according to both annotation and sequence similarity. This approach is extremely fast, clustering the complete SwissProt database within minutes. It is also accurate, recovering consistent protein families agreeing on average in more than 97% with sequence-based protein families from Pfam. Discrepancies between sequence- and annotation-based clusters were scrutinized and the reasons reported. We demonstrate examples for each of these cases, and thoroughly discuss an example of a propagated error in SwissProt: a vacuolar ATPase subunit M9.2 erroneously annotated as vacuolar ATP synthase subunit H. CLAN algorithm is available from the authors and the CLAN database is accessible at CONCLUSIONS: CLAN creates refined function-and-sequence specific protein families that can be used for identification and annotation of unknown family members. It also allows easy identification of erroneous annotations by spotting inconsistencies between similarities on annotation and sequence levels

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

King's Research Portal

Genome-wide expression patterns in physiological cardiac hypertrophy

Author: Drozdov Ignat
Ouzounis Christos A
Shah Ajay M
Tsoka Sophia
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

PubMed Central

King's Research Portal

Probabilistic annotation of protein sequences based on functional classifications

Author: Audit Benjamin
Gilks Walter R
Levy Emmanuel D
Ouzounis Christos A
Publication venue
Publication date: 01/01/2005
Field of study

RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Abstract Background One of the most evident achievements of bioinformatics is the development of methods that transfer biological knowledge from characterised proteins to uncharacterised sequences. This mode of protein function assignment is mostly based on the detection of sequence similarity and the premise that functional properties are conserved during evolution. Most automatic approaches developed to date rely on the identification of clusters of homologous proteins and the mapping of new proteins onto these clusters, which are expected to share functional characteristics. Results Here, we inverse the logic of this process, by considering the mapping of sequences directly to a functional classification instead of mapping functions to a sequence clustering. In this mode, the starting point is a database of labelled proteins according to a functional classification scheme, and the subsequent use of sequence similarity allows defining the membership of new proteins to these functional classes. In this framework, we define the Correspondence Indicators as measures of relationship between sequence and function and further formulate two Bayesian approaches to estimate the probability for a sequence of unknown function to belong to a functional class. This approach allows the parametrisation of different sequence search strategies and provides a direct measure of annotation error rates. We validate this approach with a database of enzymes labelled by their corresponding four-digit EC numbers and analyse specific cases. Conclusion The performance of this method is significantly higher than the simple strategy consisting in transferring the annotation from the highest scoring BLAST match and is expected to find applications in automated functional annotation pipelines.Published versio

HAL-ENS-LYON

Springer - Publisher Connector

PubMed Central

Apollo (Cambridge)

King's Research Portal

Measuring genome conservation across taxa: divided strains and united kingdoms

Author: Ahren Dag
Goldovsky Leon
Janssen Paul
Kunin Victor
Ouzounis Christos A.
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

Species evolutionary relationships have traditionally been defined by sequence similarities of phylogenetic marker molecules, recently followed by whole-genome phylogenies based on gene order, average ortholog similarity or gene content. Here, we introduce genome conservation—a novel metric of evolutionary distances between species that simultaneously takes into account, both gene content and sequence similarity at the whole-genome level. Genome conservation represents a robust distance measure, as demonstrated by accurate phylogenetic reconstructions. The genome conservation matrix for all presently sequenced organisms exhibits a remarkable ability to define evolutionary relationships across all taxonomic ranges. An assessment of taxonomic ranks with genome conservation shows that certain ranks are inadequately described and raises the possibility for a more precise and quantitative taxonomy in the future. All phylogenetic reconstructions are available at the genome phylogeny server: <>

CiteSeerX

Lund University Publications

Crossref

PubMed Central

King's Research Portal

Disease association and comparative genomics of compositional bias in human proteins [version 2; peer review: 2 approved]

Author: Anastasia Chasapi
Christos A. Ouzounis
Christos E. Kouros
Vasiliki Makri
Publication venue: 'F1000 Research Ltd'
Publication date: 01/04/2023
Field of study

Background: The evolutionary rate of disordered protein regions varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of compositional bias, indicative of disorder, across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards biased regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, proteins with compositional bias across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of compositional bias, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations

Directory of Open Access Journals

Tumorigenic Properties of Iron Regulatory Protein 2 (IRP2) Mediated by Its Specific 73-Amino Acids Insert

Author: Carmen Maffettone
Christos Ouzounis
Guohua Chen
Ignat Drozdov
Kostas Pantopoulos
Michael Polymenis
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Iron regulatory proteins, IRP1 and IRP2, bind to mRNAs harboring iron responsive elements and control their expression. IRPs may also perform additional functions. Thus, IRP1 exhibited apparent tumor suppressor properties in a tumor xenograft model. Here we examined the effects of IRP2 in a similar setting. Human H1299 lung cancer cells or clones engineered for tetracycline-inducible expression of wild type IRP2, or the deletion mutant IRP2Δ73 (lacking a specific insert of 73 amino acids), were injected subcutaneously into nude mice. The induction of IRP2 profoundly stimulated the growth of tumor xenografts, and this response was blunted by addition of tetracycline in the drinking water of the animals, to turnoff the IRP2 transgene. Interestingly, IRP2Δ73 failed to promote tumor growth above control levels. As expected, xenografts expressing the IRP2 transgene exhibited high levels of transferrin receptor 1 (TfR1); however, the expression of other known IRP targets was not affected. Moreover, these xenografts manifested increased c-MYC levels and ERK1/2 phosphorylation. A microarray analysis identified distinct gene expression patterns between control and tumors containing IRP2 or IRP1 transgenes. By contrast, gene expression profiles of control and IRP2Δ73-related tumors were more similar, consistently with their growth phenotype. Collectively, these data demonstrate an apparent pro-oncogenic activity of IRP2 that depends on its specific 73 amino acids insert, and provide further evidence for a link between IRPs and cancer biology

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

King's Research Portal

Emergence, development and diversification of the TGF-β signalling pathway within the animal kingdom

Author: Freilich Shiri
Goldovsky Leon
Heldin Carl-Henrik
Huminiecki Lukasz
Moustakas Aristidis
Ouzounis Christos
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The question of how genomic processes, such as gene duplication, give rise to co-ordinated organismal properties, such as emergence of new body plans, organs and lifestyles, is of importance in developmental and evolutionary biology. Herein, we focus on the diversification of the transforming growth factor-<it>β </it>(TGF-<it>β</it>) pathway – one of the fundamental and versatile metazoan signal transduction engines. Results After an investigation of 33 genomes, we show that the emergence of the TGF-<it>β </it>pathway coincided with appearance of the first known animal species. The primordial pathway repertoire consisted of four Smads and four receptors, similar to those observed in the extant genome of the early diverging tablet animal (<it>Trichoplax adhaerens</it>). We subsequently retrace duplications in ancestral genomes on the lineage leading to humans, as well as lineage-specific duplications, such as those which gave rise to novel Smads and receptors in teleost fishes. We conclude that the diversification of the TGF-<it>β </it>pathway can be parsimoniously explained according to the 2R model, with additional rounds of duplications in teleost fishes. Finally, we investigate duplications followed by accelerated evolution which gave rise to an atypical TGF-<it>β </it>pathway in free-living bacterial feeding nematodes of the genus Rhabditis. Conclusion Our results challenge the view of well-conserved developmental pathways. The TGF-<it>β </it>signal transduction engine has expanded through gene duplication, continually adopting new functions, as animals grew in anatomical complexity, colonized new environments, and developed an active immune system.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central