Search CORE

40 research outputs found

InParanoid 6: eukaryotic ortholog clusters with inparalogs

Author: A.-C. Berglund
Dessimoz
E. L. L. Sonnhammer
E. Sjolund
Fitch
G. Ostlund
Hulsen
O'Brien
Sonnhammer
Tatusov
Publication venue: Oxford University Press
Publication date
Field of study

The InParanoid eukaryotic ortholog database (http://InParanoid.sbc.su.se/) has been updated to version 6 and is now based on 35 species. We collected all available ‘complete’ eukaryotic proteomes and Escherichia coli, and calculated ortholog groups for all 595 species pairs using the InParanoid program. This resulted in 2 642 187 pairwise ortholog groups in total. The orthology-based species relations are presented in an orthophylogram. InParanoid clusters contain one or more orthologs from each of the two species. Multiple orthologs in the same species, i.e. inparalogs, result from gene duplications after the species divergence. A new InParanoid website has been developed which is optimized for speed both for users and for updating the system. The XML output format has been improved for efficient processing of the InParanoid ortholog clusters

Crossref

PubMed Central

primetv: a viewer for reconciled trees

Author: Arvestad Lars
Berglund Sonnhammer Ann-Charlotte
Lagergren Jens
Schreil Eva
Sennblad Bengt
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Spatio-temporal analysis of prostate tumors in situ suggests pre-existence of treatment-resistant clones

Author: Bergenstråhle Ludvig
Berglund Emelie
Erickson Andrew
Friedrich Stefanie
Helleday Thomas
Lamb Alastair D
Liu Yao
Lundeberg Joakim
Marklund Maja
Schultz Niklas
Sonnhammer Erik
Tanoglidi Anna
Tarish Firas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/09/2022
Field of study

The molecular mechanisms underlying lethal castration-resistant prostate cancer remain poorly understood, with intratumoral heterogeneity a likely contributing factor. To examine the temporal aspects of resistance, we analyze tumor heterogeneity in needle biopsies collected before and after treatment with androgen deprivation therapy. By doing so, we are able to couple clinical responsiveness and morphological information such as Gleason score to transcriptome-wide data. Our data-driven analysis of transcriptomes identifies several distinct intratumoral cell populations, characterized by their unique gene expression profiles. Certain cell populations present before treatment exhibit gene expression profiles that match those of resistant tumor cell clusters, present after treatment. We confirm that these clusters are resistant by the localization of active androgen receptors to the nuclei in cancer cells post-treatment. Our data also demonstrates that most stromal cells adjacent to resistant clusters do not express the androgen receptor, and we identify differentially expressed genes for these cells. Altogether, this study shows the potential to increase the power in predicting resistant tumors

UCL Discovery

PubMed Central

Unifying Parsimonious Tree Reconciliation

Author: A.C. Berglund-Sonnhammer
B. Vernot
C. Conow
C. Zmasek
D. Brooks
D. Merkle
D. Merkle
F. Ronquist
F. Ronquist
F. Ronquist
G. Nelson
G. Nelson
G.J. Szollosi
J.P. Doyon
K. Chen
M. Charleston
M. Charleston
M. Goodman
M. Hafner
M. Hendy
M. Zandee
M.S. Hafner
R. Page
R. Page
R. Page
R. Patro
R.D.M. Page
R.D.M. Page
Y. Ovadia
Publication venue
Publication date: 01/01/2013
Field of study

Evolution is a process that is influenced by various environmental factors, e.g. the interactions between different species, genes, and biogeographical properties. Hence, it is interesting to study the combined evolutionary history of multiple species, their genes, and the environment they live in. A common approach to address this research problem is to describe each individual evolution as a phylogenetic tree and construct a tree reconciliation which is parsimonious with respect to a given event model. Unfortunately, most of the previous approaches are designed only either for host-parasite systems, for gene tree/species tree reconciliation, or biogeography. Hence, a method is desirable, which addresses the general problem of mapping phylogenetic trees and covering all varieties of coevolving systems, including e.g., predator-prey and symbiotic relationships. To overcome this gap, we introduce a generalized cophylogenetic event model considering the combinatorial complete set of local coevolutionary events. We give a dynamic programming based heuristic for solving the maximum parsimony reconciliation problem in time O(n^2), for two phylogenies each with at most n leaves. Furthermore, we present an exact branch-and-bound algorithm which uses the results from the dynamic programming heuristic for discarding partial reconciliations. The approach has been implemented as a Java application which is freely available from http://pacosy.informatik.uni-leipzig.de/coresym.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

arXiv.org e-Print Archive

Crossref

Algorithm of OMA for large-scale orthology inference

Author: A Alexeyenko
A Bateman
A Schneider
AC Berglund-Sonnhammer
AK Bjorklund
Alexander CJ Roth
AM Altenhoff
AR Mushegian
C Dessimoz
C Dessimoz
C Dessimoz
CEV Storm
Christophe Dessimoz
CM Zmasek
D Fulton
DA Benson
DP Wall
ELL Sonnhammer
Gaston H Gonnet
K Chen
L Jensen
L Li
M Dayhoff
M Farrar
M Gil
M Remm
P Flicek
R Balasubramanian
RA Notebaart
RL Tatusov
RL Tatusov
RTJMvan der Heijden
TF DeLuca
TF Smith
WM Fitch
Publication venue: BioMed Central
Publication date: 01/12/2008
Field of study

Since the publication of our article (Roth, Gonnet, and Dessimoz: BMC Bioinformatics 2008 9: 518), we have noticed several errors, which we correct in the following

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UCL Discovery

eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations

Author: A. Roth
Altschul
Aurrecoechea
Berglund
C. von Mering
D. Szklarczyk
Datta
Edgar
Eyre
Felsenstein
Finn
Fitch
Gilbert
Guindon
Harris
Hubbard
Huerta-Cepas
I. Letunic
J. Muller
Jensen
Jensen
Kanehisa
Katoh
Koonin
Kriventseva
Kuhn
Kuzniar
L. J. Jensen
Letunic
Letunic
Li
Loytynoja
M. Kuhn
Makarova
P. Bork
P. Julien
Pruitt
Roth
S. Powell
Saebo
Sonnhammer
Swarbreck
T. Doerks
Tatusov
Tatusov
Thompson
Thompson
Uchiyama
van der Heijden
Vilella
Wapinski
Waterhouse
Zmasek
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The identification of orthologous relationships forms the basis for most comparative genomics studies. Here, we present the second version of the eggNOG database, which contains orthologous groups (OGs) constructed through identification of reciprocal best BLAST matches and triangular linkage clustering. We applied this procedure to 630 complete genomes (529 bacteria, 46 archaea and 55 eukaryotes), which is a 2-fold increase relative to the previous version. The pipeline yielded 224 847 OGs, including 9724 extended versions of the original COG and KOG. We computed OGs for different levels of the tree of life; in addition to the species groups included in our first release (i.e. fungi, metazoa, insects, vertebrates and mammals), we have now constructed OGs for archaea, fishes, rodents and primates. We automatically annotate the non-supervised orthologous groups (NOGs) with functional descriptions, protein domains, and functional categories as defined initially for the COG/KOG database. In-depth analysis is facilitated by precomputed high-quality multiple sequence alignments and maximum-likelihood trees for each of the available OGs. Altogether, eggNOG covers 2 242 035 proteins (built from 2 590 259 proteins) and provides a broad functional description for at least 1 966 709 (88%) of them. Users can access the complete set of orthologous groups via a web interface at: http://eggnog.embl.de

Crossref

PubMed Central

UCL Discovery

Copenhagen University Research Information System

ZORA

MDC Repository

Ortho2ExpressMatrix—a web server that interprets cross-species gene expression data by gene family information

Author: A Krause
A Krause
A Valencia
AC Berglund
AJ Enright
AJ Enright
AJ Vilella
Andreas H Ludewig
BY Liao
C Frech
EL Sonnhammer
EV Koonin
G Ostlund
H Edwards
H Parkinson
HS Le
I Rivals
J Michaud
KI Goh
L Huminiecki
M Kanehisa
M Kapushesky
M Pellegrini
M Remm
Michal R Schweiger
P Flicek
Ralf Herwig
Ramu Chenna
RC Friedman
RD Finn
RL Tatusov
S Abhiman
S Griffiths-Jones
S Haider
SF Altschul
Sylvia Krobitsch
T Barrett
T Domazet-Loso
T Meinel
T Meinel
Thomas Meinel
TJ Hubbard
TW Harris
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The study of gene families is pivotal for the understanding of gene evolution across different organisms and such phylogenetic background is often used to infer biochemical functions of genes. Modern high-throughput experiments offer the possibility to analyze the entire transcriptome of an organism; however, it is often difficult to deduct functional information from that data. Results To improve functional interpretation of gene expression we introduce Ortho2ExpressMatrix, a novel tool that integrates complex gene family information, computed from sequence similarity, with comparative gene expression profiles of two pre-selected biological objects: gene families are displayed with two-dimensional matrices. Parameters of the tool are object type (two organisms, two individuals, two tissues, etc.), type of computational gene family inference, experimental meta-data, microarray platform, gene annotation level and genome build. Family information in Ortho2ExpressMatrix bases on computationally different protein family approaches such as EnsemblCompara, InParanoid, SYSTERS and Ensembl Family. Currently, respective all-against-all associations are available for five species: human, mouse, worm, fruit fly and yeast. Additionally, microRNA expression can be examined with respect to miRBase or TargetScan families. The visualization, which is typical for Ortho2ExpressMatrix, is performed as matrix view that displays functional traits of genes (differential expression) as well as sequence similarity of protein family members (BLAST e-values) in colour codes. Such translations are intended to facilitate the user's perception of the research object. Conclusions Ortho2ExpressMatrix integrates gene family information with genome-wide expression data in order to enhance functional interpretation of high-throughput analyses on diseases, environmental factors, or genetic modification or compound treatment experiments. The tool explores differential gene expression in the light of orthology, paralogy and structure of gene families up to the point of ambiguity analyses. Results can be used for filtering and prioritization in functional genomic, biomedical and systems biology applications. The web server is freely accessible at <url>http://bioinf-data.charite.de/o2em/cgi-bin/o2em.pl</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

The Tree versus the Forest: The Fungal Tree of Life and the Topological Diversity within the Yeast Phylome

Author: AC Berglund-Sonnhammer
B Robbertse
BE Dutilh
C Ane
Christophe d'Enfert
CM Zmasek
D Pisani
DA Fitzpatrick
EE Kuramae
EE Kuramae
F Delsuc
G Talavera
H Akaike
HA Schmidt
J Huerta-Cepas
J Huerta-Cepas
J Ruan
J Stoye
JA Eisen
JD Thompson
JE Galagan
KM Wong
KP Byrne
M Anisimova
Marina Marcet-Houben
MD Rasmussen
MJ Cornell
O Gascuel
RC Edgar
RT van der Heijden
S Guindon
SV Edwards
T Dagan
T Gabaldon
T Gabaldón
TF Smith
Toni Gabaldón
Publication venue: Public Library of Science
Publication date
Field of study

A recurrent topic in phylogenomics is the combination of various sequence alignments to reconstruct a tree that describes the evolutionary relationships within a group of species. However, such approach has been criticized for not being able to properly represent the topological diversity found among gene trees. To evaluate the representativeness of species trees based on concatenated alignments, we reconstruct several fungal species trees and compare them with the complete collection of phylogenies of genes encoded in the Saccharomyces cerevisiae genome. We found that, despite high levels of among-gene topological variation, the species trees do represent widely supported phylogenetic relationships. Most topological discrepancies between gene and species trees are concentrated in certain conflicting nodes. We propose to map such information on the species tree so that it accounts for the levels of congruence across the genome. We identified the lack of sufficient accuracy of current alignment and phylogenetic methods as an important source for the topological diversity encountered among gene trees. Finally, we discuss the implications of the high levels of topological variation for phylogeny-based orthology prediction strategies

Crossref

Directory of Open Access Journals

PubMed Central

Toward a General Model for the Evolutionary Dynamics of Gene Duplicates

Author: Anke Konrad
Arvestad
Ashley I. Teufel
Aury
Berglund-Sonnhammer
Blomme
Bollobás
Chen
Conant
D'Antonio
D'Souza
David A. Liberles
Denoeud
Dittmar
Dorer
Edger
Ekman
Fernández
Force
Freeling
Gu
He
Hickman
Hughes
Hughes
Hughes
Hughes
Hughes
Hurles
Innan
Iwasa
Johan A. Grahnen
Juettemann
Kay
Kim
Kondrashov
Lee
Letunic
Li
Liang
Liberles
Liberles
Lin
Liu
Liu
Lynch
Lynch
Lynch
Lynch
Lynch
Maere
Makova
Mudholkar
Nielsen
Ohno
Panavas
Papp
Press
Rasmussen
Rastogi
Reece-Hoyes
Rogers
Roth
Storm
Tarrío
Veitia
Veitia
Wagner
Walters
Zhang
Zhang
Zhang
Zhang
Zheng
Østbye
Publication venue: Oxford University Press
Publication date
Field of study

Gene duplication is an important process in the functional divergence of genes and genomes. Several processes have been described that lead to duplicate gene retention over different timescales after both smaller-scale events and whole-genome duplication, including neofunctionalization, subfunctionalization, and dosage balance. Two common modes of duplicate gene loss include nonfunctionalization and loss due to population dynamics (failed fixation). Previous work has characterized expectations of duplicate gene retention under the neofunctionalization and subfunctionalization models. Here, that work is extended to dosage balance using simulations. A general model for duplicate gene loss/retention is then presented that is capable of fitting expectations under the different models, is defined at t = 0, and decays to an orthologous asymptotic rate rather than zero, based upon a modified Weibull hazard function. The model in a maximum likelihood framework shows the property of identifiability, recovering the evolutionary mechanism and parameters of simulation. This model is also capable of recovering the evolutionary mechanism of simulation from data generated using an unrelated network population genetic model. Lastly, the general model is applied as part of a mixture model to recent gene duplicates from the Oikopleura dioica genome, suggesting that neofunctionalization may be an important process leading to duplicate gene retention in that organism

Crossref

PubMed Central

Genome-Wide Influence of Indel Substitutions on Evolution of Bacteria of the PVC Superphylum, Revealed Using a Novel Computational Method

Author: Abascal
Abhiman
Altschul
Anisimova
Benner
Benner
Berglund-Sonnhammer
Blouin
Brandstrom
Britten
Britten
Chan
Chang
Chen
Cho
Clark
Crepin
Crepin
David A. Liberles
Davids
Dayhoff
DeLano
Dorman
Edgar
Edwards
Embley
Felsenstein
Fieseler
Fitch
Fuerst
Fuerst
Gao
Gregory
Griffiths
Gu
Gu
Gu
Hedlund
Henikoff
Horn
Huson
Javelle
Jenkins
Jones
Jones
Kanehisa
Knudsen
Lee
Lefebure
Lefébure
Li
Liberles
Liberles
Lindsay
Lonhienne
Meenan
Messier
Mira
Moran
Naomi L. Ward
Nilsson
Olga K. Kamneva
Orsi
Osterberg
Pascarella
Penn
Petersen
Petrov
Pilhofer
Podlaha
Podlaha
Pupko
Reeves
Retchless
Rivera
Roenner
Roy
Santarella-Mellwig
Schloss
Schully
Smith
Sorek
Stackebrandt
Stamatakis
Taylor
Van de Peer
Viguera
Wagner
Wang
Ward
Ward
Whelan
Yang
Yang
Yang
Yang
Zhang
Zheng
Zhou
Publication venue: Oxford University Press
Publication date
Field of study

Whole-genome scans for positive Darwinian selection are widely used to detect evolution of genome novelty. Most approaches are based on evaluation of nonsynonymous to synonymous substitution rate ratio across evolutionary lineages. These methods are sensitive to saturation of synonymous sites and thus cannot be used to study evolution of distantly related organisms. In contrast, indels occur less frequently than amino acid replacements, accumulate more slowly, and can be employed to characterize evolution of diverged organisms. As indels are also subject to the forces of natural selection, they can generate functional changes through positive selection. Here, we present a new computational approach to detect selective constraints on indel substitutions at the whole-genome level for distantly related organisms. Our method is based on ancestral sequence reconstruction, takes into account the varying susceptibility of different types of secondary structure to indels, and according to simulation studies is conservative. We applied this newly developed framework to characterize the evolution of organisms of the Planctomycetes, Verrucomicrobia, Chlamydiae (PVC) bacterial superphylum. The superphylum contains organisms with unique cell biology, physiology, and diverse lifestyles. It includes bacteria with simple cell organization and more complex eukaryote-like compartmentalization. Lifestyles range from free-living organisms to obligate pathogens. In this study, we conduct a whole-genome level analysis of indel substitutions specific to evolutionary lineages of the PVC superphylum and found that indels evolved under positive selection on up to 12% of gene tree branches. We also analyzed possible functional consequences for several case studies of predicted indel events

Crossref

PubMed Central