Search CORE

Pseudomonas Genome Database: facilitating user-friendly, comprehensive comparisons of microbial genomes

Author: Alfarano
Alm
B. Khaira
Bateman
Brinkman
Chenna
Choi
Darling
F. S. L. Brinkman
Fulton
G. L. Winsor
Gardy
Karp
Lewenza
Lewenza
Lynn
M. D. Whiteside
Markowitz
Peterson
R. E. W. Hancock
R. Lo
Rey
Stein
T. Van Rossum
Tatusov
Tatusov
Wheeler
Winsor
Publication venue: Oxford University Press
Publication date
Field of study

Pseudomonas aeruginosa is a well-studied opportunistic pathogen that is particularly known for its intrinsic antimicrobial resistance, diverse metabolic capacity, and its ability to cause life threatening infections in cystic fibrosis patients. The Pseudomonas Genome Database (http://www.pseudomonas.com) was originally developed as a resource for peer-reviewed, continually updated annotation for the Pseudomonas aeruginosa PAO1 reference strain genome. In order to facilitate cross-strain and cross-species genome comparisons with other Pseudomonas species of importance, we have now expanded the database capabilities to include all Pseudomonas species, and have developed or incorporated methods to facilitate high quality comparative genomics. The database contains robust assessment of orthologs, a novel ortholog clustering method, and incorporates five views of the data at the sequence and annotation levels (Gbrowse, Mauve and custom views) to facilitate genome comparisons. A choice of simple and more flexible user-friendly Boolean search features allows researchers to search and compare annotations or sequences within or between genomes. Other features include more accurate protein subcellular localization predictions and a user-friendly, Boolean searchable log file of updates for the reference strain PAO1. This database aims to continue to provide a high quality, annotated genome resource for the research community and is available under an open source license

eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges

Author: A. Roth
Altenhoff
C. von Mering
Chen
Chen
Ciccarelli
Creevey
D. Szklarczyk
Eisen
Gabaldon
Hulsen
I. Letunic
J. Muller
K. Trachana
Koonin
Kuzniar
L. J. Jensen
Linard
M. Kuhn
Makarova
Milinkovitch
P. Bork
Pearson
R. Arnold
S. Powell
T. Doerks
T. Rattei
Tatusov
Tatusov
Trachana
van der Heijden
von Mering
Wapinski
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Orthologous relationships form the basis of most comparative genomic and metagenomic studies and are essential for proper phylogenetic and functional analyses. The third version of the eggNOG database (http://eggnog.embl.de) contains non-supervised orthologous groups constructed from 1133 organisms, doubling the number of genes with orthology assignment compared to eggNOG v2. The new release is the result of a number of improvements and expansions: (i) the underlying homology searches are now based on the SIMAP database; (ii) the orthologous groups have been extended to 41 levels of selected taxonomic ranges enabling much more fine-grained orthology assignments; and (iii) the newly designed web page is considerably faster with more functionality. In total, eggNOG v3 contains 721 801 orthologous groups, encompassing a total of 4 396 591 genes. Additionally, we updated 4873 and 4850 original COGs and KOGs, respectively, to include all 1133 organisms. At the universal level, covering all three domains of life, 101 208 orthologous groups are available, while the others are applicable at 40 more limited taxonomic ranges. Each group is amended by multiple sequence alignments and maximum-likelihood trees and broad functional descriptions are provided for 450 904 orthologous groups (62.5%)

University of Birmingham Research Portal

Copenhagen University Research Information System

ZORA

MDC Repository

Beyond representing orthology relations by trees

Author: A Tofigh
AM Altenhoff
C Semple
Consortium T.G.O.
D Huson
D Wen
E Jacox
F Tekaia
G Jin
G. E. Scholz
J Jun
K Chen
K. T. Huber
KT Huber
L Nakhleh
LJJ Iersel van
M Hellmuth
M Hellmuth
M Lafond
M Stolzer
MS Bansal
O Mahmudi
P Gambette
P Górecki
R Tatusov
R Tatusov
S Böcker
S Willson
Y Ovadia
Y Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/11/2016
Field of study

Reconstructing the evolutionary past of a family of genes is an important aspect of many genomic studies. To help with this, simple relations on a set of sequences called orthology relations may be employed. In addition to being interesting from a practical point of view they are also attractive from a theoretical perspective in that e.\,g.\,a characterization is known for when such a relation is representable by a certain type of phylogenetic tree. For an orthology relation inferred from real biological data it is however generally too much to hope for that it satisfies that characterization. Rather than trying to correct the data in some way or another which has its own drawbacks, as an alternative, we propose to represent an orthology relation

\delta

in terms of a structure more general than a phylogenetic tree called a phylogenetic network. To compute such a network in the form of a level-1 representation for

\delta

, we formalize an orthology relation in terms of the novel concept of a symbolic 3- dissimilarity which is motivated by the biological concept of a ``cluster of orthologous groups'', or COG for short. For such maps which assign symbols rather that real values to elements, we introduce the novel {\sc Network-Popping} algorithm which has several attractive properties. In addition, we characterize an orthology relation

\delta

on some set

X

that has a level-1 representation in terms of eight natural properties for

\delta

as well as in terms of level-1 representations of orthology relations on certain subsets of

X

Springer - Publisher Connector

University of East Anglia digital repository

The genome and transcriptome of Trichormus sp NMC-1: insights into adaptation to extreme environments on the Qinghai-Tibet Plateau

Author: A Stamatakis
A Zorina
AL Delcher
B Langmead
BA Methé
C Xie
DA Los
DJ Wright
EP Balskus
G Blanc
G Norsang
HÄ Suh
J Qi
J Qi
J Zhang
JF Hess
JI Carreto
JM Shick
JP Zehr
K Mavromatis
KS Siddiqui
L Li
L R
L Ran
M Borodovsky
M Dassanayake
M Li
M Suyama
N Myers
P Pereira
P Puigbò
P Rajaniemi
PH Sudmant
PM Shih
Q Qiu
Q Tang
R Cavicchioli
RC Edgar
RL Tatusov
S Richter
SP Singh
SP Singh
SP Singh
T De Bie
T Kaneko
T Kogej
T Shi
U Consortium
U Nübel
WM Fitch
Z Xu
Z Yang
Z Yang
ZA Cheviron
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/07/2016
Field of study

The Qinghai-Tibet Plateau (QTP) has the highest biodiversity for an extreme environment worldwide, and provides an ideal natural laboratory to study adaptive evolution. In this study, we generated a draft genome sequence of cyanobacteria Trichormus sp. NMC-1 in the QTP and performed whole transcriptome sequencing under low temperature to investigate the genetic mechanism by which T. sp. NMC-1 adapted to the specific environment. Its genome sequence was 5.9 Mb with a G+C content of 39.2% and encompassed a total of 5362 CDS. A phylogenomic tree indicated that this strain belongs to the Trichormus and Anabaena cluster. Genome comparison between T. sp. NMC-1 and six relatives showed that functionally unknown genes occupied a much higher proportion (28.12%) of the T. sp. NMC-1 genome. In addition, functions of specific, significant positively selected, expanded orthogroups, and differentially expressed genes involved in signal transduction, cell wall/membrane biogenesis, secondary metabolite biosynthesis, and energy production and conversion were analyzed to elucidate specific adaptation traits. Further analyses showed that the CheY-like genes, extracellular polysaccharide and mycosporine-like amino acids might play major roles in adaptation to harsh environments. Our findings indicate that sophisticated genetic mechanisms are involved in cyanobacterial adaptation to the extreme environment of the QTP

Institute of Hydrobiology, Chinese Academy Of Sciences

University of Bedfordshire Repository

Database resources of the National Center for Biotechnology Information

Author: A. Souvorov
Altschul
Altschul
Blumenfeld
D. A. Benson
D. J. Lipman
D. L. Wheeler
D. Landsman
D. M. Church
D. R. Maglott
E. Sequeira
E. Yaschenko
Ermolaeva
Fung
G. D. Schuler
G. Starchenko
Geer
Ghedin
J. Ostell
K. Canese
K. D. Pruitt
K. Sirotkin
L. Wagner
L. Y. Geer
M. DiCuccio
M. Feolo
M. Shumway
Ma
Manolio
Needleman
O. Khovayko
R. Edgar
R. L. Tatusov
S. Federhen
S. H. Bryant
S. T. Sherry
Schuler
Schuler
Sewell
Sherry
T. A. Tatusova
T. Barrett
T. L. Madden
Tatusov
Tatusova
Tatusova
V. Chetvernin
V. Miller
W. Helmberg
Wang
Y. Kapustin
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

In addition to maintaining the GenBank(®) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link(BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace and Assembly Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Viral Genotyping Tools, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at

Public Library of Science (PLOS)

Ontologies in Quantitative Biology: A Basis for Comparison, Integration, and Discovery

Author: A. G Murzin
A. H Renear
B. R Zeeberg
C Perez-Iratxeta
C. A Orengo
D. A Hosack
D. R Swanson
E. L Sonnhammer
F Al-Shahrour
G Joshi-Tope
H Ogata
J Schulz
K. D Dahlquist
L Montecchi-Palazzi
L. J Jensen
L. J Lu
Lars J. Jensen
M Campillos
M Selbach
M. E Aranguren
M. V Blagosklonny
N. L Washington
Peer Bork
R Hoehndorf
R. L Tatusov
S Kerrien
S. W Doniger
T Attwood
T. R Gruber
W. R Taylor
Publication venue: Public Library of Science
Publication date: 01/05/2010
Field of study

As biology is becoming a data-driven discipline, ontologies become increasingly important for systematically capturing the existing knowledge. This essay discusses current trends and how ontologies can also be used for discovery

Copenhagen University Research Information System

MDC Repository

Algorithm of OMA for large-scale orthology inference

Author: A Alexeyenko
A Bateman
A Schneider
AC Berglund-Sonnhammer
AK Bjorklund
Alexander CJ Roth
AM Altenhoff
AR Mushegian
C Dessimoz
C Dessimoz
C Dessimoz
CEV Storm
Christophe Dessimoz
CM Zmasek
D Fulton
DA Benson
DP Wall
ELL Sonnhammer
Gaston H Gonnet
K Chen
L Jensen
L Li
M Dayhoff
M Farrar
M Gil
M Remm
P Flicek
R Balasubramanian
RA Notebaart
RL Tatusov
RL Tatusov
RTJMvan der Heijden
TF DeLuca
TF Smith
WM Fitch
Publication venue: BioMed Central
Publication date: 01/12/2008
Field of study

Since the publication of our article (Roth, Gonnet, and Dessimoz: BMC Bioinformatics 2008 9: 518), we have noticed several errors, which we correct in the following

Repository for Publications and Research Data

Springer - Publisher Connector

UCL Discovery

Bidirectional best hit r-window gene clusters

Author: A Bergeron
ARO Cavalcanti
B Snel
D Durand
D Sankoff
G Bourque
Hon Wai Leong
J Bentley
K Rudd
L Parida
M Ermolaeva
M Zhang
Melvin Zhang
MP Béal
P Jaccard
R Friedman
R Hoberman
R Hoberman
RL Tatusov
S Gama-Castro
V Muller
W Fitch
X He
X Ling
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background <it>Conserved gene clusters </it>are groups of genes that are located close to one another in the genomes of several species. They tend to code for proteins that have a functional interaction. The identification of conserved gene clusters is an important step towards understanding genome evolution and predicting gene function. Results In this paper, we propose a novel pairwise gene cluster model that combines the notion of bidirectional best hits with the <it>r</it>-window model introduced in 2003 by Durand and Sankoff. The bidirectional best hit (BBH) constraint removes the need to specify the minimum number of shared genes in the <it>r</it>-window model and improves the relevance of the results. We design a subquadratic time algorithm to compute the set of BBH <it>r</it>-window gene clusters efficiently. Conclusion We apply our cluster model to the comparative analysis of <it>E. coli </it>K-12 and <it>B. subtilis </it>and perform an extensive comparison between our new model and the gene teams model developed by Bergeron <it>et al</it>. As compared to the gene teams model, our new cluster model has a slightly lower recall but a higher precision at all levels of recall when the results were ranked using statistical tests. An analysis of the most significant BBH <it>r</it>-window gene cluster show that they correspond to known operons.</p

Springer - Publisher Connector

arXiv.org e-Print Archive

ScholarBank@NUS

Network Archaeology: Uncovering Ancient Networks from Present-day Interactions

Author: A Ahmed
A Kreimer
A Mithani
A Vazquez
A Vázquez
A Wagner
AC Gavin
AL Barabási
B Manna
BP Kelley
C Tantipathananandh
C Wiuf
Carl Kingsford
DJ de Solla Price
DJ Watts
DS Callaway
E Sprinzak
ED Levy
F Guo
F Hormozdiari
G Palla
H Ebel
H Huang
HA Simon
HB Fraser
I Bezáková
I Ispolatov
I Ispolatov
J Bar-Ilan
J Dutkowski
J Felsenstein
J Flannick
J Golbeck
J Hopcroft
J Leskovec
J Leskovec
J Leskovec
J Leskovec
J Leskovec
JB Pereira-Leal
JB Pereira-Leal
Joel S. Bader
JW Pinney
JW Thornton
L Hakes
LA Goodman
M Middendorf
P Shannon
R Kumar
R Milo
R Singh
RL Tatusov
S Hanneke
S Kerrien
S Li
S Navlakha
S Redner
Saket Navlakha
T Makino
TA Gibson
U Güldener
WK Kim
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 30/08/2010
Field of study

Often questions arise about old or extinct networks. What proteins interacted in a long-extinct ancestor species of yeast? Who were the central players in the Last.fm social network 3 years ago? Our ability to answer such questions has been limited by the unavailability of past versions of networks. To overcome these limitations, we propose several algorithms for reconstructing a network's history of growth given only the network as it exists today and a generative model by which the network is believed to have evolved. Our likelihood-based method finds a probable previous state of the network by reversing the forward growth model. This approach retains node identities so that the history of individual nodes can be tracked. We apply these algorithms to uncover older, non-extant biological and social networks believed to have grown via several models, including duplication-mutation with complementarity, forest fire, and preferential attachment. Through experiments on both synthetic and real-world data, we find that our algorithms can estimate node arrival times, identify anchor nodes from which new nodes copy links, and can reveal significant features of networks that have long since disappeared.Comment: 16 pages, 10 figure

Public Library of Science (PLOS)

Cold Spring Harbor Laboratory Institutional Repository