Search CORE

Springer - Publisher Connector

Rise and Demise of Bioinformatics? Promise and Progress

Author: A Atanassov
A Godzik
A Wada
AJ Butte
AJ Butte
AS Khalil
B Palsson
BM Slepchenko
BWS Sobral
C Auffray
C Brewster
C Hack
C Ouzounis
C Ouzounis
C Perez-Iratxeta
C Pons
C Sander
CA Ouzounis
CA Ouzounis
Christos A. Ouzounis
CJ Miller
CJ O'Donnell
D Eisenberg
D Gurwitz
D Howe
D Roy
DA Hanauer
DP Faith
DS Roos
EC Berglund
FC Kafatos
G Alterovitz
G Miller
GA Thorisson
GH Fernald
H Gavaghan
H Kitano
H Volpin
IN Sarkar
J Barker
J Reed
JA Blake
JH Moore
JL Blanchard
JM Thornton
L da Fontoura Costa
L Serrano
LD Stein
M Boden
M Harvey
M Kanehisa
M Krallinger
M Pop
M Pop
M Suarez
MD Ritchie
MV Schneider
N See-Kiong
NC Kyrpides
P Chain
P Nightingale
P Tarczy-Hornoch
Philip E. Bourne
Q Yan
R Fuchs
R Molidor
RA Gatenby
RB Altman
RD Sleator
RJ Robbins
RJ Simpson
S Aldridge
S Buckingham
S Kumar
S Ranganathan
SE Ilyin
SW Scherer
SY Rhee
T Craddock
T Maschio
TF Smith
TK Attwood
V de Lorenzo
V Hatzimanikatis
V Maojo
YP Chen
Publication venue: Public Library of Science
Publication date: 26/04/2012
Field of study

The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself

Stratification of co-evolving genomic groups using ranked phylogenetic profiles

Author: A Karimpour-Fard
A Muller
A Tsirigos
AC McHardy
AJ Enright
AJ Enright
Assaf Gottlieb
C Nieto
C Ouzounis
CA Ouzounis
Christos A Ouzounis
CM Fraser
DC Krakauer
DP Kreil
Eric Blanc
ES Snitkin
EV Koonin
GS Chang
H Teeling
I Cases
J Reidl
J Wu
L Goldovsky
L Goldovsky
LB Koski
Leon Goldovsky
M Pellegrini
MA Ragan
MR Graham
P Hugenholtz
R Chenna
RJ Case
RL Tatusov
S Cokus
S Freilich
S Garcia-Vallve
S Karlin
S Karlin
S Karlin
S Karlin
S Podell
SA Shelburne
SF Altschul
SG Tringe
Shiri Freilich
Sophia Tsoka
T Abe
TZ DeSantis
V Kunin
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present <it>rank-BLAST</it>, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples.</p

Springer - Publisher Connector

King's Research Portal

A proteogenomic update to Yersinia: enhancing genome annotation

Author: AJ Link
AM Frank
C Ansong
C Sacerdot
C Wei
CA Ouzounis
D Perlman
IB Rogozin
J Crabtree
JD Bendtsen
JD Jaffe
JE Elias
M Aivaliotis
M Baudet
M Mann
N Gupta
NE Castellana
PR Jungblut
PS Chain
R Pieper
R Pieper
R Pieper
Rembert Pieper
RR Brubaker
S Gallien
S Tanner
Samuel H Payne
Shih-Ting Huang
SL Salzberg
T Dandekar
T Gaasterland
W Deng
Publication venue: BioMed Central
Publication date: 01/08/2010
Field of study

Abstract Background Modern biomedical research depends on a complete and accurate proteome. With the widespread adoption of new sequencing technologies, genome sequences are generated at a near exponential rate, diminishing the time and effort that can be invested in genome annotation. The resulting gene set contains numerous errors in even the most basic form of annotation: the primary structure of the proteins. Results The application of experimental proteomics data to genome annotation, called proteogenomics, can quickly and efficiently discover misannotations, yielding a more accurate and complete genome annotation. We present a comprehensive proteogenomic analysis of the plague bacterium, <it>Yersinia pestis KIM</it>. We discover non-annotated genes, correct protein boundaries, remove spuriously annotated ORFs, and make major advances towards accurate identification of signal peptides. Finally, we apply our data to 21 other <it>Yersinia </it>genomes, correcting and enhancing their annotations. Conclusions In total, 141 gene models were altered and have been updated in RefSeq and Genbank, which can be accessed seamlessly through any NCBI tool (e.g. blast) or downloaded directly. Along with the improved gene models we discover new, more accurate means of identifying signal peptides in proteomics data.</p

Springer - Publisher Connector

The FGGY carbohydrate kinase family : insights into the evolution of functional specificities

Author: A Osterman
A Vendeville
Adam Godzik
AE Todd
AE Todd
AM Schnoes
Andrei Osterman
B Reva
BE Engelhardt
BG Magor
CA Bonner
CA Orengo
Christos A. Ouzounis
CM Seibert
D Grueninger
D Wu
DA Lee
DA Rodionov
E Di Luccio
G Casari
GE Crooks
HM Berman
I Letunic
Irina Rodionova
JA Capra
JA Capra
JA Gerlt
JH Hurley
JH Hurley
JI Yeh
K Sjolander
K Ye
KB Xavier
LA David
M Ormo
M Pachkov
ME Glasner
MN Price
MV Omelchenko
N Krishnamurthy
Olga Zagnitko
OV Kalinina
P Shannon
R Overbeek
RC Edgar
RC Edgar
RD Finn
RK Aziz
S Cheek
SS Hannenhalli
TA Tatusova
TT Nguyen
W-D Fessner
Y Zhang
Ying Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/12/2011
Field of study

© The Author(s), 2011. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in PLoS Computational Biology 7 (2011): e1002318, doi:10.1371/journal.pcbi.1002318.Function diversification in large protein families is a major mechanism driving expansion of cellular networks, providing organisms with new metabolic capabilities and thus adding to their evolutionary success. However, our understanding of the evolutionary mechanisms of functional diversity in such families is very limited, which, among many other reasons, is due to the lack of functionally well-characterized sets of proteins. Here, using the FGGY carbohydrate kinase family as an example, we built a confidently annotated reference set (CARS) of proteins by propagating experimentally verified functional assignments to a limited number of homologous proteins that are supported by their genomic and functional contexts. Then, we analyzed, on both the phylogenetic and the molecular levels, the evolution of different functional specificities in this family. The results show that the different functions (substrate specificities) encoded by FGGY kinases have emerged only once in the evolutionary history following an apparently simple divergent evolutionary model. At the same time, on the molecular level, one isofunctional group (L-ribulokinase, AraB) evolved at least two independent solutions that employed distinct specificity-determining residues for the recognition of a same substrate (L-ribulose). Our analysis provides a detailed model of the evolution of the FGGY kinase family. It also shows that only combined molecular and phylogenetic approaches can help reconstruct a full picture of functional diversifications in such diverse families.This study was funded by NIH and DOE grants

Woods Hole Open Access Server

eScholarship - University of California

The Roots of Bioinformatics in Protein Evolution

Author: AJP Martin
AP Ryle
CA Ouzounis
CB Anfinsen
CB Anfinsen
CB Bridges
CH Li
David B. Searls
E Abderhalden
E Margoliash
E Zuckerkandl
EB Lewis
F Sanger
G Braunitzer
GA Mross
HA Itano
Ingram
JB Hagen
K Brew
KA Walsh
L Pauling
MO Dayhoff
MO Dayhoff
MO Dayhoff
MO Dayhoff
MW Nirenberg
P Edman
P Edman
R Eck
RF Doolittle
RF Doolittle
RF Doolittle
RF Doolittle
RL Hill
Russell F. Doolittle
S Henikoff
S Moore
SB Needleman
SG Stephens
SJ Singer
V du Vigneuad
V Ingram
WA Fitch
WM Fitch
Publication venue: Public Library of Science
Publication date: 01/07/2010
Field of study

eScholarship - University of California

Toxicogenomic Analysis Suggests Chemical-Induced Sexual Dimorphism in the Expression of Metabolic Genes in Zebrafish Liver

Author: A Jost
A Stromberg
A Subramanian
AP Arnold
B Mittendorfer
Baowen Li
BD Robison
C Cheadle
C Dennis
C Holden
CA Mugford
CH Phoenix
Choong Yong Ung
Christos A. Ouzounis
CW Bardin
DJ Waxman
DJ Waxman
E Han
EM Santos
H Horiuchi
J Bakker
Jing Ma
K Begriche
Louxin Zhang
M Enserink
M Gochfeld
M Kanehisa
MA De Leon-Nava
ME Mendelsohn
MJ de Hoon
R Sreenivasan
SC Woods
SH Lam
SH Lam
SH Lam
Siew Hong Lam
T Decsi
TJ Nicolson
W Davies
X Yang
X Yang
Xun Zhang
Y Nishida
YH Yang
Yu Zong Chen
Z Li
Zhiyuan Gong
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 18/12/2012
Field of study

10.1371/journal.pone.0051971PLoS ONE712

ScholarBank@NUS

A Combinatorial Approach to Detect Coevolved Amino Acid Networks in Protein Families of Variable Divergence

Author: A Del Sol
A Del Sol
AE Todd
AK Ramani
Alessandra Carbone
BZ Harris
C Ouzounis
C-H Yeang
CA Wilson
CC Goh
CP Ponting
D Barker
D Bashford
D Bilder
DA Doyle
DD Pollock
EH Syed
GB Gloor
GJ Bartlett
GM Suel
I Mihalek
JJ Perona
Julie Baussand
K Kataoka
L Hedstrom
L Hedstrom
M Fares
M Paoli
M van Ham
MF Perutz
N Ota
NV Grishin
O Lichtarge
PA Glynne
PJ Baker
PJ Baker
R Liddington
RI Dima
Roland Dunbrack
RW Pearson
S Engelen
S Gianni
S Guindon
SW Lockless
T Ohshima
T Sekimoto
T Sekimoto
W Tian
W Zheng
WR Atchley
Z Songyang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Communication between distant sites often defines the biological role of a protein: amino acid long-range interactions are as important in binding specificity, allosteric regulation and conformational change as residues directly contacting the substrate. The maintaining of functional and structural coupling of long-range interacting residues requires coevolution of these residues. Networks of interaction between coevolved residues can be reconstructed, and from the networks, one can possibly derive insights into functional mechanisms for the protein family. We propose a combinatorial method for mapping conserved networks of amino acid interactions in a protein which is based on the analysis of a set of aligned sequences, the associated distance tree and the combinatorics of its subtrees. The degree of coevolution of all pairs of coevolved residues is identified numerically, and networks are reconstructed with a dedicated clustering algorithm. The method drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed. We apply the method to four protein families where we show an accurate detection of functional networks and the possibility to treat sets of protein sequences of variable divergence

CiteSeerX

HAL-Inserm

Digital Repository @ Iowa State University (ISU)

Re-Annotation Is an Essential Step in Systems Biology Modeling of Functional Genomics Data

Author: A Harel
A Hutloff
AM Schnoes
Bart H. J. van den Berg
BH van den Berg
C Smith
CA Ouzounis
CE Jones
CE Rudd
CH Wu
D Barrell
D Devos
D Kemmer
DA Benson
DP Wall
E Eyras
E Quevillon
F Meurens
Fiona M. McCarthy
FM McCarthy
G Moreno-Hagelsieb
H Zhou
ICGS Consortium
Iddo Friedberg
J Burnside
JC Camus
JR Wortman
K Sellheyer
KM Kim
L Tian
LL Chen
M Andersson
M Andersson
M Ashburner
M Pruess
M Schena
M Vidric
ME van Berkel
MK Richardson
N Daraselia
N Gupta
N Rocques
O Gundogdu
PB Neerincx
PE Neiman
R Apweiler
R Edgar
RA Shilling
S Washietl
SE Brenner
Shane C. Burgess
SL Salzberg
Susan J. Lamont
T Barrett
TJ Buza
TJ Buza
UM Braga-Neto
V Wood
X Wang
YP de Jong
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

One motivation of systems biology research is to understand gene functions and interactions from functional genomics data such as that derived from microarrays. Up-to-date structural and functional annotations of genes are an essential foundation of systems biology modeling. We propose that the first essential step in any systems biology modeling of functional genomics data, especially for species with recently sequenced genomes, is gene structural and functional re-annotation. To demonstrate the impact of such re-annotation, we structurally and functionally re-annotated a microarray developed, and previously used, as a tool for disease research. We quantified the impact of this re-annotation on the array based on the total numbers of structural- and functional-annotations, the Gene Annotation Quality (GAQ) score, and canonical pathway coverage. We next quantified the impact of re-annotation on systems biology modeling using a previously published experiment that used this microarray. We show that re-annotation improves the quantity and quality of structural- and functional-annotations, allows a more comprehensive Gene Ontology based modeling, and improves pathway coverage for both the whole array and a differentially expressed mRNA subset. Our results also demonstrate that re-annotation can result in a different knowledge outcome derived from previous published research findings. We propose that, because of this, re-annotation should be considered to be an essential first step for deriving value from functional genomics data

CiteSeerX

Scholars Junction - Mississippi State University Institutional Repository

Global Mapping of DNA Conformational Flexibility on Saccharomyces cerevisiae

Author: A Aranda
A Fungtammasan
A Letessier
A Re
A Sarai
AM Casper
AM Puliti
Andrea Bedini
B Gelfand
B Le Tallec
BR Graveley
C Vaillant
CA Beelman
Christos A. Ouzounis
D Mishmar
D Scannell
DJ Hogan
E Segal
E Zlotorynski
E Zlotorynski
EA Ozonov
EM Prescott
F Ozsolak
Giulia Menconi
H Zhang
I Sbrana
I Tirosh
I Tirosh
I Tirosh
Isabella Sbrana
J Zhao
JD Lieb
JH Graber
JM Perez-Canadillas
K Mimori
K Mrasek
KE Shearwin
KP Byrne
KP O’Brien
L Hurst
M Debatisse
MD Vinces
MTJ van Loenhout
NN Batada
O Shalem
P Milani
PJ Coates
R Gemayel
R Shalgi
Roberto Barale
S Kruglyak
S Semba
SG Durkin
T Tuller
TW Glover
U Nagalakshmi
W1 Lee
Y Field
Y Lai
Y Wang
Y Yang
Z Guo
Z Guo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

In this study we provide the first comprehensive map of DNA conformational flexibility in Saccharomyces cerevisiae complete genome. Flexibility plays a key role in DNA supercoiling and DNA/protein binding, regulating DNA transcription, replication or repair. Specific interest in flexibility analysis concerns its relationship with human genome instability. Enrichment in flexible sequences has been detected in unstable regions of human genome defined fragile sites, where genes map and carry frequent deletions and rearrangements in cancer. Flexible sequences have been suggested to be the determinants of fragile gene proneness to breakage; however, their actual role and properties remain elusive. Our in silico analysis carried out genome-wide via the StabFlex algorithm, shows the conserved presence of highly flexible regions in budding yeast genome as well as in genomes of other Saccharomyces sensu stricto species. Flexibile peaks in S. cerevisiae identify 175 ORFs mapping on their 3’UTR, a region affecting mRNA translation, localization and stability. (TA)n repeats of different extension shape the central structure of peaks and co-localize with polyadenylation efficiency element (EE) signals. ORFs with flexible peaks share common features. Transcripts are characterized by decreased half-life: this is considered peculiar of genes involved in regulatory systems with high turnover; consistently, their function affects biological processes such as cell cycle regulation or stress response. Our findings support the functional importance of flexibility peaks, suggesting that the flexible sequence may be derived by an expansion of canonical TAYRTA polyadenylation efficiency element. The flexible (TA)n repeat amplification could be the outcome of an evolutionary neofunctionalization leading to a differential 3’-end processing and expression regulation in genes with peculiar function. Our study provides a new support to the functional role of flexibility in genomes and a strategy for its characterization inside human fragile sites

Archivio della Ricerca - Università di Pisa