Search CORE

13 research outputs found

On the Representability of Complete Genomes by Multiple Competing Finite-Context (Markov) Models

Author: A Milosavljević
AJ Pinho
AL Delcher
António J. R. Neves
Armando J. Pinho
B Behzadi
Carlos A. C. Bastos
CB Burge
Christos A. Ouzounis
D Loewenstern
D Robelin
D Salomon
E Rivals
ET Whittaker
G Korodi
G Korodi
G Manzini
GF Hardy
H Richard
I Tabus
J Rissanen
J Venn
J Ziv
K Sayood
K Sjölander
L Allison
L Allison
M Brown
M Rho
M Stanke
MD Cao
MD Cao
MY Borodovsky
MY Borodovsky
MY Borodovsky
P Ferragina
P Salamon
Paulo J. S. G. Ferreira
PS Laplace
R Giancarlo
S Grumbach
S Tavaré
SL Salzberg
SL Zabell
SL Zabell
T Bayes
TC Bell
TI Dix
W Zhu
WE Johnson
X Chen
Z Liu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

A finite-context (Markov) model of order yields the probability distribution of the next symbol in a sequence of symbols, given the recent past up to depth . Markov modeling has long been applied to DNA sequences, for example to find gene-coding regions. With the first studies came the discovery that DNA sequences are non-stationary: distinct regions require distinct model orders. Since then, Markov and hidden Markov models have been extensively used to describe the gene structure of prokaryotes and eukaryotes. However, to our knowledge, a comprehensive study about the potential of Markov models to describe complete genomes is still lacking. We address this gap in this paper. Our approach relies on (i) multiple competing Markov models of different orders (ii) careful programming techniques that allow orders as large as sixteen (iii) adequate inverted repeat handling (iv) probability estimates suited to the wide range of context depths used. To measure how well a model fits the data at a particular position in the sequence we use the negative logarithm of the probability estimate at that position. The measure yields information profiles of the sequence, which are of independent interest. The average over the entire sequence, which amounts to the average number of bits per base needed to describe the sequence, is used as a global performance measure. Our main conclusion is that, from the probabilistic or information theoretic point of view and according to this performance measure, multiple competing Markov models explain entire genomes almost as well or even better than state-of-the-art DNA compression methods, such as XM, which rely on very different statistical models. This is surprising, because Markov models are local (short-range), contrasting with the statistical models underlying other methods, where the extensive data repetitions in DNA sequences is explored, and therefore have a non-local character

CiteSeerX

Public Library of Science (PLOS)

Crossref

Repositório Institucional da Universidade de Aveiro

Directory of Open Access Journals

PubMed Central

Disease proteomics

Author: A Borodovsky
A Yamamoto
AO Gure
C Eymann
CL Nilsson
CM Perou
CP Paweletz
D Greenbaum
DK Arrell
E Gagnon
E Lasonder
E Stockert
EF Petricoin
F Le Naour
FM Brichory
G Chen
G Evans
G Haas
G Zhou
H Antelmann
H Langen
J Madoz-Gurpide
J van Der Velden
JA Westbrook
JD Brenton
JE Van Eyk
JH McKerrow
JP Pellois
L Florens
L Zhang
LJ Old
M Brivio
M Stoeckli
MJ van de Vijver
MM Gourevitch
MY Heinke
N Jessani
N Sabarth
P Ping
PJ Mintz
RA VanBogelen
S Gruvberger
S Hanash
S Hanash
S Hoving
S Rubenwolf
Sam Hanash
SD Reid
SM Hanash
T Sorlie
T Soussi
TM Vondriska
TS Lewis
V Knezevic
WF Patton
WH Robinson
X Zuo
XP Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/03/2003
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62680/1/nature01514.pd

Crossref

Deep Blue Documents

Uniform Accuracy of the Maximum Likelihood Estimates for Probabilistic Models of Biological Sequences

Author: A Dembo
AY Mitrophanov
AY Mitrophanov
C Burge
C McDiarmid
CE Lawrence
DR Cox
G Fort
GA Churchill
GO Roberts
H Almagor
H Chernoff
K Azuma
L Saulis
L Saulis
LL Gatlin
LV Osipov
M Borodovsky
M Borodovsky
Mark Borodovsky
MY Borodovsky
MY Borodovsky
P Billingsley
P Gudynas
P Tuominen
P-M Samson
PJ Bickel
PW Glynn
R Durbin
R Montenegro
S Ekisheva
S Karlin
S Karlin
S Tavaré
SP Meyn
SV Nagaev
Svetlana Ekisheva
T Petrie
W Feller
WV Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Proteome Databases

Author: A Bairoch
A Bairoch
A Bairoch A
AJ Bleasby
B Jacq
C Barry
C Base
C Ericsson
C Robinson
CS Giometti
DA Benson
E Birney
E Selkov
EE Abola
EL Sonnhammer
G Stoesser
H Boucherie
H Ji
Haike Antelmann
I Moszer
JA Blake
JC Sanchez
JE Hansen
JG Henikoff
JM Cherry
JM Claverie
JM Corbett
K Gevaert
KE Rudd
KH Fasman
L Holm
LF Kolakowski Jr
M Kanehisa
M Kroeger
MC Peitsch
MD Adams
MS Boguski
MY Borodovsky
MY Borodovsky
N Guex
NL Anderson
P Cash
P Fabian
P James
P Jungblut
PD Karp
Phillip Cash
PL Pearson
R Apweiler
R Durbin
R Schneider
RA Sayle
RA Vanbogelen
RD Appel
RD Appel
RF Doolittle
SF Altschul
SR Eddy
Takashi Sazuka
TK Attwood
W Kabsch
WE Payne
Y Tateno
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1997
Field of study

The awareness that protein and DNA sequence data are essential to the understanding of biological systems is now well established in the life science community. This community is progressively becoming conscious that this is also true of additional information about protein expression, post-translational modifications, tertiary structure and, of course, function. All of this knowledge needs to be encapsulated in various databases. The goal of this chapter is to describe the data resources that are available to researchers working in the field of proteome studies. We will not attempt here to survey all the different databases that are relevant to this field. Such an exercise would be tedious due to the large number of relevant databases and would only be valid for a very short period of time due to the extreme speed with which new databases are appearing and/or disappearing. It is also for this reason that you will find a table at the end of this chapter (Table 5.l) listing the World-Wide Web (WWW) addresses of the databases described in the following sections. The most important component of this table is the Internet address that allows you to download an upto- date version of the table! We will successively describe the type of information found in the following types of databases: protein sequence, nucleotide sequence, pattern/profile, 2-D PAGE, 3-D structure, post-translational modification, genomic and metabolic. The last section of this chapter will try to predict future trends in the evolution of protein information resources

Crossref

Archive ouverte UNIGE

De-ubiquitination and ubiquitin ligase domains of A20 downregulate NF-κB signalling

Author: A Borodovsky
A Devin
A Devin
A Kieser
AM Weissman
AT Ting
AW Opipari Jr
C Wang
CM Pickart
CS Shi
DF Legler
EG Lee
H Hsu
H Hsu
H Zhou
JL Poyet
K Tada
KD Wilkinson
KS Makarova
L Aravind
L Deng
MA Kelliher
MY Balakirev
O Micheau
PC Evans
PC Evans
SQ Zhang
SY Lee
V Dixit
WC Yeh
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

The ubiquitin-modifying enzyme A20 is required for termination of Toll-like receptor responses

Author: A Borodovsky
A Kovalenko
A Krikos
AM Weissman
AW Opipari Jr.
C Wang
CA Janeway Jr.
D Wald
E Brint
E Trompouki
EG Lee
F Nomura
G Zhang
GJ Nau
GM Barton
GR Bignell
HY Song
I Kinjyo
IE Wertz
JT Cooper
K Heyninck
K Kobayashi
K Ohashi
K Takeda
KS Makarova
L Deng
MA Lomaga
MY Balakirev
P Burkett
PC Evans
PC Evans
R Beyaert
R Nakagawa
RM Hofmann
S Akira
S Akira
T-H Chuang
TR Brummelkamp
V Baud
Y Okamura
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Reactive-site-centric chemoproteomics identifies a distinct class of deubiquitinase enzymes

Author: A Borodovsky
A Borodovsky
A Maréchal
A Maréchal
AE Speers
AEH Elia
AP Turnbull
BH Ha
BH Ha
C Das
C Tagwerker
CM Woo
CW Law
D Flierman
D Komander
D Komander
DA Boudreaux
DS Hewings
DW Bak
E Weerapana
E Weerapana
ElF Oualid
FE Reyes-Turcu
G Beauclair
H Drobecq
H Liu
H Ovaa
HM Yoo
I Dikic
I Letunic
IE Wertz
J Szychowski
J Tkáč
J Wang
J Wang
JA Prescher
JF McGouran
JR Wiśniewski
K Newton
K Newton
KR Love
L Kategaya
LC Dang
LE Sanman
LMI Koharudin
M Abo
M Yu
ME Ritchie
MH Wright
MJ Niphakis
MY Balakirev
PJ Britto
R Ekkebus
RM Hofmann
S Bondalapati
SA Abdul Rehman
SA Beausoleil
T Glatter
T Wang
TG Wucherpfennig
V Chau
V Quesada
W Li
W Liu
WF Vranken
Y Gao
Y Qian
Y Yang
Y Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref