Search CORE

1,223 research outputs found

Revisiting Waiting Times in DNA evolution

Author: Nicodeme Pierre
Publication venue
Publication date: 29/05/2012
Field of study

Transcription factors are short stretches of DNA (or

k

-mers) mainly located in promoters sequences that enhance or repress gene expression. With respect to an initial distribution of letters on the DNA alphabet, Behrens and Vingron consider a random sequence of length

n

that does not contain a given

k

-mer or word of size

k

. Under an evolution model of the DNA, they compute the probability

\mathfrak{p}_n

that this

k

-mer appears after a unit time of 20 years. They prove that the waiting time for the first apparition of the

k

-mer is well approximated by

T_n=1/\mathfrak{p}_n

. Their work relies on the simplifying assumption that the

k

-mer is not self-overlapping. They observe in particular that the waiting time is mostly driven by the initial distribution of letters. Behrens et al. use an approach by automata that relaxes the assumption related to words overlaps. Their numerical evaluations confirms the validity of Behrens and Vingron approach for non self-overlapping words, but provides up to 44% corrections for highly self-overlapping words such as

\mathtt{AAAAA}

. We devised an approach of the problem by clump analysis and generating functions; this approach leads to prove a quasi-linear behaviour of

\mathfrak{p}_n

for a large range of values of

n

, an important result for DNA evolution. We present here this clump analysis, first by language decomposition, and next by an automaton construction; finally, we describe an equivalent approach by construction of Markov automata.Comment: 19 pages, 3 Figures, 2 Table

arXiv.org e-Print Archive

HAL-Paris 13

A Coverage Criterion for Spaced Seeds and its Applications to Support Vector Machine String Kernels and k-Mer Distances

Author: Martin Donald E. K.
Noé Laurent
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/01/2014
Field of study

Spaced seeds have been recently shown to not only detect more alignments, but also to give a more accurate measure of phylogenetic distances (Boden et al., 2013, Horwege et al., 2014, Leimeister et al., 2014), and to provide a lower misclassification rate when used with Support Vector Machines (SVMs) (On-odera and Shibuya, 2013), We confirm by independent experiments these two results, and propose in this article to use a coverage criterion (Benson and Mak, 2008, Martin, 2013, Martin and No{\'e}, 2014), to measure the seed efficiency in both cases in order to design better seed patterns. We show first how this coverage criterion can be directly measured by a full automaton-based approach. We then illustrate how this criterion performs when compared with two other criteria frequently used, namely the single-hit and multiple-hit criteria, through correlation coefficients with the correct classification/the true distance. At the end, for alignment-free distances, we propose an extension by adopting the coverage criterion, show how it performs, and indicate how it can be efficiently computed.Comment: http://online.liebertpub.com/doi/abs/10.1089/cmb.2014.017

arXiv.org e-Print Archive

HAL - Lille 3

CiteSeerX

INRIA a CCSD electronic archive server

PubMed Central

A Coverage Criterion for Spaced Seeds and its Applications to Support Vector Machine String Kernels and k-Mer Distances

Author: Laurent Noé
Donald E.K. Martin
Apostolico A.
Bassino F.
Boden M.
Břinda K.
Burkhardt S.
Egidi L.
Gambin A.
Leslie C.S.
Martin D.E.K.
Martin D.E.K.
Régnier M.
Simon I.
Zhou L.
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/01/2010
Field of study

arXiv.org e-Print Archive

HAL - Lille 3

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Copenhagen University Research Information System

Computing with cells: membrane systems - some complexity issues.

Author: Alhazov A.
Alhazov A.
Andrei Păun
Cardelli L.
Cheruku S.
Ciobanu G.
Ciobanu G.
Csuhaj-Varju E.
Dang Z.
Freund R.
Freund R.
Freund R.
Freund R.
Freund R.
Gutiérrez-Naranjo M. A.
Ibarra O.
Ibarra O. H.
Ibarra O. H.
Ibarra O. H.
Ibarra O. H.
Ibarra O. H.
Immerman N.
Ionescu M.
Nishida T. Y.
Oscar H. Ibarra
Petreska B.
Păun A.
Păun A.
Păun A.
Păun A.
Păun Gh.
Păun Gh.
Păun Gh.
Păun Gh.
Savitch W.
Savitch W.
Sosik P.
Syropoulos A.
Szelepcsényi R.
Zandron C.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2008
Field of study

Membrane computing is a branch of natural computing which abstracts computing models from the structure and the functioning of the living cell. The main ingredients of membrane systems, called P systems, are (i) the membrane structure, which consists of a hierarchical arrangements of membranes which delimit compartments where (ii) multisets of symbols, called objects, evolve according to (iii) sets of rules which are localised and associated with compartments. By using the rules in a nondeterministic/deterministic maximally parallel manner, transitions between the system configurations can be obtained. A sequence of transitions is a computation of how the system is evolving. Various ways of controlling the transfer of objects from one membrane to another and applying the rules, as well as possibilities to dissolve, divide or create membranes have been studied. Membrane systems have a great potential for implementing massively concurrent systems in an efficient way that would allow us to solve currently intractable problems once future biotechnology gives way to a practical bio-realization. In this paper we survey some interesting and fundamental complexity issues such as universality vs. nonuniversality, determinism vs. nondeterminism, membrane and alphabet size hierarchies, characterizations of context-sensitive languages and other language classes and various notions of parallelism

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Self-replication, Evolvability and Asynchronicity in Stochastic Worlds

Author: A. Alissandrakis
A. Egri-Nagy
A. Whiten
A.M. Tyrrell
A.W. Burks
C. Adami
C. Darwin
C. Darwin
C.G. Langton
C.G. Langton
C.L. Nehaniv
C.L. Nehaniv
C.L. Nehaniv
C.L. Nehaniv
C.L. Nehaniv
D. Mange
D. Parnas
E. Nimwegen van
E. Szathmáry
E.F. Codd
E.H. Davidson
F.J. Varela
F.J. Varela
G. Tempesti
G.P. Wagner
H. Sayama
I. Sommerville
J. Byl
J. Goguen
J. Goguen
J. Holland
J. Maynard Smith
J. Maynard Smith
J. Szostak
J.A. Reggia
J.D. Lohn
J.D. Lohn
J.D. Watson
J.P. Crutchfield
J.R. Koza
J.T. Bonner
K. Morita
L. Altenberg
L. Margulis
L. Rendell
L. Wolpert
L.E. Orgel
L.J. Fogel
L.W. Buss
M. Conrad
M. Kimura
M. Kirschner
M. Lynch
M. Ridley
M. West-Eberhard
N.J. Macias
O. Leyser
P. Dömösi
P. Wernick
P.M.B. Vitányi
R. Laing
R.E. Michod
R.E. Michod
S. Ohno
S. Rasmussen
T. Belle Van
T. Belle Van
T. Quick
T. Toffoli
T. Toffoli
T.S. Ray
V. Varshavsky
W. Banzhaf
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref