Search CORE

69 research outputs found

A study on the correlation of nucleotide skews and the positioning of the origin of replication: different modes of replication in bacterial species

Author: Almirantis Yannis
Nikolaou Christoforos
Publication venue: Oxford University Press
Publication date: 30/11/2005
Field of study

Deviations from Chargaff's 2nd parity rule, according to which A∼T and G∼C in single stranded DNA, have been associated with replication as well as with transcription in prokaryotes. Based on observations regarding mainly the transcription-replication co-linearity in a large number of prokaryotic species, we formulate the hypothesis that the replication procedure may follow different modes between genomes throughout which the skews clearly follow different patterns. We draw the conclusion that multiple functional sites of origin of replication may exist in the genomes of most archaea and in some exceptional cases of eubacteria, while in the majority of eubacteria, replication occurs through a single fixed origin

Crossref

PubMed Central

Optimal Computation of Overabundant Words

Author: Almirantis Yannis
Charalampopoulos Panagiotis
Gao Jia
Iliopoulos Costas S.
Mohamed Manal
Pissis Solon P.
Polychronopoulos Dimitris
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th International Workshop on Algorithms in Bioinformatics (WABI 2017)
Publication date: 01/01/2017
Field of study

The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word w in a given sequence x can be used for classifying w as avoided or overabundant. The definitions used for the expectation and deviation of w in this statistical model were described and biologically justified by Brendel et al. (J Biomol Struct Dyn 1986). We have very recently introduced a time-optimal algorithm for computing all avoided words of a given sequence over an integer alphabet (Algorithms Mol Biol 2017). In this article, we extend this study by presenting an O(n)-time and O(n)-space algorithm for computing all overabundant words in a sequence x of length n over an integer alphabet. Our main result is based on a new non-trivial combinatorial property of the suffix tree T of x: the number of distinct factors of x whose longest infix is the label of an explicit node of T is no more than 3n-4. We further show that the presented algorithm is time-optimal by proving that O(n) is a tight upper bound for the number of overabundant words. Finally, we present experimental results, using both synthetic and real data, which justify the effectiveness and efficiency of our approach in practical terms

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Classification of selectively constrained DNA elements using feature vectors and rule-based classifiers

Author: Dimitris Polychronopoulos
Emanuel Weitschek
Emanuel Weitschek
Giovanni Felici
Philipp Bucher
Philipp Bucher
Slavica Dimitrieva
Slavica Dimitrieva
Yannis Almirantis
Publication venue
Publication date: 01/01/2014
Field of study

Scarce work has been done in the analysis of the composition of conserved non-coding elements (CNEs) that are identified by comparisons of two or more genomes and are found to exist in all metazoan genomes. Here we present the analysis of CNEs with a methodology that takes into account word occurrence at various lengths scales in the form of feature vector representation and rule based classifiers. We implement our approach on both protein-coding exons and CNEs, originating from human, insect (Drosophila melanogaster) and worm (Caenorhabditis elegans) genomes, that are either identified in the present study or obtained from the literature. Alignment free feature vector representation of sequences combined with rule-based classification methods leads to successful classification of the different CNEs classes. Biologically meaningful results are derived by comparison with the genomic signatures approach, and classification rates for a variety of functional elements of the genomes along with surrogates are presented. (C) 2014 Elsevier Inc. All rights reserved

Open Access Repository

Information decomposition of symbolic sequences

Author: Adams
Almirantis
Audi
Benson
Benson
Chaley
Chechetkin
Cole
Conway
Coward
Dodin
Dodin
E.V. Korotkov
Fraser
Glaser
Grosse
Heringa
Herren
Hertz
Herzel
Jackson
Junker
Korotkova
Kullback
Lobzin
Lotman
M.A. Korotkova
Margot
Marple
McLachlan
N.A. Kudryashov
Ng
Pennisi
Presta
Rackovsky
Ramakrishna
Rashid
Silverman
Stoesser
Tiwari
Tomb
Trifonov
Venter
Voss
Wang
Weiss
Yaglom
Zirmunsky
Publication venue: 'Elsevier BV'
Publication date: 17/02/2003
Field of study

We developed a non-parametric method of Information Decomposition (ID) of a content of any symbolical sequence. The method is based on the calculation of Shannon mutual information between analyzed and artificial symbolical sequences, and allows the revealing of latent periodicity in any symbolical sequence. We show the stability of the ID method in the case of a large number of random letter changes in an analyzed symbolic sequence. We demonstrate the possibilities of the method, analyzing both poems, and DNA and protein sequences. In DNA and protein sequences we show the existence of many DNA and amino acid sequences with different types and lengths of latent periodicity. The possible origin of latent periodicity for different symbolical sequences is discussed.Comment: 18 pages, 8 figure

arXiv.org e-Print Archive

Crossref

University of Groningen

Information content based model for the topological properties of the gene regulatory network of Escherichia coli

Author: Albert
Alberts
Almirantis
Avery
Ayşe Erzan
Babu
Balcan
Balcan
Balcan
Banzhaf
Barabasi
Barabasi
Benos
Berg
Bergmann
Berkin Malkoç
Bilu
Bollobás
Browning
Buldyrev
Colizza
Colizza
Dawkins
Dawkins
Dobrin
Dodd
Dorogovtsev
Duygu Balcan
Erdös
Erdös
Gama-Castro
Gerland
Gershenzon
Guelzim
Harbison
Jeong
Kashtan
Kauffman
Kim
Koralov
Kugiumtzis
Li
Lynch
Ma
Matsumoto
Milo
Milo
Mungan
Münch
Okuda
O’Flanagan
Pachkov
Reil
Rudd
Salgado
Salgado
Samal
Sengun
Sengupta
Shannon
Shearwin
Sneppen
Spirin
Stormo
Teixeira
van Nimwegen
van Noort
Vazquez
Wagner
Warren
Watson
Wernicke
Zhou
Publication venue: 'Elsevier BV'
Publication date: 29/12/2009
Field of study

Gene regulatory networks (GRN) are being studied with increasingly precise quantitative tools and can provide a testing ground for ideas regarding the emergence and evolution of complex biological networks. We analyze the global statistical properties of the transcriptional regulatory network of the prokaryote Escherichia coli, identifying each operon with a node of the network. We propose a null model for this network using the content-based approach applied earlier to the eukaryote Saccharomyces cerevisiae. (Balcan et al., 2007) Random sequences that represent promoter regions and binding sequences are associated with the nodes. The length distributions of these sequences are extracted from the relevant databases. The network is constructed by testing for the occurrence of binding sequences within the promoter regions. The ensemble of emergent networks yields an exponentially decaying in-degree distribution and a putative power law dependence for the out-degree distribution with a flat tail, in agreement with the data. The clustering coefficient, degree-degree correlation, rich club coefficient and k-core visualization all agree qualitatively with the empirical network to an extent not yet achieved by any other computational model, to our knowledge. The significant statistical differences can point the way to further research into non-adaptive and adaptive processes in the evolution of the E. coli GRN.Comment: 58 pages, 3 tables, 22 figures. In press, Journal of Theoretical Biology (2009)

arXiv.org e-Print Archive

Crossref

Minimal Absent Words in Rooted and Unrooted Trees

Author: B Schieber
C Barton
D Belazzougui
D Belazzougui
F Mignosi
F Mignosi
F Mignosi
G Fici
G Fici
M Béal
M Béal
M Crochemore
M Crochemore
M Crochemore
M-P Béal
MA Bender
P Charalampopoulos
P Charalampopoulos
RM Silva
S Chairungsee
T Shibuya
Y Almirantis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

We extend the theory of minimal absent words to (rooted and unrooted) trees, having edges labeled by letters from an alphabet of cardinality. We show that the set of minimal absent words of a rooted (resp. unrooted) tree T with n nodes has cardinality (resp.), and we show that these bounds are realized. Then, we exhibit algorithms to compute all minimal absent words in a rooted (resp. unrooted) tree in output-sensitive time (resp. assuming an integer alphabet of size polynomial in n

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Palermo

Author response image 1. Author response

Author: Almirantis
Andrey
Bel-Vialar
Berlivet
Bernstein
David
Delpretti
Deschamps
Deschamps
Deschamps
Dixon
Duboule
Duboule
Duboule
Duboule
Dubrulle
Durston
Ferraiuolo
Forlani
Fraser
Gaunt
Gaunt
Gaunt
Gerard
Graham
Izpisua-Belmonte
Kmita
Krumlauf
Lewis
Li
Mallo
Marks
Mattout
Montavon
Noordermeer
Noordermeer
Nora
Nora
Orlando
Papageorgiou
Phillips-Cremins
Pourquie
Puschel
Rousseau
Schuettengruber
Sexton
Soshnikova
Spitz
Stock
Tschopp
Tschopp
Tschopp
Tschopp
van de Werken
Wang
Whiting
Woltering
Young
Zakany
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date
Field of study

Crossref

The Information Coded in the Yeast Response Elements Accounts for Most of the Topological Properties of Its Transcriptional Regulation Network

Author: A Vazques
A Wagner
A Wagner
AHY Tong
AL Barabasi
AL Barabasi
Alkan Kabakçıoğlu
AS Perelson
Ayşe Erzan
B Alberts
B Bollobas
B Kınıkoğlu
CE Shannon
CT Harbison
D Balcan
DJ Lockhart
DJ Watts
Duygu Balcan
G Caldarelli
Gustavo Stolovitzky
J Ihmels
J Ihmels
J Kleffe
J Watson
M Kellis
M Molloy
M Mungan
MC Teixeira
Muhittin Mungan
N Geard
N Guelzim
NM Luscombe
R Albert
R Dobrin
R Milo
R Pastor-Satorras
RV Sole
S Bergmann
S Carmi
S Huang
S Kauffman
S Kullback
S Zhou
SA Kauffman
SH Strogatz
SN Dorogovstsev
T Reil
TI Lee
V Colizza
V Colizza
V van Noort
W Banzhaf
Y Almirantis
Publication venue: Public Library of Science
Publication date: 27/05/2006
Field of study

The regulation of gene expression in a cell relies to a major extent on transcription factors, proteins which recognize and bind the DNA at specific binding sites (response elements) within promoter regions associated with each gene. We present an information theoretic approach to modeling transcriptional regulatory networks, in terms of a simple “sequence-matching” rule and the statistics of the occurrence of binding sequences of given specificity in random promoter regions. The crucial biological input is the distribution of the amount of information coded in these cognate response elements and the length distribution of the promoter regions. We provide an analysis of the transcriptional regulatory network of yeast Saccharomyces cerevisiae, which we extract from the available databases, with respect to the degree distributions, clustering coefficient, degree correlations, rich-club coefficient and the k-core structure. We find that these topological features are in remarkable agreement with those predicted by our model, on the basis of the amount of information coded in the interaction between the transcription factors and response elements

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

Koç University Digital Collections

List of Texts

Numerical Study of Travelling Waves in a Reaction - Diffusion System: Response to a Spatio-Temporal Forcing

Author: Almirantis Yannis
Kaufman Marcelle
Publication venue
Publication date: 01/01/1992
Field of study

info:eu-repo/semantics/publishe

DI-fusion