Search CORE

3 research outputs found

Comparative Analysis of Transcription Start Sites Using Mutual Information

Author: Mitra Chanchal K.
Reddy D. Ashok
Publication venue: Beijing Institute of Genomics. Production and hosting by Elsevier B.V.
Publication date: 31/12/2006
Field of study

The transcription start site (TSS) region shows greater variability compared with other promoter elements. We are interested to search for its variability by using information content as a measure. We note in this study that the variability is significant in the block of 5 nucleotides (nt) surrounding the TSS region compared with the block of 15 nt. This suggests that the actual region that may be involved is in the range of 5–10 nt in size. For Escherichia coli, we note that the information content from dinucleotide substitution matrices clearly shows a better discrimination, suggesting the presence of some correlations. However, for human this effect is much less, and for mouse it is practically absent. We can conclude that the presence of short-range correlations within the TSS region is species-dependent and is not universal. We further observe that there are other variable regions in the mitochondrial control element apart from TSS. It is also noted that effective comparisons can only be made on blocks, while single nucleotide comparisons do not give us any detectable signals

Elsevier - Publisher Connector

Identifying statistical dependence in genomic sequences via mutual information estimates

Author: Aktulga HM
Grama AY
Kontoyiannis I
Lyznik LA
Szpankowski L
Szpankowski W
Publication venue
Publication date: 01/01/2007
Field of study

Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA) that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5' untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's Combined DNA Index System (CODIS), we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats, an application of importance in genetic profiling.Comment: Preliminary version. Final version in EURASIP Journal on Bioinformatics and Systems Biology. See http://www.hindawi.com/journals/bsb

arXiv.org e-Print Archive

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

CUED - Cambridge University Engineering Department

Comparative Analysis of Transcription Start Sites Using Mutual Information

Author: Aerts
Altschul
Arndt
Bajic
Boore
Bucher
Bulyk
Chanchal K. Mitra
Chang
Cover
D. Ashok Reddy
Dawy
Dayhoff
Down
Gershenzon
Gonnet
Henikoff
Hershberg
Karlin
Leitao
Lunter
Martin
Montoya
Périer
Reddy
Reddy
Schwartz
Shannon
Smale
Taanman
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Crossref