Search CORE

arXiv.org e-Print Archive

MPG.PuRe

Segmentation of DNA sequences into twostate regions and melting fork regions

Author: Bakk A
Eivind Hovig
Eivind Tøstesen
Fang Liu
Geir Kjetil Sandve
Poland D
Zhou Y
Publication venue: 'IOP Publishing'
Publication date: 28/11/2008
Field of study

The accurate prediction and characterization of DNA melting domains by computational tools could facilitate a broad range of biological applications. However, no algorithm for melting domain prediction has been available until now. The main challenges include the difficulty of mathematically mapping a qualitative description of DNA melting domains to quantitative statistical mechanics models, as well as the absence of 'gold standards' and a need for generality. In this paper, we introduce a new approach to identify the twostate regions and melting fork regions along a given DNA sequence. Compared with an ad hoc segmentation used in one of our previous studies, the new algorithm is based on boundary probability profiles, rather than standard melting maps. We demonstrate that a more detailed characterization of the DNA melting domain map can be obtained using our new method, and this approach is independent of the choice of DNA melting model. We expect this work to drive our understanding of DNA melting domains one step further.Comment: 17 pages, 8 figures; new introduction, added refs, minor change

Harvard University - DASH

Recommended from our members

Computing DNA Duplex Instability Profiles Efficiently with a Two-State Model: Trends of Promoters and Binding Sites

Author: Gelev Vladimir
Kantorovitz Miriam R
Rapti Zoi
Usheva Anny
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/04/2011
Field of study

Background: DNA instability profiles have been used recently for predicting the transcriptional start site and the location of core promoters, and to gain insight into promoter action. It was also shown that the use of these profiles can significantly improve the performance of motif finding programs. Results: In this work we introduce a new method for computing DNA instability profiles. The model that we use is a modified Ising-type model and it is implemented via statistical mechanics. Our linear time algorithm computes the profile of a 10,000 base-pair long sequence in less than one second. The method we use also allows the computation of the probability that several consecutive bases are unpaired simultaneously. This is a feature that is not available in other linear-time algorithms. We use the model to compare the thermodynamic trends of promoter sequences of several genomes. In addition, we report results that associate the location of local extrema in the instability profiles with the presence of core promoter elements at these locations and with the location of the transcription start sites (TSS). We also analyzed the instability scores of binding sites of several human core promoter elements. We show that the instability scores of functional binding sites of a given core promoter element are significantly different than the scores of sites with the same motif occurring outside the functional range (relative to the TSS). Conclusions: The time efficiency of the algorithm and its genome-wide applications makes this work of broad interest to scientists interested in transcriptional regulation, motif discovery, and comparative genomics

arXiv.org e-Print Archive

The Genomic HyperBrowser: inferential genomics at the sequence level

Author: Clancy Trevor
Ferkingstad Egil
Frigessi Arnoldo
Glad Ingrid K.
Gundersen Sveinung
Holden Lars
Holden Marit
Hovig Eivind
Johansen Morten
Liestøl Knut
Nygaard Vegard
Rydbeck Halfdan
Sandve Geir K.
Tøstesen Eivind
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The immense increase in the generation of genomic scale data poses an unmet analytical challenge, due to a lack of established methodology with the required flexibility and power. We propose a first principled approach to statistical analysis of sequence-level genomic information. We provide a growing collection of generic biological investigations that query pairwise relations between tracks, represented as mathematical objects, along the genome. The Genomic HyperBrowser implements the approach and is available at http://hyperbrowser.uio.no

NORA - Norwegian Open Research Archives

A stitch in time: Efficient computation of genomic DNA melting bubbles

Author: A Wada
AT Sumner
BY Tong
C Benham
C Benham
C Flamm
CH Choi
CH Choi
CJ Benham
CJ Benham
CR Calladine
D Poland
DJ Wales
DL Stein
E Carlon
E Carlon
E Tøstesen
E Tøstesen
E Tøstesen
E Yeramian
E Yeramian
E Yeramian
E Yeramian
E Yeramian
Eivind Tøstesen
F Liu
G Altan-Bonnet
GI Jerstad
GJ King
H Wang
J Stelling
KA Dill
KA Marx
KH Hoffmann
M Fixman
MT Wolfinger
P Ak
R Blossey
RA Dimitrov
RD Blake
T Ambjörnsson
TS van Erp
TS van Erp
TS van Erp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Background: It is of biological interest to make genome-wide predictions of the locations of DNA melting bubbles using statistical mechanics models. Computationally, this poses the challenge that a generic search through all combinations of bubble starts and ends is quadratic. Results: An efficient algorithm is described, which shows that the time complexity of the task is O(NlogN) rather than quadratic. The algorithm exploits that bubble lengths may be limited, but without a prior assumption of a maximal bubble length. No approximations, such as windowing, have been introduced to reduce the time complexity. More than just finding the bubbles, the algorithm produces a stitch profile, which is a probabilistic graphical model of bubbles and helical regions. The algorithm applies a probability peak finding method based on a hierarchical analysis of the energy barriers in the Poland-Scheraga model. Conclusions: Exact and fast computation of genomic stitch profiles is thus feasible. Sequences of several megabases have been computed, only limited by computer memory. Possible applications are the genome-wide comparisons of bubbles with promotors, TSS, viral integration sites, and other melting-related regions.Comment: 16 pages, 10 figure

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

Exploiting SERS sensitivity to monitor DNA aggregation properties

Author: Capocefalo A.
Caprara D.
Ceccarini M.
Petrillo C.
Postorino P.
Ripanti F.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

In the last decades, DNA has been considered far more than the system carrying the essential genetic instructions. Indeed, because of the remarkable properties of the base-pairing specificity and thermoreversibility of the interactions, DNA plays a central role in the design of innovative architectures at the nanoscale. Here, combining complementary DNA strands with a custom-made solution of silver nanoparticles, we realize plasmonic aggregates to exploit the sensitivity of Surface Enhanced Raman Spectroscopy (SERS) for the identification/detection of the distinctive features of DNA hybridization, both in solution and on dried samples. Moreover, SERS allows monitoring the DNA aggregation process by following the temperature variation of a specific spectroscopic marker associated with the Watson-Crick hydrogen bond formation. This temperature-dependent behavior enables us to precisely reconstruct the melting profile of the selected DNA sequences by spectroscopic measurements only

Archivio della ricerca- Università di Roma La Sapienza

Technology to accelerate pangenomic scanning for unknown point mutations in exonic sequences: cycling temperature capillary electrophoresis (CTCE)

Author: Bjørheim Jens
Ekstrøm Per O
Thilly William G
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Rapid means to discover and enumerate unknown mutations in the exons of human genes on a pangenomic scale are needed to discover the genes carrying inherited risk for common diseases or the genes in which somatic mutations are required for clonal diseases such as atherosclerosis and cancers. The method of constant denaturing capillary electrophoresis (CDCE) permitted sensitive detection and enumeration of unknown point mutations but labor-intensive optimization procedures for each exonic sequence made it impractical for application at a pangenomic scale. Results A variant denaturing capillary electrophoresis protocol, cycling temperature capillary electrophoresis (CTCE), has eliminated the need for the laboratory optimization of separation conditions for each target sequence. Here are reported the separation of wild type mutant homoduplexes from wild type/mutant heteroduplexes for 27 randomly chosen target sequences without any laboratory optimization steps. Calculation of the equilibrium melting map of each target sequence attached to a high melting domain (clamp) was sufficient to design the analyte sequence and predict the expected degree of resolution. Conclusion CTCE provides practical means for economical pangenomic detection and enumeration of point mutations in large-scale human case/control cohort studies. We estimate that the combined reagent, instrumentation and labor costs for scanning the ~250,000 exons and splice sites of the ~25,000 human protein-coding genes using automated CTCE instruments in 100 case cohorts of 10,000 individuals each are now less than U.S.

500 million, less than U.S.

500 per person.</p

NORA - Norwegian Open Research Archives

A comparison study on feature selection of DNA structural properties for promoter prediction

Author: Gan Yanglan
Guan Jihong
Zhou Shuigeng
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background Promoter prediction is an integrant step for understanding gene regulation and annotating genomes. Traditional promoter analysis is mainly based on sequence compositional features. Recently, many kinds of structural features have been employed in promoter prediction. However, considering the high-dimensionality and overfitting problems, it is unfeasible to utilize all available features for promoter prediction. Thus it is necessary to choose some appropriate features for the prediction task. Results This paper conducts an extensive comparison study on feature selection of DNA structural properties for promoter prediction. Firstly, to examine whether promoters possess some special structures, we carry out a systematical comparison among the profiles of thirteen structural features on promoter and non-promoter sequences. Secondly, we investigate the correlations between these structural features and promoter sequences. Thirdly, both filter and wrapper methods are utilized to select appropriate feature subsets from thirteen different kinds of structural features for promoter prediction, and the predictive power of the selected feature subsets is evaluated. Finally, we compare the prediction performance of the feature subsets selected in this paper with nine existing promoter prediction approaches. Conclusions Experimental results show that the structural features are differentially correlated to promoters. Specifically, DNA-bending stiffness, DNA denaturation and energy-related features are highly correlated with promoters. The predictive power for promoter sequences differentiates greatly among different structural features. Selecting the relevant features can significantly improve the accuracy of promoter prediction.</p

Computing DNA duplex instability profiles efficiently with a two-state model: trends of promoters and binding sites

Author: A Krueger
A Usheva
Anny Usheva
BS Alexandrov
C Bi
C Bi
CH Choi
CH Choi
CJ Benham
D Jost
D Poland
DB Nikolov
E Protozanova
E Tøstesen
F Liu
G Kalosakas
H Wakaguri
HB Houbaviy
J SantaLucia
K Brick
M Peyrard
Miriam R Kantorovitz
P Yakovchuk
R Gordan
R Rohs
T Abeel
T Abeel
T Ambjörnsson
T Dauxois
T Hwa
T van Erp
Vladimir Gelev
X Wang
Z Rapti
Zoi Rapti
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study