367 research outputs found

    Free energy estimation of short DNA duplex hybridizations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Estimation of DNA duplex hybridization free energy is widely used for predicting cross-hybridizations in DNA computing and microarray experiments. A number of software programs based on different methods and parametrizations are available for the theoretical estimation of duplex free energies. However, significant differences in free energy values are sometimes observed among estimations obtained with various methods, thus being difficult to decide what value is the accurate one.</p> <p>Results</p> <p>We present in this study a quantitative comparison of the similarities and differences among four published DNA/DNA duplex free energy calculation methods and an extended Nearest-Neighbour Model for perfect matches based on triplet interactions. The comparison was performed on a benchmark data set with 695 pairs of short oligos that we collected and manually curated from 29 publications. Sequence lengths range from 4 to 30 nucleotides and span a large GC-content percentage range. For perfect matches, we propose an extension of the Nearest-Neighbour Model that matches or exceeds the performance of the existing ones, both in terms of correlations and root mean squared errors. The proposed model was trained on experimental data with temperature, sodium and sequence concentration characteristics that span a wide range of values, thus conferring the model a higher power of generalization when used for free energy estimations of DNA duplexes under non-standard experimental conditions.</p> <p>Conclusions</p> <p>Based on our preliminary results, we conclude that no statistically significant differences exist among free energy approximations obtained with 4 publicly available and widely used programs, when benchmarked against a collection of 695 pairs of short oligos collected and curated by the authors of this work based on 29 publications. The extended Nearest-Neighbour Model based on triplet interactions presented in this work is capable of performing accurate estimations of free energies for perfect match duplexes under both standard and non-standard experimental conditions and may serve as a baseline for further developments in this area of research.</p

    In silico estimation of annealing specificity of query searches in DNA databases

    Get PDF
    We consider DNA implementations of databases for digital signals with retrieval and mining capabilities. Digital signals are encoded in DNA sequences and retrieved through annealing between query DNA primers and data carrying DNA target sequences. The hybridization between query and target can be non-specific containing multiple mismatches thus implementing similarity-based searches. In this paper we examine theoretically and by simulation the efficiency of such a system by estimating the concentrations of query-target duplex formations at equilibrium. A coupled kinetic model is used to estimate the concentrations. We offer a derivation that results in an equation that is guaranteed to have a solution and can be easily and accurately solved computationally with bi-section root-finding methods. Finally, we also provide an approximate solution at dilute query concentrations that results in a closed form expression. This expression is used to improve the speed of the bi-section algorithm and also to find a closed form expression for the specificity ratios

    A thermodynamic approach to designing structure-free combinatorial DNA word sets

    Get PDF
    An algorithm is presented for the generation of sets of non-interacting DNA sequences, employing existing thermodynamic models for the prediction of duplex stabilities and secondary structures. A DNA ‘word’ structure is employed in which individual DNA ‘words’ of a given length (e.g. 12mer and 16mer) may be concatenated into longer sequences (e.g. four tandem words and six tandem words). This approach, where multiple word variants are used at each tandem word position, allows very large sets of non-interacting DNA strands to be assembled from combinations of the individual words. Word sets were generated and their figures of merit are compared to sets as described previously in the literature (e.g. 4, 8, 12, 15 and 16mer). The predicted hybridization behavior was experimentally verified on selected members of the sets using standard UV hyperchromism measurements of duplex melting temperatures (T(m)s). Additional experimental validation was obtained by using the sequences in formulating and solving a small example of a DNA computing problem

    Physico-chemical foundations underpinning microarray and next-generation sequencing experiments

    Get PDF
    Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized

    Sequence dependence of cross-hybridization on short oligo microarrays

    Get PDF
    One of the critical problems in the short oligo microarray technology is how to deal with cross-hybridization that produces spurious data. Little is known about the details of cross-hybridization effect at molecular level. Here, we report a free energy analysis of cross-hybridization on short oligo microarrays using data from a spike-in study. Our analysis revealed that cross-hybridization on the arrays is mostly caused by oligo fragments with a run of 10–16 nt complementary to the probes. Mismatches were estimated to be energetically much more costly in cross-hybridization than that in gene-specific hybridization, implying that the sources of cross-hybridization must be very different between a PM–MM probe pair. Consequently, it is unreliable to use MM probe signal to track cross-hybridizing signal on a corresponding PM probe. Our results also showed that the oligo fragments tend to bind to the 5â€Č ends of the probes, and are rarely seen at the 3â€Č ends. These results are useful for microarray design and data analysis

    A competitive hybridization model predicts probe signal intensity on high density DNA microarrays

    Get PDF
    A central, unresolved problem of DNA microarray technology is the interpretation of different signal intensities from multiple probes targeting the same transcript. We propose a competitive hybridization model for DNA microarray hybridization. Our model uses a probe-specific dissociation constant that is computed with current nearest neighbor model and existing parameters, and only four global parameters that are fitted to Affymetrix Latin Square data. This model can successfully predict signal intensities of individual probes, therefore makes it possible to quantify the absolute concentration of targets. Our results offer critical insights into the design and data interpretation of DNA microarrays

    Oligonukleotiidide hĂŒbridisatsioonimudeli rakendamine PCR-i ja mikrokiipide optimeerimiseks

    Get PDF
    VĂ€itekirja elektrooniline versioon ei sisalda publikatsioone.Nukleiinhapped on orgaaniliste makromolekulide hulgas unikaalsed tĂ€nu oma vĂ”imele kodeerida, dekodeerida ja kanda ĂŒle digitaalset informatsiooni. See omadus on aluseks nende kasutamisele arenevates tehnoloogiavaldkondades, alates kliinilisest diagnostikast kuni nanotehnoloogia ja informatsiooni talletamiseni. On aga oluline mĂ”ista, et digitaalse informatsiooni töötlemise ja sĂ€ilitamise aluseks nukleiinhapetes on nende keemilised omadused. TĂ€htsaim nendest on hĂŒbridiseerumine - nukleiinhapete vĂ”ime moodustada spontaanselt kaheahelaline heeliks kahe komplementaarse vĂ”i osaliselt komplementaarse ĂŒheahelalise molekuli liitumisel. Nukleiinhapete hĂŒbridisatsiooni termodĂŒnaamika arvestamine vĂ”imaldab selle protsessi kĂ€itumist suure tĂ€psusega modelleerida ja tĂ€iustada paljusid biotehnoloogilisi protsesse. KĂ€esolevas vĂ€itekirjas on hĂŒbridisatsioonimudelit kasutatud multipleks-PCR-i ja detektsiooni mikrokiipide optimeerimiseks. Me töötasime vĂ€lja ökonoomse algoritmi jaotamaks PCR praimeripaarid multipleksigruppidesse vastavalt nende omavahelisele sobivusele. Algoritm on realiseeritud nii iseseisva programmi kui veebirakendusena. Me uurisime multipleks PCR ebaĂ”nnestumise pĂ”hjuseid ja nĂ€itasime, et suur arv mittespetsiifilisi seondumiskohti lĂ€hte DNA-l vĂ€hendab praimerite töötamise edukust. Need praimeripaarid, millel oli liiga suur arv mittespetsiifilisi seondumisi mitte ainult ei töötanud ise halvasti, vaid vĂ€hendasid ka teiste nendega koos amplifiseeritud praimeripaaride Ă”nnestumise tĂ”enĂ€osust. Me töötasime vĂ€lja arvutiprogrammi genereerimaks tĂ€ieliku nimekirja kĂ”igist vĂ”imalikest bakteriaalse tmRNA hĂŒbridiseerimisproovidest mis eristaksid omavahel kahte gruppi organisme. Proovide valideerimise kĂ€igus me nĂ€itasime, et valides hĂŒbridisatsioonienergia lĂ€vivÀÀrtuse suurema kui 4 kcl/mol on vĂ”imalik tĂ€ielikult vĂ€ltida valepositiivseid signaale. Me uurisime vĂ”imalust suurendada bakteriaalse RNA hĂŒbridiseerumiskiirust lisades lĂŒhikesi spetsiifilisi oligonukleotiide, mis hĂŒbridiseerudes lĂ€htemolekulile ei lase selle sekundaarstruktuuril moodustuda. Seda meetodit kasutades tĂ”usis hĂŒbridiseerumiskiirus temperatuuril 37C neli korda.Nucleic acids are unique among all organic macromolecules by the ability to encode, decode and transmit digital information. This property is used in emergent technologies as diverse as medical diagnosis, nanoscale engineering and information storage. Still it is important to understand that the basis of this digital information processing are the chemical properties of nucleic acids, the most important being the spontaneous formation of double-stranded helix between complementary or semi-complementary single-stranded molecules, called hybridization. Taking into account the thermodynamic properties of nucleic acid hybridization allows researchers to model the process with great accuracy and thus improve many associated technologies. In current thesis the hybridization model is used to optimize multiplex PCR and microarray hybridization. We developed an efficient algorithm to distribute PCR primer pairs into multiplex groups based on their compatibility with each other. The algorithm is also implemented as both standalone and web-based computer program. We analyzed the probable causes of failure of multiplex PCR and demonstrated that the large number of nonspecific hybridization sites in template DNA is detrimental to PCR quality. Primer pairs with too many nonspecific hybridization sites not only worked poorly but caused the failure of other primer pairs as well. We developed a computer program to generate exhaustive list of all possible hybridization probes for the detection of bacterial tmRNA, capable of distinguishing between two groups of source RNA. The probes were evaluated on microarray and shown that by keeping the hybridization energy cutoff between target and non-target groups over 4 kcal/mol all false-positive signals were eliminated. We analyzed the possibility of increasing the hybridization speed of bacterial tmRNA on low temperatures by applying short specific oligonucleotides that selectively hybridize with template molecules and break their secondary structure. Using this method the hybridization speed was increased fourfold at 37C

    A multivariate prediction model for microarray cross-hybridization

    Get PDF
    BACKGROUND: Expression microarray analysis is one of the most popular molecular diagnostic techniques in the post-genomic era. However, this technique faces the fundamental problem of potential cross-hybridization. This is a pervasive problem for both oligonucleotide and cDNA microarrays; it is considered particularly problematic for the latter. No comprehensive multivariate predictive modeling has been performed to understand how multiple variables contribute to (cross-) hybridization. RESULTS: We propose a systematic search strategy using multiple multivariate models [multiple linear regressions, regression trees, and artificial neural network analyses (ANNs)] to select an effective set of predictors for hybridization. We validate this approach on a set of DNA microarrays with cytochrome p450 family genes. The performance of our multiple multivariate models is compared with that of a recently proposed third-order polynomial regression method that uses percent identity as the sole predictor. All multivariate models agree that the 'most contiguous base pairs between probe and target sequences,' rather than percent identity, is the best univariate predictor. The predictive power is improved by inclusion of additional nonlinear effects, in particular target GC content, when regression trees or ANNs are used. CONCLUSION: A systematic multivariate approach is provided to assess the importance of multiple sequence features for hybridization and of relationships among these features. This approach can easily be applied to larger datasets. This will allow future developments of generalized hybridization models that will be able to correct for false-positive cross-hybridization signals in expression experiments

    DNA molecular recognition specificity : pairwise and in competition

    Get PDF
    Despite its importance in biological systems, the molecular recognition of DNA hybridization within complex, competitive environments is poorly understood. The present thesis investigates DNA hybridization in thermal equilibrium for DNA strands bound to the surface of a microarray as well as in solution in presence of one or more competitors. For the latter we employ fluorescence anisotropy and fluorescence correlation spectroscopy to determine binding affinities of two DNA strands in a pairwise manner and in presence of a single competitor. Our results reveal that there must be a non-trivial interaction between the competing strands that extends beyond simple double helix formation. This is a signature of cooperative behavior, which can lead to more complex binding phenomena than previously thought. Moreover, we find surprising differences between the results of both techniques, which we attribute to differing sensitivities to distinct microstates of double helix formation. The second part of this work is performed with surface-bound DNA and devoted to experimentally determine a sufficient number of differing bases between two sequences to avoid cross-hybridization. We construct a set of 23 non-interacting sequences with a length of 7 bases. We conclude that for systems of increasing complexity a high level of discrimination between many competitors is essential for accurate recognition.Trotz der Relevanz fĂŒr biologische Systeme sind die Mechanismen molekularer Erkennung bei der Hybridisierung von DNA in komplexen Umgebungen kaum verstanden. Die vorliegende Arbeit untersucht DNA Hybridisierung im thermischen Gleichgewicht mit DNA-StrĂ€ngen sowohl an die OberflĂ€che eines Microarrays gebunden als auch in Lösung in Gegenwart von Konkurrenten. FĂŒr letztere verwenden wir Fluoreszenzanisotropie sowie -korrelationsspektroskopie, um BindungsaffinitĂ€ten zweier DNA-StrĂ€nge paarweise und in Anwesenheit einzelner Konkurrenten zu bestimmen. Unsere Ergebnisse zeigen, dass es nicht triviale Wechselwirkungen zwischen den beteiligten StrĂ€ngen geben muss, die ĂŒber die einfache Bildung einer Doppelhelix hinausgehen. Diese Beobachtung deutet auf kooperatives Verhalten hin und zeigt, dass DNA-Hybridisierung komplexer ablĂ€uft als bisher angenommen. Außerdem finden wir eine unerwartete Diskrepanz beider Methoden, die auf unterschiedliche SensitivitĂ€ten fĂŒr bestimmte MikrozustĂ€nde der gebundenen DNA zurĂŒckgeht. Im zweiten Teil der Arbeit widmen wir uns Experimenten mit oberflĂ€chengebundener DNA. Wir bestimmen eine ausreichende Anzahl sich unterscheidender Basenpaare zweier StrĂ€nge, um nicht spezifische Hybridisierung zu vermeiden, und zeigen, dass sich damit ein Satz aus 23 nicht interagierenden StrĂ€ngen ĂĄ 7 Basen konstruieren lĂ€sst. Wir schließen, dass fĂŒr zunehmend komplexe Systeme ein hoher Diskriminierungsgrad zwischen vielen Konkurrenten unabdingbar fĂŒr prĂ€zise Erkennung ist
    • 

    corecore