2,873 research outputs found
Thermodynamically Stable DNA Code Design using a Similarity Significance Model
DNA code design aims to generate a set of DNA sequences (codewords) with
minimum likelihood of undesired hybridizations among sequences and their
reverse-complement (RC) pairs (cross-hybridization). Inspired by the distinct
hybridization affinities (or stabilities) of perfect double helix constructed
by individual single-stranded DNA (ssDNA) and its RC pair, we propose a novel
similarity significance (SS) model to measure the similarity between DNA
sequences. Particularly, instead of directly measuring the similarity of two
sequences by any metric/approach, the proposed SS works in a way to evaluate
how more likely will the undesirable hybridizations occur over the desirable
hybridizations in the presence of the two measured sequences and their RC
pairs. With this SS model, we construct thermodynamically stable DNA codes
subject to several combinatorial constraints using a sorting-based algorithm.
The proposed scheme results in DNA codes with larger code sizes and wider free
energy gaps (hence better cross-hybridization performance) compared to the
existing methods.Comment: To appear in ISIT 202
Frustration in Biomolecules
Biomolecules are the prime information processing elements of living matter.
Most of these inanimate systems are polymers that compute their structures and
dynamics using as input seemingly random character strings of their sequence,
following which they coalesce and perform integrated cellular functions. In
large computational systems with a finite interaction-codes, the appearance of
conflicting goals is inevitable. Simple conflicting forces can lead to quite
complex structures and behaviors, leading to the concept of "frustration" in
condensed matter. We present here some basic ideas about frustration in
biomolecules and how the frustration concept leads to a better appreciation of
many aspects of the architecture of biomolecules, and how structure connects to
function. These ideas are simultaneously both seductively simple and perilously
subtle to grasp completely. The energy landscape theory of protein folding
provides a framework for quantifying frustration in large systems and has been
implemented at many levels of description. We first review the notion of
frustration from the areas of abstract logic and its uses in simple condensed
matter systems. We discuss then how the frustration concept applies
specifically to heteropolymers, testing folding landscape theory in computer
simulations of protein models and in experimentally accessible systems.
Studying the aspects of frustration averaged over many proteins provides ways
to infer energy functions useful for reliable structure prediction. We discuss
how frustration affects folding, how a large part of the biological functions
of proteins are related to subtle local frustration effects and how frustration
influences the appearance of metastable states, the nature of binding
processes, catalysis and allosteric transitions. We hope to illustrate how
Frustration is a fundamental concept in relating function to structural
biology.Comment: 97 pages, 30 figure
Recommended from our members
Three-dimensional modeling of single stranded DNA hairpins for aptamer-based biosensors.
Aptamers consist of short oligonucleotides that bind specific targets. They provide advantages over antibodies, including robustness, low cost, and reusability. Their chemical structure allows the insertion of reporter molecules and surface-binding agents in specific locations, which have been recently exploited for the development of aptamer-based biosensors and direct detection strategies. Mainstream use of these devices, however, still requires significant improvements in optimization for consistency and reproducibility. DNA aptamers are more stable than their RNA counterparts for biomedical applications but have the disadvantage of lacking the wide array of computational tools for RNA structural prediction. Here, we present the first approach to predict from sequence the three-dimensional structures of single stranded (ss) DNA required for aptamer applications, focusing explicitly on ssDNA hairpins. The approach consists of a pipeline that integrates sequentially building ssDNA secondary structure from sequence, constructing equivalent 3D ssRNA models, transforming the 3D ssRNA models into ssDNA 3D structures, and refining the resulting ssDNA 3D structures. Through this pipeline, our approach faithfully predicts the representative structures available in the Nucleic Acid Database and Protein Data Bank databases. Our results, thus, open up a much-needed avenue for integrating DNA in the computational analysis and design of aptamer-based biosensors
Structural and Dynamic Insight into Hirudin Epitopes-HLADRB1 0101 Complexes and their Modified Peptide Ligands: A Molecular Dynamic Simulation Study
Purpose: To develop a hirudin therapeutic protein that eliminates unwanted immune response.Methods: Molecular dynamic simulation was performed on immunodominant hirudin epitopes 1-15 and 13-27 and its analog, modified peptide ligands (MPLs), namely, [Lys4] Hir1-15 and [Gly9] Hir1-15, [Gly21] Hir13-27 and [Lys21] Hir13-27. The selected epitopes were modeled and 20 ns of molecular dynamics simulation was performed on peptide-HLA1 0101 and MPLs-HLA1 0101 complexes to gain a better understanding of molecular recognition mechanisms of MHC peptide binding. Characterization of the process was done by evaluation of root mean square deviation (RMSD) and total energy of binding.Result: All complexes of MPLs-HLA-DRB1 0101 showed thermodynamically unstable structure in comparison with native epitopes-HLA-DRB1 0101. The findings indicate that these analogs have different orientation in HLA grooves and are not available for suitable interaction with HLA-DRB1 0101.Conclusion: Altogether, the results show the potentials of predictive methods and molecular modeling in molecular mimicry of peptide-MHC interaction and provide insights into the binding characteristics of antigen presentation mechanism.Keywords: Modified peptide ligand, Epitopes, MHC peptide binding, Hirudin, Modified peptide ligands, Molecular dynamic simulation, Binding free energ
Rationalising sequence selection by ligand assemblies in the DNA minor groove : the case for thiazotropsin A
DNA-sequence and structure dependence on the formation of minor groove complexes at 5′-XCTAGY-3′ by the short lexitropsin thiazotropsin A are explored based on NMR spectroscopy, isothermal titration calorimetry (ITC), circular dichroism (CD) and qualitative molecular modeling. The structure and solution behaviour of the complexes are similar whether X = A, T, C or G and Z = T, A, I or C, CCTAGI being thermodynamically the most favoured (ΔG = -11.1 ± 0.1 kcal.mol-1). Binding site selectivity observed by NMR for ACTAGT in the presence of TCTAGA when both accessible sequences are concatenated in a 15-mer DNA duplex construct is consistent with thermodynamic parameters (ΙΔGΙACTAGT > ΙΔGΙTCTAGA) measured separately for the binding sites and with predictions from modeling studies. Steric bulk in the minor groove for Y = G causes unfavourable ligand-DNA interactions reflected in lower Gibbs free energy of binding (ΔG = -8.5 ± 0.01 kcal.mol-1). ITC and CD data establish that thiazotropsin A binds the ODNs with binding constants between 106 and 108 M-1 and reveal that binding is driven enthalpically through hydrogen bond formation and van der Waals interactions. The consequences of these findings are considered with respect to ligand self-association and the energetics responsible for driving DNA recognition by small molecule DNA minor groove binder
Computational methods for the discovery and analysis of genes and other functional DNA sequences
The need for automating genome analysis is a result of the tremendous amount of genomic data. As of today, a high-throughput DNA sequencing machine can run millions of sequencing reactions in parallel, and it is becoming faster and cheaper to sequence the entire genome of an organism. Public databases containing genomic data are growing exponentially, and hence the rise in demand for intuitive automated methods of DNA analysis and subsequent gene identification. However, the complexity of gene organization makes automation a challenging task, and smart algorithm design and parallelization are necessary to perform accurate analyses in reasonable amounts of time. This work describes two such automated methods for the identification of novel genes within given DNA sequences. The first method utilizes negative selection patterns as an evolutionary rationale for the identification of additional members of a gene family. As input it requires a known protein coding gene in that family. The second method is a massively parallel data mining algorithm that searches a whole genome for inverted repeats (palindromic sequences) and identifies potential precursors of non-coding RNA genes. Both methods were validated successfully on the fully sequenced and well studied plant species, Arabidopsis thaliana --Abstract, page iv
- …