916,475 research outputs found

    DNA Steganalysis Using Deep Recurrent Neural Networks

    Full text link
    Recent advances in next-generation sequencing technologies have facilitated the use of deoxyribonucleic acid (DNA) as a novel covert channels in steganography. There are various methods that exist in other domains to detect hidden messages in conventional covert channels. However, they have not been applied to DNA steganography. The current most common detection approaches, namely frequency analysis-based methods, often overlook important signals when directly applied to DNA steganography because those methods depend on the distribution of the number of sequence characters. To address this limitation, we propose a general sequence learning-based DNA steganalysis framework. The proposed approach learns the intrinsic distribution of coding and non-coding sequences and detects hidden messages by exploiting distribution variations after hiding these messages. Using deep recurrent neural networks (RNNs), our framework identifies the distribution variations by using the classification score to predict whether a sequence is to be a coding or non-coding sequence. We compare our proposed method to various existing methods and biological sequence analysis methods implemented on top of our framework. According to our experimental results, our approach delivers a robust detection performance compared to other tools

    Investigations into the molecular effects of single nucleotide polymorphism

    Get PDF
    Objectives: DNA sequences are very rich in short repeats and their pattern can be altered by point mutations. We wanted to investigate the effect of single nucleotide polymorphism (SNP) on the pattern of short DNA repeats and its biological consequences. Methods: Analysis of the pattern of short DNA repeats of the Thy-1 sequence with and without SNP. Searching for DNA-binding factors in any region of significance. Results: Comparing the pattern of short repeats in the Thy-1 gene sequences of Turkish patients with ataxia telangiectasia (AT) with the `wild type' sequence from the DNA database, we identified a missing 8-bp repeat element due to an SNP in position 1271 (intron II) in AT-DNA sequences. Only the mutated sequence had the potential for the formation of a stem loop in DNA or pre-mRNA. In super-shift experiments we found that DNA oligomers covering the area of this SNP formed a complex with proteins amongst which we identified the proliferating cell nuclear antigen (PCNA) protein. Conclusion: SNPs have the potential to alter DNA or pre-mRNA conformation. Although no SNP-depeding formation of the DNA-protein complex was evident, future investigations could reveal differential molecular mechanisms of cellular regulation. Copyright (C) 2001 S. Karger AG, Basel

    {BiQ} Analyzer {HiMod}: An Interactive Software Tool for High-throughput Locus-specific Analysis of 5-Methylcytosine and its Oxidized Derivatives

    Get PDF
    Recent data suggest important biological roles for oxidative modifications of methylated cytosines, specifically hydroxymethylation, formylation and carboxylation. Several assays are now available for profiling these DNA modifications genome-wide as well as in targeted, locus-specific settings. Here we present BiQ Analyzer HiMod, a user-friendly software tool for sequence alignment, quality control and initial analysis of locus-specific DNA modification data. The software supports four different assay types, and it leads the user from raw sequence reads to DNA modification statistics and publication-quality plots. BiQ Analyzer HiMod combines well-established graphical user interface of its predecessor tool, BiQ Analyzer HT, with new and extended analysis modes. BiQ Analyzer HiMod also includes updates of the analysis workspace, an intuitive interface, a custom vector graphics engine and support of additional input and output data formats. The tool is freely available as a stand-alone installation package from http://biq-analyzer-himod.bioinf.mpi-inf.mpg.de/

    BEAST: Bayesian evolutionary analysis by sampling trees

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented.</p> <p>Results</p> <p>BEAST version 1.4.6 consists of 81000 lines of Java source code, 779 classes and 81 packages. It provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions. BEAST source code is object-oriented, modular in design and freely available at <url>http://beast-mcmc.googlecode.com/</url> under the GNU LGPL license.</p> <p>Conclusion</p> <p>BEAST is a powerful and flexible evolutionary analysis package for molecular sequence variation. It also provides a resource for the further development of new models and statistical methods of evolutionary analysis.</p

    Structure, stability and elasticity of DNA nanotube

    Full text link
    DNA nanotubes are tubular structures composed of DNA crossover molecules. We present a bottom up approach for construction and characterization of these structures. Various possible topologies of nanotubes are constructed such as 6-helix, 8-helix and tri-tubes with different sequences and lengths. We have used fully atomistic molecular dynamics simulations to study the structure, stability and elasticity of these structures. Several nanosecond long MD simulations give the microscopic details about DNA nanotubes. Based on the structural analysis of simulation data, we show that 6-helix nanotubes are stable and maintain their tubular structure; while 8-helix nanotubes are flattened to stabilize themselves. We also comment on the sequence dependence and effect of overhangs. These structures are approximately four times more rigid having stretch modulus of ~4000 pN compared to the stretch modulus of 1000 pN of DNA double helix molecule of same length and sequence. The stretch moduli of these nanotubes are also three times larger than those of PX/JX crossover DNA molecules which have stretch modulus in the range of 1500-2000 pN. The calculated persistence length is in the range of few microns which is close to the reported experimental results on certain class of the DNA nanotubes.Comment: Published in Physical Chemistry Chemical Physic

    Analysis of DNA sequence variation within marine species using Beta-coalescents

    Get PDF
    We apply recently developed inference methods based on general coalescent processes to DNA sequence data obtained from various marine species. Several of these species are believed to exhibit so-called shallow gene genealogies, potentially due to extreme reproductive behaviour, e.g. via Hedgecock's "reproduction sweepstakes". Besides the data analysis, in particular the inference of mutation rates and the estimation of the (real) time to the most recent common ancestor, we briefly address the question whether the genealogies might be adequately described by so-called Beta coalescents (as opposed to Kingman's coalescent), allowing multiple mergers of genealogies. The choice of the underlying coalescent model for the genealogy has drastic implications for the estimation of the above quantities, in particular the real-time embedding of the genealogy.Comment: 15 pages, 16 figure

    From Nonspecific DNA–Protein Encounter Complexes to the Prediction of DNA–Protein Interactions

    Get PDF
    ©2009 Gao, Skolnick. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.doi:10.1371/journal.pcbi.1000341DNA–protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA–protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA–protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA–protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA–protein interaction modes exhibit some similarity to specific DNA–protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Ca deviation from native is up to 5 Å from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA–protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein
    corecore