10,763 research outputs found

    DNA sequences classification and computation scheme based on the symmetry principle

    Get PDF
    The DNA sequences containing multifarious novel symmetrical structure frequently play crucial role in how genomes work. Here we present a new scheme for understanding the structural features and potential mathematical rules of symmetrical DNA sequences using a method containing stepwise classification and recursive computation. By defining the symmetry of DNA sequences, we classify all sequences and conclude a series of recursive equations for computing the quantity of all classes of sequences existing theoretically; moreover, the symmetries of the typical sequences at different levels are analyzed. The classification and quantitative relation demonstrate that DNA sequences have recursive and nested properties. The scheme may help us better discuss the formation and the growth mechanism of DNA sequences because it has a capability of educing the information about structure and quantity of longer sequences according to that of shorter sequences by some recursive rules. Our scheme may provide a new stepping stone to the theoretical characterization, as well as structural analysis, of DNA sequences

    GenomeFingerprinter and universal genome fingerprint analysis for systematic comparative genomics

    Get PDF
    How to compare whole genome sequences at large scale has not been achieved via conventional methods based on pair-wisely base-to-base comparison; nevertheless, no attention was paid to handle in-one-sitting a number of genomes crossing genetic category (chromosome, plasmid, and phage) with farther divergences (much less or no homologous) over large size ranges (from Kbp to Mbp). We created a new method, GenomeFingerprinter, to unambiguously produce three-dimensional coordinates from a sequence, followed by one three-dimensional plot and six two-dimensional trajectory projections to illustrate whole genome fingerprints. We further developed a set of concepts and tools and thereby established a new method, universal genome fingerprint analysis. We demonstrated their applications through case studies on over a hundred of genome sequences. Particularly, we defined the total genetic component configuration (TGCC) (i.e., chromosome, plasmid, and phage) for describing a strain as a system, and the universal genome fingerprint map (UGFM) of TGCC for differentiating a strain as a universal system, as well as the systematic comparative genomics (SCG) for comparing in-one-sitting a number of genomes crossing genetic category in diverse strains. By using UGFM, UGFM-TGCC, and UGFM-TGCC-SCG, we compared a number of genome sequences with farther divergences (chromosome, plasmid, and phage; bacterium, archaeal bacterium, and virus) over large size ranges (6Kbp~5Mbp), giving new insights into critical problematic issues in microbial genomics in the post-genomic era. This paper provided a new method for rapidly computing, geometrically visualizing, and intuitively comparing genome sequences at fingerprint level, and hence established a new method of universal genome fingerprint analysis for systematic comparative genomics.Comment: 63 pages, 15 figures, 5 table

    Chloroplot : An Online Program for the Versatile Plotting of Organelle Genomes

    Get PDF
    Understanding the complexity of genomic structures and their unique architecture is linked with the power of visualization tools used to represent these features. Such tools should be able to provide a realistic and scalable version of genomic content. Here, we present an online organelle plotting tool focused on chloroplasts, which were developed to visualize the exclusive structure of these genomes. The distinguished unique features of this program include its ability to represent the Single Short Copy (SSC) regions in reverse complement, which allows the depiction of the codon usage bias index for each gene, along with the possibility of the minor mismatches between inverted repeat (IR) regions and user-specified plotting layers. The versatile color schemes and diverse functionalities of the program are specifically designed to reflect the accurate scalable representation of the plastid genomes. We introduce a Shiny app website for easy use of the program; a more advanced application of the tool is possible by further development and modification of the downloadable source codes provided online. The software and its libraries are completely coded in R, available at https://irscope.shinyapps.io/chloroplot/.Peer reviewe

    Blueprint for a high-performance biomaterial: full-length spider dragline silk genes.

    Get PDF
    Spider dragline (major ampullate) silk outperforms virtually all other natural and manmade materials in terms of tensile strength and toughness. For this reason, the mass-production of artificial spider silks through transgenic technologies has been a major goal of biomimetics research. Although all known arthropod silk proteins are extremely large (>200 kiloDaltons), recombinant spider silks have been designed from short and incomplete cDNAs, the only available sequences. Here we describe the first full-length spider silk gene sequences and their flanking regions. These genes encode the MaSp1 and MaSp2 proteins that compose the black widow's high-performance dragline silk. Each gene includes a single enormous exon (>9000 base pairs) that translates into a highly repetitive polypeptide. Patterns of variation among sequence repeats at the amino acid and nucleotide levels indicate that the interaction of selection, intergenic recombination, and intragenic recombination governs the evolution of these highly unusual, modular proteins. Phylogenetic footprinting revealed putative regulatory elements in non-coding flanking sequences. Conservation of both upstream and downstream flanking sequences was especially striking between the two paralogous black widow major ampullate silk genes. Because these genes are co-expressed within the same silk gland, there may have been selection for similarity in regulatory regions. Our new data provide complete templates for synthesis of recombinant silk proteins that significantly improve the degree to which artificial silks mimic natural spider dragline fibers

    Skittle: A 2-Dimensional Genome Visualization Tool

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required. To partially address this need, we introduce a new data visualization tool entitled Skittle.</p> <p>Results</p> <p>This program first creates a 2-dimensional nucleotide display by assigning four colors to the four nucleotides, and then text-wraps to a user adjustable width. This nucleotide display is accompanied by a "repeat map" which comprehensively displays all local repeating units, based upon analysis of all possible local alignments. Skittle includes a smooth-zooming interface which allows the user to analyze genomic patterns at any scale.</p> <p>Skittle is especially useful in identifying and analyzing tandem repeats, including repeats not normally detectable by other methods. However, Skittle is also more generally useful for analysis of any genomic data, allowing users to correlate published annotations and observable visual patterns, and allowing for sequence and construct quality control.</p> <p>Conclusions</p> <p>Preliminary observations using Skittle reveal intriguing genomic patterns not otherwise obvious, including structured variations inside tandem repeats. The striking visual patterns revealed by Skittle appear to be useful for hypothesis development, and have already led the authors to theorize that imperfect tandem repeats could act as information carriers, and may form tertiary structures within the interphase nucleus.</p

    Targeted mutagenesis of the Sap47 gene of Drosophila: Flies lacking the synapse associated protein of 47 kDa are viable and fertile

    Get PDF
    BACKGROUND: Conserved proteins preferentially expressed in synaptic terminals of the nervous system are likely to play a significant role in brain function. We have previously identified and molecularly characterized the Sap47 gene which codes for a novel synapse associated protein of 47 kDa in Drosophila. Sequence comparison identifies homologous proteins in numerous species including C. elegans, fish, mouse and human. First hints as to the function of this novel protein family can be obtained by generating mutants for the Sap47 gene in Drosophila. RESULTS: Attempts to eliminate the Sap47 gene through targeted mutagenesis by homologous recombination were unsuccessful. However, several mutants were generated by transposon remobilization after an appropriate insertion line had become available from the Drosophila P-element screen of the Bellen/Hoskins/Rubin/Spradling labs. Characterization of various deletions in the Sap47 gene due to imprecise excision of the P-element identified three null mutants and three hypomorphic mutants. Null mutants are viable and fertile and show no gross structural or obvious behavioural deficits. For cell-specific over-expression and "rescue" of the knock-out flies a transgenic line was generated which expresses the most abundant transcript under the control of the yeast enhancer UAS. In addition, knock-down of the Sap47 gene was achieved by generating 31 transgenic lines expressing Sap47 RNAi constructs, again under UAS control. When driven by a ubiquitously expressed yeast transcription factor (GAL4), Sap47 gene suppression in several of these lines is highly efficient resulting in residual SAP47 protein concentrations in heads as low as 6% of wild type levels. CONCLUSION: The conserved synaptic protein SAP47 of Drosophila is not essential for basic synaptic function. The Sap47 gene region may be refractory to targeted mutagenesis by homologous recombination. RNAi using a construct linking genomic DNA to anti-sense cDNA in our hands is not more effective than using a cDNA-anti-sense cDNA construct. The tools developed in this study will now allow a detailed analysis of the molecular, cellular and systemic function of the SAP47 protein in Drosophila
    • …
    corecore