23,918 research outputs found

    Louse (Insecta : Phthiraptera) mitochondrial 12S rRNA secondary structure is highly variable

    Get PDF
    Lice are ectoparasitic insects hosted by birds and mammals. Mitochondrial 12S rRNA sequences obtained from lice show considerable length variation and are very difficult to align. We show that the louse 12S rRNA domain III secondary structure displays considerable variation compared to other insects, in both the shape and number of stems and loops. Phylogenetic trees constructed from tree edit distances between louse 12S rRNA structures do not closely resemble trees constructed from sequence data, suggesting that at least some of this structural variation has arisen independently in different louse lineages. Taken together with previous work on mitochondrial gene order and elevated rates of substitution in louse mitochondrial sequences, the structural variation in louse 12S rRNA confirms the highly distinctive nature of molecular evolution in these insects

    Multiple sequence alignments of partially coding nucleic acid sequences

    Get PDF
    BACKGROUND: High quality sequence alignments of RNA and DNA sequences are an important prerequisite for the comparative analysis of genomic sequence data. Nucleic acid sequences, however, exhibit a much larger sequence heterogeneity compared to their encoded protein sequences due to the redundancy of the genetic code. It is desirable, therefore, to make use of the amino acid sequence when aligning coding nucleic acid sequences. In many cases, however, only a part of the sequence of interest is translated. On the other hand, overlapping reading frames may encode multiple alternative proteins, possibly with intermittent non-coding parts. Examples are, in particular, RNA virus genomes. RESULTS: The standard scoring scheme for nucleic acid alignments can be extended to incorporate simultaneously information on translation products in one or more reading frames. Here we present a multiple alignment tool, codaln, that implements a combined nucleic acid plus amino acid scoring model for pairwise and progressive multiple alignments that allows arbitrary weighting for almost all scoring parameters. Resource requirements of codaln are comparable with those of standard tools such as ClustalW. CONCLUSION: We demonstrate the applicability of codaln to various biologically relevant types of sequences (bacteriophage Levivirus and Vertebrate Hox clusters) and show that the combination of nucleic acid and amino acid sequence information leads to improved alignments. These, in turn, increase the performance of analysis tools that depend strictly on good input alignments such as methods for detecting conserved RNA secondary structure elements

    Xenosurveillance reflects traditional sampling techniques for the identification of human pathogens: A comparative study in West Africa

    Get PDF
    BACKGROUND: Novel surveillance strategies are needed to detect the rapid and continuous emergence of infectious disease agents. Ideally, new sampling strategies should be simple to implement, technologically uncomplicated, and applicable to areas where emergence events are known to occur. To this end, xenosurveillance is a technique that makes use of blood collected by hematophagous arthropods to monitor and identify vertebrate pathogens. Mosquitoes are largely ubiquitous animals that often exist in sizable populations. As well, many domestic or peridomestic species of mosquitoes will preferentially take blood-meals from humans, making them a unique and largely untapped reservoir to collect human blood. METHODOLOGY/PRINCIPAL FINDINGS: We sought to take advantage of this phenomenon by systematically collecting blood-fed mosquitoes during a field trail in Northern Liberia to determine whether pathogen sequences from blood engorged mosquitoes accurately mirror those obtained directly from humans. Specifically, blood was collected from humans via finger-stick and by aspirating bloodfed mosquitoes from the inside of houses. Shotgun metagenomic sequencing of RNA and DNA derived from these specimens was performed to detect pathogen sequences. Samples obtained from xenosurveillance and from finger-stick blood collection produced a similar number and quality of reads aligning to two human viruses, GB virus C and hepatitis B virus. CONCLUSIONS/SIGNIFICANCE: This study represents the first systematic comparison between xenosurveillance and more traditional sampling methodologies, while also demonstrating the viability of xenosurveillance as a tool to sample human blood for circulating pathogens

    De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations.

    Get PDF
    The human reference genome is used extensively in modern biological research. However, a single consensus representation is inadequate to provide a universal reference structure because it is a haplotype among many in the human population. Using 10× Genomics (10×G) "Linked-Read" technology, we perform whole genome sequencing (WGS) and de novo assembly on 17 individuals across five populations. We identify 1842 breakpoint-resolved non-reference unique insertions (NUIs) that, in aggregate, add up to 2.1 Mb of so far undescribed genomic content. Among these, 64% are considered ancestral to humans since they are found in non-human primate genomes. Furthermore, 37% of the NUIs can be found in the human transcriptome and 14% likely arose from Alu-recombination-mediated deletion. Our results underline the need of a set of human reference genomes that includes a comprehensive list of alternative haplotypes to depict the complete spectrum of genetic diversity across populations

    Design of RNAi reagents for invertebrate model organisms and human disease vectors

    Get PDF
    RNAi has become an important tool to silence gene expression in a variety of organisms, in particular when classical genetic methods are missing. However, application of this method in functional studies has raised new challenges in the design of RNAi reagents in order to minimize false positive and false negative results. Since the performance of reagents can be rarely validated on a genome-wide scale, improved computational methods are required that consider experimentally derived design parameters. Here, we describe computational methods for the design of RNAi reagents for invertebrate model organisms and human disease vectors, such as Anopheles. We describe procedures on how to design short and long double-stranded RNAs for single genes, and evaluate their predicted specificity and efficiency. Using a bioinformatics pipeline we also describe how to design a genome-wide RNAi library for Anopheles gambiae

    Hotspot Identification System for identification of core residues in Diabetic Proteins

    Get PDF
    Data on genome structural and functional features for various organisms are being accumulated and analyzed in laboratories all over the world. The data are stored and analyzed on a large variety of expert systems. The public access to most of these data offers to scientists around the world an unprecedented chance to data mine and explores in depth this extraordinary information repository, trying to convert data into knowledge. The DNA and RNA molecules are symbolic sequences of amino acids in the corresponding proteins has definite advantages in what concerns storage, search, and retrieval of genomic information. In this study an attempt is made to develop an algorithm for aligning multiple DNA / protein sequences. In this process hotspots are located in a protein sequence using the multiple sequence alignment

    Profile Context-Sensitive HMMs for Probabilistic Modeling of Sequences With Complex Correlations

    Get PDF
    The profile hidden Markov model is a specific type of HMM that is well suited for describing the common features of a set of related sequences. It has been extensively used in computational biology, where it is still one of the most popular tools. In this paper, we propose a new model called the profile context-sensitive HMM. Unlike traditional profile-HMMs, the proposed model is capable of describing complex long-range correlations between distant symbols in a consensus sequence. We also introduce a general algorithm that can be used for finding the optimal state-sequence of an observed symbol sequence based on the given profile-csHMM. The proposed model has an important application in RNA sequence analysis, especially in modeling and analyzing RNA pseudoknots
    corecore