2,499 research outputs found

    Identification and Characterization of Novel Astroviruses

    Get PDF
    Approximately 1.8 million children die from diarrhea annually, and millions more suffer multiple episodes of nonfatal diarrhea. Despite the availability of improved molecular diagnostics to detect the known viral agents, the etiology of a large proportion of diarrheal cases is unknown. In fact, it is estimated that no causative agent can be identified in up to 40% of sporadic cases or in gastroenteritis outbreaks. Detection of novel or unexpected viruses is the first step in identifying agents that could potentially close the diagnostic gap and pave the way for the development of more comprehensive preventative measures and better treatments. This dissertation encompasses the first application of cutting edge mass sequencing approaches to the analysis of viruses present in fecal specimens from patients with diarrhea. Known enteric viruses as well as multiple sequences: with only limited sequence similarity to viruses in GenBank) from putatively novel viruses were detected in pediatric sporadic diarrhea specimens. One virus, Astrovirus MLB1: AstV-MLB1), was fully sequenced and determined to be a highly divergent, novel astrovirus based on phylogenetic analysis. AstV-MLB1 was further detected by RT-PCR in 4/254 fecal specimens collected at the St. Louis Children\u27s hospital in 2008, indicating that AstV-MLB1 is currently circulating in North America. A second highly divergent, novel astrovirus, Astrovirus VA1: AstV-VA1), was identified in two specimens from a gastroenteritis outbreak at a child care center. Mass sequencing yielded nearly the entire genome of AstV-VA1 which appears to be most closely related to astroviruses found in mink and sheep. One additional sample also tested positive for AstV-VA1 by RT-PCR, resulting in detection of the virus in 3/5 specimens collected from the outbreak. This presents the possibility that further investigations might reveal that AstV-VA1 is a causative agent of gastroenteritis outbreaks. The identification of two novel astroviruses in fecal specimens from children with diarrhea suggests that astroviruses may cause a larger fraction of diarrhea cases than previously recognized. Furthermore, the identification and characterization of novel astroviruses MLB1 and VA1 lays the foundation for future investigations into their potential roles as etiologic agents of diarrhea

    WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    Get PDF
    BACKGROUND: This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. RESULTS: We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. CONCLUSION: Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes

    Bayesian machine learning methods for predicting protein-peptide interactions and detecting mosaic structures in DNA sequences alignments

    Get PDF
    Short well-defined domains known as peptide recognition modules (PRMs) regulate many important protein-protein interactions involved in the formation of macromolecular complexes and biochemical pathways. High-throughput experiments like yeast two-hybrid and phage display are expensive and intrinsically noisy, therefore it would be desirable to target informative interactions and pursue in silico approaches. We propose a probabilistic discriminative approach for predicting PRM-mediated protein-protein interactions from sequence data. The model suffered from over-fitting, so Laplacian regularisation was found to be important in achieving a reasonable generalisation performance. A hybrid approach yielded the best performance, where the binding site motifs were initialised with the predictions of a generative model. We also propose another discriminative model which can be applied to all sequences present in the organism at a significantly lower computational cost. This is due to its additional assumption that the underlying binding sites tend to be similar.It is difficult to distinguish between the binding site motifs of the PRM due to the small number of instances of each binding site motif. However, closely related species are expected to share similar binding sites, which would be expected to be highly conserved. We investigated rate variation along DNA sequence alignments, modelling confounding effects such as recombination. Traditional approaches to phylogenetic inference assume that a single phylogenetic tree can represent the relationships and divergences between the taxa. However, taxa sequences exhibit varying levels of conservation, e.g. due to regulatory elements and active binding sites, and certain bacteria and viruses undergo interspecific recombination. We propose a phylogenetic factorial hidden Markov model to infer recombination and rate variation. We examined the performance of our model and inference scheme on various synthetic alignments, and compared it to state of the art breakpoint models. We investigated three DNA sequence alignments: one of maize actin genes, one bacterial (Neisseria), and the other of HIV-1. Inference is carried out in the Bayesian framework, using Reversible Jump Markov Chain Monte Carlo

    A subgroup of plant aquaporins facilitate the bi-directional diffusion of As(OH)3 and Sb(OH)3 across membranes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Arsenic is a toxic and highly abundant metalloid that endangers human health through drinking water and the food chain. The most common forms of arsenic in the environment are arsenate (As(V)) and arsenite (As(III)). As(V) is a non-functional phosphate analog that enters the food chain via plant phosphate transporters. Inside cells, As(V) becomes reduced to As(III) for subsequent extrusion or compartmentation. Although much is known about As(III) transport and handling in microbes and mammals, the transport systems for As(III) have not yet been characterized in plants.</p> <p>Results</p> <p>Here we show that the Nodulin26-like Intrinsic Proteins (NIPs) AtNIP5;1 and AtNIP6;1 from <it>Arabidopsis thaliana</it>, OsNIP2;1 and OsNIP3;2 from <it>Oryza sativa</it>, and LjNIP5;1 and LjNIP6;1 from <it>Lotus japonicus </it>are bi-directional As(III) channels. Expression of these NIPs sensitized yeast cells to As(III) and antimonite (Sb(III)), and direct transport assays confirmed their ability to facilitate As(III) transport across cell membranes. On medium containing As(V), expression of the same NIPs improved yeast growth, probably due to increased As(III) efflux. Our data furthermore provide evidence that NIPs can discriminate between highly similar substrates and that they may have differential preferences in the direction of transport. A subgroup of As(III) permeable channels that group together in a phylogenetic tree required N-terminal truncation for functional expression in yeast.</p> <p>Conclusion</p> <p>This is the first molecular identification of plant As(III) transport systems and we propose that metalloid transport through NIPs is a conserved and ancient feature. Our observations are potentially of great importance for improved remediation and tolerance of plants, and may provide a key to the development of low arsenic crops for food production.</p

    Homology inference with specific molecular constraints

    Get PDF
    Evolutionary processes can be considered at multiple levels of biological organization. The work developed in this thesis focuses on protein molecular evolution. Although proteins are linear polymers composed from a basic set of 20 amino acids, they generate an enormous variety of form and function. Proteins that have arisen by a common descent are classified into families; they often share common properties including similarities in sequence, structure, and function. Multiple methods have been developed to infer evolutionary relationships between proteins and classify them into families. Yet, those generic methods are often inaccurate, especially when specific protein properties limit their applications. In this thesis, we analyse two protein classes that are often difficult for the evolutionary analysis: the coiled-coils – repetitive protein domains defined by a simple widespread peptide motif (chapters 2 and 3) and Rab small GTPases – a large family of closely related proteins (chapters 4 and 5). In both cases, we analyse the specific properties that determine protein structure and function and use them to improve their evolutionary inference

    Revisiting the evolution of mouse LINE-1 in the genomic era

    Full text link

    Uniparental Genetic Heritage of Belarusians: Encounter of Rare Middle Eastern Matrilineages with a Central European Mitochondrial DNA Pool

    Get PDF
    Ethnic Belarusians make up more than 80% of the nine and half million people inhabiting the Republic of Belarus. Belarusians together with Ukrainians and Russians represent the East Slavic linguistic group, largest both in numbers and territory, inhabiting East Europe alongside Baltic-, Finno-Permic- and Turkic-speaking people. Till date, only a limited number of low resolution genetic studies have been performed on this population. Therefore, with the phylogeographic analysis of 565 Y-chromosomes and 267 mitochondrial DNAs from six well covered geographic sub-regions of Belarus we strove to complement the existing genetic profile of eastern Europeans. Our results reveal that around 80% of the paternal Belarusian gene pool is composed of R1a, I2a and N1c Y-chromosome haplogroups – a profile which is very similar to the two other eastern European populations – Ukrainians and Russians. The maternal Belarusian gene pool encompasses a full range of West Eurasian haplogroups and agrees well with the genetic structure of central-east European populations. Our data attest that latitudinal gradients characterize the variation of the uniparentally transmitted gene pools of modern Belarusians. In particular, the Y-chromosome reflects movements of people in central-east Europe, starting probably as early as the beginning of the Holocene. Furthermore, the matrilineal legacy of Belarusians retains two rare mitochondrial DNA haplogroups, N1a3 and N3, whose phylogeographies were explored in detail after de novo sequencing of 20 and 13 complete mitogenomes, respectively, from all over Eurasia. Our phylogeographic analyses reveal that two mitochondrial DNA lineages, N3 and N1a3, both of Middle Eastern origin, might mark distinct events of matrilineal gene flow to Europe: during the mid-Holocene period and around the Pleistocene-Holocene transition, respectively

    Phylogenetic and functional analysis of the Cation Diffusion Facilitator (CDF) family: improved signature and prediction of substrate specificity

    Get PDF
    BACKGROUND The Cation Diffusion Facilitator (CDF) family is a ubiquitous family of heavy metal transporters. Much interest in this family has focused on implications for human health and bioremediation. In this work a broad phylogenetic study has been undertaken which, considered in the context of the functional characteristics of some fully characterised CDF transporters, has aimed at identifying molecular determinants of substrate selectivity and at suggesting metal specificity for newly identified CDF transporters. RESULTS Representative CDF members from all three kingdoms of life (Archaea, Eubacteria, Eukaryotes) were retrieved from genomic databases. Protein sequence alignment has allowed detection of a modified signature that can be used to identify new hypothetical CDF members. Phylogenetic reconstruction has classified the majority of CDF family members into three groups, each containing characterised members that share the same specificity towards the principally-transported metal, i.e. Zn, Fe/Zn or Mn. The metal selectivity of newly identified CDF transporters can be inferred by their position in one of these groups. The function of some conserved amino acids was assessed by site-directed mutagenesis in the poplar Zn2+ transporter PtdMTP1 and compared with similar experiments performed in prokaryotic members. An essential structural role can be assigned to a widely conserved glycine residue, while aspartate and histidine residues, highly conserved in putative transmembrane domains, might be involved in metal transport. The potential role of group-conserved amino acid residues in metal specificity is discussed. CONCLUSION In the present study phylogenetic and functional analyses have allowed the identification of three major substrate-specific CDF groups. The metal selectivity of newly identified CDF transporters can be inferred by their position in one of these groups. The modified signature sequence proposed in this work can be used to identify new hypothetical CDF members
    • …
    corecore