9,505 research outputs found

    Methods to study splicing from high-throughput RNA Sequencing data

    Full text link
    The development of novel high-throughput sequencing (HTS) methods for RNA (RNA-Seq) has provided a very powerful mean to study splicing under multiple conditions at unprecedented depth. However, the complexity of the information to be analyzed has turned this into a challenging task. In the last few years, a plethora of tools have been developed, allowing researchers to process RNA-Seq data to study the expression of isoforms and splicing events, and their relative changes under different conditions. We provide an overview of the methods available to study splicing from short RNA-Seq data. We group the methods according to the different questions they address: 1) Assignment of the sequencing reads to their likely gene of origin. This is addressed by methods that map reads to the genome and/or to the available gene annotations. 2) Recovering the sequence of splicing events and isoforms. This is addressed by transcript reconstruction and de novo assembly methods. 3) Quantification of events and isoforms. Either after reconstructing transcripts or using an annotation, many methods estimate the expression level or the relative usage of isoforms and/or events. 4) Providing an isoform or event view of differential splicing or expression. These include methods that compare relative event/isoform abundance or isoform expression across two or more conditions. 5) Visualizing splicing regulation. Various tools facilitate the visualization of the RNA-Seq data in the context of alternative splicing. In this review, we do not describe the specific mathematical models behind each method. Our aim is rather to provide an overview that could serve as an entry point for users who need to decide on a suitable tool for a specific analysis. We also attempt to propose a classification of the tools according to the operations they do, to facilitate the comparison and choice of methods.Comment: 31 pages, 1 figure, 9 tables. Small corrections adde

    Global analyses of endonucleolytic cleavage in mammals reveal expanded repertoires of cleavage-inducing small RNAs and their targets.

    Get PDF
    In mammals, small RNAs are important players in post-transcriptional gene regulation. While their roles in mRNA destabilization and translational repression are well appreciated, their involvement in endonucleolytic cleavage of target RNAs is poorly understood. Very few microRNAs are known to guide RNA cleavage. Endogenous small interfering RNAs are expected to induce target cleavage, but their target genes remain largely unknown. We report a systematic study of small RNA-mediated endonucleolytic cleavage in mouse through integrative analysis of small RNA and degradome sequencing data without imposing any bias toward known small RNAs. Hundreds of small cleavage-inducing RNAs and their cognate target genes were identified, significantly expanding the repertoire of known small RNA-guided cleavage events. Strikingly, both small RNAs and their target sites demonstrated significant overlap with retrotransposons, providing evidence for the long-standing speculation that retrotransposable elements in mRNAs are leveraged as signals for gene targeting. Furthermore, our analysis showed that the RNA cleavage pathway is also present in human cells but affecting a different repertoire of retrotransposons. These results show that small RNA-guided cleavage is more widespread than previously appreciated. Their impact on retrotransposons in non-coding regions shed light on important aspects of mammalian gene regulation

    The Echinococcus canadensis (G7) genome: A key knowledge of parasitic platyhelminth human diseases

    Get PDF
    Background: The parasite Echinococcus canadensis (G7) (phylum Platyhelminthes, class Cestoda) is one of the causative agents of echinococcosis. Echinococcosis is a worldwide chronic zoonosis affecting humans as well as domestic and wild mammals, which has been reported as a prioritized neglected disease by the World Health Organisation. No genomic data, comparative genomic analyses or efficient therapeutic and diagnostic tools are available for this severe disease. The information presented in this study will help to understand the peculiar biological characters and to design species-specific control tools. Results: We sequenced, assembled and annotated the 115-Mb genome of E. canadensis (G7). Comparative genomic analyses using whole genome data of three Echinococcus species not only confirmed the status of E. canadensis (G7) as a separate species but also demonstrated a high nucleotide sequences divergence in relation to E. granulosus (G1). The E. canadensis (G7) genome contains 11,449 genes with a core set of 881 orthologs shared among five cestode species. Comparative genomics revealed that there are more single nucleotide polymorphisms (SNPs) between E. canadensis (G7) and E. granulosus (G1) than between E. canadensis (G7) and E. multilocularis. This result was unexpected since E. canadensis (G7) and E. granulosus (G1) were considered to belong to the species complex E. granulosus sensu lato. We described SNPs in known drug targets and metabolism genes in the E. canadensis (G7) genome. Regarding gene regulation, we analysed three particular features: CpG island distribution along the three Echinococcus genomes, DNA methylation system and small RNA pathway. The results suggest the occurrence of yet unknown gene regulation mechanisms in Echinococcus. Conclusions: This is the first work that addresses Echinococcus comparative genomics. The resources presented here will promote the study of mechanisms of parasite development as well as new tools for drug discovery. The availability of a high-quality genome assembly is critical for fully exploring the biology of a pathogenic organism. The E. canadensis (G7) genome presented in this study provides a unique opportunity to address the genetic diversity among the genus Echinococcus and its particular developmental features. At present, there is no unequivocal taxonomic classification of Echinococcus species; however, the genome-wide SNPs analysis performed here revealed the phylogenetic distance among these three Echinococcus species. Additional cestode genomes need to be sequenced to be able to resolve their phylogeny.Fil: Maldonado, Lucas Luciano. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones en Microbiología y Parasitología Médica. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones en Microbiología y Parasitología Médica; ArgentinaFil: Assis, Juliana. Fundación Oswaldo Cruz; BrasilFil: Gomes Araújo, Flávio M.. Fundación Oswaldo Cruz; BrasilFil: Salim, Anna C. M.. Fundación Oswaldo Cruz; BrasilFil: Macchiaroli, Natalia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones en Microbiología y Parasitología Médica. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones en Microbiología y Parasitología Médica; ArgentinaFil: Cucher, Marcela Alejandra. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones en Microbiología y Parasitología Médica. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones en Microbiología y Parasitología Médica; ArgentinaFil: Camicia, Federico. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones en Microbiología y Parasitología Médica. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones en Microbiología y Parasitología Médica; ArgentinaFil: Fox, Adolfo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones en Microbiología y Parasitología Médica. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones en Microbiología y Parasitología Médica; ArgentinaFil: Rosenzvit, Mara Cecilia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones en Microbiología y Parasitología Médica. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones en Microbiología y Parasitología Médica; ArgentinaFil: Oliveira, Guilherme. Instituto Tecnológico Vale; Brasil. Fundación Oswaldo Cruz; BrasilFil: Kamenetzky, Laura. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones en Microbiología y Parasitología Médica. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones en Microbiología y Parasitología Médica; Argentin

    A Cryptic Non-Inducible Prophage Confers Phage-Immunity on the Streptococcus thermophilus M17PTZA496

    Get PDF
    open9openda Silva Duarte, Vinícius; Giaretta, Sabrina; Campanaro, Stefano; Treu, Laura; Armani, Andrea; Tarrah, Armin; Oliveira de Paula, Sérgio; Giacomini, Alessio; Corich, Vivianada Silva Duarte, Vinícius; Giaretta, Sabrina; Campanaro, Stefano; Treu, Laura; Armani, Andrea; Tarrah, Armin; Oliveira de Paula, Sérgio; Giacomini, Alessio; Corich, Vivian

    Species-level functional profiling of metagenomes and metatranscriptomes.

    Get PDF
    Functional profiles of microbial communities are typically generated using comprehensive metagenomic or metatranscriptomic sequence read searches, which are time-consuming, prone to spurious mapping, and often limited to community-level quantification. We developed HUMAnN2, a tiered search strategy that enables fast, accurate, and species-resolved functional profiling of host-associated and environmental communities. HUMAnN2 identifies a community's known species, aligns reads to their pangenomes, performs translated search on unclassified reads, and finally quantifies gene families and pathways. Relative to pure translated search, HUMAnN2 is faster and produces more accurate gene family profiles. We applied HUMAnN2 to study clinal variation in marine metabolism, ecological contribution patterns among human microbiome pathways, variation in species' genomic versus transcriptional contributions, and strain profiling. Further, we introduce 'contributional diversity' to explain patterns of ecological assembly across different microbial community types

    sTarPicker: A Method for Efficient Prediction of Bacterial sRNA Targets Based on a Two-Step Model for Hybridization

    Get PDF
    Bacterial sRNAs are a class of small regulatory RNAs involved in regulation of expression of a variety of genes. Most sRNAs act in trans via base-pairing with target mRNAs, leading to repression or activation of translation or mRNA degradation. To date, more than 1,000 sRNAs have been identified. However, direct targets have been identified for only approximately 50 of these sRNAs. Computational predictions can provide candidates for target validation, thereby increasing the speed of sRNA target identification. Although several methods have been developed, target prediction for bacterial sRNAs remains challenging.Here, we propose a novel method for sRNA target prediction, termed sTarPicker, which was based on a two-step model for hybridization between an sRNA and an mRNA target. This method first selects stable duplexes after screening all possible duplexes between the sRNA and the potential mRNA target. Next, hybridization between the sRNA and the target is extended to span the entire binding site. Finally, quantitative predictions are produced with an ensemble classifier generated using machine-learning methods. In calculations to determine the hybridization energies of seed regions and binding regions, both thermodynamic stability and site accessibility of the sRNAs and targets were considered. Comparisons with the existing methods showed that sTarPicker performed best in both performance of target prediction and accuracy of the predicted binding sites.sTarPicker can predict bacterial sRNA targets with higher efficiency and determine the exact locations of the interactions with a higher accuracy than competing programs. sTarPicker is available at http://ccb.bmi.ac.cn/starpicker/

    Quantification of miRNAs and Their Networks in the light of Integral Value Transformations

    Get PDF
    MicroRNAs (miRNAs) which are on average only 21-25 nucleotides long are key post-transcriptional regulators of gene expression in metazoans and plants. A proper quantitative understanding of miRNAs is required to comprehend their structures, functions, evolutions etc. In this paper, the nucleotide strings of miRNAs of three organisms namely Homo sapiens (hsa), Macaca mulatta (mml) and Pan troglodytes (ptr) have been quantified and classified based on some characterizing features. A network has been built up among the miRNAs for these three organisms through a class of discrete transformations namely Integral Value Transformations (IVTs), proposed by Sk. S. Hassan et al [1, 2]. Through this study we have been able to nullify or justify one given nucleotide string as a miRNA. This study will help us to recognize a given nucleotide string as a probable miRNA, without the requirement of any conventional biological experiment. This method can be amalgamated with the existing analysis pipelines, for small RNA sequencing data (designed for finding novel miRNA). This method would provide more confidence and would make the current analysis pipeline more efficient in predicting the probable candidates of miRNA for biological validation and filter out the improbable candidates

    Predicting the Impact of Alternative Splicing on Plant MADS Domain Protein Function

    Get PDF
    Several genome-wide studies demonstrated that alternative splicing (AS) significantly increases the transcriptome complexity in plants. However, the impact of AS on the functional diversity of proteins is difficult to assess using genome-wide approaches. The availability of detailed sequence annotations for specific genes and gene families allows for a more detailed assessment of the potential effect of AS on their function. One example is the plant MADS-domain transcription factor family, members of which interact to form protein complexes that function in transcription regulation. Here, we perform an in silico analysis of the potential impact of AS on the protein-protein interaction capabilities of MIKC-type MADS-domain proteins. We first confirmed the expression of transcript isoforms resulting from predicted AS events. Expressed transcript isoforms were considered functional if they were likely to be translated and if their corresponding AS events either had an effect on predicted dimerisation motifs or occurred in regions known to be involved in multimeric complex formation, or otherwise, if their effect was conserved in different species. Nine out of twelve MIKC MADS-box genes predicted to produce multiple protein isoforms harbored putative functional AS events according to those criteria. AS events with conserved effects were only found at the borders of or within the K-box domain. We illustrate how AS can contribute to the evolution of interaction networks through an example of selective inclusion of a recently evolved interaction motif in the MADS AFFECTING FLOWERING1-3 (MAF1–3) subclade. Furthermore, we demonstrate the potential effect of an AS event in SHORT VEGETATIVE PHASE (SVP), resulting in the deletion of a short sequence stretch including a predicted interaction motif, by overexpression of the fully spliced and the alternatively spliced SVP transcripts. For most of the AS events we were able to formulate hypotheses about the potential impact on the interaction capabilities of the encoded MIKC protein

    The CACTA transposon Bot1 played a major role in Brassica genome divergence and gene proliferation

    Get PDF
    We isolated and characterized a Brassica C genome-specific CACTA element, which was designated Bot1 (Brassica oleracea transposon 1). After analysing phylogenetic relationships, copy numbers and sequence similarity of Bot1 and Bot1 analogues in B. oleracea (C genome) versus Brassica rapa (A genome), we concluded that Bot1 has encountered several rounds of amplification in the oleracea genome only, and has played a major role in the recent rapa and oleracea genome divergence. We performed in silico analyses of the genomic organization and internal structure of Bot1, and established which segment of Bot1 is C-genome specific. Our work reports a fully characterized Brassica repetitive sequence that can distinguish the Brassica A and C chromosomes in the allotetraploid Brassica napus, by fluorescent in situ hybridization. We demonstrated that Bot1 carries a host S locus-associated SLL3 gene copy. We speculate that Bot1 was involved in the proliferation of SLL3 around the Brassica genome. The present study reinforces the assumption that transposons are a major driver of genome and gene evolution in higher plants

    Something more is necessary: are genes and genetic diagnostic tests statutory subject matter for US patents?

    Get PDF
    In a recent decision (AMP v. USPTO) from the US District Court, patent claims directed at DNA sequences corresponding to human genes and to diagnostic tests based on such genes have been found to be invalid, primarily on the basis that the DNA molecules claimed, which included cDNA, primers and probes, are 'products of nature' and are thus unpatentable. If upheld, this decision will have considerable impact on the ability of biotechnical companies and universities to patent the results of their research. In this article, we will explain the basis for this decision and discuss the appropriateness of patenting discoveries and their (obvious) uses in the light of this fascinating case. While our focus will primarily be on the product claims, diagnostic method claims were also revoked in AMP v. USPTO on the basis that they were for mental acts or did not involve any 'transformation of matter'. This will be discussed in the light of the recent US Supreme Court decision in Bilski v. Kappos, which focused on the patent-eligibility of process claims
    • …
    corecore