728 research outputs found

    Differential expression analysis for sequence count data

    Get PDF
    *Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq

    The RNA-binding protein ELAV regulates Hox RNA processing, expression and function within the Drosophila nervous system

    Get PDF
    The regulated head-to-tail expression of Hox genes provides a coordinate system for the activation of specific programmes of cell differentiation according to axial level. Recent work indicates that Hox expression can be regulated via RNA processing but the underlying mechanisms and biological significance of this form of regulation remain poorly understood. Here we explore these issues within the developing Drosophila central nervous system (CNS). We show that the pan-neural RNA-binding protein (RBP) ELAV (Hu antigen) regulates the RNA processing patterns of the Hox gene Ultrabithorax (Ubx) within the embryonic CNS. Using a combination of biochemical, genetic and imaging approaches we demonstrate that ELAV binds to discrete elements within Ubx RNAs and that its genetic removal reduces Ubx protein expression in the CNS leading to the respecification of cellular subroutines under Ubx control, thus defining for the first time a specific cellular role of ELAV within the developing CNS. Artificial provision of ELAV in glial cells (a cell type that lacks ELAV) promotes Ubx expression, suggesting that ELAVdependent regulation might contribute to cell type-specific Hox expression patterns within the CNS. Finally, we note that expression of abdominal A and Abdominal B is reduced in elav mutant embryos, whereas other Hox genes (Antennapedia) are not affected. Based on these results and the evolutionary conservation of ELAV and Hox genes we propose that the modulation of Hox RNA processing by ELAV serves to adapt the morphogenesis of the CNS to axial level by regulating Hox expression and consequently activating local programmes of neural differentiation

    RBPDB: a database of RNA-binding specificities

    Get PDF
    The RNA-Binding Protein DataBase (RBPDB) is a collection of experimental observations of RNA-binding sites, both in vitro and in vivo, manually curated from primary literature. To build RBPDB, we performed a literature search for experimental binding data for all RNA-binding proteins (RBPs) with known RNA-binding domains in four metazoan species (human, mouse, fly and worm). In total, RPBDB contains binding data on 272 RBPs, including 71 that have motifs in position weight matrix format, and 36 sets of sequences of in vivo-bound transcripts from immunoprecipitation experiments. The database is accessible by a web interface which allows browsing by domain or by organism, searching and export of records, and bulk data downloads. Users can also use RBPDB to scan sequences for RBP-binding sites. RBPDB is freely available, without registration at http://rbpdb.ccbr.utoronto.ca/

    Exploration for Functional Nucleotide Sequence Candidates within Coding Regions of Mammalian Genes

    Get PDF
    The primary role of a protein coding gene is to encode amino acids. Therefore, synonymous sites of codons, which do not change the encoded amino acid, are regarded as evolving neutrally. However, if a certain region of a protein coding gene contains a functional nucleotide element (e.g. splicing signals), synonymous sites in the region may have selective pressure. The existence of such elements would be detected by searching regions of low nucleotide substitution. We explored invariant nucleotide sequences in 10 790 orthologous genes of six mammalian species (Homo sapiens, Macaca mulatta, Mus musculus, Rattus norvegicus, Bos taurus, and Canis familiaris), and extracted 4150 sequences whose conservation is significantly stronger than other regions of the gene and named them significantly conserved coding sequences (SCCSs). SCCSs are observed in 2273 genes. The genes are mainly involved with development, transcriptional regulation, and the neurons, and are expressed in the nervous system and the head and neck organs. No strong influence of conventional factors that affect synonymous substitution was observed in SCCSs. These results imply that SCCSs may have double function as nucleotide element and protein coding sequence and retained in the course of mammalian evolution

    Site identification in high-throughput RNA-protein interaction data

    Get PDF
    Motivation: Post-transcriptional and co-transcriptional regulation is a crucial link between genotype and phenotype. The central players are the RNA-binding proteins, and experimental technologies [such as cross-linking with immunoprecipitation-(CLIP-) and RIP-seq] for probing their activities have advanced rapidly over the course of the past decade. Statistically robust, flexible computational methods for binding site identification from high-throughput immunoprecipitation assays are largely lacking however.Results: We introduce a method for site identification which provides four key advantages over previous methods: (i) it can be applied on all variations of CLIP and RIP-seq technologies, (ii) it accurately models the underlying read-count distributions, (iii) it allows external covariates, such as transcript abundance (which we demonstrate is highly correlated with read count) to inform the site identification process and (iv) it allows for direct comparison of site usage across cell types or conditions. © The Author 2012. Published by Oxford University Press. All rights reserved

    PRIDB: a protein–RNA interface database

    Get PDF
    The Protein–RNA Interface Database (PRIDB) is a comprehensive database of protein–RNA interfaces extracted from complexes in the Protein Data Bank (PDB). It is designed to facilitate detailed analyses of individual protein–RNA complexes and their interfaces, in addition to automated generation of user-defined data sets of protein–RNA interfaces for statistical analyses and machine learning applications. For any chosen PDB complex or list of complexes, PRIDB rapidly displays interfacial amino acids and ribonucleotides within the primary sequences of the interacting protein and RNA chains. PRIDB also identifies ProSite motifs in protein chains and FR3D motifs in RNA chains and provides links to these external databases, as well as to structure files in the PDB. An integrated JMol applet is provided for visualization of interacting atoms and residues in the context of the 3D complex structures. The current version of PRIDB contains structural information regarding 926 protein–RNA complexes available in the PDB (as of 10 October 2010). Atomic- and residue-level contact information for the entire data set can be downloaded in a simple machine-readable format. Also, several non-redundant benchmark data sets of protein–RNA complexes are provided. The PRIDB database is freely available online at http://bindr.gdcb.iastate.edu/PRIDB

    FRA2A is a CGG repeat expansion associated with silencing of AFF3

    Get PDF
    Folate-sensitive fragile sites (FSFS) are a rare cytogenetically visible subset of dynamic mutations. Of the eight molecularly characterized FSFS, four are associated with intellectual disability (ID). Cytogenetic expression results from CGG tri-nucleotide-repeat expansion mutation associated with local CpG hypermethylation and transcriptional silencing. The best studied is the FRAXA site in the FMR1 gene, where large expansions cause fragile X syndrome, the most common inherited ID syndrome. Here we studied three families with FRA2A expression at 2q11 associated with a wide spectrum of neurodevelopmental phenotypes. We identified a polymorphic CGG repeat in a conserved, brain-active alternative promoter of the AFF3 gene, an autosomal homolog of the X-linked AFF2/FMR2 gene: Expansion of the AFF2 CGG repeat causes FRAXE ID. We found that FRA2A-expressing individuals have mosaic expansions of the AFF3 CGG repeat in the range of several hundred repeat units. Moreover, bisulfite sequencing and pyrosequencing both suggest AFF3 promoter hypermethylation. cSNP-analysis demonstrates monoallelic expression of the AFF3 gene in FRA2A carriers thus predicting that FRA2A expression results in functional haploinsufficiency for AFF3 at least in a subset of tissues. By whole-mount in situ hybridization the mouse AFF3 ortholog shows strong regional expression in the developing brain, somites and limb buds in 9.5-12.5dpc mouse embryos. Our data suggest that there may be an association between FRA2A and a delay in the acquisition of motor and language skills in the families studied here. However, additional cases are required to firmly establish a causal relationship

    A role for SSU72 in balancing RNA polymerase II transcription elongation and termination

    Full text link
    Interactions of pre-mRNA 3&prime;end factors and the CTD of RNA polymerase II (RNAP II) are required for transcription termination and 3&prime;end processing. Here, we demonstrate that Ssu72p is stably associated with yeast cleavage and polyadenylation factor CPF and provide evidence that it bridges the CPF subunits Pta1p and Ydh1p/Cft2p, the general transcription factor TFIIB, and RNAP II via Rpb2p. Analyses of ssu72-2 mutant cells in the absence and presence of the nuclear exosome component Rrp6p revealed defects in RNAP II transcription elongation and termination. 6-azauracil, that reduces transcription elongation rates, suppressed the ssu72-2 growth defect at 33&deg;C. The sum of our analyses suggests a negative influence of Ssu72p on RNAP II during transcription that affects the commitment to either elongation or termination.<br /
    corecore