5,180 research outputs found
Mutual Enrichment in Ranked Lists and the Statistical Assessment of Position Weight Matrix Motifs
Statistics in ranked lists is important in analyzing molecular biology
measurement data, such as ChIP-seq, which yields ranked lists of genomic
sequences. State of the art methods study fixed motifs in ranked lists. More
flexible models such as position weight matrix (PWM) motifs are not addressed
in this context. To assess the enrichment of a PWM motif in a ranked list we
use a PWM induced second ranking on the same set of elements. Possible orders
of one ranked list relative to the other are modeled by permutations. Due to
sample space complexity, it is difficult to characterize tail distributions in
the group of permutations. In this paper we develop tight upper bounds on tail
distributions of the size of the intersection of the top of two uniformly and
independently drawn permutations and demonstrate advantages of this approach
using our software implementation, mmHG-Finder, to study PWMs in several
datasets.Comment: Peer-reviewed and presented as part of the 13th Workshop on
Algorithms in Bioinformatics (WABI2013
Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data
Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. © 2009 Corcoran et al
Oyster RNA-seq data support the development of Malacoherpesviridae genomics
The family of double-stranded DNA (dsDNA) Malacoherpesviridae includes viruses
able to infect marine mollusks and detrimental for worldwide aquaculture production.
Due to fast-occurring mortality and a lack of permissive cell lines, the available data
on the few known Malacoherpesviridae provide only partial support for the study
of molecular virus features, life cycle, and evolutionary history. Following thorough
data mining of bivalve and gastropod RNA-seq experiments, we used more than
five million Malacoherpesviridae reads to improve the annotation of viral genomes
and to characterize viral InDels, nucleotide stretches, and SNPs. Both genome and
protein domain analyses confirmed the evolutionary diversification and gene uniqueness
of known Malacoherpesviridae. However, the presence of Malacoherpesviridae-like
sequences integrated within genomes of phylogenetically distant invertebrates indicates
broad diffusion of these viruses and indicates the need for confirmatory investigations.
The manifest co-occurrence of OsHV-1 genotype variants in single RNA-seq samples of
Crassostrea gigas provide further support for the Malacoherpesviridae diversification. In
addition to simple sequence motifs inter-punctuating viral ORFs, recombination-inducing
sequences were found to be enriched in the OsHV-1 and AbHV1-AUS genomes.
Finally, the highly correlated expression of most viral ORFs in multiple oyster samples
is consistent with the burst of viral proteins during the lytic phase
- …