14 research outputs found

    Open reading frames provide a rich pool of potential natural antisense transcripts in fungal genomes

    Get PDF
    Natural antisense transcripts are reported from all kingdoms of life and several recent reports of genomewide screens indicate that they are widely distributed. These transcripts seem to be involved in various biological functions and may govern the expression of their respective sense partner. Very little, however, is known about the degree of evolutionary conservation of antisense transcripts. Furthermore, none of the earlier analyses has studied whether antisense relationships are solely dual or involved in more complex relationships. Here we present a systematic screen for cis- and trans-located antisense transcripts based on open reading frames (ORFs) from five fungal species. The relative number of ORFs involved in antisense relationships varies greatly between the five species. In addition, other significant differences are found between the species, such as the mean length of the antisense region. The majority of trans-located antisense transcripts is found to be involved in complex relationships, resulting in highly connected networks. The analysis of the degree of evolutionary conservation of antisense transcripts shows that most antisense transcripts have no ortholog in any other species. An annotation of antisense transcripts based on Gene Ontology directs to common terms and shows that proteins of genes involved in antisense relationships preferentially localize to the nucleus with common functions in the regulation or maintenance of nucleic acids

    Learning Channel Importance for High Content Imaging with Interpretable Deep Input Channel Mixing

    Full text link
    Uncovering novel drug candidates for treating complex diseases remain one of the most challenging tasks in early discovery research. To tackle this challenge, biopharma research established a standardized high content imaging protocol that tags different cellular compartments per image channel. In order to judge the experimental outcome, the scientist requires knowledge about the channel importance with respect to a certain phenotype for decoding the underlying biology. In contrast to traditional image analysis approaches, such experiments are nowadays preferably analyzed by deep learning based approaches which, however, lack crucial information about the channel importance. To overcome this limitation, we present a novel approach which utilizes multi-spectral information of high content images to interpret a certain aspect of cellular biology. To this end, we base our method on image blending concepts with alpha compositing for an arbitrary number of channels. More specifically, we introduce DCMIX, a lightweight, scaleable and end-to-end trainable mixing layer which enables interpretable predictions in high content imaging while retaining the benefits of deep learning based methods. We employ an extensive set of experiments on both MNIST and RXRX1 datasets, demonstrating that DCMIX learns the biologically relevant channel importance without scarifying prediction performance.Comment: Accepted @ DAGM German Conference on Pattern Recognition (GCPR) 202

    Comparative analysis of structured RNAs in S. cerevisiae indicates a multitude of different functions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Non-coding RNAs (ncRNAs) are an emerging focus for both computational analysis and experimental research, resulting in a growing number of novel, non-protein coding transcripts with often unknown functions. Whole genome screens in higher eukaryotes, for example, provided evidence for a surprisingly large number of ncRNAs. To supplement these searches, we performed a computational analysis of seven yeast species and searched for new ncRNAs and RNA motifs.</p> <p>Results</p> <p>A comparative analysis of the genomes of seven yeast species yielded roughly 2800 genomic loci that showed the hallmarks of evolutionary conserved RNA secondary structures. A total of 74% of these regions overlapped with annotated non-coding or coding genes in yeast. Coding sequences that carry predicted structured RNA elements belong to a limited number of groups with common functions, suggesting that these RNA elements are involved in post-transcriptional regulation and/or cellular localization. About 700 conserved RNA structures were found outside annotated coding sequences and known ncRNA genes. Many of these predicted elements overlapped with UTR regions of particular classes of protein coding genes. In addition, a number of RNA elements overlapped with previously characterized antisense transcripts. Transcription of about 120 predicted elements located in promoter regions and other, previously un-annotated, intergenic regions was supported by tiling array experiments, ESTs, or SAGE data.</p> <p>Conclusion</p> <p>Our computational predictions strongly suggest that yeasts harbor a substantial pool of several hundred novel ncRNAs. In addition, we describe a large number of RNA structures in coding sequences and also within antisense transcripts that were previously characterized using tiling arrays.</p

    Genomic organization of eukaryotic tRNAs

    Get PDF
    BACKGROUND: Surprisingly little is known about the organization and distribution of tRNA genes and tRNA-related sequences on a genome-wide scale. While tRNA gene complements are usually reported in passing as part of genome annotation efforts, and peculiar features such as the tandem arrangements of tRNA gene in Entamoeba histolytica have been described in some detail, systematic comparative studies are rare and mostly restricted to bacteria. We therefore set out to survey the genomic arrangement of tRNA genes and pseudogenes in a wide range of eukaryotes to identify common patterns and taxon-specific peculiarities. RESULTS: In line with previous reports, we find that tRNA complements evolve rapidly and tRNA gene and pseudogene locations are subject to rapid turnover. At phylum level, the distributions of the number of tRNA genes and pseudogenes numbers are very broad, with standard deviations on the order of the mean. Even among closely related species we observe dramatic changes in local organization. For instance, 65% and 87% of the tRNA genes and pseudogenes are located in genomic clusters in zebrafish and stickleback, resp., while such arrangements are relatively rare in the other three sequenced teleost fish genomes. Among basal metazoa, Trichoplax adherens has hardly any duplicated tRNA gene, while the sea anemone Nematostella vectensis boasts more than 17000 tRNA genes and pseudogenes. Dramatic variations are observed even within the eutherian mammals. Higher primates, for instance, have 616 +/- 120 tRNA genes and pseudogenes of which 17% to 36% are arranged in clusters, while the genome of the bushbaby Otolemur garnetti has 45225 tRNA genes and pseudogenes of which only 5.6% appear in clusters. In contrast, the distribution is surprisingly uniform across plant genomes. Consistent with this variability, syntenic conservation of tRNA genes and pseudogenes is also poor in general, with turn-over rates comparable to those of unconstrained sequence elements. Despite this large variation in abundance in Eukarya we observe a significant correlation between the number of tRNA genes, tRNA pseudogenes, and genome size. CONCLUSIONS: The genomic organization of tRNA genes and pseudogenes shows complex lineage-specific patterns characterized by an extensive variability that is in striking contrast to the extreme levels of sequence-conservation of the tRNAs themselves. The comprehensive analysis of the genomic organization of tRNA genes and pseudogenes in Eukarya provides a basis for further studies into the interplay of tRNA gene arrangements and genome organization in general

    Know when you don't know

    No full text
    Deep convolutional neural networks show outstanding performance in image-based phenotype classification given that all existing phenotypes are presented during the training of the network. However, in real-world high-content screening (HCS) experiments, it is often impossible to know all phenotypes in advance. Moreover, novel phenotype discovery itself can be an HCS outcome of interest. This aspect of HCS is not yet covered by classical deep learning approaches. When presenting an image with a novel phenotype to a trained network, it fails to indicate a novelty discovery but assigns the image to a wrong phenotype. To tackle this problem and address the need for novelty detection, we use a recently developed Bayesian approach for deep neural networks called Monte Carlo (MC) dropout to define different uncertainty measures for each phenotype prediction. With real HCS data, we show that these uncertainty measures allow us to identify novel or unclear phenotypes. In addition, we also found that the MC dropout method results in a significant improvement of classification accuracy. The proposed procedure used in our HCS case study can be easily transferred to any existing network architecture and will be beneficial in terms of accuracy and novelty detection

    Noncoding RNA of Glutamine Synthetase I Modulates Antibiotic Production in Streptomyces coelicolor A3(2)

    No full text
    Overexpression of antisense chromosomal cis-encoded noncoding RNAss (ncRNAs) in glutamine synthetase I resulted in a decrease in growth, protein synthesis, and antibiotic production in Streptomyces coelicolor. In addition, we predicted 3,597 cis-encoded ncRNAs and validated 13 of them experimentally, including several ncRNAs that are differentially expressed in bacterial hormone-defective mutants.

    Examples of predicted ncRNAs: genomic context, tiling array pattern and predicted consensus structure

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Comparative analysis of structured RNAs in indicates a multitude of different functions"</p><p>http://www.biomedcentral.com/1741-7007/5/25</p><p>BMC Biology 2007;5():25-25.</p><p>Published online 18 Jun 2007</p><p>PMCID:PMC1914338.</p><p></p> The color scheme used for coloring the RNA structures and the mountain plots representation is the same as in Figure 2. (a) and (b) Intergenic region with predicted RNA overlaps with transcripts described by David et al [9]. Note, that in (b) the sequence for sacKud.contig1979/20479-20583 is truncated (a stretch of seven gaps in the 3' end of the stem, which is not compensated by the deletions in the 5' part of the stem), which probably renders an unusable RNA due to an altered secondary structure. (c) Intergenic region with predicted RNA overlaps with promoter associated transcript described by Samanta et al [7]. (d) Predicted H/ACA snoRNA, overlapping with transcripts described by both David et al [7] and Davis et al [8]

    Noncoding RNA of Glutamine Synthetase I Modulates Antibiotic Production in Streptomyces coelicolor A3(2)

    Get PDF
    Overexpression of antisense chromosomal cis-encoded noncoding RNAss (ncRNAs) in glutamine synthetase I resulted in a decrease in growth, protein synthesis, and antibiotic production in Streptomyces coelicolor. In addition, we predicted 3,597 cis-encoded ncRNAs and validated 13 of them experimentally, including several ncRNAs that are differentially expressed in bacterial hormone-defective mutants
    corecore