Search CORE

23,131 research outputs found

PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets

Author: Deshpande Sumukh
England Matthew
Shuttleworth James
Taramonli Sandy
Yang Jianhua
Publication venue: 'Elsevier BV'
Publication date: 01/02/2019
Field of study

Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) tools fail to accurately identify them in transcriptomic data. Well-known CPC tools such as CPC2, lncScore, CPAT are primarily designed for prediction of lncRNAs based on the GENCODE, NONCODE and CANTATAdb databases. The prediction accuracy of these tools often drops when tested on transcriptomic datasets. This leads to higher false positive results and inaccuracy in the function annotation process. In this study, we present a novel tool, PLIT, for the identification of lncRNAs in plants RNA-seq datasets. PLIT implements a feature selection method based on L1 regularization and iterative Random Forests (iRF) classification for selection of optimal features. Based on sequence and codon-bias features, it classifies the RNA-seq derived FASTA sequences into coding or long non-coding transcripts. Using L1 regularization, 31 optimal features were obtained based on lncRNA and protein-coding transcripts from 8 plant species. The performance of the tool was evaluated on 7 plant RNA-seq datasets using 10-fold cross-validation. The analysis exhibited superior accuracy when evaluated against currently available state-of-the-art CPC tools

arXiv.org e-Print Archive

Online Research @ Cardiff

Coventry University Pure Portal

Nuclear RNA sequencing of the mouse erythroid cell transcriptome.

Author: Chakalova Lyubomira
Chen Chih-Yu
Clay Ieuan
Eskiw Christopher H
Fraser Peter
Mitchell Jennifer A
Moir Catherine A
Nagano Takashi
Schoenfelder Stefan
Umlauf David
Publication venue: PLoS One
Publication date: 01/01/2012
Field of study

In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq) in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq) of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A)-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs

Directory of Open Access Journals

PubMed Central

Apollo (Cambridge)

Brunel University Research Archive

FigShare

벼 도열병균의 긴 비암호화 리보핵산 분석 및 짧은 비암호화 리보핵산과의 상호작용

Author: 최고봉
Publication venue: 서울대학교 대학원
Publication date: 01/02/2023
Field of study

학위논문(박사) -- 서울대학교대학원 : 농업생명과학대학 협동과정 농생명유전체학전공, 2023. 2. 이용환.단백질을 암호화하는 구역 및 암호화하는 서열이 없는 구역에서도 전사는 일어난다. 비암호화 리보핵산은 단백질을 만드는 정보가 없지만 유전자를 조절함으로써 전사 과정, 전사 후 과정, 번역 과정, 번역 후 과정에서 일어나는 조절 과정에 관여한다. 비암호화 리보핵산은 200개의 염기보다 긴 경우 긴 비암호화 리보핵산(lncRNA)으로 간주된다. 시퀀싱(sequencing) 분석 기술이 발전하면서 비암호화 리보핵산 전사체가 축적되고 기능 분석이 수행되고 있다. 긴 비암호화 리보핵산은 발달 과정, 비생물적 자극에 대한 반응, 기주와 미생물의 상호작용에 참여한다고 보고되었다. 그러나 제한된 종에서의 연구로 인해 식물병원성 곰팡이에서는 긴 비암호화 리보핵산에 대한 역할에 대한 이해가 부족하다. 해당 연구는 기주에 대한 반응에서 긴 비암호화 리보핵산의 역할을 이해하기 위해 병이 발생하는 동안 벼 도열병균(Magnaporthe oryzae)에서 프로파일링(profiling)을 수행했다. 긴 비암호화 리보핵산을 확인 후 기능과 관련이 있을 수 있는 유전체 서열 특징과 발현 경향을 분석했다. 추가적으로, 기능을 할 가능성이 큰 긴 비암호화 리보핵산을 조사하기 위해서 감염 단계에 특이적으로 발현한 경우의 대상 유전자를 탐색했다. 유전자 분석 결과는 긴 비암호화 리보핵산은 세포벽 분해와 기주의 면역체계 회피 같은 역할을 수행하여 병원성에 관여한다고 제시해 준다. 긴 비암호화 리보핵산은 단독으로 또는 짧은 비암호화 리보핵산(sRNA)와 협력해서 기능한다. 상호작용 방식은 일반적으로 세 가지가 있다. 전자가 후자의 전구체가 되는 경우, 후자가 전자를 조절하는 경우, 전자가 후자의 활동을 조절하는 경우로 구분할 수 있다. 곰팡이에서는 이들의 상호작용에 대한 이해가 부족한 상황이다. 벼 도열병균에서 상호작용을 밝히기 위해 짧은 비암호화 리보핵산의 생합성 유전자가 없는 상황에서 두 비암호화 리보핵산의 프로파일링을 수행했다. 그 과정에서 짧은 비암호화 리보핵산 중 잔해를 배제하기 위해서 리보핵산 간섭 도구에 의해 처리되는 것들을 선별했다. 대상 유전자의 분석 결과 상호작용의 종류에 따라 다른 생물학적 과정과 연관되어 있음을 밝혔다. 해당 연구는 비암호화 리보핵산의 레퍼토리를 제공하여 생물학적 기능을 알아보기 위한 기능적 연구의 기반을 제공한다. 또한 종합적인 연구를 통해 두 종류의 비암호화 리보핵산의 상호작용에 대한 이해를 돕고, 병원성을 포함하는 생물학적 과정에서 이들이 핵심 요소라는 점을 제안한다. 따라서 본 연구는 식물 병원성 곰팡이에서 복잡한 조절망에 대한 연구 방향을 제시한다.Transcription occurs in the protein-coding regions as well as the regions where any protein-coding sequence is absent. Although these non-coding RNAs lack coding potential, they play roles in transcriptional, post-transcriptional, translational, and post-translational regulation by controlling protein-coding genes. Non-coding RNAs, which are longer than 200 nucleotides, are considered as long non-coding RNAs (lncRNAs). As the sequencing technology has advanced, a repertoire of lncRNA transcriptomes has been accumulated and the functional characterization of each lncRNA has been performed. LncRNAs have been reported to participate in the development, responses to abiotic stresses, and host-microbe interaction. However, their role in plant fungal pathogens was poorly understood due to the limited range of studied species. In this study, we profiled lncRNAs of the rice blast fungus, Magnaporthe oryzae, during disease development to decipher the role of lncRNAs in response to the host. We identified lncRNAs and analyzed their genomic feature and expression pattern to understand their properties, which could be related to their functions. Moreover, specifically expressed lncRNAs in infection stages and their target genes were identified to investigate functional lncRNAs. The analysis of target gene functions suggests that these lncRNAs play roles in pathogenesis such as cell wall degradation and evasion of host immunity. LncRNAs could function solely or in cooperation with small RNAs (sRNAs). LncRNAs generally interact with sRNAs in three ways. LncRNAs could be precursors of sRNAs, be regulated by sRNAs, and regulate sRNA activity. However, their interaction is not well understood in fungi. We profiled lncRNAs and sRNAs in the defect of sRNA biogenesis machinery genes to unravel their interaction in M. oryzae. We selected sRNAs processed by RNA interference machinery to filter out the debris. The analysis of genes targeted by non-coding RNAs suggests that two classes of non-coding RNAs be involved in different biological processes depending on the type of interaction. This study provides a repertoire of non-coding RNAs and a foundation for functional studies to elucidate their biological roles. This comprehensive study helps to understand the crosstalk between two classes of non-coding RNAs and suggests that non-coding RNAs can be key regulators in biological processes including pathogenesis. Taken together, this work shed light on the complex regulatory network in plant pathogenic fungi.CHAPTER I. Long non-coding RNA in fungi 1 ABSTRACT 2 INTRODUCTION 3 I. LncRNA profiling in fungi 5 II. Biological roles of lncRNAs in fungi 9 PERSPECTIVE 14 LITERATURE CITED 15 CHAPTER II. Genome-wide profiling of long non-coding RNA of the rice blast fungus Magnaporthe oryzae during infection 26 ABSTRACT 27 INTRODUCTION 28 MATERIALS AND METHODS I. RNA extraction and strand‐specific sequencing 31 II. Collection of in planta RNA-seq data 31 III. Transcriptome assembly 32 IV. LncRNA identification 32 V. LncRNA conservation analysis 35 VI. Assessment of stage specificity and prediction of stage-specific lncRNAs 35 VII. Target gene prediction 36 VIII. Validation of lncRNA transcript production 37 RESULTS 39 I. Genome-wide identification of lncRNAs in M. oryzae 39 II. Genomic features of M. oryzae lncRNAs 43 III. Expression of lncRNA transcripts during infection 46 IV. Prediction of stage-specifically expressed lncRNA 50 V. Verification of lncRNA production 57 DISCUSSION 60 LITERATURE CITED 63 CHAPTER III. Comprehensive genome-wide analysis of non-coding RNAs reveals functions of lncRNA-sRNA crosstalk in the rice blast fungus Magnaporthe oryzae 71 ABSTRACT 72 INTRODUCTION 73 MATERIALS AND METHODS I. Collection of RNA-seq and sRNA-seq data 76 II. RNA-seq data analysis 76 III. sRNA-seq data analysis 77 IV. Target gene prediction and analysis 78 RESULTS 79 I. Identification of lncRNAs and Dicer-dependent sRNAs 79 II. Identification of small RNAs originating from lncRNAs 84 III. Identification of sRNAs regulating lncRNA expression 89 IV. Construction of a lncRNA-sRNA-mRNA network 92 DISCUSSION 94 LITERATURE CITED 97 ABSTRACT (in Korean) 104박

SNU Open Repository and Archive

Using Pan RNA-Seq Analysis to Reveal the Ubiquitous Existence of 5′ and 3′ End Small RNAs

Author: Haishuo Ji
Haishuo Ji
Jishou Ruan
Qiang Zhao
Shan Gao
Shan Gao
Tao Zhang
Wenjun Bu
Xiaofeng Xu
Xiufeng Jin
Xue Yao
Yanqiang Liu
Ze Chen
Zhi Cheng
Publication venue: 'Frontiers Media SA'
Publication date: 01/02/2019
Field of study

In this study, we used pan RNA-seq analysis to reveal the ubiquitous existence of both 5′ and 3′ end small RNAs (5′ and 3′ sRNAs). 5′ and 3′ sRNAs alone can be used to annotate nuclear non-coding and mitochondrial genes at 1-bp resolution and identify new steady RNAs, which are usually transcribed from functional genes. Then, we provided a simple and cost effective way for the annotation of nuclear non-coding and mitochondrial genes and the identification of new steady RNAs, particularly long non-coding RNAs (lncRNAs). Using 5′ and 3′ sRNAs, the annotation of human mitochondrial was corrected and a novel ncRNA named non-coding mitochondrial RNA 1 (ncMT1) was reported for the first time in this study. We also found that most of human tRNA genes have downstream lncRNA genes as lncTRS-TGA1-1 and corrected the misunderstanding of them in previous studies. Using 5′, 3′, and intronic sRNAs, we reported for the first time that enzymatic double-stranded RNA (dsRNA) cleavage and RNA interference (RNAi) might be involved in the RNA degradation and gene expression regulation of U1 snRNA in human. We provided a different perspective on the regulation of gene expression in U1 snRNA. We also provided a novel view on cancer and virus-induced diseases, leading to find diagnostics or therapy targets from the ribonuclease III (RNase III) family and its related pathways. Our findings pave the way toward a rediscovery of dsRNA cleavage and RNAi, challenging classical theories

Directory of Open Access Journals

Recommended from our members

New technologies accelerate the exploration of non-coding RNAs in horticultural plants.

Author: Hu Rongbin
Liu Degao
Mewalal Ritesh
Tuskan Gerald A
Yang Xiaohan
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Non-coding RNAs (ncRNAs), that is, RNAs not translated into proteins, are crucial regulators of a variety of biological processes in plants. While protein-encoding genes have been relatively well-annotated in sequenced genomes, accounting for a small portion of the genome space in plants, the universe of plant ncRNAs is rapidly expanding. Recent advances in experimental and computational technologies have generated a great momentum for discovery and functional characterization of ncRNAs. Here we summarize the classification and known biological functions of plant ncRNAs, review the application of next-generation sequencing (NGS) technology and ribosome profiling technology to ncRNA discovery in horticultural plants and discuss the application of new technologies, especially the new genome-editing tool clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) systems, to functional characterization of plant ncRNAs

eScholarship - University of California

Recommended from our members

Non-coding and Coding Transcriptional Profiles Are Significantly Altered in Pediatric Retinoblastoma Tumors.

Author: Bisht Madhoolika
Campbell Moray J
Dotts Kathleen
Elchuri Sailaja V
Khetan Vikas
Krishnakumar Subrmanian
Kumar Ranjith
Miles Wayne O
Nagarajha Selvan Lakshmi Dhevi
Rajasekaran Swetha
Rishi Pukhraj
Sahoo Debashis
Sivaraman Karthikeyan
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Retinoblastoma is a rare pediatric tumor of the retina, caused by the homozygous loss of the Retinoblastoma 1 (RB1) tumor suppressor gene. Previous microarray studies have identified changes in the expression profiles of coding genes; however, our understanding of how non-coding genes change in this tumor is absent. This is an important area of research, as in many adult malignancies, non-coding genes including LNC-RNAs are used as biomarkers to predict outcome and/or relapse. To establish a complete and in-depth RNA profile, of both coding and non-coding genes, in Retinoblastoma tumors, we conducted RNA-seq from a cohort of tumors and normal retina controls. This analysis identified widespread transcriptional changes in the levels of both coding and non-coding genes. Unexpectedly, we also found rare RNA fusion products resulting from genomic alterations, specific to Retinoblastoma tumor samples. We then determined whether these gene expression changes, of both coding and non-coding genes, were also found in a completely independent Retinoblastoma cohort. Using our dataset, we then profiled the potential effects of deregulated LNC-RNAs on the expression of neighboring genes, the entire genome, and on mRNAs that contain a putative area of homology. This analysis showed that most deregulated LNC-RNAs do not act locally to change the transcriptional environment, but potentially function to modulate genes at distant sites. From this analysis, we selected a strongly down-regulated LNC-RNA in Retinoblastoma, DRAIC, and found that restoring DRAIC RNA levels significantly slowed the growth of the Y79 Retinoblastoma cell line. Collectively, our work has generated the first non-coding RNA profile of Retinoblastoma tumors and has found that these tumors show widespread transcriptional deregulation

eScholarship - University of California

Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum.

Author: Bunnik Evelien M
Le Roch Karine G
Lonardi Stefano
Lu Xueqing Maggie
Nasseri Sara
Pokhriyal Neeti
Publication venue: eScholarship, University of California
Publication date: 01/11/2015
Field of study

BackgroundPlasmodium falciparum, the deadliest malaria-causing parasite, has an extremely AT-rich (80.7 %) genome. Because of high AT-content, sequence-based annotation of genes and functional elements remains challenging. In order to better understand the regulatory network controlling gene expression in the parasite, a more complete genome annotation as well as analysis tools adapted for AT-rich genomes are needed. Recent studies on genome-wide nucleosome positioning in eukaryotes have shown that nucleosome landscapes exhibit regular characteristic patterns at the 5'- and 3'-end of protein and non-protein coding genes. In addition, nucleosome depleted regions can be found near transcription start sites. These unique nucleosome landscape patterns may be exploited for the identification of novel genes. In this paper, we propose a computational approach to discover novel putative genes based exclusively on nucleosome positioning data in the AT-rich genome of P. falciparum.ResultsUsing binary classifiers trained on nucleosome landscapes at the gene boundaries from two independent nucleosome positioning data sets, we were able to detect a total of 231 regions containing putative genes in the genome of Plasmodium falciparum, of which 67 highly confident genes were found in both data sets. Eighty-eight of these 231 newly predicted genes exhibited transcription signal in RNA-Seq data, indicative of active transcription. In addition, 20 out of 21 selected gene candidates were further validated by RT-PCR, and 28 out of the 231 genes showed significant matches using BLASTN against an expressed sequence tag (EST) database. Furthermore, 108 (47%) out of the 231 putative novel genes overlapped with previously identified but unannotated long non-coding RNAs. Collectively, these results provide experimental validation for 163 predicted genes (70.6%). Finally, 73 out of 231 genes were found to be potentially translated based on their signal in polysome-associated RNA-Seq representing transcripts that are actively being translated.ConclusionOur results clearly indicate that nucleosome positioning data contains sufficient information for novel gene discovery. As distinct nucleosome landscapes around genes are found in many other eukaryotic organisms, this methodology could be used to characterize the transcriptome of any organism, especially when coupled with other DNA-based gene finding and experimental methods (e.g., RNA-Seq)

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Genome-wide transcription start site profiling in biofilm-grown Burkholderia cenocepacia J2315

Author: Coenye Tom
Deforce Dieter
Förstner Konrad
Sass Andrea
Van Acker Heleen
Van Nieuwerburgh Filip
Vogel Jörg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: Burkholderia cenocepacia is a soil-dwelling Gram-negative Betaproteobacterium with an important role as opportunistic pathogen in humans. Infections with B. cenocepacia are very difficult to treat due to their high intrinsic resistance to most antibiotics. Biofilm formation further adds to their antibiotic resistance. B. cenocepacia harbours a large, multi-replicon genome with a high GC-content, the reference genome of strain J2315 includes 7374 annotated genes. This study aims to annotate transcription start sites and identify novel transcripts on a whole genome scale. Methods: RNA extracted from B. cenocepacia J2315 biofilms was analysed by differential RNA-sequencing and the resulting dataset compared to data derived from conventional, global RNA-sequencing. Transcription start sites were annotated and further analysed according to their position relative to annotated genes. Results: Four thousand ten transcription start sites were mapped over the whole B. cenocepacia genome and the primary transcription start site of 2089 genes expressed in B. cenocepacia biofilms were defined. For 64 genes a start codon alternative to the annotated one was proposed. Substantial antisense transcription for 105 genes and two novel protein coding sequences were identified. The distribution of internal transcription start sites can be used to identify genomic islands in B. cenocepacia. A potassium pump strongly induced only under biofilm conditions was found and 15 non-coding small RNAs highly expressed in biofilms were discovered. Conclusions: Mapping transcription start sites across the B. cenocepacia genome added relevant information to the J2315 annotation. Genes and novel regulatory RNAs putatively involved in B. cenocepacia biofilm formation were identified. These findings will help in understanding regulation of B. cenocepacia biofilm formation

Springer - Publisher Connector

Ghent University Academic Bibliography

PubMed Central

Online-Publikations-Server der Universität Würzburg