128 research outputs found

    Analysis of Antisense Expression by Whole Genome Tiling Microarrays and siRNAs Suggests Mis-Annotation of Arabidopsis Orphan Protein-Coding Genes

    Get PDF
    MicroRNAs (miRNAs) and trans-acting small-interfering RNAs (tasi-RNAs) are small (20-22 nt long) RNAs (smRNAs) generated from hairpin secondary structures or antisense transcripts, respectively, that regulate gene expression by Watson-Crick pairing to a target mRNA and altering expression by mechanisms related to RNA interference. The high sequence homology of plant miRNAs to their targets has been the mainstay of miRNA prediction algorithms, which are limited in their predictive power for other kingdoms because miRNA complementarity is less conserved yet transitive processes (production of antisense smRNAs) are active in eukaryotes. We hypothesize that antisense transcription and associated smRNAs are biomarkers which can be computationally modeled for gene discovery.We explored rice (Oryza sativa) sense and antisense gene expression in publicly available whole genome tiling array transcriptome data and sequenced smRNA libraries (as well as C. elegans) and found evidence of transitivity of MIRNA genes similar to that found in Arabidopsis. Statistical analysis of antisense transcript abundances, presence of antisense ESTs, and association with smRNAs suggests several hundred Arabidopsis 'orphan' hypothetical genes are non-coding RNAs. Consistent with this hypothesis, we found novel Arabidopsis homologues of some MIRNA genes on the antisense strand of previously annotated protein-coding genes. A Support Vector Machine (SVM) was applied using thermodynamic energy of binding plus novel expression features of sense/antisense transcription topology and siRNA abundances to build a prediction model of miRNA targets. The SVM when trained on targets could predict the "ancient" (deeply conserved) class of validated Arabidopsis MIRNA genes with an accuracy of 84%, and 76% for "new" rapidly-evolving MIRNA genes.Antisense and smRNA expression features and computational methods may identify novel MIRNA genes and other non-coding RNAs in plants and potentially other kingdoms, which can provide insight into antisense transcription, miRNA evolution, and post-transcriptional gene regulation

    Role of Next-Generation RNA-Seq Data in Discovery and Characterization of Long Non-Coding RNA in Plants

    Get PDF
    The next-generation sequencing (NGS) technologies embrace advance sequencing technologies that can generate high-throughput RNA-seq data to delve into all the possible aspects of the transcriptome. It involves short-read sequencing approaches like 454, illumina, SOLiD and Ion Torrent, and more advance single-molecule long-read sequencing approaches including PacBio and nano-pore sequencing. Together with the help of computational approaches, these technologies are revealing the necessity of complex non-coding part of the genome, once dubbed as β€œjunk DNA.” The ease in availability of high-throughput RNA-seq data has allowed the genome-wide identification of long non-coding RNA (lncRNA). The high-confidence lncRNAs can be filtered from the set of whole RNA-seq data using the computational pipeline. These can be categorized into intergenic, intronic, sense, antisense, and bidirectional lncRNAs with respect to their genomic localization. The transcription of lncRNAs in plants is carried out by plant-specific RNA polymerase IV and V in addition to RNA polymerase II and target the epigenetic regulation through RNA-directed DNA methylation (RdDM). lncRNAs regulate the gene expression through a variety of mechanism including target mimicry, histone modification, chromosome looping, etc. The differential expression pattern of lncRNA during developmental processes and different stress responses indicated their diverse role in plants

    Array-based high-throughput DNA markers for crop improvement

    Get PDF
    The last two decades have witnessed a remarkable activity in the development and use of molecular markers both in animal and plant systems. This activity started with low-throughput restriction fragment length polymorphisms and culminated in recent years with single nucleotide polymorphisms (SNPs), which are abundant and uniformly distributed. Although the latter became the markers of choice for many, their discovery needed previous sequence information. However, with the availability of microarrays, SNP platforms have been developed, which allow genotyping of thousands of markers in parallel. Besides SNPs, some other novel marker systems, including single feature polymorphisms, diversity array technology and restriction site-associated DNA markers, have also been developed, where array-based assays have been utilized to provide for the desired ultra-high throughput and low cost. These microarray-based markers are the markers of choice for the future and are already being used for construction of high-density maps, quantitative trait loci (QTL) mapping (including expression QTLs) and genetic diversity analysis with a limited expense in terms of time and money. In this study, we briefly describe the characteristics of these array-based marker systems and review the work that has already been done involving development and use of these markers, not only in simple eukaryotes like yeast, but also in a variety of seed plants with simple or complex genomes

    Genome-wide analysis of alternative splicing of pre-mRNA under salt stress in Arabidopsis

    Get PDF
    BACKGROUND: Alternative splicing (AS) of precursor mRNA (pre-mRNA) is an important gene regulation process that potentially regulates many physiological processes in plants, including the response to abiotic stresses such as salt stress. RESULTS: To analyze global changes in AS under salt stress, we obtained high-coverage (~200 times) RNA sequencing data from Arabidopsis thaliana seedlings that were treated with different concentrations of NaCl. We detected that ~49% of all intron-containing genes were alternatively spliced under salt stress, 10% of which experienced significant differential alternative splicing (DAS). Furthermore, AS increased significantly under salt stress compared with under unstressed conditions. We demonstrated that most DAS genes were not differentially regulated by salt stress, suggesting that AS may represent an independent layer of gene regulation in response to stress. Our analysis of functional categories suggested that DAS genes were associated with specific functional pathways, such as the pathways for the responses to stresses and RNA splicing. We revealed that serine/arginine-rich (SR) splicing factors were frequently and specifically regulated in AS under salt stresses, suggesting a complex loop in AS regulation for stress adaptation. We also showed that alternative splicing site selection (SS) occurred most frequently at 4 nucleotides upstream or downstream of the dominant sites and that exon skipping tended to link with alternative SS. CONCLUSIONS: Our study provided a comprehensive view of AS under salt stress and revealed novel insights into the potential roles of AS in plant response to salt stress. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-431) contains supplementary material, which is available to authorized users

    High-throughput computational methods and software for quantitative trait locus (QTL) mapping

    Get PDF
    De afgelopen jaren zijn vele nieuwe technologieen zoals Tiling arrays en High throughput DNA sequencing een belangrijke rol gaan spelen binnen het onderzoeksveld van de systeem genetica. Voor onderzoekers is het extreem belangrijk om te begrijpen dat deze methodes hun manier van werken zullen gaan beinvloeden. Deit proefschrift beschrijft mogelijke oplossingen voor deze 'Big Data' lawine die systemen genetica heeft getroffen.Dit proefschrift beschrijft de werkzaamheden uitgevoerd aan het Groningen Bioinformatics Centre om slimmere en geoptimaliseerde algoritmen zoals Pheno2Geno en MQM te ontwikkelen en een systeem om 'collaborative' research mogelijk te maken genaamd xQTL werkbank om door middel van high-throughput systemen genetica data te analyseren.In recent years many new technologies such as tiling arrays and high-throughput sequencinghave come to play an important role in systems genetics research. For researchers it is ofthe utmost importance to understand how this affects their research. This work describespossible solutions to this β€˜Big Data’ avalanche which has hit systems genetics.This thesis describes the work carried out during the author’s 4 year PHD project at theGroningen Bioinformatics Centre to develop smarter and more optimized algorithms suchas Pheno2Geno and MQM, and to use a collaborative approach such as xQTL workbench tostore and analyse high-throughput systems genetics data

    Overcoming challenges in variant calling : exploring sequence diversity in candidate genes for plant development in perennial ryegrass (Lolium perenne)

    Get PDF
    Revealing DNA sequence variation within the Lolium perenne genepool is important for genetic analysis and development of breeding applications. We reviewed current literature on plant development to select candidate genes in pathways that control agronomic traits, and identified 503 orthologues in L. perenne. Using targeted resequencing, we constructed a comprehensive catalogue of genomic variation for a L. perenne germplasm collection of 736 genotypes derived from current cultivars, breeding material and wild accessions. To overcome challenges of variant calling in heterogeneous outbreeding species, we used two complementary strategies to explore sequence diversity. First, four variant calling pipelines were integrated with the VariantMetaCaller to reach maximal sensitivity. Additional multiplex amplicon sequencing was used to empirically estimate an appropriate precision threshold. Second, a de novo assembly strategy was used to reconstruct divergent alleles for each gene. The advantage of this approach was illustrated by discovery of 28 novel alleles of LpSDUF247, a polymorphic gene co-segregating with the S-locus of the grass self-incompatibility system. Our approach is applicable to other genetically diverse outbreeding species. The resulting collection of functionally annotated variants can be mined for variants causing phenotypic variation, either through genetic association studies, or by selecting carriers of rare defective alleles for physiological analyses
    • …
    corecore