Search CORE

3 research outputs found

Inferential considerations for low-count RNA-seq transcripts: a case study on an edaphic subspecies of dominant prairie grass Andropogon gerardii

Author: Raithel Seth
Publication venue: Kansas State University
Publication date
Field of study

Master of ScienceStatisticsNora M. BelloBig bluestem (Andropogon gerardii) is a wide-ranging dominant prairie grass of ecological and agricultural importance to the US Midwest while edaphic subspecies sand bluestem (A. gerardii ssp. Hallii) grows exclusively on sand dunes. Sand bluestem exhibits phenotypic divergence related to epicuticular properties and enhanced drought tolerance relative to big bluestem. Understanding the mechanisms underlying differential drought tolerance is relevant in the face of climate change. For bluestem subspecies, presence or absence of these phenotypes may be associated with RNA transcripts characterized by low number of read counts. So called low-count transcripts pose particular inferential challenges and are thus usually filtered out at early steps of data management protocols and ignored for analyses. In this study, we use a plasmode-based approach to assess the relative performance of alternative inferential strategies on RNA-seq transcripts, with special emphasis on low-count transcripts as motivated by differential bluestem phenotypes. Our dataset consists of RNA-seq read counts for 25,582 transcripts (60% of which are classified as low-count) collected from leaf tissue of 4 individual plants of big bluestem and 4 of sand bluestem. We also compare alternative ad-hoc data filtering techniques commonly used in RNA-seq pipelines and assess the performance of recently developed statistical methods for differential expression (DE) analysis, namely DESeq2 and edgeR robust. These methods attempt to overcome the inherently noisy behavior of low-count transcripts by either shrinkage or differential weighting of observations, respectively. Our results indicate that proper specification of DE methods can remove the need for ad- hoc data filtering at arbitrary expression threshold, thus allowing for inference on low-count transcripts. Practical recommendations for inference are provided when low-count RNA-seq transcripts are of interest, as is the case in the comparison of subspecies of bluestem grasses. Insights from this study may also be relevant to other applications also focused on transcripts of low expression levels

K-State Research Exchange

Additional file 1: Table S1. of Inferential considerations for low-count RNA-seq transcripts: a case study on the dominant prairie grass Andropogon gerardii

Author: Jennifer Shelton (3440891)
Loretta Johnson (738018)
Matthew Galliart (3440894)
Nicolae Herndon (738020)
Nora Bello (594620)
Seth Raithel (3440897)
Sue Brown (2237107)
Publication venue
Publication date
Field of study

Number of high quality reads by ecotype and population for 454 and HiSeq platforms. Figure S1. Workflow diagram of transcriptome assembly pipeline. Figure S2. Cumulative length of sequences and number of sequences for various k-mer values, 454 data, and the combined 454 and HiSeq data. Figure S3. N values for various k-mers and MIRA 454 and MIRA clustered assemblies. Figure S4. Ortholog hit ratio for final MIRA clustered assembly. OHR is the length of the BLASTX hit region divided by the length of the protein, in our case using the S. bicolor database. OHR is an estimate of the percent of the full length protein sequence represented in the assembly. An OHR of 1 indicates a potential full length transcript. (DOCX 230Â kb

FigShare

Inferential considerations for low-count RNA-seq transcripts: a case study on the dominant prairie grass Andropogon gerardii

Author: AK Knapp
AS Bleed
B Chevreux
B Langmead
C Lehermeier
C Soneson
D Gianola
DJ McCarthy
E Meyer
F Spitz
GL Gadbury
IPCC
J Shelton
Jennifer Shelton
JH Bullard
JL Duan
JP Steibel
Loretta Johnson
MA Dillies
Matthew Galliart
MD Robinson
MH Schulz
MI Love
MY Liu
Nicolae Herndon
Nora M. Bello
P Chouvarine
PD Reeb
PW Barnes
R Bourgon
R Schmieder
R Schmieder
S Anders
S Anders
Seth Raithel
SF Altschul
SM Belleghem Van
Sue Brown
T Mehta
V Zeng
VM Kvam
XB Zhou
Y Benjamini
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref