6 research outputs found

    Direct inference and control of genetic population structure from RNA sequencing data

    Get PDF
    RNAseq data can be used to infer genetic variants, yet its use for estimating genetic population structure remains underexplored. Here, we construct a freely available computational tool (RGStraP) to estimate RNAseq-based genetic principal components (RG-PCs) and assess whether RG-PCs can be used to control for population structure in gene expression analyses. Using whole blood samples from understudied Nepalese populations and the Geuvadis study, we show that RG-PCs had comparable results to paired array-based genotypes, with high genotype concordance and high correlations of genetic principal components, capturing subpopulations within the dataset. In differential gene expression analysis, we found that inclusion of RG-PCs as covariates reduced test statistic inflation. Our paper demonstrates that genetic population structure can be directly inferred and controlled for using RNAseq data, thus facilitating improved retrospective and future analyses of transcriptomic data

    Direct inference and control of genetic population structure from RNA sequencing data

    Get PDF
    RNAseq data can be used to infer genetic variants, yet its use for estimating genetic population structure remains underexplored. Here, we construct a freely available computational tool (RGStraP) to estimate RNAseq-based genetic principal components (RG-PCs) and assess whether RG-PCs can be used to control for population structure in gene expression analyses. Using whole blood samples from understudied Nepalese populations and the Geuvadis study, we show that RG-PCs had comparable results to paired array-based genotypes, with high genotype concordance and high correlations of genetic principal components, capturing subpopulations within the dataset. In differential gene expression analysis, we found that inclusion of RG-PCs as covariates reduced test statistic inflation. Our paper demonstrates that genetic population structure can be directly inferred and controlled for using RNAseq data, thus facilitating improved retrospective and future analyses of transcriptomic data

    Epidemiology of Human Seasonal Coronaviruses Among People With Mild and Severe Acute Respiratory Illness in Blantyre, Malawi, 2011-2017

    No full text
    The aim of this study was to characterize the epidemiology of human seasonal coronaviruses (HCoVs) in southern Malawi. We tested for HCoVs 229E, OC43, NL63, and HKU1 using real-time polymerase chain reaction (PCR) on upper respiratory specimens from asymptomatic controls and individuals of all ages recruited through severe acute respiratory illness (SARI) surveillance at Queen Elizabeth Central Hospital, Blantyre, and a prospective influenza-like illness (ILI) observational study between 2011 and 2017. We modeled the probability of having a positive PCR for each HCoV using negative binomial models, and calculated pathogen-attributable fractions (PAFs). Overall, 8.8% (539/6107) of specimens were positive for ≥1 HCoV. OC43 was the most frequently detected HCoV (3.1% [191/6107]). NL63 was more frequently detected in ILI patients (adjusted incidence rate ratio [aIRR], 9.60 [95% confidence interval {CI}, 3.25-28.30]), while 229E (aIRR, 8.99 [95% CI, 1.81-44.70]) was more frequent in SARI patients than asymptomatic controls. In adults, 229E and OC43 were associated with SARI (PAF, 86.5% and 89.4%, respectively), while NL63 was associated with ILI (PAF, 85.1%). The prevalence of HCoVs was similar between children with SARI and controls. All HCoVs had bimodal peaks but distinct seasonality. OC43 was the most prevalent HCoV in acute respiratory illness of all ages. Individual HCoVs had distinct seasonality that differed from temperate settings

    Epidemiology of human seasonal coronaviruses among people with mild and severe acute respiratory illness in Blantyre, Malawi 2011–2017

    No full text
    BackgroundThe aim of this study was to characterize the epidemiology of human seasonal coronaviruses (HCoVs) in southern Malawi.MethodsWe tested for HCoVs 229E, OC43, NL63, and HKU1 using real-time polymerase chain reaction (PCR) on upper respiratory specimens from asymptomatic controls and individuals of all ages recruited through severe acute respiratory illness (SARI) surveillance at Queen Elizabeth Central Hospital, Blantyre, and a prospective influenza-like illness (ILI) observational study between 2011 and 2017. We modeled the probability of having a positive PCR for each HCoV using negative binomial models, and calculated pathogen-attributable fractions (PAFs).ResultsOverall, 8.8% (539/6107) of specimens were positive for ≥1 HCoV. OC43 was the most frequently detected HCoV (3.1% [191/6107]). NL63 was more frequently detected in ILI patients (adjusted incidence rate ratio [aIRR], 9.60 [95% confidence interval {CI}, 3.25-28.30]), while 229E (aIRR, 8.99 [95% CI, 1.81-44.70]) was more frequent in SARI patients than asymptomatic controls. In adults, 229E and OC43 were associated with SARI (PAF, 86.5% and 89.4%, respectively), while NL63 was associated with ILI (PAF, 85.1%). The prevalence of HCoVs was similar between children with SARI and controls. All HCoVs had bimodal peaks but distinct seasonality.ConclusionsOC43 was the most prevalent HCoV in acute respiratory illness of all ages. Individual HCoVs had distinct seasonality that differed from temperate settings

    Direct inference and control of genetic population structure from RNA sequencing data

    No full text
    RNAseq data can be used to infer genetic variants, yet its use for estimating genetic population structure remains underexplored. Here, we construct a freely available computational tool (RGStraP) to estimate RNAseq-based genetic principal components (RG-PCs) and assess whether RG-PCs can be used to control for population structure in gene expression analyses. Using whole blood samples from understudied Nepalese populations and the Geuvadis study, we show that RG-PCs had comparable results to paired array-based genotypes, with high genotype concordance and high correlations of genetic principal components, capturing subpopulations within the dataset. In differential gene expression analysis, we found that inclusion of RG-PCs as covariates reduced test statistic inflation. Our paper demonstrates that genetic population structure can be directly inferred and controlled for using RNAseq data, thus facilitating improved retrospective and future analyses of transcriptomic data
    corecore