39 research outputs found

    Integrative transcriptomics in smoking related lung diseases

    Get PDF
    Chronic lung diseases including Chronic Obstructive Pulmonary Disease (COPD), Idiopathic Pulmonary Fibrosis (IPF) and lung cancer are major causes of morbidity and mortality in the United States due to high incidence and limited therapeutic options. In order to address this critical issue, I have leveraged RNA sequencing and integrative genomics to define disease-associated transcriptomic changes which could be potentially targeted to lead to new therapeutics. We sequenced the lung transcriptome of subjects with IPF (n=19), emphysema (n=19, a subtype of COPD), or neither (n=20). The expression levels of 1770 genes differed between IPF and control lung, and 220 genes differed between emphysema and control lung (p<0.001). Upregulated genes in both emphysema and IPF were enriched for the p53/hypoxia pathway. These results were validated by immunohistochemistry of select p53/hypoxia proteins and by GSEA analysis of independent expression microarray experiments. To identify regulatory events, I constructed an integrative miRNA target prediction and anticorrelation miRNA-mRNA network, which highlighted several miRNA whose expression levels were the opposite of genes differentially expressed in both IPF and emphysema. MiR-96 was a highly connected hub in this network and was subsequently overexpressed in cell lines to validate several potential regulatory connections. Building upon these successful experiments, I next sought to define gene expression changes and the miRNA-mRNA regulatory network in never smoker lung cancer. Large and small RNA was sequenced from matched lung adenocarcinoma tumor and adjacent normal lung tissue obtained from 22 subjects (8 never, 14 current and former smokers). I identified 120 genes whose expression was modified uniquely in never smoker lung tumors. Using a repository of gene-expression profiles associated with small bioactive molecules, several compounds which counter the never smoker tumor signature were identified in silico. Leveraging differential expression information, I again constructed an mRNA-miRNA regulatory network, and subsequently identified a potential never smoker oncomir has-mir-424 and its transcription factor target FOXP2. In this thesis, I have identified genes, pathways and the miRNA-mRNA regulatory network that is altered in COPD, IPF, and lung adenocarcinoma among never smokers. My findings may ultimately lead to improved treatment options by identifying targetable pathways, regulators, and therapeutic drug candidates.2017-02-01T00:00:00

    Similarities and Differences Between Variants Called With Human Reference Genome HG19 or HG38

    Get PDF
    Background: Reference genome selection is a prerequisite for successful analysis of next generation sequencing (NGS) data. Current practice employs one of the two most recent human reference genome versions: HG19 or HG38. To date, the impact of genome version on SNV identification has not been rigorously assessed. Results: We conducted analysis comparing the SNVs identified based on HG19 vs HG38, leveraging whole genome sequencing (WGS) data from the genome-in-a-bottle (GIAB) project. First, SNVs were called using 26 different bioinformatics pipelines with either HG19 or HG38. Next, two tools were used to convert the called SNVs between HG19 and HG38. Lastly we calculated conversion rates, analyzed discordant rates between SNVs called with HG19 or HG38, and characterized the discordant SNVs. Conclusions: The conversion rates from HG38 to HG19 (average 95%) were lower than the conversion rates from HG19 to HG38 (average 99%). The conversion rates varied slightly among the various calling pipelines. Around 1.5% SNVs were discordantly converted between HG19 or HG38. The conversions from HG38 to HG19 had more SNVs which failed conversion and more discordant SNVs than the opposite conversion (HG19 to HG38). Most of the discordant SNVs had low read depth, were low confidence SNVs as defined by GIAB, and/or were predominated by G/C alleles (52% observed versus 42% expected)

    Assessing Reproducibility of Inherited Variants Detected With Short-Read Whole Genome Sequencing

    Get PDF
    Background: Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. Results: To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when \u3e 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. Conclusions: Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS

    Assessing reproducibility of inherited variants detected with short-read whole genome sequencing

    Get PDF
    Background: Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. Results: To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30x. Conclusions: Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.Peer reviewe

    Structural Changes Due to Antagonist Binding in Ligand Binding Pocket of Androgen Receptor Elucidated Through Molecular Dynamics Simulations

    No full text
    When a small molecule binds to the androgen receptor (AR), a conformational change can occur which impacts subsequent binding of co-regulator proteins and DNA. In order to accurately study this mechanism, the scientific community needs a crystal structure of the Wild type AR (WT-AR) ligand binding domain, bound with antagonist. To address this open need, we leveraged molecular docking and molecular dynamics (MD) simulations to construct a structure of the WT-AR ligand binding domain bound with antagonist bicalutamide. The structure of mutant AR (Mut-AR) bound with this same antagonist informed this study. After molecular docking analysis pinpointed the suitable binding orientation of a ligand in AR, the model was further optimized through 1 ÎĽs of MD simulations. Using this approach, three molecular systems were studied: (1) WT-AR bound with agonist R1881, (2) WT-AR bound with antagonist bicalutamide, and (3) Mut-AR bound with bicalutamide. Our structures were very similar to the experimentally determined structures of both WT-AR with R1881 and Mut-AR with bicalutamide, demonstrating the trustworthiness of this approach. In our model, when WT-AR is bound with bicalutamide, Val716/Lys720/Gln733, or Met734/Gln738/Glu897 move and thus disturb the positive and negative charge clumps of the AF2 site. This disruption of the AF2 site is key for understanding the impact of antagonist binding on subsequent co-regulator binding. In conclusion, the antagonist induced structural changes in WT-AR detailed in this study will enable further AR research and will facilitate AR targeting drug discovery
    corecore