22 research outputs found

    Genome‐wide association study of INDELs identified four novel susceptibility loci associated with lung cancer risk

    Get PDF
    Genome‐wide association studies (GWAS) have identified 45 susceptibility loci associated with lung cancer. Only less than SNPs, small insertions and deletions (INDELs) are the second most abundant genetic polymorphisms in the human genome. INDELs are highly associated with multiple human diseases, including lung cancer. However, limited studies with large‐scale samples have been available to systematically evaluate the effects of INDELs on lung cancer risk. Here, we performed a large‐scale meta‐analysis to evaluate INDELs and their risk for lung cancer in 23,202 cases and 19,048 controls. Functional annotations were performed to further explore the potential function of lung cancer risk INDELs. Conditional analysis was used to clarify the relationship between INDELs and SNPs. Four new risk loci were identified in genome‐wide INDEL analysis (1p13.2: rs5777156, Insertion, OR = 0.92, P = 9.10 × 10−8; 4q28.2: rs58404727, Deletion, OR = 1.19, P = 5.25 × 10−7; 12p13.31: rs71450133, Deletion, OR = 1.09, P = 8.83 × 10−7; and 14q22.3: rs34057993, Deletion, OR = 0.90, P = 7.64 × 10−8). The eQTL analysis and functional annotation suggested that INDELs might affect lung cancer susceptibility by regulating the expression of target genes. After conducting conditional analysis on potential causal SNPs, the INDELs in the new loci were still nominally significant. Our findings indicate that INDELs could be potentially functional genetic variants for lung cancer risk. Further functional experiments are needed to better understand INDEL mechanisms in carcinogenesis

    Impact of individual level uncertainty of lung cancer polygenic risk score (PRS) on risk stratification

    Get PDF
    Background Although polygenic risk score (PRS) has emerged as a promising tool for predicting cancer risk from genome-wide association studies (GWAS), the individual-level accuracy of lung cancer PRS and the extent to which its impact on subsequent clinical applications remains largely unexplored. Methods Lung cancer PRSs and confidence/credible interval (CI) were constructed using two statistical approaches for each individual: (1) the weighted sum of 16 GWAS-derived significant SNP loci and the CI through the bootstrapping method (PRS-16-CV) and (2) LDpred2 and the CI through posteriors sampling (PRS-Bayes), among 17,166 lung cancer cases and 12,894 controls with European ancestry from the International Lung Cancer Consortium. Individuals were classified into different genetic risk subgroups based on the relationship between their own PRS mean/PRS CI and the population level threshold. Results Considerable variances in PRS point estimates at the individual level were observed for both methods, with an average standard deviation (s.d.) of 0.12 for PRS-16-CV and a much larger s.d. of 0.88 for PRS-Bayes. Using PRS-16-CV, only 25.0% of individuals with PRS point estimates in the lowest decile of PRS and 16.8% in the highest decile have their entire 95% CI fully contained in the lowest and highest decile, respectively, while PRS-Bayes was unable to find any eligible individuals. Only 19% of the individuals were concordantly identified as having high genetic risk (> 90th percentile) using the two PRS estimators. An increased relative risk of lung cancer comparing the highest PRS percentile to the lowest was observed when taking the CI into account (OR = 2.73, 95% CI: 2.12–3.50, P-value = 4.13 × 10−15) compared to using PRS-16-CV mean (OR = 2.23, 95% CI: 1.99–2.49, P-value = 5.70 × 10−46). Improved risk prediction performance with higher AUC was consistently observed in individuals identified by PRS-16-CV CI, and the best performance was achieved by incorporating age, gender, and detailed smoking pack-years (AUC: 0.73, 95% CI = 0.72–0.74). Conclusions Lung cancer PRS estimates using different methods have modest correlations at the individual level, highlighting the importance of considering individual-level uncertainty when evaluating the practical utility of PRS

    Systematic analyses of regulatory variants in DNase I hypersensitive sites identified two novel lung cancer susceptibility loci

    Get PDF
    DNase I hypersensitive sites (DHS) are abundant in regulatory elements, such as promoter, enhancer and transcription factor binding sites. Many studies have revealed that disease-associated variants were concentrated in DHS related regions. However, limited studies are available on the roles of DHS-related variants in lung cancer. In the current study, we performed a large-scale case-control study with 20,871 lung cancer cases and 15,971 controls to evaluate the associations between regulatory genetic variants in DHS and lung cancer susceptibility. The eQTL (expression quantitative trait loci) analysis and pathway enrichment analysis were performed to identify the possible target genes and pathways. Additionally, we performed motif-based analysis to explore the lung cancer related motifs using sequence kernel association test (SKAT). Two novel variants, rs186332 in 20q13.3 (C>T, OR = 1.17, 95% CI: 1.10-1.24, P = 8.45×10-7) and rs4839323 in 1p13.2 (T>C, OR = 0.92, 95% CI: 0.89-0.95, P = 1.02×10-6) showed significant association with lung cancer risk. The eQTL analysis suggested that these two SNPs might regulate the expression of MRGBP and SLC16A1 respectively. What's more, the expression of both MRGBP and SLC16A1 were aberrantly elevated in lung tumor tissues. The motif-based analysis identified 10 motifs related to the risk of lung cancer (P < 1.71×10-4). Our findings suggested that variants in DHS might modify lung cancer susceptibility through regulating the expression of surrounding genes. This study provided us a deeper insight into the roles of DHS related genetic variants for lung cancer

    Iam hiQ—a novel pair of accuracy indices for imputed genotypes

    Get PDF
    Background Imputation of untyped markers is a standard tool in genome-wide association studies to close the gap between directly genotyped and other known DNA variants. However, high accuracy with which genotypes are imputed is fundamental. Several accuracy measures have been proposed and some are implemented in imputation software, unfortunately diversely across platforms. In the present paper, we introduce Iam hiQ, an independent pair of accuracy measures that can be applied to dosage files, the output of all imputation software. Iam (imputation accuracy measure) quantifies the average amount of individual-specific versus population-specific genotype information in a linear manner. hiQ (heterogeneity in quantities of dosages) addresses the inter-individual heterogeneity between dosages of a marker across the sample at hand. Results Applying both measures to a large case–control sample of the International Lung Cancer Consortium (ILCCO), comprising 27,065 individuals, we found meaningful thresholds for Iam and hiQ suitable to classify markers of poor accuracy. We demonstrate how Manhattan-like plots and moving averages of Iam and hiQ can be useful to identify regions enriched with less accurate imputed markers, whereas these regions would by missed when applying the accuracy measure info (implemented in IMPUTE2). Conclusion We recommend using Iam hiQ additional to other accuracy scores for variant filtering before stepping into the analysis of imputed GWAS data

    Transcriptome‐wide association study reveals candidate causal genes for lung cancer

    Get PDF
    We have recently completed the largest GWAS on lung cancer including 29,266 cases and 56,450 controls of European descent. The goal of this study has been to integrate the complete GWAS results with a large‐scale expression quantitative trait loci (eQTL) mapping study in human lung tissues (n=1,038) to identify candidate causal genes for lung cancer. We performed transcriptome‐wide association study (TWAS) for lung cancer overall, by histology (adenocarcinoma, squamous cell carcinoma, small cell lung cancer) and smoking subgroups (never‐ and ever‐smokers). We performed replication analysis using lung data from the Genotype‐Tissue Expression (GTEx) project. DNA damage assays were performed in human lung fibroblasts for selected TWAS genes. As expected, the main TWAS signal for all histological subtypes and ever‐smokers was on chromosome 15q25. The gene most strongly associated with lung cancer at this locus using the TWAS approach was IREB2 (PTWAS=1.09E‐99), where lower predicted expression increased lung cancer risk. A new lung adenocarcinoma susceptibility locus was revealed on 9p13.3 and associated with higher predicted expression of AQP3 (PTWAS=3.72E‐6). Among the 45 previously described lung cancer GWAS loci, we mapped candidate target gene for 17 of them. The association AQP3‐adenocarcinoma on 9p13.3 was replicated using GTEx (PTWAS=6.55E‐5). Consistent with the effect of risk alleles on gene expression levels, IREB2 knockdown and AQP3 overproduction promote endogenous DNA damage. These findings indicate genes whose expression in lung tissue directly influence lung cancer risk

    Mendelian randomization and mediation analysis of leukocyte telomere length and risk of lung and head and neck cancers

    Get PDF
    Background: Evidence from observational studies of telomere length (TL) has been conflicting regarding its direction of association with cancer risk. We investigated the causal relevance of TL for lung and head and neck cancers using Mendelian Randomization (MR) and mediation analyses. Methods: We developed a novel genetic instrument for TL in chromosome 5p15.33, using variants identified through deep-sequencing, that were genotyped in 2051 cancer-free subjects. Next, we conducted an MR analysis of lung (16 396 cases, 13 013 controls) and head and neck cancer (4415 cases, 5013 controls) using eight genetic instruments for TL. Lastly, the 5p15.33 instrument and distinct 5p15.33 lung cancer risk loci were evaluated using two-sample mediation analysis, to quantify their direct and indirect, telomere-mediated, effects. Results: The multi-allelic 5p15.33 instrument explained 1.49-2.00% of TL variation in our data (p = 2.6 × 10-9). The MR analysis estimated that a 1000 base-pair increase in TL increases risk of lung cancer [odds ratio (OR) = 1.41, 95% confidence interval (CI): 1.20-1.65] and lung adenocarcinoma (OR = 1.92, 95% CI: 1.51-2.22), but not squamous lung carcinoma (OR = 1.04, 95% CI: 0.83-1.29) or head and neck cancers (OR = 0.90, 95% CI: 0.70-1.05). Mediation analysis of the 5p15.33 instrument indicated an absence of direct effects on lung cancer risk (OR = 1.00, 95% CI: 0.95-1.04). Analysis of distinct 5p15.33 susceptibility variants estimated that TL mediates up to 40% of the observed associations with lung cancer risk. Conclusions: Our findings support a causal role for long telomeres in lung cancer aetiology, particularly for adenocarcinoma, and demonstrate that telomere maintenance partially mediates the lung cancer susceptibility conferred by 5p15.33 loci

    Genome-wide interaction study of smoking behavior and non-small cell lung cancer risk in Caucasian population.

    Get PDF
    Non-small cell lung cancer (NSCLC) is the most common type of lung cancer. Both environmental and genetic risk factors contribute to lung carcinogenesis. We conducted a genome-wide interaction analysis between SNPs and smoking status (never vs ever smokers) in a European-descent population. We adopted a two-step analysis strategy in the discovery stage: we first conducted a case-only interaction analysis to assess the relationship between SNPs and smoking behavior using 13,336 NSCLC cases. Candidate SNPs with p-value less than 0.001 were further analyzed using a standard case-control interaction analysis including 13970 controls. The significant SNPs with p-value less than 3.5x10-5 (correcting for multiple tests) from the case-control analysis in the discovery stage were further validated using an independent replication dataset comprising 5377 controls and 3054 NSCLC cases. We further stratified the analysis by histological subtypes. Two novel SNPs, rs6441286 and rs17723637, were identified for overall lung cancer risk. The interaction odds ratio and meta-analysis p-value for these two SNPs were 1.24 with 6.96x10-7 and 1.37 with 3.49x10-7, respectively. Additionally, interaction of smoking with rs4751674 was identified in squamous cell lung carcinoma with an odds ratio of 0.58 and p-value of 8.12x10-7. This study is by far the largest genome-wide SNP-smoking interaction analysis reported for lung cancer. The three identified novel SNPs provide potential candidate biomarkers for lung cancer risk screening and intervention. The results from our study reinforce that gene-smoking interactions play important roles in the etiology of lung cancer and account for part of the missing heritability of this disease

    Genetic interaction analysis among oncogenesis-related genes revealed novel genes and networks in lung cancer development

    Get PDF
    The development of cancer is driven by the accumulation of many oncogenesis-related genetic alterationsand tumorigenesis is triggered by complex networks of involved genes rather than independent actions. To explore the epistasis existing among oncogenesis-related genes in lung cancer development, we conducted pairwise genetic interaction analyses among 35,031 SNPs from 2027 oncogenesis-related genes. The genotypes from three independent genome-wide association studies including a total of 24,037 lung cancer patients and 20,401 healthy controls with Caucasian ancestry were analyzed in the study. Using a two-stage study design including discovery and replication studies, and stringent Bonferroni correction for multiple statistical analysis, we identified significant genetic interactions between SNPs in RGL1:RAD51B (OR=0.44, p value=3.27x10-11 in overall lung cancer and OR=0.41, p value=9.71x10-11 in non-small cell lung cancer), SYNE1:RNF43 (OR=0.73, p value=1.01x10-12 in adenocarcinoma) and FHIT:TSPAN8 (OR=1.82, p value=7.62x10-11 in squamous cell carcinoma) in our analysis. None of these genes have been identified from previous main effect association studies in lung cancer. Further eQTL gene expression analysis in lung tissues provided information supporting the functional role of the identified epistasis in lung tumorigenesis. Gene set enrichment analysis revealed potential pathways and gene networks underlying molecular mechanisms in overall lung cancer as well as histology subtypes development. Our results provide evidence that genetic interactions between oncogenesis-related genes play an important role in lung tumorigenesis and epistasis analysis, combined with functional annotation, provides a valuable tool for uncovering functional novel susceptibility genes that contribute to lung cancer development by interacting with other modifier genes

    Genome-wide association study of lung adenocarcinoma in East Asia and comparison with a European population

    Get PDF
    Lung adenocarcinoma is the most common type of lung cancer. Known risk variants explain only a small fraction of lung adenocarcinoma heritability. Here, we conducted a two-stage genome-wide association study of lung adenocarcinoma of East Asian ancestry (21,658 cases and 150,676 controls; 54.5% never-smokers) and identified 12 novel susceptibility variants, bringing the total number to 28 at 25 independent loci. Transcriptome-wide association analyses together with colocalization studies using a Taiwanese lung expression quantitative trait loci dataset (n = 115) identified novel candidate genes, including FADS1 at 11q12 and ELF5 at 11p13. In a multi-ancestry meta-analysis of East Asian and European studies, four loci were identified at 2p11, 4q32, 16q23, and 18q12. At the same time, most of our findings in East Asian populations showed no evidence of association in European populations. In our studies drawn from East Asian populations, a polygenic risk score based on the 25 loci had a stronger association in never-smokers vs. individuals with a history of smoking (Pinteraction = 0.0058). These findings provide new insights into the etiology of lung adenocarcinoma in individuals from East Asian populations, which could be important in developing translational applications

    Causal relationships between body mass index, smoking, and lung cancer: univariable and multivariable mendelian randomization.

    Get PDF
    At the time of cancer diagnosis, body mass index (BMI) is inversely correlated with lung cancer risk, which may reflect reverse causality and confounding due to smoking behavior. We used two‐sample univariable and multivariable Mendelian randomization (MR) to estimate causal relationships of BMI and smoking behaviors on lung cancer and histological subtypes, based on an aggregated genome wide association studies (GWASs) analysis of lung cancer in 29 266 cases and 56 450 controls. We observed a positive causal effect for high BMI on occurrence of small cell lung cancer (odds ratio (OR) = 1.60, 95% confidence interval (CI) = 1.24‐2.06, P = 2.70 x 10−4). After adjustment of smoking behaviors using multivariable Mendelian randomization (MVMR), a direct causal effect on small cell lung cancer (ORMVMR = 1.28, 95% CI = 1.06‐1.55, PMVMR = 0.011), and an inverse effect on lung adenocarcinoma (ORMVMR = 0.86, 95% CI = 0.77‐0.96, PMVMR = 0.008) were observed. A weak increased risk of lung squamous cell carcinoma was observed for higher BMI in univariable Mendelian randomization (UVMR) analysis (ORUVMR = 1.19, 95% CI = 1.01‐1.40, PUVMR = 0.036), but this effect disappeared after adjustment of smoking (ORMVMR = 1.02, 95% CI = 0.90‐1.16, PMVMR = 0.746). These results highlight the histology‐specific impact of BMI on lung carcinogenesis and imply mediator role of smoking behaviors in the association between BMI and lung cancer
    corecore