8 research outputs found

    Additional file 2: of Identifying and correcting epigenetics measurements for systematic sources of variation

    No full text
    Figure S2. Quantile-quantile (QQ) plots for CpG site-specific analysis with respect to smoking using standard adjustment (a), residuals (b), ComBat (c) and SVA (d) correcting methods for the β values. The inflation factor λ is defined as the ratio of the median of the observed log10 transformed p values from the CpG site-specific analysis and the median of the expected log10 transformed p values. (PDF 110 kb

    Identifying and correcting epigenetics measurements for systematic sources of variation

    No full text
    Abstract Background Methylation measures quantified by microarray techniques can be affected by systematic variation due to the technical processing of samples, which may compromise the accuracy of the measurement process and contribute to bias the estimate of the association under investigation. The quantification of the contribution of the systematic source of variation is challenging in datasets characterized by hundreds of thousands of features. In this study, we introduce a method previously developed for the analysis of metabolomics data to evaluate the performance of existing normalizing techniques to correct for unwanted variation. Illumina Infinium HumanMethylation450K was used to acquire methylation levels in over 421,000 CpG sites for 902 study participants of a case-control study on breast cancer nested within the EPIC cohort. The principal component partial R-square (PC-PR2) analysis was used to identify and quantify the variability attributable to potential systematic sources of variation. Three correcting techniques, namely ComBat, surrogate variables analysis (SVA) and a linear regression model to compute residuals were applied. The impact of each correcting method on the association between smoking status and DNA methylation levels was evaluated, and results were compared with findings from a large meta-analysis. Results A sizeable proportion of systematic variability due to variables expressing ‘batch’ and ‘sample position’ within ‘chip’ was identified, with values of the partial R2 statistics equal to 9.5 and 11.4% of total variation, respectively. After application of ComBat or the residuals’ methods, the contribution was 1.3 and 0.2%, respectively. The SVA technique resulted in a reduced variability due to ‘batch’ (1.3%) and ‘sample position’ (0.6%), and in a diminished variability attributable to ‘chip’ within a batch (0.9%). After ComBat or the residuals’ corrections, a larger number of significant sites (k = 600 and k = 427, respectively) were associated to smoking status than the SVA correction (k = 96). Conclusions The three correction methods removed systematic variation in DNA methylation data, as assessed by the PC-PR2, which lent itself as a useful tool to explore variability in large dimension data. SVA produced more conservative findings than ComBat in the association between smoking and DNA methylation

    Identifying and correcting epigenetics measurements for systematic sources of variation

    No full text
    Abstract Background Methylation measures quantified by microarray techniques can be affected by systematic variation due to the technical processing of samples, which may compromise the accuracy of the measurement process and contribute to bias the estimate of the association under investigation. The quantification of the contribution of the systematic source of variation is challenging in datasets characterized by hundreds of thousands of features. In this study, we introduce a method previously developed for the analysis of metabolomics data to evaluate the performance of existing normalizing techniques to correct for unwanted variation. Illumina Infinium HumanMethylation450K was used to acquire methylation levels in over 421,000 CpG sites for 902 study participants of a case-control study on breast cancer nested within the EPIC cohort. The principal component partial R-square (PC-PR2) analysis was used to identify and quantify the variability attributable to potential systematic sources of variation. Three correcting techniques, namely ComBat, surrogate variables analysis (SVA) and a linear regression model to compute residuals were applied. The impact of each correcting method on the association between smoking status and DNA methylation levels was evaluated, and results were compared with findings from a large meta-analysis. Results A sizeable proportion of systematic variability due to variables expressing ‘batch’ and ‘sample position’ within ‘chip’ was identified, with values of the partial R2 statistics equal to 9.5 and 11.4% of total variation, respectively. After application of ComBat or the residuals’ methods, the contribution was 1.3 and 0.2%, respectively. The SVA technique resulted in a reduced variability due to ‘batch’ (1.3%) and ‘sample position’ (0.6%), and in a diminished variability attributable to ‘chip’ within a batch (0.9%). After ComBat or the residuals’ corrections, a larger number of significant sites (k = 600 and k = 427, respectively) were associated to smoking status than the SVA correction (k = 96). Conclusions The three correction methods removed systematic variation in DNA methylation data, as assessed by the PC-PR2, which lent itself as a useful tool to explore variability in large dimension data. SVA produced more conservative findings than ComBat in the association between smoking and DNA methylation

    Additional file 3: of Identifying and correcting epigenetics measurements for systematic sources of variation

    No full text
    Figure S3. Quantile-quantile (QQ) plots for CpG site-specific analysis with respect to smoking using standard adjustment (a), residuals (b), ComBat (c) and SVA (d) correcting methods for the M values. The inflation factor λ is defined as the ratio of the median of the observed log10 transformed p values from the CpG site-specific analysis and the median of the expected log10 transformed p values. (PDF 110 kb

    Additional file 1: of Prospective analysis of circulating metabolites and breast cancer in EPIC

    No full text
    Supplementary tables describing the completeness of the metabolites measures (Table S1.) and metabolites concentrations by case-control status (Table S2..); Supplementary figures showing age-adjusted correlations between metabolites in control participants (Figure S1.) and adjusted P values for associations between metabolites and different breast cancer subtypes (Figure S2.). Abbreviations: BMI: Body Mass Index; EPIC: European Prospective Investigation into Cancer and nutrition; ER: estrogen receptor; FDR: False Discovery Rate; HER2: Human epidermal growth factor receptor 2; IARC: International Agency for Research on Cancer; MHT: menopause hormone therapy; MS: Mass spectrometry; NMR: nuclear magnetic resonance; OR: odds ratio; PC: phosphatidylcholine; PR: progesterone receptor; SD: standard deviation; WC: waist circumference. (PDF 1041 kb

    Prospective analysis of circulating metabolites and breast cancer in EPIC

    No full text
    Abstract Background Metabolomics is a promising molecular tool to identify novel etiologic pathways leading to cancer. Using a targeted approach, we prospectively investigated the associations between metabolite concentrations in plasma and breast cancer risk. Methods A nested case-control study was established within the European Prospective Investigation into Cancer cohort, which included 1624 first primary incident invasive breast cancer cases (with known estrogen and progesterone receptor and HER2 status) and 1624 matched controls. Metabolites (n = 127, acylcarnitines, amino acids, biogenic amines, glycerophospholipids, hexose, sphingolipids) were measured by mass spectrometry in pre-diagnostic plasma samples and tested for associations with breast cancer incidence using multivariable conditional logistic regression. Results Among women not using hormones at baseline (n = 2248), and after control for multiple tests, concentrations of arginine (odds ratio [OR] per SD = 0.79, 95% confidence interval [CI] = 0.70–0.90), asparagine (OR = 0.83 (0.74–0.92)), and phosphatidylcholines (PCs) ae C36:3 (OR = 0.83 (0.76–0.90)), aa C36:3 (OR = 0.84 (0.77–0.93)), ae C34:2 (OR = 0.85 (0.78–0.94)), ae C36:2 (OR = 0.85 (0.78–0.88)), and ae C38:2 (OR = 0.84 (0.76–0.93)) were inversely associated with breast cancer risk, while the acylcarnitine C2 (OR = 1.23 (1.11–1.35)) was positively associated with disease risk. In the overall population, C2 (OR = 1.15 (1.06–1.24)) and PC ae C36:3 (OR = 0.88 (0.82–0.95)) were associated with risk of breast cancer, and these relationships did not differ by breast cancer subtype, age at diagnosis, fasting status, menopausal status, or adiposity. Conclusions These findings point to potentially novel pathways and biomarkers of breast cancer development. Results warrant replication in other epidemiological studies

    Prospective analysis of circulating metabolites and breast cancer in EPIC

    No full text
    Abstract Background Metabolomics is a promising molecular tool to identify novel etiologic pathways leading to cancer. Using a targeted approach, we prospectively investigated the associations between metabolite concentrations in plasma and breast cancer risk. Methods A nested case-control study was established within the European Prospective Investigation into Cancer cohort, which included 1624 first primary incident invasive breast cancer cases (with known estrogen and progesterone receptor and HER2 status) and 1624 matched controls. Metabolites (n = 127, acylcarnitines, amino acids, biogenic amines, glycerophospholipids, hexose, sphingolipids) were measured by mass spectrometry in pre-diagnostic plasma samples and tested for associations with breast cancer incidence using multivariable conditional logistic regression. Results Among women not using hormones at baseline (n = 2248), and after control for multiple tests, concentrations of arginine (odds ratio [OR] per SD = 0.79, 95% confidence interval [CI] = 0.70–0.90), asparagine (OR = 0.83 (0.74–0.92)), and phosphatidylcholines (PCs) ae C36:3 (OR = 0.83 (0.76–0.90)), aa C36:3 (OR = 0.84 (0.77–0.93)), ae C34:2 (OR = 0.85 (0.78–0.94)), ae C36:2 (OR = 0.85 (0.78–0.88)), and ae C38:2 (OR = 0.84 (0.76–0.93)) were inversely associated with breast cancer risk, while the acylcarnitine C2 (OR = 1.23 (1.11–1.35)) was positively associated with disease risk. In the overall population, C2 (OR = 1.15 (1.06–1.24)) and PC ae C36:3 (OR = 0.88 (0.82–0.95)) were associated with risk of breast cancer, and these relationships did not differ by breast cancer subtype, age at diagnosis, fasting status, menopausal status, or adiposity. Conclusions These findings point to potentially novel pathways and biomarkers of breast cancer development. Results warrant replication in other epidemiological studies
    corecore