137 research outputs found

    A genome-wide study of de novo deletions identifies a candidate locus for non-syndromic isolated cleft lip/palate risk

    Get PDF
    Background: Copy number variants (CNVs) may play an important part in the development of common birth defects such as oral clefts, and individual patients with multiple birth defects (including clefts) have been shown to carry small and large chromosomal deletions. In this paper we investigate de novo deletions defined as DNA segments missing in an oral cleft proband but present in both unaffected parents. We compare de novo deletion frequencies in children of European ancestry with an isolated, non-syndromic oral cleft to frequencies in children of European ancestry from randomly sampled trios.Results: We identified a genome-wide significant 62 kilo base (kb) non-coding region on chromosome 7p14.1 where de novo deletions occur more frequently among oral cleft cases than controls. We also observed wider de novo deletions among cleft lip and palate (CLP) cases than seen among cleft palate (CP) and cleft lip (CL) cases.Conclusions: This study presents a region where de novo deletions appear to be involved in the etiology of oral clefts, although the underlying biological mechanisms are still unknown. Larger de novo deletions are more likely to interfere with normal craniofacial development and may result in more severe clefts. Study protocol and sample DNA source can severely affect estimates of de novo deletion frequencies. Follow-up studies are needed to further validate these findings and to potentially identify additional structural variants underlying oral clefts. © 2014 Younkin et al.; licensee BioMed Central Ltd

    Empirical Bayes analysis of single nucleotide polymorphisms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An important goal of whole-genome studies concerned with single nucleotide polymorphisms (SNPs) is the identification of SNPs associated with a covariate of interest such as the case-control status or the type of cancer. Since these studies often comprise the genotypes of hundreds of thousands of SNPs, methods are required that can cope with the corresponding multiple testing problem. For the analysis of gene expression data, approaches such as the empirical Bayes analysis of microarrays have been developed particularly for the detection of genes associated with the response. However, the empirical Bayes analysis of microarrays has only been suggested for binary responses when considering expression values, i.e. continuous predictors.</p> <p>Results</p> <p>In this paper, we propose a modification of this empirical Bayes analysis that can be used to analyze high-dimensional categorical SNP data. This approach along with a generalized version of the original empirical Bayes method are available in the R package siggenes version 1.10.0 and later that can be downloaded from <url>http://www.bioconductor.org</url>.</p> <p>Conclusion</p> <p>As applications to two subsets of the HapMap data show, the empirical Bayes analysis of microarrays cannot only be used to analyze continuous gene expression data, but also be applied to categorical SNP data, where the response is not restricted to be binary. In association studies in which typically several ten to a few hundred SNPs are considered, our approach can furthermore be employed to test interactions of SNPs. Moreover, the posterior probabilities resulting from the empirical Bayes analysis of (prespecified) interactions/genotypes can also be used to quantify the importance of these interactions.</p

    Sample size requirements to detect the effect of a group of genetic variants in case-control studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Because common diseases are caused by complex interactions among many genetic variants along with environmental risk factors, very large sample sizes are usually needed to detect such effects in case-control studies. Nevertheless, many genetic variants act in well defined biologic systems or metabolic pathways. Therefore, a reasonable first step may be to detect the effect of a group of genetic variants before assessing specific variants.</p> <p>Methods</p> <p>We present a simple method for determining approximate sample sizes required to detect the average joint effect of a group of genetic variants in a case-control study for multiplicative models.</p> <p>Results</p> <p>For a range of reasonable numbers of genetic variants, the sample size requirements for the test statistic proposed here are generally not larger than those needed for assessing marginal effects of individual variants and actually decline with increasing number of genetic variants in many situations considered in the group.</p> <p>Conclusion</p> <p>When a significant effect of the group of genetic variants is detected, subsequent multiple tests could be conducted to detect which individual genetic variants and their combinations are associated with disease risk. When testing for an effect size in a group of genetic variants, one can use our global test described in this paper, because the sample size required to detect an effect size in the group is comparatively small. Our method could be viewed as a screening tool for assessing groups of genetic variants involved in pathogenesis and etiology of common complex human diseases.</p

    Phenotype Prediction Using Regularized Regression on Genetic Data in the DREAM5 Systems Genetics B Challenge

    Get PDF
    A major goal of large-scale genomics projects is to enable the use of data from high-throughput experimental methods to predict complex phenotypes such as disease susceptibility. The DREAM5 Systems Genetics B Challenge solicited algorithms to predict soybean plant resistance to the pathogen Phytophthora sojae from training sets including phenotype, genotype, and gene expression data. The challenge test set was divided into three subcategories, one requiring prediction based on only genotype data, another on only gene expression data, and the third on both genotype and gene expression data. Here we present our approach, primarily using regularized regression, which received the best-performer award for subchallenge B2 (gene expression only). We found that despite the availability of 941 genotype markers and 28,395 gene expression features, optimal models determined by cross-validation experiments typically used fewer than ten predictors, underscoring the importance of strong regularization in noisy datasets with far more features than samples. We also present substantial analysis of the training and test setup of the challenge, identifying high variance in performance on the gold standard test sets.National Science Foundation (U.S.). Graduate Research Fellowship ProgramNational Defense Science and Engineering Graduate Fellowshi

    The impact of FADS genetic variants on ω6 polyunsaturated fatty acid metabolism in African Americans

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Arachidonic acid (AA) is a long-chain omega-6 polyunsaturated fatty acid (PUFA) synthesized from the precursor dihomo-gamma-linolenic acid (DGLA) that plays a vital role in immunity and inflammation. Variants in the Fatty Acid Desaturase (<it>FADS</it>) family of genes on chromosome 11q have been shown to play a role in PUFA metabolism in populations of European and Asian ancestry; no work has been done in populations of African ancestry to date.</p> <p>Results</p> <p>In this study, we report that African Americans have significantly higher circulating levels of plasma AA (p = 1.35 × 10<sup>-48</sup>) and lower DGLA levels (p = 9.80 × 10<sup>-11</sup>) than European Americans. Tests for association in N = 329 individuals across 80 nucleotide polymorphisms (SNPs) in the Fatty Acid Desaturase (<it>FADS</it>) locus revealed significant association with AA, DGLA and the AA/DGLA ratio, a measure of enzymatic efficiency, in both racial groups (peak signal p = 2.85 × 10<sup>-16 </sup>in African Americans, 2.68 × 10<sup>-23 </sup>in European Americans). Ancestry-related differences were observed at an upstream marker previously associated with AA levels (rs174537), wherein, 79-82% of African Americans carry two copies of the G allele compared to only 42-45% of European Americans. Importantly, the allelic effect of the G allele, which is associated with <it>enhanced </it>conversion of DGLA to AA, on enzymatic efficiency was similar in both groups.</p> <p>Conclusions</p> <p>We conclude that the impact of <it>FADS </it>genetic variants on PUFA metabolism, specifically AA levels, is likely more pronounced in African Americans due to the larger proportion of individuals carrying the genotype associated with increased FADS1 enzymatic conversion of DGLA to AA.</p

    Single nucleotide polymorphisms in obesity-related genes and all-cause and cause-specific mortality: a prospective cohort study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The aim of this study was to examine the associations between 16 specific single nucleotide polymorphisms (SNPs) in 8 obesity-related genes and overall and cause-specific mortality. We also examined the associations between the SNPs and body mass index (BMI) and change in BMI over time.</p> <p>Methods</p> <p>Data were analyzed from 9,919 individuals who participated in two large community-based cohort studies conducted in Washington County, Maryland in 1974 (CLUE I) and 1989 (CLUE II). DNA from blood collected in 1989 was genotyped for 16 SNPs in 8 obesity-related genes: monoamine oxidase A (<it>MAOA</it>), lipoprotein lipase (<it>LPL</it>), paraoxonase 1 and 2 (<it>PON1 </it>and <it>PON2</it>), leptin receptor (<it>LEPR</it>), tumor necrosis factor-α (<it>TNFα</it>), and peroxisome proliferative activated receptor-γ and -δ (<it>PPARG </it>and <it>PPARD</it>). Data on height and weight in 1989 (CLUE II baseline) and at age 21 were collected from participants at the time of blood collection. All participants were followed from 1989 to the date of death or the end of follow-up in 2005. Cox proportional hazards regression was used to obtain the relative risk (RR) estimates and 95% confidence intervals (CI) for each SNP and mortality outcomes.</p> <p>Results</p> <p>The results showed no patterns of association for the selected SNPs and the all-cause and cause-specific mortality outcomes, although statistically significant associations (p < 0.05) were observed between <it>PPARG </it>rs4684847 and all-cause mortality (CC: reference; CT: RR 0.99, 95% CI 0.89, 1.11; TT: RR 0.60, 95% CI 0.39, 0.93) and cancer-related mortality (CC: reference; CT: RR 1.01, 95% CI 0.82, 1.25; TT: RR 0.22, 95% CI 0.06, 0.90) and <it>TNFα </it>rs1799964 and cancer-related mortality (TT: reference; CT: RR 1.23, 95% CI 1.03, 1.47; CC: RR 0.83, 95% CI 0.54, 1.28). Additional analyses showed significant associations between SNPs in <it>LEPR </it>with BMI (rs1137101) and change in BMI over time (rs1045895 and rs1137101).</p> <p>Conclusion</p> <p>Findings from this cohort study suggest that the selected SNPs are not associated with overall or cause-specific death, although several <it>LEPR </it>SNPs may be related to BMI and BMI change over time.</p

    A unified framework for multi-locus association analysis of both common and rare variants

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Common, complex diseases are hypothesized to result from a combination of common and rare genetic variants. We developed a unified framework for the joint association testing of both types of variants. Within the framework, we developed a union-intersection test suitable for genome-wide analysis of single nucleotide polymorphisms (SNPs), candidate gene data, as well as medical sequencing data. The union-intersection test is a composite test of association of genotype frequencies and differential correlation among markers.</p> <p>Results</p> <p>We demonstrated by computer simulation that the false positive error rate was controlled at the expected level. We also demonstrated scenarios in which the multi-locus test was more powerful than traditional single marker analysis. To illustrate use of the union-intersection test with real data, we analyzed a publically available data set of 319,813 autosomal SNPs genotyped for 938 cases of Parkinson disease and 863 neurologically normal controls for which no genome-wide significant results were found by traditional single marker analysis. We also analyzed an independent follow-up sample of 183 cases and 248 controls for replication.</p> <p>Conclusions</p> <p>We identified a single risk haplotype with a directionally consistent effect in both samples in the gene <it>GAK</it>, which is involved in clathrin-mediated membrane trafficking. We also found suggestive evidence that directionally inconsistent marginal effects from single marker analysis appeared to result from risk being driven by different haplotypes in the two samples for the genes <it>SYN3 </it>and <it>NGLY1</it>, which are involved in neurotransmitter release and proteasomal degradation, respectively. These results illustrate the utility of our unified framework for genome-wide association analysis of common, complex diseases.</p

    Large-scale genome-wide association studies and meta-analyses of longitudinal change in adult lung function.

    Get PDF
    BACKGROUND: Genome-wide association studies (GWAS) have identified numerous loci influencing cross-sectional lung function, but less is known about genes influencing longitudinal change in lung function. METHODS: We performed GWAS of the rate of change in forced expiratory volume in the first second (FEV1) in 14 longitudinal, population-based cohort studies comprising 27,249 adults of European ancestry using linear mixed effects model and combined cohort-specific results using fixed effect meta-analysis to identify novel genetic loci associated with longitudinal change in lung function. Gene expression analyses were subsequently performed for identified genetic loci. As a secondary aim, we estimated the mean rate of decline in FEV1 by smoking pattern, irrespective of genotypes, across these 14 studies using meta-analysis. RESULTS: The overall meta-analysis produced suggestive evidence for association at the novel IL16/STARD5/TMC3 locus on chromosome 15 (P  =  5.71 × 10(-7)). In addition, meta-analysis using the five cohorts with ≥3 FEV1 measurements per participant identified the novel ME3 locus on chromosome 11 (P  =  2.18 × 10(-8)) at genome-wide significance. Neither locus was associated with FEV1 decline in two additional cohort studies. We confirmed gene expression of IL16, STARD5, and ME3 in multiple lung tissues. Publicly available microarray data confirmed differential expression of all three genes in lung samples from COPD patients compared with controls. Irrespective of genotypes, the combined estimate for FEV1 decline was 26.9, 29.2 and 35.7 mL/year in never, former, and persistent smokers, respectively. CONCLUSIONS: In this large-scale GWAS, we identified two novel genetic loci in association with the rate of change in FEV1 that harbor candidate genes with biologically plausible functional links to lung function

    Clique-Finding for Heterogeneity and Multidimensionality in Biomarker Epidemiology Research: The CHAMBER Algorithm

    Get PDF
    Commonly-occurring disease etiology may involve complex combinations of genes and exposures resulting in etiologic heterogeneity. We present a computational algorithm that employs clique-finding for heterogeneity and multidimensionality in biomedical and epidemiological research (the "CHAMBER" algorithm).This algorithm uses graph-building to (1) identify genetic variants that influence disease risk and (2) predict individuals at risk for disease based on inherited genotype. We use a set-covering algorithm to identify optimal cliques and a Boolean function that identifies etiologically heterogeneous groups of individuals. We evaluated this approach using simulated case-control genotype-disease associations involving two- and four-gene patterns. The CHAMBER algorithm correctly identified these simulated etiologies. We also used two population-based case-control studies of breast and endometrial cancer in African American and Caucasian women considering data on genotypes involved in steroid hormone metabolism. We identified novel patterns in both cancer sites that involved genes that sulfate or glucuronidate estrogens or catecholestrogens. These associations were consistent with the hypothesized biological functions of these genes. We also identified cliques representing the joint effect of multiple candidate genes in all groups, suggesting the existence of biologically plausible combinations of hormone metabolism genes in both breast and endometrial cancer in both races.The CHAMBER algorithm may have utility in exploring the multifactorial etiology and etiologic heterogeneity in complex disease

    The pharmacogenomics of inhaled corticosteroids and lung function decline in COPD.

    Full text link
    Inhaled corticosteroids (ICS) are widely prescribed for patients with chronic obstructive pulmonary disease (COPD), yet have variable outcomes and adverse reactions, which may be genetically determined. The primary aim of the study was to identify the genetic determinants for forced expiratory volume in 1 s (FEV1) changes related to ICS therapy.In the Lung Health Study (LHS)-2, 1116 COPD patients were randomised to the ICS triamcinolone acetonide (n=559) or placebo (n=557) with spirometry performed every 6 months for 3 years. We performed a pharmacogenomic genome-wide association study for the genotype-by-ICS treatment effect on 3 years of FEV1 changes (estimated as slope) in 802 genotyped LHS-2 participants. Replication was performed in 199 COPD patients randomised to the ICS, fluticasone or placebo.A total of five loci showed genotype-by-ICS interaction at p<5×10-6; of these, single nucleotide polymorphism (SNP) rs111720447 on chromosome 7 was replicated (discovery p=4.8×10-6, replication p=5.9×10-5) with the same direction of interaction effect. ENCODE (Encyclopedia of DNA Elements) data revealed that in glucocorticoid-treated (dexamethasone) A549 alveolar cell line, glucocorticoid receptor binding sites were located near SNP rs111720447. In stratified analyses of LHS-2, genotype at SNP rs111720447 was significantly associated with rate of FEV1 decline in patients taking ICS (C allele β 56.36 mL·year-1, 95% CI 29.96-82.76 mL·year-1) and in patients who were assigned to placebo, although the relationship was weaker and in the opposite direction to that in the ICS group (C allele β -27.57 mL·year-1, 95% CI -53.27- -1.87 mL·year-1).The study uncovered genetic factors associated with FEV1 changes related to ICS in COPD patients, which may provide new insight on the potential biology of steroid responsiveness in COPD
    corecore