206 research outputs found

    The effect of minor allele frequency on the likelihood of obtaining false positives

    Get PDF
    Determining the most promising single-nucleotide polymorphisms (SNPs) presents a challenge in genome-wide association studies, when hundreds of thousands of association tests are conducted. The power to detect genetic effects is dependent on minor allele frequency (MAF), and genome-wide association studies SNP arrays include SNPs with a wide distribution of MAFs. Therefore, it is critical to understand MAF's effect on the false positive rate

    A unified framework for multi-locus association analysis of both common and rare variants

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Common, complex diseases are hypothesized to result from a combination of common and rare genetic variants. We developed a unified framework for the joint association testing of both types of variants. Within the framework, we developed a union-intersection test suitable for genome-wide analysis of single nucleotide polymorphisms (SNPs), candidate gene data, as well as medical sequencing data. The union-intersection test is a composite test of association of genotype frequencies and differential correlation among markers.</p> <p>Results</p> <p>We demonstrated by computer simulation that the false positive error rate was controlled at the expected level. We also demonstrated scenarios in which the multi-locus test was more powerful than traditional single marker analysis. To illustrate use of the union-intersection test with real data, we analyzed a publically available data set of 319,813 autosomal SNPs genotyped for 938 cases of Parkinson disease and 863 neurologically normal controls for which no genome-wide significant results were found by traditional single marker analysis. We also analyzed an independent follow-up sample of 183 cases and 248 controls for replication.</p> <p>Conclusions</p> <p>We identified a single risk haplotype with a directionally consistent effect in both samples in the gene <it>GAK</it>, which is involved in clathrin-mediated membrane trafficking. We also found suggestive evidence that directionally inconsistent marginal effects from single marker analysis appeared to result from risk being driven by different haplotypes in the two samples for the genes <it>SYN3 </it>and <it>NGLY1</it>, which are involved in neurotransmitter release and proteasomal degradation, respectively. These results illustrate the utility of our unified framework for genome-wide association analysis of common, complex diseases.</p

    Cubic exact solutions for the estimation of pairwise haplotype frequencies: implications for linkage disequilibrium analyses and a web tool 'CubeX'

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The frequency of a haplotype comprising one allele at each of two loci can be expressed as a cubic equation (the 'Hill equation'), the solution of which gives that frequency. Most haplotype and linkage disequilibrium analysis programs use iteration-based algorithms which substitute an estimate of haplotype frequency into the equation, producing a new estimate which is repeatedly fed back into the equation until the values converge to a maximum likelihood estimate (expectation-maximisation).</p> <p>Results</p> <p>We present a program, "CubeX", which calculates the biologically possible exact solution(s) and provides estimated haplotype frequencies, D', r<sup>2 </sup>and <it>χ</it><sup>2 </sup>values for each. CubeX provides a "complete" analysis of haplotype frequencies and linkage disequilibrium for a pair of biallelic markers under situations where sampling variation and genotyping errors distort sample Hardy-Weinberg equilibrium, potentially causing more than one biologically possible solution. We also present an analysis of simulations and real data using the algebraically exact solution, which indicates that under perfect sample Hardy-Weinberg equilibrium there is only one biologically possible solution, but that under other conditions there may be more.</p> <p>Conclusion</p> <p>Our analyses demonstrate that lower allele frequencies, lower sample numbers, population stratification and a possible |D'| value of 1 are particularly susceptible to distortion of sample Hardy-Weinberg equilibrium, which has significant implications for calculation of linkage disequilibrium in small sample sizes (eg HapMap) and rarer alleles (eg paucimorphisms, q < 0.05) that may have particular disease relevance and require improved approaches for meaningful evaluation.</p

    Accuracy of Predicting the Genetic Risk of Disease Using a Genome-Wide Approach

    Get PDF
    Background - The prediction of the genetic disease risk of an individual is a powerful public health tool. While predicting risk has been successful in diseases which follow simple Mendelian inheritance, it has proven challenging in complex diseases for which a large number of loci contribute to the genetic variance. The large numbers of single nucleotide polymorphisms now available provide new opportunities for predicting genetic risk of complex diseases with high accuracy. Methodology/Principal Findings - We have derived simple deterministic formulae to predict the accuracy of predicted genetic risk from population or case control studies using a genome-wide approach and assuming a dichotomous disease phenotype with an underlying continuous liability. We show that the prediction equations are special cases of the more general problem of predicting the accuracy of estimates of genetic values of a continuous phenotype. Our predictive equations are responsive to all parameters that affect accuracy and they are independent of allele frequency and effect distributions. Deterministic prediction errors when tested by simulation were generally small. The common link among the expressions for accuracy is that they are best summarized as the product of the ratio of number of phenotypic records per number of risk loci and the observed heritability. Conclusions/Significance - This study advances the understanding of the relative power of case control and population studies of disease. The predictions represent an upper bound of accuracy which may be achievable with improved effect estimation methods. The formulae derived will help researchers determine an appropriate sample size to attain a certain accuracy when predicting genetic ris

    Detection of regulator genes and eQTLs in gene networks

    Full text link
    Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in non-coding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are "expression quantitative trait loci" or eQTLs for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, and to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and softwares to identify eQTLs and their associated genes, to reconstruct co-expression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.Comment: minor revision with typos corrected; review article; 24 pages, 2 figure

    Polymorphisms on SSC15q21-q26 Containing QTL for reproduction in Swine and its association with litter size

    Get PDF
    Several quantitative trait loci (QTL) for important reproductive traits (ovulation rate) have been identified on the porcine chromosome 15 (SSC15). To assist in the selection of positional candidate swine genes for these QTL on SSC15, twenty-one genes had already been assigned to SSC15 in a previous study in our lab, by using the radiation hybrid panel IMpRH. Further polymorphism studies were carried out on these positional candidate genes with four breeds of pigs (Duroc, Erhualian, Dahuabai and Landrace) harboring significant differences in reproduction traits. A total of nineteen polymorphisms were found in 21 genes. Among these, seven in six genes were used for association studies, whereby NRP2 polymorphism was found to be significantly (p < 0.05) associated with litter-size traits. NRP2 might be a candidate gene for pig-litter size based on its chromosome location (Du et al., 2006), significant association with litter-size traits and relationships with Sema and the VEGF super families

    Polymorphisms of XRCC4 are involved in reduced colorectal cancer risk in Chinese schizophrenia patients

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genetic factors related to the regulation of apoptosis in schizophrenia patients may be involved in a reduced vulnerability to cancer. XRCC4 is one of the potential candidate genes associated with schizophrenia which might induce colorectal cancer resistance.</p> <p>Methods</p> <p>To examine the genetic association between colorectal cancer and schizophrenia, we analyzed five SNPs (rs6452526, rs2662238, rs963248, rs35268, rs2386275) covering ~205.7 kb in the region of XRCC4.</p> <p>Results</p> <p>We observed that two of the five genetic polymorphisms showed statistically significant differences between 312 colorectal cancer subjects without schizophrenia and 270 schizophrenia subjects (rs6452536, p = 0.004, OR 0.61, 95% CI 0.44-0.86; rs35268, p = 0.028, OR 1.54, 95% CI 1.05-2.26). Moreover, the haplotype which combined all five markers was the most significant, giving a global <it>p </it>= 0.0005.</p> <p>Conclusions</p> <p>Our data firstly indicate that XRCC4 may be a potential protective gene towards schizophrenia, conferring reduced susceptibility to colorectal cancer in the Han Chinese population.</p

    Patterns of polymorphism and linkage disequilibrium in cultivated barley

    Get PDF
    We carried out a genome-wide analysis of polymorphism (4,596 SNP loci across 190 elite cultivated accessions) chosen to represent the available genetic variation in current elite North West European and North American barley germplasm. Population sub-structure, patterns of diversity and linkage disequilibrium varied considerably across the seven barley chromosomes. Gene-rich and rarely recombining haplotype blocks that may represent up to 60% of the physical length of barley chromosomes extended across the ‘genetic centromeres’. By positioning 2,132 bi-parentally mapped SNP markers with minimum allele frequencies higher than 0.10 by association mapping, 87.3% were located to within 5 cM of their original genetic map position. We show that at this current marker density genetically diverse populations of relatively small size are sufficient to fine map simple traits, providing they are not strongly stratified within the sample, fall outside the genetic centromeres and population sub-structure is effectively controlled in the analysis. Our results have important implications for association mapping, positional cloning, physical mapping and practical plant breeding in barley and other major world cereals including wheat and rye that exhibit comparable genome and genetic features
    corecore