16 research outputs found

    Patterns of polymorphism and linkage disequilibrium in cultivated barley

    Get PDF
    We carried out a genome-wide analysis of polymorphism (4,596 SNP loci across 190 elite cultivated accessions) chosen to represent the available genetic variation in current elite North West European and North American barley germplasm. Population sub-structure, patterns of diversity and linkage disequilibrium varied considerably across the seven barley chromosomes. Gene-rich and rarely recombining haplotype blocks that may represent up to 60% of the physical length of barley chromosomes extended across the ‘genetic centromeres’. By positioning 2,132 bi-parentally mapped SNP markers with minimum allele frequencies higher than 0.10 by association mapping, 87.3% were located to within 5 cM of their original genetic map position. We show that at this current marker density genetically diverse populations of relatively small size are sufficient to fine map simple traits, providing they are not strongly stratified within the sample, fall outside the genetic centromeres and population sub-structure is effectively controlled in the analysis. Our results have important implications for association mapping, positional cloning, physical mapping and practical plant breeding in barley and other major world cereals including wheat and rye that exhibit comparable genome and genetic features

    Regional heritability mapping method helps explain missing heritability of blood lipid traits in isolated populations

    Get PDF
    Single single-nucleotide polymorphism (SNP) genome-wide association studies (SSGWAS) may fail to identify loci with modest effects on a trait. The recently developed regional heritability mapping (RHM) method can potentially identify such loci. In this study, RHM was compared with the SSGWAS for blood lipid traits (high-density lipoprotein (HDL), low-density lipoprotein (LDL), plasma concentrations of total cholesterol (TC) and triglycerides (TG)). Data comprised 2246 adults from isolated populations genotyped using ∼300 000 SNP arrays. The results were compared with large meta-analyses of these traits for validation. Using RHM, two significant regions affecting HDL on chromosomes 15 and 16 and one affecting LDL on chromosome 19 were identified. These regions covered the most significant SNPs associated with HDL and LDL from the meta-analysis. The chromosome 19 region was identified in our data despite the fact that the most significant SNP in the meta-analysis (or any SNP tagging it) was not genotyped in our SNP array. The SSGWAS identified one SNP associated with HDL on chromosome 16 (the top meta-analysis SNP) and one on chromosome 10 (not reported by RHM or in the meta-analysis and hence possibly a false positive association). The results further confirm that RHM can have better power than SSGWAS in detecting causal regions including regions containing crucial ungenotyped variants. This study suggests that RHM can be a useful tool to explain some of the ‘missing heritability' of complex trait variation

    Constructing endophenotypes of complex disease using non-negative matrix factorization and adjusted rand index

    Get PDF
    [[abstract]]Complex diseases are typically caused by combinations of molecular disturbances that vary widely among different patients. Endophenotypes, a combination of genetic factors associated with a disease, offer a simplified approach to dissect complex trait by reducing genetic heterogeneity. Because molecular dissimilarities often exist between patients with indistinguishable disease symptoms, these unique molecular features may reflect pathogenic heterogeneity. To detect molecular dissimilarities among patients and reduce the complexity of high-dimension data, we have explored an endophenotype-identification analytical procedure that combines non-negative matrix factorization (NMF) and adjusted rand index (ARI), a measure of the similarity of two clusterings of a data set. To evaluate this procedure, we compared it with a commonly used method, principal component analysis with k-means clustering (PCA-K). A simulation study with gene expression dataset and genotype information was conducted to examine the performance of our procedure and PCA-K. The results showed that NMF mostly outperformed PCA-K. Additionally, we applied our endophenotype-identification analytical procedure to a publicly available dataset containing data derived from patients with late-onset Alzheimer’s disease (LOAD). NMF distilled information associated with 1,116 transcripts into three metagenes and three molecular subtypes (MS) for patients in the LOAD dataset: MS1 (), MS2 (), and MS3 (). ARI was then used to determine the most representative transcripts for each metagene; 123, 89, and 71 metagene-specific transcripts were identified for MS1, MS2, and MS3, respectively. These metagene-specific transcripts were identified as the endophenotypes. Our results showed that 14, 38, 0, and 28 candidate susceptibility genes listed in AlzGene database were found by all patients, MS1, MS2, and MS3, respectively. Moreover, we found that MS2 might be a normal-like subtype. Our proposed procedure provides an alternative approach to investigate the pathogenic mechanism of disease and better understand the relationship between phenotype and genotype.[[notice]]補正完
    corecore