Skip to main content
Article thumbnail
Location of Repository

Clustering and principal-components approach based on heritability for mapping multiple gene expressions

By Yuanjia Wang, Yixin Fang and Shuang Wang


When the number of phenotypes in a genetic study is on the scale of thousands, such as in studies concerning thousands of gene expression levels, the single-trait analysis is computationally intensive, and heavy adjustment of multiple comparisons is required. Traditional multivariate genetic linkage analysis for quantitative traits focuses on mapping only a few phenotypes and is not feasible for a large number of traits. To cope with high-dimensional phenotype data, clustering analysis and principal-component analysis (PCA) are proposed to reduce the data dimensionality and to map shared genetic contributions for multiple traits. However, standard clustering analysis and PCA are applicable for independent observations. In most genetic studies, where family data are collected, these standard analyses can only be applied to founders and can lead to the loss of information. Here, we proposed a clustering method that can exploit family structure information and applied the method to 29 gene expression levels mapped to a reported hot spot on chromosome 14. We then used a PCA approach based on heritability applicable to small number of traits to combine phenotypes in the clusters. Lastly, we used a penalized PCA approach based on heritability applicable to arbitrary number of traits to combine 150 gene expression levels with the highest heritability. Genome-wide multipoint linkage analysis was carried out on the individual traits and on the combined traits. Two previously reported peaks on chromosomes 14 and 20 were identified. Linkage evidence was stronger for traits derived from methods that incorporate family structure information

Topics: Proceedings
Publisher: BioMed Central
OAI identifier:
Provided by: PubMed Central

Suggested articles


  1. (1999). A principal-components approach based on heritability for combining phenotype information. Hum Hered
  2. (2007). A ridge penalized principal-components approach based on heritability for high-dimensional data. Hum Hered
  3. (1997). Bivariate quantitative trait linkage analysis: pleiotropy versus co-incident linkages. Genet Epidemiol
  4. (2004). Cheung BG: Genetic analysis of genome-wide variation in human gene expression. Nature
  5. (2003). Duggirala R: Factors of insulin resistance syndrome-related phenotypes are linked to genetic locations on chromosomes 6 and 7 in nondiabetic Mexican-Americans. Diabetes
  6. EM: Comparison of multipoint linkage analyses for quantitative traits in the CEPH data: parametric LOD scores, variancecomponents LOD scores, and Bayes factors. BMC Proc 2007, 1(Suppl 1):S93.
  7. (1995). Kruglyak L: Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet
  8. (2003). RC: Adding further power to Haseman and Elston method for detecting linkage in larger sibships: weighting sums and differences. Hum Hered
  9. (2003). Spielman RS: Natural variation in human gene expression assessed in lymphoblastoid cells. Nat Genet
  10. (1995). Zeng ZB: Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.