Abstract

Localization of disease susceptibility genes to certain areas in the human genome, or gene mapping, requires careful analysis of genetic marker data. Gene mapping is often carried out using a sample of individuals affected by the disease of interest and a sample of healthy controls. From a data mining perspective, gene mapping can then be cast as a pattern discov-ery and analysis task: which genetically motivated marker patterns help to separate affected individuals from healthy ones? The marker data constitutes haplotypes: a haplotype is a string of genetic markers from one chromosome. Individuals who share a common ancestor, such as those that have inherited the disease gene from this individual, potentially share a substring in their haplotypes. Classi-fication or association analysis of haplotypes is thus one approach to gene mapping. Further, analyzing the similarities of haplotypes and clustering them can provide insight to genetic rela-tionships of individuals, to different mutations, and thus to the genetic etiology of the disease. We describe and illustrate data mining approaches to gene mapping using haplotypes: as-sociation analysis, similarity analysis, and clustering. The association-based gene mapping methods have been found to perform well and are being routinely applied in gene mapping projects

    Similar works

    Full text

    thumbnail-image

    Available Versions