Search CORE

1 research outputs found

Data mining for gene mapping

Author: Evimaria Terzi
Hannu Toivonen
Ieee Press
Jozef Zurada
Medo Kantardzic (eds
Petteri Hintsanen
Petteri Sevon
Päivi Onkamo
Publication venue: Press
Publication date
Field of study

Localization of disease susceptibility genes to certain areas in the human genome, or gene mapping, requires careful analysis of genetic marker data. Gene mapping is often carried out using a sample of individuals affected by the disease of interest and a sample of healthy controls. From a data mining perspective, gene mapping can then be cast as a pattern discov-ery and analysis task: which genetically motivated marker patterns help to separate affected individuals from healthy ones? The marker data constitutes haplotypes: a haplotype is a string of genetic markers from one chromosome. Individuals who share a common ancestor, such as those that have inherited the disease gene from this individual, potentially share a substring in their haplotypes. Classi-fication or association analysis of haplotypes is thus one approach to gene mapping. Further, analyzing the similarities of haplotypes and clustering them can provide insight to genetic rela-tionships of individuals, to different mutations, and thus to the genetic etiology of the disease. We describe and illustrate data mining approaches to gene mapping using haplotypes: as-sociation analysis, similarity analysis, and clustering. The association-based gene mapping methods have been found to perform well and are being routinely applied in gene mapping projects

CiteSeerX