Copy number variants (CNVs) within humans can have both adaptive and deleterious effects. Because of their phenotypic significance, researchers have attempted to find single nucleotide polymorphisms (SNPs) in high linkage disequilibrium (LD) with CNVs to use in genomewide association studies. However, studies have found that CNVs are less likely to be in strong LD with flanking markers. We hypothesized that this “taggability gap” can be explained by duplication events that place paralogous sequences far apart. In support of our hypothesis, we find that duplications are significantly less likely than deletions to have a “tag” SNP, even after controlling for CNV length, allele frequency, and availability of appropriate flanking SNPs. Using a novel likelihood method, we are able to show that many complex CNVs—those due to multiple duplication or deletion polymorphisms—are made up of two loci with little LD between them. Additionally, we find that many polymorphic duplications detected in a recent clone-based study are located far from their parental loci. We also examine two other common hypotheses for the taggability gap, and find that recurrent mutation of both deletions and duplications appears to have an effect on LD, but that lower SNP density around CNVs has no effect. Overall, our results suggest that a substantial fraction of CNVs caused by duplication cannot be tagged by markers flanking the parental locus because they have changed genomic location
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.