Identity-by-descent (IBD) is a fundamental concept in genetics with many
applications. In a common definition, two haplotypes are said to contain an IBD
segment if they share a segment that is inherited from a recent shared common
ancestor without intervening recombination. Long IBD segments (> 1cM) can be
efficiently detected by a number of algorithms using high-density SNP array
data from a population sample. However, these approaches detect IBD based on
contiguous segments of identity-by-state, and such segments may exist due to
the conflation of smaller, nearby IBD segments. We quantified this effect using
coalescent simulations, finding that nearly 40% of inferred segments 1-2cM long
are results of conflations of two or more shorter segments, under demographic
scenarios typical for modern humans. This biases the inferred IBD segment
length distribution, and so can affect downstream inferences. We observed this
conflation effect universally across different IBD detection programs and human
demographic histories, and found inference of segments longer than 2cM to be
much more reliable (less than 5% conflation rate). As an example of how this
can negatively affect downstream analyses, we present and analyze a novel
estimator of the de novo mutation rate using IBD segments, and demonstrate that
the biased length distribution of the IBD segments due to conflation can lead
to inflated estimates if the conflation is not modeled. Understanding the
conflation effect in detail will make its correction in future methods more
tractable