A similarity matrix and its application in genomic selection for hedging haplotype diversity

Abstract

Mendelian sampling variance (MSV) has many breeding applications. However, its computationally intensive nature limits its widespread use. Recently proposed selection indices for long-term genetic gain combine genomic estimated breeding value and MSV. However, these indices tend to select similar parents with high MSV potential under high selection intensity, resulting in favorable haplotypes losses. Therefore, this thesis aimed to develop a faster approach for computing MSV and derive a similarity matrix for hedging haplotype diversity. The thesis first develops an efficient approach for computing MSV using marker effects, a genetic map, and phased genotypes. Then, using the same information as MSV, it derives a similarity matrix. The off-diagonal elements of this matrix represent the similarities between parental haplotypes, and diagonal elements represent the similarity of a parent to itself, which equals its MSV. A high similarity indicates that the parents share many heterozygous markers, with large effects on a trait in the same linkage phase. Similar to how covariance matrices of asset prices are used in finance to create diversified portfolios, the similarity matrix can help avoid repeated matings of similar parents and achieve expected genetic gain while hedging haplotype diversity in the next generation. The thesis then develops the Python package PyMSQ for computing MSV and similarity matrix to facilitate their use in breeding programs. Compared to gamevar (a recently published Fortran program), PyMSQ was up to 240 times faster at computing MSV in the analyzed data sets. Finally, similarity matrices for milk production and longevity traits were calculated using PyMSQ for a large German Holstein population to assess their applicability, relevance, and influencing factors. The similarity matrix presented in this thesis introduces new criteria for genomic selection, allowing for increased genetic gain while hedging haplotype diversity in breeding programs

    Similar works