16 research outputs found

    Simultaneous fine mapping of closely linked epistatic quantitative trait loci using combined linkage disequilibrium and linkage with a general pedigree

    Get PDF
    Causal mutations and their intra- and inter-locus interactions play a critical role in complex trait variation. It is often not easy to detect epistatic quantitative trait loci (QTL) due to complicated population structure requirements for detecting epistatic effects in linkage analysis studies and due to main effects often being hidden by interaction effects. Mapping their positions is even harder when they are closely linked. The data structure requirement may be overcome when information on linkage disequilibrium is used. We present an approach using a mixed linear model nested in an empirical Bayesian approach, which simultaneously takes into account additive, dominance and epistatic effects due to multiple QTL. The covariance structure used in the mixed linear model is based on combined linkage disequilibrium and linkage information. In a simulation study where there are complex epistatic interactions between QTL, it is possible to simultaneously map interacting QTL into a small region using the proposed approach. The estimated variance components are accurate and less biased with the proposed approach compared with traditional models

    An efficient variance component approach implementing an average information REML suitable for combined LD and linkage mapping with a general complex pedigree

    Get PDF
    Variance component (VC) approaches based on restricted maximum likelihood (REML) have been used as an attractive method for positioning of quantitative trait loci (QTL). Linkage disequilibrium (LD) information can be easily implemented in the covariance structure among QTL effects (e.g. genotype relationship matrix) and mapping resolution appears to be high. Because of the use of LD information, the covariance structure becomes much richer and denser compared to the use of linkage information alone. This makes an average information (AI) REML algorithm based on mixed model equations and sparse matrix techniques less useful. In addition, (near-) singularity problems often occur with high marker densities, which is common in fine-mapping, causing numerical problems in AIREML based on mixed model equations. The present study investigates the direct use of the variance covariance matrix of all observations in AIREML for LD mapping with a general complex pedigree. The method presented is more efficient than the usual approach based on mixed model equations and robust to numerical problems caused by near-singularity due to closely linked markers. It is also feasible to fit multiple QTL simultaneously in the proposed method whereas this would drastically increase computing time when using mixed model equation-based methods

    Optimization of a crossing system using mate selection

    Get PDF
    A simple model based on one single identified quantitative trait locus (QTL) in a two-way crossing system was used to demonstrate the power of mate selection algorithms as a natural means of opportunistic line development for optimization of crossbreeding programs over multiple generations. Mate selection automatically invokes divergent selection in two parental lines for an over-dominant QTL and increased frequency of the favorable allele toward fixation in the sire-line for a fully-dominant QTL. It was concluded that an optimal strategy of line development could be found by mate selection algorithms for a given set of parameters such as genetic model of QTL, breeding objective and initial frequency of the favorable allele in the base populations, etc. The same framework could be used in other scenarios, such as programs involving crossing to exploit breed effects and heterosis. In contrast to classical index selection, this approach to mate selection can optimize long-term responses

    The effect of genomic information on optimal contribution selection in livestock breeding programs

    Get PDF
    BACKGROUND: Long-term benefits in animal breeding programs require that increases in genetic merit be balanced with the need to maintain diversity (lost due to inbreeding). This can be achieved by using optimal contribution selection. The availability of high-density DNA marker information enables the incorporation of genomic data into optimal contribution selection but this raises the question about how this information affects the balance between genetic merit and diversity. METHODS: The effect of using genomic information in optimal contribution selection was examined based on simulated and real data on dairy bulls. We compared the genetic merit of selected animals at various levels of co-ancestry restrictions when using estimated breeding values based on parent average, genomic or progeny test information. Furthermore, we estimated the proportion of variation in estimated breeding values that is due to within-family differences. RESULTS: Optimal selection on genomic estimated breeding values increased genetic gain. Genetic merit was further increased using genomic rather than pedigree-based measures of co-ancestry under an inbreeding restriction policy. Using genomic instead of pedigree relationships to restrict inbreeding had a significant effect only when the population consisted of many large full-sib families; with a half-sib family structure, no difference was observed. In real data from dairy bulls, optimal contribution selection based on genomic estimated breeding values allowed for additional improvements in genetic merit at low to moderate inbreeding levels. Genomic estimated breeding values were more accurate and showed more within-family variation than parent average breeding values; for genomic estimated breeding values, 30 to 40% of the variation was due to within-family differences. Finally, there was no difference between constraining inbreeding via pedigree or genomic relationships in the real data. CONCLUSIONS: The use of genomic estimated breeding values increased genetic gain in optimal contribution selection. Genomic estimated breeding values were more accurate and showed more within-family variation, which led to higher genetic gains for the same restriction on inbreeding. Using genomic relationships to restrict inbreeding provided no additional gain, except in the case of very large full-sib families

    Comparing genomic prediction accuracy from purebred, crossbred and combined purebred and crossbred reference populations in sheep

    Get PDF
    BACKGROUND: The accuracy of genomic prediction depends largely on the number of animals with phenotypes and genotypes. In some industries, such as sheep and beef cattle, data are often available from a mixture of breeds, multiple strains within a breed or from crossbred animals. The objective of this study was to compare the accuracy of genomic prediction for several economically important traits in sheep when using data from purebreds, crossbreds or a combination of those in a reference population. METHODS: The reference populations were purebred Merinos, crossbreds of Border Leicester (BL), Poll Dorset (PD) or White Suffolk (WS) with Merinos and combinations of purebred and crossbred animals. Genomic breeding values (GBV) were calculated based on genomic best linear unbiased prediction (GBLUP), using a genomic relationship matrix calculated based on 48 599 Ovine SNP (single nucleotide polymorphisms) genotypes. The accuracy of GBV was assessed in a group of purebred industry sires based on the correlation coefficient between GBV and accurate estimated breeding values based on progeny records. RESULTS: The accuracy of GBV for Merino sires increased with a larger purebred Merino reference population, but decreased when a large purebred Merino reference population was augmented with records from crossbred animals. The GBV accuracy for BL, PD and WS breeds based on crossbred data was the same or tended to decrease when more purebred Merinos were added to the crossbred reference population. The prediction accuracy for a particular breed was close to zero when the reference population did not contain any haplotypes of the target breed, except for some low accuracies that were obtained when predicting PD from WS and vice versa. CONCLUSIONS: This study demonstrates that crossbred animals can be used for genomic prediction of purebred animals using 50 k SNP marker density and GBLUP, but crossbred data provided lower accuracy than purebred data. Including data from distant breeds in a reference population had a neutral to slightly negative effect on the accuracy of genomic prediction. Accounting for differences in marker allele frequencies between breeds had only a small effect on the accuracy of genomic prediction from crossbred or combined crossbred and purebred reference populations

    Using an evolutionary algorithm and parallel computing for haplotyping in a general complex pedigree with multiple marker loci

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Haplotype reconstruction is important in linkage mapping and association mapping of quantitative trait loci (QTL). One widely used statistical approach for haplotype reconstruction is simulated annealing (SA), implemented in SimWalk2. However, the algorithm needs a very large number of sequential iterations, and it does not clearly show if convergence of the likelihood is obtained.</p> <p>Results</p> <p>An evolutionary algorithm (EA) is a good alternative whose convergence can be easily assessed during the process. It is feasible to use a powerful parallel-computing strategy with the EA, increasing the computational efficiency. It is shown that the EA can be ~4 times faster and gives more reliable estimates than SimWalk2 when using 4 processors. In addition, jointly updating dependent variables can increase the computational efficiency up to ~2 times. Overall, the proposed method with 4 processors increases the computational efficiency up to ~8 times compared to SimWalk2. The efficiency will increase more with a larger number of processors.</p> <p>Conclusion</p> <p>The use of the evolutionary algorithm and the joint updating method can be a promising tool for haplotype reconstruction in linkage and association mapping of QTL.</p

    Computing approximate standard errors for genetic parameters derived from random regression models fitted by average information REML

    Get PDF
    Approximate standard errors (ASE) of variance components for random regression coefficients are calculated from the average information matrix obtained in a residual maximum likelihood procedure. Linear combinations of those coefficients define variance components for the additive genetic variance at given points of the trajectory. Therefore, ASE of these components and heritabilities derived from them can be calculated. In our example, the ASE were larger near the ends of the trajectory

    A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Knowing the phase of marker genotype data can be useful in genome-wide association studies, because it makes it possible to use analysis frameworks that account for identity by descent or parent of origin of alleles and it can lead to a large increase in data quantities via genotype or sequence imputation. Long-range phasing and haplotype library imputation constitute a fast and accurate method to impute phase for SNP data.</p> <p>Methods</p> <p>A long-range phasing and haplotype library imputation algorithm was developed. It combines information from surrogate parents and long haplotypes to resolve phase in a manner that is not dependent on the family structure of a dataset or on the presence of pedigree information.</p> <p>Results</p> <p>The algorithm performed well in both simulated and real livestock and human datasets in terms of both phasing accuracy and computation efficiency. The percentage of alleles that could be phased in both simulated and real datasets of varying size generally exceeded 98% while the percentage of alleles incorrectly phased in simulated data was generally less than 0.5%. The accuracy of phasing was affected by dataset size, with lower accuracy for dataset sizes less than 1000, but was not affected by effective population size, family data structure, presence or absence of pedigree information, and SNP density. The method was computationally fast. In comparison to a commonly used statistical method (fastPHASE), the current method made about 8% less phasing mistakes and ran about 26 times faster for a small dataset. For larger datasets, the differences in computational time are expected to be even greater. A computer program implementing these methods has been made available.</p> <p>Conclusions</p> <p>The algorithm and software developed in this study make feasible the routine phasing of high-density SNP chips in large datasets.</p

    A phasing and imputation method for pedigreed populations that results in a single-stage genomic evaluation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Efficient, robust, and accurate genotype imputation algorithms make large-scale application of genomic selection cost effective. An algorithm that imputes alleles or allele probabilities for all animals in the pedigree and for all genotyped single nucleotide polymorphisms (SNP) provides a framework to combine all pedigree, genomic, and phenotypic information into a single-stage genomic evaluation.</p> <p>Methods</p> <p>An algorithm was developed for imputation of genotypes in pedigreed populations that allows imputation for completely ungenotyped animals and for low-density genotyped animals, accommodates a wide variety of pedigree structures for genotyped animals, imputes unmapped SNP, and works for large datasets. The method involves simple phasing rules, long-range phasing and haplotype library imputation and segregation analysis.</p> <p>Results</p> <p>Imputation accuracy was high and computational cost was feasible for datasets with pedigrees of up to 25 000 animals. The resulting single-stage genomic evaluation increased the accuracy of estimated genomic breeding values compared to a scenario in which phenotypes on relatives that were not genotyped were ignored.</p> <p>Conclusions</p> <p>The developed imputation algorithm and software and the resulting single-stage genomic evaluation method provide powerful new ways to exploit imputation and to obtain more accurate genetic evaluations.</p
    corecore