164 research outputs found
GENERALIZED LINEAR MIXED MODELS: AN APPLICATION
The purpose of this paper is to present a specific application of the generalized linear mixed model. Often of interest to animal-breeders is the estimation of genetic parameters associated with certain traits. When the trait is measured in terms of a normally distributed response variable, standard variance-component estimation and mixed-model procedures can be used. Increasingly, breeders are interested in categorical traits (degree of calving difficulty, number born, etc.). An application of the generalized linear mixed to an animal breeding study of the number of lambs born alive will be presented. We will show how the model is determined, how the estimation equations are formed, and the resulting inference
ANALYSIS OF THE SPATIAL DISTRIBUTION OF SUGARBEET PLANTS
The spatial distribution of emerged sugarbeet plants is an important aspect of the performance of sugarbeet planters. Three major components influencing the spatial distribution are the ability to drop a single seed at a time, the ability to drop the seeds a fixed distance apart, and the ability of the seed to emerge. A model has been developed to describe the distribution of the spacing between emerged sugarbeet plants. The model consists of a mixture of normal and gamma distributions. The spatial data consists of the distance between neighboring emerged plants. Spatial data was collected on 7 planters operated at 3 speeds using both pelleted and encrusted seeds. Four replicates were obtained of each treatment combination. Approximate maximum likelihood estimates of the parameters were obtained separately for each replicate of the treatment combinations
A BAYESIAN GWAS METHOD UTILIZING HAPLOTYPE CLUSTERS FOR A COMPOSITE BREED POPULATION
Commercial beef cattle are often composites of multiple breeds. Current methods used to produce genomic predictors are based on the underlying assumption of animals being sampled from a homogeneous population. As a result, the predictors can perform poorly when used to predict the relative genetic merit of animals whose breed composition are different. In part, this is due to the changes in linkage disequilibrium between the markers and the quantitative trait loci as we move from one breed to the next. An alternative model based on breed specific haplotype clusters was developed to allow for differences in linkage disequilibrium across multiple breeds. The haplotype clusters were modeled as hidden states in a hidden Markov model where the genomic effects are associated with loci located on the unobserved clusters. Similar to the Bayes C model, we can model the genomic effects at the loci using a prior, which consists of a mixture of a multivariate normal and a point mass at zero distribution. The model will be used to construct genomic predictors using records on 6,552 cattle genotyped for 99,827 mapped SNPs representing various fractions of three different breeds
A BAYESIAN GWAS METHOD UTILIZING HAPLOTYPE CLUSTERS FOR A COMPOSITE BREED POPULATION
Commercial beef cattle are often composites of multiple breeds. Current methods used to produce genomic predictors are based on the underlying assumption of animals being sampled from a homogeneous population. As a result, the predictors can perform poorly when used to predict the relative genetic merit of animals whose breed composition are different. In part, this is due to the changes in linkage disequilibrium between the markers and the quantitative trait loci as we move from one breed to the next. An alternative model based on breed specific haplotype clusters was developed to allow for differences in linkage disequilibrium across multiple breeds. The haplotype clusters were modeled as hidden states in a hidden Markov model where the genomic effects are associated with loci located on the unobserved clusters. Similar to the Bayes C model, we can model the genomic effects at the loci using a prior, which consists of a mixture of a multivariate normal and a point mass at zero distribution. The model will be used to construct genomic predictors using records on 5,000 cattle genotyped for 99,827 mapped SNPs representing various fractions of three different breeds
The impact of training strategies on the accuracy of genomic predictors in United States Red Angus cattle
Genomic selection (GS) has become an integral part of genetic evaluation methodology and has been applied to all major livestock species, including beef and dairy cattle, pigs, and chickens. Significant contributions in increased accuracy of selection decisions have been clearly illustrated in dairy cattle after practical application of GS. In the majority of U.S. beef cattle breeds, similar efforts have also been made to increase the accuracy of genetic merit estimates through the inclusion of genomic information into routine genetic evaluations using a variety of methods. However, prediction accuracies can vary relative to panel density, the number of folds used for folds cross-validation, and the choice of dependent variables (e.g., EBV, deregressed EBV, adjusted phenotypes). The aim of this study was to evaluate the accuracy of genomic predictors for Red Angus beef cattle with different strategies used in training and evaluation. The reference population consisted of 9,776 Red Angus animals whose genotypes were imputed to 2 medium-density panels consisting of over 50,000 (50K) and approximately 80,000 (80K) SNP. Using the imputed panels, we determined the influence of marker density, exclusion (deregressed EPD adjusting for parental information [DEPD-PA]) or inclusion (deregressed EPD without adjusting for parental information [DEPD]) of parental information in the deregressed EPD used as the dependent variable, and the number of clusters used to partition training animals (3, 5, or 10). A BayesC model with π set to 0.99 was used to predict molecular breeding values (MBV) for 13 traits for which EPD existed. The prediction accuracies were measured as genetic correlations between MBV and weighted deregressed EPD. The average accuracies across all traits were 0.540 and 0.552 when using the 50K and 80K SNP panels, respectively, and 0.538, 0.541, and 0.561 when using 3, 5, and 10 folds, respectively, for cross-validation. Using DEPD-PA as the response variable resulted in higher accuracies of MBV than those obtained by DEPD for growth and carcass traits. When DEPD were used as the response variable, accuracies were greater for threshold traits and those that are sex limited, likely due to the fact that these traits suffer from a lack of information content and excluding animals in training with only parental information substantially decreases the training population size. It is recommended that the contribution of parental average to deregressed EPD should be removed in the construction of genomic prediction equations. The difference in terms of prediction accuracies between the 2 SNP panels or the number of folds compared herein was negligible
HOW GOOD ARE SPATIAL GLM\u27S? A SIMULATION STUDY
An area of increasing interest to agricultural and ecological researchers is the analysis of spatially correlated non-normal data. A generalized linear model(GLM) accounting for spatial covariance was presented by Gotway and Stroup (1997). Their method included approximate inference based on asymptotic distributions. A simulation study was conducted to assess the small sample behavior of their proposed estimates and test statistics. This study suggests that the spatial GLM yields unbiased estimates of treatment means and differences for binomial data, that the spatial GLM improves precision, as measured by MSE, and that the approximate F-statistic is acceptable for hypothesis testing
SIMULATION STUDY OF SPATIAL-POISSON DATA ASSESSING INCLUSION OF SPATIAL CORRELATION AND NON-NORMALITY IN THE ANALYSIS
Spatial correlation and non-normality in agricultural, geological, or environmental settings can have a significant effect on the accuracy of the results obtained in the statistical analyses. Generalized linear mixed models, spatial models, and generalized linear models were compared in order to assess how critical the inclusion of non-normality and spatial correlation is to the analysis. Spatially correlated data with a Poisson distribution were generated in a completely randomized design (CRD) with 2 treatments and 18 repetitions. Four analyses: spatial Poisson, non-spatial Poisson, spatial normal, and non-spatial normal, were conducted on the simulated data to compare their power functions. The degree of spatial correlation, size of the mean, the dimension of the plots and difference between the two treatment means were altered to investigate how the ability to detect differences between the treatments is affected. In addition, the range covariance parameter was estimated and compared among the spatial models. Some covariance parameter estimates were under-estimated. The size of the field plot and the treatment means were increased to assess their effects on estimation of the range. The Reduced Maximum Likelihood (REML) covariance parameter estimates were compared to those obtained using Maximum Likelihood (ML) estimates. The analysis that incorporated the spatial correlation of the observations and used ML to estimate the covariance parameters had the highest power and most accurate range parameter estimates
Exact Distribution of Linkage Disequilibrium in the Presence of Mutation, Selection, or Minor Allele Frequency Filtering
Linkage disequilibrium (LD), often expressed in terms of the squared correlation (r2) between allelic values at two loci, is an important concept in many branches of genetics and genomics. Genetic drift and recombination have opposite effects on LD, and thus r2 will keep changing until the effects of these two forces are counterbalanced. Several approximations have been used to determine the expected value of r2 at equilibrium in the presence or absence of mutation. In this paper, we propose a probability-based approach to compute the exact distribution of allele frequencies at two loci in a finite population at any generation t conditional on the distribution at generation t−1. As r2 is a function of this distribution of allele frequencies, this approach can be used to examine the distribution of r2 over generations as it approaches equilibrium. The exact distribution of LD from our method is used to describe, quantify, and compare LD at different equilibria, including equilibrium in the absence or presence of mutation, selection, and filtering by minor allele frequency. We also propose a deterministic formula for expected LD in the presence of mutation at equilibrium based on the exact distribution of LD
Corn Seed Spacing Uniformity as Affected by Seed Tube Condition
Variation in corn seed spacing from a John Deere MaxEmergeTM Plus Vacumeter planter was evaluated on the University of Nebraska Planter Test Stand in a laboratory setting for two seed tube conditions (new or worn) with two examples of corn seed shape (round or flat). Seed spacing uniformity was measured using three seed spacing uniformity parameters: Coefficient of Precision (CP3), ISO Multiples index, and ISO Miss index.
Differences were detected in all three seed spacing uniformity parameters due to the seed tube condition. The new seed tubes had better seed spacing uniformity than the worn seed tubes, within each example of the seed shapes (round or flat) used in this experiment. For the seed used in this experiment, the round corn seed had better seed spacing uniformity than the flat corn seed, within each of the seed tube conditions (new or worn).
A recommended schedule for seed tube replacement to maintain seed spacing uniformity has not been developed, and more research in this area is needed. Currently, sugarbeet growers in western Nebraska use one of three options: a) test one of their seed tubes on a good planter test stand every year before sugarbeet planting season and replace all tubes when results indicate it will improve seed spacing uniformity to the desired level; b) feel the inside front surface of the seed tube every year before sugarbeet planting season and change seed tubes when the feel of the surface changes from a slick plastic to a very fine sandpaper; or c) replace seed tubes before sugarbeet planting season when they have planted over approximately 150 acres of corn per planter row with their current seed tubes
Using pooled data for genomic prediction in a bivariate framework with missing data
Pooling samples to derive group genotypes can enable the economically efficient use of commercial animals within genetic evaluations. To test a multivariate framework for genetic evaluations using pooled data, simulation was used to mimic a beef cattle population including two moderately heritable traits with varying genetic correlations, genotypes and pedigree data. There were 15 generations (n = 32,000; random selection and mating), and the last generation was subjected to genotyping through pooling. Missing records were induced in two ways: (a) sequential culling and (b) random missing records. Gaps in genotyping were also explored whereby genotyping occurred through generation 13 or 14. Pools of 1, 20, 50 and 100 animals were constructed randomly or by minimizing phenotypic variation. The EBV was estimated using a bivariate single-step genomic best linear unbiased prediction model. Pools of 20 animals constructed by minimizing phenotypic variation generally led to accuracies that were not different than using individual progeny data. Gaps in genotyping led to significantly different EBV accuracies (p \u3c .05) for sires and dams born in the generation nearest the pools. Pooling of any size generally led to larger accuracies than no information from generation 15 regardless of the way missing records arose, the percentage of records available or the genetic correlation. Pooling to aid in the use of commercial data in genetic evaluations can be utilized in multivariate cases with varying relationships between the traits and in the presence of systematic and randomly missing phenotypes
- …