742 research outputs found

    R Package wgaim: QTL Analysis in Bi-Parental Populations Using Linear Mixed Models

    Get PDF
    The wgaim (whole genome average interval mapping) package developed in the R system for statistical computing (R Development Core Team 2011) builds on linear mixed modelling techniques by incorporating a whole genome approach to detecting significant quantitative trait loci (QTL) in bi-parental populations. Much of the sophistication is inherited through the well established linear mixed modelling package ASReml-R (Butler et al. 2009). As wgaim uses an extension of interval mapping to incorporate the whole genome into the analysis, functions are provided which allow conversion of genetic data objects created with the qtl package of Broman and Wu (2010) available in R. Results of QTL analyses are available using summary and print methods as well as diagnostic summaries of the selection method. In addition, the package features a flexible linkage map plotting function that can be easily manipulated to provide an aesthetic viewable genetic map. As a visual summary, QTL obtained from one or more models can also be added to the linkage map.

    A New Approach to Forest Site Quality Modeling

    Get PDF
    Multiple regression and discriminant analysis procedures are commonly used to develop forest site quality models. \u27When they contain many independent variables relative to sample size, these models may be subject to predicton bias. Fit statistics such as R2 in regression and classification tables in discriminant analysis show the apparent model accuracy but this may be a biased estimate of the model\u27s actual accuracy. Sample splitting methods such as cross-validation and the bootstrap can be used to get an unbiased actual accuracy estimate. A discriminant procedure called classification tree analysis uses cross-validation to build the classifier with the greatest estimated actual accuracy. Because cross-validation is used in model development, the model is less likely to be over-fit with insignificant variables when compared with stepwise linear discriminant analysis. Classification tree analysis and linear discriminant analysis were used to develop models that discriminate prime vs. nonprime ponderosa pine (Pinus ponderosa) sites. Prime sites are defined as having site index 25 greater than 7.6 meters; nonprime sites have site index 25 less than 7.6 meters. Forest habitat type, percent sand content, and soil pH were incorporated in both models. The cross-valiation estimate of classification tree actual accuracy was 88 percent. A random bootstrap estimate of the linear discriminant function actual accuracy was 80 percent. viii A multiple regression model developed with random plots revealed little useful information and was biased when applied to prime site plots. The conventional regression approach using random plots may be misleading if one is interested in identifying relatively rare prime sites. Forest habitat types within the ponderosa pine series in southern Utah were examined as site quality indicators. The site index range within any one habitat type was broad. However, the best ponderosa pine sites consistently occurred in only Pinus ponderosa/Quercus gambelii, and Pinus ponderosa/Symphoricarpos oreophilus habitat types; or in habitat types within the Pseudotsuga menziesii or Abies concolor series. Therefore forest habitat type when used with other site variables may be useful in predicting prime sites. The effect of aspect at the upper elevational limit of ponderosa pine was examined by comparing mean site index and mean initial 10 year diameter increment on southerly and northerly slopes from two cinder cones. Southerly aspects on both cinder cones had greater mean diameter increment. Southerly aspects on the highest elevation cinder cone had the greatest mean site index. There was no significant difference in mean site index on the lower elevation cinder cone. Optimal aspect for height and diameter growth may differ due to l) the effect of density on diameter increment; and/or 2) available soil water limiting height growth during the spring and ambient temperature/solar radiation limiting diameter growth in late summer. Optimal aspect for forest production is not constant but varies with tree species, elevation, latitude, and other factors affecting site microclimate

    Estimating Effects and Making Predictions from Genome-Wide Marker Data

    Full text link
    In genome-wide association studies (GWAS), hundreds of thousands of genetic markers (SNPs) are tested for association with a trait or phenotype. Reported effects tend to be larger in magnitude than the true effects of these markers, the so-called ``winner's curse.'' We argue that the classical definition of unbiasedness is not useful in this context and propose to use a different definition of unbiasedness that is a property of the estimator we advocate. We suggest an integrated approach to the estimation of the SNP effects and to the prediction of trait values, treating SNP effects as random instead of fixed effects. Statistical methods traditionally used in the prediction of trait values in the genetics of livestock, which predates the availability of SNP data, can be applied to analysis of GWAS, giving better estimates of the SNP effects and predictions of phenotypic and genetic values in individuals.Comment: Published in at http://dx.doi.org/10.1214/09-STS306 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Rapid Methods for Assessing Water, Sanitation and Hygiene (WASH) Services at Refugee Camps in Emergency Settings

    Get PDF
    This document describes UNHCR's methodology for conducting rapid WASH household assessments in refugee settings. The document can be used to help determine the most appropriate sampling approach for collecting information to assess WASH services. It can also be used to identify thresholds that can be used for the key WASH indicators that describe WASH services in camps and settlements. This briefing note is complemented by the working paper that contains additional details on the evaluation

    Anisotropic matern correlation and spatial prediction using REML

    Get PDF
    The MatĀ“ern correlation function provides great flexibility for modeling spatially correlated random processes in two dimensions, in particular via a smoothness parameter, whose estimation allows data to determine the degree of smoothness of a spatial process. The extension to include anisotropy provides a very general and flexible class of spatial covariance functions that can be used in a model-based approach to geostatistics, in which parameter estimation is achieved via REML and prediction is within the E-BLUP framework. In this article we develop a general class of linear mixed models using an anisotropic MatĀ“ern class with an extended metric. The approach is illustrated by application to soil salinity data in a rice-growing field in Australia, and to fine-scale soil pH data. It is found that anisotropy is an important aspect of both datasets, emphasizing the value of a straightforward and accessible approach to modeling anisotropy

    Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials

    Get PDF
    Variety selection in perennial pasture crops involves identifying best varieties from data collected from multiple harvest times in field trials. For accurate selection, the statistical methods for analysing such data need to account for the spatial and temporal correlation typically present. This paper provides an approach for analysing multi-harvest data from variety selection trials in which there may be a large number of harvest times. Methods are presented for modelling the variety by harvest effects while accounting for the spatial and temporal correlation between observations. These methods provide an improvement in model fit compared to separate analyses for each harvest, and provide insight into variety by harvest interactions. The approach is illustrated using two traits from a lucerne variety selection trial. The proposed method provides variety predictions allowing for the natural sources of variation and correlation in multi-harvest data

    WGNAM: whole-genome nested association mapping

    Get PDF
    A powerful QTL analysis methodĀ for nested association mapping populations is presented. Based on a one-stage multi-locus model, it provides accurate predictions of founder specific QTL effects

    Sensitivity of genomic selection to using different prior distributions

    Get PDF
    Genomic selection describes a selection strategy based on genomic estimated breeding values (GEBV) predicted from dense genetic markers such as single nucleotide polymorphism (SNP) data. Different Bayesian models have been suggested to derive the prediction equation, with the main difference centred around the specification of the prior distributions
    • ā€¦
    corecore