The use of plasmodes as a supplement to simulations: A simple example evaluating individual admixture estimation methodologies

Abstract

With the advent of powerful computers, simulation studies are becoming an important tool in statistical methodology research. However, computer simulations of a specific process are only as good as our understanding of the underlying mechanisms. An attractive supplement to simulations is the use of plasmode datasets. Plasmodes are data sets that are generated by natural biologic processes, under experimental conditions that allow some aspect of the truth to be known. The benefit of the plasmode approach is that the data are generated through completely natural processes, thus circumventing the common concern of the realism and accuracy of computer simulated data. The estimation of admixture, or the proportion of an individual’s genome that originates from different founding populations, is a particularly difficult research endeavor that is well suited to the use of plasmodes. Current methods have been tested with simulations of complex populations where the underlying mechanisms such as the rate and distribution of recombination are not well understood. To demonstrate the utility of this method data derived from mouse crosses is used to evaluate the effectiveness of several admixture estimation methodologies. Each cross shares a common founding population so that the ancestry proportion for each individual is known, allowing for the comparison of true and estimated individual admixture values. Analysis shows that the different estimation methodologies (Structure, AdmixMap and FRAPPE) examined all perform well with simple datasets. However, the performance of the estimation methodologies varied greatly when applied to a plasmode consisting of three founding populations. The results of these examples illustrate the utility of plasmodes in the evaluation of statistical genetics methodologies

    Similar works