20,951 research outputs found

    Pitfalls and Remedies for Cross Validation with Multi-trait Genomic Prediction Methods.

    Get PDF
    Incorporating measurements on correlated traits into genomic prediction models can increase prediction accuracy and selection gain. However, multi-trait genomic prediction models are complex and prone to overfitting which may result in a loss of prediction accuracy relative to single-trait genomic prediction. Cross-validation is considered the gold standard method for selecting and tuning models for genomic prediction in both plant and animal breeding. When used appropriately, cross-validation gives an accurate estimate of the prediction accuracy of a genomic prediction model, and can effectively choose among disparate models based on their expected performance in real data. However, we show that a naive cross-validation strategy applied to the multi-trait prediction problem can be severely biased and lead to sub-optimal choices between single and multi-trait models when secondary traits are used to aid in the prediction of focal traits and these secondary traits are measured on the individuals to be tested. We use simulations to demonstrate the extent of the problem and propose three partial solutions: 1) a parametric solution from selection index theory, 2) a semi-parametric method for correcting the cross-validation estimates of prediction accuracy, and 3) a fully non-parametric method which we call CV2*: validating model predictions against focal trait measurements from genetically related individuals. The current excitement over high-throughput phenotyping suggests that more comprehensive phenotype measurements will be useful for accelerating breeding programs. Using an appropriate cross-validation strategy should more reliably determine if and when combining information across multiple traits is useful

    An Evaluation of Design-based Properties of Different Composite Estimators

    Full text link
    For the last several decades, the US Census Bureau has been using the AK composite estimation method to produce statistics on employment from the Current Population Survey (CPS) data. The CPS uses a rotating design and AK estimators are linear combinations of monthly survey weighted averages (called month-in-sample estimates) in each rotation groups. Denoting by XX the vector of month-in-sample estimates and by Σ\Sigma its design based variance, the coefficients of the linear combination were optimized by the Census Bureau after substituting Σ\Sigma by an estimate and under unrealistic stationarity assumptions. To show the limits of this approach, we compared the AK estimator with different competitors using three different synthetic populations that mimics the Current Population Survey (CPS) data and a simplified sample design that mimics the CPS design. In our simulation setup, empirically best estimators have larger mean square error than simple averages. In the real data analysis, the AK estimates are constantly below the survey-weighted estimates, indicating potential bias. Any attempt to improve on the estimated optimal estimator in either class would require a thorough investigation of the highly non-trivial problem of estimation of Σ\Sigma for a complex setting like the CPS (we did not entertain this problem in this paper). A different approach is to use a variant of the regression composite estimator used by Statistics Canada. The regression composite estimator does not require estimation of Σ\Sigma and is less sensitive to the rotation group bias in our simulations. Our study demonstrates that there is a great potential for improving the estimation of levels and month to month changes in the unemployment rates by using the regression composite estimator

    Quasiparticle Interference on the Surface of the Topological Insulator Bi2_2Te3_3

    Full text link
    The quasiparticle interference of the spectroscopic imaging scanning tunneling microscopy has been investigated for the surface states of the large gap topological insulator Bi2_2Te3_3 through the T-matrix formalism. Both the scalar potential scattering and the spin-orbit scattering on the warped hexagonal isoenergy contour are considered. While backscatterings are forbidden by time-reversal symmetry, other scatterings are allowed and exhibit strong dependence on the spin configurations of the eigenfunctions at k points over the isoenergy contour. The characteristic scattering wavevectors found in our analysis agree well with recent experiment results.Comment: 5 pages, 2 figures, Some typos are correcte

    Generalised Umbral Moonshine

    Get PDF
    Umbral moonshine describes an unexpected relation between 23 finite groups arising from lattice symmetries and special mock modular forms. It includes the Mathieu moonshine as a special case and can itself be viewed as an example of the more general moonshine phenomenon which connects finite groups and distinguished modular objects. In this paper we introduce the notion of generalised umbral moonshine, which includes the generalised Mathieu moonshine [Gaberdiel M.R., Persson D., Ronellenfitsch H., Volpato R., Commun. Number Theory Phys. 7 (2013), 145-223] as a special case, and provide supporting data for it. A central role is played by the deformed Drinfel'd (or quantum) double of each umbral finite group GG, specified by a cohomology class in H3(G,U(1))H^3(G,U(1)). We conjecture that in each of the 23 cases there exists a rule to assign an infinite-dimensional module for the deformed Drinfel'd double of the umbral finite group underlying the mock modular forms of umbral moonshine and generalised umbral moonshine. We also discuss the possible origin of the generalised umbral moonshine
    corecore