136 research outputs found

    Accurate genotype imputation in multiparental populations from low-coverage sequence

    Get PDF
    Many different types of multiparental populations have recently been produced to increase genetic diversity and resolution in QTL mapping. Low-coverage, genotyping-by-sequencing (GBS) technology has become a cost-effective tool in these populations, despite large amounts of missing data in offspring and founders. In this work, we present a general statistical framework for genotype imputation in such experimental crosses from low-coverage GBS data. Generalizing a previously developed hidden Markov model for calculating ancestral origins of offspring DNA, we present an imputation algorithm that does not require parental data and that is applicable to bi-and multiparental populations. Our imputation algorithm allows heterozygosity of parents and offspring as well as error correction in observed genotypes. Further, our approach can combine imputation and genotype calling from sequencing reads, and it also applies to called genotypes from SNP array data. We evaluate our imputation algorithm by simulated and real data sets in four different types of populations: the F2, the advanced intercross recombinant inbred lines, the multiparent advanced generation intercross, and the cross-pollinated population. Because our approach uses marker data and population design information efficiently, the comparisons with previous approaches show that our imputation is accurate at even very low (< 1 ×) sequencing depth, in addition to having accurate genotype phasing and error detection.</p

    Genomic prediction of grain yield and drought-adaptation capacity in sorghum is enhanced by multi-trait analysis

    Get PDF
    Grain yield and stay-green drought adaptation trait are important targets of selection in grain sorghum breeding for broad adaptation to a range of environments. Genomic prediction for these traits may be enhanced by joint multi-trait analysis. The objectives of this study were to assess the capacity of multi-trait models to improve genomic prediction of parental breeding values for grain yield and stay-green in sorghum by using information from correlated auxiliary traits, and to determine the combinations of traits that optimize predictive results in specific scenarios. The dataset included phenotypic performance of 2645 testcross hybrids across 26 environments as well as genomic and pedigree information on their female parental lines. The traits considered were grain yield (GY), stay-green (SG), plant height (PH), and flowering time (FT). We evaluated the improvement in predictive performance of multi-trait G-BLUP models relative to single-trait G-BLUP. The use of a blended kinship matrix exploiting pedigree and genomic information was also explored to optimize multi-trait predictions. Predictive ability for GY increased up to 16% when PH information on the training population was exploited through multi-trait genomic analysis. For SG prediction, full advantage from multi-trait G-BLUP was obtained only when GY information was also available on the predicted lines per se, with predictive ability improvements of up to 19%. Predictive ability, unbiasedness and accuracy of predictions from conventional multi-trait G-BLUP were further optimized by using a combined pedigree-genomic relationship matrix. Results of this study suggest that multi-trait genomic evaluation combining routinely measured traits may be used to improve prediction of crop productivity and drought adaptability in grain sorghum.EEA PergaminoFil: Velazco, Julio. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria Pergamino. Sección Forrajeras; Argentina. Wageningen University and Research . Biometris – Mathematical and Statistical Methods; HolandaFil: Jordan, David R. The University of Queensland. Hermitage Research Facility. Queensland Alliance for Agriculture and Food Innovation; AustraliaFil: Mace, Emma S. The University of Queensland. Hermitage Research Facility. Queensland Alliance for Agriculture and Food Innovation; Australia. Hermitage Research Facility. Department of Agriculture and Fisheries; AustraliaFil: Hunt, Colleen H. The University of Queensland. Hermitage Research Facility. Queensland Alliance for Agriculture and Food Innovation; Australia. Hermitage Research Facility. Department of Agriculture and Fisheries; AustraliaFil: Malosetti, Marcos. Wageningen University and Research . Biometris – Mathematical and Statistical Methods; HolandaFil: Eeuwijk, Fred A. van. Wageningen University and Research . Biometris – Mathematical and Statistical Methods; Holand

    Southeast of What? Reflections on SEALS\u27 Success

    Get PDF
    In epidemiologic studies, measurement error in dietary variables often attenuates association between dietary intake and disease occurrence. To adjust for the attenuation caused by error in dietary intake, regression calibration is commonly used. To apply regression calibration, unbiased reference measurements are required. Short-term reference measurements for foods that are not consumed daily contain excess zeroes that pose challenges in the calibration model. We adapted two-part regression calibration model, initially developed for multiple replicates of reference measurements per individual to a single-replicate setting. We showed how to handle excess zero reference measurements by two-step modeling approach, how to explore heteroscedasticity in the consumed amount with variance-mean graph, how to explore nonlinearity with the generalized additive modeling (GAM) and the empirical logit approaches, and how to select covariates in the calibration model. The performance of two-part calibration model was compared with the one-part counterpart. We used vegetable intake and mortality data from European Prospective Investigation on Cancer and Nutrition (EPIC) study. In the EPIC, reference measurements were taken with 24-hour recalls. For each of the three vegetable subgroups assessed separately, correcting for error with an appropriately specified two-part calibration model resulted in about three fold increase in the strength of association with all-cause mortality, as measured by the log hazard ratio. Further found is that the standard way of including covariates in the calibration model can lead to over fitting the two-part calibration model. Moreover, the extent of adjusting for error is influenced by the number and forms of covariates in the calibration model. For episodically consumed foods, we advise researchers to pay special attention to response distribution, nonlinearity, and covariate inclusion in specifying the calibration model

    Reconstruction of Networks with Direct and Indirect Genetic Effects

    Get PDF
    Genetic variance of a phenotypic trait can originate from direct genetic effects, or from indirect effects, i.e., through genetic effects on other traits, affecting the trait of interest. This distinction is often of great importance, for example, when trying to improve crop yield and simultaneously control plant height. As suggested by Sewall Wright, assessing contributions of direct and indirect effects requires knowledge of (1) the presence or absence of direct genetic effects on each trait, and (2) the functional relationships between the traits. Because experimental validation of such relationships is often unfeasible, it is increasingly common to reconstruct them using causal inference methods. However, most current methods require all genetic variance to be explained by a small number of quantitative trait loci (QTL) with fixed effects. Only a few authors have considered the “missing heritability” case, where contributions of many undetectable QTL are modeled with random effects. Usually, these are treated as nuisance terms that need to be eliminated by taking residuals from a multi-trait mixed model (MTM). But fitting such an MTM is challenging, and it is impossible to infer the presence of direct genetic effects. Here, we propose an alternative strategy, where genetic effects are formally included in the graph. This has important advantages: (1) genetic effects can be directly incorporated in causal inference, implemented via our PCgen algorithm, which can analyze many more traits; and (2) we can test the existence of direct genetic effects, and improve the orientation of edges between traits. Finally, we show that reconstruction is much more accurate if individual plant or plot data are used, instead of genotypic means. We have implemented the PCgen-algorithm in the R-package pcgen.</p

    Determinants of barley grain yield in drought-prone Mediterranean environments

    Get PDF
    The determinants of barley grain yield in drought-prone Mediterranean environments have been studied in the Nure x Tremois (NT) population. A large set of yield and other morpho-physiological data were recorded in 118 doubled haploid (DH) lines of the population, in multi-environment field trials (18 site-year combination). Agrometeorological variables have been recorded and calculated at each site too. Four main periods of barley development were considered, vegetative, reproductive early and late grain filling phases, to dissect the effect on yield traits of the growth phases. Relationships between agrometeorological variables, grain yield (GY) and its main components (GN and GW) were also investigated by correlation. Results firstly gave a clear indication of the involvement of water consumption in determining GY and GW (r2=0.616, P=0.007 and r2=0.703, P=0.005, respectively) calculated from sowing to the early grain filling period, while GN showed its highest correlation with the total photothermal quotient (PQ) calculated for the same period (r2=0.646, P=0.013). With the only exception of total PQ calculated during the vegetative period, all significant correlations with GY were associated to water-dependent agrometeorological parameters. As a second result, the NT segregating population allowed us to weight the amount of interaction due to genotypes over environments or to environments in relation to genotypes by a GGE analysis; 47.67% of G+GE sum of squares was explained by the first two principal components. Then, the introduction of genomic information at major barley genes regulating the length of growth cycle allowed us to explain patterns of adaptation of different groups of NT lines according to the variants (alleles) harbored at venalization (Vrn-H1) in combination with earliness (Eam6) genes. The superiority of the lines carrying the Nure allele at Eam6 was confirmed by factorial ANOVA testing the four possible haplotypes obtained combining alternative alleles at Eam6 and Vrn-H1. Maximum yield potential and differentials among the NT genotypes was finally explored through Finlay-Wilkinson model to interpret grain yield of NT genotypes together with yield adaptability (Ya), as the regression coefficient bi; Ya ranged from 0.71 for NT77 to 1.20 for NT19. Lines simply harboring the Nure variants at the two genes behaved as highest yielding (3.04 t ha\u20131), and showed the highest yield adaptability (bi=1.05). The present study constitutes a starting point towards the introduction of genomic variables in agronomic models for barley grain yield in Mediterranean environments

    A two-stage approach for the spatio-temporal analysis of high-throughput phenotyping data

    Get PDF
    High throughput phenotyping (HTP) platforms and devices are increasingly used for the characterization of growth and developmental processes for large sets of plant genotypes. Such HTP data require challenging statistical analyses in which longitudinal genetic signals need to be estimated against a background of spatio-temporal noise processes. We propose a two-stage approach for the analysis of such longitudinal HTP data. In a first stage, we correct for design features and spatial trends per time point. In a second stage, we focus on the longitudinal modelling of the spatially corrected data, thereby taking advantage of shared longitudinal features between genotypes and plants within genotypes. We propose a flexible hierarchical three-level P-spline growth curve model, with plants/plots nested in genotypes, and genotypes nested in populations. For selection of genotypes in a plant breeding context, we show how to extract new phenotypes, like growth rates, from the estimated genotypic growth curves and their first-order derivatives. We illustrate our approach on HTP data from the PhenoArch greenhouse platform at INRAE Montpellier and the outdoor Field Phenotyping platform at ETH Zürich.Ministerio de Ciencia, Innovación y Universidades | Ref. BCAM Severo Ochoa accreditation SEV-2017-0718Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung | Ref. project PhenoCOOL (project no. 169542)Horizon 2020 Framework Programme | Ref. grant agreement ID 731013 (EPPN2020)Ministerio de Ciencia, Innovación y Universidades | Ref. MTM2017-82379-

    Dynamics of senescence-related QTLs in potato

    Get PDF
    The study of quantitative trait's expression over time helps to understand developmental processes which occur in the course of the growing season. Temperature and other environmental factors play an important role. The dynamics of haulm senescence was observed in a diploid potato mapping population in two consecutive years (2004 and 2005) under field conditions in Finland. The available time series data were used in a smoothed generalized linear model to characterize curves describing the senescence development in terms of its onset, mean and maximum progression rate and inflection point. These characteristics together with the individual time points were used in a Quantitative trait loci (QTL) analysis. Although QTLs occurring early in the sene

    Combining pedigree and genomic information to improve prediction quality: an example in sorghum

    Get PDF
    Key Message: The use of a kinship matrix integrating pedigree- and marker-based relationships optimized the performance of genomic prediction in sorghum, especially for traits of lower heritability. Abstract: Selection based on genome-wide markers has become an active breeding strategy in crops. Genomic prediction models can make use of pedigree information to account for the residual polygenic effects not captured by markers. Our aim was to evaluate the impact of using pedigree and genomic information on prediction quality of breeding values for different traits in sorghum. We explored BLUP models that use weighted combinations of pedigree and genomic relationship matrices. The optimal weighting factor was empirically determined in order to maximize predictive ability after evaluating a range of candidate weights. The phenotypic data consisted of testcross evaluations of sorghum parental lines across multiple environments. All lines were genotyped, and full pedigree information was available. The performance of the best predictive combined matrix was compared to that of models fitting the component matrices independently. Model performance was assessed using cross-validation technique. Fitting a combined pedigree–genomic matrix with the optimal weight always yielded the largest increases in predictive ability and the largest reductions in prediction bias relative to the simple G-BLUP. However, the weight that optimized prediction varied across traits. The benefits of including pedigree information in the genomic model were more relevant for traits with lower heritability, such as grain yield and stay-green. Our results suggest that the combination of pedigree and genomic relatedness can be used to optimize predictions of complex traits in crops when the additive variation is not fully explained by markers
    corecore