128 research outputs found

    On Flexible finite polygenic models for multiple-trait evaluation

    Get PDF
    Finite polygenic models (FPM) might be an alternative to the infinitesimal model (TIM) for the genetic evaluation of pedigreed multiple-generation populations for multiple quantitative traits. I present a general flexible Bayesian method that includes the number of genes in the FPM as an additional random variable. Markov-chain Monte Carlo techniques such as Gibbs sampling and the reversible jump sampler are used for implementation. Sampling of genotypes of all genes in the FPM is done via the use of segregation indicators. A broad range of FPM models, some combined with TIM, are empirically tested for the estimation of variance components and the number of genes in the FPM. Four simulation scenarios were studied, including genetic models with 5 or 50 additive independent diallelic genes affecting the traits, and random selection or selection on one of the traits was performed. The results in this study were based on ten replicates per simulation scenario. In the case of random selection, uniform priors on additive gene effects led to posterior mean estimates of genetic variance that were positively correlated with the number of genes fitted in the FPM. In the case of trait selection, assuming normal priors on gene effects also led to genetic variance estimates for the selected trait that were negatively correlated with the number of genes in the FPM. This negative correlation was not observed for the unselected trait. Treating the number of genes in the FPM as random revealed a positive correlation between prior and posterior mean estimates of this number, but the prior hardly affected the posterior estimates of genetic variance. Posterior inferences about the number of genes should be considered to be indicative where trait selection seems to improve the power of distinguishing between TIM and FPM. Based on the results of this study, I suggest not replacing TIM by the FPM, but combining TIM and FPM with the number of genes treated as random, to facilitate a highly flexible and thereby robust method for variance component estimation in pedigreed populations. Further study is required to explore the full potential of these models under different genetic model assumption

    Learning from data: Plant breeding applications of machine learning

    Get PDF
    Increasingly, new sources of data are being incorporated into plant breeding pipelines. Enormous amounts of data from field phenomics and genotyping technologies places data mining and analysis into a completely different level that is challenging from practical and theoretical standpoints. Intelligent decision-making relies on our capability of extracting from data useful information that may help us to achieve our goals more efficiently. Many plant breeders, agronomists and geneticists perform analyses without knowing relevant underlying assumptions, strengths or pitfalls of the employed methods. The study endeavors to assess statistical learning properties and plant breeding applications of supervised and unsupervised machine learning techniques. A soybean nested association panel (aka. SoyNAM) was the base-population for experiments designed in situ and in silico. We used mixed models and Markov random fields to evaluate phenotypic-genotypic-environmental associations among traits and learning properties of genome-wide prediction methods. Alternative methods for analyses were proposed

    Gene mapping using linkage disequilibrium

    Get PDF

    Inferences on the genetic control of quantitative traits from selection experiments

    Get PDF

    Statistical perspectives on dependencies between genomic markers

    Get PDF
    To study the genetic impact on a quantitative trait, molecular markers are used as predictor variables in a statistical model. This habilitation thesis elucidated challenges accompanied with such investigations. First, the usefulness of including different kinds of genetic effects, which can be additive or non-additive, was verified. Second, dependencies between markers caused by their proximity on the genome were studied in populations with family stratification. The resulting covariance matrix deserved special attention due to its multi-functionality in several fields of genomic evaluations

    NATURAL AND ANTHROPOGENIC DRIVERS OF TREE EVOLUTIONARY DYNAMICS

    Get PDF
    Species of trees inhabit diverse and heterogeneous environments, and often play important ecological roles in such communities. As a result of their vast ecological breadth, trees have become adapted to various environmental pressures. In this dissertation I examine various environmental factors that drive evolutionary dynamics in threePinusspecies in California and Nevada, USA. In chapter two, I assess the role of management influence of thinning, fire, and their interaction on fine-scale gene flow within fire-suppressed populations of Pinus lambertiana, a historically dominant and ecologically important member of mixed-conifer forests of the Sierra Nevada, California. Here, I find evidence that treatment prescription differentially affects fine-scale genetic structure and effective gene flow in this species. In my third chapter, I describe the development of a dense linkage map for Pinus balfouriana which I use in chapter four to assess the quantitative trait locus (QTL) landscape of water-use efficiency across two isolated ranges of the species. I find evidence that precipitation-related variables structure the geographical range of P. balfouriana, that traits related to water-use efficiency are heritable and differentiated across populations, and associated QTLs underlying this phenotypic variation explain large proportions of total variation. In chapter five, I assess evidence for local adaptation to the eastern Sierra Nevada rain shadow within P. albicaulisacross fine spatial scales of the Lake Tahoe Basin, USA. Here, genetic variation of traits related to water availability were structured more so across populations than neutral variation, and loci identified by genome-wide association methods show elevated signals of local adaptation that track soil water availability. In chapter six, I review theory related to polygenic local adaptation and literature of genotype-phenotype associations in trees. I find that evidence suggests a polygenic basis for many traits important to conservation and industry, and I suggest paths forward to best describing such genetic bases in tree species. Overall, my results show that spatial and genetic structure of trees are often driven by their environment, and that ongoing selective pressures driven by environmental change will continue to be important in these systems

    Bayesian Adaptive Markov Chain Monte Carlo Estimation of Genetic Parameters

    Get PDF
    Accurate estimation of genetic parameters is crucial for an efficient genetic evaluation system. REML and Bayesian methods are commonly used for the estimation of genetic parameters. In Bayesian approach, the idea is to combine what is known about the parameter which is represented in terms of a prior probability distribution together with the information coming from the data, to obtain a posterior distribution of the parameter of interest. Here a new fast adaptive Markov Chain Monte Carlo (MCMC) sampling algorithm is proposed. It combines both hybrid Gibbs sampler and Metropolis-Hastings (M-H) algorithm, for the estimation of genetic parameters in the linear mixed models with several random effects. The new adaptive MCMC algorithm has two steps: in step 1 the hybrid Gibbs sampler is used to learn an efficient proposal covariance structure for the variance components, and in step 2 the M-H algorithm is used to propose new values based on the learned covariance structure from step 1. Normally the dependencies among the random effects slow down the convergence of the MCMC chain. So in the second step of the algorithm those random effects were marginalized from the likelihood to improve the mixing of the chain. The new algorithm showed good mixing properties and was about twice time faster than the hybrid Gibbs sampling to produce posterior for variance components. Also the new algorithm was able to detect different modes in the posterior distribution. Moreover, the new proposed exponential prior for variance components was able to provide estimated mode of the posterior dominance variance to be zero in case of no dominance. The performance of the method was illustrated with field data and simulated data sets.Eine exakte Schätzung von genetischen Parametern ist entscheidend für ein leistungsfähiges genetisches Evaluierungssystem. Normalerweise werden REML- und Bayes-Verfahren für die Schätzung von genetischen Einflussfaktoren angewendet. Bei der Bayes-Methode werden die Informationen, die über einen Parameter durch A-priori-Wahrscheinlichkeitseinschätzung bekannt sind mit den Daten und Erfahrungen aus aktuellen Studien kombiniert und in eine A-posteriori-Verteilung überführt. In der vorliegenden Arbeit wird ein neuer, schnell anpassungsfähiger Markov Chain Monte Carlo (MCMC) sampling Algorithmus vorgestellt, welcher die Vorteile des Hybrid-Gibbs sampler mit denen des Metropolis-Hastings Algorithmus zur Einschätzung von genetischen Einflussfaktoren in linear mixed models mit mehreren Zufallsvariablen in sich vereinigt. Dieser neue MCMC Algorithmus arbeitet in 2 Stufen: im ersten Schritt wird der Hybrid Gibbs sampler genutzt, um eine effiziente vorgeschlagene Kovarianzstruktur für die Varianzkomponenten zu erlernen, während im zweiten Schritt der M-H Algorithmus zur Aufstellung neuer Werte basierend auf der erlernten Kovarianzstruktur aus Schritt 1 zur Anwendung kommt. Normalerweise verzögern die Abhängigkeiten unter den Zufallsvariablen die Annäherung der Markov-Kette an einen stationären Zustand. Also wurden diese Zufallsvariablen in einem weiteren Schritt von der Wahrscheinlichkeitsschätzung ausgeschlossen, um das Gemisch der Kette zu verbessern. Der neue Algorithmus zeigte gute Mischeigenschaften und war zweimal schneller als der Hybrid-Gibbs sampler, um eine a-posteriori-Verteilung von Varianzkomponenten zu erstellen, außerdem können bei dieser Methode auch mehrere Modes festgestellt werden. Mit der vorgeschlagenen exponentiellen Vorbewertung für Varianzkomponenten ist es weiterhin möglich solche Maximalwerte bei der posterior Verteilung auf den Wert Null zu schätzen im Falle, dass keine Dominanz besteht. Die Durchführung der Methode wurde mit realen und simulierten Datensätzen veranschaulicht

    The genetic architecture of psychiatric disorders

    Get PDF
    corecore