53 research outputs found
Predicting Genetic Values: A Kernel-Based Best Linear Unbiased Prediction With Genomic Data
Genomic data provide a valuable source of information for modeling covariance structures, allowing a more accurate prediction of total genetic values (GVs). We apply the kriging concept, originally developed in the geostatistical context for predictions in the low-dimensional space, to the high-dimensional space spanned by genomic single nucleotide polymorphism (SNP) vectors and study its properties in different gene-action scenarios. Two different kriging methods [âuniversal krigingâ (UK) and âsimple krigingâ (SK)] are presented. As a novelty, we suggest use of the family of MatĂ©rn covariance functions to model the covariance structure of SNP vectors. A genomic best linear unbiased prediction (GBLUP) is applied as a reference method. The three approaches are compared in a whole-genome simulation study considering additive, additive-dominance, and epistatic gene-action models. Predictive performance is measured in terms of correlation between true and predicted GVs and average true GVs of the individuals ranked best by prediction. We show that UK outperforms GBLUP in the presence of dominance and epistatic effects. In a limiting case, it is shown that the genomic covariance structure proposed by VanRaden (2008) can be considered as a covariance function with corresponding quadratic variogram. We also prove theoretically that if a specific linear relationship exists between covariance matrices for two linear mixed models, the GVs resulting from BLUP are linked by a scaling factor. Finally, the relation of kriging to other models is discussed and further options for modeling the covariance structure, which might be more appropriate in the genomic context, are suggested
Purchasing Behavior, Setting, Pricing, Family: Determinants of School Lunch Participation
Despite growing school lunch availability in Germany, its utilization is still low, and students resort to unhealthy alternatives. We investigated predictors of school lunch participation and reasons for nonparticipation in 1215 schoolchildren. Children reported meal habits, parents provided family-related information (like socioeconomic status), and anthropometry was conducted on-site in schools. Associations between school lunch participation and family-related predictors were estimated using logistic regression controlling for age and gender if necessary. School was added as a random effect. School lunch participation was primarily associated with family factors. While having breakfast on schooldays was positively associated with school lunch participation (ORadj = 2.20, p = 0.002), lower secondary schools (ORadj = 0.52, p < 0.001) and low SES (ORadj = 0.25, p < 0.001) were negatively associated. The main reasons for nonparticipation were school- and lunch-related factors (taste, time constraints, pricing). Parents reported pricing as crucial a reason as an unpleasant taste for nonparticipation. Nonparticipants bought sandwiches and energy drinks significantly more often on school days, whereas participants were less often affected by overweight (OR = 0.66, p = 0.043). Our data stress school- and lunch-related factors as an important opportunity to foster school lunch utilization
And yet Again: Having Breakfast Is Positively Associated with Lower BMI and Healthier General Eating Behavior in Schoolchildren
Given the high prevalence of childhood overweight, school-based programs aiming at nutritional behavior may be a good starting point for community-based interventions. Therefore, we investigated associations between school-related meal patterns and weight status in 1215 schoolchildren. Anthropometry was performed on-site in schools. Children reported their meal habits, and parents provided family-related information via questionnaires. Associations between nutritional behavior and weight status were estimated using hierarchical linear and logistic regression. Analyses were adjusted for age, socioâeconomic status, school type, migration background, and parental weight status. Having breakfast was associated with a lower BMI-SDS (ÎČadj = â0.51, p = 0.004) and a lower risk of being overweight (ORadj = 0.30, p = 0.009), while having two breakfasts resulting in stronger associations (BMI-SDS: ÎČadj = â0.66, p < 0.001; risk of overweight: ORadj = 0.22, p = 0.001). Likewise, children who regularly skipped breakfast on school days showed stronger associations (BMI-SDS: ÎČ = 0.49, p < 0.001; risk of overweight: OR = 3.29, p < 0.001) than children who skipped breakfast only occasionally (BMI-SDS: ÎČ = 0.43, p < 0.001; risk of overweight: OR = 2.72, p = 0.032). The associations persisted after controlling for parental SES and weight status. Therefore, our data confirm the school setting as a suitable starting point for community-based interventions and may underline the necessity of national programs providing free breakfast and lunch to children
Using Whole-Genome Sequence Data to Predict Quantitative Trait Phenotypes in Drosophila melanogaster
Predicting organismal phenotypes from genotype data is important for plant and animal breeding, medicine, and evolutionary biology. Genomic-based phenotype prediction has been applied for single-nucleotide polymorphism (SNP) genotyping platforms, but not using complete genome sequences. Here, we report genomic prediction for starvation stress resistance and startle response in Drosophila melanogaster, using âŒ2.5 million SNPs determined by sequencing the Drosophila Genetic Reference Panel population of inbred lines. We constructed a genomic relationship matrix from the SNP data and used it in a genomic best linear unbiased prediction (GBLUP) model. We assessed predictive ability as the correlation between predicted genetic values and observed phenotypes by cross-validation, and found a predictive ability of 0.239±0.008 (0.230±0.012) for starvation resistance (startle response). The predictive ability of BayesB, a Bayesian method with internal SNP selection, was not greater than GBLUP. Selection of the 5% SNPs with either the highest absolute effect or variance explained did not improve predictive ability. Predictive ability decreased only when fewer than 150,000 SNPs were used to construct the genomic relationship matrix. We hypothesize that predictive power in this population stems from the SNPâbased modeling of the subtle relationship structure caused by long-range linkage disequilibrium and not from population structure or SNPs in linkage disequilibrium with causal variants. We discuss the implications of these results for genomic prediction in other organisms
Recommended from our members
Using Whole-Genome Sequence Data to Predict Quantitative Trait Phenotypes in Drosophila melanogaster
Predicting organismal phenotypes from genotype data is important for plant and animal breeding, medicine, and evolutionary biology. Genomic-based phenotype prediction has been applied for single-nucleotide polymorphism (SNP) genotyping platforms, but not using complete genome sequences. Here, we report genomic prediction for starvation stress resistance and startle response in Drosophila melanogaster, using âŒ2.5 million SNPs determined by sequencing the Drosophila Genetic Reference Panel population of inbred lines. We constructed a genomic relationship matrix from the SNP data and used it in a genomic best linear unbiased prediction (GBLUP) model. We assessed predictive ability as the correlation between predicted genetic values and observed phenotypes by cross-validation, and found a predictive ability of 0.239±0.008 (0.230±0.012) for starvation resistance (startle response). The predictive ability of BayesB, a Bayesian method with internal SNP selection, was not greater than GBLUP. Selection of the 5% SNPs with either the highest absolute effect or variance explained did not improve predictive ability. Predictive ability decreased only when fewer than 150,000 SNPs were used to construct the genomic relationship matrix. We hypothesize that predictive power in this population stems from the SNPâbased modeling of the subtle relationship structure caused by long-range linkage disequilibrium and not from population structure or SNPs in linkage disequilibrium with causal variants. We discuss the implications of these results for genomic prediction in other organisms
Recommended from our members
Large-scale mapping of mutations affecting zebrafish development
BACKGROUND: Large-scale mutagenesis screens in the zebrafish employing the mutagen ENU have isolated several hundred mutant loci that represent putative developmental control genes. In order to realize the potential of such screens, systematic genetic mapping of the mutations is necessary. Here we report on a large-scale effort to map the mutations generated in mutagenesis screening at the Max Planck Institute for Developmental Biology by genome scanning with microsatellite markers. RESULTS: We have selected a set of microsatellite markers and developed methods and scoring criteria suitable for efficient, high-throughput genome scanning. We have used these methods to successfully obtain a rough map position for 319 mutant loci from the TĂŒbingen I mutagenesis screen and subsequent screening of the mutant collection. For 277 of these the corresponding gene is not yet identified. Mapping was successful for 80 % of the tested loci. By comparing 21 mutation and gene positions of cloned mutations we have validated the correctness of our linkage group assignments and estimated the standard error of our map positions to be approximately 6 cM. CONCLUSION: By obtaining rough map positions for over 300 zebrafish loci with developmental phenotypes, we have generated a dataset that will be useful not only for cloning of the affected genes, but also to suggest allelism of mutations with similar phenotypes that will be identified in future screens. Furthermore this work validates the usefulness of our methodology for rapid, systematic and inexpensive microsatellite mapping of zebrafish mutations
Genomische Vorhersage fĂŒr quantitative Merkmale: Verwendung von Kernel-Methoden und Verfahren, die auf vollstĂ€ndigen Genomsequenzen basieren
Die Vorhersage genetischer Werte ist von groĂer Bedeutung in der Tier- und Pflanzenzucht, der personalisierten Medizin und der Evolutionsbiologie. Traditionell werden genetischeWerte durch eine beste lineare unverzerrte Vorhersage (BLUP) im Rahmen eines linearen gemischten Modells ermittelt, dessen Kovarianzstrukturen aus VerwandtschaftsmaĂen zwischen Individuen berechnet werden können. Heutzutage ermöglichen Single Nucleotide Polymorphism (SNP) Marker die Einbeziehung genomischer Informationen in das Model (genomisches BLUP (GBLUP)).
Die Vorhersage von Zufallsvariablen auf Basis korrelierter Daten ist auch eines der wichtigsten Gebiete in der Geostatistik. Dabei wird der sogenannte âKrigingâ-Ansatz verwendet, bestehend aus einem BLUP-Ansatz mit parametrisierten Kovarianzfunktionen. In der vorliegenden Arbeit wird das Kriging Konzept auf die genomische Vorhersage ĂŒbertragen. Unter Verwendung der Familie der MatĂ©rn Kovarianzfunktionen wird Kriging mit dem GBLUP-Ansatz in einer genomweiten Simulationsstudie verglichen. Die Ergebnisse der Simulationsstudie lassen darauf schlieĂen, dass Kriging dem GBLUP-Ansatz in nichtadditiven Genwirkungs-Szenarien ĂŒberlegen ist.
Mit der zunehmenden VerfĂŒgbarkeit genomweiter Sequenzdaten hat die methodologische Entwicklung genom-basierter Vorhersagemethoden erneut an Bedeutung gewonnen. Diese Arbeit enthĂ€lt die weltweit erste Studie zur phĂ€notypischen Vorhersage unter Verwendung von Sequenzdaten in einem höheren eukaryotischen Organismus. Der âDrosophila melanogaster Genetic Reference Panelâ dient dabei als Datengrundlage und umfasst Sequenzen sowie phĂ€notypische Daten von 157 Inzuchtlinien des Modellorganismus Drosophila melanogaster. FĂŒr die beiden Merkmale âstarvation resistanceâ und âstartle responseâ können unter Verwendung von 2.5 Millionen SNPs moderate Vorhersagegenauigkeiten mit GBLUP beobachtet werden. Die Vorhersagegenauigkeit einer Bayesschen Methode mit interner SNP-Selektion ist nicht gröĂer als die durch GBLUP erzielte Genauigkeit, und die Vorhersagegenauigkeit des GBLUPAnsatzes nimmt erst ab, wenn weniger als 150.000 SNPs verwendet werden.
FĂŒr ein drittes Merkmal (âchill coma recoveryâ) erzielt der GBLUP-Ansatz nur sehr geringe Genauigkeiten. Mit Hilfe differenzierter Analysen und einer genomweiten Assoziationsstudie, welche paarweise Interaktionen zwischen Markern miteinbezieht, werden zwei mögliche Ursachen fĂŒr das Scheitern des GBLUP-Ansatzes identifiziert: die bimodale phĂ€notypische Verteilung sowie ein extensives Netzwerk epistatischer Interaktionen zwischen SNPs.
Es ist bekannt, dass die Genauigkeit der genomischen Vorhersage auch durch die zugrunde liegende Struktur des Kopplungsungleichgewichtes (linkage disequilibrium (LD)) zwischen SNPs beeinflusst wird. Mehrere, meist approximative Formeln fĂŒr die erwartete Höhe an LD in Populationen endlicher GröĂe existieren bereits in der Literatur. In dieser Arbeit wird eine alternative Rekursionsformel vorgeschlagen, welche die zeitliche Entwicklung des LDs beschreibt, und in einer Simulationsstudie wird gezeigt, dass die vorgeschlagene Formel der vielfach verwendeten Formel von Sved in allen betrachteten Parameterkonstellationen ĂŒberlegen ist. Die Theorie zu zeit-diskreten Markovketten erlaubt weiterhin die Herleitung des erwarteten LDs im Gleichgewichtszustand, was wiederum zu einer Formel fĂŒr die effektive PopulationsgröĂe Ne fĂŒhrt. Durch die Analyse des Effektes der Nicht-Exaktheit der Rekursionsformel auf den Gleichgewichtszustand kann gezeigt werden, dass der resultierende Fehler an erwartetem LD beachtlich sein kann. Unter Verwendung des humanen HapMap Datensatzes wird auĂerdem deutlich gemacht, dass der Ne-SchĂ€tzer stark von der Verteilung der AllelhĂ€ufigkeit des selteneren Allels abhĂ€ngt, die den zur Analyse ausgewĂ€hlten SNPs zugrunde liegt.
Die vorliegende Arbeit umfasst ein weites Spektrum an Untersuchungen an Schnittstellen der Statistik, Tierzucht und Genetik. Die vorgestellten Ergebnisse sind sowohl aus praktischer als auch aus methodisch-statistischer Sicht von Interesse
- âŠ