647 research outputs found
Searching for interacting QTL in related populations of an outbreeding species
Many important crop species are outbreeding. In outbreeding species the search for genes affecting traits is complicated by the fact that in a single cross up to four alleles may be present at each locus. This paper is concerned with the search for interacting quantitative trait loci (QTL) in populations which have been obtained by crossing a number of parents. It will be assumed that the parents are unrelated, but the methods can be extended easily to allow a pedigree structure. The approach has two goals: (1) finding QTL that are interacting with other loci and also loci which behave additively; (2) finding parents which segregate at two or more interacting QTL. Large populations obtained by crossing these parents can be used to study interactions in detail. QTL analysis is carried out by means of regression on predictions of QTL genotypes
Recommended from our members
Rare variants contribute disproportionately to quantitative trait variation in yeast.
How variants with different frequencies contribute to trait variation is a central question in genetics. We use a unique model system to disentangle the contributions of common and rare variants to quantitative traits. We generated ~14,000 progeny from crosses among 16 diverse yeast strains and identified thousands of quantitative trait loci (QTLs) for 38 traits. We combined our results with sequencing data for 1011 yeast isolates to show that rare variants make a disproportionate contribution to trait variation. Evolutionary analyses revealed that this contribution is driven by rare variants that arose recently, and that negative selection has shaped the relationship between variant frequency and effect size. We leveraged the structure of the crosses to resolve hundreds of QTLs to single genes. These results refine our understanding of trait variation at the population level and suggest that studies of rare variants are a fertile ground for discovery of genetic effects
Analysis of epistasis in human complex traits
Thousands of genetic mutations have been associated with many human complex traits and diseases, improving our understanding of the biological mechanisms underlying these phenotypes.
The great majority of genetic association studies have focused exclusively on the direct effects of single mutations, ignoring possible interactions (epistasis). However, since genes operate within complex networks, interactions are expected to exist. The modelling of epistasis could further biological understanding, but the detection of such effects is complicated by a vast search space.
In this thesis, we present a new statistical method to detect genetic interactions affecting quantitative traits in large-scale datasets. Our approach is based on testing for an interaction between a variant and a polygenic score (PGS) comprising a group of other mutations. We develop a new computational algorithm for PGS construction, and show through simulations that this method is robust to false-positives while retaining statistical power.
We apply our approach to 97 quantitative traits in the UK Biobank (UKB) and find 144 independent interactions with the PGS for 52 different traits, including important variants known to affect disease risk at the APOE, FTO and LDLR genes, for example.
We also develop a test to identify, for each variant interacting with the PGS, the variants driving that interaction. This recovers previously-known interactions and identifies several novel signals, primarily for biomarker traits. An example is a large network of genes (including ABO, ASGR1, FUT2, FUT6, PIGC and TREH) affecting alkaline phosphatase levels, or an interaction between IL33 and ALOX15 impacting eosinophil count, potentially implicated in asthma.
Lastly, we extend our analysis to a new dataset of imputed variation at HLA genes in the UKB and find, among others, a new interaction for glycated haemoglobin involving HLA-DQA1*03:01, an allele previously associated with diabetes.
Our results demonstrate the potential for detecting epistatic effects in presently-available genomic datasets. This can allow the uncovering of key 'core' genes modulating the impacts of other regions in the genome, as well as the identification of subgroups of interacting variants of likely functional relevance
Detecting Major Genetic Loci Controlling Phenotypic Variability in Experimental Crosses
Traditional methods for detecting genes that affect complex diseases in humans or animal models, milk production in livestock, or other traits of interest, have asked whether variation in genotype produces a change in that traitâs average value. But focusing on differences in the mean ignores differences in variability about that mean. The robustness, or uniformity, of an individualâs character is not only of great practical importance in medical genetics and food production but is also of scientific and evolutionary interest (e.g., blood pressure in animal models of heart disease, litter size in pigs, flowering time in plants). We describe a method for detecting major genes controlling the phenotypic variance, referring to these as vQTL. Our method uses a double generalized linear model with linear predictors based on probabilities of line origin. We evaluate our method on simulated F2 and collaborative cross data, and on a real F2 intercross, demonstrating its accuracy and robustness to the presence of ordinary mean-controlling QTL. We also illustrate the connection between vQTL and QTL involved in epistasis, explaining how these concepts overlap. Our method can be applied to a wide range of commonly used experimental crosses and may be extended to genetic association more generally
Extensions and Improvements to Random Forests for Classification
The motivation of my dissertation is to improve two weaknesses of Random Forests. One, the failure to detect genetic interactions between two single nucleotide polymorphisms (SNPs) in higher dimensions when the interacting SNPs both have weak main effects and two, the difficulty of interpretation in comparison to parametric methods such as logistic regression, linear discriminant analysis, and linear regression.
We focus on detecting pairwise SNP interactions in genome case-control studies. We determine the best parameter settings to optimize the detection of SNP interactions and improve the efficiency of Random Forests and present an efficient filtering method. The filtering method is compared to leading methods and is shown that it is computationally faster with good detection power.
Random Forests allows us to identify clusters, outliers, and important features for subgroups of observations through the visualization of the proximities. We improve the interpretation of Random Forests through the proximities. The result of the new proximities are asymmetric, and the appropriate visualization requires an asymmetric model for interpretation. We propose a new visualization technique for asymmetric data and compare it to existing approaches
- âŠ