732 research outputs found

    Multi-environment QTL mixed models for drought stress adaptation in wheat

    Get PDF
    Many quantitative trait loci (QTL) detection methods ignore QTL-by-environment interaction (QEI) and are limited in accommodation of error and environment-specific variance. This paper outlines a mixed model approach using a recombinant inbred spring wheat population grown in six drought stress trials. Genotype estimates for yield, anthesis date and height were calculated using the best design and spatial effects model for each trial. Parsimonious factor analytic models best captured the variance-covariance structure, including genetic correlations, among environments. The 1RS.1BL rye chromosome translocation (from one parent) which decreased progeny yield by 13.8 g m(-2) was explicitly included in the QTL model. Simple interval mapping (SIM) was used in a genome-wide scan for significant QTL, where QTL effects were fitted as fixed environment-specific effects. All significant environment-specific QTL were subsequently included in a multi-QTL model and evaluated for main and QEI effects with non-significant QEI effects being dropped. QTL effects (either consistent or environment-specific) included eight yield, four anthesis, and six height QTL. One yield QTL co-located (or was linked) to an anthesis QTL, while another co-located with a height QTL. In the final multi-QTL model, only one QTL for yield (6 g m(-2)) was consistent across environments (no QEI), while the remaining QTL had significant QEI effects (average size per environment of 5.1 g m(-2)). Compared to single trial analyses, the described framework allowed explicit modelling and detection of QEI effects and incorporation of additional classification information about genotypes

    A phylogenetic method to perform genome-wide association studies in microbes

    Get PDF
    Genome-Wide Association Studies (GWAS) are designed to perform an unbiased search of genetic sequence data with the intent of identifying statistically significant associations with a phenotype or trait of interest. The application of GWAS methods to microbial organisms promises to improve the way we understand, manage, and treat infectious diseases. Yet, while microbial pathogens continue to undermine human health, wealth, and longevity, microbial GWAS methods remain unable to fully capitalise on the growing wealth of bacterial and viral genetic sequence data. Clonal population structure and homologous recombination in microbial organisms make it difficult for existing GWAS methods to achieve both the precision needed to reject false positive findings and the statistical power required to detect genuine associations between microbial genotypic and phenotypic variants. In this thesis, we investigate potential solutions to the most substantial methodological challenges in microbial GWAS, and we introduce a new phylogenetic GWAS approach that has been specifically designed for use in bacterial samples. In presenting our approach, we describe the features that render it robust to the confounding effects of both population structure and recombination, while maintaining high statistical power to detect associations. Our approach is applicable to organisms ranging from purely clonal to frequently recombining, to sequence data from both the core and accessory genome, and to binary, categorical, and continuous phenotypes. We also describe the efforts taken to make our method efficient, scalable, and accessible in its implementation within the open-source R package we have created, called treeWAS. Next, we apply our GWAS method to simulated datasets. We develop multiple frameworks for simulating genotypic and phenotypic data with control over relevant parameters. We then present the results of our simulation study, and we use thorough performance testing to demonstrate the power and specificity of our approach, as compared to the performance of alternative cluster-based and dimension-reduction methods. Our approach is then applied to three empirical datasets, from Neisseria gonorrhoeae and Neisseria meningitidis, where we identify core SNPs associated with binary drug resistance and continuous antibiotic minimum inhibitory concentration phenotypes, as well as both core SNP and accessory genome associations with invasive and commensal phenotypes. These applications illustrate the versatility and potential of our method, demonstrating in each case that our approach is capable of confirming known resistance- or virulence-associated loci and discovering novel associations. Our thesis concludes with a review of the previous chapters and an evaluation of the strengths and limitations displayed by the current implementation of our phylogenetic approach to association testing. We discuss key areas for further development, and we propose potential solutions to advance the development of microbial GWAS in future work.Open Acces

    Genetic susceptibility to the metabolic syndrome

    Get PDF
    Tableau d’honneur de la Faculté des études supérieures et postdoctorales, 2004-2005Le syndrome métabolique est caractérisé par un regroupement de facteurs de risque présents chez un même individu et augmentant ainsi ses chances de développer le diabète de type 2 et les maladies cardiovasculaires. Il est donc important de comprendre l’étiologie génétique de ce trait. Dans cette thèse, une multitude d’approches génétiques ont été utilisées afin d’apporter un brin de connaissance sur l’architecture génétique du syndrome métabolique et de ses composantes individuelles. Trois gènes candidats ont été testés incluant le récepteur activé par les proliférateurs de péroxisomes (PPAR) α et PPARγ ainsi que la protéine de transfert des phospholipides (PLTP). Les gènes PPARα et PLTP ont tous deux été associés significativement avec plusieurs variables d’adiposité. Des effets significatifs d’interaction entre les gènes PPARα et PPARγ ont été obtenus pour les paramètres de glucose et d’insuline. Il a aussi été démontré que le polymorphisme PPARα L162V influence les changements de cholestérol-HDL2 suite à un traitement au gemfibrozil. Par la suite, des criblages génomiques ont été effectués sur les concentrations de lipides et de lipoprotéines plasmatiques. Plusieurs régions chromosomiques ont été identifiées incluant 1q43, 11q13 q24, 15q26.1, et 19q13.32 pour le cholestérol-LDL, 12q14.1 pour le cholestérol-HDL, 2p14, 11p13, et 11q24.1 pour les triglycérides, 18q21.32 pour l’apolipoprotéine (apo) B-LDL, et 3p25.2 pour l’apoAI. La contribution génétique à la variation du diamètre principal des particules LDL (DP-LDL) a aussi été étudiée. Les résultats démontrent une forte ressemblance familiale avec des coefficients d’héritabilité de plus de 50%, la présence d’un gène à effet majeur, et une forte évidence de liaison sur le chromosome 17q. Le gène de l’apoH, localisé à cet endroit, a par la suite été significativement associé au DP-LDL, suggérant que ce gène est responsable du signal de liaison observé sur le chromosome 17. Finalement, une variable quantitative du syndrome métabolique a été construite à l’aide d’une analyse factorielle. Un criblage génomique effectué sur cette variable a démontré une évidence de liaison sur le chromosome 15q, suggérant la présence d’un gène à cet endroit contribuant au regroupement des facteurs de risques caractérisant le syndrome métabolique. Plusieurs de ces résultats devront être répliqués, alors que d’autres méritent d’être suivis.The metabolic syndrome is a cluster of interrelated cardiovascular risk factors co-occurring in the same individual. People with this syndrome are at increased risk for developing diabetes mellitus and cardiovascular diseases. Accordingly, it is important to elucidate the genetic aetiology governing this trait in order to better comprehend its pathogenesis. In the present thesis, heritability and complex segregation analyses as well as candidate gene and genome-wide scan approaches have been applied to shed some lights on the genetic architecture of the metabolic syndrome and its individual components. A total of three candidate genes have been investigated including peroxisome proliferator-activated receptor (PPAR) α and PPARγ as well as phospholipid transfer protein (PLTP). It has been shown that polymorphisms in both PPARα and PLTP genes are significantly associated with several indices of adiposity. In addition, significant gene-gene interactions have been observed between PPARα and PPARγ on glucose/insulin parameters. It has also been shown that the HDL2-cholesterol response to gemfibrozil therapy is modulated by the PPARα L162V polymorphism. Genome-wide linkage scans have been performed on lipid and lipoprotein concentrations. Many chromosome regions harbouring lipoprotein/lipid genes have been identified including 1q43, 11q13 q24, 15q26.1, and 19q13.32 for LDL-cholesterol, 12q14.1 for HDL-cholesterol, 2p14, 11p13, and 11q24.1 for triglycerides, 18q21.32 for LDL-apolipoprotein (apo) B, and 3p25.2 for apoAI. The genetic contribution of the variation in LDL peak particle diameter (LDL-PPD) has been also investigated. Overall, the results indicate: 1) that LDL-PPD strongly aggregates within families with heritability estimate above 50%; 2) the existence of a major gene effect affecting the phenotype; and 3) the presence of a major quantitative trait locus located on chromosome 17q. The apo H gene, a positional candidate gene, was then significantly associated with LDL-PPD, suggesting that this gene is responsible for the linkage signal observed on 17q. Finally, factor analyses have been used to construct a quantitative metabolic syndrome variable and a genome-wide linkage scan has been conducted to identify the genomic regions underlying this trait. A major quantitative trait locus has been observed on chromosome 15q suggesting a gene within this region contributing to the clustering of the metabolic syndrome-related phenotypes. Many of these findings must go through independent replication, while others produced new leads that deserve follow-up

    New strategies to detect and understand genotype-by-environment interactions and QTL-by-environment interactions

    Get PDF
    Dissertação para obtenção do Grau de Doutor em Estatística e Gestão do Risco, especialidade em EstatísticaGenotype-by-environment interaction (GEI) is frequent in multi-environment trials, and represents differential responses of genotypes across environments. With the development of molecular markers and mapping techniques, researchers can go one step further and analyse the whole genome to detect specific locations of genes which influence a quantitative trait such as yield. These locations are called quantitative trait locus (QTL), and when these QTLs have different expression across environments we talk about QTLby-environment interactions (QEI), which is the base of GEI. Good understandings of these interactions enable researchers to select better genotypes across different environmental conditions and, consequently, to improve crops in developed and developing countries. In this thesis I intend to present new strategies to improve detection and better understanding of QTLs, especially those exhibiting QEI in the context of multi-environment trials, by using and providing open source software. The first part of this thesis presents a comparison between two of the most used methods to analyse and to structure GEI: the joint regression analysis (JRA) and the additive main effects and multiplicative interaction (AMMI) model. This comparison is made in terms of “robustness” with different incidence rates of missing values, and in terms of dominant/winner genotypes. In the following chapters two- and threestages approaches are presented in which the AMMI model is used to gain accuracy in the phenotypic data, and their scores used to order the environments to find ecological or biological patterns. The first approach (two stages) is appropriated when the error variance is constant across environments, whereas the second (three stages) is more general and accounts for differences in the error variances by using the proposed weighted AMMI model (WAMMI). The final part of the thesis illustrates a strategy to simulate and to model GEI and QEI in complex traits, with the example of yield, based on a number of physiological parameters purely genotype dependent. This is done by using an eco-physiological genotype-to-phenotype model with seven parameters defined with a simple QTL basis.Fundação para a Ciência e Tecnologia - SFRH/BD/35994/2007; project N N310 447838 supported by Ministry of Science and Higher Education, Poland

    Genetics of age-related maculopathy & Score statistics for X-linked quantitative trait loci

    Get PDF
    Age-related maculopathy (ARM) is a common cause of irreparable vision loss in industrialized countries. The disease is characterized by progressive loss of central vision making everyday tasks challenging. The etiology is complex and has both an environmental and a strong genetic components. The public health relevance of the work is to improve the understanding genetic causes in the disease etiology and ultimately to lead to better disease management and prevention. From my ARM work, I present four papers covering range of statistical approaches. The first paper presents fine-mapping efforts, using both linkage and association methods, under previously identified linkage peaks on chromosomes 1q31 and 10q26. We replicate the discovery of the complement factor H (CFH) gene on 1q31 and identify a novel locus, harboring three closely linked genes (PLEKHA1, LOC387715, and HTRA1), on 10q26. Both discoveries have been widely replicated. In the next paper I present meta-analysis of 11 CFH and 5 LOC387715 data sets. We also replicate these findings in two independent case-control cohorts, including one cohort, where ARM status was not a factor in the ascertainment. In the third paper we replicate discoveries of new complement related loci (C2 and CFB) on chromosome 19p13 as well as developing classification models based on SNPs from CFH, LOC387715, and C2. The last paper focuses on applying statistical techniques from the diagnostic medicine literature to ARM. We comment on the importance of understanding the difference and similarities between different goals of genetic studies: improving etiological understanding or finding variants that discriminate well between cases and controls. This work is particularly relevant today when there has been explosion in the availability of direct-to-consumer DNA tests.In addition to carrying out linkage and association analysis, I also have extended the statistical theory behind score-based linkage analyses for X chromosomal markers. This work has public health relevance because many complex common diseases have sex-specific differences, such as prevalence and age of onset. Modeling those appropriately with powerful and robust methods will bring an improved understanding of their genetic basis

    Gene x Environment Interactions in Developmental Dyslexia

    Get PDF
    The goal of this project was to advance understanding of the complex multifactorial etiology of developmental dyslexia, or reading disability (RD), by investigating gene x environment (G x E) interactions. This project tested for G x E interactions using molecular genetic methods and measures of psychosocial and bioenvironmental risk factors. There are two competing predictions that can be derived from existing G x E models about the expected direction of interactions in RD. There could be diathesis-stress interactions in which the effects of genotype are stronger in risk environments, or there could be bioecological interactions in which the effects of genotype are stronger in optimal environments. This study was a sib-pair linkage design including dizygotic twins and their non-twin siblings (age 8-19 years) from 212 families. Analyses initially focused on identifying genetic and environmental risk factors showing main effects on reading phenotypes. Sib-pair linkage analyses with two regression-based linkage models (DeFries-Fulker and Haseman-Elston) showed converging evidence for linkage in 4 regions previously associated with RD, 1p36-p34, 3p12-q13, 6p22.2, and 15q21. Across chromosomal locations, the phenotype with the strongest evidence for linkage was rapid naming. In the environmental analyses, three home variables (parental education, books in the home, and child print exposure) and two bioenvironmental variables (prenatal exposure to smoking and birth weight) showed statistically independent main effects on child reading. The G x E analyses were conducted at the significant linkage peaks with the environments showing main effects. Both DeFries-Fulker and Haseman-Elston G x E analyses showed converging evidence for diathesis-stress G x E interaction with parent education at the chromosome 1 and 3 loci for phonological phenotypes. Follow-up analyses controlling for scaling artifacts, G-E correlations, and ADHD comorbidity revealed that the diathesis-stress G x E interactions were generally robust to these confounding factors. Discussion of the results focused on exploration of the diathesis-stress interactions in the context of previous behavioral genetic and molecular genetic findings, including dimensions that may be important for directionality of interactions, such as genetic approach (behavioral versus molecular), sample characteristics (age, disorder, and comorbidity), and environmental range

    New insights into the genetic control of gene expression using a Bayesian multi-tissue approach.

    Get PDF
    The majority of expression quantitative trait locus (eQTL) studies have been carried out in single tissues or cell types, using methods that ignore information shared across tissues. Although global analysis of RNA expression in multiple tissues is now feasible, few integrated statistical frameworks for joint analysis of gene expression across tissues combined with simultaneous analysis of multiple genetic variants have been developed to date. Here, we propose Sparse Bayesian Regression models for mapping eQTLs within individual tissues and simultaneously across tissues. Testing these on a set of 2,000 genes in four tissues, we demonstrate that our methods are more powerful than traditional approaches in revealing the true complexity of the eQTL landscape at the systems-level. Highlighting the power of our method, we identified a two-eQTL model (cis/trans) for the Hopx gene that was experimentally validated and was not detected by conventional approaches. We showed common genetic regulation of gene expression across four tissues for ∼27% of transcripts, providing >5 fold increase in eQTLs detection when compared with single tissue analyses at 5% FDR level. These findings provide a new opportunity to uncover complex genetic regulatory mechanisms controlling global gene expression while the generality of our modelling approach makes it adaptable to other model systems and humans, with broad application to analysis of multiple intermediate and whole-body phenotypes

    Genome-wide association for major depressive disorder: a possible role for the presynaptic protein piccolo

    Get PDF
    Major depressive disorder (MDD) is a common complex trait with enormous public health significance. As part of the Genetic Association Information Network initiative of the US Foundation for the National Institutes of Health, we conducted a genome-wide association study of 435 291 single nucleotide polymorphisms (SNPs) genotyped in 1738 MDD cases and 1802 controls selected to be at low liability for MDD. Of the top 200, 11 signals localized to a 167 kb region overlapping the gene piccolo (PCLO, whose protein product localizes to the cytomatrix of the presynaptic active zone and is important in monoaminergic neurotransmission in the brain) with P-values of 7.7 × 1
    corecore