3,938 research outputs found

    hapassoc: Software for Likelihood Inference of Trait Associations with SNP Haplotypes and Other Attributes

    Get PDF
    Complex medical disorders, such as heart disease and diabetes, are thought to involve a number of genes which act in conjunction with lifestyle and environmental factors to increase disease susceptibility. Associations between complex traits and single nucleotide polymorphisms (SNPs) in candidate genomic regions can provide a useful tool for identifying genetic risk factors. However, analysis of trait associations with single SNPs ignores the potential for extra information from haplotypes, combinations of variants at multiple SNPs along a chromosome inherited from a parent. When haplotype-trait associations are of interest and haplotypes of individuals can be determined, generalized linear models (GLMs) may be used to investigate haplotype associations while adjusting for the effects of non-genetic cofactors or attributes. Unfortunately, haplotypes cannot always be determined cost-effectively when data is collected on unrelated subjects. Uncertain haplotypes may be inferred on the basis of data from single SNPs. However, subsequent analyses of risk factors must account for the resulting uncertainty in haplotype assignment in order to avoid potential errors in interpretation. To account for such uncertainty, we have developed hapassoc, software for R implementing a likelihood approach to inference of haplotype and non-genetic effects in GLMs of trait associations. We provide a description of the underlying statistical method and illustrate the use of hapassoc with examples that highlight the flexibility to specify dominant and recessive effects of genetic risk factors, a feature not shared by other software that restricts users to additive effects only. Additionally, hapassoc can accommodate missing SNP genotypes for limited numbers of subjects.

    Statistical methods for detecting genes associated with sperm competition in natural populations of Drosophila, using blocks of tightly linked single nucleotide polymorphisms : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Statistics at Massey University, Albany, New Zealand

    Get PDF
    The purpose of the project is to develop statistical methods for detecting genes associated with sperm competition in natural populations of Drosophila (fruit flies). The flies' genotype information given by Fiumera et al. (2004) is used as the starting point of the analysis. This dataset utilizes blocks of tightly linked single nucleotide polymorphisms within genes suspected to affect sperm competition. The sperm competition detection process is completed in three different stages: maternal and offspring haplotypes reconstruction; paternal genotype and offspring fraction estimation; and preferred genotype detection. Software programs HAPLORE and PHASE 2.0 were implemented for maternal and offspring haplotype reconstruction. The software Parentage is applied on the reconstructed haplotypes for estimating paternal genotypes and the amount of offspring they produced. Lastly, the Kruskal Wallis and permutation tests were conducted to detect differences in offspring produced between groups of males with different genotypes

    Populations in statistical genetic modelling and inference

    Full text link
    What is a population? This review considers how a population may be defined in terms of understanding the structure of the underlying genetics of the individuals involved. The main approach is to consider statistically identifiable groups of randomly mating individuals, which is well defined in theory for any type of (sexual) organism. We discuss generative models using drift, admixture and spatial structure, and the ancestral recombination graph. These are contrasted with statistical models for inference, principle component analysis and other `non-parametric' methods. The relationships between these approaches are explored with both simulated and real-data examples. The state-of-the-art practical software tools are discussed and contrasted. We conclude that populations are a useful theoretical construct that can be well defined in theory and often approximately exist in practice

    Haplotype-aware Diplotyping from Noisy Long Reads

    No full text
    corecore