3,938 research outputs found
hapassoc: Software for Likelihood Inference of Trait Associations with SNP Haplotypes and Other Attributes
Complex medical disorders, such as heart disease and diabetes, are thought to involve a number of genes which act in conjunction with lifestyle and environmental factors to increase disease susceptibility. Associations between complex traits and single nucleotide polymorphisms (SNPs) in candidate genomic regions can provide a useful tool for identifying genetic risk factors. However, analysis of trait associations with single SNPs ignores the potential for extra information from haplotypes, combinations of variants at multiple SNPs along a chromosome inherited from a parent. When haplotype-trait associations are of interest and haplotypes of individuals can be determined, generalized linear models (GLMs) may be used to investigate haplotype associations while adjusting for the effects of non-genetic cofactors or attributes. Unfortunately, haplotypes cannot always be determined cost-effectively when data is collected on unrelated subjects. Uncertain haplotypes may be inferred on the basis of data from single SNPs. However, subsequent analyses of risk factors must account for the resulting uncertainty in haplotype assignment in order to avoid potential errors in interpretation. To account for such uncertainty, we have developed hapassoc, software for R implementing a likelihood approach to inference of haplotype and non-genetic effects in GLMs of trait associations. We provide a description of the underlying statistical method and illustrate the use of hapassoc with examples that highlight the flexibility to specify dominant and recessive effects of genetic risk factors, a feature not shared by other software that restricts users to additive effects only. Additionally, hapassoc can accommodate missing SNP genotypes for limited numbers of subjects.
Statistical methods for detecting genes associated with sperm competition in natural populations of Drosophila, using blocks of tightly linked single nucleotide polymorphisms : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Statistics at Massey University, Albany, New Zealand
The purpose of the project is to develop statistical methods for detecting genes associated with sperm competition in natural populations of Drosophila (fruit flies). The flies' genotype information given by Fiumera et al. (2004) is used as the starting point of the analysis. This dataset utilizes blocks of tightly linked single nucleotide polymorphisms within genes suspected to affect sperm competition. The sperm competition detection process is completed in three different stages: maternal and offspring haplotypes reconstruction; paternal genotype and offspring fraction estimation; and preferred genotype detection. Software programs HAPLORE and PHASE 2.0 were implemented for maternal and offspring haplotype reconstruction. The software Parentage is applied on the reconstructed haplotypes for estimating paternal genotypes and the amount of offspring they produced. Lastly, the Kruskal Wallis and permutation tests were conducted to detect differences in offspring produced between groups of males with different genotypes
Populations in statistical genetic modelling and inference
What is a population? This review considers how a population may be defined
in terms of understanding the structure of the underlying genetics of the
individuals involved. The main approach is to consider statistically
identifiable groups of randomly mating individuals, which is well defined in
theory for any type of (sexual) organism. We discuss generative models using
drift, admixture and spatial structure, and the ancestral recombination graph.
These are contrasted with statistical models for inference, principle component
analysis and other `non-parametric' methods. The relationships between these
approaches are explored with both simulated and real-data examples. The
state-of-the-art practical software tools are discussed and contrasted. We
conclude that populations are a useful theoretical construct that can be well
defined in theory and often approximately exist in practice
- …