21 research outputs found

    Most parsimonious haplotype allele sharing determination

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The "common disease – common variant" hypothesis and genome-wide association studies have achieved numerous successes in the last three years, particularly in genetic mapping in human diseases. Nevertheless, the power of the association study methods are still low, in particular on quantitative traits, and the description of the full allelic spectrum is deemed still far from reach. Given increasing density of single nucleotide polymorphisms available and suggested by the block-like structure of the human genome, a popular and prosperous strategy is to use haplotypes to try to capture the correlation structure of SNPs in regions of little recombination. The key to the success of this strategy is thus the ability to unambiguously determine the haplotype allele sharing status among the members. The association studies based on haplotype sharing status would have significantly reduced degrees of freedom and be able to capture the combined effects of tightly linked causal variants.</p> <p>Results</p> <p>For pedigree genotype datasets of medium density of SNPs, we present two methods for haplotype allele sharing status determination among the pedigree members. Extensive simulation study showed that both methods performed nearly perfectly on breakpoint discovery, mutation haplotype allele discovery, and shared chromosomal region discovery.</p> <p>Conclusion</p> <p>For pedigree genotype datasets, the haplotype allele sharing status among the members can be deterministically, efficiently, and accurately determined, even for very small pedigrees. Given their excellent performance, the presented haplotype allele sharing status determination programs can be useful in many downstream applications including haplotype based association studies.</p

    Fast and Accurate Haplotype Inference with Hidden Markov Model

    Get PDF
    The genome of human and other diploid organisms consists of paired chromosomes. The haplotype information (DNA constellation on one single chromosome), which is crucial for disease association analysis and population genetic inference among many others, is however hidden in the data generated for diploid organisms (including human) by modern high-throughput technologies which cannot distinguish information from two homologous chromosomes. Here, I consider the haplotype inference problem in two common scenarios of genetic studies: 1. Model organisms (such as laboratory mice): Individuals are bred through prescribed pedigree design. 2. Out-bred organisms (such as human): Individuals (mostly unrelated) are drawn from one or more populations or continental groups. In the two scenarios, one individual may share short blocks of chromosomes with other individual(s) or with founder(s) if available. I have developed and implemented methods, by identifying the shared blocks statistically, to accurately and more rapidly reconstruct the haplotypes for individuals under study and to solve important related problems including genotype imputation and ancestry inference. My methods, based on hidden Markov model, can scale up to tens of thousands of individuals. Analysis based on my method leads to a new genetic map in mouse population which reveals important biological properties of the recombination process. I have also explored the study design and empirical quality control for imputation tasks with large scale datasets from admixed population.Doctor of Philosoph

    Molecular polymorphisms for phylogeny, pedigree and population structure studies

    Get PDF
    A number of types of molecular polymorphisms can be used for studying genetic relationship and evolutionary history. Microsatellites are hypervariable and can be very useful tools to determine population structure, distinguish sibling species, as well as verifying parental relationships and pedigrees. However, while microsatellite polymorphisms are useful for solving relationships between populations within a species, relations among species or genera will probably be obscured due to a high degree of homoplasy —identity arising from evolutionary convergence not by descent. For long range evolutionary history, such as phylogeny from old world monkey to human, mtDNA markers may be better candidates. The aim of this thesis is to assess molecular polymorphisms of different types and their optimal use in different situations. Two widely separated taxa were used for testing –the green monkey Chlorocebus sabaeus, and the sibling dipteran flies Bactrocera tryoni and B. neohumeralis, known collectively as the Queensland fruit fly. In the present study a complete 16,550 bp mtDNA sequence of the green monkey Chlorocebus sabaeus is reported for the fist time and has been annotated (Chapter 2). Knowledge of the mtDNA genome contributes not only to identification of large scale single nucleotide polymorphisms (SNPs) (Chapter 4) or other mtDNA polymorphisms development, but also to primate phylogenetic and evolutionary study (Chapter 3). Microsatellites used for the green monkey paternity and pedigree studies were developed by cross-amplification using human primers (Chapter 5). For studies of population structure and species discrimination in Queensland fruit fly (Chapter 7), microsatellites were isolated from a genomic library of Bactrocera tryoni (Chapter 6) The total length of 16550 bp of complete mtDNA of the green monkey C. sabaeus, which has been sequenced and annotated here, adds a new node to the primate phylogenetic tree, and creates great opportunity for SNP marker development. The heteroplasmic region was cloned and five different sequences from a single individual were obtained; the implication of this are discussed. The phylogenetic tree reconstructed using the complete mtDNA sequence of C. sabaeus and other primates was used to solve controversial taxonomic status of C. sabaeus. Phylogenies of primate evolution using different genes from mtDNA are discussed. Primate evolutionary trees using different substitution types are compared and the phylogenetic trees constructed using transversions for the complete mtDNA were found close to preconceived expectations than those with transversions + transitions. The sequence of C. sabaeus 12SrRNA reported here agrees with the one published by ven der Kuyl et al. (1996), but additional SNPs were identified. SNPs for other regions of mtDNA were explored using dHPLC. Twenty two PCR segments for 96 individuals were tested by dHPLC. Fifty five SNPs were found and 10 haplogroups were established. Microsatellite markers were used to construct a genealogy for a colony of green monkeys (C. sabaeus) in the UCLA Vervet Monkey Research Colony. Sixteen microsatellites cross-amplified from human primers were used to conduct paternity analysis and pedigree construction. Seventy-eight out of 417 offspring were assigned paternity successfully. The low success rate is attributed to a certain proportion of mismatches between mothers and offspring; the fact that not all candidate fathers were sampled, the limitations of microsatellite polymorphisms; and weakness of the exclusion method for paternity assessment. Due to the low success rate, the pedigree is split into a few small ones. In a complicated pedigree composed of 75 animals and up to four generations with multiple links a power male mated with 8 females and contributed 10 offspring to the pedigree. Close inbreeding was avoided. Population structure within two species of Queensland fruit fly Bactrocera tryoni and Bactrocera neohumeralis (Tephritidae: Diptera) is examined using microsatellite polymorphisms. Queensland fruit flies B. tryoni and B. neohumeralis are sympatric sibling species that have similar morphological and ecological features. They even share polymorphism at the molecular level. Mating time difference is the main mechanism by which they maintain separate species. In the present study, 22 polymorphic and scorable microsatellites were isolated from B. tryoni and tested in the two species sampled from sympatric distribution areas. Pairwise genetic distance analysis showed explicit differentiation in allele frequencies between the two species, but very weak differences between conspecific populations. Gene flow is higher within B. tryoni than within B. neohumeralis, and gene exchange between the two species exists. An averaging linkage clustering tree constructed by UPGMA showed two major clusters distinguishing the two species, and it appears that population structure is highly correlated with geographic distance. The relationship between molecular markers, evolution, and selection are discussed using comparative studies within two large taxa: primate and insect. The degree of conservation and polymorphism in microsatellites varies between taxa, over evolutionary time

    Molecular polymorphisms for phylogeny, pedigree and population structure studies

    Get PDF
    A number of types of molecular polymorphisms can be used for studying genetic relationship and evolutionary history. Microsatellites are hypervariable and can be very useful tools to determine population structure, distinguish sibling species, as well as verifying parental relationships and pedigrees. However, while microsatellite polymorphisms are useful for solving relationships between populations within a species, relations among species or genera will probably be obscured due to a high degree of homoplasy —identity arising from evolutionary convergence not by descent. For long range evolutionary history, such as phylogeny from old world monkey to human, mtDNA markers may be better candidates. The aim of this thesis is to assess molecular polymorphisms of different types and their optimal use in different situations. Two widely separated taxa were used for testing –the green monkey Chlorocebus sabaeus, and the sibling dipteran flies Bactrocera tryoni and B. neohumeralis, known collectively as the Queensland fruit fly. In the present study a complete 16,550 bp mtDNA sequence of the green monkey Chlorocebus sabaeus is reported for the fist time and has been annotated (Chapter 2). Knowledge of the mtDNA genome contributes not only to identification of large scale single nucleotide polymorphisms (SNPs) (Chapter 4) or other mtDNA polymorphisms development, but also to primate phylogenetic and evolutionary study (Chapter 3). Microsatellites used for the green monkey paternity and pedigree studies were developed by cross-amplification using human primers (Chapter 5). For studies of population structure and species discrimination in Queensland fruit fly (Chapter 7), microsatellites were isolated from a genomic library of Bactrocera tryoni (Chapter 6) The total length of 16550 bp of complete mtDNA of the green monkey C. sabaeus, which has been sequenced and annotated here, adds a new node to the primate phylogenetic tree, and creates great opportunity for SNP marker development. The heteroplasmic region was cloned and five different sequences from a single individual were obtained; the implication of this are discussed. The phylogenetic tree reconstructed using the complete mtDNA sequence of C. sabaeus and other primates was used to solve controversial taxonomic status of C. sabaeus. Phylogenies of primate evolution using different genes from mtDNA are discussed. Primate evolutionary trees using different substitution types are compared and the phylogenetic trees constructed using transversions for the complete mtDNA were found close to preconceived expectations than those with transversions + transitions. The sequence of C. sabaeus 12SrRNA reported here agrees with the one published by ven der Kuyl et al. (1996), but additional SNPs were identified. SNPs for other regions of mtDNA were explored using dHPLC. Twenty two PCR segments for 96 individuals were tested by dHPLC. Fifty five SNPs were found and 10 haplogroups were established. Microsatellite markers were used to construct a genealogy for a colony of green monkeys (C. sabaeus) in the UCLA Vervet Monkey Research Colony. Sixteen microsatellites cross-amplified from human primers were used to conduct paternity analysis and pedigree construction. Seventy-eight out of 417 offspring were assigned paternity successfully. The low success rate is attributed to a certain proportion of mismatches between mothers and offspring; the fact that not all candidate fathers were sampled, the limitations of microsatellite polymorphisms; and weakness of the exclusion method for paternity assessment. Due to the low success rate, the pedigree is split into a few small ones. In a complicated pedigree composed of 75 animals and up to four generations with multiple links a power male mated with 8 females and contributed 10 offspring to the pedigree. Close inbreeding was avoided. Population structure within two species of Queensland fruit fly Bactrocera tryoni and Bactrocera neohumeralis (Tephritidae: Diptera) is examined using microsatellite polymorphisms. Queensland fruit flies B. tryoni and B. neohumeralis are sympatric sibling species that have similar morphological and ecological features. They even share polymorphism at the molecular level. Mating time difference is the main mechanism by which they maintain separate species. In the present study, 22 polymorphic and scorable microsatellites were isolated from B. tryoni and tested in the two species sampled from sympatric distribution areas. Pairwise genetic distance analysis showed explicit differentiation in allele frequencies between the two species, but very weak differences between conspecific populations. Gene flow is higher within B. tryoni than within B. neohumeralis, and gene exchange between the two species exists. An averaging linkage clustering tree constructed by UPGMA showed two major clusters distinguishing the two species, and it appears that population structure is highly correlated with geographic distance. The relationship between molecular markers, evolution, and selection are discussed using comparative studies within two large taxa: primate and insect. The degree of conservation and polymorphism in microsatellites varies between taxa, over evolutionary time

    The role of visual adaptation in cichlid fish speciation

    Get PDF
    D. Shane Wright (1) , Ole Seehausen (2), Ton G.G. Groothuis (1), Martine E. Maan (1) (1) University of Groningen; GELIFES; EGDB(2) Department of Fish Ecology &amp; Evolution, EAWAG Centre for Ecology, Evolution and Biogeochemistry, Kastanienbaum AND Institute of Ecology and Evolution, Aquatic Ecology, University of Bern.In less than 15,000 years, Lake Victoria cichlid fishes have radiated into as many as 500 different species. Ecological and sexual sel ection are thought to contribute to this ongoing speciation process, but genetic differentiation remains low. However, recent work in visual pigment genes, opsins, has shown more diversity. Unlike neighboring Lakes Malawi and Tanganyika, Lake Victoria is highly turbid, resulting in a long wavelength shift in the light spectrum with increasing depth, providing an environmental gradient for exploring divergent coevolution in sensory systems and colour signals via sensory drive. Pundamilia pundamila and Pundamilia nyererei are two sympatric species found at rocky islands across southern portions of Lake Victoria, differing in male colouration and the depth they reside. Previous work has shown species differentiation in colour discrimination, corresponding to divergent female preferences for conspecific male colouration. A mechanistic link between colour vision and preference would provide a rapid route to reproductive isolation between divergently adapting populations. This link is tested by experimental manip ulation of colour vision - raising both species and their hybrids under light conditions mimicking shallow and deep habitats. We quantify the expression of retinal opsins and test behaviours important for speciation: mate choice, habitat preference, and fo raging performance
    corecore