4,789 research outputs found
On the informativeness of dominant and co-dominant genetic markers for Bayesian supervised clustering
We study the accuracy of Bayesian supervised method used to cluster
individuals into genetically homogeneous groups on the basis of dominant or
codominant molecular markers. We provide a formula relating an error criterion
the number of loci used and the number of clusters. This formula is exact and
holds for arbitrary number of clusters and markers. Our work suggests that
dominant markers studies can achieve an accuracy similar to that of codominant
markers studies if the number of markers used in the former is about 1.7 times
larger than in the latter
Cryptic diversity within the major trypanosomiasis vector Glossina fuscipes revealed by molecular markers
Background: The tsetse fly Glossina fuscipes s.l. is responsible for the transmission of approximately 90% of cases of human African trypanosomiasis (HAT) or sleeping sickness. Three G. fuscipes subspecies have been described, primarily based upon subtle differences in the morphology of their genitalia. Here we describe a study conducted across the range of this important vector to determine whether molecular evidence generated from nuclear DNA (microsatellites and gene sequence information), mitochondrial DNA and symbiont DNA support the existence of these taxa as discrete taxonomic units.
Principal Findings: The nuclear ribosomal Internal transcribed spacer 1 (ITS1) provided support for the three subspecies. However nuclear and mitochondrial sequence data did not support the monophyly of the morphological subspecies G. f.fuscipes or G. f. quanzensis. Instead, the most strongly supported monophyletic group was comprised of flies sampled fromEthiopia. Maternally inherited loci (mtDNA and symbiont) also suggested monophyly of a group from Lake Victoria basin and Tanzania, but this group was not supported by nuclear loci, suggesting different histories of these markers. Microsatellite data confirmed strong structuring across the range of G. fuscipes s.l., and was useful for deriving the interrelationship of closely related populations.
Conclusion/Significance: We propose that the morphological classification alone is not used to classify populations of G. fuscipes for control purposes. The Ethiopian population, which is scheduled to be the target of a sterile insect release (SIT) programme, was notably discrete. From a programmatic perspective this may be both positive, given that it may reflect limited migration into the area or negative if the high levels of differentiation are also reflected in reproductive isolation between this population and the flies to be used in the release programme
Non-stationary patterns of isolation-by-distance: inferring measures of local genetic differentiation with Bayesian kriging
Patterns of isolation-by-distance arise when population differentiation
increases with increasing geographic distances. Patterns of
isolation-by-distance are usually caused by local spatial dispersal, which
explains why differences of allele frequencies between populations accumulate
with distance. However, spatial variations of demographic parameters such as
migration rate or population density can generate non-stationary patterns of
isolation-by-distance where the rate at which genetic differentiation
accumulates varies across space. To characterize non-stationary patterns of
isolation-by-distance, we infer local genetic differentiation based on Bayesian
kriging. Local genetic differentiation for a sampled population is defined as
the average genetic differentiation between the sampled population and fictive
neighboring populations. To avoid defining populations in advance, the method
can also be applied at the scale of individuals making it relevant for
landscape genetics. Inference of local genetic differentiation relies on a
matrix of pairwise similarity or dissimilarity between populations or
individuals such as matrices of FST between pairs of populations. Simulation
studies show that maps of local genetic differentiation can reveal barriers to
gene flow but also other patterns such as continuous variations of gene flow
across habitat. The potential of the method is illustrated with 2 data sets:
genome-wide SNP data for human Swedish populations and AFLP markers for alpine
plant species. The software LocalDiff implementing the method is available at
http://membres-timc.imag.fr/Michael.Blum/LocalDiff.htmlComment: In press, Evolution 201
Assessing population genetic structure via the maximisation of genetic distance
<p>Abstract</p> <p>Background</p> <p>The inference of the hidden structure of a population is an essential issue in population genetics. Recently, several methods have been proposed to infer population structure in population genetics.</p> <p>Methods</p> <p>In this study, a new method to infer the number of clusters and to assign individuals to the inferred populations is proposed. This approach does not make any assumption on Hardy-Weinberg and linkage equilibrium. The implemented criterion is the maximisation (via a <it>simulated annealing </it>algorithm) of the averaged genetic distance between a predefined number of clusters. The performance of this method is compared with two Bayesian approaches: STRUCTURE and BAPS, using simulated data and also a real human data set.</p> <p>Results</p> <p>The simulations show that with a reduced number of markers, BAPS overestimates the number of clusters and presents a reduced proportion of correct groupings. The accuracy of the new method is approximately the same as for STRUCTURE. Also, in Hardy-Weinberg and linkage disequilibrium cases, BAPS performs incorrectly. In these situations, STRUCTURE and the new method show an equivalent behaviour with respect to the number of inferred clusters, although the proportion of correct groupings is slightly better with the new method. Re-establishing equilibrium with the randomisation procedures improves the precision of the Bayesian approaches. All methods have a good precision for <it>F</it><sub><it>ST </it></sub>≥ 0.03, but only STRUCTURE estimates the correct number of clusters for <it>F</it><sub><it>ST </it></sub>as low as 0.01. In situations with a high number of clusters or a more complex population structure, MGD performs better than STRUCTURE and BAPS. The results for a human data set analysed with the new method are congruent with the geographical regions previously found.</p> <p>Conclusion</p> <p>This new method used to infer the hidden structure in a population, based on the maximisation of the genetic distance and not taking into consideration any assumption about Hardy-Weinberg and linkage equilibrium, performs well under different simulated scenarios and with real data. Therefore, it could be a useful tool to determine genetically homogeneous groups, especially in those situations where the number of clusters is high, with complex population structure and where Hardy-Weinberg and/or linkage equilibrium are present.</p
Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication
Horseshoe crabs are marine arthropods with a fossil record extending back
approximately 450 million years. They exhibit remarkable morphological
stability over their long evolutionary history, retaining a number of ancestral
arthropod traits, and are often cited as examples of "living fossils." As
arthropods, they belong to the Ecdysozoa}, an ancient super-phylum whose
sequenced genomes (including insects and nematodes) have thus far shown more
divergence from the ancestral pattern of eumetazoan genome organization than
cnidarians, deuterostomes, and lophotrochozoans. However, much of ecdysozoan
diversity remains unrepresented in comparative genomic analyses. Here we use a
new strategy of combined de novo assembly and genetic mapping to examine the
chromosome-scale genome organization of the Atlantic horseshoe crab Limulus
polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by
sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their
parents at a mean redundancy of 1.1x per sample. The map includes 84,307
sequence markers and 5,775 candidate conserved protein coding genes. Comparison
to other metazoan genomes shows that the L. polyphemus genome preserves
ancestral bilaterian linkage groups, and that a common ancestor of modern
horseshoe crabs underwent one or more ancient whole genome duplications (WGDs)
~ 300 MYA, followed by extensive chromosome fusion
A General Framework for Updating Belief Distributions
We propose a framework for general Bayesian inference. We argue that a valid
update of a prior belief distribution to a posterior can be made for parameters
which are connected to observations through a loss function rather than the
traditional likelihood function, which is recovered under the special case of
using self information loss. Modern application areas make it is increasingly
challenging for Bayesians to attempt to model the true data generating
mechanism. Moreover, when the object of interest is low dimensional, such as a
mean or median, it is cumbersome to have to achieve this via a complete model
for the whole data distribution. More importantly, there are settings where the
parameter of interest does not directly index a family of density functions and
thus the Bayesian approach to learning about such parameters is currently
regarded as problematic. Our proposed framework uses loss-functions to connect
information in the data to functionals of interest. The updating of beliefs
then follows from a decision theoretic approach involving cumulative loss
functions. Importantly, the procedure coincides with Bayesian updating when a
true likelihood is known, yet provides coherent subjective inference in much
more general settings. Connections to other inference frameworks are
highlighted.Comment: This is the pre-peer reviewed version of the article "A General
Framework for Updating Belief Distributions", which has been accepted for
publication in the Journal of Statistical Society - Series B. This article
may be used for non-commercial purposes in accordance with Wiley Terms and
Conditions for Self-Archivin
Insertion/Deletion markers for assessing the genetic variation and the spatial genetic structure of Tunisian Brachypodium hybridum populations
The wild annual grass Brachypodium hybridum, an allotetraploid species derived from the natural hybridization between the diploid species B. distachyon (2n=10) and B. stacei (2n=20). This trio of species has been suggested as a model system for polyploidy. Brachypodium hybridum is the most widespread Brachypodium species in Tunisia. Natural diversity can be used as a powerful tool to uncover gene function and, in the case of B. hybridum, to understand the functional consequences of polyploidy. Here, we examined the spatial distribution of genetic variation of B. hybridum across its entire range in Tunisia and tested underlying factors that shaped its genetic variation. Population genetic analyses were conducted on 145 individuals from 9 populations using 8 InDel markers. Results indicated a relatively high level of within-population genetic diversity (He = 0.35) and limited among-population differentiation (FPT = 0.20) for this predominantly self-pollinating grass. UPGMA cluster analyses, PCoA and Bayesian clustering supported the demarcation of the populations into 3 groups that were not correlated with location or altitude, suggesting a loose genetic affinity of B. hybridum populations in relation to their geographical locations, and no obvious genetic structure among populations across the study area. This pattern was associated with a considerable amount of an asymmetric gene flow between populations. Overall, the obtained results suggest that the long-distance seed-dispersal is the most important factor in shaping the spatial genetic structure of B. hybridum in Tunisia. They also provide key guidelines for on-going and future work including breeding programs and genome-wide association studies
On Identifying the Optimal Number of Population Clusters via the Deviance Information Criterion
Inferring population structure using Bayesian clustering programs often requires a priori specification of the number of subpopulations, , from which the sample has been drawn. Here, we explore the utility of a common Bayesian model selection criterion, the Deviance Information Criterion (DIC), for estimating . We evaluate the accuracy of DIC, as well as other popular approaches, on datasets generated by coalescent simulations under various demographic scenarios. We find that DIC outperforms competing methods in many genetic contexts, validating its application in assessing population structure
- …