137 research outputs found
Genealogical lineage sorting leads to significant, but incorrect Bayesian multilocus inference of population structure
Over the past decades, the use of molecular markers has revolutionized biology and led to the foundation of a new research discipline—phylogeography. Of particular interest has been the inference of population structure and biogeography. While initial studies focused on mtDNA as a molecular marker, it has become apparent that selection and genealogical lineage sorting could lead to erroneous inferences. As it is not clear to what extent these forces affect a given marker, it has become common practice to use the combined evidence from a set of molecular markers as an attempt to recover the signals that approximate the true underlying demography. Typically, the number of markers used is determined by either budget constraints or by statistical power required to recognize significant population differentiation. Using microsatellite markers from Drosophila and humans, we show that even large numbers of loci (>50) can frequently result in statistically well-supported, but incorrect inference of population structure using the software baps. Most importantly, genomic features, such as chromosomal location, variability of the markers, or recombination rate, cannot explain this observation. Instead, it can be attributed to sampling variation among loci with different realizations of the stochastic lineage sorting. This phenomenon is particularly pronounced for low levels of population differentiation. Our results have important implications for ongoing studies of population differentiation, as we unambiguously demonstrate that statistical significance of population structure inferred from a random set of genetic markers cannot necessarily be taken as evidence for a reliable demographic inference
The Andaman day gecko paradox: an ancient endemic without pronounced phylogeographic structure
The Andaman day gecko (Phelsuma andamanensis) is endemic to the Andaman Archipelago, located ~ 6000 km away from Madagascar where the genus Phelsuma likely evolved. We complemented existing phylogenetic data with additional markers to show that this species consistently branches off early in the evolution of the genus Phelsuma, and this early origin led us to hypothesize that island populations within the Andaman Archipelago could have further diversified. We sampled the Andaman day gecko from all major islands in the Andamans, developed new microsatellite markers and amplified mitochondrial markers to study population diversification. We detected high allelic diversity in microsatellites, but surprisingly poor geographical structuring. This study demonstrates that the Andaman day gecko has a panmictic population (K = 1), but with weak signals for two clusters that we name ‘North’ (North Andaman, Middle Andaman, Interview, Baratang, Neil, and Long Islands) and ‘South’ (Havelock, South Andaman, Little Andaman Islands). The mitochondrial COI gene uncovered wide haplotype sharing across islands with the presence of several private haplotypes (except for the Little Andaman Island, which only had an exclusive private haplotype) signalling ongoing admixture. This species differs from two other Andaman endemic geckos for which we provide comparative mitochondrial data, where haplotypes show a distinct phylogeographic structure. Testing population history scenarios for the Andaman day gecko using Approximate Bayesian Computation (ABC) supports two possible scenarios but fails to tease apart whether admixture or divergence produced the two weak clusters. Both scenarios agree that admixture and/or divergence prior to the onset of the last glacial maximum shaped the genetic diversity and structure detected in this study. ABC supports population expansion, possibly explained by anthropogenic food subsidies via plantations of cash crops, potentially coupled with human mediated dispersal resulting in the observed panmictic population. The Andaman day gecko may thus be a rare example of an island endemic reptile benefiting from habitat modification and increased movement in its native range
PoPoolation DB: a user-friendly web-based database for the retrieval of natural polymorphisms in Drosophila
<p>Abstract</p> <p>Background</p> <p>The enormous potential of natural variation for the functional characterization of genes has been neglected for a long time. Only since recently, functional geneticists are starting to account for natural variation in their analyses. With the new sequencing technologies it has become feasible to collect sequence information for multiple individuals on a genomic scale. In particular sequencing pooled DNA samples has been shown to provide a cost-effective approach for characterizing variation in natural populations. While a range of software tools have been developed for mapping these reads onto a reference genome and extracting SNPs, linking this information to population genetic estimators and functional information still poses a major challenge to many researchers.</p> <p>Results</p> <p>We developed PoPoolation DB a user-friendly integrated database. Popoolation DB links variation in natural populations with functional information, allowing a wide range of researchers to take advantage of population genetic data. PoPoolation DB provides the user with population genetic parameters (Watterson's <it>θ </it>or Tajima's <it>π</it>), Tajima's D, SNPs, allele frequencies and indels in regions of interest. The database can be queried by gene name, chromosomal position, or a user-provided query sequence or GTF file. We anticipate that PoPoolation DB will be a highly versatile tool for functional geneticists as well as evolutionary biologists.</p> <p>Conclusions</p> <p>PoPoolation DB, available at <url>http://www.popoolation.at/pgt</url>, provides an integrated platform for researchers to investigate natural polymorphism and associated functional annotations from UCSC and Flybase genome browsers, population genetic estimators and RNA-seq information.</p
High performance computation of landscape genomic models integrating local indices of spatial association
Since its introduction, landscape genomics has developed quickly with the
increasing availability of both molecular and topo-climatic data. The current
challenges of the field mainly involve processing large numbers of models and
disentangling selection from demography. Several methods address the latter,
either by estimating a neutral model from population structure or by inferring
simultaneously environmental and demographic effects. Here we present
Samada, an integrated approach to study signatures of local adaptation,
providing rapid processing of whole genome data and enabling assessment of
spatial association using molecular markers. Specifically, candidate loci to
adaptation are identified by automatically assessing genome-environment
associations. In complement, measuring the Local Indicators of Spatial
Association (LISA) for these candidate loci allows to detect whether similar
genotypes tend to gather in space, which constitutes a useful indication of the
possible kinship relationship between individuals. In this paper, we also
analyze SNP data from Ugandan cattle to detect signatures of local adaptation
with Samada, BayEnv, LFMM and an outlier method (FDIST approach in
Arlequin) and compare their results. Samada is an open source software
for Windows, Linux and MacOS X available at \url{http://lasig.epfl.ch/sambada}Comment: 1 figure in text, 1 figure in supplementary material The structure of
the article was modified and some explanations were updated. The methods and
results presented are the same as in the previous versio
ForestQB: An adaptive query builder to support wildlife research
This paper presents ForestQB, a SPARQL query builder, to assist Bioscience and Wildlife Researchers in accessing Linked-Data. As they are unfamiliar with the Semantic Web and the data ontologies, ForestQB aims to empower them to benefit from using Linked-Data to extract valuable information without having to grasp the nature of the data and its underlying technologies. ForestQB is integrating Form-Based Query builders with Natural Language to simplify query construction to match the user requirements. (Demo is available at https://iotgarage.net/demo/forestQB
First extraction of eDNA from tree hole water to detect tree frogs: a simple field method piloted in Madagascar
Environmental DNA (eDNA) is becoming an increasingly used tool for monitoring cryptic species within terrestrial and aquatic systems. We present the first method for extracting water from tree holes for eDNA studies of tree-dwelling frogs, and the first use of eDNA for amphibian monitoring in Madagascar. This pilot study expands on a previously developed method and aims to provide a simple field protocol for DNA extraction from very small water samples, using a relatively inexpensive kit compared to other collection methods. We collected 20 ml of water from tree holes in Ambohitantely Special Reserve in Madagascar, with the aim to survey for the Critically Endangered tree frog Anodonthyla vallani, and we developed species specific cytochrome c oxidase 1 primers for this species. While our two samples did not detect A. vallani, we successfully extracted up to 16.6 ng/µl of eDNA from the samples and using 16S rRNA primers barcoded the tree frog Plethodontohyla mihanika in one of the samples. Despite just two samples being collected, we highlight the future potential of eDNA from tree holes for investigating cryptic habitat specialist amphibians given we extracted frog eDNA from just 20 ml of water. The method provides a rapid, simple, and cost-effective method which can assist cryptic species monitoring in challenging and time-consuming field conditions and should be developed further for frog surveying in Madagascar and beyond. The newly developed primers can be used for further work using this eDNA method to survey threatened Anodonthyla frog species
Making linked-data accessible: A review
Linked-Data (LD) is a paradigm that utilises the RDF triplestore to describe numerous pieces of knowledge linked together. When an entity is retrieved in LD, its associated data becomes instantly obtainable. SPARQL is the query language that allows users to access LD. On the other hand, SPARQL has a complicated syntax that necessitates previous knowledge. Thus, in order to encourage the end-users to use LD, it is crucial to allow them to obtain the data efficiently, in addition to improving their overall experience. Instead of manually constructing SPARQL queries, this paper investigates and reviews existing methods in which LD can be accessed using various tools and techniques, including query builders, visualisation approaches, and several LD applications. We then identify gaps within the literature and highlight future research directions
- …