61 research outputs found

    Une approche phylogénomique pour inférer l'évolution des eucaryotes

    Get PDF
    Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal

    Selecting RAD-Seq Data Analysis Parameters for Population Genetics: The More the Better?

    Get PDF
    Restriction site-associated DNA sequencing (RAD-seq) has become a powerful and widely used tool in molecular ecology studies as it allows to cost-effectively recover thousands of polymorphic sites across individuals of non-model organisms. However, its successful implementation in population genetics relies on correct data processing that would minimize potential loci-assembly biases and consequent genotyping error rates. RAD-seq data processing when no reference genome is available involves the assembly of hundreds of thousands high-throughput sequencing reads into orthologous loci, for which various key parameter values need to be selected by the researcher. Previous studies exploring the effect of these parameter values found or assumed that a larger number of recovered polymorphic loci is associated with a better assembly. Here, using three RAD-seq datasets from different species, we explore the effect of read filtering, loci assembly and polymorphic site selection on number of markers obtained and genetic differentiation inferred using the Stacks software. We find (i) that recovery of higher numbers of polymorphic loci is not necessarily associated with higher genetic differentiation, (ii) that the presence of PCR duplicates, selected loci assembly parameters and selected SNP filtering parameters affect the number of recovered polymorphic loci and degree of genetic differentiation, and (iii) that this effect is different in each dataset, meaning that defining a systematic universal protocol for RAD-seq data analysis may lead to missing relevant information about population differentiation

    Genetic connectivity and hybridization with its siter species challenge the current management paradigm of white anglerfish (Lophius piscatorius)

    Get PDF
    Understanding the inter and intraspecific dynamics of fish populations is essential to promote effective management and conservation actions and to predict adaptation to changing conditions. This is possible through the analysis of thousands of genetic markers, which has proven useful to resolve connectivity among populations. Here, we have tackled this issue in the white anglerfish (Lophius piscatorius), which inhabits the Northeast Atlantic and Mediterranean Sea and coexists with its morphologically almost identical sister species, the black anglerfish (L. budegassa). Our genetic analyses based on 16,000 SNP markers and 700 samples reveal that i) the white anglerfish from the Mediterranean Sea and the Atlantic Ocean are genetically isolated, but that no differentiation can be observed within the later, and that ii) black and white anglerfish naturally hybridize, resulting in a population of about 20% of, most likely sterile, hybrids in some areas. These findings challenge the current paradigm of white anglerfish management, which considers three independent management units within the North East Atlantic and assumes that all mature fish have reproductive potential. Additionally, the northwards distribution of both species, likely due to temperature raises, calls for further monitoring of the abundance and distribution of hybrids to anticipate the effects of climate change in the interactions between both species and their potential resilience

    The SAR11 Group of Alpha-Proteobacteria Is Not Related to the Origin of Mitochondria

    Get PDF
    Although free living, members of the successful SAR11 group of marine alpha-proteobacteria contain a very small and A+T rich genome, two features that are typical of mitochondria and related obligate intracellular parasites such as the Rickettsiales. Previous phylogenetic analyses have suggested that Candidatus Pelagibacter ubique, the first cultured member of this group, is related to the Rickettsiales+mitochondria clade whereas others disagree with this conclusion. In order to determine the evolutionary position of the SAR11 group and its relationship to the origin of mitochondria, we have performed phylogenetic analyses on the concatenation of 24 proteins from 5 mitochondria and 71 proteobacteria. Our results support that SAR11 group is not the sistergroup of the Rickettsiales+mitochondria clade and confirm that the position of this group in the alpha-proteobacterial tree is strongly affected by tree reconstruction artefacts due to compositional bias. As a consequence, genome reduction and bias toward a high A+T content may have evolved independently in the SAR11 species, which points to a different direction in the quest for the closest relatives to mitochondria and Rickettsiales. In addition, our analyses raise doubts about the monophyly of the newly proposed Pelagibacteraceae family

    Environmental status assessment using DNA metabarcoding: towards a genetics based Marine Biotic Index (gAMBI).

    Get PDF
    Marine ecosystem protection and conservation initiatives rely on the assessment of ecological integrity and health status of marine environments. The AZTI's Marine Biotic Index (AMBI), which consists on using macroinvertebrate diversity as indicator of ecosystem health, is used worldwide for this purpose. Yet, this index requires taxonomic assignment of specimens, which typically involves a time and resource consuming visual identification of each sample. DNA barcoding or metabarcoding are potential harmonized, faster and cheaper alternatives for species identification, although the suitability of these methods for easing the implementation of the AMBI is yet to be evaluated. Here, we analyze the requirements for the implementation of a genetics based AMBI (gAMBI), and show, using available sequence data, that information about presence/absence of the most frequently occurring species provides accurate AMBI values. Our results set the basics for the implementation of the gAMBI, which has direct implications for a faster and cheaper marine monitoring and health status assessment

    Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data

    Get PDF
    Abstract Background The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly assessed. To this aim, a comprehensive model that integrates the factors of normal cell contamination and intra-tumour heterogeneity and that can be translated to synthetic data on which to perform benchmarks is indispensable. Results We propose such model and implement it in an R package called CnaGen to synthetically generate a wide range of alterations under different normal cell contamination levels. Six recently published methods for CNA and loss of heterozygosity (LOH) detection on tumour samples were assessed on this synthetic data and on a dilution series of a breast cancer cell-line: ASCAT, GAP, GenoCNA, GPHMM, MixHMM and OncoSNP. We report the recall rates in terms of normal cell contamination levels and alteration characteristics: length, copy number and LOH state, as well as the false discovery rate distribution for each copy number under different normal cell contamination levels. Assessed methods are in general better at detecting alterations with low copy number and under a little normal cell contamination levels. All methods except GPHMM, which failed to recognize the alteration pattern in the cell-line samples, provided similar results for the synthetic and cell-line sample sets. MixHMM and GenoCNA are the poorliest performing methods, while GAP generally performed better. This supports the viability of approaches other than the common hidden Markov model (HMM)-based. Conclusions We devised and implemented a comprehensive model to generate data that simulate tumoural samples genotyped using SNP arrays. The validity of the model is supported by the similarity of the results obtained with synthetic and real data. Based on these results and on the software implementation of the methods, we recommend GAP for advanced users and GPHMM for a fully driven analysis.</p
    • …
    corecore