conference paper
The VarGoats 1000 genome project dataset: an alternative approach for WGS data filtering for large-scale analysis of livestock diversity
Abstract
International audienceGoat domestication started ca. 11,000 years ago from the bezoar, Capra aegagrus, in SW Asia. Afterward, domestic goats followed the expansion of human populations out of the Fertile Crescent and spread to Europe, Asia, and Africa in a process which lasted a few thousand years. As a result, many populations became locally adapted to highly contrasting environmental conditions. Hybridization with wild goat species also occurred, playing a role in goats’ evolution through adaptive introgression. These phenomena, combined with the more recent human-mediated selection, shaped the global diversity we observe today. VarGoats is a large-scale collaborative effort to assess goat global genomic variation. Currently, the project has assembled a database of 1327 genomes from 133 local and transboundary domestic goat populations from 4 continents (Europe, Africa, Asia, and Oceania), and 45 genomes from 8 wild goat species. Variant calling followed by quality filtering procedures retained a data set of > 28M biallelic SNPs. Preliminary evaluations showed that commonly adopted variant filtering approaches relying on Minor Allele Frequency (MAF) and Linkage Disequilibrium (LD) may not be suitable to process a data set representative of global diversity across multiple species, due to notable differences in LD structure and in the presence/frequency of variants at the local vs. global scale. Thus, we devised a novel approach based on Minor Allele Count (MAC) and marker spacing (bp-space) specifically designed to avoid biases introduced by standard filtering procedures and adequately represent continental and species-specific variation. The comparison of the effects of MAF+LD pruning versus the newly proposed MAC+bp-space method showed that the latter permits to thin down the starting ca. 28M variants to ca. 13M with only a negligible reduction (1.52%) in bezoar and wild goat diversity. In contrast, the LD-based filtering would have caused a loss of 7.55% of bezoar-specific markers and of 20.59% of wild goat specific variants, potentially hampering downstream analyses- info:eu-repo/semantics/conferenceObject
- Conference papers
- Daejeon, South Korea
- large-scale genomics
- biodiversity
- goats and related species
- goats and related species biodiversity large-scale genomics
- [SDV]Life Sciences [q-bio]
- [SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN]