9 research outputs found
A large interactive visual database of copy number variants discovered in taurine cattle
Background: Copy number variants (CNVs) contribute to genetic diversity and phenotypic variation. We aimed to discover CNVs in taurine cattle using a large collection of whole-genome sequences and to provide an interactive database of the identified CNV regions (CNVRs) that includes visualizations of sequence read alignments, CNV boundaries, and genome annotations.
Results: CNVs were identified in each of 4 whole-genome sequencing datasets, which together represent >500 bulls from 17 breeds, using a popular multi-sample read-depth−based algorithm, cn.MOPS. Quality control and CNVR construction, performed dataset-wise to avoid batch effects, resulted in 26,223 CNVRs covering 107.75 unique Mb (4.05%) of the bovine genome. Hierarchical clustering of samples by CNVR genotypes indicated clear separation by breeds. An interactive HTML database was created that allows data filtering options, provides graphical and tabular data summaries including Hardy-Weinberg equilibrium tests on genotype proportions, and displays genes and quantitative trait loci at each CNVR. Notably, the database provides sequence read alignments at each CNVR genotype and the boundaries of constituent CNVs in individual samples. Besides numerous novel discoveries, we corroborated the genotypes reported for a CNVR at the KIT locus known to be associated with the piebald coat colour phenotype in Hereford and some Simmental cattle.
Conclusions: We present a large comprehensive collection of taurine cattle CNVs in a novel interactive visual database that displays CNV boundaries, read depths, and genome features for individual CNVRs, thus providing users with a powerful means to explore and scrutinize CNVRs of interest more thoroughly
Optimizing Selection of the Reference Population for Genotype Imputation From Array to Sequence Variants
Imputation of high-density genotypes to whole-genome sequences (WGS) is a cost-effective method to increase the density of available markers within a population. Imputed genotypes have been successfully used for genomic selection and discovery of variants associated with traits of interest for the population. To allow for the use of imputed genotypes for genomic analyses, accuracy of imputation must be high. Accuracy of imputation is influenced by multiple factors, such as size and composition of the reference group, and the allele frequency of variants included. Understanding the use of imputed WGSs prior to the generation of the reference population is important, as accurate imputation might be more focused, for instance, on common or on rare variants. The aim of this study was to present and evaluate new methods to select animals for sequencing relying on a previously genotyped population. The Genetic Diversity Index method optimizes the number of unique haplotypes in the future reference population, while the Highly Segregating Haplotype selection method targets haplotype alleles found throughout the majority of the population of interest. First the WGSs of a dairy cattle population were simulated. The simulated sequences mimicked the linkage disequilibrium level and the variants’ frequency distribution observed in currently available Holstein sequences. Then, reference populations of different sizes, in which animals were selected using both novel methods proposed here as well as two other methods presented in previous studies, were created. Finally, accuracies of imputation obtained with different reference populations were compared against each other. The novel methods were found to have overall accuracies of imputation of more than 0.85. Accuracies of imputation of rare variants reached values above 0.50. In conclusion, if imputed sequences are to be used for discovery of novel associations between variants and traits of interest in the population, animals carrying novel information should be selected and, consequently, the Genetic Diversity Index method proposed here may be used. If sequences are to be used to impute the overall genotyped population, a reference population consisting of common haplotypes carriers selected using the proposed Highly Segregating Haplotype method is recommended
Optimizing selection of the reference population for genotype imputation from array to sequence variants
Imputation of high-density genotypes to whole-genome sequences (WGS) is a cost-effective method to increase the density of available markers within a population. Imputed genotypes have been successfully used for genomic selection and discovery of variants associated with traits of interest for the population. To allow for the use of imputed genotypes for genomic analyses, accuracy of imputation must be high. Accuracy of imputation is influenced by multiple factors, such as size and composition of the reference group, and the allele frequency of variants included. Understanding the use of imputed WGSs prior to the generation of the reference population is important, as accurate imputation might be more focused, for instance, on common or on rare variants. The aim of this study was to present and evaluate new methods to select animals for sequencing relying on a previously genotyped population. The Genetic Diversity Index method optimizes the number of unique haplotypes in the future reference population, while the Highly Segregating Haplotype selection method targets haplotype alleles found throughout the majority of the population of interest. First the WGSs of a dairy cattle population were simulated. The simulated sequences mimicked the linkage disequilibrium level and the variants' frequency distribution observed in currently available Holstein sequences. Then, reference populations of different sizes, in which animals were selected using both novel methods proposed here as well as two other methods presented in previous studies, were created. Finally, accuracies of imputation obtained with different reference populations were compared against each other. The novel methods were found to have overall accuracies of imputation of more than 0.85. Accuracies of imputation of rare variants reached values above 0.50. In conclusion, if imputed sequences are to be used for discovery of novel associations between variants and traits of interest in the population, animals carrying novel information should be selected and, consequently, the Genetic Diversity Index method proposed here may be used. If sequences are to be used to impute the overall genotyped population, a reference population consisting of common haplotypes carriers selected using the proposed Highly Segregating Haplotype method is recommended.</p
High confidence copy number variants identified in Holstein dairy cattle from whole genome sequence and genotype array data
Multiple methods to detect copy number variants (CNV) relying on different types of data have been developed and CNV have been shown to have an impact on phenotypes of numerous traits of economic importance in cattle, such as reproduction and immunity. Further improvements in CNV detection are still needed in regard to the trade-off between high-true and low-false positive variant identification rates. Instead of improving single CNV detection methods, variants can be identified in silico with high confidence when multiple methods and datasets are combined. Here, CNV were identified from whole-genome sequences (WGS) and genotype array (GEN) data on 96 Holstein animals. After CNV detection, two sets of high confidence CNV regions (CNVR) were created that contained variants found in both WGS and GEN data following an animal-based (n = 52) and a population-based (n = 36) pipeline. Furthermore, the change in false positive CNV identification rates using different GEN marker densities was evaluated. The population-based approach characterized CNVR, which were more often shared among animals (average 40% more samples per CNVR) and were more often linked to putative functions (48 vs 56% of CNVR) than CNV identified with the animal-based approach. Moreover, false positive identification rates up to 22% were estimated on GEN information. Further research using larger datasets should use a population-wide approach to identify high confidence CNVR
Genetic and genomic analysis of hyperthelia in Brown Swiss cattle
Supernumerary teats (SNT) are any abnormal teats found on a calf in addition to the usual and functional 4 teats. The presence of SNT has also been termed "hyperthelia" since the end of the 19th century. Supernumerary teats can act as an incubator for bacteria, infecting the whole udder, and can interfere with the positioning of the milking machine, and consequently, have economic relevance. Different types of SNT are observed at different positions on the udder. Caudal teats are in the rear, ramal teats are attached to another teat, and intercalary teats are found between 2 regular teats. Not all teats are equally developed; some are completely functional but most are rudimentary and not attached to any mammary gland tissue. Recently, different studies showed the poly/oligogenic character of these malformations in cattle as well as in other mammalian species. The objective of this study was to analyze the genetic architecture and incidence of hyperthelia in Swiss Brown Swiss cattle using both traditional genetic evaluation as well as imputed whole genome sequence variant information. First, phenotypes collected over the last 20 yr were used together with pedigree information for estimation of genetic variance. Second, breeding values of Brown Swiss bulls were estimated applying the BLUP algorithm. The BLUP-EBV were deregressed and used as phenotypes in genome-wide association studies. The gene LGR5 on chromosome 5 was identified as a candidate for the presence of SNT. Using alternative trait coding, genomic regions on chromosome 17 and 20 were also identified as being involved in the development of SNT with their own supernumerary mammary gland tissue. Implementing knowledge gained in this study as a routine application allows a more accurate evaluation of the trait and reduction of SNT prevalence in the Swiss Brown Swiss cattle population
Genome-wide association study for supernumerary teats in Swiss Brown Swiss Cattle reveals LGR5 as a major gene on chromosome 5
Objectives
- To investigate the genetic architecture of the supernumerary teats through genome-wide association studies (GWAS) performed with imputed whole-genome sequence data
- To demonstrate the presence of a major gene influencing the presence of SN
Genome-wide association study between copy number variants and hoof health traits in Holstein dairy cattle.
Genome-wide association studies based on SNP have been completed for multiple traits in dairy cattle; however, copy number variants (CNV) could add genomic information that has yet to be harnessed. The objectives of this study were to identify CNV in genotyped Holstein animals and assess their association with hoof health traits using deregressed estimated breeding values as pseudophenotypes. A total of 23,256 CNV comprising 1,645 genomic regions were identified in 5,845 animals. Fourteen genomic regions harboring structural variations, including 9 deletions and 5 duplications, were associated with at least 1 of the studied hoof health traits. This group of traits included digital dermatitis, interdigital dermatitis, heel horn erosion, sole ulcer, white line lesion, sole hemorrhage, and interdigital hyperplasia; no regions were associated with toe ulcer. Twenty candidate genes overlapped with the regions associated with these traits including SCART1, NRXN2, KIF26A, GPHN, and OR7A17. In this study, an effect on infectious hoof lesions could be attributed to the PRAME (Preferentially Expressed Antigen in Melanoma) gene. Almost all genes detected in association with noninfectious hoof lesions could be linked to known metabolic disorders. The knowledge obtained considering information of associated CNV to the traits of interest in this study could improve the accuracy of estimated breeding values. This may further increase the genetic gain for these traits in the Canadian Holstein population, thus reducing the involuntary animal losses due to lameness
Genome-wide association analyses reveals copy number variant regions associated with reproduction and disease traits in Canadian Holstein cattle.
This study aimed to evaluate the impact of copy number variants (CNVs) on 13 reproduction and 12 disease traits in Holstein cattle. Intensity signal files containing Log R ratio and B allele frequency information from 13,730 Holstein animals genotyped with a 95K SNP panel, and 8,467 Holstein animals genotyped with a 50K SNP panel were used to identify the CNVs. Subsequently, the identified CNVs were validated using whole genome sequence data from 126 animals, resulting in 870 high-confidence CNV regions (CNVRs) on 12,131 animals. Out of these, 54 CNVRs had frequencies higher than or equal to 1% in the population and were used in the genome-wide association analysis (one CNVR at a time, including the G matrix). Results revealed that 4 CNVRs were significantly (p-value < 3.7 × 10-5) associated with at least one of the traits analyzed in this study. Specifically, 2 CNVRs were associated with 3 reproduction traits (i.e., calf survival, first service to conception, and non-return rate), and 2 CNVRs were associated with 2 disease traits (i.e., metritis and retained placenta). These CNVRs harbored genes implicated in immune response, cellular signaling, and neuronal development, supporting their potential involvement in these traits. Further investigations to unravel the mechanistic and functional implications of these CNVRs on the mentioned traits are warranted
Genetic mechanisms underlying feed utilization and implementation of genomic selection for improved feed efficiency in dairy cattle
The economic importance of genetically improving feed efficiency has been recognized by cattle producers worldwide. It has the potential to considerably reduce costs, minimize environmental impact, optimize land and resource use efficiency, and improve the overall cattle industry’s profitability. Feed efficiency is a genetically complex trait that can be described as units of product output (e.g. milk yield) per unit of feed input. The main objective of this review paper is to present an overview of the main genetic and physiological mechanisms underlying feed utilization in ruminants and the process towards implementation of genomic selection for feed efficiency in dairy cattle. In summary, feed efficiency can be improved via numerous metabolic pathways and biological mechanisms through genetic selection. Various studies have indicated that feed efficiency is heritable and genomic selection can be successfully implemented in dairy cattle with a large enough training population. In this context, some organizations have worked collaboratively to do research and develop training populations for successful implementation of joint international genomic evaluations. The integration of “-omics” technologies, further investments in high-throughput phenotyping, and identification of novel indicator traits will also be paramount in maximizing the rates of genetic progress for feed efficiency in dairy cattle worldwide