21 research outputs found

    Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach

    Get PDF
    Genome wide association studies are now widely used in the livestock sector to estimate the association among single nucleotide polymorphisms (SNPs) distributed across the whole genome and one or more trait. As computational power increases, the use of machine learning techniques to analyze large genome wide datasets becomes possible

    Development of a 200 single nucleotide polymorphism panel for parentage assessment for 14 Italian goat breeds

    Get PDF
    The recent availability of a medium density SNPs chip in goat offers the possibility to develop a useful and less expensive tool for parentage assessment. However, standard approaches of SNP selection for parentage assignment are still ineffective due to a lack of information about markers position. In this study, we describe the identification of a 200 SNPs panel for parentage testing in goat. Data on 350 goats of 14 different Italian breeds genotyped with the Illumina 50K SNP array were provided by the Italian Goat Consortium (IGC). The 200 SNPs panel was identified by a three-step procedure, as follows: 1) parentage assessment by mendelian errors and genomic parentage to identify true parent-offspring pairs; 2) identification of informative SNPs by canonical discriminant analysis and 3) reduction by mendelian errors and stepwise regression. The 200 SNPs panel was tested on pairwise comparison of all animals at each locus. Sensitivity, specificity and accuracy of the panel were assessed. The probability of exclusion (Pe) and the probability of a random coincidental match inclusion (Pi) for each breed were estimated. The panel showed good assessment power, with high sensitivity (0.9429), specificity (1.0) and accuracy (0.99997). Pe values ranged from a minimum of 0.99999981 for Maltese from Sardinia to a maximum of 0.999999999996 for Nicastrese. We further reduced panel size by stepwise regression to 174 SNPs showing the same performance of the 200 SNP panel. The development of tools for parentage assessment could improve breeding management also in species with low genetic information, as goat

    Genomic retrospective evaluation of 20 years of selection in Italian Holstein bulls for feet and legs trait

    Get PDF
    Under strong directional selection,allelefrequencies rapidly change,allowing the identificationof genomic regions carrying genes and variantsthat control selected traits, as production, functional and morphologicaltraits. Here we searched selection sweeps by birth date regression on EBVs and the analysis of changes in allele frequencies. Genomic retrospective evaluation of recent selection wasperformed in 2918 Italian Holstein bulls born between 1979 and 2011. Genotypedata from SELMOL, PROZOO and INNOVAGEN projects were used.Estimated Breeding Value (EBVs) for 32 traitswereprovided by the Italian Holstein association (ANAFI). Bulls were genotyped with BovineSNP50 v.1 and BovineHD SNPchips. SNPs positions were updated to UMD3.1 using SNPchiMp v.3. Genotypes were imputed using BEAGLE (v.3.3.4) to obtain HD genotypes for all individuals. After quality control, a total of 2918 animals and 613,956 SNPs were included in the working dataset. Birth date regressed on Feet and LegsEBVshowsa strong positive trend in the birth date interval analyzed. To detect genomic regions involved, we first identifiedPLUS- and MINUS-variantanimalsfor the target EBV over the total year range (134 bulls, group OVERALL)and within each birth year (130 bulls, group BY_YEAR). Then,SNP allelic frequencies, within each group,wereobtainedfor PLUS and MINUS variantspools and the absolute allele frequency difference (delta)was calculated. Mean delta valueswere estimated in overlapping sliding windows of 50 SNPs.Only windows with the mean delta above the 75th percentile + 1.5*Interquartile rangewere retained. Only overlapping regions between OVERALL and BY_YEAR group were retained. These regions cover the 0.84% of the total windows analyzed.Among these, two regions seem particularly interesting. The ~686Kb region on BTA10 (from position 62,578 to 63,264 Kb) had the highest mean delta on BY_YEAR. The~417Kb region onBTA20(from position 40,738 to 41,155 Kb)had thehighest mean delta on OVERALL.Bioinformatic analysis is underway to identify candidate genes, QTLsand metabolic pathways under selection for this trait

    Short communication: Imputing genotypes using PedImpute fast algorithm combining pedigree and population information

    No full text
    Routine genomic evaluations frequently include a preliminary imputation step, requiring high accuracy and reduced computing time. A new algorithm, PedImpute (http://dekoppel.eu/pedimpute/), was developed and compared with findhap (http://aipl.arsusda.gov/software/findhap/) and BEAGLE (http://faculty.washington.edu/browning/beagle/beagle.html), using 19,904 Holstein genotypes from a 4-country international collaboration (United States, Canada, UK, and Italy). Different scenarios were evaluated on a sample subset that included only single nucleotide polymorphism from the Bovine low-density (LD) Illumina BeadChip (Illumina Inc., San Diego, CA). Comparative criteria were computing time, percentage of missing alleles, percentage of wrongly imputed alleles, and the allelic squared correlation. Imputation accuracy on ungenotyped animals was also analyzed. The algorithm PedImpute was slightly more accurate and faster than findhap and BEAGLE when sire, dam, and maternal grandsire were genotyped at high density. On the other hand, BEAGLE performed better than both PedImpute and findhap for animals with at least one close relative not genotyped or genotyped at low density. However, computing time and resources using BEAGLE were incompatible with routine genomic evaluations in Italy. Error rate and allelic squared correlation attained by PedImpute ranged from 0.2 to 1.1% and from 96.6 to 99.3%, respectively. When complete genomic information on sire, dam, and maternal grandsire are available, as expected to be the case in the close future in (at least) dairy cattle, and considering accuracies obtained and computation time required, PedImpute represents a valuable choice in routine evaluations among the algorithms tested

    Multibreed genomic evaluation for production traits of dairy cattle in the United States using single-step genomic best linear unbiased predictor

    No full text
    International audienceOfficial multibreed genomic evaluations for dairy cattle in the United States are based on multibreed BLUP evaluation followed by single-breed estimation of SNP effects. Single-step genomic BLUP (ssGBLUP) allows the straight computation of genomic (G)EBV in a multibreed context. This work aimed to develop ssGBLUP multibreed genomic predictions for US dairy cattle using the algorithm for proven and young (APY) to compute the inverse of the genomic relationship matrix. Only purebred Ayrshire (AY), Brown Swiss (BS), Guernsey (GU), Holstein (HO), and Jersey (JE) animals were considered. A 3-trait model with milk (MY), fat (FY), and protein (PY) yields was applied using about 45 million phenotypes recorded from January 2000 to June 2020. The whole data set included about 29.5 million animals, of which almost 4 million were genotyped. All the effects in the model were breed specific, and breed was also considered as fixed unknown parent groups. Evaluations were done for (1) each single breed separately (single); (2) HO and JE together (HO_JE); (3) AY, BS, and GU together (AY_BS_GU); (4) all the 5 breeds together (5_BREEDS). Initially, 15k core animals were used in APY for AY_BS_GU and 5_BREEDS, but larger core sets with more animals from the least represented breeds were also tested. The HO_JE evaluation had a fixed set of 30k core animals, with an equal representation of the 2 breeds, whereas HO and JE single-breed analysis involved 15k core animals. Validation for cows was based on correlations between adjusted phenotypes and (G)EBV, whereas for bulls on the regression of daughter yield deviations on (G)EBV. Because breed was correctly considered in the model, BLUP results for single and multibreed analyses were the same. Under ssGBLUP, predictability and reliability for AY, BS, and GU were on average 7% and 2% lower in 5_BREEDS compared with single-breed evaluations, respectively. However, validation parameters for these 3 breeds became better than in the single-breed evaluations when 45k animals were included in the core set for 5_BREEDS. Evaluations for Holsteins were more stable across scenarios because of the greatest number of genotyped animals and amount of data. Combining AY, BS, and GU into one evaluation resulted in predictions similar to the ones from single breed, especially when using about 30k core animals in APY. The results showed that single-step large-scale multibreed evaluations are computationally feasible, but fine tuning is needed to avoid a reduction in reliability when numerically dominant breeds are combined. Having evaluations for AY, BS, and GU separated from HO and JE may reduce inflation of GEBV for the first 3 breeds

    Adding evidence for a role of the SLITRK gene family in the pathogenesis of left displacement of the abomasum in Holstein-Friesian dairy cows

    No full text
    Left displacement of the abomasum (LDA) is a frequent disease in Holstein-Friesian cattle with a big economic impact in dairy farms. LDA is a multi-factorial disease and genetics is known to play a role (low to moderate estimated heritability). It is therefore of interest to look for genetic polymorphisms associated with LDA in order to find involved genes and clarify the pathogenesis of the condition. In a population of 62 Italian Holstein-Friesian cows, LDA cases and SNP genotypes were recorded, and a genome-wide association study (GWAS) was performed. A genetic signal of association with LDA was detected on BTA 12, within the sequence of the SLITRK5 gene. This gene is involved in neurological activities such as axonogenesis and synaptic transmission, and this may be related with abomasal hypomotility and atony, which are considered the primary causes of LDA. The results of this research suggest a role for the SLITRK5 gene, and more generally for all nearby genes of the SLITRK family, in the pathogenesis of LDA in dairy cattle. The involvement of the SLITRK gene family may lead to novel approaches for the prevention, diagnosis and therapy of LDA, and could also be used in enhanced selection schemes for healthier animals

    Use of different statistical approaches to study selection signatures in sheep breeds farmed in Italy

    No full text
    Natural and artificial selection affect genome structure causing genetic variation between breeds. Dense marker maps of thousand SNP disseminated across the whole genome allow for the investigation of chromosomal regions that differ between breeds. Several statistical approaches have been proposed to study selection signatures in livestock species. In this work, four approaches were used to study selection signatures in a sample of 496 sheep belonging to 20 Italian breeds, different for geographical origin and production aptitude. The four approaches were: I) Fst Outlier Detection (FOD), implemented in the LOSITAN software. II) comparison of Breed LS means of the sum of differences in SNP allele frequencies along sliding windows (SNP_DIFF).. III) Correspondence analysis (CA). VI) Canonical Discriminant Analysis (CDA). Animal were genotyped with the Illumina OvineSNP50 BeadChip. The first five chromosomes were considered. After data editing, a total of 20,194 SNP were retained for the analysis. The different approaches were able to identify the same regions expressing variation between breeds. On OAR6, for example, all methods highlighted a region located between 35 and 41 Mb, where BMPR1b and ABCG2 loci map. Moreover, SNP able to differentiate between breeds were also detected at 76, 96 and 107 Mb, near to KIT, IL8 and SCD5 loci, respectively. All methods were able to discriminate breeds and, in general, a geographical pattern of variation has been detected. However each approach may supply different kind of information. FOD detected a relatively low number of markers in divergent selection but it was able to identify loci under balanced selection. CA and CDA decomposed the total variability of SNP markers among breeds in different and uncorrelated variables that could be useful for the identification of genes influencing complex traits

    The assessment of inter-individual variation of whole-genome DNA sequence in 32 cows

    No full text
    Despite the growing number of sequenced bovine genomes, the knowledge of the population-wide variation of sequences remains limited. In many studies, statistical methodology was not applied in order to relate findings in the sequenced samples to a population-wide level. Our goal was to assess the population-wide variation in DNA sequence based on whole-genome sequences of 32 Holstein-Friesian cows. The number of SNPs significantly varied across individuals. The number of identified SNPs increased with coverage, following a logarithmic curve. A total of 15,272,427 SNPs were identified, 99.16\ua0% of them being bi-allelic. Missense SNPs were classified into three categories based on their genomic location: housekeeping genes, genes undergoing strong selection, and genes neutral to selection. The number of missense SNPs was significantly higher within genes neutral to selection than in the other two categories. The number of variants located within 3'UTR and 5'UTR regions was also significantly different across gene families. Moreover, the number of insertions and deletions differed significantly among cows varying between 261,712 and 330,103 insertions and from 271,398 to 343,649 deletions. Results not only demonstrate inter-individual variation in the number of SNPs and indels but also show that the number of missense SNPs differs across genes representing different functional backgrounds
    corecore