13 research outputs found

    Comparison of variant calling methods for whole genome sequencing data in dairy cattle

    Get PDF
    Accurate identification of SNPs from next-generation sequencing data is crucial for high-quality downstream analysis. Whole genome sequence data of 65 key ancestors of genotyped Swiss dairy populations were available for investigation (24 billion reads, 96.8% mapped to UMD31, 12x coverage). Four publically available variant calling programmes were assessed and different levels of pre-calling handling for each method were tested and compared. SNP concordance was examined with Illumina’s BovineHD Genotyping BeadChip¼. Depending on variant calling software used, between 16,894,054 and 22,048,382 SNP were identified (multi-sample calling). A total of 14,644,310 SNP were identified by all four variant callers (multi-sample calling). InDel counts ranged from 1,997,791 to 2,857,754; 1,708,649 InDels were identified by all four variant callers. A minimum of pre-calling data handling resulted in the highest non-reference sensitivity and the lowest non-reference discrepancy rates

    The importance of identity-by-state information for the accuracy of genomic selection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is commonly assumed that prediction of genome-wide breeding values in genomic selection is achieved by capitalizing on linkage disequilibrium between markers and QTL but also on genetic relationships. Here, we investigated the reliability of predicting genome-wide breeding values based on population-wide linkage disequilibrium information, based on identity-by-descent relationships within the known pedigree, and to what extent linkage disequilibrium information improves predictions based on identity-by-descent genomic relationship information.</p> <p>Methods</p> <p>The study was performed on milk, fat, and protein yield, using genotype data on 35 706 SNP and deregressed proofs of 1086 Italian Brown Swiss bulls. Genome-wide breeding values were predicted using a genomic identity-by-state relationship matrix and a genomic identity-by-descent relationship matrix (averaged over all marker loci). The identity-by-descent matrix was calculated by linkage analysis using one to five generations of pedigree data.</p> <p>Results</p> <p>We showed that genome-wide breeding values prediction based only on identity-by-descent genomic relationships within the known pedigree was as or more reliable than that based on identity-by-state, which implicitly also accounts for genomic relationships that occurred before the known pedigree. Furthermore, combining the two matrices did not improve the prediction compared to using identity-by-descent alone. Including different numbers of generations in the pedigree showed that most of the information in genome-wide breeding values prediction comes from animals with known common ancestors less than four generations back in the pedigree.</p> <p>Conclusions</p> <p>Our results show that, in pedigreed breeding populations, the accuracy of genome-wide breeding values obtained by identity-by-descent relationships was not improved by identity-by-state information. Although, in principle, genomic selection based on identity-by-state does not require pedigree data, it does use the available pedigree structure. Our findings may explain why the prediction equations derived for one breed may not predict accurate genome-wide breeding values when applied to other breeds, since family structures differ among breeds.</p

    Nonsense-Mediated Decay Enables Intron Gain in Drosophila

    Get PDF
    Intron number varies considerably among genomes, but despite their fundamental importance, the mutational mechanisms and evolutionary processes underlying the expansion of intron number remain unknown. Here we show that Drosophila, in contrast to most eukaryotic lineages, is still undergoing a dramatic rate of intron gain. These novel introns carry significantly weaker splice sites that may impede their identification by the spliceosome. Novel introns are more likely to encode a premature termination codon (PTC), indicating that nonsense-mediated decay (NMD) functions as a backup for weak splicing of new introns. Our data suggest that new introns originate when genomic insertions with weak splice sites are hidden from selection by NMD. This mechanism reduces the sequence requirement imposed on novel introns and implies that the capacity of the spliceosome to recognize weak splice sites was a prerequisite for intron gain during eukaryotic evolution

    Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals

    Get PDF
    peer-reviewedH.D.D., A.J.C., P.J.B. and B.J.H. would like to acknowledge the Dairy Futures Cooperative Research Centre for funding. H.P. and R.F. acknowledge funding from the German Federal Ministry of Education and Research (BMBF) within the AgroClustEr ‘Synbreed—Synergistic Plant and Animal Breeding’ (grant 0315527B). H.P., R.F., R.E. and K.-U.G. acknowledge the Arbeitsgemeinschaft SĂŒddeutscher RinderzĂŒchter, the Arbeitsgemeinschaft Österreichischer FleckviehzĂŒchter and ZuchtData EDV Dienstleistungen for providing genotype data. A. Bagnato acknowledges the European Union (EU) Collaborative Project LowInputBreeds (grant agreement 222623) for providing Brown Swiss genotypes. Braunvieh Schweiz is acknowledged for providing Brown Swiss phenotypes. H.P. and R.F. acknowledge the German Holstein Association (DHV) and the ConfederaciĂłn de Asociaciones de Frisona Española (CONCAFE) for sharing genotype data. H.P. was financially supported by a postdoctoral fellowship from the Deutsche Forschungsgemeinschaft (DFG) (grant PA 2789/1-1). D.B. and D.C.P. acknowledge funding from the Research Stimulus Fund (11/S/112) and Science Foundation Ireland (14/IA/2576). M.S. and F.S.S. acknowledge the Canadian Dairy Network (CDN) for providing the Holstein genotypes. P.S. acknowledges funding from the Genome Canada project entitled ‘Whole Genome Selection through Genome Wide Imputation in Beef Cattle’ and acknowledges WestGrid and Compute/Calcul Canada for providing computing resources. J.F.T. was supported by the National Institute of Food and Agriculture, US Department of Agriculture, under awards 2013-68004-20364 and 2015-67015-23183. A. Bagnato, F.P., M.D. and J.W. acknowledge EU Collaborative Project Quantomics (grant 516 agreement 222664) for providing Brown Swiss and Finnish Ayrshire sequences and genotypes. A.C.B. and R.F.V. acknowledge funding from the public–private partnership ‘Breed4Food’ (code BO-22.04-011- 001-ASG-LR) and EU FP7 IRSES SEQSEL (grant 317697). A.C.B. and R.F.V. acknowledge CRV (Arnhem, the Netherlands) for providing data on Dutch and New Zealand Holstein and Jersey bulls.Stature is affected by many polymorphisms of small effect in humans1. In contrast, variation in dogs, even within breeds, has been suggested to be largely due to variants in a small number of genes2,3. Here we use data from cattle to compare the genetic architecture of stature to those in humans and dogs. We conducted a meta-analysis for stature using 58,265 cattle from 17 populations with 25.4 million imputed whole-genome sequence variants. Results showed that the genetic architecture of stature in cattle is similar to that in humans, as the lead variants in 163 significantly associated genomic regions (P < 5 × 10−8) explained at most 13.8% of the phenotypic variance. Most of these variants were noncoding, including variants that were also expression quantitative trait loci (eQTLs) and in ChIP–seq peaks. There was significant overlap in loci for stature with humans and dogs, suggesting that a set of common genes regulates body size in mammals

    Estimates of missing heritability for complex traits in Brown Swiss cattle

    Get PDF
    Background: Genomic selection estimates genetic merit based on dense SNP (single nucleotide polymorphism) genotypes and phenotypes. This requires that SNPs explain a large fraction of the genetic variance. The objectives of this work were: (1) to estimate the fraction of genetic variance explained by dense genome-wide markers using 54 K SNP chip genotyping, and (2) to evaluate the effect of alternative marker-based relationship matrices and corrections for the base population on the fraction of the genetic variance explained by markers. Methods. Two alternative marker-based relationship matrices were estimated using 35 706 SNPs on 1086 dairy bulls. Both pedigree- and marker-based relationship matrices were fitted simultaneously or separately in an animal model to estimate the fraction of variance not explained by the markers, i.e. the fraction explained by the pedigree. The phenotypes considered in the analysis were the deregressed estimated breeding values (dEBV) for milk, fat and protein yield and for somatic cell score (SCS). Results: When dEBV were not sufficiently accurate (50 or 70%), the estimated fraction of the genetic variance explained by the markers was around 65% for yield traits and 45% for SCS. Scaling marker genotypes with locus-specific frequencies of heterozygotes slightly increased the variance explained by markers, compared with scaling with the average frequency of heterozygotes across loci. The estimated fraction of the genetic variance explained by the markers using separately both relationships matrices followed the same trends but the results were underestimated. With less accurate dEBV estimates, the fraction of the genetic variance explained by markers was underestimated, which is probably an artifact due to the dEBV being estimated by a pedigree-based animal model. Conclusions: When using only highly accurate dEBV, the proportion of the genetic variance explained by the Illumina 54 K SNP chip was approximately 80% for Brown Swiss cattle. These results depend on the SNP chip used and the family structure of the population, i.e. more dense SNPs and closer family relationships are expected to result in a higher fraction of the variance explained by the SNPs

    Estimates of marker effects for measures of milk flow in the Italian brown Swiss dairy cattle population

    No full text
    Abstract Background Milkability is a complex trait that is characterized by milk flow traits including average milk flow rate, maximum milk flow rate and total milking time. Milkability has long been recognized as an economically important trait that can be improved through selection. By improving milkability, management costs of milking decrease through reduced labor and improved efficiency of the automatic milking system, which has been identified as an important factor affecting net profit. The objective of this study was to identify markers associated with electronically measured milk flow traits, in the Italian Brown Swiss population that could potentially improve selection based on genomic predictions. Results Sires (n = 1351) of cows with milk flow information were genotyped for 33,074 single nucleotide polymorphism (SNP) markers distributed across 29 Bos taurus autosomes (BTA). Among the six milk flow traits collected, ascending time, time of plateau, descending time, total milking time, maximum milk flow and average milk flow, there were 6,929 (time of plateau) to 14,585 (maximum milk flow) significant SNP markers identified for each trait across all BTA. Unique regions were found for each of the 6 traits providing evidence that each individual milk flow trait offers distinct genetic information about milk flow. This study was also successful in identifying functional processes and genes associated with SNPs that influences milk flow. Conclusions In addition to verifying the presence of previously identified milking speed quantitative trait loci (QTL) within the Italian Brown Swiss population, this study revealed a number of genomic regions associated with milk flow traits that have never been reported as milking speed QTL. While several of these regions were not associated with a known gene or QTL, a number of regions were associated with QTL that have been formerly reported as regions associated with somatic cell count, somatic cell score and udder morphometrics. This provides further evidence of the complexity of milk flow traits and the underlying relationship it has with other economically important traits for dairy cattle. Improved understanding of the overall milking pattern will aid in identification of cows with lower management costs and improved udder health.</p

    Identification and validation of copy number variants in Italian Brown Swiss dairy cattle using Illumina Bovine SNP50 BeadchipÂź

    Get PDF
    The determination of copy number variation (CNV) is very important for the evaluation of genomic traits in several species because they are a major source for the genetic variation, influencing gene expression, phenotypic variation, adaptation and the development of diseases. The aim of this study was to obtain a CNV genome map using the Illumina Bovine SNP50 BeadChip data of 651 bulls of the Italian Brown Swiss breed. PennCNV and SVS7 (Golden Helix) software were used for the detection of the CNVs and Copy Number Variation Regions (CNVRs). A total of 5,099 and 1,289 CNVs were identified with PennCNV and SVS7 software, respectively. These were grouped at the population level into 1101 (220 losses, 774 gains, 107 complex) and 277 (185 losses, 56 gains and 36 complex) CNVR. Ten of the selected CNVR were experimentally validated with a qPCR experiment. The GO and pathway analyses were conducted and they identified genes (false discovery rate corrected) in the CNVR related to biological processes cellular component, molecular function and metabolic pathways. Among those, we found the FCGR2B, PPARα, KATNAL1, DNAJC15, PTK2, TG, STAT family, NPM1, GATA2, LMF1, ECHS1 genes, already known in literature because of their association with various traits in cattle. Although there is variability in the CNVRs detection across methods and platforms, this study allowed the identification of CNVRs in Italian Brown Swiss, overlapping those already detected in other breeds and finding additional ones, thus producing new knowledge for association studies with traits of interest in cattle

    Phospho-Profiling Linking Biology and Clinics in Pediatric Acute Myeloid Leukemia

    No full text
    Abstract. Aberrant activation of key signaling-molecules is a hallmark of acute myeloid leukemia (AML) and may have prognostic and therapeutic implications. AML summarizes several disease entities with a variety of genetic subtypes. A comprehensive model spanning from signal activation patterns in major genetic subtypes of pediatric AML (pedAML) to outcome prediction and pre-clinical response to signaling inhibitors has not yet been provided. We established a high-throughput flow-cytometry based method to assess activation of hallmark phospho-proteins (phospho-flow) in 166 bone-marrow derived pedAML samples under basal and cytokine stimulated conditions. We correlated levels of activated phospho-proteins at diagnosis with relapse incidence in intermediate (IR) and high risk (HR) subtypes. In parallel, we screened a set of signaling inhibitors for their efficacy against primary AML blasts in a flow-cytometry based ex vivo cytotoxicity assay and validated the results in a murine xenograft model. Certain phospho-signal patterns differ between genetic subtypes of pedAML. Some are consistently seen through all AML subtypes such as pSTAT5. In IR/HR subtypes high levels of GM-CSF stimulated pSTAT5 and low levels of unstimulated pJNK correlated with increased relapse risk overall. Combination of GM-CSF/pSTAT5high and basal/pJNKlow separated three risk groups among IR/HR subtypes. Out of 10 tested signaling inhibitors, midostaurin most effectively affected AML blasts and simultaneously blocked phosphorylation of multiple proteins, including STAT5. In a mouse xenograft model of KMT2A-rearranged pedAML, midostaurin significantly prolonged disease latency. Our study demonstrates the applicability of phospho-flow for relapse-risk assessment in pedAML, whereas functional phenotype-driven ex vivo testing of signaling inhibitors may allow individualized therapy
    corecore