176 research outputs found

    Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals

    Get PDF
    Stature is affected by many polymorphisms of small effect in humans1. In contrast, variation in dogs, even within breeds, has been suggested to be largely due to variants in a small number of genes2,3. Here we use data from cattle to compare the genetic architecture of stature to those in humans and dogs. We conducted a meta-analysis for stature using 58,265 cattle from 17 populations with 25.4 million imputed whole-genome sequence variants. Results showed that the genetic architecture of stature in cattle is similar to that in humans, as the lead variants in 163 significantly associated genomic regions (P \u3c 5 × 10−8) explained at most 13.8% of the phenotypic variance. Most of these variants were noncoding, including variants that were also expression quantitative trait loci (eQTLs) and in ChIP–seq peaks. There was significant overlap in loci for stature with humans and dogs, suggesting that a set of common genes regulates body size in mammals

    Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Get PDF
    BACKGROUND: Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. RESULTS: Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence) were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site) for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9) change/site/year) was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9) change/site/year) was approximately half of the overall rate (1.9–2.0 × 10(-9) change/site/year). Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. CONCLUSION: This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies

    Box–Cox Transformation and Random Regression Models for Fecal egg Count Data

    Get PDF
    Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants, fecal egg count (FEC) is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used in an effort to achieve normality before analysis. However, the transformed data are often still not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box–Cox transformation to approach normality and to estimate (co)variance components. We also proposed using random regression models (RRM) for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4) adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box–Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated

    Application of machine learning in SNP discovery

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Single nucleotide polymorphisms (SNP) constitute more than 90% of the genetic variation, and hence can account for most trait differences among individuals in a given species. Polymorphism detection software PolyBayes and PolyPhred give high false positive SNP predictions even with stringent parameter values. We developed a machine learning (ML) method to augment PolyBayes to improve its prediction accuracy. ML methods have also been successfully applied to other bioinformatics problems in predicting genes, promoters, transcription factor binding sites and protein structures.</p> <p>Results</p> <p>The ML program C4.5 was applied to a set of features in order to build a SNP classifier from training data based on human expert decisions (True/False). The training data were 27,275 candidate SNP generated by sequencing 1973 STS (sequence tag sites) (12 Mb) in both directions from 6 diverse homozygous soybean cultivars and PolyBayes analysis. Test data of 18,390 candidate SNP were generated similarly from 1359 additional STS (8 Mb). SNP from both sets were classified by experts. After training the ML classifier, it agreed with the experts on 97.3% of test data compared with 7.8% agreement between PolyBayes and experts. The PolyBayes positive predictive values (PPV) (i.e., fraction of candidate SNP being real) were 7.8% for all predictions and 16.7% for those with 100% posterior probability of being real. Using ML improved the PPV to 84.8%, a 5- to 10-fold increase. While both ML and PolyBayes produced a similar number of true positives, the ML program generated only 249 false positives as compared to 16,955 for PolyBayes. The complexity of the soybean genome may have contributed to high false SNP predictions by PolyBayes and hence results may differ for other genomes.</p> <p>Conclusion</p> <p>A machine learning (ML) method was developed as a supplementary feature to the polymorphism detection software for improving prediction accuracies. The results from this study indicate that a trained ML classifier can significantly reduce human intervention and in this case achieved a 5–10 fold enhanced productivity. The optimized feature set and ML framework can also be applied to all polymorphism discovery software. ML support software is written in Perl and can be easily integrated into an existing SNP discovery pipeline.</p

    SNP-PHAGE – High throughput SNP discovery pipeline

    Get PDF
    BACKGROUND: Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs or microsatellite) markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable. RESULTS: We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at . CONCLUSION: SNP-PHAGE provides a bioinformatics solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP) submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs) of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers

    Variants Within Genes \u3ci\u3eEDIL3\u3c/i\u3e and \u3ci\u3eADGRB3\u3c/i\u3e are Associated With Divergent Fecal Egg Counts in Katahdin Sheep at Weaning

    Get PDF
    Gastrointestinal nematodes (GIN) pose a severe threat to sheep production worldwide. Anthelmintic drug resistance coupled with growing concern regarding potential environmental effects of drug use have demonstrated the necessity of implementing other methods of GIN control. The aim of this study was to test for genetic variants associated with resistance or susceptibility to GIN in Katahdin sheep to improve the current understanding of the genetic mechanisms responsible for host response to GIN. Linear regression and casecontrol genome-wide association studies were conducted with high-density genotype data and cube-root transformed weaning fecal egg counts (tFEC) of 583 Katahdin sheep. The casecontrol GWAS identified two significant SNPs (P-values 1.49e-08 to 1.01e-08) within introns of the gene adhesion G protein-coupled receptor B3 (ADGRB3) associated with lower fecal egg counts. With linear regression, four significant SNPs (P-values 7.82e-08 to 3.34e-08) were identified within the first intron of the gene EGF-like repeats and discoidin domains 3 (EDIL3). These identified SNPs were in very high linkage disequilibrium (r2 of 0.996–1), and animals with alternate homozygous genotypes had significantly higher median weaning tFEC phenotypes compared to all other genotypes. Significant SNPs were queried through public databases to identify putative transcription factor binding site (TFBS) and potential lncRNA differences between reference and alternate alleles. Changes in TFBS were predicted at two SNPs, and one significant SNPwas found to bewithin a predicted lncRNA sequencewith greater than 90% similarity to a known lncRNA in the bovine genome. The gene EDIL3 has been described in other species for its roles in the inhibition and resolution of inflammation. Potential changes of EDIL3 expression mediated through lncRNA expression and/or transcription factor binding may impact the overall immune response and reduce the ability of Katahdin sheep to control GIN infection. This study lays the foundation for further research of EDIL3 and ADGRB3 towards understanding genetic mechanisms of susceptibility to GIN, and suggests these SNPs may contribute to genetic strategies for improving parasite resistance traits in sheep

    Scaling up community-based goat breeding programmes via multi-stakeholder collaboration

    Get PDF
    Community-based livestock breeding programmes (CBBPs) have emerged as a potential approach to implement sustainable livestock breeding in smallholder systems. In Malawi and Uganda, goat CBBPs were introduced to improve production and productivity of indigenous goats through selective breeding. Scaling up CBBPs have recently received support due to evidence-based results from current implementation and results of CBBPs implemented in other regions of the world. This paper explores strategies for scaling up goat CBBPs in Malawi and Uganda, and documents experiences and lessons learned during implementation of the programme. A number of stakeholders supporting goat-based interventions for improving smallholders’ livelihoods exists. This offers an opportunity for different actors to work together by pooling financial resources and technical expertise for establishment and sustainability of goat CBBPs. Scaling up strategies should be an integral part of the pilot design hence dissemination partners need to be engaged during the design and inception stages of the pilot CBBPs. Creation of self-sustaining CBBPs requires early collaborative programme planning, meaningful investment and long-term concerted and coordinated efforts by collaborating partners. Permanently established actors, like government agencies and research and training institutions, are better placed to coordinate such efforts. The overall goal of the scaling up programme should be creation of a financially sustainable system, in which smallholders are able, on their own, to transact and sustain operations of their local breeding institutions using locally generated revenue/ resources. Since CBBP scaling up is a ‘learning by doing process’, an effective monitoring and evaluation system should be an integral part of the process

    Development and Characterization of a High Density SNP Genotyping Assay for Cattle

    Get PDF
    The success of genome-wide association (GWA) studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP) genotyping for the identification of quantitative trait loci (QTL) and for marker-assisted selection in model and agricultural species. A cost-effective and efficient approach for the development of a custom genotyping assay interrogating 54,001 SNP loci to support GWA applications in cattle is described. A novel algorithm for achieving a compressed inter-marker interval distribution proved remarkably successful, with median interval of 37 kb and maximum predicted gap of <350 kb. The assay was tested on a panel of 576 animals from 21 cattle breeds and six outgroup species and revealed that from 39,765 to 46,492 SNP are polymorphic within individual breeds (average minor allele frequency (MAF) ranging from 0.24 to 0.27). The assay also identified 79 putative copy number variants in cattle. Utility for GWA was demonstrated by localizing known variation for coat color and the presence/absence of horns to their correct genomic locations. The combination of SNP selection and the novel spacing algorithm allows an efficient approach for the development of high-density genotyping platforms in species having full or even moderate quality draft sequence. Aspects of the approach can be exploited in species which lack an available genome sequence. The BovineSNP50 assay described here is commercially available from Illumina and provides a robust platform for mapping disease genes and QTL in cattle

    Genome wide CNV analysis reveals additional variants associated with milk production traits in Holsteins

    Get PDF
    Milk production is an economically important sector of global agriculture. Much attention has been paid to the identification of quantitative trait loci (QTL) associated with milk, fat, and protein yield and the genetic and molecular mechanisms underlying them. Copy number variation (CNV) is an emerging class of variants which may be associated with complex traits. In this study, we performed a genome-wide association between CNVs and milk production traits in 26,362 Holstein bulls and cows. A total of 99 candidate CNVs were identified using Illumina BovineSNP50 array data, and association tests for each production trait were performed using a linear regression analysis with PCA correlation. A total of 34 CNVs on 22 chromosomes were significantly associated with at least one milk production trait after false discovery rate (FDR) correction. Some of those CNVs were located within or near known QTL for milk production traits. We further investigated the relationship between associated CNVs with neighboring SNPs. For all 82 combinations of traits and CNVs (less than 400 kb in length), we found 17 cases where CNVs directly overlapped with tag SNPs and 40 cases where CNVs were adjacent to tag SNPs. In 5 cases, CNVs located were in strong linkage disequilibrium with tag SNPs, either within or adjacent to the same haplotype block. There were an additional 20 cases where CNVs did not have a significant association with SNPs, suggesting that the effects of those CNVs were probably not captured by tag SNPs. We conclude that combining CNV with SNP analyses reveals more genetic variations underlying milk production traits than those revealed by SNPs alone.https://doi.org/10.1186/1471-2164-15-68

    Genome to Phenome: Improving Animal Health, Production, and Well-Being – A New USDA Blueprint for Animal Genome Research 2018–2027

    Get PDF
    In 2008, a consortium led by the Agricultural Research Service (ARS) and the National Institute for Food and Agriculture (NIFA) published the “Blueprint for USDA Efforts in Agricultural Animal Genomics 2008–2017,” which served as a guiding document for research and funding in animal genomics. In the decade that followed, many of the goals set forth in the blueprint were accomplished. However, several other goals require further research. In addition, new topics not covered in the original blueprint, which are the result of emerging technologies, require exploration. To develop a new, updated blueprint, ARS and NIFA, along with scientists in the animal genomics field, convened a workshop titled “Genome to Phenome: A USDA Blueprint for Improving Animal Production” in November 2017, and these discussions were used to develop new goals for the next decade. Like the previous blueprint, these goals are grouped into the broad categories “Science to Practice,” “Discovery Science,” and “Infrastructure.” New goals for characterizing the microbiome, enhancing the use of gene editing and other biotechnologies, and preserving genetic diversity are included in the new blueprint, along with updated goals within many genome research topics described in the previous blueprint. The updated blueprint that follows describes the vision, current state of the art, the research needed to advance the field, expected deliverables, and partnerships needed for each animal genomics research topic. Accomplishment of the goals described in the blueprint will significantly increase the ability to meet the demands for animal products by an increasing world population within the next decade
    corecore