14 research outputs found

    Genome-wide association analysis reveals QTL and candidate mutations involved in white spotting in cattle

    Get PDF
    International audienceAbstractBackgroundWhite spotting of the coat is a characteristic trait of various domestic species including cattle and other mammals. It is a hallmark of Holstein–Friesian cattle, and several previous studies have detected genetic loci with major effects for white spotting in animals with Holstein–Friesian ancestry. Here, our aim was to better understand the underlying genetic and molecular mechanisms of white spotting, by conducting the largest mapping study for this trait in cattle, to date.ResultsUsing imputed whole-genome sequence data, we conducted a genome-wide association analysis in 2973 mixed-breed cows and bulls. Highly significant quantitative trait loci (QTL) were found on chromosomes 6 and 22, highlighting the well-established coat color genes KIT and MITF as likely responsible for these effects. These results are in broad agreement with previous studies, although we also report a third significant QTL on chromosome 2 that appears to be novel. This signal maps immediately adjacent to the PAX3 gene, which encodes a known transcription factor that controls MITF expression and is the causal locus for white spotting in horses. More detailed examination of these loci revealed a candidate causal mutation in PAX3 (p.Thr424Met), and another candidate mutation (rs209784468) within a conserved element in intron 2 of MITF transcripts expressed in the skin. These analyses also revealed a mechanistic ambiguity at the chromosome 6 locus, where highly dispersed association signals suggested multiple or multiallelic QTL involving KIT and/or other genes in this region.ConclusionsOur findings extend those of previous studies that reported KIT as a likely causal gene for white spotting, and report novel associations between candidate causal mutations in both the MITF and PAX3 genes. The sizes of the effects of these QTL are substantial, and could be used to select animals with darker, or conversely whiter, coats depending on the desired characteristics

    Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome

    Get PDF
    Additional file 3. This file contains all supplementary tables relating to lncRNA identification via the conservation of synteny. Table S3. lncRNAs inferred in one species by the genomic alignment of a transcript assembled with the RNA-seq libraries from a related spdecies. Table S12. Presence of intergenic lncRNAs both in sheep and cattle, in regions of conserved synteny. Table S13. Presence of intergenic lncRNAs both in sheep and goat, in regions of conserved synteny. Table S14. Presence of intergenic lncRNAs both in cattle and goat, in regions of conserved synteny. Table S15. Presence of intergenic lncRNAs both in sheep and humans, in regions of conserved synteny. Table S16. Presence of intergenic lncRNAs both in goat and humans, in regions of conserved synteny. Table S17. Presence of intergenic lncRNAs both in cattle and humans, in regions of conserved synteny. Table S18. High-confidence lncRNA pairs, those conserved across species both sequentially and positionally

    Verifying explainability of a deep learning tissue classifier trained on RNA-seq data.

    Get PDF
    For complex machine learning (ML) algorithms to gain widespread acceptance in decision making, we must be able to identify the features driving the predictions. Explainability models allow transparency of ML algorithms, however their reliability within high-dimensional data is unclear. To test the reliability of the explainability model SHapley Additive exPlanations (SHAP), we developed a convolutional neural network to predict tissue classification from Genotype-Tissue Expression (GTEx) RNA-seq data representing 16,651 samples from 47 tissues. Our classifier achieved an average F1 score of 96.1% on held-out GTEx samples. Using SHAP values, we identified the 2423 most discriminatory genes, of which 98.6% were also identified by differential expression analysis across all tissues. The SHAP genes reflected expected biological processes involved in tissue differentiation and function. Moreover, SHAP genes clustered tissue types with superior performance when compared to all genes, genes detected by differential expression analysis, or random genes. We demonstrate the utility and reliability of SHAP to explain a deep learning model and highlight the strengths of applying ML to transcriptome data

    DNA methylation patterns identify subgroups of pancreatic neuroendocrine tumors with clinical association

    Get PDF
    Here we report the DNA methylation profile of 84 sporadic pancreatic neuroendocrine tumors (PanNETs) with associated clinical and genomic information. We identified three subgroups of PanNETs, termed T1, T2 and T3, with distinct patterns of methylation. The T1 subgroup was enriched for functional tumors and ATRX, DAXX and MEN1 wild-type genotypes. The T2 subgroup contained tumors with mutations in ATRX, DAXX and MEN1 and recurrent patterns of chromosomal losses in half of the genome with no association between regions with recurrent loss and methylation levels. T2 tumors were larger and had lower methylation in the MGMT gene body, which showed positive correlation with gene expression. The T3 subgroup harboured mutations in MEN1 with recurrent loss of chromosome 11, was enriched for grade G1 tumors and showed histological parameters associated with better prognosis. Our results suggest a role for methylation in both driving tumorigenesis and potentially stratifying prognosis in PanNETs

    Identification of long non-coding RNA in the horse transcriptome

    Get PDF
    Abstract Background Efforts to resolve the transcribed sequences in the equine genome have focused on protein-coding RNA. The transcription of the intergenic regions, although detected via total RNA sequencing (RNA-seq), has yet to be characterized in the horse. The most recent equine transcriptome based on RNA-seq from several tissues was a prime opportunity to obtain a concurrent long non-coding RNA (lncRNA) database. Results This lncRNA database has a breadth of eight tissues and a depth of over 20 million reads for select tissues, providing the deepest and most expansive equine lncRNA database. Utilizing the intergenic reads and three categories of novel genes from a previously published equine transcriptome pipeline, we better describe these groups by annotating the lncRNA candidates. These lncRNA candidates were filtered using an approach adapted from human lncRNA annotation, which removes transcripts based on size, expression, protein-coding capability and distance to the start or stop of annotated protein-coding transcripts. Conclusion Our equine lncRNA database has 20,800 transcripts that demonstrate characteristics unique to lncRNA including low expression, low exon diversity and low levels of sequence conservation. These candidate lncRNA will serve as a baseline lncRNA annotation and begin to describe the RNA-seq reads assigned to the intergenic space in the horse
    corecore