24 research outputs found

    Towards Deep Cellular Phenotyping in Placental Histology

    Full text link
    The placenta is a complex organ, playing multiple roles during fetal development. Very little is known about the association between placental morphological abnormalities and fetal physiology. In this work, we present an open sourced, computationally tractable deep learning pipeline to analyse placenta histology at the level of the cell. By utilising two deep Convolutional Neural Network architectures and transfer learning, we can robustly localise and classify placental cells within five classes with an accuracy of 89%. Furthermore, we learn deep embeddings encoding phenotypic knowledge that is capable of both stratifying five distinct cell populations and learn intraclass phenotypic variance. We envisage that the automation of this pipeline to population scale studies of placenta histology has the potential to improve our understanding of basic cellular placental biology and its variations, particularly its role in predicting adverse birth outcomes.Comment: Updated MRC funding material. Corrected typo that suggested ensembling and Inception accuracy were the same (updated to reflect the fact the ensemble model is 1% better than previously reported

    Mixture models for analysis of melting temperature data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In addition to their use in detecting undesired real-time PCR products, melting temperatures are useful for detecting variations in the desired target sequences. Methodological improvements in recent years allow the generation of high-resolution melting-temperature (T<sub>m</sub>) data. However, there is currently no convention on how to statistically analyze such high-resolution T<sub>m </sub>data.</p> <p>Results</p> <p>Mixture model analysis was applied to T<sub>m </sub>data. Models were selected based on Akaike's information criterion. Mixture model analysis correctly identified categories in T<sub>m </sub>data obtained for known plasmid targets. Using simulated data, we investigated the number of observations required for model construction. The precision of the reported mixing proportions from data fitted to a preconstructed model was also evaluated.</p> <p>Conclusion</p> <p>Mixture model analysis of T<sub>m </sub>data allows the minimum number of different sequences in a set of amplicons and their relative frequencies to be determined. This approach allows T<sub>m </sub>data to be analyzed, classified, and compared in an unbiased manner.</p

    Characterising the genetic architecture of changes in adiposity during adulthood using electronic health records

    Get PDF
    Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 24.5 million primary-care health records in over 740,000 individuals in the UK Biobank, Million Veteran Program USA, and Estonian Biobank, to discover and validate the genetic architecture of adiposity trajectories. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI by 14%. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 (APOE missense variant). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology of quantitative traits in adulthood

    The RNA-Editing Enzyme ADAR1 Controls Innate Immune Responses to RNA

    Get PDF
    The ADAR RNA-editing enzymes deaminate adenosine bases to inosines in cellular RNAs. Aberrant interferon expression occurs in patients in whom ADAR1 mutations cause Aicardi-GoutiĂšres syndrome (AGS) or dystonia arising from striatal neurodegeneration. Adar1 mutant mouse embryos show aberrant interferon induction and die by embryonic day E12.5. We demonstrate that Adar1 embryonic lethality is rescued to live birth in Adar1; Mavs double mutants in which the antiviral interferon induction response to cytoplasmic double-stranded RNA (dsRNA) is prevented. Aberrant immune responses in Adar1 mutant mouse embryo fibroblasts are dramatically reduced by restoring the expression of editing-active cytoplasmic ADARs. We propose that inosine in cellular RNA inhibits antiviral inflammatory and interferon responses by altering RLR interactions. Transfecting dsRNA oligonucleotides containing inosine-uracil base pairs into Adar1 mutant mouse embryo fibroblasts reduces the aberrant innate immune response. ADAR1 mutations causing AGS affect the activity of the interferon-inducible cytoplasmic isoform more severely than the nuclear isoform

    Machine Learning based histology phenotyping to investigate the epidemiologic and genetic basis of adipocyte morphology and cardiometabolic traits

    Get PDF
    Genetic studies have recently highlighted the importance of fat distribution, as well as overall adiposity, in the pathogenesis of obesity-associated diseases. Using a large study (n = 1,288) from 4 independent cohorts, we aimed to investigate the relationship between mean adipocyte area and obesity-related traits, and identify genetic factors associated with adipocyte cell size. To perform the first large-scale study of automatic adipocyte phenotyping using both histological and genetic data, we developed a deep learning-based method, the Adipocyte U-Net, to rapidly derive mean adipocyte area estimates from histology images. We validate our method using three state-of-the-art approaches; CellProfiler, Adiposoft and floating adipocytes fractions, all run blindly on two external cohorts. We observe high concordance between our method and the state-of-the-art approaches (Adipocyte U-net vs. CellProfiler: R2visceral = 0.94, P < 2.2 × 10-16, R2subcutaneous = 0.91, P < 2.2 × 10-16), and faster run times (10,000 images: 6mins vs 3.5hrs). We applied the Adipocyte U-Net to 4 cohorts with histology, genetic, and phenotypic data (total N = 820). After meta-analysis, we found that mean adipocyte area positively correlated with body mass index (BMI) (Psubq = 8.13 × 10-69, ÎČsubq = 0.45; Pvisc = 2.5 × 10-55, ÎČvisc = 0.49; average R2 across cohorts = 0.49) and that adipocytes in subcutaneous depots are larger than their visceral counterparts (Pmeta = 9.8 × 10-7). Lastly, we performed the largest GWAS and subsequent meta-analysis of mean adipocyte area and intra-individual adipocyte variation (N = 820). Despite having twice the number of samples than any similar study, we found no genome-wide significant associations, suggesting that larger sample sizes and a homogenous collection of adipose tissue are likely needed to identify robust genetic associations.This article is freely available via Open Access. Click on the Publisher URL to access it via the publisher's site.C.A.G received a pump priming grant from Novo Nordisk to carry out this work. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.published version, accepted versio

    A systematic evaluation of expression of HERV-W elements; influence of genomic context, viral structure and orientation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One member of the W family of human endogenous retroviruses (HERV) appears to have been functionally adopted by the human host. Nevertheless, a highly diversified and regulated transcription from a range of HERV-W elements has been observed in human tissues and cells. Aberrant expression of members of this family has also been associated with human disease such as multiple sclerosis (MS) and schizophrenia. It is not known whether this broad expression of HERV-W elements represents transcriptional leakage or specific transcription initiated from the retroviral promoter in the long terminal repeat (LTR) region. Therefore, potential influences of genomic context, structure and orientation on the expression levels of individual HERV-W elements in normal human tissues were systematically investigated.</p> <p>Results</p> <p>Whereas intronic HERV-W elements with a pseudogene structure exhibited a strong anti-sense orientation bias, intronic elements with a proviral structure and solo LTRs did not. Although a highly variable expression across tissues and elements was observed, systematic effects of context, structure and orientation were also observed. Elements located in intronic regions appeared to be expressed at higher levels than elements located in intergenic regions. Intronic elements with proviral structures were expressed at higher levels than those elements bearing hallmarks of processed pseudogenes or solo LTRs. Relative to their corresponding genes, intronic elements integrated on the sense strand appeared to be transcribed at higher levels than those integrated on the anti-sense strand. Moreover, the expression of proviral elements appeared to be independent from that of their corresponding genes.</p> <p>Conclusions</p> <p>Intronic HERV-W provirus integrations on the sense strand appear to have elicited a weaker negative selection than pseudogene integrations of transcripts from such elements. Our current findings suggest that the previously observed diversified and tissue-specific expression of elements in the HERV-W family is the result of both directed transcription (involving both the LTR and internal sequence) and leaky transcription of HERV-W elements in normal human tissues.</p

    GestaltMatcher Database - A global reference for facial phenotypic variability in rare human diseases

    Get PDF
    The most important factor that complicates the work of dysmorphologists is the significant phenotypic variability of the human face. Next-Generation Phenotyping (NGP) tools that assist clinicians with recognizing characteristic syndromic patterns are particularly challenged when confronted with patients from populations different from their training data. To that end, we systematically analyzed the impact of genetic ancestry on facial dysmorphism. For that purpose, we established the GestaltMatcher Database (GMDB) as a reference dataset for medical images of patients with rare genetic disorders from around the world. We collected 10,980 frontal facial images - more than a quarter previously unpublished - from 8,346 patients, representing 581 rare disorders. Although the predominant ancestry is still European (67%), data from underrepresented populations have been increased considerably via global collaborations (19% Asian and 7% African). This includes previously unpublished reports for more than 40% of the African patients. The NGP analysis on this diverse dataset revealed characteristic performance differences depending on the composition of training and test sets corresponding to genetic relatedness. For clinical use of NGP, incorporating non-European patients resulted in a profound enhancement of GestaltMatcher performance. The top-5 accuracy rate increased by +11.29%. Importantly, this improvement in delineating the correct disorder from a facial portrait was achieved without decreasing the performance on European patients. By design, GMDB complies with the FAIR principles by rendering the curated medical data findable, accessible, interoperable, and reusable. This means GMDB can also serve as data for training and benchmarking. In summary, our study on facial dysmorphism on a global sample revealed a considerable cross ancestral phenotypic variability confounding NGP that should be counteracted by international efforts for increasing data diversity. GMDB will serve as a vital reference database for clinicians and a transparent training set for advancing NGP technology.</p

    Clinical and molecular consequences of disease-associated de novo mutations in SATB2

    Get PDF
    Purpose: To characterize features associated with de novo mutations affecting SATB2 function in individuals ascertained on the basis of intellectual disability. Methods: Twenty previously unreported individuals with 19 different SATB2 mutations (11 loss-of-function and 8 missense variants) were studied. Fibroblasts were used to measure mutant protein production. Subcellular localization and mobility of wild-type and mutant SATB2 were assessed using fluorescently tagged protein. Results: Recurrent clinical features included neurodevelopmental impairment (19/19), absent/near absent speech (16/19), normal somatic growth (17/19), cleft palate (9/19), drooling (12/19), and dental anomalies (8/19). Six of eight missense variants clustered in the first CUT domain. Sibling recurrence due to gonadal mosaicism was seen in one family. A nonsense mutation in the last exon resulted in production of a truncated protein retaining all three DNA-binding domains. SATB2 nuclear mobility was mutation-dependent; p.Arg389Cys in CUT1 increased mobility and both p.Gly515Ser in CUT2 and p.Gln566Lys between CUT2 and HOX reduced mobility. The clinical features in individuals with missense variants were indistinguishable from those with loss of function. Conclusion: SATB2 haploinsufficiency is a common cause of syndromic intellectual disability. When mutant SATB2 protein is produced, the protein appears functionally inactive with a disrupted pattern of chromatin or matrix association

    Mouse genomic variation and its effect on phenotypes and gene regulation

    Get PDF
    We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism
    corecore