21 research outputs found

    Towards Deep Cellular Phenotyping in Placental Histology

    Full text link
    The placenta is a complex organ, playing multiple roles during fetal development. Very little is known about the association between placental morphological abnormalities and fetal physiology. In this work, we present an open sourced, computationally tractable deep learning pipeline to analyse placenta histology at the level of the cell. By utilising two deep Convolutional Neural Network architectures and transfer learning, we can robustly localise and classify placental cells within five classes with an accuracy of 89%. Furthermore, we learn deep embeddings encoding phenotypic knowledge that is capable of both stratifying five distinct cell populations and learn intraclass phenotypic variance. We envisage that the automation of this pipeline to population scale studies of placenta histology has the potential to improve our understanding of basic cellular placental biology and its variations, particularly its role in predicting adverse birth outcomes.Comment: Updated MRC funding material. Corrected typo that suggested ensembling and Inception accuracy were the same (updated to reflect the fact the ensemble model is 1% better than previously reported

    Mixture models for analysis of melting temperature data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In addition to their use in detecting undesired real-time PCR products, melting temperatures are useful for detecting variations in the desired target sequences. Methodological improvements in recent years allow the generation of high-resolution melting-temperature (T<sub>m</sub>) data. However, there is currently no convention on how to statistically analyze such high-resolution T<sub>m </sub>data.</p> <p>Results</p> <p>Mixture model analysis was applied to T<sub>m </sub>data. Models were selected based on Akaike's information criterion. Mixture model analysis correctly identified categories in T<sub>m </sub>data obtained for known plasmid targets. Using simulated data, we investigated the number of observations required for model construction. The precision of the reported mixing proportions from data fitted to a preconstructed model was also evaluated.</p> <p>Conclusion</p> <p>Mixture model analysis of T<sub>m </sub>data allows the minimum number of different sequences in a set of amplicons and their relative frequencies to be determined. This approach allows T<sub>m </sub>data to be analyzed, classified, and compared in an unbiased manner.</p

    The RNA-Editing Enzyme ADAR1 Controls Innate Immune Responses to RNA

    Get PDF
    The ADAR RNA-editing enzymes deaminate adenosine bases to inosines in cellular RNAs. Aberrant interferon expression occurs in patients in whom ADAR1 mutations cause Aicardi-GoutiĂšres syndrome (AGS) or dystonia arising from striatal neurodegeneration. Adar1 mutant mouse embryos show aberrant interferon induction and die by embryonic day E12.5. We demonstrate that Adar1 embryonic lethality is rescued to live birth in Adar1; Mavs double mutants in which the antiviral interferon induction response to cytoplasmic double-stranded RNA (dsRNA) is prevented. Aberrant immune responses in Adar1 mutant mouse embryo fibroblasts are dramatically reduced by restoring the expression of editing-active cytoplasmic ADARs. We propose that inosine in cellular RNA inhibits antiviral inflammatory and interferon responses by altering RLR interactions. Transfecting dsRNA oligonucleotides containing inosine-uracil base pairs into Adar1 mutant mouse embryo fibroblasts reduces the aberrant innate immune response. ADAR1 mutations causing AGS affect the activity of the interferon-inducible cytoplasmic isoform more severely than the nuclear isoform

    Machine Learning based histology phenotyping to investigate the epidemiologic and genetic basis of adipocyte morphology and cardiometabolic traits

    Get PDF
    Genetic studies have recently highlighted the importance of fat distribution, as well as overall adiposity, in the pathogenesis of obesity-associated diseases. Using a large study (n = 1,288) from 4 independent cohorts, we aimed to investigate the relationship between mean adipocyte area and obesity-related traits, and identify genetic factors associated with adipocyte cell size. To perform the first large-scale study of automatic adipocyte phenotyping using both histological and genetic data, we developed a deep learning-based method, the Adipocyte U-Net, to rapidly derive mean adipocyte area estimates from histology images. We validate our method using three state-of-the-art approaches; CellProfiler, Adiposoft and floating adipocytes fractions, all run blindly on two external cohorts. We observe high concordance between our method and the state-of-the-art approaches (Adipocyte U-net vs. CellProfiler: R2visceral = 0.94, P < 2.2 × 10-16, R2subcutaneous = 0.91, P < 2.2 × 10-16), and faster run times (10,000 images: 6mins vs 3.5hrs). We applied the Adipocyte U-Net to 4 cohorts with histology, genetic, and phenotypic data (total N = 820). After meta-analysis, we found that mean adipocyte area positively correlated with body mass index (BMI) (Psubq = 8.13 × 10-69, ÎČsubq = 0.45; Pvisc = 2.5 × 10-55, ÎČvisc = 0.49; average R2 across cohorts = 0.49) and that adipocytes in subcutaneous depots are larger than their visceral counterparts (Pmeta = 9.8 × 10-7). Lastly, we performed the largest GWAS and subsequent meta-analysis of mean adipocyte area and intra-individual adipocyte variation (N = 820). Despite having twice the number of samples than any similar study, we found no genome-wide significant associations, suggesting that larger sample sizes and a homogenous collection of adipose tissue are likely needed to identify robust genetic associations.This article is freely available via Open Access. Click on the Publisher URL to access it via the publisher's site.C.A.G received a pump priming grant from Novo Nordisk to carry out this work. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.published version, accepted versio

    A systematic evaluation of expression of HERV-W elements; influence of genomic context, viral structure and orientation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One member of the W family of human endogenous retroviruses (HERV) appears to have been functionally adopted by the human host. Nevertheless, a highly diversified and regulated transcription from a range of HERV-W elements has been observed in human tissues and cells. Aberrant expression of members of this family has also been associated with human disease such as multiple sclerosis (MS) and schizophrenia. It is not known whether this broad expression of HERV-W elements represents transcriptional leakage or specific transcription initiated from the retroviral promoter in the long terminal repeat (LTR) region. Therefore, potential influences of genomic context, structure and orientation on the expression levels of individual HERV-W elements in normal human tissues were systematically investigated.</p> <p>Results</p> <p>Whereas intronic HERV-W elements with a pseudogene structure exhibited a strong anti-sense orientation bias, intronic elements with a proviral structure and solo LTRs did not. Although a highly variable expression across tissues and elements was observed, systematic effects of context, structure and orientation were also observed. Elements located in intronic regions appeared to be expressed at higher levels than elements located in intergenic regions. Intronic elements with proviral structures were expressed at higher levels than those elements bearing hallmarks of processed pseudogenes or solo LTRs. Relative to their corresponding genes, intronic elements integrated on the sense strand appeared to be transcribed at higher levels than those integrated on the anti-sense strand. Moreover, the expression of proviral elements appeared to be independent from that of their corresponding genes.</p> <p>Conclusions</p> <p>Intronic HERV-W provirus integrations on the sense strand appear to have elicited a weaker negative selection than pseudogene integrations of transcripts from such elements. Our current findings suggest that the previously observed diversified and tissue-specific expression of elements in the HERV-W family is the result of both directed transcription (involving both the LTR and internal sequence) and leaky transcription of HERV-W elements in normal human tissues.</p

    Clinical and molecular consequences of disease-associated de novo mutations in SATB2

    Get PDF
    Purpose: To characterize features associated with de novo mutations affecting SATB2 function in individuals ascertained on the basis of intellectual disability. Methods: Twenty previously unreported individuals with 19 different SATB2 mutations (11 loss-of-function and 8 missense variants) were studied. Fibroblasts were used to measure mutant protein production. Subcellular localization and mobility of wild-type and mutant SATB2 were assessed using fluorescently tagged protein. Results: Recurrent clinical features included neurodevelopmental impairment (19/19), absent/near absent speech (16/19), normal somatic growth (17/19), cleft palate (9/19), drooling (12/19), and dental anomalies (8/19). Six of eight missense variants clustered in the first CUT domain. Sibling recurrence due to gonadal mosaicism was seen in one family. A nonsense mutation in the last exon resulted in production of a truncated protein retaining all three DNA-binding domains. SATB2 nuclear mobility was mutation-dependent; p.Arg389Cys in CUT1 increased mobility and both p.Gly515Ser in CUT2 and p.Gln566Lys between CUT2 and HOX reduced mobility. The clinical features in individuals with missense variants were indistinguishable from those with loss of function. Conclusion: SATB2 haploinsufficiency is a common cause of syndromic intellectual disability. When mutant SATB2 protein is produced, the protein appears functionally inactive with a disrupted pattern of chromatin or matrix association

    Mouse genomic variation and its effect on phenotypes and gene regulation

    Get PDF
    We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism

    Observing the darkest matter of the genome : Expression of human endogenous retrovirus W elements

    Get PDF
    The human genome is composed of coding genes and vast stretches of sequences largely considered junk . Researchers are, however, uncovering widespread and extensive transcription of not only the coding, but also of the non-coding sequences in the genomes of many species. Transcripts that do not code for any protein are thought to carry out their potential functions by directly interacting with other sequences and proteins by their base-pairing capabilities or secondary structures. Since little is known about non-coding DNA and their RNA transcripts, they have been called the dark matter of the genome. Half the human genome is composed of repetitive sequences, about eight percent by ancient remnants of retroviral infections called human endogenous retroviruses (HERV). These repetitive elements are usually excluded from most studies of expressed sequences as they are methodologically problematic to identify unambiguously. The dogma has been that degenerated viral sequences are junk and are for the most part transcriptionally silent. This is being revised because of observation of transcription of these elements in human tissues and expression variations associated to human diseases. These repetitive regions could be called the darkest matter of the genome. In this thesis are included observations of expression patterns of HERV elements and increased expression and alterations associated to exogenous virus infections. An evaluation of the currently available sequence specific assays and a novel melting temperature (Tm) analysis method for studying expression patterns of highly repetitive and homologous sequences is presented herein. The Tm analysis method was further developed with: i) the use of a temperature probe to normalize for temperature deviations in the thermocycler instrument, ii) a curve fit algorithm to interpolate exact temperatures from multiple data points and iii) a new approach to analyzing obtained Tm with mixture models for an impartial and objective statistical analysis. Using these methods, we studied the expression patterns of individual elements within one HERV family in human tissues. We found significant differences between expression patterns of HERV between human tissues and between individuals to an extent similar to that which would be expected for coding transcripts. The observations and methods developed in the course of this thesis might hopefully help in casting some light on the expression, regulation and functions of these RNAs containing highly repetitive sequences

    “I don’t think people are ready to trust these algorithms at face value”: trust and the use of machine learning algorithms in the diagnosis of rare disease

    No full text
    Background: As the use of AI becomes more pervasive, and computerised systems are used in clinical decision-making, the role of trust in, and the trustworthiness of, AI tools will need to be addressed. Using the case of computational phenotyping to support the diagnosis of rare disease in dysmorphology, this paper explores under what conditions we could place trust in medical AI tools, which employ machine learning. Methods: Semi-structured qualitative interviews (n = 20) with stakeholders (clinical geneticists, data scientists, bioinformaticians, industry and patient support group spokespersons) who design and/or work with computational phenotyping (CP) systems. The method of constant comparison was used to analyse the interview data. Results: Interviewees emphasized the importance of establishing trust in the use of CP technology in identifying rare diseases. Trust was formulated in two interrelated ways in these data. First, interviewees talked about the importance of using CP tools within the context of a trust relationship; arguing that patients will need to trust clinicians who use AI tools and that clinicians will need to trust AI developers, if they are to adopt this technology. Second, they described a need to establish trust in the technology itself, or in the knowledge it provides—epistemic trust. Interviewees suggested CP tools used for the diagnosis of rare diseases might be perceived as more trustworthy if the user is able to vouchsafe for the technology’s reliability and accuracy and the person using/developing them is trusted. Conclusion: This study suggests we need to take deliberate and meticulous steps to design reliable or confidence-worthy AI systems for use in healthcare. In addition, we need to devise reliable or confidence-worthy processes that would give rise to reliable systems; these could take the form of RCTs and/or systems of accountability transparency and responsibility that would signify the epistemic trustworthiness of these tools. words 294
    corecore