20 research outputs found

    Learning a local-variable model of aromatic and conjugated systems

    Get PDF
    A collection of new approaches to building and training neural networks, collectively referred to as deep learning, are attracting attention in theoretical chemistry. Several groups aim to replace computationally expensive <i>ab initio</i> quantum mechanics calculations with learned estimators. This raises questions about the representability of complex quantum chemical systems with neural networks. Can local-variable models efficiently approximate nonlocal quantum chemical features? Here, we find that convolutional architectures, those that only aggregate information locally, cannot efficiently represent aromaticity and conjugation in large systems. They cannot represent long-range nonlocality known to be important in quantum chemistry. This study uses aromatic and conjugated systems computed from molecule graphs, though reproducing quantum simulations is the ultimate goal. This task, by definition, is both computable and known to be important to chemistry. The failure of convolutional architectures on this focused task calls into question their use in modeling quantum mechanics. To remedy this heretofore unrecognized deficiency, we introduce a new architecture that propagates information back and forth in waves of nonlinear computation. This architecture is still a local-variable model, and it is both computationally and representationally efficient, processing molecules in sublinear time with far fewer parameters than convolutional networks. Wave-like propagation models aromatic and conjugated systems with high accuracy, and even models the impact of small structural changes on large molecules. This new architecture demonstrates that some nonlocal features of quantum chemistry can be efficiently represented in local variable models

    Deep learning quantification of percent steatosis in donor liver biopsy frozen sections

    Get PDF
    BACKGROUND: Pathologist evaluation of donor liver biopsies provides information for accepting or discarding potential donor livers. Due to the urgent nature of the decision process, this is regularly performed using frozen sectioning at the time of biopsy. The percent steatosis in a donor liver biopsy correlates with transplant outcome, however there is significant inter- and intra-observer variability in quantifying steatosis, compounded by frozen section artifact. We hypothesized that a deep learning model could identify and quantify steatosis in donor liver biopsies. METHODS: We developed a deep learning convolutional neural network that generates a steatosis probability map from an input whole slide image (WSI) of a hematoxylin and eosin-stained frozen section, and subsequently calculates the percent steatosis. Ninety-six WSI of frozen donor liver sections from our transplant pathology service were annotated for steatosis and used to train (n = 30 WSI) and test (n = 66 WSI) the deep learning model. FINDINGS: The model had good correlation and agreement with the annotation in both the training set (r of 0.88, intraclass correlation coefficient [ICC] of 0.88) and novel input test sets (r = 0.85 and ICC=0.85). These measurements were superior to the estimates of the on-service pathologist at the time of initial evaluation (r = 0.52 and ICC=0.52 for the training set, and r = 0.74 and ICC=0.72 for the test set). INTERPRETATION: Use of this deep learning algorithm could be incorporated into routine pathology workflows for fast, accurate, and reproducible donor liver evaluation. FUNDING: Mid-America Transplant Society

    ProteomeScout: A repository and analysis resource for post-translational modifications and proteins

    Get PDF
    ProteomeScout (https://proteomescout.wustl.edu) is a resource for the study of proteins and their post-translational modifications (PTMs) consisting of a database of PTMs, a repository for experimental data, an analysis suite for PTM experiments, and a tool for visualizing the relationships between complex protein annotations. The PTM database is a compendium of public PTM data, coupled with user-uploaded experimental data. ProteomeScout provides analysis tools for experimental datasets, including summary views and subset selection, which can identify relationships within subsets of data by testing for statistically significant enrichment of protein annotations. Protein annotations are incorporated in the ProteomeScout database from external resources and include terms such as Gene Ontology annotations, domains, secondary structure and non-synonymous polymorphisms. These annotations are available in the database download, in the analysis tools and in the protein viewer. The protein viewer allows for the simultaneous visualization of annotations in an interactive web graphic, which can be exported in Scalable Vector Graphics (SVG) format. Finally, quantitative data measurements associated with public experiments are also easily viewable within protein records, allowing researchers to see how PTMs change across different contexts. ProteomeScout should prove useful for protein researchers and should benefit the proteomics community by providing a stable repository for PTM experiments

    Modeling Small-Molecule Reactivity Identifies Promiscuous Bioactive Compounds

    No full text
    Scientists rely on high-throughput screening tools to identify promising small-molecule compounds for the development of biochemical probes and drugs. This study focuses on the identification of promiscuous bioactive compounds, which are compounds that appear active in many high-throughput screening experiments against diverse targets but are often false-positives which may not be easily developed into successful probes. These compounds can exhibit bioactivity due to nonspecific, intractable mechanisms of action and/or by interference with specific assay technology readouts. Such “frequent hitters” are now commonly identified using substructure filters, including pan assay interference compounds (PAINS). Herein, we show that mechanistic modeling of small-molecule reactivity using deep learning can improve upon PAINS filters when modeling promiscuous bioactivity in PubChem assays. Without training on high-throughput screening data, a deep learning model of small-molecule reactivity achieves a sensitivity and specificity of 18.5% and 95.5%, respectively, in identifying promiscuous bioactive compounds. This performance is similar to PAINS filters, which achieve a sensitivity of 20.3% at the same specificity. Importantly, such reactivity modeling is complementary to PAINS filters. When PAINS filters and reactivity models are combined, the resulting model outperforms either method alone, achieving a sensitivity of 24% at the same specificity. However, as a probabilistic model, the sensitivity and specificity of the deep learning model can be tuned by adjusting the threshold. Moreover, for a subset of PAINS filters, this reactivity model can help discriminate between promiscuous and nonpromiscuous bioactive compounds even among compounds matching those filters. Critically, the reactivity model provides mechanistic hypotheses for assay interference by predicting the precise atoms involved in compound reactivity. Overall, our analysis suggests that deep learning approaches to modeling promiscuous compound bioactivity may provide a complementary approach to current methods for identifying promiscuous compounds
    corecore