806 research outputs found

    Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs

    Full text link
    Cryo-electron microscopy (cryoEM) is an increasingly popular method for protein structure determination. However, identifying a sufficient number of particles for analysis (often >100,000) can take months of manual effort. Current computational approaches are limited by high false positive rates and require significant ad-hoc post-processing, especially for unusually shaped particles. To address this shortcoming, we develop Topaz, an efficient and accurate particle picking pipeline using neural networks trained with few labeled particles by newly leveraging the remaining unlabeled particles through the framework of positive-unlabeled (PU) learning. Remarkably, despite using minimal labeled particles, Topaz allows us to improve reconstruction resolution by up to 0.15 {\AA} over published particles on three public cryoEM datasets without any post-processing. Furthermore, we show that our novel generalized-expectation criteria approach to PU learning outperforms existing general PU learning approaches when applied to particle detection, especially for challenging datasets of non-globular proteins. We expect Topaz to be an essential component of cryoEM analysis.Comment: 43 pages, 5 main figures, 6 supplemental figure

    Nucleosomes indicate the in vitro radiosensitivity of irradiated bronchoepithelial and lung cancer cells

    Get PDF
    Nucleosomes, which are typical cell death products, are elevated in the serum of cancer patients and are known to rapidly increase during radiotherapy. As both normal and malignant cells are damaged by irradiation, we investigated to which extent both cell types contribute to the release of nucleosomes. We cultured monolayers of normal bronchoepithelial lung cells (BEAS-2B, n = 18) and epithelial lung cancer cells (EPLC, n = 18), exposed them to various radiation doses (0, 10 and 30 Gy) and observed them for 5 days. Culture medium was changed every 24 h. Subsequently, nucleosomes were determined in the supernatant by the Cell Death Detection-ELISA(plus) ( Roche Diagnostics). Additionally, the cell number was estimated after harvesting the cells in a second preparation. After 5 days, the cell number of BEAS-2B cultures in the irradiated groups (10 Gy: median 0.03 x 10(6) cells/culture, range 0.02-0.08 x 10(6) cells/culture; 30 Gy: median 0.08 x 10(6) cells/culture, range 0.02-0.14 x 10(6) cells/culture) decreased significantly (10 Gy: p = 0.005; 30 Gy p = 0.005; Wilcoxon test) compared to the non-irradiated control group (median 4.81 x 10(6) cells/culture, range 1.50-9.54 x 10(6) cells/culture). Consistently, nucleosomes remained low in the supernatant of nonirradiated BEAS-2B. However, at 10 Gy, BEAS-2B showed a considerably increasing release of nucleosomes, with a maximum at 72 h ( before irradiation: 0.24 x 10(3) arbitrary units, AU, range 0.13-4.09 x 10(3) AU, and after 72 h: 1.94 x 10(3) AU, range 0.11-5.70 x 10(3) AU). At 30 Gy, the release was even stronger, reaching the maximum earlier (at 48 h, 11.09 x 10(3) AU, range 6.89-18.28 x 10(3) AU). In non-irradiated EPLC, nucleosomes constantly increased slightly. At 10 Gy, we observed a considerably higher release of nucleosomes in EPLC, with a maximum at 72 h (before irradiation: 2.79 x 10(3) AU, range 2.42-3.80 x 10(3) AU, and after 72 h: 7.16 x 10(3) AU, range 4.30-16.20 x 10(3) AU), which was more than 3.5 times higher than in BEAS-2B. At 30 Gy, the maximum (6.22 x 10(3) AU, range 5.13-9.71 x 10(3) AU) was observed already after 24 h. These results indicate that normal bronchoepithelial and malignant lung cancer cells contribute to the release of nucleosomes during irradiation in a dose-and time-dependent manner with cancer cells having a stronger impact at low doses. Copyright (C) 2004 S. Karger AG, Basel

    PoET: A generative model of protein families as sequences-of-sequences

    Full text link
    Generative protein language models are a natural way to design new proteins with desired functions. However, current models are either difficult to direct to produce a protein from a specific family of interest, or must be trained on a large multiple sequence alignment (MSA) from the specific family of interest, making them unable to benefit from transfer learning across families. To address this, we propose P\textbf{P}ro\textbf{o}tein E\textbf{E}volutionary T\textbf{T}ransformer (PoET), an autoregressive generative model of whole protein families that learns to generate sets of related proteins as sequences-of-sequences across tens of millions of natural protein sequence clusters. PoET can be used as a retrieval-augmented language model to generate and score arbitrary modifications conditioned on any protein family of interest, and can extrapolate from short context lengths to generalize well even for small families. This is enabled by a unique Transformer layer; we model tokens sequentially within sequences while attending between sequences order invariantly, allowing PoET to scale to context lengths beyond those used during training. In extensive experiments on deep mutational scanning datasets, we show that PoET outperforms existing protein language models and evolutionary sequence models for variant function prediction across proteins of all MSA depths. We also demonstrate PoET's ability to controllably generate new protein sequences

    ERCC1 expression and RAD51B activity correlate with cell cycle response to platinum drug treatment not DNA repair

    Get PDF
    Background: The H69CIS200 and H69OX400 cell lines are novel models of low-level platinum-drug resistance. Resistance was not associated with increased cellular glutathione or decreased accumulation of platinum, rather the resistant cell lines have a cell cycle alteration allowing them to rapidly proliferate post drug treatment. Results: A decrease in ERCC1 protein expression and an increase in RAD51B foci activity was observed in association with the platinum induced cell cycle arrest but these changes did not correlate with resistance or altered DNA repair capacity. The H69 cells and resistant cell lines have a p53 mutation and consequently decrease expression of p21 in response to platinum drug treatment, promoting progression of the cell cycle instead of increasing p21 to maintain the arrest. Conclusion: Decreased ERCC1 protein and increased RAD51B foci may in part be mediating the maintenance of the cell cycle arrest in the sensitive cells. Resistance in the H69CIS200 and H69OX400 cells may therefore involve the regulation of ERCC1 and RAD51B independent of their roles in DNA repair. The novel mechanism of platinum resistance in the H69CIS200 and H69OX400 cells demonstrates the multifactorial nature of platinum resistance which can occur independently of alterations in DNA repair capacity and changes in ERCC1

    Modulation of the Ribonucleotide Reductase-Antimetabolite Drug Interaction in Cancer Cell Lines

    Get PDF
    RRM1 is a determinant of gemcitabine efficacy in cancer patients. However, the precision of predicting tumor response based on RRM1 levels is not optimal. We used gene-specific overexpression and RNA interference to assess RRM1's impact on different classes of cytotoxic agents, on drug-drug interactions, and the modulating impact of other molecular and cellular parameters. RRM1 was the dominant determinant of gemcitabine efficacy in various cancer cell lines. RRM1 also impacted the efficacy of other antimetabolite agents. It did not disrupt the interaction of two cytotoxic agents when combined. Cell lines with truncation, deletion, and null status of p53 were resistant to gemcitabine without apparent relationship to RRM1 levels. Pemetrexed and carboplatin sensitivity did not appear to be related to p53 mutation status. The impact of p53 mutations in patients treated with gemcitabine should be studied in prospective clinical trials to develop a model with improved precision of predicting drug efficacy

    RRM1 (ribonucleotide reductase M1)

    Get PDF
    Review on RRM1 (ribonucleotide reductase M1), with data on DNA, on the protein encoded, and where the gene is implicated

    Reconstructing continuous distributions of 3D protein structure from cryo-EM images

    Full text link
    Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structure of proteins and other macromolecular complexes at near-atomic resolution. In single particle cryo-EM, the central problem is to reconstruct the three-dimensional structure of a macromolecule from 104710^{4-7} noisy and randomly oriented two-dimensional projections. However, the imaged protein complexes may exhibit structural variability, which complicates reconstruction and is typically addressed using discrete clustering approaches that fail to capture the full range of protein dynamics. Here, we introduce a novel method for cryo-EM reconstruction that extends naturally to modeling continuous generative factors of structural heterogeneity. This method encodes structures in Fourier space using coordinate-based deep neural networks, and trains these networks from unlabeled 2D cryo-EM images by combining exact inference over image orientation with variational inference for structural heterogeneity. We demonstrate that the proposed method, termed cryoDRGN, can perform ab initio reconstruction of 3D protein complexes from simulated and real 2D cryo-EM image data. To our knowledge, cryoDRGN is the first neural network-based approach for cryo-EM reconstruction and the first end-to-end method for directly reconstructing continuous ensembles of protein structures from cryo-EM images
    corecore