100 research outputs found

    Towards automatic generation of Piping and Instrumentation Diagrams (P&IDs) with Artificial Intelligence

    Full text link
    Developing Piping and Instrumentation Diagrams (P&IDs) is a crucial step during the development of chemical processes. Currently, this is a tedious, manual, and time-consuming task. We propose a novel, completely data-driven method for the prediction of control structures. Our methodology is inspired by end-to-end transformer-based human language translation models. We cast the control structure prediction as a translation task where Process Flow Diagrams (PFDs) are translated to P&IDs. To use established transformer-based language translation models, we represent the P&IDs and PFDs as strings using our recently proposed SFILES 2.0 notation. Model training is performed in a transfer learning approach. Firstly, we pre-train our model using generated P&IDs to learn the grammatical structure of the process diagrams. Thereafter, the model is fine-tuned leveraging transfer learning on real P&IDs. The model achieved a top-5 accuracy of 74.8% on 10,000 generated P&IDs and 89.2% on 100,000 generated P&IDs. These promising results show great potential for AI-assisted process engineering. The tests on a dataset of 312 real P&IDs indicate the need of a larger P&IDs dataset for industry applications

    Protein Folding in Archaea

    Get PDF
    Chaperonins are a specific class of barrel-shaped chaperones, present in almost all organisms. Newly synthesized proteins encapsulated by the chaperonin can attain their native structure unimpaired by aggregation during repeated cycles of ATP-dependent binding and release. Chaperonins are generally divided into two groups. Group I chaperonins, such as the barrel-shaped GroEL oligomer, are found predominantly in bacteria and cooperate with cofactors of the Hsp10 familly (i.e. GroES). The Group II chaperonins, on the other hand, do not require a Hsp10- cofactor and are found in the eukaryotic cytosol and in archaea. The function of GroEL is understood in great detail and the substrate interaction proteome has been recently identified. In contrast, our knowledge about the natural substrates of Group II chaperonins is deficient and as a consequence, mechanistical studies on Group II chaperonins have been limited to using the eukaryotic model substrates actin and tubulin as well as heterologous model substrates. In the present study, the complete substrate spectrum of a Group II chaperonin, the thermosome (Ths) of the mesophilic archaeon Methanosarcina mazei (M. mazei), was analysed for the first time. In addition, the unique coexistence of both the goup I and the group II chaperonins in M. mazei, which was confirmed in the initial part of the study, provided the opportunity to obtain new insights into how the substrate selection differs between the two chaperonin groups. For these purposes, the chaperonin substrates were isolated by immunoprecipitation of the chaperonin-substrate complexes and identified by liquid chromatography coupled mass spectrometry (LC-MS) using three different approaches: LC-MS after separation of the proteins (i) by classical 2D-PAGE, (ii) by difference gel electrophoresis (Ettan DIGE) and (iii) by 1D-PAGE. Analysis of substrates of both the thermosome (MmThs) and GroEL/GroES (MmGroEL, MmGroES) of M. mazei revealed that each chaperonin handles a defined set of substrates, and both chaperonins contribute to the folding of ~17% of the proteins in the archaeal cytosol. Bioinformatic analysis revealed that the chaperonin specificity is governed by a combination of a various physical properties (hydrophobicity, net charge and size), structural features (i.e. the domain fold), and less concrete characteristics like the evolutionary status and, in this context, the phylogenetic origin of the substrate

    Data augmentation for machine learning of chemical process flowsheets

    Full text link
    Artificial intelligence has great potential for accelerating the design and engineering of chemical processes. Recently, we have shown that transformer-based language models can learn to auto-complete chemical process flowsheets using the SFILES 2.0 string notation. Also, we showed that language translation models can be used to translate Process Flow Diagrams (PFDs) into Process and Instrumentation Diagrams (P&IDs). However, artificial intelligence methods require big data and flowsheet data is currently limited. To mitigate this challenge of limited data, we propose a new data augmentation methodology for flowsheet data that is represented in the SFILES 2.0 notation. We show that the proposed data augmentation improves the performance of artificial intelligence-based process design models. In our case study flowsheet data augmentation improved the prediction uncertainty of the flowsheet autocompletion model by 14.7%. In the future, our flowsheet data augmentation can be used for other machine learning algorithms on chemical process flowsheets that are based on SFILES notation.Comment: Submitted to PROCEEDINGS OF THE 33rd European Symposium on Computer Aided Process Engineering (ESCAPE33), June 18-21, 2023, Athens, Greec

    Lytic Water Dynamics Reveal Evolutionarily Conserved Mechanisms of ATP Hydrolysis by TIP49 AAA+ ATPases

    Get PDF
    SummaryEukaryotic TIP49a (Pontin) and TIP49b (Reptin) AAA+ ATPases play essential roles in key cellular processes. How their weak ATPase activity contributes to their important functions remains largely unknown and difficult to analyze because of the divergent properties of TIP49a and TIP49b proteins and of their homo- and hetero-oligomeric assemblies. To circumvent these complexities, we have analyzed the single ancient TIP49 ortholog found in the archaeon Methanopyrus kandleri (mkTIP49). All-atom homology modeling and molecular dynamics simulations validated by biochemical assays reveal highly conserved organizational principles and identify key residues for ATP hydrolysis. An unanticipated crosstalk between Walker B and Sensor I motifs impacts the dynamics of water molecules and highlights a critical role of trans-acting aspartates in the lytic water activation step that is essential for the associative mechanism of ATP hydrolysis

    Spt4/5 stimulates transcription elongation through the RNA polymerase clamp coiled-coil motif

    Get PDF
    Spt5 is the only known RNA polymerase-associated factor that is conserved in all three domains of life. We have solved the structure of the Methanococcus jannaschii Spt4/5 complex by X-ray crystallography, and characterized its function and interaction with the archaeal RNAP in a wholly recombinant in vitro transcription system. Archaeal Spt4 and Spt5 form a stable complex that associates with RNAP independently of the DNA–RNA scaffold of the elongation complex. The association of Spt4/5 with RNAP results in a stimulation of transcription processivity, both in the absence and the presence of the non-template strand. A domain deletion analysis reveals the molecular anatomy of Spt4/5—the Spt5 Nus-G N-terminal (NGN) domain is the effector domain of the complex that both mediates the interaction with RNAP and is essential for its elongation activity. Using a mutagenesis approach, we have identified a hydrophobic pocket on the Spt5 NGN domain as binding site for RNAP, and reciprocally the RNAP clamp coiled-coil motif as binding site for Spt4/5

    The Initiation Factor TFE and the Elongation Factor Spt4/5 Compete for the RNAP Clamp during Transcription Initiation and Elongation

    Get PDF
    TFIIE and the archaeal homolog TFE enhance DNA strand separation of eukaryotic RNAPII and the archaeal RNAP during transcription initiation by an unknown mechanism. We have developed a fluorescently labeled recombinant M. jannaschii RNAP system to probe the archaeal transcription initiation complex, consisting of promoter DNA, TBP, TFB, TFE, and RNAP. We have localized the position of the TFE winged helix (WH) and Zinc ribbon (ZR) domains on the RNAP using single-molecule FRET. The interaction sites of the TFE WH domain and the transcription elongation factor Spt4/5 overlap, and both factors compete for RNAP binding. Binding of Spt4/5 to RNAP represses promoter-directed transcription in the absence of TFE, which alleviates this effect by displacing Spt4/5 from RNAP. During elongation, Spt4/5 can displace TFE from the RNAP elongation complex and stimulate processivity. Our results identify the RNAP “clamp” region as a regulatory hot spot for both transcription initiation and transcription elongation

    The Effect of Chaperonin Buffering on Protein Evolution

    Get PDF
    Molecular chaperones are highly conserved and ubiquitous proteins that help other proteins in the cell to fold. Pioneering work by Rutherford and Lindquist suggested that the chaperone Hsp90 could buffer (i.e., suppress) phenotypic variation in its client proteins and that alternate periods of buffering and expression of these variants might be important in adaptive evolution. More recently, Tokuriki and Tawfik presented an explicit mechanism for chaperone-dependent evolution, in which the Escherichia coli chaperonin GroEL facilitated the folding of clients that had accumulated structurally destabilizing but neofunctionalizing mutations in the protein core. But how important an evolutionary force is chaperonin-mediated buffering in nature? Here, we address this question by modeling the per-residue evolutionary rate of the crystallized E. coli proteome, evaluating the relative contributions of chaperonin buffering, functional importance, and structural features such as residue contact density. Previous findings suggest an interaction between codon bias and GroEL in limiting the effects of misfolding errors. Our results suggest that the buffering of deleterious mutations by GroEL increases the evolutionary rate of client proteins. We then examine the evolutionary fate of GroEL clients in the Mycoplasmas, a group of bacteria containing the only known organisms that lack chaperonins. We show that GroEL was lost once in the common ancestor of a monophyletic subgroup of Mycoplasmas, and we evaluate the effect of this loss on the subsequent evolution of client proteins, providing evidence that client homologs in 11 Mycoplasma species have lost their obligate dependency on GroEL for folding. Our analyses indicate that individual molecules such as chaperonins can have significant effects on proteome evolution through their modulation of protein folding

    Simulation vs. Reality: A Comparison of In Silico Distance Predictions with DEER and FRET Measurements

    Get PDF
    Site specific incorporation of molecular probes such as fluorescent- and nitroxide spin-labels into biomolecules, and subsequent analysis by Förster resonance energy transfer (FRET) and double electron-electron resonance (DEER) can elucidate the distance and distance-changes between the probes. However, the probes have an intrinsic conformational flexibility due to the linker by which they are conjugated to the biomolecule. This property minimizes the influence of the label side chain on the structure of the target molecule, but complicates the direct correlation of the experimental inter-label distances with the macromolecular structure or changes thereof. Simulation methods that account for the conformational flexibility and orientation of the probe(s) can be helpful in overcoming this problem. We performed distance measurements using FRET and DEER and explored different simulation techniques to predict inter-label distances using the Rpo4/7 stalk module of the M. jannaschii RNA polymerase. This is a suitable model system because it is rigid and a high-resolution X-ray structure is available. The conformations of the fluorescent labels and nitroxide spin labels on Rpo4/7 were modeled using in vacuo molecular dynamics simulations (MD) and a stochastic Monte Carlo sampling approach. For the nitroxide probes we also performed MD simulations with explicit water and carried out a rotamer library analysis. Our results show that the Monte Carlo simulations are in better agreement with experiments than the MD simulations and the rotamer library approach results in plausible distance predictions. Because the latter is the least computationally demanding of the methods we have explored, and is readily available to many researchers, it prevails as the method of choice for the interpretation of DEER distance distributions

    Repression of RNA polymerase by the archaeo-viral regulator ORF145/RIP

    Get PDF
    Little is known about how archaeal viruses perturb the transcription machinery of their hosts. Here we provide the first example of an archaeo-viral transcription factor that directly targets the host RNA polymerase (RNAP) and efficiently represses its activity. ORF145 from the temperate Acidianus two-tailed virus (ATV) forms a high-affinity complex with RNAP by binding inside the DNA-binding channel where it locks the flexible RNAP clamp in one position. This counteracts the formation of transcription pre-initiation complexes in vitro and represses abortive and productive transcription initiation, as well as elongation. Both host and viral promoters are subjected to ORF145 repression. Thus, ORF145 has the properties of a global transcription repressor and its overexpression is toxic for Sulfolobus. On the basis of its properties, we have re-named ORF145 RNAP Inhibitory Protein (RIP)
    corecore