15 research outputs found

    Rapid protein assignments and structures from raw NMR spectra with the deep learning technique ARTINA

    Full text link
    Nuclear Magnetic Resonance (NMR) spectroscopy is one of the major techniques in structural biology with over 11,800 protein structures deposited in the Protein Data Bank. NMR can elucidate structures and dynamics of small and medium size proteins in solution, living cells, and solids, but has been limited by the tedious data analysis process. It typically requires weeks or months of manual work of a trained expert to turn NMR measurements into a protein structure. Automation of this process is an open problem, formulated in the field over 30 years ago. Here, we present a solution to this challenge that enables the completely automated analysis of protein NMR data within hours after completing the measurements. Using only NMR spectra and the protein sequence as input, our machine learning-based method, ARTINA, delivers signal positions, resonance assignments, and structures strictly without any human intervention. Tested on a 100-protein benchmark comprising 1329 multidimensional NMR spectra, ARTINA demonstrated its ability to solve structures with 1.44 {\AA} median RMSD to the PDB reference and to identify 91.36% correct NMR resonance assignments. ARTINA can be used by non-experts, reducing the effort for a protein assignment or structure determination by NMR essentially to the preparation of the sample and the spectra measurements

    Computer vision-based automated peak picking applied to protein NMR spectra

    Get PDF
    Motivation: A detailed analysis of multidimensional NMR spectra of macromolecules requires the identification of individual resonances (peaks). This task can be tedious and time-consuming and often requires support by experienced users. Automated peak picking algorithms were introduced more than 25 years ago, but there are still major deficiencies/flaws that often prevent complete and error free peak picking of biological macromolecule spectra. The major challenges of automated peak picking algorithms is both the distinction of artifacts from real peaks particularly from those with irregular shapes and also picking peaks in spectral regions with overlapping resonances which are very hard to resolve by existing computer algorithms. In both of these cases a visual inspection approach could be more effective than a ‘blind' algorithm. Results: We present a novel approach using computer vision (CV) methodology which could be better adapted to the problem of peak recognition. After suitable ‘training' we successfully applied the CV algorithm to spectra of medium-sized soluble proteins up to molecular weights of 26 kDa and to a 130 kDa complex of a tetrameric membrane protein in detergent micelles. Our CV approach outperforms commonly used programs. With suitable training datasets the application of the presented method can be extended to automated peak picking in multidimensional spectra of nucleic acids or carbohydrates and adapted to solid-state NMR spectra. Availability and implementation: CV-Peak Picker is available upon request from the authors. Contact: [email protected]; [email protected]; [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

    Nuclear magnetic resonance spectroscopy interpretation for protein modeling using computer vision and probabilistic graphical models

    No full text
    Dynamic development of nuclear magnetic resonance spectroscopy (NMR) allowed fast acquisition of experimental data which determine structure and dynamics of macromolecules. Nevertheless, due to lack of appropriate computational methods, NMR spectra are still analyzed manually by researchers what takes weeks or years depending on protein complexity. Therefore automation of this process is extremely desired and can significantly reduce time of protein structure solving. In presented work, a new approach to automated three-dimensional protein NMR spectra analysis is presented. It is based on Histogram of Oriented Gradients and Bayesian Network which have not been ever applied in that context in the history of research in the area. Proposed method was evaluated using benchmark data which was established by manual labeling of 99 spectroscopic images taken from 6 different NMR experiments. Afterwards subsequent validation was made using spectra of upstream of N-ras protein. With the use of proposed method, a three-dimensional structure of mentioned protein was calculated. Comparison with reference structure from protein databank reveals no significant differences what has proven that proposed method can be used in practice in NMR laboratories

    Nuclear magnetic resonance spectroscopy interpretation for protein modeling using computer vision and probabilistic graphical models

    No full text
    Dynamic development of nuclear magnetic resonance spectroscopy (NMR) allowed fast acquisition of experimental data which determine structure and dynamics of macromolecules. Nevertheless, due to lack of appropriate computational methods, NMR spectra are still analyzed manually by researchers what takes weeks or years depending on protein complexity. Therefore automation of this process is extremely desired and can significantly reduce time of protein structure solving. In presented work, a new approach to automated three-dimensional protein NMR spectra analysis is presented. It is based on Histogram of Oriented Gradients and Bayesian Network which have not been ever applied in that context in the history of research in the area. Proposed method was evaluated using benchmark data which was established by manual labeling of 99 spectroscopic images taken from 6 different NMR experiments. Afterwards subsequent validation was made using spectra of upstream of N-ras protein. With the use of proposed method, a three-dimensional structure of mentioned protein was calculated. Comparison with reference structure from protein databank reveals no significant differences what has proven that proposed method can be used in practice in NMR laboratories

    NMRtist: an online platform for automated biomolecular NMR spectra analysis

    No full text
    We present NMRtist, an online platform that combines deep learning, large-scale optimization, and cloud computing to automate protein NMR spectra analysis. Our website provides virtual storage for NMR spectra deposition together with a set of applications designed for automated peak picking, chemical shift assignment, and protein structure determination. The system can be used by non-experts and allows protein assignments and structures to be determined within hours after the measurements, strictly without any human intervention.ISSN:1367-4803ISSN:1460-205

    Rapid protein assignments and structures from raw NMR spectra with the deep learning technique ARTINA

    No full text
    Nuclear Magnetic Resonance (NMR) spectroscopy is a major technique in structural biology with over 11,800 protein structures deposited in the Protein Data Bank. NMR can elucidate structures and dynamics of small and medium size proteins in solution, living cells, and solids, but has been limited by the tedious data analysis process. It typically requires weeks or months of manual work of a trained expert to turn NMR measurements into a protein structure. Automation of this process is an open problem, formulated in the field over 30 years ago. We present a solution to this challenge that enables the completely automated analysis of protein NMR data within hours after completing the measurements. Using only NMR spectra and the protein sequence as input, our machine learning-based method, ARTINA, delivers signal positions, resonance assignments, and structures strictly without human intervention. Tested on a 100-protein benchmark comprising 1329 multidimensional NMR spectra, ARTINA demonstrated its ability to solve structures with 1.44 Å median RMSD to the PDB reference and to identify 91.36% correct NMR resonance assignments. ARTINA can be used by non-experts, reducing the effort for a protein assignment or structure determination by NMR essentially to the preparation of the sample and the spectra measurements.ISSN:2041-172

    Chemical shift transfer: an effective strategy for protein NMR assignment with ARTINA

    No full text
    Chemical shift transfer (CST) is a well-established technique in NMR spectroscopy that utilizes the chemical shift assignment of one protein (source) to identify chemical shifts of another (target). Given similarity between source and target systems (e.g., using homologs), CST allows the chemical shifts of the target system to be assigned using a limited amount of experimental data. In this study, we propose a deep-learning based workflow, ARTINA-CST, that automates this procedure, allowing CST to be carried out within minutes or hours of computational time and strictly without any human supervision. We characterize the efficacy of our method using three distinct synthetic and experimental datasets, demonstrating its effectiveness and robustness even when substantial differences exist between the source and target proteins. With its potential applications spanning a wide range of NMR projects, including drug discovery and protein interaction studies, ARTINA-CST is anticipated to be a valuable method that facilitates research in the field.ISSN:2296-889

    PDBcor: An Automated Correlation Extraction Calculator for Multi-State Protein Structures

    No full text
    Allostery and correlated motion are key elements linking protein dynamics with the mechanisms of action of proteins. Here, we present PDBCor, an automated and unbiased method for the detection and analysis of correlated motions from experimental multi-state protein structures. It uses torsion angle and distance statistics and does not require any structure superposition. Clustering of protein conformers allows us to extract correlations in the form of mutual information based on information theory. With PDBcor, we elucidated correlated motion in the WW domain of PIN1, the protein GB3, and the enzyme cyclophilin, in line with reported findings. Correlations extracted with PDBcor can be utilized in subsequent assays including nuclear magnetic resonance (NMR) multi-state structure optimization and validation. As a guide for the interpretation of PDBcor results, we provide a series of protein structure ensembles that exhibit different levels of correlation, including non-correlated, locally correlated, and globally correlated ensembles.ISSN:0969-2126ISSN:1878-418

    Data from: A versatile pipeline for the multi-scale digital reconstruction and quantitative analysis of 3D tissue architecture

    No full text
    A prerequisite for the systems biology analysis of tissues is an accurate digital three-dimensional reconstruction of tissue structure based on images of markers covering multiple scales. Here, we designed a flexible pipeline for the multi-scale reconstruction and quantitative morphological analysis of tissue architecture from microscopy images. Our pipeline includes newly developed algorithms that address specific challenges of thick dense tissue reconstruction. Our implementation allows for a flexible workflow, scalable to high-throughput analysis and applicable to various mammalian tissues. We applied it to the analysis of liver tissue and extracted quantitative parameters of sinusoids, bile canaliculi and cell shapes, recognizing different liver cell types with high accuracy. Using our platform, we uncovered an unexpected zonation pattern of hepatocytes with different size, nuclei and DNA content, thus revealing new features of liver tissue organization. The pipeline also proved effective to analyse lung and kidney tissue, demonstrating its generality and robustness
    corecore