584 research outputs found

    Developing a scoring function for NMR structure-based assignments using machine learning

    Get PDF
    Determining the assignment of signals received from the ex- periments (peaks) to speci_c nuclei of the target molecule in Nuclear Magnetic Resonance (NMR1) spectroscopy is an important challenge. Nuclear Vector Replacement (NVR) ([2, 3]) is a framework for structure- based assignments which combines multiple types of NMR data such as chemical shifts, residual dipolar couplings, and NOEs. NVR-BIP [1] is a tool which utilizes a scoring function with a binary integer programming (BIP) model to perform the assignments. In this paper, support vector machines (SVM) and boosting are employed to combine the terms in NVR-BIP's scoring function by viewing the assignment as a classi_ca- tion problem. The assignment accuracies obtained using this approach show that boosting improves the assignment accuracy of NVR-BIP on our data set when RDCs are not available and outperforms SVMs. With RDCs, boosting and SVMs o_er mixed results

    Combining automated peak tracking in SAR by NMR with structure-based backbone assignment from 15N-NOESY

    Get PDF
    BACKGROUND: Chemical shift mapping is an important technique in NMR-based drug screening for identifying the atoms of a target protein that potentially bind to a drug molecule upon the molecule's introduction in increasing concentrations. The goal is to obtain a mapping of peaks with known residue assignment from the reference spectrum of the unbound protein to peaks with unknown assignment in the target spectrum of the bound protein. Although a series of perturbed spectra help to trace a path from reference peaks to target peaks, a one-to-one mapping generally is not possible, especially for large proteins, due to errors, such as noise peaks, missing peaks, missing but then reappearing, overlapped, and new peaks not associated with any peaks in the reference. Due to these difficulties, the mapping is typically done manually or semi-automatically, which is not efficient for high-throughput drug screening. RESULTS: We present PeakWalker, a novel peak walking algorithm for fast-exchange systems that models the errors explicitly and performs many-to-one mapping. On the proteins: hBcl(XL), UbcH5B, and histone H1, it achieves an average accuracy of over 95% with less than 1.5 residues predicted per target peak. Given these mappings as input, we present PeakAssigner, a novel combined structure-based backbone resonance and NOE assignment algorithm that uses just (15)N-NOESY, while avoiding TOCSY experiments and (13)C-labeling, to resolve the ambiguities for a one-to-one mapping. On the three proteins, it achieves an average accuracy of 94% or better. CONCLUSIONS: Our mathematical programming approach for modeling chemical shift mapping as a graph problem, while modeling the errors directly, is potentially a time- and cost-effective first step for high-throughput drug screening based on limited NMR data and homologous 3D structures

    A Tabu search approach for the nuclear magnetic resonance protein structure based assignment problem

    Get PDF
    Nuclear Magnetic Resonance (NMR) Spectroscopy is an experimental technique which exploits the magnetic properties of specific nuclei and enables the study of proteins in solution. The key bottleneck of NMR studies is to map the NMR peaks to corresponding nuclei, also known as the assignment problem. Structure Based Assignment (SBA) is an approach to solve this computationally challenging problem by using prior information about the protein obtained from a homologous structure. [17] used the Nuclear Vector Replacement (NVR) [29] framework to model SBA as a binary integer programming problem (NVR-BIP). In this thesis, we prove that this problem is NP-hard and propose a tabu search algorithm (NVR-TS) equipped with a guided perturbation mechanism to efficiently solve it. NVR-TS uses a quadratic penalty relaxation of NVR-BIP where the violations in the Nuclear Overhauser Effect constraints are penalized in the objective function. Experimental results indicate that our algorithm finds the optimal solution on NVR-BIP's data set which consists of 7 proteins with 25 templates (31 to 126 residues). Furthermore, for two additional large proteins, MBP and EIN (348 and 243 residues, respectively) which NVR-BIP failed to solve, it achieves 91% and 41% assignment accuracies. The executable and the input files are available for download at http://people.sabanciuniv.edu/catay/NVR-TS/NVR-TS.html
    corecore