59 research outputs found

    Exploiting structural and topological information to improve prediction of RNA-protein binding sites

    Get PDF
    The breast and ovarian cancer susceptibility gene BRCA1 encodes a multifunctional tumor suppressor protein BRCA1, which is involved in regulating cellular processes such as cell cycle, transcription, DNA repair, DNA damage response and chromatin remodeling. BRCA1 protein, located primarily in cell nuclei, interacts with multiple proteins and various DNA targets. It has been demonstrated that BRCA1 protein binds to damaged DNA and plays a role in the transcriptional regulation of downstream target genes. As a key protein in the repair of DNA double-strand breaks, the BRCA1-DNA binding properties, however, have not been reported in detail

    Assessing dose rate distributions in VMAT plans

    Get PDF
    Dose rate is an essential factor in radiobiology. As modern radiotherapy delivery techniques such as volumetric modulated arc therapy (VMAT) introduce dynamic modulation of the dose rate, it is important to assess the changes in dose rate. Both the rate of monitor units per minute (MU rate) and collimation are varied over the course of a fraction, leading to different dose rates in every voxel of the calculation volume at any point in time during dose delivery. Given the radiotherapy plan and machine specific limitations, a VMAT treatment plan can be split into arc sectors between Digital Imaging and Communications in Medicine control points (CPs) of constant and known MU rate. By calculating dose distributions in each of these arc sectors independently and multiplying them with the MU rate, the dose rate in every single voxel at every time point during the fraction can be calculated. Independently calculated and then summed dose distributions per arc sector were compared to the whole arc dose calculation for validation. Dose measurements and video analysis were performed to validate the calculated datasets. A clinical head and neck, cranial and liver case were analyzed using the tool developed. Measurement validation of synthetic test cases showed linac agreement to precalculated arc sector times within ±0.4 s and doses ±0.1 MU (one standard deviation). Two methods for the visualization of dose rate datasets were developed: the first method plots a two-dimensional (2D) histogram of the number of voxels receiving a given dose rate over the course of the arc treatment delivery. In similarity to treatment planning system display of dose, the second method displays the dose rate as color wash on top of the corresponding computed tomography image, allowing the user to scroll through the variation over time. Examining clinical cases showed dose rates spread over a continuous spectrum, with mean dose rates hardly exceeding 100 cGy min(-1) for conventional fractionation. A tool to analyze dose rate distributions in VMAT plans with sub-second accuracy was successfully developed and validated. Dose rates encountered in clinical VMAT test cases show a continuous spectrum with a mean less than or near 100 cGy min(-1) for conventional fractionation

    PRIDB: a protein–RNA interface database

    Get PDF
    The Protein–RNA Interface Database (PRIDB) is a comprehensive database of protein–RNA interfaces extracted from complexes in the Protein Data Bank (PDB). It is designed to facilitate detailed analyses of individual protein–RNA complexes and their interfaces, in addition to automated generation of user-defined data sets of protein–RNA interfaces for statistical analyses and machine learning applications. For any chosen PDB complex or list of complexes, PRIDB rapidly displays interfacial amino acids and ribonucleotides within the primary sequences of the interacting protein and RNA chains. PRIDB also identifies ProSite motifs in protein chains and FR3D motifs in RNA chains and provides links to these external databases, as well as to structure files in the PDB. An integrated JMol applet is provided for visualization of interacting atoms and residues in the context of the 3D complex structures. The current version of PRIDB contains structural information regarding 926 protein–RNA complexes available in the PDB (as of 10 October 2010). Atomic- and residue-level contact information for the entire data set can be downloaded in a simple machine-readable format. Also, several non-redundant benchmark data sets of protein–RNA complexes are provided. The PRIDB database is freely available online at http://bindr.gdcb.iastate.edu/PRIDB

    VoiceS: voice quality after transoral CO2 laser surgery versus single vocal cord irradiation for unilateral stage 0 and I glottic larynx cancer-a randomized phase III trial [study protocol].

    Get PDF
    BACKGROUND Surgery and radiotherapy are well-established standards of care for unilateral stage 0 and I early-stage glottic cancer (ESGC). Based on comparative studies and meta-analyses, functional and oncological outcomes after both treatment modalities are similar. Historically, radiotherapy (RT) has been performed by irradiation of the whole larynx. However, only the involved vocal cord is being treated with recently introduced hypofractionated concepts that result in 8 to 10-fold smaller target volumes. Retrospective data argues for an improvement in voice quality with non-inferior local control. Based on these findings, single vocal cord irradiation (SVCI) has been implemented as a routine approach in some institutions for ESGC in recent years. However, prospective data directly comparing SVCI with surgery is lacking. The aim of VoiceS is to fill this gap. METHODS In this prospective randomized multi-center open-label phase III study with a superiority design, 34 patients with histopathologically confirmed, untreated, unilateral stage 0-I ESGC (unilateral cTis or cT1a) will be randomized to SVCI or transoral CO2-laser microsurgical cordectomy (TLM). Average difference in voice quality, measured by using the voice handicap index (VHI) will be modeled over four time points (6, 12, 18, and 24 months). Primary endpoint of this study will be the patient-reported subjective voice quality between 6 to 24 months after randomization. Secondary endpoints will include perceptual impression of the voice via roughness - breathiness - hoarseness (RBH) assessment at the above-mentioned time points. Additionally, quantitative characteristics of voice, loco-regional tumor control at 2 and 5 years, and treatment toxicity at 2 and 5 years based on CTCAE v.5.0 will be reported. DISCUSSION To our knowledge, VoiceS is the first randomized phase III trial comparing SVCI with TLM. Results of this study may lead to improved decision-making in the treatment of ESGC. TRIAL REGISTRATION ClinicalTrials.gov NCT04057209. Registered on 15 August 2019. Cantonal Ethics Committee KEK-BE 2019-01506

    Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling

    Get PDF
    Background: Identification of functionally important sites in biomolecular sequences has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. Experimental determination of such sites lags far behind the number of known biomolecular sequences. Hence, there is a need to develop reliable computational methods for identifying functionally important sites from biomolecular sequences. Results: We present a mixture of experts approach to biomolecular sequence labeling that takes into account the global similarity between biomolecular sequences. Our approach combines unsupervised and supervised learning techniques. Given a set of sequences and a similarity measure defined on pairs of sequences, we learn a mixture of experts model by using spectral clustering to learn the hierarchical structure of the model and by using bayesian techniques to combine the predictions of the experts. We evaluate our approach on two biomolecular sequence labeling problems: RNA-protein and DNA-protein interface prediction problems. The results of our experiments show that global sequence similarity can be exploited to improve the performance of classifiers trained to label biomolecular sequence data. Conclusion: The mixture of experts model helps improve the performance of machine learning methods for identifying functionally important sites in biomolecular sequences.This is a proceeding from IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 10 (2009): S4, doi: 10.1186/1471-2105-10-S4-S4. Posted with permission.</p

    BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features

    Get PDF
    Abstract Background Understanding how biomolecules interact is a major task of systems biology. To model protein-nucleic acid interactions, it is important to identify the DNA or RNA-binding residues in proteins. Protein sequence features, including the biochemical property of amino acids and evolutionary information in terms of position-specific scoring matrix (PSSM), have been used for DNA or RNA-binding site prediction. However, PSSM is rather designed for PSI-BLAST searches, and it may not contain all the evolutionary information for modelling DNA or RNA-binding sites in protein sequences. Results In the present study, several new descriptors of evolutionary information have been developed and evaluated for sequence-based prediction of DNA and RNA-binding residues using support vector machines (SVMs). The new descriptors were shown to improve classifier performance. Interestingly, the best classifiers were obtained by combining the new descriptors and PSSM, suggesting that they captured different aspects of evolutionary information for DNA and RNA-binding site prediction. The SVM classifiers achieved 77.3% sensitivity and 79.3% specificity for prediction of DNA-binding residues, and 71.6% sensitivity and 78.7% specificity for RNA-binding site prediction. Conclusions Predictions at this level of accuracy may provide useful information for modelling protein-nucleic acid interactions in systems biology studies. We have thus developed a web-based tool called BindN+ (http://bioinfo.ggc.org/bindn+/) to make the SVM classifiers accessible to the research community

    Classifying RNA-Binding Proteins Based on Electrostatic Properties

    Get PDF
    Protein structure can provide new insight into the biological function of a protein and can enable the design of better experiments to learn its biological roles. Moreover, deciphering the interactions of a protein with other molecules can contribute to the understanding of the protein's function within cellular processes. In this study, we apply a machine learning approach for classifying RNA-binding proteins based on their three-dimensional structures. The method is based on characterizing unique properties of electrostatic patches on the protein surface. Using an ensemble of general protein features and specific properties extracted from the electrostatic patches, we have trained a support vector machine (SVM) to distinguish RNA-binding proteins from other positively charged proteins that do not bind nucleic acids. Specifically, the method was applied on proteins possessing the RNA recognition motif (RRM) and successfully classified RNA-binding proteins from RRM domains involved in protein–protein interactions. Overall the method achieves 88% accuracy in classifying RNA-binding proteins, yet it cannot distinguish RNA from DNA binding proteins. Nevertheless, by applying a multiclass SVM approach we were able to classify the RNA-binding proteins based on their RNA targets, specifically, whether they bind a ribosomal RNA (rRNA), a transfer RNA (tRNA), or messenger RNA (mRNA). Finally, we present here an innovative approach that does not rely on sequence or structural homology and could be applied to identify novel RNA-binding proteins with unique folds and/or binding motifs

    Functional specialization of domains tandemly duplicated within 16S rRNA methyltransferase RsmC

    Get PDF
    RNA methyltransferases (MTases) are important players in the biogenesis and regulation of the ribosome, the cellular machine for protein synthesis. RsmC is a MTase that catalyzes the transfer of a methyl group from S-adenosyl-l-methionine (SAM) to G1207 of 16S rRNA. Mutations of G1207 have dominant lethal phenotypes in Escherichia coli, underscoring the significance of this modified nucleotide for ribosome function. Here we report the crystal structure of E. coli RsmC refined to 2.1 Å resolution, which reveals two homologous domains tandemly duplicated within a single polypeptide. We characterized the function of the individual domains and identified key residues involved in binding of rRNA and SAM, and in catalysis. We also discovered that one of the domains is important for the folding of the other. Domain duplication and subfunctionalization by complementary degeneration of redundant functions (in particular substrate binding versus catalysis) has been reported for many enzymes, including those involved in RNA metabolism. Thus, RsmC can be regarded as a model system for functional streamlining of domains accompanied by the development of dependencies concerning folding and stability

    PDNAsite:identification of DNA-binding site from protein sequence by incorporating spatial and sequence context

    Get PDF
    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community
    corecore