2,209 research outputs found

    Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition

    Get PDF
    Background: Subcellular location prediction of proteins is an important and well-studied problem in bioinformatics. This is a problem of predicting which part in a cell a given protein is transported to, where an amino acid sequence of the protein is given as an input. This problem is becoming more important since information on subcellular location is helpful for annotation of proteins and genes and the number of complete genomes is rapidly increasing. Since existing predictors are based on various heuristics, it is important to develop a simple method with high prediction accuracies. Results: In this paper, we propose a novel and general predicting method by combining techniques for sequence alignment and feature vectors based on amino acid composition. We implemented this method with support vector machines on plant data sets extracted from the TargetP database. Through fivefold cross validation tests, the obtained overall accuracies and average MCC were 0.9096 and 0.8655 respectively. We also applied our method to other datasets including that of WoLF PSORT. Conclusion: Although there is a predictor which uses the information of gene ontology and yields higher accuracy than ours, our accuracies are higher than existing predictors which use only sequence information. Since such information as gene ontology can be obtained only for known proteins, our predictor is considered to be useful for subcellular location prediction of newly-discovered proteins. Furthermore, the idea of combination of alignment and amino acid frequency is novel and general so that it may be applied to other problems in bioinformatics. Our method for plant is also implemented as a web-system and available on http://sunflower.kuicr.kyoto-u.ac.jp/~tamura/slpfa.html webcite

    Design Novel Dual Agonists for Treating Type-2 Diabetes by Targeting Peroxisome Proliferator-Activated Receptors with Core Hopping Approach

    Get PDF
    Owing to their unique functions in regulating glucose, lipid and cholesterol metabolism, PPARs (peroxisome proliferator-activated receptors) have drawn special attention for developing drugs to treat type-2 diabetes. By combining the lipid benefit of PPAR-alpha agonists (such as fibrates) with the glycemic advantages of the PPAR-gamma agonists (such as thiazolidinediones), the dual PPAR agonists approach can both improve the metabolic effects and minimize the side effects caused by either agent alone, and hence has become a promising strategy for designing effective drugs against type-2 diabetes. In this study, by means of the powerful “core hopping” and “glide docking” techniques, a novel class of PPAR dual agonists was discovered based on the compound GW409544, a well-known dual agonist for both PPAR-alpha and PPAR-gamma modified from the farglitazar structure. It was observed by molecular dynamics simulations that these novel agonists not only possessed the same function as GW409544 did in activating PPAR-alpha and PPAR-gamma, but also had more favorable conformation for binding to the two receptors. It was further validated by the outcomes of their ADME (absorption, distribution, metabolism, and excretion) predictions that the new agonists hold high potential to become drug candidates. Or at the very least, the findings reported here may stimulate new strategy or provide useful insights for discovering more effective dual agonists for treating type-2 diabetes. Since the “core hopping” technique allows for rapidly screening novel cores to help overcome unwanted properties by generating new lead compounds with improved core properties, it has not escaped our notice that the current strategy along with the corresponding computational procedures can also be utilized to find novel and more effective drugs for treating other illnesses

    'Unite and conquer': enhanced prediction of protein subcellular localization by integrating multiple specialized tools

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Knowing the subcellular location of proteins provides clues to their function as well as the interconnectivity of biological processes. Dozens of tools are available for predicting protein location in the eukaryotic cell. Each tool performs well on certain data sets, but their predictions often disagree for a given protein. Since the individual tools each have particular strengths, we set out to integrate them in a way that optimally exploits their potential. The method we present here is applicable to various subcellular locations, but tailored for predicting whether or not a protein is localized in mitochondria. Knowledge of the mitochondrial proteome is relevant to understanding the role of this organelle in global cellular processes.</p> <p>Results</p> <p>In order to develop a method for enhanced prediction of subcellular localization, we integrated the outputs of available localization prediction tools by several strategies, and tested the performance of each strategy with known mitochondrial proteins. The accuracy obtained (up to 92%) surpasses by far the individual tools. The method of integration proved crucial to the performance. For the prediction of mitochondrion-located proteins, integration via a two-layer decision tree clearly outperforms simpler methods, as it allows emphasis of biologically relevant features such as the mitochondrial targeting peptide and transmembrane domains.</p> <p>Conclusion</p> <p>We developed an approach that enhances the prediction accuracy of mitochondrial proteins by uniting the strength of specialized tools. The combination of machine-learning based integration with biological expert knowledge leads to improved performance. This approach also alleviates the conundrum of how to choose between conflicting predictions. Our approach is easy to implement, and applicable to predicting subcellular locations other than mitochondria, as well as other biological features. For a trial of our approach, we provide a webservice for mitochondrial protein prediction (named YimLOC), which can be accessed through the AnaBench suite at http://anabench.bcm.umontreal.ca/anabench/. The source code is provided in the Additional File <supplr sid="S2">2</supplr>.</p> <suppl id="S2"> <title> <p>Additional file 2</p> </title> <text> <p>This file contains scripts for the online server YimLOC. Please note that there scripts only codes for the ready-to-use STACK-mem-DT described in the main text. The scripts do not provide the training process.</p> </text> <file name="1471-2105-8-420-S2.pdf"> <p>Click here for file</p> </file> </suppl

    A method to improve protein subcellular localization prediction by integrating various biological data sources

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein subcellular localization is crucial information to elucidate protein functions. Owing to the need for large-scale genome analysis, computational method for efficiently predicting protein subcellular localization is highly required. Although many previous works have been done for this task, the problem is still challenging due to several reasons: the number of subcellular locations in practice is large; distribution of protein in locations is imbalanced, that is the number of protein in each location remarkably different; and there are many proteins located in multiple locations. Thus it is necessary to explore new features and appropriate classification methods to improve the prediction performance.</p> <p>Results</p> <p>In this paper we propose a new predicting method which combines two key ideas: 1) Information of neighbour proteins in a probabilistic gene network is integrated to enrich the prediction features. 2) Fuzzy k-NN, a classification method based on fuzzy set theory is applied to predict protein locating in multiple sites. Experiment was conducted on a dataset consisting of 22 locations from Budding yeast proteins and significant improvement was observed.</p> <p>Conclusion</p> <p>Our results suggest that the neighbourhood information from functional gene networks is predictive to subcellular localization. The proposed method thus can be integrated and complementary to other available prediction methods.</p

    A Multi-Label Predictor for Identifying the Subcellular Locations of Singleplex and Multiplex Eukaryotic Proteins

    Get PDF
    Subcellular locations of proteins are important functional attributes. An effective and efficient subcellular localization predictor is necessary for rapidly and reliably annotating subcellular locations of proteins. Most of existing subcellular localization methods are only used to deal with single-location proteins. Actually, proteins may simultaneously exist at, or move between, two or more different subcellular locations. To better reflect characteristics of multiplex proteins, it is highly desired to develop new methods for dealing with them. In this paper, a new predictor, called Euk-ECC-mPLoc, by introducing a powerful multi-label learning approach which exploits correlations between subcellular locations and hybridizing gene ontology with dipeptide composition information, has been developed that can be used to deal with systems containing both singleplex and multiplex eukaryotic proteins. It can be utilized to identify eukaryotic proteins among the following 22 locations: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centrosome, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome, (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole. Experimental results on a stringent benchmark dataset of eukaryotic proteins by jackknife cross validation test show that the average success rate and overall success rate obtained by Euk-ECC-mPLoc were 69.70% and 81.54%, respectively, indicating that our approach is quite promising. Particularly, the success rates achieved by Euk-ECC-mPLoc for small subsets were remarkably improved, indicating that it holds a high potential for simulating the development of the area. As a user-friendly web-server, Euk-ECC-mPLoc is freely accessible to the public at the website http://levis.tongji.edu.cn:8080/bioinfo/Euk-ECC-mPLoc/. We believe that Euk-ECC-mPLoc may become a useful high-throughput tool, or at least play a complementary role to the existing predictors in identifying subcellular locations of eukaryotic proteins

    Three-dimensionally Ordered Macroporous Structure Enabled Nanothermite Membrane of Mn2O3/Al

    Get PDF
    Mn2O3 has been selected to realize nanothermite membrane for the first time in the literature. Mn2O3/Al nanothermite has been synthesized by magnetron sputtering a layer of Al film onto three-dimensionally ordered macroporous (3DOM) Mn2O3 skeleton. The energy release is significantly enhanced owing to the unusual 3DOM structure, which ensures Al and Mn2O3 to integrate compactly in nanoscale and greatly increase effective contact area. The morphology and DSC curve of the nanothermite membrane have been investigated at various aluminizing times. At the optimized aluminizing time of 30 min, energy release reaches a maximum of 2.09 kJ∙g−1, where the Al layer thickness plays a decisive role in the total energy release. This method possesses advantages of high compatibility with MEMS and can be applied to other nanothermite systems easily, which will make great contribution to little-known nanothermite research

    Genetic Analysis of the Cytoplasmic Dynein Subunit Families

    Get PDF
    Cytoplasmic dyneins, the principal microtubule minus-end-directed motor proteins of the cell, are involved in many essential cellular processes. The major form of this enzyme is a complex of at least six protein subunits, and in mammals all but one of the subunits are encoded by at least two genes. Here we review current knowledge concerning the subunits, their interactions, and their functional roles as derived from biochemical and genetic analyses. We also carried out extensive database searches to look for new genes and to clarify anomalies in the databases. Our analysis documents evolutionary relationships among the dynein subunits of mammals and other model organisms, and sheds new light on the role of this diverse group of proteins, highlighting the existence of two cytoplasmic dynein complexes with distinct cellular roles

    Overexpression of P70 S6 kinase protein is associated with increased risk of locoregional recurrence in node-negative premenopausal early breast cancer patients

    Get PDF
    The RPS6KB1 gene is amplified and overexpressed in approximately 10% of breast carcinomas and has been found associated with poor prognosis. We studied the prognostic significance of P70 S6 kinase protein (PS6K) overexpression in a series of 452 node-negative premenopausal early-stage breast cancer patients (median follow-up: 10.8 years). Immunohistochemistry was used to assess PS6K expression in the primary tumour, which had previously been analysed for a panel of established prognostic factors in breast cancer. In a univariate analysis, PS6K overexpression was associated with worse distant disease-free survival as well as impaired locoregional control (HR 1.80, P 0.025 and HR 2.50, P 0.006, respectively). In a multivariate analysis including other prognostic factors, PS6K overexpression remained an independent predictor for poor locoregional control (RR 2.67, P 0.003). To our knowledge, P70 S6 kinase protein is the first oncogenic marker that has prognostic impact on locoregional control and therefore may have clinical implications in determining the local treatment strategy in early-stage breast cancer patients