Search CORE

32,970 research outputs found

DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences

Author: Keum Jongsoo
Lee Ingoo
Nam Hojung
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 05/11/2018
Field of study

Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors are shown to be not informative enough to predict accurate DTIs. Thus, in this study, we employ a convolutional neural network (CNN) on raw protein sequences to capture local residue patterns participating in DTIs. With CNN on protein sequences, our model performs better than previous protein descriptor-based models. In addition, our model performs better than the previous deep learning model for massive prediction of DTIs. By examining the pooled convolution results, we found that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches.Comment: 26 pages, 7 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Prediction of protein-protein interaction types using association rule based classification

Author: Gilbert D
Kim JW
Kim S
Park SH
Reyes JA
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

This article has been made available through the Brunel Open Access Publishing Fund - Copyright @ 2009 Park et alBackground: Protein-protein interactions (PPI) can be classified according to their characteristics into, for example obligate or transient interactions. The identification and characterization of these PPI types may help in the functional annotation of new protein complexes and in the prediction of protein interaction partners by knowledge driven approaches. Results: This work addresses pattern discovery of the interaction sites for four different interaction types to characterize and uses them for the prediction of PPI types employing Association Rule Based Classification (ARBC) which includes association rule generation and posterior classification. We incorporated domain information from protein complexes in SCOP proteins and identified 354 domain-interaction sites. 14 interface properties were calculated from amino acid and secondary structure composition and then used to generate a set of association rules characterizing these domain-interaction sites employing the APRIORI algorithm. Our results regarding the classification of PPI types based on a set of discovered association rules shows that the discriminative ability of association rules can significantly impact on the prediction power of classification models. We also showed that the accuracy of the classification can be improved through the use of structural domain information and also the use of secondary structure content. Conclusion: The advantage of our approach is that we can extract biologically significant information from the interpretation of the discovered association rules in terms of understandability and interpretability of rules. A web application based on our method can be found at http://bioinfo.ssu.ac.kr/~shpark/picasso/SHP was supported by the Korea Research Foundation Grant funded by the Korean Government(KRF-2005-214-E00050). JAR has been supported by the Programme Alβan, the European Union Programme of High level Scholarships for Latin America, scholarship E04D034854CL. SK was supported by Soongsil University Research Fund

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Brunel University Research Archive

Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints

Author: Greener Joe G
Jones David T
Kandathil Shaun M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/09/2019
Field of study

The inapplicability of amino acid covariation methods to small protein families has limited their use for structural annotation of whole genomes. Recently, deep learning has shown promise in allowing accurate residue-residue contact prediction even for shallow sequence alignments. Here we introduce DMPfold, which uses deep learning to predict inter-atomic distance bounds, the main chain hydrogen bond network, and torsion angles, which it uses to build models in an iterative fashion. DMPfold produces more accurate models than two popular methods for a test set of CASP12 domains, and works just as well for transmembrane proteins. Applied to all Pfam domains without known structures, confident models for 25% of these so-called dark families were produced in under a week on a small 200 core cluster. DMPfold provides models for 16% of human proteome UniProt entries without structures, generates accurate models with fewer than 100 sequences in some cases, and is freely available.Comment: JGG and SMK contributed equally to the wor

arXiv.org e-Print Archive

UCL Discovery

Exact and efficient top-K inference for multi-target prediction by querying separable linear relational models

Author: De Baets Bernard
Dembczynski Krzysztof
Stock Michiel
Waegeman Willem
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Many complex multi-target prediction problems that concern large target spaces are characterised by a need for efficient prediction strategies that avoid the computation of predictions for all targets explicitly. Examples of such problems emerge in several subfields of machine learning, such as collaborative filtering, multi-label classification, dyadic prediction and biological network inference. In this article we analyse efficient and exact algorithms for computing the top-

K

predictions in the above problem settings, using a general class of models that we refer to as separable linear relational models. We show how to use those inference algorithms, which are modifications of well-known information retrieval methods, in a variety of machine learning settings. Furthermore, we study the possibility of scoring items incompletely, while still retaining an exact top-K retrieval. Experimental results in several application domains reveal that the so-called threshold algorithm is very scalable, performing often many orders of magnitude more efficiently than the naive approach

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Proteomic study of the membrane components of signalling cascades of Botrytis cinerea controlled by phosphorylation

Author: Amil Francisco
Blanco-Ulate Barbara
Cantoral Fernández Jesús Manuel
Carrasco Rafael
Chiva Cristina
Escobar Niño Almudena
Fernández Acero Francisco Javier
Fuentes Carlos
Lineiro Eva
Sabido Eduard
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Protein phosphorylation and membrane proteins play an important role in the infection of plants by phytopathogenic fungi, given their involvement in signal transduction cascades. Botrytis cinerea is a well-studied necrotrophic fungus taken as a model organism in fungal plant pathology, given its broad host range and adverse economic impact. To elucidate relevant events during infection, several proteomics analyses have been performed in B. cinerea, but they cover only 10% of the total proteins predicted in the genome database of this fungus. To increase coverage, we analysed by LC-MS/MS the first-reported overlapped proteome in phytopathogenic fungi, the “phosphomembranome” of B. cinerea, combining the two most important signal transduction subproteomes. Of the 1112 membrane-associated phosphoproteins identified, 64 and 243 were classified as exclusively identified or overexpressed under glucose and deproteinized tomato cell wall conditions, respectively. Seven proteins were found under both conditions, but these presented a specific phosphorylation pattern, so they were considered as exclusively identified or overexpressed proteins. From bioinformatics analysis, those differences in the membrane-associated phosphoproteins composition were associated with various processes, including pyruvate metabolism, unfolded protein response, oxidative stress response, autophagy and cell death. Our results suggest these proteins play a significant role in the B. cinerea pathogenic cycl

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

eScholarship - University of California

UPF Digital Repository

Repositorio de Objetos de Docencia e Investigación de la Universidad de Cádiz

Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

Author: Du Yushen
Gong Danyang
Jiang Lin
Shu Sara
Sun Ren
Wu Nicholas C
Wu Ting-Ting
Zhang Tianhao
Publication venue: eScholarship, University of California
Publication date: 01/11/2016
Field of study

Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available.ImportanceTo fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California