26 research outputs found
GOPred: GO Molecular Function Prediction by Combined Classifiers
Functional protein annotation is an important matter for in vivo and in silico biology. Several computational methods have been proposed that make use of a wide range of features such as motifs, domains, homology, structure and physicochemical properties. There is no single method that performs best in all functional classification problems because information obtained using any of these features depends on the function to be assigned to the protein. In this study, we portray a novel approach that combines different methods to better represent protein function. First, we formulated the function annotation problem as a classification problem defined on 300 different Gene Ontology (GO) terms from molecular function aspect. We presented a method to form positive and negative training examples while taking into account the directed acyclic graph (DAG) structure and evidence codes of GO. We applied three different methods and their combinations. Results show that combining different methods improves prediction accuracy in most cases. The proposed method, GOPred, is available as an online computational annotation tool (http://kinaz.fen.bilkent.edu.tr/gopred)
Crowdsourced mapping of unexplored target space of kinase inhibitors
Despite decades of intensive search for compounds that modulate the activity of particular protein targets, a large proportion of the human kinome remains as yet undrugged. Effective approaches are therefore required to map the massive space of unexplored compound-kinase interactions for novel and potent activities. Here, we carry out a crowdsourced benchmarking of predictive algorithms for kinase inhibitor potencies across multiple kinase families tested on unpublished bioactivity data. We find the top-performing predictions are based on various models, including kernel learning, gradient boosting and deep learning, and their ensemble leads to a predictive accuracy exceeding that of single-dose kinase activity assays. We design experiments based on the model predictions and identify unexpected activities even for under-studied kinases, thereby accelerating experimental mapping efforts. The open-source prediction algorithms together with the bioactivities between 95 compounds and 295 kinases provide a resource for benchmarking prediction algorithms and for extending the druggable kinome. The IDG-DREAM Challenge carried out crowdsourced benchmarking of predictive algorithms for kinase inhibitor activities on unpublished data. This study provides a resource to compare emerging algorithms and prioritize new kinase activities to accelerate drug discovery and repurposing efforts
The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens
Background The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Results Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. Conclusion We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.Peer reviewe
The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens
BackgroundThe Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.ResultsHere, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.ConclusionWe conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.</p
Prediction of enzyme classes in a hierarchical approach by using SPMap
Enzymes are proteins that play important roles in biochemical reactions as catalysts. They are classified based on the reaction they catalyzed, in a hierarchical scheme by International Enzyme Commission (EC). This hierarchical scheme is expressed in four-level tree structure and a unique number is assigned to each enzyme class. There are six major classes at the top level according to the reaction they carried out and sub-classes at the lower levels are further specific reactions of these classes. The aim of this study was to build a three-level classification model based on the hierarchical structure of EC classes. ENZYME database was used to extract the information of EC classes then enzymes were assigned to these EC classes. Primary sequences of enzymes extracted from UniProtKB/Swiss-Prot database were used to extract features. A subsequence based feature extraction method, Subsequence Profile Map (SPMap) was used in this study. SPMap is a discriminative method that explicitly models the differences between positive and negative examples. SPMap considers the conserved subsequences of protein sequences in the same class. SPMap generates the feature vector of each sample protein as a probability of fixed-length subsequences of this protein with respect to a probabilistic profile matrix calculated by clustering similar subsequences in the training data set. In our case, positive and negative training datasets were prepared for each class, at each level of the tree structure. SPMap was used for feature extraction and Support Vector Machines (SVMs) were used for classification. Five-fold cross validation was used to test the performance of the system. The overall sensitivity, specificity and AUC values for the six major EC classes are 93.08%, 98.95% and 0.993, respectively. The results at the second and third levels were also comparable to those of six major classes
Reconstruction of three dimensional models from real images
An image based model reconstruction system is described. Real images of a rigid object acquired under a simple but controlled environment are used to recover the three dimensional geometry and the surface appearance. Based on a multi-image calibration method, an algorithm to extract the rotation axis of a turn-table has been developed. Furthermore, this can be extended to estimate robustly the initial bounding volume of the object to be modeled The coarse volume obtained, is then carved using a stereo correction method which removes the disadvantages of silhouette based reconstruction by photoconsistency. The concept of surface particles is adapted in order to extract a texture map for the model. Some metrics are defined to measure the quality of the reconstructed models
Ore-age: An intelligent assisting and tutoring system for mining method selection
In the past studies about the mining method selection process, which is among the most critical aspects in the mining engineering discipline, there are attempts to build up a systematic approach to make this selection. But, these approaches work based on static databases and fail in inserting the intuitive feelings and engineering judgments of experienced engineers to the selection process. In this study, a hybrid system based on 13 different expert systems and one interface agent is developed, to make mining method selection for the given ore-bodies. The learning procedure to insert the expertise of the experienced engineers to the selection process, works based on a neuro-fuzzy model, combining the TSK model of the fuzzy theory and a two layered neural network with the utilization of the back-propagation algorithm. Again, to supply the maximum assistance to the users, the agent executes the system's tutoring procedure in case an inexperienced user enters the system, to complete his/her missing knowledge about mining method selection. The system that is being developed in this study can be introduced as the first example of dynamic, intelligent assisting and tutoring systems in the mining profession
Computer vision based unistroke keyboards
In this paper we present a unistroke keyboard based on computer vision. The keyboard can be made of paper containing an image of the keyboard which has an upside down U-shape. Each character is represented by a nonoverlapping rectangular region. The user enters a character to the computer by covering the character region with a stylus. The actions of the user are captured by a camera and the covered key is recognized. During the text entry process the user need not have to raise the stylus from the keyboard and this leads to faster data entry rates. In a companion system the user imitates writing on a surface using a pointer or a stylus. In this case the trace of the pointer is analyzed and the characters are recognized. The character set of the continuous hand writing system is based on the Graffiti alphabet to achieve very high recognition rates