162 research outputs found
A new computational strategy for identifying essential proteins based on network topological properties and biological information
<div><p>Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There are two types of methods for predicting essential proteins: experimental methods, which require considerable time and resources, and computational methods, which overcome the shortcomings of experimental methods. However, the prediction accuracy of computational methods for essential proteins requires further improvement. In this paper, we propose a new computational strategy named CoTB for identifying essential proteins based on a combination of topological properties, subcellular localization information and orthologous protein information. First, we introduce several topological properties of the protein-protein interaction (PPI) network. Second, we propose new methods for measuring orthologous information and subcellular localization and a new computational strategy that uses a random forest prediction model to obtain a probability score for the proteins being essential. Finally, we conduct experiments on four different <i>Saccharomyces</i> <i>cerevisiae</i> datasets. The experimental results demonstrate that our strategy for identifying essential proteins outperforms traditional computational methods and the most recently developed method, SON. In particular, our strategy improves the prediction accuracy to 89, 78, 79, and 85 percent on the YDIP, YMIPS, YMBD and YHQ datasets at the top 100 level, respectively.</p></div
A New Method for Identifying Essential Proteins Based on Network Topology Properties and Protein Complexes
<div><p>Essential proteins are indispensable to the viability and reproduction of an organism. The identification of essential proteins is necessary not only for understanding the molecular mechanisms of cellular life but also for disease diagnosis, medical treatments and drug design. Many computational methods have been proposed for discovering essential proteins, but the precision of the prediction of essential proteins remains to be improved. In this paper, we propose a new method, LBCC, which is based on the combination of local density, betweenness centrality (BC) and in-degree centrality of complex (IDC). First, we introduce the common centrality measures; second, we propose the densities <i>Den</i><sub>1</sub>(<i>v</i>) and <i>Den</i><sub>2</sub>(<i>v</i>) of a node <i>v</i> to describe its local properties in the network; and finally, the combined strategy of Den<sub>1</sub>, Den<sub>2</sub>, BC and IDC is developed to improve the prediction precision. The experimental results demonstrate that LBCC outperforms traditional topological measures for predicting essential proteins, including degree centrality (DC), BC, subgraph centrality (SC), eigenvector centrality (EC), network centrality (NC), and the local average connectivity-based method (LAC). LBCC also improves the prediction precision by approximately 10 percent on the YMIPS and YMBD datasets compared to the most recently developed method, LIDC.</p></div
Predicting protein complexes using a supervised learning method combined with local structural information
<div><p>The existing protein complex detection methods can be broadly divided into two categories: unsupervised and supervised learning methods. Most of the unsupervised learning methods assume that protein complexes are in dense regions of protein-protein interaction (PPI) networks even though many true complexes are not dense subgraphs. Supervised learning methods utilize the informative properties of known complexes; they often extract features from existing complexes and then use the features to train a classification model. The trained model is used to guide the search process for new complexes. However, insufficient extracted features, noise in the PPI data and the incompleteness of complex data make the classification model imprecise. Consequently, the classification model is not sufficient for guiding the detection of complexes. Therefore, we propose a new robust score function that combines the classification model with local structural information. Based on the score function, we provide a search method that works both forwards and backwards. The results from experiments on six benchmark PPI datasets and three protein complex datasets show that our approach can achieve better performance compared with the state-of-the-art supervised, semi-supervised and unsupervised methods for protein complex detection, occasionally significantly outperforming such methods.</p></div
Jackknife curves of CoTB and the other eight methods for the HDIP network.
<p>Jackknife curves of CoTB and the other eight methods for the HDIP network.</p
MOESM1 of Light/dark cycle enhancement and energy consumption of tubular microalgal photobioreactors with discrete double inclined ribs
Additional file 1: Figure S1. Independent validation of maximum calculating time and verification of tracked particle number: (a) the number of particles escaped from outlet of the PBR under different maximum calculating time with 1000 particles released, and (b) the impact of number of released particles on f av when the maximum calculating time is 10s
PR curves of LBCC and the other seven previously proposed methods for the YDIP network.
<p>PR curves of LBCC and the other seven previously proposed methods for the YDIP network.</p
PR curves of CoTB and the other methods for the YDIP network.
<p>PR curves of CoTB and the other methods for the YDIP network.</p
PR curves of LBCC and the other seven previously proposed methods for the YMIPS network.
<p>PR curves of LBCC and the other seven previously proposed methods for the YMIPS network.</p
The retromer complex predicted by ClusterSS.
<p>The red nodes represent the proteins in the true complex that are detected by the algorithm, the green nodes represent the proteins in the true complex that are not detected by the algorithm, and the blue nodes represent the proteins that do not belong to the true complex that are detected by the algorithm.</p
- …