164 research outputs found

    Development and implementation of an algorithm for detection of protein complexes in large interaction networks

    Get PDF
    BACKGROUND: After complete sequencing of a number of genomes the focus has now turned to proteomics. Advanced proteomics technologies such as two-hybrid assay, mass spectrometry etc. are producing huge data sets of protein-protein interactions which can be portrayed as networks, and one of the burning issues is to find protein complexes in such networks. The enormous size of protein-protein interaction (PPI) networks warrants development of efficient computational methods for extraction of significant complexes. RESULTS: This paper presents an algorithm for detection of protein complexes in large interaction networks. In a PPI network, a node represents a protein and an edge represents an interaction. The input to the algorithm is the associated matrix of an interaction network and the outputs are protein complexes. The complexes are determined by way of finding clusters, i. e. the densely connected regions in the network. We also show and analyze some protein complexes generated by the proposed algorithm from typical PPI networks of Escherichia coli and Saccharomyces cerevisiae. A comparison between a PPI and a random network is also performed in the context of the proposed algorithm. CONCLUSION: The proposed algorithm makes it possible to detect clusters of proteins in PPI networks which mostly represent molecular biological functional units. Therefore, protein complexes determined solely based on interaction data can help us to predict the functions of proteins, and they are also useful to understand and explain certain biological processes


    Get PDF
    Jamu is an Indonesian herbal medicine made from a mixture of several plants.  Nowadays, many jamu are  produced commercially by many industries in Indonesia.  Each producer may have their own jamu formula. However, one is certain; the efficacy of jamu is determined by the composition of the plants used.  Thus, it is interesting to model the ingredient of jamu which consist of plants and use it to predict efficacy of jamu.  In this analysis, Partial Least Squares Discriminant Analysis (PLSDA) is used in modeling jamu ingredients to predict  the  efficacy.  It  is  obtained  that  utilizing the prediction of  y ij obtained  from  PLSDA  directly  rather  than  use  it  to calculate probability of jamu i belong to efficacy j and then use the probability to predict efficacy produces lower False Positive Rate (FPR) in predicting efficacy group.  Keywords: Jamu, PLSD

    The entire organization of transcription units on the Bacillus subtilis genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the post-genomic era, comprehension of cellular processes and systems requires global and non-targeted approaches to handle vast amounts of biological information.</p> <p>Results</p> <p>The present study predicts transcription units (TUs) in <it>Bacillus subtilis</it>, based on an integrated approach involving DNA sequence and transcriptome analyses. First, co-expressed gene clusters are predicted by calculating the Pearson correlation coefficients of adjacent genes for all the genes in a series that are transcribed in the same direction with no intervening gene transcribed in the opposite direction. Transcription factor (TF) binding sites are then predicted by detecting statistically significant TF binding sequences on the genome using a position weight matrix. This matrix is a convenient way to identify sites that are more highly conserved than others in the entire genome because any sequence that differs from a consensus sequence has a lower score. We identify genes regulated by each of the TFs by comparing gene expression between wild-type and TF mutants using a one-sided test. By applying the integrated approach to 11 σ factors and 17 TFs of <it>B. subtilis</it>, we are able to identify fewer candidates for genes regulated by the TFs than were identified using any single approach, and also detect the known TUs efficiently.</p> <p>Conclusion</p> <p>This integrated approach is, therefore, an efficient tool for narrowing searches for candidate genes regulated by TFs, identifying TUs, and estimating roles of the σ factors and TFs in cellular processes and functions of genes composing the TUs.</p

    Resilience and rewiring of the passenger airline networks in the United States

    Full text link
    The air transportation network, a fundamental component of critical infrastructure, is formed from a collection of individual air carriers, each one with a methodically designed and engineered network structure. We analyze the individual structures of the seven largest passenger carriers in the USA and find that networks with dense interconnectivity, as quantified by large k-cores for high values of k, are extremely resilient to both targeted removal of airports (nodes) and random removal of flight paths paths (edges). Such networks stay connected and incur minimal increase in an heuristic travel time despite removal of a majority of nodes or edges. Similar results are obtained for targeted removal based on either node degree or centrality. We introduce network rewiring schemes that boost resilience to different levels of perturbation while preserving total number of flight and gate requirements. Recent studies have focused on the asymptotic optimality of hub-and-spoke spatial networks under normal operating conditions, yet our results indicate that point-to-point architectures can be much more resilient to perturbations.Comment: 11 pages, 8 figures, replaced by the version to appear in Physical Review

    Metabolomics approach for determining growth-specific metabolites based on Fourier transform ion cyclotron resonance mass spectrometry

    Get PDF
    Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR/MS) is the best MS technology for obtaining exact mass measurements owing to its great resolution and accuracy, and several outstanding FT-ICR/MS-based metabolomics approaches have been reported. A reliable annotation scheme is needed to deal with direct-infusion FT-ICR/MS metabolic profiling. Correlation analyses can help us not only uncover relations between the ions but also annotate the ions originated from identical metabolites (metabolite derivative ions). In the present study, we propose a procedure for metabolite annotation on direct-infusion FT-ICR/MS by taking into consideration the classification of metabolite-derived ions using correlation analyses. Integrated analysis based on information of isotope relations, fragmentation patterns by MS/MS analysis, co-occurring metabolites, and database searches (KNApSAcK and KEGG) can make it possible to annotate ions as metabolites and estimate cellular conditions based on metabolite composition. A total of 220 detected ions were classified into 174 metabolite derivative groups and 72 ions were assigned to candidate metabolites in the present work. Finally, metabolic profiling has been able to distinguish between the growth stages with the aid of PCA. The constructed model using PLS regression for OD600 values as a function of metabolic profiles is very useful for identifying to what degree the ions contribute to the growth stages. Ten phospholipids which largely influence the constructed model are highly abundant in the cells. Our analyses reveal that global modification of those phospholipids occurs as E. coli enters the stationary phase. Thus, the integrated approach involving correlation analyses, metabolic profiling, and database searching is efficient for high-throughput metabolomics

    How and when should interactome-derived clusters be used to predict functional modules and protein function?

    Get PDF
    Motivation: Clustering of protein–protein interaction networks is one of the most common approaches for predicting functional modules, protein complexes and protein functions. But, how well does clustering perform at these tasks

    An iterative network partition algorithm for accurate identification of dense network modules

    Get PDF
    A key step in network analysis is to partition a complex network into dense modules. Currently, modularity is one of the most popular benefit functions used to partition network modules. However, recent studies suggested that it has an inherent limitation in detecting dense network modules. In this study, we observed that despite the limitation, modularity has the advantage of preserving the primary network structure of the undetected modules. Thus, we have developed a simple iterative Network Partition (iNP) algorithm to partition a network. The iNP algorithm provides a general framework in which any modularity-based algorithm can be implemented in the network partition step. Here, we tested iNP with three modularity-based algorithms: multi-step greedy (MSG), spectral clustering and Qcut. Compared with the original three methods, iNP achieved a significant improvement in the quality of network partition in a benchmark study with simulated networks, identified more modules with significantly better enrichment of functionally related genes in both yeast protein complex network and breast cancer gene co-expression network, and discovered more cancer-specific modules in the cancer gene co-expression network. As such, iNP should have a broad application as a general method to assist in the analysis of biological networks

    Coexpression Analysis of Tomato Genes and Experimental Verification of Coordinated Expression of Genes Found in a Functionally Enriched Coexpression Module

    Get PDF
    Gene-to-gene coexpression analysis is a powerful approach to infer the function of uncharacterized genes. Here, we report comprehensive identification of coexpression gene modules of tomato (Solanum lycopersicum) and experimental verification of coordinated expression of module member genes. On the basis of the gene-to-gene correlation coefficient calculated from 67 microarray hybridization data points, we performed a network-based analysis. This facilitated the identification of 199 coexpression modules. A gene ontology annotation search revealed that 75 out of the 199 modules are enriched with genes associated with common functional categories. To verify the coexpression relationships between module member genes, we focused on one module enriched with genes associated with the flavonoid biosynthetic pathway. A non-enzyme, non-transcription factor gene encoding a zinc finger protein in this module was overexpressed in S. lycopersicum cultivar Micro-Tom, and expression levels of flavonoid pathway genes were investigated. Flavonoid pathway genes included in the module were up-regulated in the plant overexpressing the zinc finger gene. This result demonstrates that coexpression modules, at least the ones identified in this study, represent actual transcriptional coordination between genes, and can facilitate the inference of tomato gene function

    Jerarca: Efficient Analysis of Complex Networks Using Hierarchical Clustering

    Get PDF
    Background: How to extract useful information from complex biological networks is a major goal in many fields, especially in genomics and proteomics. We have shown in several works that iterative hierarchical clustering, as implemented in the UVCluster program, is a powerful tool to analyze many of those networks. However, the amount of computation time required to perform UVCluster analyses imposed significant limitations to its use. Methodology/Principal Findings: We describe the suite Jerarca, designed to efficiently convert networks of interacting units into dendrograms by means of iterative hierarchical clustering. Jerarca is divided into three main sections. First, weighted distances among units are computed using up to three different approaches: a more efficient version of UVCluster and two new, related algorithms called RCluster and SCluster. Second, Jerarca builds dendrograms based on those distances, using well-known phylogenetic algorithms, such as UPGMA or Neighbor-Joining. Finally, Jerarca provides optimal partitions of the trees using statistical criteria based on the distribution of intra- and intercluster connections. Outputs compatible with the phylogenetic software MEGA and the Cytoscape package are generated, allowing the results to be easily visualized. Conclusions/Significance: The four main advantages of Jerarca in respect to UVCluster are: 1) Improved speed of a novel UVCluster algorithm; 2) Additional, alternative strategies to perform iterative hierarchical clustering; 3) Automatic evaluatio