348 research outputs found

    Information flow in interaction networks II: channels, path lengths and potentials

    Full text link
    In our previous publication, a framework for information flow in interaction networks based on random walks with damping was formulated with two fundamental modes: emitting and absorbing. While many other network analysis methods based on random walks or equivalent notions have been developed before and after our earlier work, one can show that they can all be mapped to one of the two modes. In addition to these two fundamental modes, a major strength of our earlier formalism was its accommodation of context-specific directed information flow that yielded plausible and meaningful biological interpretation of protein functions and pathways. However, the directed flow from origins to destinations was induced via a potential function that was heuristic. Here, with a theoretically sound approach called the channel mode, we extend our earlier work for directed information flow. This is achieved by constructing a potential function facilitating a purely probabilistic interpretation of the channel mode. For each network node, the channel mode combines the solutions of emitting and absorbing modes in the same context, producing what we call a channel tensor. The entries of the channel tensor at each node can be interpreted as the amount of flow passing through that node from an origin to a destination. Similarly to our earlier model, the channel mode encompasses damping as a free parameter that controls the locality of information flow. Through examples involving the yeast pheromone response pathway, we illustrate the versatility and stability of our new framework.Comment: Minor changes from v3. 30 pages, 7 figures. Plain LaTeX format. This version contains some additional material compared to the journal submission: two figures, one appendix and a few paragraph

    Information Flow in Interaction Networks

    Full text link
    Interaction networks, consisting of agents linked by their interactions, are ubiquitous across many disciplines of modern science. Many methods of analysis of interaction networks have been proposed, mainly concentrating on node degree distribution or aiming to discover clusters of agents that are very strongly connected between themselves. These methods are principally based on graph-theory or machine learning. We present a mathematically simple formalism for modelling context-specific information propagation in interaction networks based on random walks. The context is provided by selection of sources and destinations of information and by use of potential functions that direct the flow towards the destinations. We also use the concept of dissipation to model the aging of information as it diffuses from its source. Using examples from yeast protein-protein interaction networks and some of the histone acetyltransferases involved in control of transcription, we demonstrate the utility of the concepts and the mathematical constructs introduced in this paper.Comment: 30 pages, 5 figures. This paper was published in 2007 in Journal of Computational Biology. The version posted here does not include post peer-review change

    CytoITMprobe: a network information flow plugin for Cytoscape

    Get PDF
    To provide the Cytoscape users the possibility of integrating ITM Probe into their workflows, we developed CytoITMprobe, a new Cytoscape plugin. CytoITMprobe maintains all the desirable features of ITM Probe and adds additional flexibility not achievable through its web service version. It provides access to ITM Probe either through a web server or locally. The input, consisting of a Cytoscape network, together with the desired origins and/or destinations of information and a dissipation coefficient, is specified through a query form. The results are shown as a subnetwork of significant nodes and several summary tables. Users can control the composition and appearance of the subnetwork and interchange their ITM Probe results with other software tools through tab-delimited files. The main strength of CytoITMprobe is its flexibility. It allows the user to specify as input any Cytoscape network, rather than being restricted to the pre-compiled protein-protein interaction networks available through the ITM Probe web service. Users may supply their own edge weights and directionalities. Consequently, as opposed to ITM Probe web service, CytoITMprobe can be applied to many other domains of network-based research beyond protein-networks. It also enables seamless integration of ITM Probe results with other Cytoscape plugins having complementary functionality for data analysis.Comment: 16 pages, 6 figures. Version

    Chemomechanical coupling and motor cycles of the molecular motor myosin V

    Get PDF

    Learning from Partially Labeled Data: Unsupervised and Semi-supervised Learning on Graphs and Learning with Distribution Shifting

    Get PDF
    This thesis focuses on two fundamental machine learning problems:unsupervised learning, where no label information is available, and semi-supervised learning, where a small amount of labels are given in addition to unlabeled data. These problems arise in many real word applications, such as Web analysis and bioinformatics,where a large amount of data is available, but no or only a small amount of labeled data exists. Obtaining classification labels in these domains is usually quite difficult because it involves either manual labeling or physical experimentation. This thesis approaches these problems from two perspectives: graph based and distribution based. First, I investigate a series of graph based learning algorithms that are able to exploit information embedded in different types of graph structures. These algorithms allow label information to be shared between nodes in the graph---ultimately communicating information globally to yield effective unsupervised and semi-supervised learning. In particular, I extend existing graph based learning algorithms, currently based on undirected graphs, to more general graph types, including directed graphs, hypergraphs and complex networks. These richer graph representations allow one to more naturally capture the intrinsic data relationships that exist, for example, in Web data, relational data, bioinformatics and social networks. For each of these generalized graph structures I show how information propagation can be characterized by distinct random walk models, and then use this characterization to develop new unsupervised and semi-supervised learning algorithms. Second, I investigate a more statistically oriented approach that explicitly models a learning scenario where the training and test examples come from different distributions. This is a difficult situation for standard statistical learning approaches, since they typically incorporate an assumption that the distributions for training and test sets are similar, if not identical. To achieve good performance in this scenario, I utilize unlabeled data to correct the bias between the training and test distributions. A key idea is to produce resampling weights for bias correction by working directly in a feature space and bypassing the problem of explicit density estimation. The technique can be easily applied to many different supervised learning algorithms, automatically adapting their behavior to cope with distribution shifting between training and test data

    How kinesin waits for ATP affects the nucleotide and load dependence of the stepping kinetics

    Full text link
    Dimeric molecular motors walk on polar tracks by binding and hydrolyzing one ATP per step. Despite tremendous progress, the waiting state for ATP binding in the well-studied kinesin that walks on microtubule (MT), remains controversial. One experiment suggests that in the waiting state both heads are bound to the MT, while the other shows that ATP binds to the leading head after the partner head detaches. To discriminate between these two scenarios, we developed a theory to calculate accurately several experimentally measurable quantities as a function of ATP concentration and resistive force. In particular, we predict that measurement of the randomness parameter could discriminate between the two scenarios for the waiting state of kinesin, thereby resolving this standing controversy

    Retinal Vascular Network Topology Reconstruction and Artery/Vein Classification via Dominant Set Clustering

    Get PDF
    The estimation of vascular network topology in complex networks is important in understanding the relationship between vascular changes and a wide spectrum of diseases. Automatic classification of the retinal vascular trees into arteries and veins is of direct assistance to the ophthalmologist in terms of diagnosis and treatment of eye disease. However, it is challenging due to their projective ambiguity and subtle changes in appearance, contrast and geometry in the imaging process. In this paper, we propose a novel method that is capable of making the artery/vein (A/V) distinction in retinal color fundus images based on vascular network topological properties. To this end, we adapt the concept of dominant set clustering and formalize the retinal blood vessel topology estimation and the A/V classification as a pairwise clustering problem. The graph is constructed through image segmentation, skeletonization and identification of significant nodes. The edge weight is defined as the inverse Euclidean distance between its two end points in the feature space of intensity, orientation, curvature, diameter, and entropy. The reconstructed vascular network is classified into arteries and veins based on their intensity and morphology. The proposed approach has been applied to five public databases, INSPIRE, IOSTAR, VICAVR, DRIVE and WIDE, and achieved high accuracies of 95.1%, 94.2%, 93.8%, 91.1%, and 91.0%, respectively. Furthermore, we have made manual annotations of the blood vessel topologies for INSPIRE, IOSTAR, VICAVR, and DRIVE datasets, and these annotations are released for public access so as to facilitate researchers in the community

    Fractals in the Nervous System: conceptual Implications for Theoretical Neuroscience

    Get PDF
    This essay is presented with two principal objectives in mind: first, to document the prevalence of fractals at all levels of the nervous system, giving credence to the notion of their functional relevance; and second, to draw attention to the as yet still unresolved issues of the detailed relationships among power law scaling, self-similarity, and self-organized criticality. As regards criticality, I will document that it has become a pivotal reference point in Neurodynamics. Furthermore, I will emphasize the not yet fully appreciated significance of allometric control processes. For dynamic fractals, I will assemble reasons for attributing to them the capacity to adapt task execution to contextual changes across a range of scales. The final Section consists of general reflections on the implications of the reviewed data, and identifies what appear to be issues of fundamental importance for future research in the rapidly evolving topic of this review

    Methods, tools, and computational environment for network-based analysis of biological data

    Get PDF
    Cancer currently affects more than 18 million persons world-wide annually. It is a leading cause of death and so far, only 60% cure rate can be reached within the most developed health care systems. The nature of cancer has been a mystery for centuries, until discoveries during recent decades shed light on the underlying molecular events. This depended on the progress in understanding cell and tissue biology, developments of molecular technologies and of -omics technologies. Cancer has then emerged as a highly heterogeneous disease, however with some very basic mechanistic features common to all cancers. To deal with the complexity of causes and consequences of pathological changes in the molecular machinery, methods and tools of network analysis can be helpful. Complexity of this task requires easy-to-use tools, which allow researchers and clinicians with no background in computer science to perform network analysis. Paper I describes a web-based framework for network enrichment analysis (NEA), using previously developed algorithm and code. The developed platform introduces functionality for a researcher to use data pre-downloaded from various popular databases as well as own data, perform NEA and obtain statistical estimations, export results in different formats for publications or further use in research pipeline. Paper II presents development of another web server, which provided vast opportunities for exploration and integrated analysis of multiple public cancer datasets that describe in vitro and in vivo sample collections. The web server linked molecular data at the single gene level, phenotype and pharmacological response variables, as well as pathway level variables calculated with NEA and connected to the framework presented in Paper I. Researchers can use the platform for creating multivariate models based on raw or pre-processed data from various sources, visualize created models, estimate their performance and compare them, export models for further usage in own research environments. Paper III demonstrates NEAdriver, a practical application of NEA to probabilistic evaluation of driver roles of mutations reported in ten cancer cohorts. NEAdriver results are compared with cancer gene sets produced by other, both network analysis and network-free methods. The paper demonstrated ability of NEA to be used directly for discovering novel driver genes as well as being used in combination with other methods. In order to demonstrate benefits of using NEA, some rare cancer types and types with low mutation burden were used. Paper IV is a manuscript evaluating performance of most representative methods of network analysis across methods’ parameters, functional ontologies and network versions. This study emphasizes discovery of novel functional associations for known genes, as opposed to previous tests dominated by a few “gold standard” genes which were well characterized previously. We performed the analysis in the context of various topological properties of networks, pathways of interest, and genes. It employed both existing, widely used topological metrics and a number of novel ones developed for this analysis
    corecore