516 research outputs found
Finding undetected protein associations in cell signaling by belief propagation
External information propagates in the cell mainly through signaling cascades
and transcriptional activation, allowing it to react to a wide spectrum of
environmental changes. High throughput experiments identify numerous molecular
components of such cascades that may, however, interact through unknown
partners. Some of them may be detected using data coming from the integration
of a protein-protein interaction network and mRNA expression profiles. This
inference problem can be mapped onto the problem of finding appropriate optimal
connected subgraphs of a network defined by these datasets. The optimization
procedure turns out to be computationally intractable in general. Here we
present a new distributed algorithm for this task, inspired from statistical
physics, and apply this scheme to alpha factor and drug perturbations data in
yeast. We identify the role of the COS8 protein, a member of a gene family of
previously unknown function, and validate the results by genetic experiments.
The algorithm we present is specially suited for very large datasets, can run
in parallel, and can be adapted to other problems in systems biology. On
renowned benchmarks it outperforms other algorithms in the field.Comment: 6 pages, 3 figures, 1 table, Supporting Informatio
SteinerNet: a web server for integrating âomicâ data to discover hidden components of response pathways
High-throughput technologies including transcriptional profiling, proteomics and reverse genetics screens provide detailed molecular descriptions of cellular responses to perturbations. However, it is difficult to integrate these diverse data to reconstruct biologically meaningful signaling networks. Previously, we have established a framework for integrating transcriptional, proteomic and interactome data by searching for the solution to the prize-collecting Steiner tree problem. Here, we present a web server, SteinerNet, to make this method available in a user-friendly format for a broad range of users with data from any species. At a minimum, a user only needs to provide a set of experimentally detected proteins and/or genes and the server will search for connections among these data from the provided interactomes for yeast, human, mouse, Drosophila melanogaster and Caenorhabditis elegans. More advanced users can upload their own interactome data as well. The server provides interactive visualization of the resulting optimal network and downloadable files detailing the analysis and results. We believe that SteinerNet will be useful for researchers who would like to integrate their high-throughput data for a specific condition or cellular response and to find biologically meaningful pathways. SteinerNet is accessible at http://fraenkel.mit.edu/steinernet.National Institutes of Health (U.S.) (U54-CA112967)National Institutes of Health (U.S.) (R01-GM089903)National Science Foundation (Award Number DB1-0821391)National Institutes of Health (U.S.) (U54-CA112967
PCSF: An R-package for network-based interpretation of high-throughput data
With the recent technological developments a vast amount of high-throughput data has been profiled to understand the mechanism of complex diseases. The current bioinformatics challenge is to interpret the data and underlying biology, where efficient algorithms for analyzing heterogeneous high-throughput data using biological networks are becoming increasingly valuable. In this paper, we propose a software package based on the Prize-collecting Steiner Forest graph optimization approach. The PCSF package performs fast and user-friendly network analysis of high-throughput data by mapping the data onto a biological networks such as protein-protein interaction, gene-gene interaction or any other correlation or coexpression based networks. Using the interaction networks as a template, it determines high-confidence subnetworks relevant to the data, which potentially leads to predictions of functional units. It also interactively visualizes the resulting subnetwork with functional enrichment analysis
On the performance of a cavity method based algorithm for the Prize-Collecting Steiner Tree Problem on graphs
We study the behavior of an algorithm derived from the cavity method for the
Prize-Collecting Steiner Tree (PCST) problem on graphs. The algorithm is based
on the zero temperature limit of the cavity equations and as such is formally
simple (a fixed point equation resolved by iteration) and distributed
(parallelizable). We provide a detailed comparison with state-of-the-art
algorithms on a wide range of existing benchmarks networks and random graphs.
Specifically, we consider an enhanced derivative of the Goemans-Williamson
heuristics and the DHEA solver, a Branch and Cut Linear/Integer Programming
based approach. The comparison shows that the cavity algorithm outperforms the
two algorithms in most large instances both in running time and quality of the
solution. Finally we prove a few optimality properties of the solutions
provided by our algorithm, including optimality under the two post-processing
procedures defined in the Goemans-Williamson derivative and global optimality
in some limit cases
Swimming Upstream: Identifying Proteomic Signals that Drive Transcriptional Changes using the Interactome and Multiple â-Omicsâ Datasets
available in PMC 2013 December 23Signaling and transcription are tightly integrated processes that underlie many cellular responses to the environment. A network of signaling events, often mediated by post-translational modification on proteins, can lead to long-term changes in cellular behavior by altering the activity of specific transcriptional regulators and consequently the expression level of their downstream targets. As many high-throughput, â-omicsâ methods are now available that can simultaneously measure changes in hundreds of proteins and thousands of transcripts, it should be possible to systematically reconstruct cellular responses to perturbations in order to discover previously unrecognized signaling pathways.
This chapter describes a computational method for discovering such pathways that aims to compensate for the varying levels of noise present in these diverse data sources. Based on the concept of constraint optimization on networks, the method seeks to achieve two conflicting aims: (1) to link together many of the signaling proteins and differentially expressed transcripts identified in the experiments âconstraintsâ using previously reported proteinâprotein and proteinâDNA interactions, while (2) keeping the resulting network small and ensuring it is composed of the highest confidence interactions âoptimizationâ. A further distinctive feature of this approach is the use of transcriptional data as evidence of upstream signaling events that drive changes in gene expression, rather than as proxies for downstream changes in the levels of the encoded proteins.
We recently demonstrated that by applying this method to phosphoproteomic and transcriptional data from the pheromone response in yeast, we were able to recover functionally coherent pathways and to reveal many components of the cellular response that are not readily apparent in the original data. Here, we provide a more detailed description of the method, explore the robustness of the solution to the noise level of input data and discuss the effect of parameter values.National Cancer Institute (U.S.) ((NCI) Grant U54-CA112967)Natural Sciences and Engineering Research Council of Canada (Postgraduate scholarship)Massachusetts Institute of Technology (Eugene Bell Career Development Chair)National Cancer Institute (U.S.) (NCI integrative cancer biology program graduate fellowship
Density-Based Region Search with Arbitrary Shape for Object Localization
Region search is widely used for object localization. Typically, the region
search methods project the score of a classifier into an image plane, and then
search the region with the maximal score. The recently proposed region search
methods, such as efficient subwindow search and efficient region search, %which
localize objects from the score distribution on an image are much more
efficient than sliding window search. However, for some classifiers and tasks,
the projected scores are nearly all positive, and hence maximizing the score of
a region results in localizing nearly the entire images as objects, which is
meaningless.
In this paper, we observe that the large scores are mainly concentrated on or
around objects. Based on this observation, we propose a method, named level set
maximum-weight connected subgraph (LS-MWCS), which localizes objects with
arbitrary shapes by searching regions with the densest score rather than the
maximal score. The region density can be controlled by a parameter flexibly.
And we prove an important property of the proposed LS-MWCS, which guarantees
that the region with the densest score can be searched. Moreover, the LS-MWCS
can be efficiently optimized by belief propagation. The method is evaluated on
the problem of weakly-supervised object localization, and the quantitative
results demonstrate the superiorities of our LS-MWCS compared to other
state-of-the-art methods
Reconstruction of the temporal signaling network in Salmonella-infected human cells
Salmonella enterica is a bacterial pathogen that usually infects its host through food sources. Translocation of the pathogen proteins into the host cells leads to changes in the signaling mechanism either by activating or inhibiting the host proteins. Using high-throughput âomicâ technologies, changes in the signaling components can be quantified at different levels; however, experimental hits are usually incomplete to represent the whole signaling system as some driver proteins stay hidden within the experimental data. Given that the bacterial infection modifies the response network of the host, more coherent view of the underlying biological processes and the signaling networks can be obtained by using a network modeling approach based on the reverse engineering principles in which a confident region from the protein interactome is found by inferring hits from the omic experiments. In this work, we have used a published temporal phosphoproteomic dataset of Salmonella-infected human cells and reconstructed the temporal signaling network of the human host by integrating the interactome and the phosphoproteomic datasets. We have combined two well-established network modeling frameworks, the Prize-collecting Steiner Forest (PCSF) approach and the Integer Linear Programming (ILP) based edge inference approach. The resulting network conserves the information on temporality, direction of interactions, while revealing hidden entities in the signaling, such as the SNARE binding, mTOR signaling, immune response, cytoskeleton organization, and apoptosis pathways. Targets of the Salmonella effectors in the host cells such as CDC42, RHOA, 14-3-3ÎŽ, Syntaxin family, Oxysterol-binding proteins were included in the reconstructed signaling network although they were not present in the initial phosphoproteomic data. We believe that integrated approaches have a high potential for the identification of clinical targets in infectious diseases, especially in the Salmonella infections
Large deviations of cascade processes on graphs
Simple models of irreversible dynamical processes such as Bootstrap
Percolation have been successfully applied to describe cascade processes in a
large variety of different contexts. However, the problem of analyzing
non-typical trajectories, which can be crucial for the understanding of the
out-of-equilibrium phenomena, is still considered to be intractable in most
cases. Here we introduce an efficient method to find and analyze optimized
trajectories of cascade processes. We show that for a wide class of
irreversible dynamical rules, this problem can be solved efficiently on
large-scale systems
- âŠ