24 research outputs found
DLOREAN: Dynamic Location-aware Reconstruction of multiway Networks
This paper presents a method for learning time-varying higher-order interactions based on node observations, with application to short-term traffic forecasting based on traffic flow sensor measurements. We incorporate domain knowledge into the design of a new damped periodic kernel which lever- ages traffic flow patterns towards better structure learning. We introduce location-based regularization for learning models with desirable geographical properties (short-range or long-range interactions). We show using experiments on synthetic and real data, that our approach performs better than static methods for reconstruction of multiway interactions, as well as time-varying methods which recover only pair-wise interactions. Further, we show on real traffic data that our model is useful for short-term traffic forecasting, improving over state-of-the-art
The IncP-1 plasmid backbone adapts to different host bacterial species and evolves through homologous recombination
Plasmids are important members of the bacterial mobile gene pool, and are among the most important contributors to horizontal gene transfer between bacteria. They typically harbour a wide spectrum of host beneficial traits, such as antibiotic resistance, inserted into their backbones. Although these inserted elements have drawn considerable interest, evolutionary information about the plasmid backbones, which encode plasmid related traits, is sparse. Here we analyse 25 complete backbone genomes from the broad-host-range IncP-1 plasmid family. Phylogenetic analysis reveals seven clades, in which two plasmids that we isolated from a marine biofilm represent a novel clade. We also found that homologous recombination is a prominent feature of the plasmid backbone evolution. Analysis of genomic signatures indicates that the plasmids have adapted to different host bacterial species. Globally circulating IncP-1 plasmids hence contain mosaic structures of segments derived from several parental plasmids that have evolved in, and adapted to, different, phylogenetically very distant host bacterial species
Learning time-varying interaction networks
Most biological systems consist of several subcomponents whichinteract with each other. These interactions govern the overall behaviourof the system; and in turn vary over time and in response to internaland external stress during the course of an experiment. Identifying such time-varying networks promises new insight into transient interactionsand their role in the biological process. Traditional methods havefocussed on identifying a single interaction network based on time series data, ignoring the dynamic rewiring ofthe underlying network.This thesis studies the problem of inferring time-varying interactionsin gene interaction networks based on gene microarray expressiondata. With the advent of next generation sequencing techologies,the amount of publicly available microarray expression data as well as other omicsdata has grown tremendously. Further, the microarray data is often generatedfrom different experimental conditions or under networkperturbations. One of the current challenges in systems biology isintegration of data generated from different experimental conditionsand under different stresses towards understanding of the dynamicinteractome. NETGEM, the first study included in this thesis describes a method for inference oftime-varying gene interaction network based on microarray expressiondata under network perturbation. The method presents a probabilistic generativemodel under the assumption that the changes in the interactionnetwork are caused by the changing functional roles of the interaction genesduring the course of a biological process. This is used to infertime-varying interactions for a perturbation study in {\emSaccharomyces cerevisiae\/}~(Baker\u27s Yeast) under nutrient stress. Theinferred network agrees with experimental evidence aswell as identifying key transient interactions during the course of the experiment. In the subsequent study, we present a survey chapter describing current approaches forinference of time-varying biological networks based on nodeobservations. We give an overview of different methods in terms of theunderlying model assumptions and applicability under differentconditions. We also describe how recent advances in theory ofcompressed sensing have led to development of new network inference methods with mild assumptions on network dynamics
Integrative Analysis of Dynamic Networks
Networks play a central role in several disciplines such as
computational biology, social network analysis, transportation
planning and many others; and consequently, several methods have been developed for network analysis. However, in many cases, the study of a single network is insufficient to discover patterns with multiple facets and subtle
signals. Integrative analysis is necessary in order to fuse weak information
present in multiple networks into a more confident prediction, especially in domains where there are diverse modes of data acquisition e.g.~with modern
biological technologies. This is further complicated by the fact that
most real-world networks are inherently
dynamic in nature. Discerning how networks evolve over time is
crucial to unraveling the underlying phenomenon governing the
system.
Though network science has grown to include advances from diverse fields ranging from
classical results in graph theory and approximation algorithms to
newer methods focussed on study of real-world networks, integrative analysis
of multiple dynamic networks is yet to be fully explored. This
thesis makes two-fold contribution in this area.
The first part of this thesis presents work aimed at integrative analysis of multiple
networks reflecting the diverse relationships
among a common set of actors or nodes. We make the connection between Lovasz theta function, a celebrated result in graph theory, and Kernel methods in
machine learning. This allows us to develop new algorithms for
classical graph-theoretic problems like planted clique recovery, graph
coloring and max k-cut. We also present a new scalable method for discovering
common dense subgraphs from multiple networks, with significant
computational advantage over previous state-of-the-art enumerative
approaches. Motivated by the SVM-theta connection, we design two new
``global'' graph kernels which can be used for graph classification. The kernels
capture global graph properties like girth, while being competitive
with existing ``local'' graph kernels.
The second part of this thesis investigates the problem of learning time-varying interactions
based on node observation data using the framework of probabilistic
graphical models. We explore two facets of this problem: modelling the influence
of gene function on dynamic gene-gene interactions; and, capturing
higher-order time-varying networks in a transport application
Learning time-varying interaction networks
Most biological systems consist of several subcomponents whichinteract with each other. These interactions govern the overall behaviourof the system; and in turn vary over time and in response to internaland external stress during the course of an experiment. Identifying such time-varying networks promises new insight into transient interactionsand their role in the biological process. Traditional methods havefocussed on identifying a single interaction network based on time series data, ignoring the dynamic rewiring ofthe underlying network.This thesis studies the problem of inferring time-varying interactionsin gene interaction networks based on gene microarray expressiondata. With the advent of next generation sequencing techologies,the amount of publicly available microarray expression data as well as other omicsdata has grown tremendously. Further, the microarray data is often generatedfrom different experimental conditions or under networkperturbations. One of the current challenges in systems biology isintegration of data generated from different experimental conditionsand under different stresses towards understanding of the dynamicinteractome. NETGEM, the first study included in this thesis describes a method for inference oftime-varying gene interaction network based on microarray expressiondata under network perturbation. The method presents a probabilistic generativemodel under the assumption that the changes in the interactionnetwork are caused by the changing functional roles of the interaction genesduring the course of a biological process. This is used to infertime-varying interactions for a perturbation study in {\emSaccharomyces cerevisiae\/}~(Baker\u27s Yeast) under nutrient stress. Theinferred network agrees with experimental evidence aswell as identifying key transient interactions during the course of the experiment. In the subsequent study, we present a survey chapter describing current approaches forinference of time-varying biological networks based on nodeobservations. We give an overview of different methods in terms of theunderlying model assumptions and applicability under differentconditions. We also describe how recent advances in theory ofcompressed sensing have led to development of new network inference methods with mild assumptions on network dynamics