13,330 research outputs found
Understanding Complex Systems: From Networks to Optimal Higher-Order Models
To better understand the structure and function of complex systems,
researchers often represent direct interactions between components in complex
systems with networks, assuming that indirect influence between distant
components can be modelled by paths. Such network models assume that actual
paths are memoryless. That is, the way a path continues as it passes through a
node does not depend on where it came from. Recent studies of data on actual
paths in complex systems question this assumption and instead indicate that
memory in paths does have considerable impact on central methods in network
science. A growing research community working with so-called higher-order
network models addresses this issue, seeking to take advantage of information
that conventional network representations disregard. Here we summarise the
progress in this area and outline remaining challenges calling for more
research.Comment: 8 pages, 4 figure
Counting Causal Paths in Big Times Series Data on Networks
Graph or network representations are an important foundation for data mining
and machine learning tasks in relational data. Many tools of network analysis,
like centrality measures, information ranking, or cluster detection rest on the
assumption that links capture direct influence, and that paths represent
possible indirect influence. This assumption is invalidated in time-stamped
network data capturing, e.g., dynamic social networks, biological sequences or
financial transactions. In such data, for two time-stamped links (A,B) and
(B,C) the chronological ordering and timing determines whether a causal path
from node A via B to C exists. A number of works has shown that for that reason
network analysis cannot be directly applied to time-stamped network data.
Existing methods to address this issue require statistics on causal paths,
which is computationally challenging for big data sets.
Addressing this problem, we develop an efficient algorithm to count causal
paths in time-stamped network data. Applying it to empirical data, we show that
our method is more efficient than a baseline method implemented in an
OpenSource data analytics package. Our method works efficiently for different
values of the maximum time difference between consecutive links of a causal
path and supports streaming scenarios. With it, we are closing a gap that
hinders an efficient analysis of big time series data on complex networks.Comment: 10 pages, 2 figure
Reasoning about Independence in Probabilistic Models of Relational Data
We extend the theory of d-separation to cases in which data instances are not
independent and identically distributed. We show that applying the rules of
d-separation directly to the structure of probabilistic models of relational
data inaccurately infers conditional independence. We introduce relational
d-separation, a theory for deriving conditional independence facts from
relational models. We provide a new representation, the abstract ground graph,
that enables a sound, complete, and computationally efficient method for
answering d-separation queries about relational models, and we present
empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related
wor
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
Recommended from our members
Temporal and Relational Models for Causality: Representation and Learning
Discovering causal dependence is central to understanding the behavior of complex systems and to selecting actions that will achieve particular outcomes. The majority of work in this area has focused on propositional domains, where data instances are assumed to be independent and identically distributed (i.i.d.). However, many real-world domains are inherently relational, i.e., they consist of multiple types of entities that interact with each other, and temporal, i.e., they change over time. This thesis focuses on causal modeling for these more complex relational and temporal domains. This thesis provides an in-depth investigation of the properties of relational models and is extending their expressivity to include a temporal dimension. Specifically, we first investigate alternative ways to ground relational models, and we provide an in-depth analysis of the impact of alternative grounding semantics for feature construction, causal effect estimation, and model selection. Then, we extend relational models to represent discrete time. We generalize the theory of d-separation for this class of temporal and relational models. Finally, we provide a constraint-based algorithm, TRCD, to learn the structure of temporal relational models from data
Identifiability and transportability in dynamic causal networks
In this paper we propose a causal analog to the purely observational Dynamic Bayesian Networks, which we call Dynamic Causal Networks.
We provide a sound and complete algorithm for identification of Dynamic Causal Networks, namely, for computing the effect of an intervention or experiment, based on passive observations only, whenever possible. We note the existence of two types of confounder variables that affect in substantially different ways the identification
procedures, a distinction with no analog in either Dynamic Bayesian Networks or standard causal graphs. We further propose a procedure
for the transportability of causal effects in Dynamic Causal Network settings, where the result of causal experiments in a source domain may be used for the identification of causal effects in a target domain.Preprin
- …