13,330 research outputs found

    Understanding Complex Systems: From Networks to Optimal Higher-Order Models

    Full text link
    To better understand the structure and function of complex systems, researchers often represent direct interactions between components in complex systems with networks, assuming that indirect influence between distant components can be modelled by paths. Such network models assume that actual paths are memoryless. That is, the way a path continues as it passes through a node does not depend on where it came from. Recent studies of data on actual paths in complex systems question this assumption and instead indicate that memory in paths does have considerable impact on central methods in network science. A growing research community working with so-called higher-order network models addresses this issue, seeking to take advantage of information that conventional network representations disregard. Here we summarise the progress in this area and outline remaining challenges calling for more research.Comment: 8 pages, 4 figure

    Counting Causal Paths in Big Times Series Data on Networks

    Full text link
    Graph or network representations are an important foundation for data mining and machine learning tasks in relational data. Many tools of network analysis, like centrality measures, information ranking, or cluster detection rest on the assumption that links capture direct influence, and that paths represent possible indirect influence. This assumption is invalidated in time-stamped network data capturing, e.g., dynamic social networks, biological sequences or financial transactions. In such data, for two time-stamped links (A,B) and (B,C) the chronological ordering and timing determines whether a causal path from node A via B to C exists. A number of works has shown that for that reason network analysis cannot be directly applied to time-stamped network data. Existing methods to address this issue require statistics on causal paths, which is computationally challenging for big data sets. Addressing this problem, we develop an efficient algorithm to count causal paths in time-stamped network data. Applying it to empirical data, we show that our method is more efficient than a baseline method implemented in an OpenSource data analytics package. Our method works efficiently for different values of the maximum time difference between consecutive links of a causal path and supports streaming scenarios. With it, we are closing a gap that hinders an efficient analysis of big time series data on complex networks.Comment: 10 pages, 2 figure

    Reasoning about Independence in Probabilistic Models of Relational Data

    Full text link
    We extend the theory of d-separation to cases in which data instances are not independent and identically distributed. We show that applying the rules of d-separation directly to the structure of probabilistic models of relational data inaccurately infers conditional independence. We introduce relational d-separation, a theory for deriving conditional independence facts from relational models. We provide a new representation, the abstract ground graph, that enables a sound, complete, and computationally efficient method for answering d-separation queries about relational models, and we present empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related wor

    Transforming Graph Representations for Statistical Relational Learning

    Full text link
    Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

    Identifiability and transportability in dynamic causal networks

    Get PDF
    In this paper we propose a causal analog to the purely observational Dynamic Bayesian Networks, which we call Dynamic Causal Networks. We provide a sound and complete algorithm for identification of Dynamic Causal Networks, namely, for computing the effect of an intervention or experiment, based on passive observations only, whenever possible. We note the existence of two types of confounder variables that affect in substantially different ways the identification procedures, a distinction with no analog in either Dynamic Bayesian Networks or standard causal graphs. We further propose a procedure for the transportability of causal effects in Dynamic Causal Network settings, where the result of causal experiments in a source domain may be used for the identification of causal effects in a target domain.Preprin
    • …
    corecore