156 research outputs found

    Proteins across scales through graph partitioning: application to the major peanut allergen Ara h 1

    Get PDF
    The analysis of community structure in complex networks has been given much attention recently, as it is hoped that the communities at various scales can affect or explain the global behaviour of the system. A plethora of community detection algorithms have been proposed, insightful yet often restricted by certain inherent resolutions. Proteins are multi-scale biomolecular machines with coupled structural organization across scales, which is linked to their function. To reveal this organization, we applied a recently developed multi-resolution method, Markov Stability, which is based on atomistic graph partitioning, along with theoretical mutagenesis that further allows for hot spot identification using Gaussian process regression. The methodology finds partitions of a graph without imposing a particular scale a priori and analyses the network in a computationally efficient way. Here, we show an application on peanut allergenicity, which despite extensive experimental studies that focus on epitopes, groups of atoms associated with allergenic reactions, remains poorly understood. We compare our results against available experiment data, and we further predict distal regulatory sites that may significantly alter protein dynamics

    Exact and approximate role assignment for multi-layer networks

    Get PDF
    The concept of role equivalence has been applied in social network analysis for decades. Early definitions recognized two social actors as role equivalent, if they have identical relationships to the same other actors. Although this rather strong equivalence requirement has been relaxed in different ways, it is often challenging to detect interesting, non-Trivial role equivalences, especially for social networks derived from empirical data. Multi-layer networks (MLNs) are increasingly gaining popularity for modelling collective adaptive systems, for example, engineered cyber-physical systems or animal collectives. Multiplex networks, a special case of MLNs, transparently and compactly describe such complex interactions (social, biological, transportation), where nodes can be connected by links of different types. In this work, we first propose a novel notion of exact and approximate role equivalence for multiplex MLNs. Then, we implement and experimentally evaluate the algorithm on a suite of real-world case studies. Results demonstrate that our notion of approximate role assignment not only obtains non-Trivial partitions over nodes and layers as well, but it provides a fine-grained hierarchy of role equivalences, which is impossible to obtain by (combining) the existing role detection techniques. We demonstrate the latter by interpreting in detail the case study of Florence families, a classical benchmark from literature

    An approach for analysing the impact of data integration on complex network diffusion models

    Get PDF
    Complex networks are a powerful way to reason about systems with non-trivial patterns of interaction. The increased attention in this research area is accelerated by the increasing availability of complex network data sets, with data often being reused as secondary data sources. Typically, multiple data sources are combined to create a larger, fuller picture of these complex networks and in doing so scientists have to make sometimes subjective decisions about how these sources should be integrated. These seemingly trivial decisions can sometimes have significant impact on both the resultant integrated networks and any downstream network models executed on them. We highlight the importance of this impact in online social networks and dark networks, two use-cases where data are regularly combined from multiple sources due to challenges in measurement or overlap of networks. We present a method for systematically testing how different, realistic data integration approaches can alter both the networks themselves and network models run on them, as well as an associated Python package (NIDMod) that implements this method. A number of experiments show the effectiveness of our method in identifying the impact of different data integration setups on network diffusion models

    The GNAR-edge model: a network autoregressive model for networks with time-varying edge weights

    Get PDF
    In economic and financial applications, there is often the need for analysing multivariate time series, comprising of time series for a range of quantities. In some applications, such complex systems can be associated with some underlying network describing pairwise relationships among the quantities. Accounting for the underlying network structure for the analysis of this type of multivariate time series is required for assessing estimation error and can be particularly informative for forecasting. Our work is motivated by a dataset consisting of time series of industry-to-industry transactions. In this example, pairwise relationships between Standard Industrial Classification (SIC) codes can be represented using a network, with SIC codes as nodes and pairwise transactions between SIC codes as edges, while the observed time series of the amounts of the transactions for each pair of SIC codes can be regarded as time-varying weights on the edges. Inspired by Knight et al. (2020, J. Stat. Softw., 96, 1–36), we introduce the GNAR-edge model which allows modelling of multiple time series utilizing the network structure, assuming that each edge weight depends not only on its past values, but also on past values of its neighbouring edges, for a range of neighbourhood stages. The method is validated through simulations. Results from the implementation of the GNAR-edge model on the real industry-to-industry data show good fitting and predictive performance of the model. The predictive performance is improved when sparsifying the network using a lead–lag analysis and thresholding edges according to a lead–lag score

    Targeted Community Merging provides an efficient comparison between collaboration clusters and departmental partitions

    Get PDF
    Community detection theory is vital for the structural analysis of many types of complex networks, especially for human-like collaboration networks. In this work, we present a new community detection algorithm, the Targeted Community Merging algorithm, based on the well-known Girvan–Newman algorithm, which allows obtaining community partitions with high values of modularity and a small number of communities. We then perform an analysis and comparison between the departmental and community structure of scientific collaboration networks within the University of Zaragoza. Thus, we draw valuable conclusions from the inter- and intra-departmental collaboration structure that could be useful to take decisions on an eventual departmental restructuring

    Maximum likelihood estimation for randomized shortest paths with trajectory data

    Get PDF
    Randomized shortest paths (RSPs) are tool developed in recent years for different graph and network analysis applications, such as modelling movement or flow in networks. In essence, the RSP framework considers the temperature-dependent Gibbs–Boltzmann distribution over paths in the network. At low temperatures, the distribution focuses solely on the shortest or least-cost paths, while with increasing temperature, the distribution spreads over random walks on the network. Many relevant quantities can be computed conveniently from this distribution, and these often generalize traditional network measures in a sensible way. However, when modelling real phenomena with RSPs, one needs a principled way of estimating the parameters from data. In this work, we develop methods for computing the maximum likelihood estimate of the model parameters, with focus on the temperature parameter, when modelling phenomena based on movement, flow or spreading processes. We test the validity of the derived methods with trajectories generated on artificial networks as well as with real data on the movement of wild reindeer in a geographic landscape, used for estimating the degree of randomness in the movement of the animals. These examples demonstrate the attractiveness of the RSP framework as a generic model to be used in diverse applications. randomized shortest paths; random walk; shortest path; parameter estimation; maximum likelihood; animal movement modellingpublishedVersio
    • …
    corecore