11 research outputs found
Optimizing complex networks models
Analyzing real-world networks ultimately amounts at com-
paring their empirical properties with the outcome of a proper, statistical model. The far most common, and most useful, approach to define benchmarks rests upon the so-called canonical formalism of statistical mechanics which has led to the definition of the broad class of models known as Exponential Random Graphs (ERGs). Generally speaking, employing a model of this family boils down at maximizing a likelihood function that embodies the available information about a certain system, hence constituting the desired benchmark. Although powerful, the aforementioned models cannot be solved analytically, whence the need to rest upon numerical recipes for their optimization. Generally speaking, this is a hard task, since real-world networks can be enormous in size (for example, consisting of billions of nodes and links), hence requiring models with ‘many’ parameters (say, of the same order of magnitude of the number of nodes). This evidence calls for optimization algorithms which are both fast and scalable: the collection of works constituting the present thesis represents an attempt to fill this gap. Chapter 1 provides a
quick introduction to the topic. Chapter 2 deals specifically with ERGs: after reviewing the basic concepts constituting the pillars upon which such a framework is based, we will discuss several instances of it and three different numerical techniques for their optimization. Chapter 3, instead, focuses on the detection of mesoscale structures and, in particular, on the formalism based upon surprise: as the latter allows any partition of nodes to be assigned a p-value, detecting a specific, mesoscale structural organization can be understood as the problem of finding the corresponding, most significant partition - i.e. an optimization problem whose score function is, precisely, surprise. Finally, chapter 4 deals with the application of a couple of ERGs and of the surprise-based formalism to cryptocurrencies (specifically, Bitcoin)
Reconstructing firm-level interactions in the Dutch input–output network from production constraints
Recent crises have shown that the knowledge of the structure of input–output networks, at the firm level, is crucial when studying economic resilience from the microscopic point of view of firms that try to rewire their connections under supply and demand constraints. Unfortunately, empirical inter-firm network data are protected by confidentiality, hence rarely accessible. The available methods for network reconstruction from partial information treat all pairs of nodes as potentially interacting, thereby overestimating the rewiring capabilities of the system and the implied resilience. Here, we use two big data sets of transactions in the Netherlands to represent a large portion of the Dutch inter-firm network and document its properties. We, then, introduce a generalized maximum-entropy reconstruction method that preserves the production function of each firm in the data, i.e. the input and output flows of each node for each product type. We confirm that the new method becomes increasingly more reliable in reconstructing the empirical network as a finer product resolution is considered and can, therefore, be used as a realistic generative model of inter-firm networks with fine production constraints. Moreover, the likelihood of the model directly enumerates the number of alternative network configurations that leave each firm in its current production state, thereby estimating the reduction in the rewiring capability of the system implied by the observed input–output constraints
Reconstructing firm-level interactions: the Dutch input-output network
Recent crises have shown that the knowledge of the structure of input-output
networks at the firm level is crucial when studying economic resilience from
the microscopic point of view of firms that rewire their connections under
supply and demand shocks. Unfortunately, empirical inter-firm network data are
rarely accessible and protected by confidentiality. The available methods of
network reconstruction from partial information, which have been devised for
financial exposures, are inadequate for inter-firm relationships because they
treat all pairs of nodes as potentially interacting, thereby overestimating the
rewiring capabilities of the system. Here we use two big data sets of
transactions in the Netherlands to represent a large portion of the Dutch
inter-firm network and document the properties of one of the few analysed
networks of this kind. We, then, introduce a generalized maximum-entropy
reconstruction method that preserves the production function of each firm in
the data, i.e. the input and output flows of each node for each product type.
We confirm that the new method becomes increasingly more reliable as a finer
product resolution is considered and can therefore be used as a generative
model of inter-firm networks with fine production constraints. The likelihood
of the model, being related to the entropy, proxies the rewiring capability of
the system for a fixed input-output configuration.Comment: main text: 26 pages, 9 figures; supplementary material 21 pages, 9
figure
Detecting mesoscale structures by surprise
The importance of identifying the presence of mesoscale structures in complex
networks can be hardly overestimated. So far, much attention has been devoted
to the detection of communities, bipartite and core-periphery structures on
binary networks: such an effort has led to the definition of a unified
framework based upon the score function called surprise, i.e. a p-value that
can be assigned to any given partition of nodes, on both undirected and
directed networks. Here, we aim at making a step further, by extending the
entire framework to the weighted case: after reviewing the application of the
surprise-based formalism to the detection of binary mesoscale structures, we
present a suitable generalization of it for detecting weighted mesoscale
structures, a topic that has received much less attention. To this aim, we
analyze four variants of the surprise; from a technical point of view, this
amounts at employing four variants of the hypergeometric distribution: the
binomial one for the detection of binary communities, the multinomial one for
the detection of binary "bimodular" structures and their negative counterparts
for the detection of communities and "bimodular" structures on weighted
networks. On top of that, we define two "enhanced" variants of surprise, able
to encode both binary and weighted constraints and whose definition rests upon
two suitable generalizations of the hypergeometric distribution itself. As a
result, we present a general, statistically-grounded approach to detect
mesoscale structures on networks via a unified, suprise-based framework. To
illustrate the performance of our methods, we, first, test them on a variety of
well-established, synthetic benchmarks and, then, apply them to several
real-world networks, i.e. social, economic, financial and ecological ones.
Moreover, we attach to the paper a Python code implementing all the considered
variants of surprise.Comment: 32 pages, 4 tables, 12 figures. Python code available at:
https://github.com/EmilianoMarchese/SurpriseMeMor
Fast and scalable likelihood maximization for Exponential Random Graph Models with local constraints
Exponential Random Graph Models (ERGMs) have gained increasing popularity over the years. Rooted into statistical physics, the ERGMs framework has been successfully employed for reconstructing networks, detecting statistically significant patterns in graphs, counting networked configurations with given properties. From a technical point of view, the ERGMs workflow is defined by two subsequent optimization steps: the first one concerns the maximization of Shannon entropy and leads to identify the functional form of the ensemble probability distribution that is maximally non-committal with respect to the missing information; the second one concerns the maximization of the likelihood function induced by this probability distribution and leads to its numerical determination. This second step translates into the resolution of a system of O(N) non-linear, coupled equations (with N being the total number of nodes of the network under analysis), a problem that is affected by three main issues, i.e. accuracy, speed and scalability. The present paper aims at addressing these problems by comparing the performance of three algorithms (i.e. Newton’s method, a quasi-Newton method and a recently-proposed fixed-point recipe) in solving several ERGMs, defined by binary and weighted constraints in both a directed and an undirected fashion. While Newton’s method performs best for relatively little networks, the fixed-point recipe is to be preferred when large configurations are considered, as it ensures convergence to the solution within seconds for networks with hundreds of thousands of nodes (e.g. the Internet, Bitcoin). We attach to the paper a Python code implementing the three aforementioned algorithms on all the ERGMs considered in the present work
The weighted Bitcoin Lightning Network
The Bitcoin Lightning Network (BLN) was launched in 2018 to scale up the number of transactions between Bitcoin owners. Although several contributions concerning the analysis of the BLN binary structure have recently appeared in the literature, the properties of its weighted counterpart are still largely unknown. The present contribution aims at filling this gap, by considering the Bitcoin Lightning Network over a period of 18 months, ranging from 12th January 2018 to 17th July 2019, and focusing on its weighted, undirected, daily snapshot representation - each weight representing the total capacity of the channels the two involved nodes have established on a given temporal snapshot. As the study of the BLN weighted structural properties reveals, it is becoming increasingly ‘centralized’ at different levels, just as its binary counterpart: (1) the Nakamoto coefficient shows that the percentage of nodes whose degrees/strengths ‘enclose’ the 51% of the total number of links/total weight is rapidly decreasing; (2) the Gini coefficient confirms that several weighted centrality measures are becoming increasingly unevenly distributed; (3) the weighted BLN topology is becoming increasingly compatible with a core–periphery structure, with the largest nodes ‘by strength’ constituting the core of such a network, whose size keeps shrinking as the BLN evolves. Further inspection of the resilience of the weighted BLN shows that removing such hubs leads to the network fragmentation into many components, an evidence indicating potential security threats — as the ones represented by the so called ‘split attacks’.ISSN:0960-0779ISSN:1873-288