1,263 research outputs found

    Analysis of the Web Graph Aggregated by Host and Pay-Level Domain

    Full text link
    In this paper the web is analyzed as a graph aggregated by host and pay-level domain (PLD). The web graph datasets, publicly available, have been released by the Common Crawl Foundation and are based on a web crawl performed during the period May-June-July 2017. The host graph has \sim1.3 billion nodes and \sim5.3 billion arcs. The PLD graph has \sim91 million nodes and \sim1.1 billion arcs. We study the distributions of degree and sizes of strongly/weakly connected components (SCC/WCC) focusing on power laws detection using statistical methods. The statistical plausibility of the power law model is compared with that of several alternative distributions. While there is no evidence of power law tails on host level, they emerge on PLD aggregation for indegree, SCC and WCC size distributions. Finally, we analyze distance-related features by studying the cumulative distributions of the shortest path lengths, and give an estimation of the diameters of the graphs

    Eigenvector-Based Centrality Measures for Temporal Networks

    Get PDF
    Numerous centrality measures have been developed to quantify the importances of nodes in time-independent networks, and many of them can be expressed as the leading eigenvector of some matrix. With the increasing availability of network data that changes in time, it is important to extend such eigenvector-based centrality measures to time-dependent networks. In this paper, we introduce a principled generalization of network centrality measures that is valid for any eigenvector-based centrality. We consider a temporal network with N nodes as a sequence of T layers that describe the network during different time windows, and we couple centrality matrices for the layers into a supra-centrality matrix of size NTxNT whose dominant eigenvector gives the centrality of each node i at each time t. We refer to this eigenvector and its components as a joint centrality, as it reflects the importances of both the node i and the time layer t. We also introduce the concepts of marginal and conditional centralities, which facilitate the study of centrality trajectories over time. We find that the strength of coupling between layers is important for determining multiscale properties of centrality, such as localization phenomena and the time scale of centrality changes. In the strong-coupling regime, we derive expressions for time-averaged centralities, which are given by the zeroth-order terms of a singular perturbation expansion. We also study first-order terms to obtain first-order-mover scores, which concisely describe the magnitude of nodes' centrality changes over time. As examples, we apply our method to three empirical temporal networks: the United States Ph.D. exchange in mathematics, costarring relationships among top-billed actors during the Golden Age of Hollywood, and citations of decisions from the United States Supreme Court.Comment: 38 pages, 7 figures, and 5 table

    A novel approach to study realistic navigations on networks

    Get PDF
    We consider navigation or search schemes on networks which are realistic in the sense that not all search chains can be completed. We show that the quantity μ=ρ/sd\mu = \rho/s_d, where sds_d is the average dynamic shortest distance and ρ\rho the success rate of completion of a search, is a consistent measure for the quality of a search strategy. Taking the example of realistic searches on scale-free networks, we find that μ\mu scales with the system size NN as NδN^{-\delta}, where δ\delta decreases as the searching strategy is improved. This measure is also shown to be sensitive to the distintinguishing characteristics of networks. In this new approach, a dynamic small world (DSW) effect is said to exist when δ0\delta \approx 0. We show that such a DSW indeed exists in social networks in which the linking probability is dependent on social distances.Comment: Text revised, references added; accepted version in Journal of Statistical Mechanic

    Detecting Community Structure in Dynamic Social Networks Using the Concept of Leadership

    Full text link
    Detecting community structure in social networks is a fundamental problem empowering us to identify groups of actors with similar interests. There have been extensive works focusing on finding communities in static networks, however, in reality, due to dynamic nature of social networks, they are evolving continuously. Ignoring the dynamic aspect of social networks, neither allows us to capture evolutionary behavior of the network nor to predict the future status of individuals. Aside from being dynamic, another significant characteristic of real-world social networks is the presence of leaders, i.e. nodes with high degree centrality having a high attraction to absorb other members and hence to form a local community. In this paper, we devised an efficient method to incrementally detect communities in highly dynamic social networks using the intuitive idea of importance and persistence of community leaders over time. Our proposed method is able to find new communities based on the previous structure of the network without recomputing them from scratch. This unique feature, enables us to efficiently detect and track communities over time rapidly. Experimental results on the synthetic and real-world social networks demonstrate that our method is both effective and efficient in discovering communities in dynamic social networks

    The developmental dynamics of terrorist organizations

    Get PDF
    We identify robust statistical patterns in the frequency and severity of violent attacks by terrorist organizations as they grow and age. Using group-level static and dynamic analyses of terrorist events worldwide from 1968-2008 and a simulation model of organizational dynamics, we show that the production of violent events tends to accelerate with increasing size and experience. This coupling of frequency, experience and size arises from a fundamental positive feedback loop in which attacks lead to growth which leads to increased production of new attacks. In contrast, event severity is independent of both size and experience. Thus larger, more experienced organizations are more deadly because they attack more frequently, not because their attacks are more deadly, and large events are equally likely to come from large and small organizations. These results hold across political ideologies and time, suggesting that the frequency and severity of terrorism may be constrained by fundamental processes.Comment: 28 pages, 8 figures, 4 tables, supplementary materia

    Semi-Supervised Overlapping Community Finding based on Label Propagation with Pairwise Constraints

    Get PDF
    Algorithms for detecting communities in complex networks are generally unsupervised, relying solely on the structure of the network. However, these methods can often fail to uncover meaningful groupings that reflect the underlying communities in the data, particularly when those structures are highly overlapping. One way to improve the usefulness of these algorithms is by incorporating additional background information, which can be used as a source of constraints to direct the community detection process. In this work, we explore the potential of semi-supervised strategies to improve algorithms for finding overlapping communities in networks. Specifically, we propose a new method, based on label propagation, for finding communities using a limited number of pairwise constraints. Evaluations on synthetic and real-world datasets demonstrate the potential of this approach for uncovering meaningful community structures in cases where each node can potentially belong to more than one community.Comment: Fix table

    Consensus clustering in complex networks

    Get PDF
    The community structure of complex networks reveals both their organization and hidden relationships among their constituents. Most community detection methods currently available are not deterministic, and their results typically depend on the specific random seeds, initial conditions and tie-break rules adopted for their execution. Consensus clustering is used in data analysis to generate stable results out of a set of partitions delivered by stochastic methods. Here we show that consensus clustering can be combined with any existing method in a self-consistent way, enhancing considerably both the stability and the accuracy of the resulting partitions. This framework is also particularly suitable to monitor the evolution of community structure in temporal networks. An application of consensus clustering to a large citation network of physics papers demonstrates its capability to keep track of the birth, death and diversification of topics.Comment: 11 pages, 12 figures. Published in Scientific Report

    Hawkes process as a model of social interactions: a view on video dynamics

    Get PDF
    We study by computer simulation the "Hawkes process" that was proposed in a recent paper by Crane and Sornette (Proc. Nat. Acad. Sci. USA 105, 15649 (2008)) as a plausible model for the dynamics of YouTube video viewing numbers. We test the claims made there that robust identification is possible for classes of dynamic response following activity bursts. Our simulated timeseries for the Hawkes process indeed fall into the different categories predicted by Crane and Sornette. However the Hawkes process gives a much narrower spread of decay exponents than the YouTube data, suggesting limits to the universality of the Hawkes-based analysis.Comment: Added errors to parameter estimates and further description. IOP style, 13 pages, 5 figure
    corecore