214,060 research outputs found

    Characterizing the community structure of complex networks

    Get PDF
    Community structure is one of the key properties of complex networks and plays a crucial role in their topology and function. While an impressive amount of work has been done on the issue of community detection, very little attention has been so far devoted to the investigation of communities in real networks. We present a systematic empirical analysis of the statistical properties of communities in large information, communication, technological, biological, and social networks. We find that the mesoscopic organization of networks of the same category is remarkably similar. This is reflected in several characteristics of community structure, which can be used as ``fingerprints'' of specific network categories. While community size distributions are always broad, certain categories of networks consist mainly of tree-like communities, while others have denser modules. Average path lengths within communities initially grow logarithmically with community size, but the growth saturates or slows down for communities larger than a characteristic size. This behaviour is related to the presence of hubs within communities, whose roles differ across categories. Also the community embeddedness of nodes, measured in terms of the fraction of links within their communities, has a characteristic distribution for each category. Our findings are verified by the use of two fundamentally different community detection methods.Comment: 15 pages, 20 figures, 4 table

    Spin models on random graphs

    Get PDF
    In the past decades complex networks and their behavior have attracted much attention. In the real world many of such networks can be found, for instance as social, information, technological and biological networks. An interesting property that many of them share is that they are scale free. Such networks have many nodes with a moderate amount of links, but also a significant amount of nodes with a very high number of links. The latter type of nodes are called hubs and play an important role in the behavior of the network. To model scale free networks, we use power-law random graphs. This means that their degree sequences obey a power law, i.e., the fraction of vertices that have k neighbors is proportional to k- for some > 1. Not only the structure of these networks is interesting, also the behavior of processes living on these networks is a fascinating subject. Processes one can think of are opinion formation, the spread of information and the spread of viruses. It is especially interesting if these processes undergo a so-called phase transition, i.e., a minor change in the circumstances suddenly results in completely different behavior. Hubs in scale free networks again have a large influence on processes living on them. The relation between the structure of the network and processes living on the network is the main topic of this thesis. We focus on spin models, i.e., Ising and Potts models. In physics, these are traditionally used as simple models to study magnetism. When studied on a random graph, the spins can, for example, be considered as opinions. In that case the ferromagnetic or antiferromagnetic interactions can be seen as the tendency of two connected persons in a social network to agree or disagree, respectively. In this thesis we study two models: the ferromagnetic Ising model on power-law random graphs and the antiferromagnetic Potts model on the ErdÂżos-RĂ©nyi random graph. For the first model we derive an explicit formula for the thermodynamic limit of the pressure, generalizing a result of Dembo and Montanari to random graphs with power-law exponent > 2, for which the variance of degrees is potentially infinite. We furthermore identify the thermodynamic limit of the magnetization, internal energy and susceptibility. For this same model, we also study the phase transition. We identify the critical temperature and compute the critical exponents of the magnetization and susceptibility. These exponents are universal in the sense that they only depend on the power-law exponent and not on any other detail of the degree distribution. The proofs rely on the locally tree-like structure of the random graph. This means that the local neighborhood of a randomly chosen vertex behaves like a branching process. Correlation inequalities are used to show that it suffices to study the behavior of the Ising model on these branching processes to obtain the results for the random graph. To compute the critical temperature and critical exponents we derive upper and lower bounds on the magnetization and susceptibility. These bounds are essentially Taylor approximations, but for power-law exponents 5 a more detailed analysis is necessary. We also study the case where the power-law exponent 2 (1, 2) for which the mean degree is infinite and the graph is no longer locally tree-like. We can, however, still say something about the magnetization of this model. For the antiferromagnetic Potts model we use an interpolation scheme to show that the thermodynamic limit exists. For this model the correlation inequalities do not hold, thus making it more difficult to study. We derive an extended variational principle and use to it give upper bounds on the pressure. Furthermore, we use a constrained secondmoment method to show that the high-temperature solution is correct for high enough temperature. We also show that this solution cannot be correct for low temperatures by showing that the entropy becomes negative if it were to be correct, thus identifying a phase transition

    Spin models on random graphs

    Get PDF
    In the past decades complex networks and their behavior have attracted much attention. In the real world many of such networks can be found, for instance as social, information, technological and biological networks. An interesting property that many of them share is that they are scale free. Such networks have many nodes with a moderate amount of links, but also a significant amount of nodes with a very high number of links. The latter type of nodes are called hubs and play an important role in the behavior of the network. To model scale free networks, we use power-law random graphs. This means that their degree sequences obey a power law, i.e., the fraction of vertices that have k neighbors is proportional to k- for some > 1. Not only the structure of these networks is interesting, also the behavior of processes living on these networks is a fascinating subject. Processes one can think of are opinion formation, the spread of information and the spread of viruses. It is especially interesting if these processes undergo a so-called phase transition, i.e., a minor change in the circumstances suddenly results in completely different behavior. Hubs in scale free networks again have a large influence on processes living on them. The relation between the structure of the network and processes living on the network is the main topic of this thesis. We focus on spin models, i.e., Ising and Potts models. In physics, these are traditionally used as simple models to study magnetism. When studied on a random graph, the spins can, for example, be considered as opinions. In that case the ferromagnetic or antiferromagnetic interactions can be seen as the tendency of two connected persons in a social network to agree or disagree, respectively. In this thesis we study two models: the ferromagnetic Ising model on power-law random graphs and the antiferromagnetic Potts model on the ErdÂżos-RĂ©nyi random graph. For the first model we derive an explicit formula for the thermodynamic limit of the pressure, generalizing a result of Dembo and Montanari to random graphs with power-law exponent > 2, for which the variance of degrees is potentially infinite. We furthermore identify the thermodynamic limit of the magnetization, internal energy and susceptibility. For this same model, we also study the phase transition. We identify the critical temperature and compute the critical exponents of the magnetization and susceptibility. These exponents are universal in the sense that they only depend on the power-law exponent and not on any other detail of the degree distribution. The proofs rely on the locally tree-like structure of the random graph. This means that the local neighborhood of a randomly chosen vertex behaves like a branching process. Correlation inequalities are used to show that it suffices to study the behavior of the Ising model on these branching processes to obtain the results for the random graph. To compute the critical temperature and critical exponents we derive upper and lower bounds on the magnetization and susceptibility. These bounds are essentially Taylor approximations, but for power-law exponents 5 a more detailed analysis is necessary. We also study the case where the power-law exponent 2 (1, 2) for which the mean degree is infinite and the graph is no longer locally tree-like. We can, however, still say something about the magnetization of this model. For the antiferromagnetic Potts model we use an interpolation scheme to show that the thermodynamic limit exists. For this model the correlation inequalities do not hold, thus making it more difficult to study. We derive an extended variational principle and use to it give upper bounds on the pressure. Furthermore, we use a constrained secondmoment method to show that the high-temperature solution is correct for high enough temperature. We also show that this solution cannot be correct for low temperatures by showing that the entropy becomes negative if it were to be correct, thus identifying a phase transition

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    Detecting change points in the large-scale structure of evolving networks

    Full text link
    Interactions among people or objects are often dynamic in nature and can be represented as a sequence of networks, each providing a snapshot of the interactions over a brief period of time. An important task in analyzing such evolving networks is change-point detection, in which we both identify the times at which the large-scale pattern of interactions changes fundamentally and quantify how large and what kind of change occurred. Here, we formalize for the first time the network change-point detection problem within an online probabilistic learning framework and introduce a method that can reliably solve it. This method combines a generalized hierarchical random graph model with a Bayesian hypothesis test to quantitatively determine if, when, and precisely how a change point has occurred. We analyze the detectability of our method using synthetic data with known change points of different types and magnitudes, and show that this method is more accurate than several previously used alternatives. Applied to two high-resolution evolving social networks, this method identifies a sequence of change points that align with known external "shocks" to these networks
    • …