214,060 research outputs found
Characterizing the community structure of complex networks
Community structure is one of the key properties of complex networks and
plays a crucial role in their topology and function. While an impressive amount
of work has been done on the issue of community detection, very little
attention has been so far devoted to the investigation of communities in real
networks. We present a systematic empirical analysis of the statistical
properties of communities in large information, communication, technological,
biological, and social networks. We find that the mesoscopic organization of
networks of the same category is remarkably similar. This is reflected in
several characteristics of community structure, which can be used as
``fingerprints'' of specific network categories. While community size
distributions are always broad, certain categories of networks consist mainly
of tree-like communities, while others have denser modules. Average path
lengths within communities initially grow logarithmically with community size,
but the growth saturates or slows down for communities larger than a
characteristic size. This behaviour is related to the presence of hubs within
communities, whose roles differ across categories. Also the community
embeddedness of nodes, measured in terms of the fraction of links within their
communities, has a characteristic distribution for each category. Our findings
are verified by the use of two fundamentally different community detection
methods.Comment: 15 pages, 20 figures, 4 table
Spin models on random graphs
In the past decades complex networks and their behavior have attracted much attention. In the real world many of such networks can be found, for instance as social, information, technological and biological networks. An interesting property that many of them share is that they are scale free. Such networks have many nodes with a moderate amount of links, but also a significant amount of nodes with a very high number of links. The latter type of nodes are called hubs and play an important role in the behavior of the network. To model scale free networks, we use power-law random graphs. This means that their degree sequences obey a power law, i.e., the fraction of vertices that have k neighbors is proportional to k- for some > 1. Not only the structure of these networks is interesting, also the behavior of processes living on these networks is a fascinating subject. Processes one can think of are opinion formation, the spread of information and the spread of viruses. It is especially interesting if these processes undergo a so-called phase transition, i.e., a minor change in the circumstances suddenly results in completely different behavior. Hubs in scale free networks again have a large influence on processes living on them. The relation between the structure of the network and processes living on the network is the main topic of this thesis. We focus on spin models, i.e., Ising and Potts models. In physics, these are traditionally used as simple models to study magnetism. When studied on a random graph, the spins can, for example, be considered as opinions. In that case the ferromagnetic or antiferromagnetic interactions can be seen as the tendency of two connected persons in a social network to agree or disagree, respectively. In this thesis we study two models: the ferromagnetic Ising model on power-law random graphs and the antiferromagnetic Potts model on the ErdÂżos-RĂ©nyi random graph. For the first model we derive an explicit formula for the thermodynamic limit of the pressure, generalizing a result of Dembo and Montanari to random graphs with power-law exponent > 2, for which the variance of degrees is potentially infinite. We furthermore identify the thermodynamic limit of the magnetization, internal energy and susceptibility. For this same model, we also study the phase transition. We identify the critical temperature and compute the critical exponents of the magnetization and susceptibility. These exponents are universal in the sense that they only depend on the power-law exponent and not on any other detail of the degree distribution. The proofs rely on the locally tree-like structure of the random graph. This means that the local neighborhood of a randomly chosen vertex behaves like a branching process. Correlation inequalities are used to show that it suffices to study the behavior of the Ising model on these branching processes to obtain the results for the random graph. To compute the critical temperature and critical exponents we derive upper and lower bounds on the magnetization and susceptibility. These bounds are essentially Taylor approximations, but for power-law exponents 5 a more detailed analysis is necessary. We also study the case where the power-law exponent 2 (1, 2) for which the mean degree is infinite and the graph is no longer locally tree-like. We can, however, still say something about the magnetization of this model. For the antiferromagnetic Potts model we use an interpolation scheme to show that the thermodynamic limit exists. For this model the correlation inequalities do not hold, thus making it more difficult to study. We derive an extended variational principle and use to it give upper bounds on the pressure. Furthermore, we use a constrained secondmoment method to show that the high-temperature solution is correct for high enough temperature. We also show that this solution cannot be correct for low temperatures by showing that the entropy becomes negative if it were to be correct, thus identifying a phase transition
Spin models on random graphs
In the past decades complex networks and their behavior have attracted much attention. In the real world many of such networks can be found, for instance as social, information, technological and biological networks. An interesting property that many of them share is that they are scale free. Such networks have many nodes with a moderate amount of links, but also a significant amount of nodes with a very high number of links. The latter type of nodes are called hubs and play an important role in the behavior of the network. To model scale free networks, we use power-law random graphs. This means that their degree sequences obey a power law, i.e., the fraction of vertices that have k neighbors is proportional to k- for some > 1. Not only the structure of these networks is interesting, also the behavior of processes living on these networks is a fascinating subject. Processes one can think of are opinion formation, the spread of information and the spread of viruses. It is especially interesting if these processes undergo a so-called phase transition, i.e., a minor change in the circumstances suddenly results in completely different behavior. Hubs in scale free networks again have a large influence on processes living on them. The relation between the structure of the network and processes living on the network is the main topic of this thesis. We focus on spin models, i.e., Ising and Potts models. In physics, these are traditionally used as simple models to study magnetism. When studied on a random graph, the spins can, for example, be considered as opinions. In that case the ferromagnetic or antiferromagnetic interactions can be seen as the tendency of two connected persons in a social network to agree or disagree, respectively. In this thesis we study two models: the ferromagnetic Ising model on power-law random graphs and the antiferromagnetic Potts model on the ErdÂżos-RĂ©nyi random graph. For the first model we derive an explicit formula for the thermodynamic limit of the pressure, generalizing a result of Dembo and Montanari to random graphs with power-law exponent > 2, for which the variance of degrees is potentially infinite. We furthermore identify the thermodynamic limit of the magnetization, internal energy and susceptibility. For this same model, we also study the phase transition. We identify the critical temperature and compute the critical exponents of the magnetization and susceptibility. These exponents are universal in the sense that they only depend on the power-law exponent and not on any other detail of the degree distribution. The proofs rely on the locally tree-like structure of the random graph. This means that the local neighborhood of a randomly chosen vertex behaves like a branching process. Correlation inequalities are used to show that it suffices to study the behavior of the Ising model on these branching processes to obtain the results for the random graph. To compute the critical temperature and critical exponents we derive upper and lower bounds on the magnetization and susceptibility. These bounds are essentially Taylor approximations, but for power-law exponents 5 a more detailed analysis is necessary. We also study the case where the power-law exponent 2 (1, 2) for which the mean degree is infinite and the graph is no longer locally tree-like. We can, however, still say something about the magnetization of this model. For the antiferromagnetic Potts model we use an interpolation scheme to show that the thermodynamic limit exists. For this model the correlation inequalities do not hold, thus making it more difficult to study. We derive an extended variational principle and use to it give upper bounds on the pressure. Furthermore, we use a constrained secondmoment method to show that the high-temperature solution is correct for high enough temperature. We also show that this solution cannot be correct for low temperatures by showing that the entropy becomes negative if it were to be correct, thus identifying a phase transition
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
Detecting change points in the large-scale structure of evolving networks
Interactions among people or objects are often dynamic in nature and can be
represented as a sequence of networks, each providing a snapshot of the
interactions over a brief period of time. An important task in analyzing such
evolving networks is change-point detection, in which we both identify the
times at which the large-scale pattern of interactions changes fundamentally
and quantify how large and what kind of change occurred. Here, we formalize for
the first time the network change-point detection problem within an online
probabilistic learning framework and introduce a method that can reliably solve
it. This method combines a generalized hierarchical random graph model with a
Bayesian hypothesis test to quantitatively determine if, when, and precisely
how a change point has occurred. We analyze the detectability of our method
using synthetic data with known change points of different types and
magnitudes, and show that this method is more accurate than several previously
used alternatives. Applied to two high-resolution evolving social networks,
this method identifies a sequence of change points that align with known
external "shocks" to these networks
- …