113,124 research outputs found
Detecting Strong Ties Using Network Motifs
Detecting strong ties among users in social and information networks is a
fundamental operation that can improve performance on a multitude of
personalization and ranking tasks. Strong-tie edges are often readily obtained
from the social network as users often participate in multiple overlapping
networks via features such as following and messaging. These networks may vary
greatly in size, density and the information they carry. This setting leads to
a natural strong tie detection task: given a small set of labeled strong tie
edges, how well can one detect unlabeled strong ties in the remainder of the
network?
This task becomes particularly daunting for the Twitter network due to scant
availability of pairwise relationship attribute data, and sparsity of strong
tie networks such as phone contacts. Given these challenges, a natural approach
is to instead use structural network features for the task, produced by {\em
combining} the strong and "weak" edges. In this work, we demonstrate via
experiments on Twitter data that using only such structural network features is
sufficient for detecting strong ties with high precision. These structural
network features are obtained from the presence and frequency of small network
motifs on combined strong and weak ties. We observe that using motifs larger
than triads alleviate sparsity problems that arise for smaller motifs, both due
to increased combinatorial possibilities as well as benefiting strongly from
searching beyond the ego network. Empirically, we observe that not all motifs
are equally useful, and need to be carefully constructed from the combined
edges in order to be effective for strong tie detection. Finally, we reinforce
our experimental findings with providing theoretical justification that
suggests why incorporating these larger sized motifs as features could lead to
increased performance in planted graph models.Comment: To appear in Proceedings of WWW 2017 (Web-science track
Enhanced reconstruction of weighted networks from strengths and degrees
Network topology plays a key role in many phenomena, from the spreading of
diseases to that of financial crises. Whenever the whole structure of a network
is unknown, one must resort to reconstruction methods that identify the least
biased ensemble of networks consistent with the partial information available.
A challenging case, frequently encountered due to privacy issues in the
analysis of interbank flows and Big Data, is when there is only local
(node-specific) aggregate information available. For binary networks, the
relevant ensemble is one where the degree (number of links) of each node is
constrained to its observed value. However, for weighted networks the problem
is much more complicated. While the naive approach prescribes to constrain the
strengths (total link weights) of all nodes, recent counter-intuitive results
suggest that in weighted networks the degrees are often more informative than
the strengths. This implies that the reconstruction of weighted networks would
be significantly enhanced by the specification of both strengths and degrees, a
computationally hard and bias-prone procedure. Here we solve this problem by
introducing an analytical and unbiased maximum-entropy method that works in the
shortest possible time and does not require the explicit generation of
reconstructed samples. We consider several real-world examples and show that,
while the strengths alone give poor results, the additional knowledge of the
degrees yields accurately reconstructed networks. Information-theoretic
criteria rigorously confirm that the degree sequence, as soon as it is
non-trivial, is irreducible to the strength sequence. Our results have strong
implications for the analysis of motifs and communities and whenever the
reconstructed ensemble is required as a null model to detect higher-order
patterns
Link-Prediction Enhanced Consensus Clustering for Complex Networks
Many real networks that are inferred or collected from data are incomplete
due to missing edges. Missing edges can be inherent to the dataset (Facebook
friend links will never be complete) or the result of sampling (one may only
have access to a portion of the data). The consequence is that downstream
analyses that consume the network will often yield less accurate results than
if the edges were complete. Community detection algorithms, in particular,
often suffer when critical intra-community edges are missing. We propose a
novel consensus clustering algorithm to enhance community detection on
incomplete networks. Our framework utilizes existing community detection
algorithms that process networks imputed by our link prediction based
algorithm. The framework then merges their multiple outputs into a final
consensus output. On average our method boosts performance of existing
algorithms by 7% on artificial data and 17% on ego networks collected from
Facebook
- …