98 research outputs found

    Estimating Infection Sources in Networks Using Partial Timestamps

    Full text link
    We study the problem of identifying infection sources in a network based on the network topology, and a subset of infection timestamps. In the case of a single infection source in a tree network, we derive the maximum likelihood estimator of the source and the unknown diffusion parameters. We then introduce a new heuristic involving an optimization over a parametrized family of Gromov matrices to develop a single source estimation algorithm for general graphs. Compared with the breadth-first search tree heuristic commonly adopted in the literature, simulations demonstrate that our approach achieves better estimation accuracy than several other benchmark algorithms, even though these require more information like the diffusion parameters. We next develop a multiple sources estimation algorithm for general graphs, which first partitions the graph into source candidate clusters, and then applies our single source estimation algorithm to each cluster. We show that if the graph is a tree, then each source candidate cluster contains at least one source. Simulations using synthetic and real networks, and experiments using real-world data suggest that our proposed algorithms are able to estimate the true infection source(s) to within a small number of hops with a small portion of the infection timestamps being observed.Comment: 15 pages, 15 figures, accepted by IEEE Transactions on Information Forensics and Securit

    It’s Always April Fools’ Day! On the Difficulty of Social Network Misinformation Classification via Propagation Features

    Get PDF
    Given the huge impact that Online Social Networks (OSN) had in the way people get informed and form their opinion, they became an attractive playground for malicious entities that want to spread misinformation, and leverage their effect. In fact, misinformation easily spreads on OSN and is a huge threat for modern society, possibly influencing also the outcome of elections, or even putting people’s life at risk (e.g., spreading “anti-vaccines” misinformation). Therefore, it is of paramount importance for our society to have some sort of “validation” on information spreading through OSN. The need for a wide-scale validation would greatly benefit from automatic tools. In this paper, we show that it is difficult to carry out an automatic classification of misinformation considering only structural properties of content propagation cascades. We focus on structural properties, because they would be inherently dif- ficult to be manipulated, with the the aim of circumventing classification systems. To support our claim, we carry out an extensive evaluation on Facebook posts belonging to conspiracy theories (as representative of misinformation), and scientific news (representative of fact-checked content). Our findings show that conspiracy content actually reverberates in a way which is hard to distinguish from the one scientific content does: for the classification mechanisms we investigated, classification F1-score never exceeds 0.65 during content propagation stages, and is still less than 0.7 even after propagation is complete

    Reconstructing Graph Diffusion History from a Single Snapshot

    Full text link
    Diffusion on graphs is ubiquitous with numerous high-impact applications. In these applications, complete diffusion histories play an essential role in terms of identifying dynamical patterns, reflecting on precaution actions, and forecasting intervention effects. Despite their importance, complete diffusion histories are rarely available and are highly challenging to reconstruct due to ill-posedness, explosive search space, and scarcity of training data. To date, few methods exist for diffusion history reconstruction. They are exclusively based on the maximum likelihood estimation (MLE) formulation and require to know true diffusion parameters. In this paper, we study an even harder problem, namely reconstructing Diffusion history from A single SnapsHot} (DASH), where we seek to reconstruct the history from only the final snapshot without knowing true diffusion parameters. We start with theoretical analyses that reveal a fundamental limitation of the MLE formulation. We prove: (a) estimation error of diffusion parameters is unavoidable due to NP-hardness of diffusion parameter estimation, and (b) the MLE formulation is sensitive to estimation error of diffusion parameters. To overcome the inherent limitation of the MLE formulation, we propose a novel barycenter formulation: finding the barycenter of the posterior distribution of histories, which is provably stable against the estimation error of diffusion parameters. We further develop an effective solver named DIffusion hiTting Times with Optimal proposal (DITTO) by reducing the problem to estimating posterior expected hitting times via the Metropolis--Hastings Markov chain Monte Carlo method (M--H MCMC) and employing an unsupervised graph neural network to learn an optimal proposal to accelerate the convergence of M--H MCMC. We conduct extensive experiments to demonstrate the efficacy of the proposed method.Comment: Full version of the KDD 2023 paper. Our code is available at https://github.com/q-rz/KDD23-DITT

    Rumour source detection in social networks using partial observations

    Get PDF
    The spread of information on graphs has been extensively studied in engineering, biology, and economics. Re- cently, however, several authors have started to address the more challenging inverse problem, of localizing the origin of an epidemic, given observed traces of infection. In this paper, we introduce a novel technique to estimate the location of a source of multiple epidemics on a general graph, assuming knowledge of the start times of rumours, and using observations from a small number of monitors

    Back to the Source: an Online Approach for Sensor Placement and Source Localization

    Get PDF
    Source localization, the act of finding the originator of a disease or rumor in a network, has become an important problem in sociology and epidemiology. The localization is done using the infection state and time of infection of a few designated sensor nodes; however, maintaining sensors can be very costly in practice. We propose the first online approach to source localization: We deploy a priori only a small number of sensors (which reveal if they are reached by an infection) and then iteratively choose the best location to place new sensors in order to localize the source. This approach allows for source localization with a very small number of sensors; moreover, the source can be found while the epidemic is still ongoing. Our method applies to a general network topology and performs well even with random transmission delays
    • …
    corecore