87 research outputs found

    Multivariate Hawkes Processes for Large-scale Inference

    Full text link
    In this paper, we present a framework for fitting multivariate Hawkes processes for large-scale problems both in the number of events in the observed history nn and the number of event types dd (i.e. dimensions). The proposed Low-Rank Hawkes Process (LRHP) framework introduces a low-rank approximation of the kernel matrix that allows to perform the nonparametric learning of the d2d^2 triggering kernels using at most O(ndr2)O(ndr^2) operations, where rr is the rank of the approximation (rd,nr \ll d,n). This comes as a major improvement to the existing state-of-the-art inference algorithms that are in O(nd2)O(nd^2). Furthermore, the low-rank approximation allows LRHP to learn representative patterns of interaction between event types, which may be valuable for the analysis of such complex processes in real world datasets. The efficiency and scalability of our approach is illustrated with numerical experiments on simulated as well as real datasets.Comment: 16 pages, 5 figure

    Interactions in Information Spread

    Full text link
    Since the development of writing 5000 years ago, human-generated data gets produced at an ever-increasing pace. Classical archival methods aimed at easing information retrieval. Nowadays, archiving is not enough anymore. The amount of data that gets generated daily is beyond human comprehension, and appeals for new information retrieval strategies. Instead of referencing every single data piece as in traditional archival techniques, a more relevant approach consists in understanding the overall ideas conveyed in data flows. To spot such general tendencies, a precise comprehension of the underlying data generation mechanisms is required. In the rich literature tackling this problem, the question of information interaction remains nearly unexplored. First, we investigate the frequency of such interactions. Building on recent advances made in Stochastic Block Modelling, we explore the role of interactions in several social networks. We find that interactions are rare in these datasets. Then, we wonder how interactions evolve over time. Earlier data pieces should not have an everlasting influence on ulterior data generation mechanisms. We model this using dynamic network inference advances. We conclude that interactions are brief. Finally, we design a framework that jointly models rare and brief interactions based on Dirichlet-Hawkes Processes. We argue that this new class of models fits brief and sparse interaction modelling. We conduct a large-scale application on Reddit and find that interactions play a minor role in this dataset. From a broader perspective, our work results in a collection of highly flexible models and in a rethinking of core concepts of machine learning. Consequently, we open a range of novel perspectives both in terms of real-world applications and in terms of technical contributions to machine learning.Comment: PhD thesis defended on 2022/09/1

    Scaling edge parameters for topic-awareness in information propagation

    Get PDF
    Social media platforms play a crucial role in regulating public discourse. Recognizing the importance of understanding this complex phenomenon a large body of research has been published in attempts to model how information spreads within these platforms. These models are termed information propagation models. The majority of the existing information propagation models attempt to capture the causal relationship between to two information spreading events through modeling the probabilities of information transmission between the two users or through capturing the temporal correlations that exist between the events. While these models have been successful in the past, they fail to capture the various properties that have emerged in the recent past. One emerging property that has been presented in the recent analysis is the role the content of information plays in regulating the patterns of information spread. Specifically, social scientists believe that in the presence of large amounts of information, users tend to interact with items that help confirm their own views. This thesis explores a possible method to incorporate user-specific and event-specific features to existing information propagation models by scaling the edge parameters. Through modeling the scaling factors to capture the phenomena of selective exposure due to confirmation bias, we showcase the ability of our approach to capturing complex social dynamics. Through experiments on both synthetic and real-world datasets, we validate the advantages that could be gained over the existing models. The presented approach exhibits clearly visible performance gains on the network recovery task and performed competitively against the baselines
    corecore