1 research outputs found
Interactions in Information Spread
Since the development of writing 5000 years ago, human-generated data gets
produced at an ever-increasing pace. Classical archival methods aimed at easing
information retrieval. Nowadays, archiving is not enough anymore. The amount of
data that gets generated daily is beyond human comprehension, and appeals for
new information retrieval strategies. Instead of referencing every single data
piece as in traditional archival techniques, a more relevant approach consists
in understanding the overall ideas conveyed in data flows. To spot such general
tendencies, a precise comprehension of the underlying data generation
mechanisms is required. In the rich literature tackling this problem, the
question of information interaction remains nearly unexplored. First, we
investigate the frequency of such interactions. Building on recent advances
made in Stochastic Block Modelling, we explore the role of interactions in
several social networks. We find that interactions are rare in these datasets.
Then, we wonder how interactions evolve over time. Earlier data pieces should
not have an everlasting influence on ulterior data generation mechanisms. We
model this using dynamic network inference advances. We conclude that
interactions are brief. Finally, we design a framework that jointly models rare
and brief interactions based on Dirichlet-Hawkes Processes. We argue that this
new class of models fits brief and sparse interaction modelling. We conduct a
large-scale application on Reddit and find that interactions play a minor role
in this dataset. From a broader perspective, our work results in a collection
of highly flexible models and in a rethinking of core concepts of machine
learning. Consequently, we open a range of novel perspectives both in terms of
real-world applications and in terms of technical contributions to machine
learning.Comment: PhD thesis defended on 2022/09/1