1 research outputs found
A Constrained Coupled Matrix-Tensor Factorization for Learning Time-evolving and Emerging Topics
Topic discovery has witnessed a significant growth as a field of data mining
at large. In particular, time-evolving topic discovery, where the evolution of
a topic is taken into account has been instrumental in understanding the
historical context of an emerging topic in a dynamic corpus. Traditionally,
time-evolving topic discovery has focused on this notion of time. However,
especially in settings where content is contributed by a community or a crowd,
an orthogonal notion of time is the one that pertains to the level of expertise
of the content creator: the more experienced the creator, the more advanced the
topic. In this paper, we propose a novel time-evolving topic discovery method
which, in addition to the extracted topics, is able to identify the evolution
of that topic over time, as well as the level of difficulty of that topic, as
it is inferred by the level of expertise of its main contributors. Our method
is based on a novel formulation of Constrained Coupled Matrix-Tensor
Factorization, which adopts constraints well-motivated for, and, as we
demonstrate, are essential for high-quality topic discovery. We qualitatively
evaluate our approach using real data from the Physics and also Programming
Stack Exchange forum, and we were able to identify topics of varying levels of
difficulty which can be linked to external events, such as the announcement of
gravitational waves by the LIGO lab in Physics forum. We provide a quantitative
evaluation of our method by conducting a user study where experts were asked to
judge the coherence and quality of the extracted topics. Finally, our proposed
method has implications for automatic curriculum design using the extracted
topics, where the notion of the level of difficulty is necessary for the proper
modeling of prerequisites and advanced concepts