1 research outputs found
Analyzing Evolving Stories in News Articles
There is an overwhelming number of news articles published every day around
the globe. Following the evolution of a news-story is a difficult task given
that there is no such mechanism available to track back in time to study the
diffusion of the relevant events in digital news feeds. The techniques
developed so far to extract meaningful information from a massive corpus rely
on similarity search, which results in a myopic loopback to the same topic
without providing the needed insights to hypothesize the origin of a story that
may be completely different than the news today. In this paper, we present an
algorithm that mines historical data to detect the origin of an event, segments
the timeline into disjoint groups of coherent news articles, and outlines the
most important documents in a timeline with a soft probability to provide a
better understanding of the evolution of a story. Qualitative and quantitative
approaches to evaluate our framework demonstrate that our algorithm discovers
statistically significant and meaningful stories in reasonable time.
Additionally, a relevant case study on a set of news articles demonstrates that
the generated output of the algorithm holds the promise to aid prediction of
future entities in a story.Comment: This is a pre-print of an article published in the International
Journal of Data Science and Analytics. The final authenticated version is
available online at: https://doi.org/10.1007/s41060-017-0091-