In order to create a corpus exploration method providing topics that are
easier to interpret than standard LDA topic models, here we propose combining
two techniques called Entity linking and Labeled LDA. Our method identifies in
an ontology a series of descriptive labels for each document in a corpus. Then
it generates a specific topic for each label. Having a direct relation between
topics and labels makes interpretation easier; using an ontology as background
knowledge limits label ambiguity. As our topics are described with a limited
number of clear-cut labels, they promote interpretability, and this may help
quantitative evaluation. We illustrate the potential of the approach by
applying it in order to define the most relevant topics addressed by each party
in the European Parliament's fifth mandate (1999-2004).Comment: in Proceedings of Digital Humanities 2016, Krako