Unsupervised Terminological Ontology Learning based on Hierarchical
  Topic Modeling

Bless, Patrick; Klabjan, Diego; Zhu, Xiaofeng

research

Unsupervised Terminological Ontology Learning based on Hierarchical Topic Modeling

Authors: Patrick Bless
Diego Klabjan
Xiaofeng Zhu
Publication date: 29 August 2017
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'
Doi

Abstract

In this paper, we present hierarchical relationbased latent Dirichlet allocation (hrLDA), a data-driven hierarchical topic model for extracting terminological ontologies from a large number of heterogeneous documents. In contrast to traditional topic models, hrLDA relies on noun phrases instead of unigrams, considers syntax and document structures, and enriches topic hierarchies with topic relations. Through a series of experiments, we demonstrate the superiority of hrLDA over existing topic models, especially for building hierarchies. Furthermore, we illustrate the robustness of hrLDA in the settings of noisy data sets, which are likely to occur in many practical scenarios. Our ontology evaluation results show that ontologies extracted from hrLDA are very competitive with the ontologies created by domain experts

Similar works

Full text

Available Versions

Crossref

info:doi/10.1109%2Firi.2017.18

Last time updated on 06/08/2021