20 research outputs found
TiFi: Taxonomy Induction for Fictional Domains [Extended version]
Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin
End-to-End Reinforcement Learning for Automatic Taxonomy Induction
We present a novel end-to-end reinforcement learning approach to automatic
taxonomy induction from a set of terms. While prior methods treat the problem
as a two-phase task (i.e., detecting hypernymy pairs followed by organizing
these pairs into a tree-structured hierarchy), we argue that such two-phase
methods may suffer from error propagation, and cannot effectively optimize
metrics that capture the holistic structure of a taxonomy. In our approach, the
representations of term pairs are learned using multiple sources of information
and used to determine \textit{which} term to select and \textit{where} to place
it on the taxonomy via a policy network. All components are trained in an
end-to-end manner with cumulative rewards, measured by a holistic tree metric
over the training taxonomies. Experiments on two public datasets of different
domains show that our approach outperforms prior state-of-the-art taxonomy
induction methods up to 19.6\% on ancestor F1.Comment: 11 Pages. ACL 2018 Camera Read