research

Analysis of term roles along taxonomy nodes by adopting discriminant and characteristic capabilities

Abstract

Taxonomies are becoming essential to a growing number of application, particularly for specific domains. Taxonomies, originally built by hand, have been recently focused on their automatic generation. In particular, a main issue on automatic taxonomy building regards the choice of the most suitable features. In this paper, we propose an analy- sis on how each feature changes its role along taxonomy nodes in a text categorization scenario, in which the features are the terms in textual documents. We deem that, in a hierarchical structure, each node should intuitively be represented with proper meaningful and discriminant terms (i.e., performing a feature selection task for each node), instead of con- sidering a fixed feature space. To assess the discriminant power of a term, we adopt two novel metrics able to measure it. Our conjecture is that a term could significantly change its discriminant power (hence, its role) along the taxonomy levels. We perform experiments aimed at proving that a significant number of terms play different roles in each taxonomy node, giving emphasis to the usefulness of a distinct feature selection for each node. We assert that this analysis should support automatic taxonomy building approaches

    Similar works