Strategy Comparison for Semantic Zero-Shot Taxonomy Filters

Abstract

In information retrieval, categorised filtering based on subject-related taxonomies is a way of supporting users in formulating their information needs in an efficient way. Progress in machine learning classification algorithms has made it possible to automatize the task of tagging or category assignment in a generally acceptable manner, provided a sufficient number of labelled example documents from all categories is put into the training process. The latter requirement, however, is a serious obstacle for a flexible use over a broad range of domains and in areas with limited amount of training data available. This contribution shows the outcome of experiments with transformer-based zero-shot text classification methods which work without any specific training. Using taxonomy descriptions, sentence aggregation with saturation, and hierarchical consistency, this approach can be enhanced to perform nearly as well as more elaborate classifiers

    Similar works