1 research outputs found
A comparative study of thresholding strategies in progressive filtering
Thresholding strategies in automated text categorization are an underexplored
area of research. Indeed, thresholding strategies are often considered a
post-processing step of minor importance, the underlying assumptions being that
they do not make a difference in the performance of a classifier and that finding
the optimal thresholding strategy for any given classifier is trivial. Neither these
assumptions are true. In this paper, we concentrate on progressive filtering, a
hierarchical text categorization technique that relies on a local-classifier-per-node
approach, thus mimicking the underlying taxonomy of categories. The focus of
the paper is on assessing TSA, a greedy threshold selection algorithm, against a
relaxed brute-force algorithm and the most relevant state-of-the-art algorithms.
Experiments, performed on Reuters, confirm the validity of TSA