1 research outputs found
Orthogonal Matching Pursuit for Text Classification
In text classification, the problem of overfitting arises due to the high
dimensionality, making regularization essential. Although classic regularizers
provide sparsity, they fail to return highly accurate models. On the contrary,
state-of-the-art group-lasso regularizers provide better results at the expense
of low sparsity. In this paper, we apply a greedy variable selection algorithm,
called Orthogonal Matching Pursuit, for the text classification task. We also
extend standard group OMP by introducing overlapping Group OMP to handle
overlapping groups of features. Empirical analysis verifies that both OMP and
overlapping GOMP constitute powerful regularizers, able to produce effective
and very sparse models. Code and data are available online:
https://github.com/y3nk0/OMP-for-Text-Classification