1 research outputs found
Feature selection with test cost constraint
Feature selection is an important preprocessing step in machine learning and
data mining. In real-world applications, costs, including money, time and other
resources, are required to acquire the features. In some cases, there is a test
cost constraint due to limited resources. We shall deliberately select an
informative and cheap feature subset for classification. This paper proposes
the feature selection with test cost constraint problem for this issue. The new
problem has a simple form while described as a constraint satisfaction problem
(CSP). Backtracking is a general algorithm for CSP, and it is efficient in
solving the new problem on medium-sized data. As the backtracking algorithm is
not scalable to large datasets, a heuristic algorithm is also developed.
Experimental results show that the heuristic algorithm can find the optimal
solution in most cases. We also redefine some existing feature selection
problems in rough sets, especially in decision-theoretic rough sets, from the
viewpoint of CSP. These new definitions provide insight to some new research
directions.Comment: 23 page