3 research outputs found

    Active Feature-Value Acquisition

    Full text link

    Active sampling for detecting irrelevant features

    No full text
    The general approach for automatically driving data collection using information from previously acquired data is called active learning. Traditional active learning addresses the problem of choosing the unlabeled examples for which the class labels are queried with the goal of learning a classifier. In contrast we address the problem of active feature sampling for detecting useless features. We propose a strategy to actively sample the values of new features on class-labeled examples, with the objective of feature relevance assessment. We derive an active feature sampling algorithm from an information theoretic and statistical formulation of the problem. We present experimental results on synthetic, UCI and real world datasets to demonstrate that our active sampling algorithm can provide accurate estimates of feature relevance with lower data acquisition costs than random sampling and other previously proposed sampling algorithms. 1

    Active Sampling for Detecting Irrelevant Features

    No full text
    The general approach for automatically driving data collection using information from previously acquired data is called active learning. Traditional active learning addresses the problem of choosing the unlabeled examples for which the class labels are queried with the goal of learning a classifier. In contrast we address the problem of active feature sampling for detecting useless features. We propose a strategy to actively sample the values of new features on class-labeled examples, with the objective of feature relevance assessment. We derive an active feature sampling algorithm from an information theoretic and statistical formulation of the problem. We present experimental results on synthetic, UCI and real world datasets to demonstrate that our active sampling algorithm can provide accurate estimates of feature relevance with lower data acquisition costs than random sampling and other previously proposed sampling algorithms
    corecore