An Online Sparse Streaming Feature Selection Algorithm

Abstract

Online streaming feature selection (OSFS), which conducts feature selection in an online manner, plays an important role in dealing with high-dimensional data. In many real applications such as intelligent healthcare platform, streaming feature always has some missing data, which raises a crucial challenge in conducting OSFS, i.e., how to establish the uncertain relationship between sparse streaming features and labels. Unfortunately, existing OSFS algorithms never consider such uncertain relationship. To fill this gap, we in this paper propose an online sparse streaming feature selection with uncertainty (OS2FSU) algorithm. OS2FSU consists of two main parts: 1) latent factor analysis is utilized to pre-estimate the missing data in sparse streaming features before con-ducting feature selection, and 2) fuzzy logic and neighborhood rough set are employed to alleviate the uncertainty between estimated streaming features and labels during conducting feature selection. In the experiments, OS2FSU is compared with five state-of-the-art OSFS algorithms on six real datasets. The results demonstrate that OS2FSU outperforms its competitors when missing data are encountered in OSFS

    Similar works

    Full text

    thumbnail-image

    Available Versions