4,748 research outputs found
Characteristic of partition-circuit matroid through approximation number
Rough set theory is a useful tool to deal with uncertain, granular and
incomplete knowledge in information systems. And it is based on equivalence
relations or partitions. Matroid theory is a structure that generalizes linear
independence in vector spaces, and has a variety of applications in many
fields. In this paper, we propose a new type of matroids, namely,
partition-circuit matroids, which are induced by partitions. Firstly, a
partition satisfies circuit axioms in matroid theory, then it can induce a
matroid which is called a partition-circuit matroid. A partition and an
equivalence relation on the same universe are one-to-one corresponding, then
some characteristics of partition-circuit matroids are studied through rough
sets. Secondly, similar to the upper approximation number which is proposed by
Wang and Zhu, we define the lower approximation number. Some characteristics of
partition-circuit matroids and the dual matroids of them are investigated
through the lower approximation number and the upper approximation number.Comment: 12 page
NMGRS: Neighborhood-based multigranulation rough sets
AbstractRecently, a multigranulation rough set (MGRS) has become a new direction in rough set theory, which is based on multiple binary relations on the universe. However, it is worth noticing that the original MGRS can not be used to discover knowledge from information systems with various domains of attributes. In order to extend the theory of MGRS, the objective of this study is to develop a so-called neighborhood-based multigranulation rough set (NMGRS) in the framework of multigranulation rough sets. Furthermore, by using two different approximating strategies, i.e., seeking common reserving difference and seeking common rejecting difference, we first present optimistic and pessimistic 1-type neighborhood-based multigranulation rough sets and optimistic and pessimistic 2-type neighborhood-based multigranulation rough sets, respectively. Through analyzing several important properties of neighborhood-based multigranulation rough sets, we find that the new rough sets degenerate to the original MGRS when the size of neighborhood equals zero. To obtain covering reducts under neighborhood-based multigranulation rough sets, we then propose a new definition of covering reduct to describe the smallest attribute subset that preserves the consistency of the neighborhood decision system, which can be calculated by Chen’s discernibility matrix approach. These results show that the proposed NMGRS largely extends the theory and application of classical MGRS in the context of multiple granulations
Multiple Relevant Feature Ensemble Selection Based on Multilayer Co-Evolutionary Consensus MapReduce
IEEE Although feature selection for large data has been intensively investigated in data mining, machine learning, and pattern recognition, the challenges are not just to invent new algorithms to handle noisy and uncertain large data in applications, but rather to link the multiple relevant feature sources, structured, or unstructured, to develop an effective feature reduction method. In this paper, we propose a multiple relevant feature ensemble selection (MRFES) algorithm based on multilayer co-evolutionary consensus MapReduce (MCCM). We construct an effective MCCM model to handle feature ensemble selection of large-scale datasets with multiple relevant feature sources, and explore the unified consistency aggregation between the local solutions and global dominance solutions achieved by the co-evolutionary memeplexes, which participate in the cooperative feature ensemble selection process. This model attempts to reach a mutual decision agreement among co-evolutionary memeplexes, which calls for the need for mechanisms to detect some noncooperative co-evolutionary behaviors and achieve better Nash equilibrium resolutions. Extensive experimental comparative studies substantiate the effectiveness of MRFES to solve large-scale dataset problems with the complex noise and multiple relevant feature sources on some well-known benchmark datasets. The algorithm can greatly facilitate the selection of relevant feature subsets coming from the original feature space with better accuracy, efficiency, and interpretability. Moreover, we apply MRFES to human cerebral cortex-based classification prediction. Such successful applications are expected to significantly scale up classification prediction for large-scale and complex brain data in terms of efficiency and feasibility
An Online Sparse Streaming Feature Selection Algorithm
Online streaming feature selection (OSFS), which conducts feature selection
in an online manner, plays an important role in dealing with high-dimensional
data. In many real applications such as intelligent healthcare platform,
streaming feature always has some missing data, which raises a crucial
challenge in conducting OSFS, i.e., how to establish the uncertain relationship
between sparse streaming features and labels. Unfortunately, existing OSFS
algorithms never consider such uncertain relationship. To fill this gap, we in
this paper propose an online sparse streaming feature selection with
uncertainty (OS2FSU) algorithm. OS2FSU consists of two main parts: 1) latent
factor analysis is utilized to pre-estimate the missing data in sparse
streaming features before con-ducting feature selection, and 2) fuzzy logic and
neighborhood rough set are employed to alleviate the uncertainty between
estimated streaming features and labels during conducting feature selection. In
the experiments, OS2FSU is compared with five state-of-the-art OSFS algorithms
on six real datasets. The results demonstrate that OS2FSU outperforms its
competitors when missing data are encountered in OSFS
- …