Search CORE

137 research outputs found

A Scalable and Effective Rough Set Theory based Approach for Big Data Pre-processing

Author: Beck Gael
Chelly Dagdia Zaineb
Lebbah Mustapha
Zarges Christine
Publication venue
Publication date: 02/05/2020
Field of study

International audienceA big challenge in the knowledge discovery process is to perform data pre-processing, specifically feature selection, on a large amount of data and high dimensional attribute set. A variety of techniques have been proposed in the literature to deal with this challenge with different degrees of success as most of these techniques need further information about the given input data for thresholding, need to specify noise levels or use some feature ranking procedures. To overcome these limitations, rough set theory (RST) can be used to discover the dependency within the data and reduce the number of attributes enclosed in an input data set while using the data alone and requiring no supplementary information. However, when it comes to massive data sets, RST reaches its limits as it is highly computationally expensive. In this paper, we propose a scalable and effective rough set theory-based approach for large-scale data pre-processing, specifically for feature selection, under the Spark framework. In our detailed experiments, data sets with up to 10,000 attributes have been considered, revealing that our proposed solution achieves a good speedup and performs its feature selection task well without sacrificing performance. Thus, making it relevant to big data

Crossref

Aberystwyth Research Portal

INRIA a CCSD electronic archive server

HAL-Paris 13

Uncertainty Management of Intelligent Feature Selection in Wireless Sensor Networks

Author: Mal-sarkar Sanchita
Publication venue: EngagedScholarship@CSU
Publication date: 01/01/2009
Field of study

Wireless sensor networks (WSN) are envisioned to revolutionize the paradigm of monitoring complex real-world systems at a very high resolution. However, the deployment of a large number of unattended sensor nodes in hostile environments, frequent changes of environment dynamics, and severe resource constraints pose uncertainties and limit the potential use of WSN in complex real-world applications. Although uncertainty management in Artificial Intelligence (AI) is well developed and well investigated, its implications in wireless sensor environments are inadequately addressed. This dissertation addresses uncertainty management issues of spatio-temporal patterns generated from sensor data. It provides a framework for characterizing spatio-temporal pattern in WSN. Using rough set theory and temporal reasoning a novel formalism has been developed to characterize and quantify the uncertainties in predicting spatio-temporal patterns from sensor data. This research also uncovers the trade-off among the uncertainty measures, which can be used to develop a multi-objective optimization model for real-time decision making in sensor data aggregation and samplin

OhioLINK Electronic Thesis and Dissertation Center

Cleveland-Marshall College of Law

Towards scalable fuzzy–rough feature selection

Author: Jensen Richard
MacParthalain Neil
Publication venue
Publication date: 01/12/2015
Field of study

Crossref

Aberystwyth Research Portal

Attribute Selection Methods in Rough Set Theory

Author: Li Xiaohan
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2014
Field of study

Attribute selection for rough sets is an NP-hard problem, in which fast heuristic algorithms are needed to find reducts. In this project, two reduct methods for rough set were implemented: particle swarm optimization and Johnson’s method. Both algorithms were evaluated with five different benchmarks from the KEEL repository. The results obtained from both implementations were compared with results obtained by the ROSETTA software using the same benchmarks. The results show that the implementations achieve better correction rates than ROSETTA

SJSU ScholarWorks

On rough sets, their recent extensions, and applications

Author: MacParthaláin Neil Seosamh
Shen Qiang
Publication venue
Publication date: 01/12/2010
Field of study

Aberystwyth Research Portal

A Detailed Study of the Distributed Rough Set Based Locality Sensitive Hashing Feature Selection Technique

Author: Chelly Dagdia Zaineb
Zarges Christine
Publication venue
Publication date: 30/09/2021
Field of study

International audienceIn the context of big data, granular computing has recently been implemented by some mathematical tools, especially Rough Set Theory (RST). As a key topic of rough set theory, feature selection has been investigated to adapt the related granular concepts of RST to deal with large amounts of data, leading to the development of the distributed RST version. However, despite of its scalability, the distributed RST version faces a key challenge tied to the partitioning of the feature search space in the distributed environment while guaranteeing data dependency. Therefore, in this manuscript, we propose a new distributed RST version based on Locality Sensitive Hashing (LSH), named LSH-dRST, for big data feature selection. LSH-dRST uses LSH to match similar features into the same bucket and maps the generated buckets into partitions to enable the splitting of the universe in a more efficient way. More precisely, in this paper, we perform a detailed analysis of the performance of LSH-dRST by comparing it to the standard distributed RST version, which is based on a random partitioning of the universe. We demonstrate that our LSH-dRST is scalable when dealing with large amounts of data. We also demonstrate * This work is part of a project that has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 702527. 2 Z. Chelly Dagdia, C. Zarges / LSH-RST for an Efficient Big Data Pre-processing that LSH-dRST ensures the partitioning of the high dimensional feature search space in a more reliable way; hence better preserving data dependency in the distributed environment and ensuring a lower computational cost

Aberystwyth Research Portal

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL UVSQ

Rough Set Theory as a Data Mining Technique:A Case Study in Epidemiology and Cancer Incidence Prediction

Author: A Banerjee
F Amersi
F Bagherzadeh-Khiabani
J Dean
J Schneider
J Zhang
K Thangavel
L Polkowski
M Guller
M Porta
M Woodward
NX Vinh
SJ Mooney
T Zhai
Z Pawlak
Z Pawlak
Publication venue: Springer Nature
Publication date: 18/01/2019
Field of study

Crossref

Aberystwyth Research Portal

A Novel Variable Precision Reduction Approach to Comprehensive Knowledge Systems

Author: Chen C. L. Philip
Liu Hongbo
McLoone Sean
Wu Xindong
Yang Chao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/01/2018
Field of study

Queen's University Belfast Research Portal

Crossref