Search CORE

461 research outputs found

Fuzzy-rough set and fuzzy ID3 decision approaches to knowledge discovery in datasets

Author: Amgad Salama
O. G. Elbarbary
Publication venue: International scientific publication and consulting services (ISPACS)
Publication date: 01/01/2012
Field of study

Fuzzy rough sets are the generalization of traditional rough sets to deal with both fuzziness and vagueness in data. The existing researches on fuzzy rough sets mainly concentrate on the construction of approximation operators. Less effort has been put on the knowledge discovery in datasets with fuzzy rough sets. This paper mainly focuses on knowledge discovery in datasets with fuzzy rough sets. After analyzing the previous works on knowledge discovery with fuzzy rough sets, we introduce formal concepts of attribute reduction with fuzzy rough sets and completely study the structure of attribute reduction

Crossref

Directory of Open Access Journals

On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems

Author: Bustince Humberto
Elkano Mikel
Galar Mikel
Uriz Mikel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/02/2019
Field of study

We present a new distributed fuzzy partitioning method to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems. The proposed algorithm builds a fixed number of fuzzy sets for all variables and adjusts their shape and position to the real distribution of training data. A two-step process is applied : 1) transformation of the original distribution into a standard uniform distribution by means of the probability integral transform. Since the original distribution is generally unknown, the cumulative distribution function is approximated by computing the q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy partition in the transformed attribute space using a fixed number of equally distributed triangular membership functions. Despite the aforementioned transformation, the definition of every fuzzy set in the original space can be recovered by applying the inverse cumulative distribution function (also known as quantile function). The experimental results reveal that the proposed methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT) induction algorithm to maintain classification accuracy with up to 6 million fewer leaves.Comment: Appeared in 2018 IEEE International Congress on Big Data (BigData Congress). arXiv admin note: text overlap with arXiv:1902.0935

arXiv.org e-Print Archive

Crossref

Sistemas granulares evolutivos

Author: Leite Daniel Furtado
Publication venue: [s.n.]
Publication date: 21/08/2018
Field of study

Orientador: Fernando Antonio Campos GomideTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Recentemente tem-se observado um crescente interesse em abordagens de modelagem computacional para lidar com fluxos de dados do mundo real. Métodos e algoritmos têm sido propostos para obtenção de conhecimento a partir de conjuntos de dados muito grandes e, a princípio, sem valor aparente. Este trabalho apresenta uma plataforma computacional para modelagem granular evolutiva de fluxos de dados incertos. Sistemas granulares evolutivos abrangem uma variedade de abordagens para modelagem on-line inspiradas na forma com que os humanos lidam com a complexidade. Esses sistemas exploram o fluxo de informação em ambiente dinâmico e extrai disso modelos que podem ser linguisticamente entendidos. Particularmente, a granulação da informação é uma técnica natural para dispensar atenção a detalhes desnecessários e enfatizar transparência, interpretabilidade e escalabilidade de sistemas de informação. Dados incertos (granulares) surgem a partir de percepções ou descrições imprecisas do valor de uma variável. De maneira geral, vários fatores podem afetar a escolha da representação dos dados tal que o objeto representativo reflita o significado do conceito que ele está sendo usado para representar. Neste trabalho são considerados dados numéricos, intervalares e fuzzy; e modelos intervalares, fuzzy e neuro-fuzzy. A aprendizagem de sistemas granulares é baseada em algoritmos incrementais que constroem a estrutura do modelo sem conhecimento anterior sobre o processo e adapta os parâmetros do modelo sempre que necessário. Este paradigma de aprendizagem é particularmente importante uma vez que ele evita a reconstrução e o retreinamento do modelo quando o ambiente muda. Exemplos de aplicação em classificação, aproximação de função, predição de séries temporais e controle usando dados sintéticos e reais ilustram a utilidade das abordagens de modelagem granular propostas. O comportamento de fluxos de dados não-estacionários com mudanças graduais e abruptas de regime é também analisado dentro do paradigma de computação granular evolutiva. Realçamos o papel da computação intervalar, fuzzy e neuro-fuzzy em processar dados incertos e prover soluções aproximadas de alta qualidade e sumário de regras de conjuntos de dados de entrada e saída. As abordagens e o paradigma introduzidos constituem uma extensão natural de sistemas inteligentes evolutivos para processamento de dados numéricos a sistemas granulares evolutivos para processamento de dados granularesAbstract: In recent years there has been increasing interest in computational modeling approaches to deal with real-world data streams. Methods and algorithms have been proposed to uncover meaningful knowledge from very large (often unbounded) data sets in principle with no apparent value. This thesis introduces a framework for evolving granular modeling of uncertain data streams. Evolving granular systems comprise an array of online modeling approaches inspired by the way in which humans deal with complexity. These systems explore the information flow in dynamic environments and derive from it models that can be linguistically understood. Particularly, information granulation is a natural technique to dispense unnecessary details and emphasize transparency, interpretability and scalability of information systems. Uncertain (granular) data arise from imprecise perception or description of the value of a variable. Broadly stated, various factors can affect one's choice of data representation such that the representing object conveys the meaning of the concept it is being used to represent. Of particular concern to this work are numerical, interval, and fuzzy types of granular data; and interval, fuzzy, and neurofuzzy modeling frameworks. Learning in evolving granular systems is based on incremental algorithms that build model structure from scratch on a per-sample basis and adapt model parameters whenever necessary. This learning paradigm is meaningful once it avoids redesigning and retraining models all along if the system changes. Application examples in classification, function approximation, time-series prediction and control using real and synthetic data illustrate the usefulness of the granular approaches and framework proposed. The behavior of nonstationary data streams with gradual and abrupt regime shifts is also analyzed in the realm of evolving granular computing. We shed light upon the role of interval, fuzzy, and neurofuzzy computing in processing uncertain data and providing high-quality approximate solutions and rule summary of input-output data sets. The approaches and framework introduced constitute a natural extension of evolving intelligent systems over numeric data streams to evolving granular systems over granular data streamsDoutoradoAutomaçãoDoutor em Engenharia Elétric

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio da Producao Cientifica e Intelectual da Unicamp

Granular computing based approach of rule learning for binary classification

Author: Cocea Mihaela
Liu Han
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2019
Field of study

Rule learning is one of the most popular types of machine-learning approaches, which typically follow two main strategies: ‘divide and conquer’ and ‘separate and conquer’. The former strategy is aimed at induction of rules in the form of a decision tree, whereas the latter one is aimed at direct induction of if–then rules. Due to the case that the divide and conquer strategy could result in the replicated sub-tree problem, which not only leads to overfitting but also increases the computational complexity in classifying unseen instances, researchers have thus been motivated to develop rule learning approaches through the separate and conquer strategy. In this paper, we focus on investigation of the Prism algorithm, since it is a representative one that follows the separate and conquer strategy, and is aimed at learning a set of rules for each class in the setting of granular computing, where each class (referred to as target class) is viewed as a granule. The Prism algorithm shows highly comparable performance to the most popular algorithms, such as ID3 and C4.5, which follow the divide and conquer strategy. However, due to the need to learn a rule set for each class, Prism usually produces very complex rule-based classifiers. In real applications, there are many problems that involve one target class only, so it is not necessary to learn a rule set for each class, i.e., only a set of rules for the target class needs to be learned and a default rule is used to indicate the case of non-target classes. To address the above issues of Prism, we propose a new version of the algorithm referred to as PrismSTC, where ‘STC’ stands for ‘single target class’. Our experimental results show that PrismSTC leads to production of simpler rule-based classifiers without loss of accuracy in comparison with Prism. PrismSTC also demonstrates sufficiently good performance comparing with C4.5

Online Research @ Cardiff

Portsmouth University Research Portal (Pure)

Fuzzy-rough set models and fuzzy-rough data reduction

Author: Alireza Mansouri Ghroutkhar
Hassan Mishmast Nehi
Publication venue: 'Croatian Operational Research Society'
Publication date: 01/01/2020
Field of study

Rough set theory is a powerful tool to analysis the information systems. Fuzzy rough set is introduced as a fuzzy generalization of rough sets. This paper reviewed the most important contributions to the rough set theory, fuzzy rough set theory and their applications. In many real world situations, some of the attribute values for an object may be in the set-valued form. In this paper, to handle this problem, we present a more general approach to the fuzzification of rough sets. Specially, we define a broad family of fuzzy rough sets. This paper presents a new development for the rough set theory by incorporating the classical rough set theory and the interval-valued fuzzy sets. The proposed methods are illustrated by an numerical example on the real case

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Fuzzy rough granular neural networks, fuzzy granules, and classification

Author: Dutta Soumitra
Ganivada Avatharam
Pal Sankar K.
Publication venue: Elsevier B.V.
Publication date: 30/09/2011
Field of study

AbstractWe introduce a fuzzy rough granular neural network (FRGNN) model based on the multilayer perceptron using a back-propagation algorithm for the fuzzy classification of patterns. We provide the development strategy of the network mainly based upon the input vector, initial connection weights determined by fuzzy rough set theoretic concepts, and the target vector. While the input vector is described in terms of fuzzy granules, the target vector is defined in terms of fuzzy class membership values and zeros. Crude domain knowledge about the initial data is represented in the form of a decision table, which is divided into subtables corresponding to different classes. The data in each decision table is converted into granular form. The syntax of these decision tables automatically determines the appropriate number of hidden nodes, while the dependency factors from all the decision tables are used as initial weights. The dependency factor of each attribute and the average degree of the dependency factor of all the attributes with respect to decision classes are considered as initial connection weights between the nodes of the input layer and the hidden layer, and the hidden layer and the output layer, respectively. The effectiveness of the proposed FRGNN is demonstrated on several real-life data sets

Elsevier - Publisher Connector