1,121 research outputs found

    NIS-Apriori-based rule generation with three-way decisions and its application system in SQL

    Get PDF
    In the study, non-deterministic information systems-Apriori-based (NIS-Apriori-based) rule generation from table data sets with incomplete information, SQL implementation, and the unique characteristics of the new framework are presented. Additionally, a few unsolved new research topics are proposed based on the framework. We follow the framework of NISs and propose certain rules and possible rules based on possible world semantics. Although each rule τ depends on a large number of possible tables, we prove that each rule τ is determined by examining only two τ -dependent possible tables. The NIS-Apriori algorithm is an adjusted Apriori algorithm that can handle such tables. Furthermore, it is logically sound and complete with regard to the rules. Subsequently, the implementation of the NIS-Apriori algorithm in SQL is described and a few new topics induced by effects of NIS-Apriori-based rule generation are confirmed. One of the topics that are considered is the possibility of estimating missing values via the obtained certain rules. The proposed methodology and the environment yielded by NIS-Apriori-based rule generation in SQL are useful for table data analysis with three-way decisions

    Sistemas granulares evolutivos

    Get PDF
    Orientador: Fernando Antonio Campos GomideTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Recentemente tem-se observado um crescente interesse em abordagens de modelagem computacional para lidar com fluxos de dados do mundo real. Métodos e algoritmos têm sido propostos para obtenção de conhecimento a partir de conjuntos de dados muito grandes e, a princípio, sem valor aparente. Este trabalho apresenta uma plataforma computacional para modelagem granular evolutiva de fluxos de dados incertos. Sistemas granulares evolutivos abrangem uma variedade de abordagens para modelagem on-line inspiradas na forma com que os humanos lidam com a complexidade. Esses sistemas exploram o fluxo de informação em ambiente dinâmico e extrai disso modelos que podem ser linguisticamente entendidos. Particularmente, a granulação da informação é uma técnica natural para dispensar atenção a detalhes desnecessários e enfatizar transparência, interpretabilidade e escalabilidade de sistemas de informação. Dados incertos (granulares) surgem a partir de percepções ou descrições imprecisas do valor de uma variável. De maneira geral, vários fatores podem afetar a escolha da representação dos dados tal que o objeto representativo reflita o significado do conceito que ele está sendo usado para representar. Neste trabalho são considerados dados numéricos, intervalares e fuzzy; e modelos intervalares, fuzzy e neuro-fuzzy. A aprendizagem de sistemas granulares é baseada em algoritmos incrementais que constroem a estrutura do modelo sem conhecimento anterior sobre o processo e adapta os parâmetros do modelo sempre que necessário. Este paradigma de aprendizagem é particularmente importante uma vez que ele evita a reconstrução e o retreinamento do modelo quando o ambiente muda. Exemplos de aplicação em classificação, aproximação de função, predição de séries temporais e controle usando dados sintéticos e reais ilustram a utilidade das abordagens de modelagem granular propostas. O comportamento de fluxos de dados não-estacionários com mudanças graduais e abruptas de regime é também analisado dentro do paradigma de computação granular evolutiva. Realçamos o papel da computação intervalar, fuzzy e neuro-fuzzy em processar dados incertos e prover soluções aproximadas de alta qualidade e sumário de regras de conjuntos de dados de entrada e saída. As abordagens e o paradigma introduzidos constituem uma extensão natural de sistemas inteligentes evolutivos para processamento de dados numéricos a sistemas granulares evolutivos para processamento de dados granularesAbstract: In recent years there has been increasing interest in computational modeling approaches to deal with real-world data streams. Methods and algorithms have been proposed to uncover meaningful knowledge from very large (often unbounded) data sets in principle with no apparent value. This thesis introduces a framework for evolving granular modeling of uncertain data streams. Evolving granular systems comprise an array of online modeling approaches inspired by the way in which humans deal with complexity. These systems explore the information flow in dynamic environments and derive from it models that can be linguistically understood. Particularly, information granulation is a natural technique to dispense unnecessary details and emphasize transparency, interpretability and scalability of information systems. Uncertain (granular) data arise from imprecise perception or description of the value of a variable. Broadly stated, various factors can affect one's choice of data representation such that the representing object conveys the meaning of the concept it is being used to represent. Of particular concern to this work are numerical, interval, and fuzzy types of granular data; and interval, fuzzy, and neurofuzzy modeling frameworks. Learning in evolving granular systems is based on incremental algorithms that build model structure from scratch on a per-sample basis and adapt model parameters whenever necessary. This learning paradigm is meaningful once it avoids redesigning and retraining models all along if the system changes. Application examples in classification, function approximation, time-series prediction and control using real and synthetic data illustrate the usefulness of the granular approaches and framework proposed. The behavior of nonstationary data streams with gradual and abrupt regime shifts is also analyzed in the realm of evolving granular computing. We shed light upon the role of interval, fuzzy, and neurofuzzy computing in processing uncertain data and providing high-quality approximate solutions and rule summary of input-output data sets. The approaches and framework introduced constitute a natural extension of evolving intelligent systems over numeric data streams to evolving granular systems over granular data streamsDoutoradoAutomaçãoDoutor em Engenharia Elétric

    Decision Rules Acquisition for Inconsistent Disjunctive Set-Valued Ordered Decision Information Systems

    Get PDF
    Set-valued information system is an important formal framework for the development of decision support systems. We focus on the decision rules acquisition for the inconsistent disjunctive set-valued ordered decision information system in this paper. In order to derive optimal decision rules for an inconsistent disjunctive set-valued ordered decision information system, we define the concept of reduct of an object. By constructing the dominance discernibility function for an object, we compute reducts of the object via utilizing Boolean reasoning techniques, and then the corresponding optimal decision rules are induced. Finally, we discuss the certain reduct of the inconsistent disjunctive set-valued ordered decision information system, which can be used to simplify all certain decision rules as much as possible

    Data mining in soft computing framework: a survey

    Get PDF
    The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included

    Rough Set Approach to Incomplete Multiscale Information System

    Get PDF
    Multiscale information system is a new knowledge representation system for expressing the knowledge with different levels of granulations. In this paper, by considering the unknown values, which can be seen everywhere in real world applications, the incomplete multiscale information system is firstly investigated. The descriptor technique is employed to construct rough sets at different scales for analyzing the hierarchically structured data. The problem of unravelling decision rules at different scales is also addressed. Finally, the reduct descriptors are formulated to simplify decision rules, which can be derived from different scales. Some numerical examples are employed to substantiate the conceptual arguments

    Discretisation of conditions in decision rules induced for continuous

    Get PDF
    Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs
    corecore