1,407 research outputs found

    Data mining in soft computing framework: a survey

    Get PDF
    The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included

    Sistemas granulares evolutivos

    Get PDF
    Orientador: Fernando Antonio Campos GomideTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Recentemente tem-se observado um crescente interesse em abordagens de modelagem computacional para lidar com fluxos de dados do mundo real. Métodos e algoritmos têm sido propostos para obtenção de conhecimento a partir de conjuntos de dados muito grandes e, a princípio, sem valor aparente. Este trabalho apresenta uma plataforma computacional para modelagem granular evolutiva de fluxos de dados incertos. Sistemas granulares evolutivos abrangem uma variedade de abordagens para modelagem on-line inspiradas na forma com que os humanos lidam com a complexidade. Esses sistemas exploram o fluxo de informação em ambiente dinâmico e extrai disso modelos que podem ser linguisticamente entendidos. Particularmente, a granulação da informação é uma técnica natural para dispensar atenção a detalhes desnecessários e enfatizar transparência, interpretabilidade e escalabilidade de sistemas de informação. Dados incertos (granulares) surgem a partir de percepções ou descrições imprecisas do valor de uma variável. De maneira geral, vários fatores podem afetar a escolha da representação dos dados tal que o objeto representativo reflita o significado do conceito que ele está sendo usado para representar. Neste trabalho são considerados dados numéricos, intervalares e fuzzy; e modelos intervalares, fuzzy e neuro-fuzzy. A aprendizagem de sistemas granulares é baseada em algoritmos incrementais que constroem a estrutura do modelo sem conhecimento anterior sobre o processo e adapta os parâmetros do modelo sempre que necessário. Este paradigma de aprendizagem é particularmente importante uma vez que ele evita a reconstrução e o retreinamento do modelo quando o ambiente muda. Exemplos de aplicação em classificação, aproximação de função, predição de séries temporais e controle usando dados sintéticos e reais ilustram a utilidade das abordagens de modelagem granular propostas. O comportamento de fluxos de dados não-estacionários com mudanças graduais e abruptas de regime é também analisado dentro do paradigma de computação granular evolutiva. Realçamos o papel da computação intervalar, fuzzy e neuro-fuzzy em processar dados incertos e prover soluções aproximadas de alta qualidade e sumário de regras de conjuntos de dados de entrada e saída. As abordagens e o paradigma introduzidos constituem uma extensão natural de sistemas inteligentes evolutivos para processamento de dados numéricos a sistemas granulares evolutivos para processamento de dados granularesAbstract: In recent years there has been increasing interest in computational modeling approaches to deal with real-world data streams. Methods and algorithms have been proposed to uncover meaningful knowledge from very large (often unbounded) data sets in principle with no apparent value. This thesis introduces a framework for evolving granular modeling of uncertain data streams. Evolving granular systems comprise an array of online modeling approaches inspired by the way in which humans deal with complexity. These systems explore the information flow in dynamic environments and derive from it models that can be linguistically understood. Particularly, information granulation is a natural technique to dispense unnecessary details and emphasize transparency, interpretability and scalability of information systems. Uncertain (granular) data arise from imprecise perception or description of the value of a variable. Broadly stated, various factors can affect one's choice of data representation such that the representing object conveys the meaning of the concept it is being used to represent. Of particular concern to this work are numerical, interval, and fuzzy types of granular data; and interval, fuzzy, and neurofuzzy modeling frameworks. Learning in evolving granular systems is based on incremental algorithms that build model structure from scratch on a per-sample basis and adapt model parameters whenever necessary. This learning paradigm is meaningful once it avoids redesigning and retraining models all along if the system changes. Application examples in classification, function approximation, time-series prediction and control using real and synthetic data illustrate the usefulness of the granular approaches and framework proposed. The behavior of nonstationary data streams with gradual and abrupt regime shifts is also analyzed in the realm of evolving granular computing. We shed light upon the role of interval, fuzzy, and neurofuzzy computing in processing uncertain data and providing high-quality approximate solutions and rule summary of input-output data sets. The approaches and framework introduced constitute a natural extension of evolving intelligent systems over numeric data streams to evolving granular systems over granular data streamsDoutoradoAutomaçãoDoutor em Engenharia Elétric

    Perpetual Learning Framework based on Type-2 Fuzzy Logic System for a Complex Manufacturing Process

    Get PDF
    This paper introduces a perpetual type-2 Neuro-Fuzzy modelling structure for continuous learning and its application to the complex thermo-mechanical metal process of steel Friction Stir Welding (FSW). The ‘perpetual’ property refers to the capability of the proposed system to continuously learn from new process data, in an incremental learning fashion. This is particularly important in industrial/manufacturing processes, as it eliminates the need to retrain the model in the presence of new data, or in the case of any process drift. The proposed structure evolves through incremental, hybrid (supervised/unsupervised) learning, and accommodates new sample data in a continuous fashion. The human-like information capture paradigm of granular computing is used along with an interval type-2 neural-fuzzy system to develop a modelling structure that is tolerant to the uncertainty in the manufacturing data (common challenge in industrial/manufacturing data). The proposed method relies on the creation of new fuzzy rules which are updated and optimised during the incremental learning process. An iterative pruning strategy in the model is then employed to remove any redundant rules, as a result of the incremental learning process. The rule growing/pruning strategy is used to guarantee that the proposed structure can be used in a perpetual learning mode. It is demonstrated that the proposed structure can effectively learn complex dynamics of input-output data in an adaptive way and maintain good predictive performance in the metal processing case study of steel FSW using real manufacturing dat

    Fuzzy rough granular neural networks, fuzzy granules, and classification

    Get PDF
    AbstractWe introduce a fuzzy rough granular neural network (FRGNN) model based on the multilayer perceptron using a back-propagation algorithm for the fuzzy classification of patterns. We provide the development strategy of the network mainly based upon the input vector, initial connection weights determined by fuzzy rough set theoretic concepts, and the target vector. While the input vector is described in terms of fuzzy granules, the target vector is defined in terms of fuzzy class membership values and zeros. Crude domain knowledge about the initial data is represented in the form of a decision table, which is divided into subtables corresponding to different classes. The data in each decision table is converted into granular form. The syntax of these decision tables automatically determines the appropriate number of hidden nodes, while the dependency factors from all the decision tables are used as initial weights. The dependency factor of each attribute and the average degree of the dependency factor of all the attributes with respect to decision classes are considered as initial connection weights between the nodes of the input layer and the hidden layer, and the hidden layer and the output layer, respectively. The effectiveness of the proposed FRGNN is demonstrated on several real-life data sets

    A Fraud-Detection Fuzzy Logic Based System for the Sudanese Financial Sector

    Get PDF
    Financial fraud considered as a global issue that faces the financial sector and economy; as a result, many financial institutions loose hundreds of millions of dollars annually due to fraud. In Sudan, there are difficulties of getting real data from banks and the unavailability of systems which explain the reasons of suspicious transaction. Hence, there is a need for transparent techniques which can automatically detect fraud with high accuracy and identify its causes and common patterns. Some of the Artificial Intelligence (AI) techniques provide good predictive models, nevertheless they are considered as black-box models which are not easy to understand and analyze. In this paper, we developed a novel intelligent type-2 Fuzzy Logic Systems (FLSs) which can detect fraud in debit cards using real world dataset extracted from financial institutions in Sudan. FLSs provide white-box transparent models which employ linguistic labels and IF-Then rules which could be easily analyzed, interpreted and augmented by the fraud experts. The proposed type-2 FLS system learnt its fuzzy sets parameters from data using Fuzzy C-means (FCM) clustering as well as learning the FLS rules from data. The proposed system has the potential to result in highly accurate automatic fraud-detection for the Sudanese financial institutions and banking sectors

    A fuzzy approach to text classification with two-stage training for ambiguous instances

    Get PDF
    Sentiment analysis is a very popular application area of text mining and machine learning. The popular methods include Support Vector Machine, Naive Bayes, Decision Trees and Deep Neural Networks. However, these methods generally belong to discriminative learning, which aims to distinguish one class from others with a clear-cut outcome, under the presence of ground truth. In the context of text classification, instances are naturally fuzzy (can be multi-labeled in some application areas) and thus are not considered clear-cut, especially given the fact that labels assigned to sentiment in text represent an agreed level of subjective opinion for multiple human annotators rather than indisputable ground truth. This has motivated researchers to develop fuzzy methods, which typically train classifiers through generative learning, i.e. a fuzzy classifier is used to measure the degree to which an instance belongs to each class. Traditional fuzzy methods typically involve generation of a single fuzzy classifier and employ a fixed rule of defuzzification outputting the class with the maximum membership degree. The use of a single fuzzy classifier with the above fixed rule of defuzzification is likely to get the classifier encountering the text ambiguity situation on sentiment data, i.e. an instance may obtain equal membership degrees to both the positive and negative classes. In this paper, we focus on cyberhate classification, since the spread of hate speech via social media can have disruptive impacts on social cohesion and lead to regional and community tensions. Automatic detection of cyberhate has thus become a priority research area. In particular, we propose a modified fuzzy approach with two stage training for dealing with text ambiguity and classifying four types of hate speech, namely: religion, race, disability and sexual orientation - and compare its performance with those popular methods as well as some existing fuzzy approaches, while the features are prepared through the Bag-of-Words and Word Embedding feature extraction methods alongside the correlation based feature subset selection method. The experimental results show that the proposed fuzzy method outperforms the other methods in most cases

    Cognitive Models and Computational Approaches for improving Situation Awareness Systems

    Get PDF
    2016 - 2017The world of Internet of Things is pervaded by complex environments with smart services available every time and everywhere. In such a context, a serious open issue is the capability of information systems to support adaptive and collaborative decision processes in perceiving and elaborating huge amounts of data. This requires the design and realization of novel socio-technical systems based on the “human-in-the-loop” paradigm. The presence of both humans and software in such systems demands for adequate levels of Situation Awareness (SA). To achieve and maintain proper levels of SA is a daunting task due to the intrinsic technical characteristics of systems and the limitations of human cognitive mechanisms. In the scientific literature, such issues hindering the SA formation process are defined as SA demons. The objective of this research is to contribute to the resolution of the SA demons by means of the identification of information processing paradigms for an original support to the SA and the definition of new theoretical and practical approaches based on cognitive models and computational techniques. The research work starts with an in-depth analysis and some preliminary verifications of methods, techniques, and systems of SA. A major outcome of this analysis is that there is only a limited use of the Granular Computing paradigm (GrC) in the SA field, despite the fact that SA and GrC share many concepts and principles. The research work continues with the definition of contributions and original results for the resolution of significant SA demons, exploiting some of the approaches identified in the analysis phase (i.e., ontologies, data mining, and GrC). The first contribution addresses the issues related to the bad perception of data by users. We propose a semantic approach for the quality-aware sensor data management which uses a data imputation technique based on association rule mining. The second contribution proposes an original ontological approach to situation management, namely the Adaptive Goal-driven Situation Management. The approach uses the ontological modeling of goals and situations and a mechanism that suggests the most relevant goals to the users at a given moment. Lastly, the adoption of the GrC paradigm allows the definition of a novel model for representing and reasoning on situations based on a set theoretical framework. This model has been instantiated using the rough sets theory. The proposed approaches and models have been implemented in prototypical systems. Their capabilities in improving SA in real applications have been evaluated with typical methodologies used for SA systems. [edited by Author]XXX cicl

    The Application of Data Mining Techniques in Agricultural Science

    Get PDF
    Information Technology has a positive impact on other disciplines. Using today's technology, precision agriculture and InformationTechnology are mixed together. Use of Information Technology in agriculture will lead to improvements in productivity. For this purpose,the raw data is transformed into useful information through data mining. This research determined whether data mining techniques can alsobe used to improve pattern recognition and analysis of large growth factors of ornamental plants experimental datasets. Furthermore, theresearch aimed to establish data mining techniques can be used to assist in the classification and regression methods by determining whethermeaningful patterns exist various growth factors of ornamental plants characterized at various research sites across Kish Island. Differentdata mining techniques were used analyze a large data base of ornamental plants properties attributes. The data base has been collected fromdifferent plants of Kish Island in various areas in order to determine, classify and predict effective growth factors on blooming. In thisresearch, analyzed data with regression technique showed the effect of chlorophyll content on the number of flowers. The analysis of theseagricultural data base with different data mining methods may have some advantages in agricultur

    The posterity of Zadeh's 50-year-old paper: A retrospective in 101 Easy Pieces – and a Few More

    Get PDF
    International audienceThis article was commissioned by the 22nd IEEE International Conference of Fuzzy Systems (FUZZ-IEEE) to celebrate the 50th Anniversary of Lotfi Zadeh's seminal 1965 paper on fuzzy sets. In addition to Lotfi's original paper, this note itemizes 100 citations of books and papers deemed “important (significant, seminal, etc.)” by 20 of the 21 living IEEE CIS Fuzzy Systems pioneers. Each of the 20 contributors supplied 5 citations, and Lotfi's paper makes the overall list a tidy 101, as in “Fuzzy Sets 101”. This note is not a survey in any real sense of the word, but the contributors did offer short remarks to indicate the reason for inclusion (e.g., historical, topical, seminal, etc.) of each citation. Citation statistics are easy to find and notoriously erroneous, so we refrain from reporting them - almost. The exception is that according to Google scholar on April 9, 2015, Lotfi's 1965 paper has been cited 55,479 times

    Evolving fuzzy and neuro-fuzzy approaches in clustering, regression, identification, and classification: A Survey

    Get PDF
    Major assumptions in computational intelligence and machine learning consist of the availability of a historical dataset for model development, and that the resulting model will, to some extent, handle similar instances during its online operation. However, in many real world applications, these assumptions may not hold as the amount of previously available data may be insufficient to represent the underlying system, and the environment and the system may change over time. As the amount of data increases, it is no longer feasible to process data efficiently using iterative algorithms, which typically require multiple passes over the same portions of data. Evolving modeling from data streams has emerged as a framework to address these issues properly by self-adaptation, single-pass learning steps and evolution as well as contraction of model components on demand and on the fly. This survey focuses on evolving fuzzy rule-based models and neuro-fuzzy networks for clustering, classification and regression and system identification in online, real-time environments where learning and model development should be performed incrementally. (C) 2019 Published by Elsevier Inc.Igor Škrjanc, Jose Antonio Iglesias and Araceli Sanchis would like to thank to the Chair of Excellence of Universidad Carlos III de Madrid, and the Bank of Santander Program for their support. Igor Škrjanc is grateful to Slovenian Research Agency with the research program P2-0219, Modeling, simulation and control. Daniel Leite acknowledges the Minas Gerais Foundation for Research and Development (FAPEMIG), process APQ-03384-18. Igor Škrjanc and Edwin Lughofer acknowledges the support by the ”LCM — K2 Center for Symbiotic Mechatronics” within the framework of the Austrian COMET-K2 program. Fernando Gomide is grateful to the Brazilian National Council for Scientific and Technological Development (CNPq) for grant 305906/2014-3
    corecore