Search CORE

2,226 research outputs found

Automatic Finding Trapezoidal Membership Functions in Mining Fuzzy Association Rules Based on Learning Automata

Author: Anari B.
Anari Z.
Hatamlou A.
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 01/01/2022
Field of study

Association rule mining is an important data mining technique used for discovering relationships among all data items. Membership functions have a significant impact on the outcome of the mining association rules. An important challenge in fuzzy association rule mining is finding an appropriate membership functions, which is an optimization issue. In the most relevant studies of fuzzy association rule mining, only triangle membership functions are considered. This study, as the first attempt, used a team of continuous action-set learning automata (CALA) to find both the appropriate number and positions of trapezoidal membership functions (TMFs). The spreads and centers of the TMFs were taken into account as parameters for the research space and a new approach for the establishment of a CALA team to optimize these parameters was introduced. Additionally, to increase the convergence speed of the proposed approach and remove bad shapes of membership functions, a new heuristic approach has been proposed. Experiments on two real data sets showed that the proposed algorithm improves the efficiency of the extracted rules by finding optimized membership functions

Re-UNIR

DIALNET

Evolutionary approaches to fuzzy modelling for classification

Author: Galea Michelle
Shen Qiang
Publication venue
Publication date: 01/01/2004
Field of study

Aberystwyth Research Portal

A Genetic Tuning to Improve the Performance of Fuzzy Rule-Based Classification Systems with Interval-Valued Fuzzy Sets: Degree of Ignorance and Lateral Position

Author: A. Fernández
Akbarzadeh-Totonchi
Alcalá
Alcalá
Alcalá
Alcalá
Alcalá-Fdez
Alcalá-Fdez
Antonelli
Bustince
Bustince
Bustince
Casillas
Cazarez-Castro
Celikyilmaz
Cordón
Cordón
Cordón
Cortes
Coupland
delaOssa
Demšar
Deschrijver
F. Herrera
Fernández
Fernández
Gacto
García
García
García
Gorlzakczany
H. Bustince
Herrera
Herrera
Herrera
Hidalgo
Ishibuchi
Ishibuchi
Ishibuchi
Ishibuchi
J. Sanz
Kaya
Liang
Luengo
Mansoori
Miller
Nojima
Palacios
Park
Sanz
Schaefer
Sheskin
Starczewski
Walker
Wu
Wu
Zarandi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Fuzzy Rule-Based Systems are appropriate tools to deal with classification problems due to their good properties. However, they can suffer a lack of system accuracy as a result of the uncertainty inherent in the definition of the membership functions and the limitation of the homogeneous distribution of the linguistic labels. The aim of the paper is to improve the performance of Fuzzy Rule-Based Classification Systems by means of the Theory of Interval-Valued Fuzzy Sets and a post-processing genetic tuning step. In order to build the Interval-Valued Fuzzy Sets we define a new function called weak ignorance for modeling the uncertainty associated with the definition of the membership functions. Next, we adapt the fuzzy partitions to the problem in an optimal way through a cooperative evolutionary tuning in which we handle both the degree of ignorance and the lateral position (based on the 2-tuples fuzzy linguistic representation) of the linguistic labels. The experimental study is carried out over a large collection of data-sets and it is supported by a statistical analysis. Our results show empirically that the use of our methodology outperforms the initial Fuzzy Rule-Based Classification System. The application of our cooperative tuning enhances the results provided by the use of the isolated tuning approaches and also improves the behavior of the genetic tuning based on the 3-tuples fuzzy linguistic representation.Spanish Government TIN2008-06681-C06-01 TIN2010-1505

Elsevier - Publisher Connector

Crossref

Repositorio Institucional Universidad de Granada

Academica-e

Data mining using intelligent systems : an optimized weighted fuzzy decision tree approach

Author: Li XuQin
Publication venue
Publication date
Field of study

Data mining can be said to have the aim to analyze the observational datasets to find relationships and to present the data in ways that are both understandable and useful. In this thesis, some existing intelligent systems techniques such as Self-Organizing Map, Fuzzy C-means and decision tree are used to analyze several datasets. The techniques are used to provide flexible information processing capability for handling real-life situations. This thesis is concerned with the design, implementation, testing and application of these techniques to those datasets. The thesis also introduces a hybrid intelligent systems technique: Optimized Weighted Fuzzy Decision Tree (OWFDT) with the aim of improving Fuzzy Decision Trees (FDT) and solving practical problems. This thesis first proposes an optimized weighted fuzzy decision tree, incorporating the introduction of Fuzzy C-Means to fuzzify the input instances but keeping the expected labels crisp. This leads to a different output layer activation function and weight connection in the neural network (NN) structure obtained by mapping the FDT to the NN. A momentum term was also introduced into the learning process to train the weight connections to avoid oscillation or divergence. A new reasoning mechanism has been also proposed to combine the constructed tree with those weights which had been optimized in the learning process. This thesis also makes a comparison between the OWFDT and two benchmark algorithms, Fuzzy ID3 and weighted FDT. SIx datasets ranging from material science to medical and civil engineering were introduced as case study applications. These datasets involve classification of composite material failure mechanism, classification of electrocorticography (ECoG)/Electroencephalogram (EEG) signals, eye bacteria prediction and wave overtopping prediction. Different intelligent systems techniques were used to cluster the patterns and predict the classes although OWFDT was used to design classifiers for all the datasets. In the material dataset, Self-Organizing Map and Fuzzy C-Means were used to cluster the acoustic event signals and classify those events to different failure mechanism, after the classification, OWFDT was introduced to design a classifier in an attempt to classify acoustic event signals. For the eye bacteria dataset, we use the bagging technique to improve the classification accuracy of Multilayer Perceptrons and Decision Trees. Bootstrap aggregating (bagging) to Decision Tree also helped to select those most important sensors (features) so that the dimension of the data could be reduced. Those features which were most important were used to grow the OWFDT and the curse of dimensionality problem could be solved using this approach. The last dataset, which is concerned with wave overtopping, was used to benchmark OWFDT with some other Intelligent Systems techniques, such as Adaptive Neuro-Fuzzy Inference System (ANFIS), Evolving Fuzzy Neural Network (EFuNN), Genetic Neural Mathematical Method (GNMM) and Fuzzy ARTMAP. Through analyzing these datasets using these Intelligent Systems Techniques, it has been shown that patterns and classes can be found or can be classified through combining those techniques together. OWFDT has also demonstrated its efficiency and effectiveness as compared with a conventional fuzzy Decision Tree and weighted fuzzy Decision Tree

Warwick Research Archives Portal Repository

Mining quantitative association rules based on evolutionary computation and its application to atmospheric pollution

Author: Martínez Ballesteros María del Mar
Martínez Álvarez Francisco
Riquelme Santos José Cristóbal
Troncoso Lora Alicia
Publication venue: 'IOS Press'
Publication date: 01/01/2010
Field of study

This research presents the mining of quantitative association rules based on evolutionary computation techniques. First, a real-coded genetic algorithm that extends the well-known binary-coded CHC algorithm has been projected to determine the intervals that define the rules without needing to discretize the attributes. The proposed algorithm is evaluated in synthetic datasets under different levels of noise in order to test its performance and the reported results are then compared to that of a multi-objective differential evolution algorithm, recently published. Furthermore, rules from real-world time series such as temperature, humidity, wind speed and direction of the wind, ozone, nitrogen monoxide and sulfur dioxide have been discovered with the objective of finding all existing relations between atmospheric pollution and climatological conditions.Ministerio de Ciencia y Tecnología TIN2007-68084-C-00Junta de Andalucía P07-TIC-0261

idUS. Depósito de Investigación Universidad de Sevilla

A forecasting of indices and corresponding investment decision making application

Author: Patel Pretesh Bhoola
Publication venue
Publication date: 01/03/2007
Field of study

Student Number : 9702018F - MSc(Eng) Dissertation - School of Electrical and Information Engineering - Faculty of Engineering and the Built EnvironmentDue to the volatile nature of the world economies, investing is crucial in ensuring an individual is prepared for future financial necessities. This research proposes an application, which employs computational intelligent methods that could assist investors in making financial decisions. This system consists of 2 components. The Forecasting Component (FC) is employed to predict the closing index price performance. Based on these predictions, the Stock Quantity Selection Component (SQSC) recommends the investor to purchase stocks, hold the current investment position or sell stocks in possession. The development of the FC module involved the creation of Multi-Layer Perceptron (MLP) as well as Radial Basis Function (RBF) neural network classifiers. TCategorizes that these networks classify are based on a profitable trading strategy that outperforms the long-term “Buy and hold” trading strategy. The Dow Jones Industrial Average, Johannesburg Stock Exchange (JSE) All Share, Nasdaq 100 and the Nikkei 225 Stock Average indices are considered. TIt has been determined that the MLP neural network architecture is particularly suited in the prediction of closing index price performance. Accuracies of 72%, 68%, 69% and 64% were obtained for the prediction of closing price performance of the Dow Jones Industrial Average, JSE All Share, Nasdaq 100 and Nikkei 225 Stock Average indices, respectively. TThree designs of the Stock Quantity Selection Component were implemented and compared in terms of their complexity as well as scalability. TComplexity is defined as the number of classifiers employed by the design. Scalability is defined as the ability of the design to accommodate the classification of additional investment recommendations. TDesigns that utilized 1, 4 and 16 classifiers, respectively, were developed. These designs were implemented using MLP neural networks, RBF neural networks, Fuzzy Inference Systems as well as Adaptive Neuro-Fuzzy Inference Systems. The design that employed 4 classifiers achieved low complexity and high scalability. As a result, this design is most appropriate for the application of concern. It has also been determined that the neural network architecture as well as the Fuzzy Inference System implementation of this design performed equally well

Wits Institutional Repository on DSPACE

Technical and Fundamental Features Analysis for Stock Market Prediction with Data Mining Methods

Author: Barak Sasan
Publication venue: Vysoká škola báňská - Technická univerzita Ostrava
Publication date: 01/01/2019
Field of study

Predicting stock prices is an essential objective in the financial world. Forecasting stock returns and their risk represents one of the most critical concerns of market decision makers. This thesis investigates the stock price forecasting with three approaches from the data mining concept and shows how different elements in the stock price can help to enhance the accuracy of our prediction. For this reason, the first and second approaches capture many fundamental indicators from the stocks and implement them as explanatory variables to do stock price classification and forecasting. In the third approach, technical features from the candlestick representation of the share prices are extracted and used to enhance the accuracy of the forecasting. In each approach, different tools and techniques from data mining and machine learning are employed to justify why the forecasting is working. Furthermore, since the idea is to evaluate the potential of features in the stock trend forecasting, therefore we diversify our experiments using both technical and fundamental features. Therefore, in the first approach, a three-stage methodology is developed while in the first step, a comprehensive investigation of all possible features which can be effective on stocks risk and return are identified. Then, in the next stage, risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, based on some filters and function-based clustering; and re-predicted the risk and return of stocks. In the second approach, instead of using single classifiers, a fusion model is proposed based on the use of multiple diverse base classifiers that operate on a common input and a meta-classifier that learns from base classifiers’ outputs to obtain a more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting, and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes are determined using a methodology based on dataset clustering and candidate classifiers’ accuracy. Finally, in the third approach, a novel forecasting model for stock markets based on the wrapper ANFIS (Adaptive Neural Fuzzy Inference System) – ICA (Imperialist Competitive Algorithm) and technical analysis of Japanese Candlestick is presented. Two approaches of Raw-based and Signal-based are devised to extract the model’s input variables and buy and sell signals are considered as output variables. To illustrate the methodologies, for the first and second approaches, Tehran Stock Exchange (TSE) data for the period from 2002 to 2012 are applied, while for the third approach, we used General Motors and Dow Jones indexes.Predicting stock prices is an essential objective in the financial world. Forecasting stock returns and their risk represents one of the most critical concerns of market decision makers. This thesis investigates the stock price forecasting with three approaches from the data mining concept and shows how different elements in the stock price can help to enhance the accuracy of our prediction. For this reason, the first and second approaches capture many fundamental indicators from the stocks and implement them as explanatory variables to do stock price classification and forecasting. In the third approach, technical features from the candlestick representation of the share prices are extracted and used to enhance the accuracy of the forecasting. In each approach, different tools and techniques from data mining and machine learning are employed to justify why the forecasting is working. Furthermore, since the idea is to evaluate the potential of features in the stock trend forecasting, therefore we diversify our experiments using both technical and fundamental features. Therefore, in the first approach, a three-stage methodology is developed while in the first step, a comprehensive investigation of all possible features which can be effective on stocks risk and return are identified. Then, in the next stage, risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, based on some filters and function-based clustering; and re-predicted the risk and return of stocks. In the second approach, instead of using single classifiers, a fusion model is proposed based on the use of multiple diverse base classifiers that operate on a common input and a meta-classifier that learns from base classifiers’ outputs to obtain a more precise stock return and risk predictions. A set of diversity methods, including Bagging, Boosting, and AdaBoost, is applied to create diversity in classifier combinations. Moreover, the number and procedure for selecting base classifiers for fusion schemes are determined using a methodology based on dataset clustering and candidate classifiers’ accuracy. Finally, in the third approach, a novel forecasting model for stock markets based on the wrapper ANFIS (Adaptive Neural Fuzzy Inference System) – ICA (Imperialist Competitive Algorithm) and technical analysis of Japanese Candlestick is presented. Two approaches of Raw-based and Signal-based are devised to extract the model’s input variables and buy and sell signals are considered as output variables. To illustrate the methodologies, for the first and second approaches, Tehran Stock Exchange (TSE) data for the period from 2002 to 2012 are applied, while for the third approach, we used General Motors and Dow Jones indexes.154 - Katedra financívyhově

DSpace at VSB Technical University of Ostrava

Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications

Author: He Yuanchen
Publication venue: ScholarWorks @ Georgia State University
Publication date: 04/12/2006
Field of study

Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies

ScholarWorks @ Georgia State University

Selecting the best measures to discover quantitative association rules

Author: Martínez Ballesteros María del Mar
Martínez Álvarez Francisco
Riquelme Santos José Cristóbal
Troncoso Lora Alicia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

The majority of the existing techniques to mine association rules typically use the support and the confidence to evaluate the quality of the rules obtained. However, these two measures may not be sufficient to properly assess their quality due to some inherent drawbacks they present. A review of the literature reveals that there exist many measures to evaluate the quality of the rules, but that the simultaneous optimization of all measures is complex and might lead to poor results. In this work, a principal components analysis is applied to a set of measures that evaluate quantitative association rules' quality. From this analysis, a reduced subset of measures has been selected to be included in the fitness function in order to obtain better values for the whole set of quality measures, and not only for those included in the fitness function. This is a general-purpose methodology and can, therefore, be applied to the fitness function of any algorithm. To validate if better results are obtained when using the function fitness composed of the subset of measures proposed here, the existing QARGA algorithm has been applied to a wide variety of datasets. Finally, a comparative analysis of the results obtained by means of the application of QARGA with the original fitness function is provided, showing a remarkable improvement when the new one is used.Ministerio de Ciencia y Tecnología TIN2011-28956-C0

idUS. Depósito de Investigación Universidad de Sevilla

Hybrid ACO and SVM algorithm for pattern classification

Author: Alwan Hiba Basim
Publication venue
Publication date: 01/01/2013
Field of study

Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO

Universiti Utara Malaysia: UUM eTheses