69,353 research outputs found

    Multiobjective Evolutionary Optimization of Type-2 Fuzzy Rule-Based Systems for Financial Data Classification

    Get PDF
    Classification techniques are becoming essential in the financial world for reducing risks and possible disasters. Managers are interested in not only high accuracy, but in interpretability and transparency as well. It is widely accepted now that the comprehension of how inputs and outputs are related to each other is crucial for taking operative and strategic decisions. Furthermore, inputs are often affected by contextual factors and characterized by a high level of uncertainty. In addition, financial data are usually highly skewed toward the majority class. With the aim of achieving high accuracies, preserving the interpretability, and managing uncertain and unbalanced data, this paper presents a novel method to deal with financial data classification by adopting type-2 fuzzy rule-based classifiers (FRBCs) generated from data by a multiobjective evolutionary algorithm (MOEA). The classifiers employ an approach, denoted as scaled dominance, for defining rule weights in such a way to help minority classes to be correctly classified. In particular, we have extended PAES-RCS, an MOEA-based approach to learn concurrently the rule and data bases of FRBCs, for managing both interval type-2 fuzzy sets and unbalanced datasets. To the best of our knowledge, this is the first work that generates type-2 FRBCs by concurrently maximizing accuracy and minimizing the number of rules and the rule length with the objective of producing interpretable models of real-world skewed and incomplete financial datasets. The rule bases are generated by exploiting a rule and condition selection (RCS) approach, which selects a reduced number of rules from a heuristically generated rule base and a reduced number of conditions for each selected rule during the evolutionary process. The weight associated with each rule is scaled by the scaled dominance approach on the fuzzy frequency of the output class, in order to give a higher weight to the minority class. As regards the data base learning, the membership function parameters of the interval type-2 fuzzy sets used in the rules are learned concurrently to the application of RCS. Unbalanced datasets are managed by using, in addition to complexity, selectivity and specificity as objectives of the MOEA rather than only the classification rate. We tested our approach, named IT2-PAES-RCS, on 11 financial datasets and compared our results with the ones obtained by the original PAES-RCS with three objectives and with and without scaled dominance, the FRBCs, fuzzy association rule-based classification model for high-dimensional dataset (FARC-HD) and fuzzy unordered rules induction algorithm (FURIA), the classical C4.5 decision tree algorithm, and its cost-sensitive version. Using nonparametric statistical tests, we will show that IT2-PAES-RCS generates FRBCs with, on average, accuracy statistically comparable with and complexity lower than the ones generated by the two versions of the original PAES-RCS. Further, the FRBCs generated by FARC-HD and FURIA and the decision trees computed by C4.5 and its cost-sensitive version, despite the highest complexity, result to be less accurate than the FRBCs generated by IT2-PAES-RCS. Finally, we will highlight how these FRBCs are easily interpretable by showing and discussing one of them

    Z-number-valued rule-based decision trees

    Get PDF
    As a novel architecture of a fuzzy decision tree constructed on fuzzy rules, the fuzzy rule-based decision tree (FRDT) achieved better performance in terms of both classification accuracy and the size of the resulted decision tree than other classical decision trees such as C4.5, LADtree, BFtree, SimpleCart and NBTree. The concept of Z-number extends the classical fuzzy number to model both uncertain and partial reliable information. Z-numbers have significant potential in rule-based systems due to their strong representation capability. This paper designs a Z-number-valued rulebased decision tree (ZRDT) and provides the learning algorithm. Firstly, the information gain is used to replace the fuzzy confidence in FRDT to select features in each rule. Additionally, we use the negative samples to generate the second fuzzy numbers that adjust the first fuzzy numbers and improve the model’s fit to the training data. The proposed ZRDT is compared with the FRDT with three different parameter values and two classical decision trees, PUBLIC and C4.5, and a decision tree ensemble method, AdaBoost.NC, in terms of classification effect and size of decision trees. Based on statistical tests, the proposed ZRDT has the highest classification performance with the smallest size for the produced decision tree.The project B-TIC-590-UGR20Programa Operativo FEDER 2014-2020Regional Ministry of EconomyKnowledgeEnterprise and Universities (CECEU) of AndalusiaChina Scholarship Council (CSC) (202106070037)Project PID2019-103880RB-I00MCIN/AEI/10.13039/501100011033Andalusian government through project P20_0067

    A Fuzzy Logic-Based System for Soccer Video Scenes Classification

    Get PDF
    Massive global video surveillance worldwide captures data but lacks detailed activity information to flag events of interest, while the human burden of monitoring video footage is untenable. Artificial intelligence (AI) can be applied to raw video footage to identify and extract required information and summarize it in linguistic formats. Video summarization automation usually involves text-based data such as subtitles, segmenting text and semantics, with little attention to video summarization in the processing of video footage only. Classification problems in recorded videos are often very complex and uncertain due to the dynamic nature of the video sequence and light conditions, background, camera angle, occlusions, indistinguishable scene features, etc. Video scene classification forms the basis of linguistic video summarization, an open research problem with major commercial importance. Soccer video scenes present added challenges due to specific objects and events with similar features (e.g. “people” include audiences, coaches, and players), as well as being constituted from a series of quickly changing and dynamic frames with small inter-frame variations. There is an added difficulty associated with the need to have light weight video classification systems working in real time with massive data sizes. In this thesis, we introduce a novel system based on Interval Type-2 Fuzzy Logic Classification Systems (IT2FLCS) whose parameters are optimized by the Big Bang–Big Crunch (BB-BC) algorithm, which allows for the automatic scenes classification using optimized rules in broadcasted soccer matches video. The type-2 fuzzy logic systems would be unequivocal to present a highly interpretable and transparent model which is very suitable for the handling the encountered uncertainties in video footages and converting the accumulated data to linguistic formats which can be easily stored and analysed. Meanwhile the traditional black box techniques, such as support vector machines (SVMs) and neural networks, do not provide models which could be easily analysed and understood by human users. The BB-BC optimization is a heuristic, population-based evolutionary approach which is characterized by the ease of implementation, fast convergence and low computational cost. We employed the BB-BC to optimize our system parameters of fuzzy logic membership functions and fuzzy rules. Using the BB-BC we are able to balance the system transparency (through generating a small rule set) together with increasing the accuracy of scene classification. Thus, the proposed fuzzy-based system allows achieving relatively high classification accuracy with a small number of rules thus increasing the system interpretability and allowing its real-time processing. The type-2 Fuzzy Logic Classification System (T2FLCS) obtained 87.57% prediction accuracy in the scene classification of our testing group data which is better than the type-1 fuzzy classification system and neural networks counterparts. The BB-BC optimization algorithms decrease the size of rule bases both in T1FLCS and T2FLCS; the T2FLCS finally got 85.716% with reduce rules, outperforming the T1FLCS and neural network counterparts, especially in the “out-of-range data” which validates the T2FLCSs capability to handle the high level of faced uncertainties. We also presented a novel approach based on the scenes classification system combined with the dynamic time warping algorithm to implement the video events detection for real world processing. The proposed system could run on recorded or live video clips and output a label to describe the event in order to provide the high level summarization of the videos to the user

    Adaptive imputation of missing values for incomplete pattern classification

    Get PDF
    In classification of incomplete pattern, the missing values can either play a crucial role in the class determination, or have only little influence (or eventually none) on the classification results according to the context. We propose a credal classification method for incomplete pattern with adaptive imputation of missing values based on belief function theory. At first, we try to classify the object (incomplete pattern) based only on the available attribute values. As underlying principle, we assume that the missing information is not crucial for the classification if a specific class for the object can be found using only the available information. In this case, the object is committed to this particular class. However, if the object cannot be classified without ambiguity, it means that the missing values play a main role for achieving an accurate classification. In this case, the missing values will be imputed based on the K-nearest neighbor (K-NN) and self-organizing map (SOM) techniques, and the edited pattern with the imputation is then classified. The (original or edited) pattern is respectively classified according to each training class, and the classification results represented by basic belief assignments are fused with proper combination rules for making the credal classification. The object is allowed to belong with different masses of belief to the specific classes and meta-classes (which are particular disjunctions of several single classes). The credal classification captures well the uncertainty and imprecision of classification, and reduces effectively the rate of misclassifications thanks to the introduction of meta-classes. The effectiveness of the proposed method with respect to other classical methods is demonstrated based on several experiments using artificial and real data sets

    Probabilistic Inference from Arbitrary Uncertainty using Mixtures of Factorized Generalized Gaussians

    Full text link
    This paper presents a general and efficient framework for probabilistic inference and learning from arbitrary uncertain information. It exploits the calculation properties of finite mixture models, conjugate families and factorization. Both the joint probability density of the variables and the likelihood function of the (objective or subjective) observation are approximated by a special mixture model, in such a way that any desired conditional distribution can be directly obtained without numerical integration. We have developed an extended version of the expectation maximization (EM) algorithm to estimate the parameters of mixture models from uncertain training examples (indirect observations). As a consequence, any piece of exact or uncertain information about both input and output values is consistently handled in the inference and learning stages. This ability, extremely useful in certain situations, is not found in most alternative methods. The proposed framework is formally justified from standard probabilistic principles and illustrative examples are provided in the fields of nonparametric pattern classification, nonlinear regression and pattern completion. Finally, experiments on a real application and comparative results over standard databases provide empirical evidence of the utility of the method in a wide range of applications

    Bayesian learning of models for estimating uncertainty in alert systems: application to air traffic conflict avoidance

    Get PDF
    Alert systems detect critical events which can happen in the short term. Uncertainties in data and in the models used for detection cause alert errors. In the case of air traffic control systems such as Short-Term Conflict Alert (STCA), uncertainty increases errors in alerts of separation loss. Statistical methods that are based on analytical assumptions can provide biased estimates of uncertainties. More accurate analysis can be achieved by using Bayesian Model Averaging, which provides estimates of the posterior probability distribution of a prediction. We propose a new approach to estimate the prediction uncertainty, which is based on observations that the uncertainty can be quantified by variance of predicted outcomes. In our approach, predictions for which variances of posterior probabilities are above a given threshold are assigned to be uncertain. To verify our approach we calculate a probability of alert based on the extrapolation of closest point of approach. Using Heathrow airport flight data we found that alerts are often generated under different conditions, variations in which lead to alert detection errors. Achieving 82.1% accuracy of modelling the STCA system, which is a necessary condition for evaluating the uncertainty in prediction, we found that the proposed method is capable of reducing the uncertain component. Comparison with a bootstrap aggregation method has demonstrated a significant reduction of uncertainty in predictions. Realistic estimates of uncertainties will open up new approaches to improving the performance of alert systems
    • …
    corecore