59 research outputs found

    Combining rough and fuzzy sets for feature selection

    Get PDF

    Rule pruning techniques in the ant-miner classification algorithm and its variants: A review

    Get PDF
    Rule-based classification is considered an important task of data classification.The ant-mining rule-based classification algorithm, inspired from the ant colony optimization algorithm, shows a comparable performance and outperforms in some application domains to the existing methods in the literature.One problem that often arises in any rule-based classification is the overfitting problem. Rule pruning is a framework to avoid overfitting.Furthermore, we find that the influence of rule pruning in ant-miner classification algorithms is equivalent to that of local search in stochastic methods when they aim to search for more improvement for each candidate solution.In this paper, we review the history of the pruning techniques in ant-miner and its variants.These techniques are classified into post-pruning, pre-pruning and hybrid-pruning.In addition, we compare and analyse the advantages and disadvantages of these methods. Finally, future research direction to find new hybrid rule pruning techniques are provided

    Machine Learning for a Medical Prediction System “Breast Cancer Detection” as a use case

    Get PDF
    Breast cancer is a widespread and serious illness, highlighting the importance of an early detection tool that can provide prognostic information and suggest necessary lifestyle changes to prevent its advancement, also the environmental changes in our daily life have significantly enhance the chances of getting cancer at an early stage of our life. Machine learning has become an indispensable tool in addressing this pressing need, enhancing human capabilities and offering greater automation with reduced errors. In this article, a breast cancer detection and prediction system has been created, utilizing diverse machine learning models including KNN, LR, and XGBoost

    Ant colony optimization algorithm for rule based classification: Issues and potential

    Get PDF
    Classification rule discovery using ant colony optimization (ACO) imitates the foraging behavior of real ant colonies. It is considered as one of the successful swarm intelligence metaheuristics for data classification. ACO has gained importance because of its stochastic feature and iterative adaptation procedure based on positive feedback, both of which allow for the exploration of a large area of the search space. Nevertheless, ACO also has several drawbacks that may reduce the classification accuracy and the computational time of the algorithm. This paper presents a review of related work of ACO rule classification which emphasizes the types of ACO algorithms and issues. Potential solutions that may be considered to improve the performance of ACO algorithms in the classification domain were also presented. Furthermore, this review can be used as a source of reference to other researchers in developing new ACO algorithms for rule classification

    A Breast Cancer Detection Problem using various Machine Learning Techniques in the Context of Health Prediction System

    Get PDF
    Today, breast cancer is one of the most common diseases that can cause certain complications, sometimes worst-case scenario is death. Thus, there is an urgent need for a diagnosis tool that can help doctors detect the disease at an early stage and recommend the necessary lifestyle changes to stop the progression of the disease; the likelihood of developing cancer at a young age has also been greatly increased by environmental changes in our everyday lives. Machine learning is an urgent need today to enhance human effort and offer higher automation with fewer errors. In this article, a breast cancer detection and prediction system is developed based on machine learning models (SVM, NB, AdaBoost). The achieved accuracies of the developed models are as follows: SVM achieved an overall score of 98.82%, NB achieved an overall score of 97.71%, and finally, AdaBoost achieved an overall score of 97.71%

    Neutrosophic rule-based prediction system for toxicity effects assessment of biotransformed hepatic drugs

    Get PDF
    Measuring toxicity is an important step in drug development. However, the current experimental meth- ods which are used to estimate the drug toxicity are expensive and need high computational efforts. Therefore, these methods are not suitable for large-scale evaluation of drug toxicity. As a consequence, there is a high demand to implement computational models that can predict drug toxicity risks. In this paper, we used a dataset that consists of 553 drugs that biotransformed in the liver

    Clasificador difuso para diagnóstico de enfermedades

    Get PDF
    En este artículo se presenta la aplicación de un nuevo método de identificación difusa para resolver problemas de clasificación. El modelo o clasificador difuso obtenido después del proceso de entrenamiento, contiene conjuntos triangulares con solapamiento de 0.5 para el antecedente y conjuntos tipo singleton para el consecuente. En la evaluación de las reglas se emplea un operador promedio en vez de una T-norma. Los consecuentes son ajustados empleando mínimos cuadrados recursivos. El método propuesto consigue una mayor precisión que la alcanzada con los métodos actuales existentes, empleando un número reducido de reglas y parámetros, sin sacrificar la interpretabilidad del modelo difuso. El enfoque propuesto es aplicado a dos problemas clásicos de clasificación: el Pima Indian Diabetic y el Dermatology Problem, para mostrar el desempeño del método propuesto y comparar los resultados con los alcanzados por otros investigadores.This paper presents the application of a new fuzzy identification method to solve classification problems. The model or fuzzy classifier, obtained after training process, contains triangular sets with 0.5 overlapping to the antecedent and singleton sets for the consequent. In the evaluation of the rules is used an average operator instead of a T-norm. The consequent are adjusted using recursive least squares. The proposed method achieves higher accuracy than others methods, using a small number of rules and parameters, without sacrificing the interpretability of the fuzzy model. The proposed approach is applied in two classic classification problems: Pima Indian Diabetic and Dermatology Problem, to show the performance of the proposed method and compare the results with other researchers

    Heuristic-based feature selection for rough set approach

    Get PDF
    The paper presents the proposed research methodology, dedicated to the application of greedy heuristics as a way of gathering information about available features. Discovered knowledge, represented in the form of generated decision rules, was employed to support feature selection and reduction process for induction of decision rules with classical rough set approach. Observations were executed over input data sets discretised by several methods. Experimental results show that elimination of less relevant attributes through the proposed methodology led to inferring rule sets with reduced cardinalities, while maintaining rule quality necessary for satisfactory classification

    An adaptive ant colony optimization algorithm for rule-based classification

    Get PDF
    Classification is an important data mining task with different applications in many fields. Various classification algorithms have been developed to produce classification models with high accuracy. Differing from other complex and difficult classification models, rules-based classification algorithms produce models which are understandable for users. Ant-Miner is a variant of ant colony optimisation and a prominent intelligent algorithm widely use in rules-based classification. However, the Ant-Miner has overfitting and easily falls into local optima problems which resulted in low classification accuracy and complex classification rules. In this study, a new Ant-Miner classifier is developed, named Adaptive Genetic Iterated-AntMiner (AGI-AntMiner) that aims to avoid local optima and overfitting problems. The components of AGI-AntMiner includes: i) an Adaptive AntMiner which is a prepruning technique to dynamically select the appropriate threshold based on the quality of the rules; ii) Genetic AntMiner that improves the post-pruning by adding/removing terms in a dual manner; and, iii) an Iterated Local Search-AntMiner that improves exploitation based on multiple-neighbourhood structure. The proposed AGI-AntMiner algorithm is evaluated on 16 benchmark datasets of medical, financial, gaming and social domains obtained from the University California Irvine repository. The algorithm’s performance was compared with other variants of Ant-Miner and state-of-the-art rules-based classification algorithms based on classification accuracy and model complexity. Experimental results proved that the proposed AGI-AntMiner algorithm is superior in two (2) aspects. Hybridization of local search in AGI-AntMiner has improved the exploitation mechanism which leads to the discovery of more accurate classification rules. The new pre-pruning and postpruning techniques have improved the pruning ability to produce shorter classification rules which are easier to interpret by the users. Thus, the proposed AGI-AntMiner algorithm is capable in conducting an efficient search in finding the best classification rules that balance the classification accuracy and model complexity to overcome overfitting and local optima problems
    corecore