87 research outputs found

    A review of homogenous ensemble methods on the classification of breast cancer data

    Get PDF
    In the last decades, emerging data mining technology has been introduced to assist humankind in generating relevant decisions. Data mining is a concept established by computer scientists to lead a secure and reliable classification and deduction of data. In the medical field, data mining methods can assist in performing various medical diagnoses, including breast cancer. As evolution happens, ensemble methods are being proposed to achieve better performance in classification. This technique reinforced the use of multiple classifiers in the model. The review of the homogenous ensemble method on breast cancer classification is being carried out to identify the overall performance. The results of the reviewed ensemble techniques, such as Random Forest and XGBoost, show that ensemble methods can outperform the performance of the single classifier method. The reviewed ensemble methods have pros and cons and are useful for solving breast cancer classification problems. The methods are being discussed thoroughly to examine the overall performance in the classification

    A hybridization of butterfly optimization algorithm and harmony search for fuzzy modelling in phishing attack detection

    Get PDF
    Fuzzy system is one of the most used systems in the decision-making and classification method as it is easy to understand because the way this system works is closer to how humans think. It is a system that uses human experts to hold the membership values to make decisions. However, it is hard to determine the fuzzy parameter manually in a complex problem, and the process of generating the parameter is called fuzzy modelling. Therefore, an optimization method is needed to solve this issue, and one of the best methods to be applied is Butterfly Optimization Algorithm. In this paper, BOA was improvised by combining this algorithm with Harmony Search (HS) in order to achieve optimal results in fuzzy modelling. The advantages of both algorithms are used to balance the exploration and exploitation in the searching process. Two datasets from UCI machine learning were used: Website Phishing Dataset and Phishing Websites Dataset. As a result, the average accuracy for WPD and PWD was 98.69% and 98.80%, respectively. In conclusion, the proposed method shows promising and effective results compared to other methods

    Attribute related methods for improvement of ID3 Algorithm in classification of data: A review

    Get PDF
    Decision tree is an important method in data mining to solve the classification problems. There are several learning algorithms to implement the decision tree but the most commonly-used is ID3 algorithm. Nevertheless, there are some limitations in ID3 algorithm that can affect the performance in the classification of data. The use of information gain in the ID3 algorithm as the attribute selection criteria is not to assess the relationship between classification and the dataset’s attributes. The objective of the study being conducted is to implement the attribute related methods to solve the shortcomings of the ID3 algorithm like the tendency to select attributes with many values and also improve the performance of ID3 algorithm. The techniques of attribute related methods studied in this paper were mutual information, association function and attribute weighted. All the techniques assist the decision tree to find the most optimal attributes in each generation of the tree. Results of the reviewed techniques show that attribute selection methods capable to resolve the limitations in ID3 algorithm and increase the performance of the method. All of the reviewed techniques have their advantages and disadvantages and useful to solve the classification problems. Implementation of the techniques with ID3 algorithm is being discussed thoroughly

    An Improved Algorithm for Optimising the Production of Biochemical Systems

    Get PDF
    This chapter presents an improved method for constrained optimisation of biochemical systems production. The aim of the proposed method is to maximise its production and, at the same time, to minimise the total amount of chemical concentrations involved in producing the best production. The proposed method models biochemical systems with ordinary differential equations. The optimisation process became complex for the large size of biochemical systems that contain many chemicals. In addition, several constraints as the steady-state constraint and the constraint of chemical concentrations also contributed to the computational complexity and difficulty in the optimisation process. This chapter considers the biochemical systems as a nonlinear equations system. To solve the nonlinear equations system, the Newton method was applied. Then, both genetic algorithm and cooperative co-evolutionary algorithm were applied to fine-tune the components in the biochemical systems to maximise the production and minimise the total amount of chemical concentrations involved. Two biochemical systems were used, namely the ethanol production in the Saccharomyces cerevisiae pathway and the tryptophan production in the Escherichia coli pathway. In evaluating the performance of the proposed method, several comparisons with other works were performed, and the proposed method demonstrated its effectiveness in maximising the production and minimising the total amount of chemical concentrations involved

    Review on Intrusion Detection System Based on The Goal of The Detection System

    Get PDF
    An extensive review of the intrusion detection system (IDS) is presented in this paper. Previous studies review the IDS based on the approaches (algorithms) used or based on the types of the intrusion itself. The presented paper reviews the IDS based on the goal of the IDS (accuracy and time), which become the main objective of this paper. Firstly, the IDS were classified into two types based on the goal they intend to achieve. These two types of IDS were later reviewed in detail, followed by a comparison of some of the studies that have earlier been carried out on IDS. The comparison is done based on the results shown in the studies compared. The comparison shows that the studies focusing on the detection time reduce the accuracy of the detection compared to other studies

    Loan eligibility classification using logistic regression

    Get PDF
    Machine learning is becoming increasingly vital in various domains, including loan eligibility classification, d ue to its ability to analyze large amounts of data, develop predictive models, adapt to new information, and automate processes. This research paper presents a study on loan eligibility classification using a machine learning approach by comparing the performance of three Machine Learning algorithms which were Logistic Regression, Random Forest, and Decision Tree. This research was conducted using Python and Jupyter Notebook for data analysis and model development. The models were then evaluated on the testing set using evaluation metrics such as Accuracy, Precision, Recall, And Fl-Score. The performance of the models was compared to identify the most effective algorithm for loan eligibility classification. Among the three ML approach, the LR model appears to be the most effective at classify loan eligibility, with the 82% accuracy score, 82% recall score, 81% precision score and 79% Fl score

    Review of the machine learning methods in the classification of phishing attack

    Get PDF
    The development of computer networks today has increased rapidly. This can be seen based on the trend of computer users around the world, whereby they need to connect their computer to the Internet. This shows that the use of Internet networks is very important, whether for work purposes or access to social media accounts. However, in widely using this computer network, the privacy of computer users is in danger, especially for computer users who do not install security systems in their computer. This problem will allow hackers to hack and commit network attacks. This is very dangerous, especially for Internet users because hackers can steal confidential information such as bank login account or social media login account. The attacks that can be made include phishing attacks. The goal of this study is to review the types of phishing attacks and current methods used in preventing them. Based on the literature, the machine learning method is widely used to prevent phishing attacks. There are several algorithms that can be used in the machine learning method to prevent these attacks. This study focused on an algorithm that was thoroughly made and the methods in implementing this algorithm are discussed in detail

    Optimization of Biochemical Systems Production Using Combination of Newton Method and Particle Swarm Optimization

    Get PDF
    In the presented paper, an improved method that combines the Newton method with Particle Swarm Optimization (PSO) algorithm to optimize the production of biochemical systems was discussed and presented in detail. The optimization of the biochemical system's production became difficult and complicated when it involves a large size of biochemical systems that have many components and interaction between chemical. Also, two objectives and several constraints make the optimization process difficult. To overcome these situations, the proposed method was proposed by treating the biochemical systems as a nonlinear equations system and then optimizes using PSO. The proposed method was proposed to improve the biochemical system's production and at the same time reduce the total of chemical concentration involves. In the proposed method, the Newton method was used to deal with nonlinear equations system, while the PSO algorithm was utilized to fine-tune the variables in nonlinear equations system. The main reason for using the Newton method is its simplicity in solving the nonlinear equations system. The justification of choosing PSO algorithm is its direct implementation and effectiveness in the optimization process. In order to evaluate the proposed method, two biochemical systems were used, which were E.coli pathway and S. cerevisiae pathway. The experimental results showed that the proposed method was able to achieve the best result as compared to other works
    • …
    corecore