1,279 research outputs found

    Diabetes Classification using Fuzzy Logic and Adaptive Cuckoo Search Optimization Techniques

    Get PDF
    Diabetic patients can be detected now a days globally. It�s main reason of growth is the incapability of body to produce enough insulin. So, majority of people today are either diabetic or pre-diabetic. Therefore, it is very much required to develop a system that can detect and classify the diabetes in optimal time period effectively and efficiently. So, proposed system make use of fuzzy logic and adaptive cuckoo search optimization algorithm (ACS) for diabetes classification. This work has been carried out in various steps. Firstly, the training dataset�s dimensionality reduction and optimal fuzzy rule generation via ACS optimization technique. Next is fuzzy model design and testing of fuzzified testing dataset. In this paper, outcome of FF-BAT algorithm has been compared with ACS algorithm. Experimental results were examined and it is noticed that ACS algorithm seems to perform better than FF-BAT algorithm

    Integrated bio-search approaches with multi-objective algorithms for optimization and classification problem

    Get PDF
    Optimal selection of features is very difficult and crucial to achieve, particularly for the task of classification. It is due to the traditional method of selecting features that function independently and generated the collection of irrelevant features, which therefore affects the quality of the accuracy of the classification. The goal of this paper is to leverage the potential of bio-inspired search algorithms, together with wrapper, in optimizing multi-objective algorithms, namely ENORA and NSGA-II to generate an optimal set of features. The main steps are to idealize the combination of ENORA and NSGA-II with suitable bio-search algorithms where multiple subset generation has been implemented. The next step is to validate the optimum feature set by conducting a subset evaluation. Eight (8) comparison datasets of various sizes have been deliberately selected to be checked. Results shown that the ideal combination of multi-objective algorithms, namely ENORA and NSGA-II, with the selected bio-inspired search algorithm is promising to achieve a better optimal solution (i.e. a best features with higher classification accuracy) for the selected datasets. This discovery implies that the ability of bio-inspired wrapper/filtered system algorithms will boost the efficiency of ENORA and NSGA-II for the task of selecting and classifying features

    A Survey of Feature Selection Strategies for DNA Microarray Classification

    Get PDF
    Classification tasks are difficult and challenging in the bioinformatics field, that used to predict or diagnose patients at an early stage of disease by utilizing DNA microarray technology. However, crucial characteristics of DNA microarray technology are a large number of features and small sample sizes, which means the technology confronts a "dimensional curse" in its classification tasks because of the high computational execution needed and the discovery of biomarkers difficult. To reduce the dimensionality of features to find the significant features that can employ feature selection algorithms and not affect the performance of classification tasks. Feature selection helps decrease computational time by removing irrelevant and redundant features from the data. The study aims to briefly survey popular feature selection methods for classifying DNA microarray technology, such as filters, wrappers, embedded, and hybrid approaches. Furthermore, this study describes the steps of the feature selection process used to accomplish classification tasks and their relationships to other components such as datasets, cross-validation, and classifier algorithms. In the case study, we chose four different methods of feature selection on two-DNA microarray datasets to evaluate and discuss their performances, namely classification accuracy, stability, and the subset size of selected features. Keywords: Brief survey; DNA microarray data; feature selection; filter methods; wrapper methods; embedded methods; and hybrid methods. DOI: 10.7176/CEIS/14-2-01 Publication date:March 31st 202

    An adaptive clustering and classification algorithm for Twitter data streaming in Apache Spark

    Get PDF
    On-going big data from social networks sites alike Twitter or Facebook has been an entrancing hotspot for investigation by researchers in current decades as a result of various aspects including up-to-date-ness, accessibility and popularity; however anyway there may be a trade off in accuracy. Moreover, clustering of twitter data has caught the attention of researchers. As such, an algorithm which can cluster data within a lesser computational time, especially for data streaming is needed. The presented adaptive clustering and classification algorithm is used for data streaming in Apache spark to overcome the existing problems is processed in two phases. In the first phase, the input pre-processed twitter data is viably clustered utilizing an Improved Fuzzy C-means clustering and the proposed clustering is additionally improved by an Adaptive Particle swarm optimization (PSO) algorithm. Further the clustered data streaming is assessed utilizing spark engine. In the second phase, the input pre-processed Higgs data is classified utilizing the modified support vector machine (MSVM) classifier with grid search optimization. At long last the optimized information is assessed in spark engine and the assessed esteem is utilized to discover an accomplished confusion matrix. The proposed work is utilizing Twitter dataset and Higgs dataset for the data streaming in Apache Spark. The computational examinations exhibit the superiority ofpresented approach comparing with the existing methods in terms of precision, recall, F-score, convergence, ROC curve and accuracy

    Enhancing Feature Selection Accuracy using Butterfly and Lion Optimization Algorithm with Specific Reference to Psychiatric Disorder Detection & Diagnosis

    Get PDF
    As the complexity of medical computing increases the use of intelligent methods based on methods of soft computing also increases. During current decade this intelligent computing involves various meta-heuristic algorithms for Optimization. Many new meta-heuristic algorithms are proposed in last few years. The dimension of this data has also wide. Feature selection processes play an important role in these types of wide data. In intelligent computation feature selection is important phase after the pre-processing phase. The success of any model depends on how better optimization algorithms is used. Sometime single optimization algorithms are not enough in order to produce better result. In this paper meta-heuristic algorithm like butterfly optimization algorithm and enhanced lion optimization algorithm are used to show better accuracy in feature selection. The study focuses on nature based integrated meta-heuristic algorithm like Butterfly Optimization and lion-based optimization. Also, in this paper various other Optimization algorithms are analyzed. The study shows how integrated methods are useful to enhance the accuracy of any computing model to solve Complex problems. Here experimental result has shown by proposing and hybrid model for two major psychiatric disorders one is known as autism spectrum and second one is Parkinson's disease

    Predicting Arrhythmia Based on Machine Learning Using Improved Harris Hawk Algorithm

    Get PDF
    Arrhythmia disease is widely recognized as a prominent and lethal ailment on a global scale, resulting in a significant number of fatalities annually. The timely identification of this ailment is crucial for preserving individuals' lives. Machine Learning (ML), a branch of artificial intelligence (AI), has emerged as a highly efficient and cost-effective method for illness detection. The objective of this work is to develop a machine learning (ML) model capable of accurately predicting heart illness by using the Arrhythmia disease dataset, with the purpose of achieving optimal performance. The performance of the model is greatly influenced by the selection of the machine learning method and the features in the dataset for training purposes. In order to mitigate the issue of overfitting caused by the high dimensionality of the features in the Arrhythmia dataset, a reduction of the dataset to a lower dimensional subspace was performed via the improved Harris hawk optimization algorithm (iHHO). The Harris hawk algorithm exhibits a rapid convergence rate and possesses a notable degree of adaptability in its ability to identify optimal characteristics. The performance of the models created with the feature-selected dataset using various machine learning techniques was evaluated and compared. In this work, total seven classifiers like SVM, GB, GNB, RF, LR, DT, and KNN are used to classify the data produced by the iHHO algorithm. The results clearly show the improvement of 3%, 4%, 4%, 9%, 8%, 3%, and 9% with the classifiers KNN, RF, GB, SVM, LR, DT, and GNB respectively

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance
    • …
    corecore