12,387 research outputs found

    Identifying Effective Features and Classifiers for Short Term Rainfall Forecast Using Rough Sets Maximum Frequency Weighted Feature Reduction Technique

    Get PDF
    Precise rainfall forecasting is a common challenge across the globe in meteorological predictions. As rainfall forecasting involves rather complex dynamic parameters, an increasing demand for novel approaches to improve the forecasting accuracy has heightened. Recently, Rough Set Theory (RST) has attracted a wide variety of scientific applications and is extensively adopted in decision support systems. Although there are several weather prediction techniques in the existing literature, identifying significant input for modelling effective rainfall prediction is not addressed in the present mechanisms. Therefore, this investigation has examined the feasibility of using rough set based feature selection and data mining methods, namely Naïve Bayes (NB), Bayesian Logistic Regression (BLR), Multi-Layer Perceptron (MLP), J48, Classification and Regression Tree (CART), Random Forest (RF), and Support Vector Machine (SVM), to forecast rainfall. Feature selection or reduction process is a process of identifying a significant feature subset, in which the generated subset must characterize the information system as a complete feature set. This paper introduces a novel rough set based Maximum Frequency Weighted (MFW) feature reduction technique for finding an effective feature subset for modelling an efficient rainfall forecast system. The experimental analysis and the results indicate substantial improvements of prediction models when trained using the selected feature subset. CART and J48 classifiers have achieved an improved accuracy of 83.42% and 89.72%, respectively. From the experimental study, relative humidity2 (a4) and solar radiation (a6) have been identified as the effective parameters for modelling rainfall prediction

    A new model for large dataset dimensionality reduction based on teaching learning-based optimization and logistic regression

    Get PDF
    One of the human diseases with a high rate of mortality each year is breast cancer (BC). Among all the forms of cancer, BC is the commonest cause of death among women globally. Some of the effective ways of data classification are data mining and classification methods. These methods are particularly efficient in the medical field due to the presence of irrelevant and redundant attributes in medical datasets. Such redundant attributes are not needed to obtain an accurate estimation of disease diagnosis. Teaching learning-based optimization (TLBO) is a new metaheuristic that has been successfully applied to several intractable optimization problems in recent years. This paper presents the use of a multi-objective TLBO algorithm for the selection of feature subsets in automatic BC diagnosis. For the classification task in this work, the logistic regression (LR) method was deployed. From the results, the projected method produced better BC dataset classification accuracy (classified into malignant and benign). This result showed that the projected TLBO is an efficient features optimization technique for sustaining data-based decision-making systems

    A voting-based machine learning approach for classifying biological and clinical datasets.

    Get PDF
    BACKGROUND: Different machine learning techniques have been proposed to classify a wide range of biological/clinical data. Given the practicability of these approaches accordingly, various software packages have been also designed and developed. However, the existing methods suffer from several limitations such as overfitting on a specific dataset, ignoring the feature selection concept in the preprocessing step, and losing their performance on large-size datasets. To tackle the mentioned restrictions, in this study, we introduced a machine learning framework consisting of two main steps. First, our previously suggested optimization algorithm (Trader) was extended to select a near-optimal subset of features/genes. Second, a voting-based framework was proposed to classify the biological/clinical data with high accuracy. To evaluate the efficiency of the proposed method, it was applied to 13 biological/clinical datasets, and the outcomes were comprehensively compared with the prior methods. RESULTS: The results demonstrated that the Trader algorithm could select a near-optimal subset of features with a significant level of p-value \u3c 0.01 relative to the compared algorithms. Additionally, on the large-sie datasets, the proposed machine learning framework improved prior studies by ~ 10% in terms of the mean values associated with fivefold cross-validation of accuracy, precision, recall, specificity, and F-measure. CONCLUSION: Based on the obtained results, it can be concluded that a proper configuration of efficient algorithms and methods can increase the prediction power of machine learning approaches and help researchers in designing practical diagnosis health care systems and offering effective treatment plans

    Water filtration by using apple and banana peels as activated carbon

    Get PDF
    Water filter is an important devices for reducing the contaminants in raw water. Activated from charcoal is used to absorb the contaminants. Fruit peels are some of the suitable alternative carbon to substitute the charcoal. Determining the role of fruit peels which were apple and banana peels powder as activated carbon in water filter is the main goal. Drying and blending the peels till they become powder is the way to allow them to absorb the contaminants. Comparing the results for raw water before and after filtering is the observation. After filtering the raw water, the reading for pH was 6.8 which is in normal pH and turbidity reading recorded was 658 NTU. As for the colour, the water becomes more clear compared to the raw water. This study has found that fruit peels such as banana and apple are an effective substitute to charcoal as natural absorbent

    A Hybrid Fish – Bee Optimization Algorithm for Heart Disease Prediction using Multiple Kernel SVM Classifier

    Get PDF
    International audienceThe patient's heart disease status is obtained by using a heart disease detection model. That is used for the medical experts. In order to predict the heart disease, the existing technique use optimal classifier. Even though the existing technique achieved the better result, it has some disadvantages. In order to improve those drawbacks, the suggested technique utilizes the effective method for heart disease prediction. At first the input information is preprocessed and then the preprocessed result is forwarded to the feature selection process. For the feature selection process a proficient feature selection is used over the high dimensional medical data. Hybrid Fish Bee optimization algorithm (HFSBEE) is utilized. Thus, the proposed algorithm parallelizes the two algorithms such that the local behavior of artificial bee colony algorithm and global search of fish swarm optimization are effectively used to find the optimal solution. Classification process is performed by the transformation of medical dataset to the Multi kernel support vector machine (MKSVM). The process of our proposed technique is calculated based on the accuracy, sensitivity, specificity, precision, recall and F-measure. Here, for test analysis, the some datasets used i.e. Cleveland, Hungarian and Switzerland etc., that are given based on the UCI machine learning repository. The experimental outcome show that our presented technique is went better than the accuracy of 97.68%. This is for the Cleveland dataset when related with existing hybrid kernel support vector machine (HKSVM) method achieved 96.03% and optimal rough fuzzy classifier obtained 62.25%. The implementation of the proposed method is done by MATLAB platform. Rundown phrases-Artificial bee colony algorithm, Fish swarm optimization, Multi kernel support vector machine, Optimal rough fuzzy, Cleveland, Hungarian and Switzerland

    Classification Arabic Twitter User’s Insights Using Rough Set Theory

    Get PDF
    Nowadays, people using social media from around the world to share their daily affairs. Arabic twitter for example is a platform where users read, reply, post which known ‘tweets’. Users trading their opinions on different trends that are not equal in important and differed based on their power and interest. Tweets can provide rich information to make decision. The main objective of this paper is to present a framework for making a valuable decision through analyzing social users' insights based on their proximity to a particular trend with highlights their power in this trend. Tweets are exceedingly unstructured that makes it difficult to analyze. Nevertheless, our proposed model differs from previous research in this field it gathered the use of supervised and unsupervised machine learning algorithms. The process of performing this work as follows: classifying users based on the degree of their closeness/interest utilizing Mendelow’s power/interest matrix, rough set theory to eliminate the features that may be found in user profiles to find minimal sets of data. The proposed model applied two attribute reduction algorithms on our dataset to determine the optimal number of reducts for improving decision making from the user replies. In addition to, unsupervised machine learning to group their replies into subcategories such as positive, negative, or neutral. The experimental evaluation shows that Johnson algorithm has reduced the user attributes by 71% than genetic algorithm that utilized in a classification model
    • …
    corecore