2 research outputs found

    Numeric prediction of dissolved oxygen status through two-stage training for classification-driven regression

    Get PDF
    Dissolved oxygen of aquaculture is an important measure of the quality of culture environment and how aquatic products have been grown. In the machine learning context, the above measure can be achieved by defining a regression problem, which aims at numerical prediction of the dissolved oxygen status. In general, the vast majority of popular machine learning algorithms were designed for undertaking classification tasks. In order to effectively adopt the popular machine learning algorithms for the above-mentioned numerical prediction, in this paper, we propose a two-stage training approach that involves transforming a regression problem into a classification problem and then transforming it back to regression problem. In particular, unsupervised discretization of continuous attributes is adopted at the first stage to transform the target (numeric) attribute into a discrete (nominal) one with several intervals, such that popular machine learning algorithms can be used to predict the interval to which an instance belongs in the setting of a classification task. Furthermore, based on the classification result at the first stage, some of the instances within the predicted interval are selected for training at the second stage towards numerical prediction of the target attribute value of each instance. An experimental study is conducted to investigate in general the effectiveness of the popular learning algorithms in the numerical prediction task and also analyze how the increase of the number of training instances (selected at the second training stage) can impact on the final prediction performance. The results show that the adoption of decision tree learning and neural networks lead to better and more stable performance than Naive Bayes, K Nearest Neighbours and Support Vector Machine

    Boosting-BoW Algorithm for Finding Kidney Diseases from Medical Test Reports

    Get PDF
    This paper introduces an approach to increase the accuracy rate of classification by employing Bag-of-Words (BoW) as a feature selection method along with machine learning algorithms to obtain a more accurate output. Because of its capability in quickly processing large sets of data and getting accurate results, this approach can be used in medical areas. Different ensemble approaches are generated by different researchers to obtain good results as mentioned in the literature review. In this study a novel algorithm is proposed to analyze medical kidney test reports, using BoW for selecting the features and analyzing them via Boosting four different machine learning classification algorithms like Sequential Minimum Optimization (SMO), k-Nearest Neighbors (k-NN), Random Forests (RF) and Naïve Bayes (NB). With the help of specialists in urology, the proposed algorithm is tested against multiple datasets of different kidney tests. The accuracy of the proposed Boosting algorithms outperforms its counterpart algorithms like SMO, k-NN, RF and NB when they had showen their performances alone
    corecore