179 research outputs found

    Fuzzy rough and evolutionary approaches to instance selection

    Get PDF

    A fuzzy logic-based text classification method for social media

    Get PDF
    Social media offer abundant information for studying people’s behaviors, emotions and opinions during the evolution of various rare events such as natural disasters. It is useful to analyze the correlation between social media and human-affected events. This study uses Hurricane Sandy 2012 related Twitter text data to conduct information extraction and text classification. Considering that the original data contains different topics, we need to find the data related to Hurricane Sandy. A fuzzy logic-based approach is introduced to solve the problem of text classification. Inputs used in the proposed fuzzy logic-based model are multiple useful features extracted from each Twitter’s message. The output is its degree of relevance for each message to Sandy. A number of fuzzy rules are designed and different defuzzification methods are combined in order to obtain desired classification results. This work compares the proposed method with the well-known keyword search method in terms of correctness rate and quantity. The result shows that the proposed fuzzy logic-based approach is more suitable to classify Twitter messages than keyword word method

    Dealing with imbalanced and weakly labelled data in machine learning using fuzzy and rough set methods

    Get PDF

    Multistage feature selection methods for data classification

    Get PDF
    In data analysis process, a good decision can be made with the assistance of several sub-processes and methods. The most common processes are feature selection and classification processes. Various methods and processes have been proposed to solve many issues such as low classification accuracy, and long processing time faced by the decision-makers. The analysis process becomes more complicated especially when dealing with complex datasets that consist of large and problematic datasets. One of the solutions that can be used is by employing an effective feature selection method to reduce the data processing time, decrease the used memory space, and increase the accuracy of decisions. However, not all the existing methods are capable of dealing with these issues. The aim of this research was to assist the classifier in giving a better performance when dealing with problematic datasets by generating optimised attribute set. The proposed method comprised two stages of feature selection processes, that employed correlation-based feature selection method using a best first search algorithm (CFS-BFS) and as well as a soft set and rough set parameter selection method (SSRS). CFS-BFS is used to eliminate uncorrelated attributes in a dataset meanwhile SSRS was utilized to manage any problematic values such as uncertainty in a dataset. Several bench-marking feature selection methods such as classifier subset evaluation (CSE) and principle component analysis (PCA) and different classifiers such as support vector machine (SVM) and neural network (NN) were used to validate the obtained results. ANOVA and T-test were also conducted to verify the obtained results. The obtained averages for two experimentalworks have proven that the proposed method equally matched the performance of other benchmarking methods in terms of assisting the classifier in achieving high classification performance for complex datasets. The obtained average for another experimental work has shown that the proposed work has outperformed the other benchmarking methods. In conclusion, the proposed method is significant to be used as an alternative feature selection method and able to assist the classifiers in achieving better accuracy in the classification process especially when dealing with problematic datasets

    Neutrosophic rule-based prediction system for toxicity effects assessment of biotransformed hepatic drugs

    Get PDF
    Measuring toxicity is an important step in drug development. However, the current experimental meth- ods which are used to estimate the drug toxicity are expensive and need high computational efforts. Therefore, these methods are not suitable for large-scale evaluation of drug toxicity. As a consequence, there is a high demand to implement computational models that can predict drug toxicity risks. In this paper, we used a dataset that consists of 553 drugs that biotransformed in the liver

    Rough-set based learning methods: A case study to assess the relationship between the clinical delivery of cannabinoid medicine for anxiety, depression, sleep, patterns and predictability

    Get PDF
    COVID-19 is an unprecedented health crisis causing a great deal of stress and mental health challenges in populations in Canada. Recently, research is emerging highlighting the potential of cannabinoids’ beneficial effects related to anxiety, mood, and sleep disorders as well as pointing to an increased use of medicinal cannabis since COVID-19 was declared a pandemic. Furthermore, evidence points to a correlation between mental health and sleep patterns. The objective of this research is threefold: i) to assess the relationship of the clinical delivery of cannabinoid medicine, by utilizing machine learning, to anxiety, depression and sleep scores; ii) to discover patterns based on patient features such as specific cannabis recommendations, diagnosis information, decreasing/increasing levels of clinical assessment tools (GAD7, PHQ9 and PSQI) scores over a period of time (including during the COVID timeline); and iii) to predict whether new patients could potentially experience either an increase or decrease in clinical assessment tool scores. The dataset for this thesis was derived from patient visits to Ekosi Health Centres in Manitoba, Canada and Ontario, Canada from January, 2019 to April, 2021. Extensive pre-processing and feature engineering was performed. To determine the outcome of a patients treatment, a class feature (Worse, Better, or No Change) indicative of their progress or lack thereof due to the treatment received was introduced. Three well-known supervised machine learning models (tree-based, rule-based and nearest neighbour) were trained on the patient dataset. In addition, seven rough and rough-fuzzy hybrid methods were also trained on the same dataset. All experiments were conducted using a 10-fold CV method. Sensitivity and specificity measures were higher in all classes with rough and rough-fuzzy hybrid methods. The highest accuracy of 99.15% was obtained using the rule-based rough-set learning method.Ekosi Health Center, MitacsMaster of Science in Applied Computer Scienc

    Automated Resolution Selection for Image Segmentation

    Get PDF
    It is well known in image processing in general, and hence in image segmentation in particular, that computational cost increases rapidly with the number and dimensions of the images to be processed. Several fields, such as astronomy, remote sensing, and medical imaging, use very large images, which might also be 3D and/or captured at several frequency bands, all adding to the computational expense. Multiresolution analysis is one method of increasing the efficiency of the segmentation process. One multiresolution approach is the coarse-to-fine segmentation strategy, whereby the segmentation starts at a coarse resolution and is then fine-tuned during subsequent steps. Until now, the starting resolution for segmentation has been selected arbitrarily with no clear selection criteria. The research conducted for this thesis showed that starting from different resolutions for image segmentation results in different accuracies and speeds, even for images from the same dataset. An automated method for resolution selection for an input image would thus be beneficial. This thesis introduces a framework for the selection of the best resolution for image segmentation. First proposed is a measure for defining the best resolution based on user/system criteria, which offers a trade-off between accuracy and time. A learning approach is then described for the selection of the resolution, whereby extracted image features are mapped to the previously determined best resolution. In the learning process, class (i.e., resolution) distribution is imbalanced, making effective learning from the data difficult. A variant of AdaBoost, called RAMOBoost, is therefore used in this research for the learning-based selection of the best resolution for image segmentation. RAMOBoost is designed specifically for learning from imbalanced data. Two sets of features are used: Local Binary Patterns (LBP) and statistical features. Experiments conducted with four datasets using three different segmentation algorithms show that the resolutions selected through learning enable much faster segmentation than the original ones, while retaining at least the original accuracy. For three of the four datasets used, the segmentation results obtained with the proposed framework were significantly better than with the original resolution with respect to both accuracy and time

    Computational Intelligence in Healthcare

    Get PDF
    This book is a printed edition of the Special Issue Computational Intelligence in Healthcare that was published in Electronic

    Computational Intelligence in Healthcare

    Get PDF
    The number of patient health data has been estimated to have reached 2314 exabytes by 2020. Traditional data analysis techniques are unsuitable to extract useful information from such a vast quantity of data. Thus, intelligent data analysis methods combining human expertise and computational models for accurate and in-depth data analysis are necessary. The technological revolution and medical advances made by combining vast quantities of available data, cloud computing services, and AI-based solutions can provide expert insight and analysis on a mass scale and at a relatively low cost. Computational intelligence (CI) methods, such as fuzzy models, artificial neural networks, evolutionary algorithms, and probabilistic methods, have recently emerged as promising tools for the development and application of intelligent systems in healthcare practice. CI-based systems can learn from data and evolve according to changes in the environments by taking into account the uncertainty characterizing health data, including omics data, clinical data, sensor, and imaging data. The use of CI in healthcare can improve the processing of such data to develop intelligent solutions for prevention, diagnosis, treatment, and follow-up, as well as for the analysis of administrative processes. The present Special Issue on computational intelligence for healthcare is intended to show the potential and the practical impacts of CI techniques in challenging healthcare applications
    • …
    corecore