141 research outputs found

    Breast Cancer Diagnosis from Perspective of Class Imbalance

    Get PDF
    Introduction: Breast cancer is the second cause of mortality among women. Early detection is the only rescue to reduce the risk of breast cancer mortality. Traditional methods cannot effectively diagnose tumor since they are based on the assumption of well-balanced dataset.. However, a hybrid method can help to alleviate the two-class imbalance problem existing in the diagnosis of breast cancer and establish a more accurate diagnosis. Material and Methods: The proposed hybrid approach was based on improved Laplacian score (LS) andK-nearest neighbor (KNN) algorithms called LS-KNN. An improved LS algorithm was used for obtaining the optimal feature subset. The KNN with automatic K was utilized for classifying the data which guaranteed the effectiveness of the proposed method by reducing the computational effort and making the classification more faster. The effectiveness of LS-KNN was also examined on two biased-representative breast cancer datasets using classification accuracy, sensitivity, specificity, G-mean, and Matthews correlation coefficient. Results: Applying the proposed algorithm on two breast cancer datasets indicated that the efficiency of the new method was higher than the previously introduced methods. The obtained values of accuracy, sensitivity, specificity, G-mean, and Matthews correlation coefficient were 99.27%, 99.12%, 99.51%, 99.42%, respectively. Conclusion: Experimental results showed that the proposed approach worked well with breast cancer datasets and could be a good alternative to the well-known machine learning method

    Implementing decision tree-based algorithms in medical diagnostic decision support systems

    Get PDF
    As a branch of healthcare, medical diagnosis can be defined as finding the disease based on the signs and symptoms of the patient. To this end, the required information is gathered from different sources like physical examination, medical history and general information of the patient. Development of smart classification models for medical diagnosis is of great interest amongst the researchers. This is mainly owing to the fact that the machine learning and data mining algorithms are capable of detecting the hidden trends between features of a database. Hence, classifying the medical datasets using smart techniques paves the way to design more efficient medical diagnostic decision support systems. Several databases have been provided in the literature to investigate different aspects of diseases. As an alternative to the available diagnosis tools/methods, this research involves machine learning algorithms called Classification and Regression Tree (CART), Random Forest (RF) and Extremely Randomized Trees or Extra Trees (ET) for the development of classification models that can be implemented in computer-aided diagnosis systems. As a decision tree (DT), CART is fast to create, and it applies to both the quantitative and qualitative data. For classification problems, RF and ET employ a number of weak learners like CART to develop models for classification tasks. We employed Wisconsin Breast Cancer Database (WBCD), Z-Alizadeh Sani dataset for coronary artery disease (CAD) and the databanks gathered in Ghaem Hospital’s dermatology clinic for the response of patients having common and/or plantar warts to the cryotherapy and/or immunotherapy methods. To classify the breast cancer type based on the WBCD, the RF and ET methods were employed. It was found that the developed RF and ET models forecast the WBCD type with 100% accuracy in all cases. To choose the proper treatment approach for warts as well as the CAD diagnosis, the CART methodology was employed. The findings of the error analysis revealed that the proposed CART models for the applications of interest attain the highest precision and no literature model can rival it. The outcome of this study supports the idea that methods like CART, RF and ET not only improve the diagnosis precision, but also reduce the time and expense needed to reach a diagnosis. However, since these strategies are highly sensitive to the quality and quantity of the introduced data, more extensive databases with a greater number of independent parameters might be required for further practical implications of the developed models

    Self-adaptive parameter and strategy based particle swarm optimization for large-scale feature selection problems with multiple classifiers

    Get PDF
    This work was partially supported by the National Natural Science Foundation of China (61403206, 61876089,61876185), the Natural Science Foundation of Jiangsu Province (BK20141005), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (14KJB520025), the Engineering Research Center of Digital Forensics, Ministry of Education, and the Priority Academic Program Development of Jiangsu Higher Education Institutions.Peer reviewedPostprin

    Computational models and approaches for lung cancer diagnosis

    Full text link
    The success of treatment of patients with cancer depends on establishing an accurate diagnosis. To this end, the aim of this study is to developed novel lung cancer diagnostic models. New algorithms are proposed to analyse the biological data and extract knowledge that assists in achieving accurate diagnosis results

    Storage Capacity Estimation of Commercial Scale Injection and Storage of CO2 in the Jacksonburg-Stringtown Oil Field, West Virginia

    Get PDF
    Geological capture, utilization and storage (CCUS) of carbon dioxide (CO2) in depleted oil and gas reservoirs is one method to reduce greenhouse gas emissions with enhanced oil recovery (EOR) and extending the life of the field. Therefore CCUS coupled with EOR is considered to be an economic approach to demonstration of commercial-scale injection and storage of anthropogenic CO2. Several critical issues should be taken into account prior to injecting large volumes of CO2, such as storage capacity, project duration and long-term containment. Reservoir characterization and 3D geological modeling are the best way to estimate the theoretical CO 2 storage capacity in mature oil fields. The Jacksonburg-Stringtown field, located in northwestern West Virginia, has produced over 22 million barrels of oil (MMBO) since 1895. The sandstone of the Late Devonian Gordon Stray is the primary reservoir.;The Upper Devonian fluvial sandstone reservoirs in Jacksonburg-Stringtown oil field, which has produced over 22 million barrels of oil since 1895, are an ideal candidate for CO2 sequestration coupled with EOR. Supercritical depth (\u3e2500 ft.), minimum miscible pressure (941 psi), favorable API gravity (46.5°) and good water flood response are indicators that facilitate CO 2-EOR operations. Moreover, Jacksonburg-Stringtown oil field is adjacent to a large concentration of CO2 sources located along the Ohio River that could potentially supply enough CO2 for sequestration and EOR without constructing new pipeline facilities.;Permeability evaluation is a critical parameter to understand the subsurface fluid flow and reservoir management for primary and enhanced hydrocarbon recovery and efficient carbon storage. In this study, a rapid, robust and cost-effective artificial neural network (ANN) model is constructed to predict permeability using the model\u27s strong ability to recognize the possible interrelationships between input and output variables. Two commonly available conventional well logs, gamma ray and bulk density, and three logs derived variables, the slope of GR, the slope of bulk density and Vsh were selected as input parameters and permeability was selected as desired output parameter to train and test an artificial neural network. The results indicate that the ANN model can be applied effectively in permeability prediction.;Porosity is another fundamental property that characterizes the storage capability of fluid and gas bearing formations in a reservoir. In this study, a support vector machine (SVM) with mixed kernels function (MKF) is utilized to construct the relationship between limited conventional well log suites and sparse core data. The input parameters for SVM model consist of core porosity values and the same log suite as ANN\u27s input parameters, and porosity is the desired output. Compared with results from the SVM model with a single kernel function, mixed kernel function based SVM model provide more accurate porosity prediction values.;Base on the well log analysis, four reservoir subunits within a marine-dominated estuarine depositional system are defined: barrier sand, central bay shale, tidal channels and fluvial channel subunits. A 3-D geological model, which is used to estimate theoretical CO2 sequestration capacity, is constructed with the integration of core data, wireline log data and geological background knowledge. Depending on the proposed 3-D geological model, the best regions for coupled CCUS-EOR are located in southern portions of the field, and the estimated CO2 theoretical storage capacity for Jacksonburg-Stringtown oil field vary between 24 to 383 million metric tons. The estimation results of CO2 sequestration and EOR potential indicate that the Jacksonburg-Stringtown oilfield has significant potential for CO2 storage and value-added EOR

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications
    corecore