6 research outputs found

    Breast Cancer Detection by Extracting and Selecting Features Using Machine Learning

    Get PDF
    The cancer of the breast is a significant cause of female death worldwide, but especially in developing countries. For better results and higher survival rates, early diagnosis and screening are crucial. Machine learning (ML) methods can aid in the initialdiscovery and diagnosis of breast cancer by choosing the most informative elements from medical data and eliminating irrelevant ones. The approach of feature extraction involves taking unstructured data and extracting a representative set of characteristics that may be used to classify or forecast data. The aim is to decrease the dimensionality of the feature space while upholding or even refining the accuracy of the ML model. An artificial intelligence model is developed on the given features to categorize mammography images into benign and malignant groups. Different supervised learning techniques, including support vector machines, random forests, and artificial neural networks, are employed and contrasted in order to select the best-performing model. This research offers a comprehensive framework for utilizing machine learning methods to detect breast cancer. The technique demonstrates how it might assist radiologists in the early detection of breast cancer by effectively extracting and selecting critical characteristics that could improve patient outcomes and potentially save lives

    Machine Learning-Based Hybrid Recommendation (SVOF-KNN) Model For Breast Cancer Coimbra Dataset Diagnosis

    Get PDF
    An effective way to identify breast cancer is by creating a prediction algorithm using risk factors. Models for ML have been used to improve the effectiveness of early detection. This article analyses a KNN combined with singular value decomposition and Grey wolf optimization(GWO) method to give a detection of breast cancer(BC) at the early phase depending on risk metrics. The SVD technique was utilized to eliminate the reliable feature vectors, the GW optimizer was used to select the feature vectors, and while KNN model was used to diagnose the BC status. The proposed hybrid recommendation model (SVOF-KNN) for BC prediction's main objective is to give an accurate recommendation for BC prognosis through four different steps such as;BCCD dataset collection, data pre-processing, feature selection, and classification/recommendation. It is implemented to classify the consequence of risk metrics connected withregular blood analysis(BA) in the BCCD database. The aspects of the BC dataset are insulin, glucose, HOMA, Leptin, resistin, etc. The error categories such as RMSE and MAE are used to calculate the exception values for each instance of the BC dataset. It hybrid model has recommended the best score instance having the minimumexception rateas the defined features for BC prediction. It improves significance in automatic BC classification with the optimum solution. The hybrid recommendation model (SVOF-KNN) also recommends the accurateclassification method for BC diagnosis. The results of this work shall enhance the QoS in BC care

    Predicting breast cancer progression by using cell-free DNA

    Get PDF
    Thesis Submitted to the Faculty of Information in partial fulfillment of the requirements for the award of Master of Science in information TechnologyCancer is among the leading causes of deaths in Kenya after infectious and cardiovascular diseases. Among the various forms of cancer, breast cancer accounts for a significant percentage of all new cancer incidences in the country and has a high mortality rate. On a global level, breast cancer is considered the most common cancer. Treatment methods employed vary from patient to patient due to factors such as the stage, age, and health. Treatment methods such as surgery, radiotherapy, chemotherapy or a combination of all have been used all to varying degrees of success and are not always efficient. However, these modalities have been employed successfully when the disease is detected early. This research applied deep neural networks coupled with genetic algorithms to build a learning model that evaluated the biomarkers obtained from cell-free DNA. The model was able to predict progression of breast cancer. The research, in addition, employed an agile, data-driven methodology due to its recursive nature producing a model with a higher degree of accuracy and specificity. The model developed was able to attain an accuracy of 94% in predicting breast cancer progression

    Ultrasound guided Diffuse Optical Tomography for Breast Cancer Diagnosis: Algorithm Development

    Get PDF
    According to National Breast Cancer Society, one in every eight women in United States is diagnosed with breast cancer in her lifetime. American Cancer Society recommends a semi-annual breast-cancer screening for every woman which can be heavily facilitated by the availability of low-cost, non-invasive diagnostic method with good sensitivity and penetration depth. Ultrasound (US) guided Diffuse Optical Tomography (US-guided DOT) has been explored as a breast-cancer diagnostic and screening tool over the past two decades. It has demonstrated a great potential for breast-cancer diagnosis, treatment monitoring and chemotherapy-response prediction. In this imaging method, optical measurements of four different wavelengths are used to reconstruct unknown optical absorption maps which are then used to calculate the hemoglobin concentration of the US-visible lesion. This dissertation focuses on algorithm development for robust data processing, imaging reconstruction and optimal breast cancer diagnostic strategy development in DOT. The inverse problem in DOT is ill-posed, ill-conditioned, and underdetermined. This makes the task of image reconstruction challenging, and thus regularization-based method need to be employed. In this dissertation, a simple two-step reconstruction method that can produce accurate image estimates in DOT is proposed and investigated. In the first step, a truncated Moore-Penrose Pseudoinverse solution is computed to obtain a preliminary estimate of the image. This estimate can be reliably determined from the measured data; subsequently, this preliminary estimate is incorporated into the design of a penalized least squares estimator that is employed to compute the final image estimate. Using physical phantoms, the proposed method was demonstrated to yield more accurate reconstruction compared to other conventional reconstruction methods. The method was also evaluated with clinical data that included 10 benign and 10 malignant cases. The capability of reconstructing high contrast malignant lesions improved by the use of the proposed method.Reconstructed absorption maps are prone to image artifacts from outliers in measurement data from tissue heterogeneity, bad coupling between tissue and light guides, and motion by patient or operator. In this dissertation, a new automated iterative perturbation correction algorithm is proposed to reduce image artifacts based on the structural similarity index (SSIM)) of absorption maps of four optical wavelengths. The SSIM was calculated for each wavelength to assess its similarity with other wavelengths. Absorption map was iteratively reconstructed and projected back into measurement space to quantify projection error. Outlier measurements with highest projection errors were iteratively removed until all wavelength images were structurally similar with SSIM values greater than a threshold. Clinical data demonstrated statistically significant improvement in image artifact reduction.US guidance with DOT helps to reduce false positive rate and hence reduce number of unnecessary biopsies. However, DOT data processing and image reconstruction speed remains slow compared to real-time US. Real-time or near real time diagnosis with DOT is an important step toward the clinical translation of the US-guided DOT. In this dissertation, to address this important need, we present a two-stage diagnostic strategy that is computationally efficient and accurate. In the first stage, benign lesions are identified in near real-time by use of a random forest classifier acting on the DOT measurements and radiologistsÕ US diagnostic scores. The lesions that cannot be reliably classified by the random forest classifier will be passed on to the image reconstruction stage. Functional information from the reconstructed hemoglobin concentrations is used by a Support Vector Machine (SVM) classifier for diagnosis in the second stage. This two-step classification approach that combines both perturbation data and functional features results in improved classification, as quantified using the receiver operating characteristic (ROC) curve. Using this two-step approach, area under the ROC curve (AUC) is 0.937 å± 0.009 with sensitivity of 91.4% and specificity of 85.7%. While using functional features and US score, AUC is 0.892 å± 0.027 with sensitivity of 90.2% and specificity of 74.5%. The specificity increased by more than 10% due to the implementation of the random forest classifier

    Implementing decision tree-based algorithms in medical diagnostic decision support systems

    Get PDF
    As a branch of healthcare, medical diagnosis can be defined as finding the disease based on the signs and symptoms of the patient. To this end, the required information is gathered from different sources like physical examination, medical history and general information of the patient. Development of smart classification models for medical diagnosis is of great interest amongst the researchers. This is mainly owing to the fact that the machine learning and data mining algorithms are capable of detecting the hidden trends between features of a database. Hence, classifying the medical datasets using smart techniques paves the way to design more efficient medical diagnostic decision support systems. Several databases have been provided in the literature to investigate different aspects of diseases. As an alternative to the available diagnosis tools/methods, this research involves machine learning algorithms called Classification and Regression Tree (CART), Random Forest (RF) and Extremely Randomized Trees or Extra Trees (ET) for the development of classification models that can be implemented in computer-aided diagnosis systems. As a decision tree (DT), CART is fast to create, and it applies to both the quantitative and qualitative data. For classification problems, RF and ET employ a number of weak learners like CART to develop models for classification tasks. We employed Wisconsin Breast Cancer Database (WBCD), Z-Alizadeh Sani dataset for coronary artery disease (CAD) and the databanks gathered in Ghaem Hospital’s dermatology clinic for the response of patients having common and/or plantar warts to the cryotherapy and/or immunotherapy methods. To classify the breast cancer type based on the WBCD, the RF and ET methods were employed. It was found that the developed RF and ET models forecast the WBCD type with 100% accuracy in all cases. To choose the proper treatment approach for warts as well as the CAD diagnosis, the CART methodology was employed. The findings of the error analysis revealed that the proposed CART models for the applications of interest attain the highest precision and no literature model can rival it. The outcome of this study supports the idea that methods like CART, RF and ET not only improve the diagnosis precision, but also reduce the time and expense needed to reach a diagnosis. However, since these strategies are highly sensitive to the quality and quantity of the introduced data, more extensive databases with a greater number of independent parameters might be required for further practical implications of the developed models
    corecore