1,442 research outputs found

    Random Forest as a tumour genetic marker extractor

    Get PDF
    Identifying tumour genetic markers is an essential task for biomedicine. In this thesis, we analyse a dataset of chromosomal rearrangements of cancer samples and present a methodology for extracting genetic markers from this dataset by using a Random Forest as a feature selection tool

    An investigation of ensemble learning methods in classification problems and an application on non-small-cell lung cancer data

    Get PDF
    This study aims to classify NSCLC death status and consists of patient records of 24 variables created by the open-source dataset of the cancer data site. Besides, basic classifiers such as SMO (Sequential Minimal Optimization), K-NN (K-Nearest Neighbor), random forest, and XGBoost (Extreme Gradient Boosting), which are machine learning methods, and their performances, and voting, bagging, boosting, and stacking methods from ensemble learning methods were used. Performance evaluation of models was compared in terms of accuracy, specificity, sensitivity, precision, and Roc curve. The basic classifier performances of random forest, SMO, K-NN, and XGBoost classifiers, their performances in the bagging ensemble learning method, and their performances in the boosting ensemble learning method are evaluated. In addition, Model 1 (random forest + SMO), Model 2 (XGBoost + K-NN), Model 3 (random forest + K-NN), Model 4 (XGBoost+SMO), Model 5 (SMO+K-NN + random forest), Model 6 (SMO+K-NN+XGBoost) and Model 7 (SMO+K-NN + random forest + XGBoost) the performances of in different metrics were expressed. The boosting ensemble learning method, which provides the maximum classification performance with XGBoost, achieved a 0.982 accuracy value, 0.971 sensitivity value, 0.989 precision value, 0.989 specificity value, and 0.998 ROC curve. It is recommended to use ensemble learning methods for classification problems in patients with a high prevalence of cancer to achieve successful results

    Artificial intelligence in digital pathology: a diagnostic test accuracy systematic review and meta-analysis

    Full text link
    Ensuring diagnostic performance of AI models before clinical use is key to the safe and successful adoption of these technologies. Studies reporting AI applied to digital pathology images for diagnostic purposes have rapidly increased in number in recent years. The aim of this work is to provide an overview of the diagnostic accuracy of AI in digital pathology images from all areas of pathology. This systematic review and meta-analysis included diagnostic accuracy studies using any type of artificial intelligence applied to whole slide images (WSIs) in any disease type. The reference standard was diagnosis through histopathological assessment and / or immunohistochemistry. Searches were conducted in PubMed, EMBASE and CENTRAL in June 2022. We identified 2976 studies, of which 100 were included in the review and 48 in the full meta-analysis. Risk of bias and concerns of applicability were assessed using the QUADAS-2 tool. Data extraction was conducted by two investigators and meta-analysis was performed using a bivariate random effects model. 100 studies were identified for inclusion, equating to over 152,000 whole slide images (WSIs) and representing many disease types. Of these, 48 studies were included in the meta-analysis. These studies reported a mean sensitivity of 96.3% (CI 94.1-97.7) and mean specificity of 93.3% (CI 90.5-95.4) for AI. There was substantial heterogeneity in study design and all 100 studies identified for inclusion had at least one area at high or unclear risk of bias. This review provides a broad overview of AI performance across applications in whole slide imaging. However, there is huge variability in study design and available performance data, with details around the conduct of the study and make up of the datasets frequently missing. Overall, AI offers good accuracy when applied to WSIs but requires more rigorous evaluation of its performance.Comment: 26 pages, 5 figures, 8 tables + Supplementary material

    Treating colon cancer survivability prediction as a classification problem

    Get PDF
    This work presents a survivability prediction model for colon cancer developed with machine learning techniques. Survivability was viewed as a classification task where it was necessary to determine if a patient would survive each of the five years following treatment. The model was based on the SEER dataset which, after preprocessing, consisted of 38,592 records of colon cancer patients. Six features were extracted from a feature selection process in order to construct the model. This model was compared with another one with 18 features indicated by a physician. The results show that the performance of the sixfeature model is close to that of the model using 18 features, which indicates that the first may be a good compromise between usability and performance.This work has been supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT – Fundação para a Ciência e Tecnologia within the Project Scope UID/CEC/00319/2013. The work of Tiago Oliveira is supported by a FCT grant with the reference SFRH/BD/85291/ 2012.info:eu-repo/semantics/publishedVersio

    Thyroid cartilage infiltration in advanced laryngeal cancer: prognostic implications and predictive modelling

    Get PDF
    Objective: Detection of laryngeal cartilage invasion is of great importance in staging of laryngeal squamous cell carcinoma (LSCC). The role of prognosticators in locally advanced laryngeal cancer are still widely debated. This study aimed to assess the impact of volume of thyroid cartilage infiltration, as well as other histopathologic variables, on patient survival. Materials and methods: We retrospectively analysed 74 patients affected by pT4 LSCC and treated with total laryngectomy between 2005 and 2021 at the Department of Otorhinolaryngology - Head and Neck Surgery of the University of Brescia, Italy. We considered as potential prognosticators histological grade, perineural (PNI) and lympho-vascular invasion (LVI), thyroid cartilage infiltration, and pTN staging. Pre-operative CT or MRI were analysed to quantify the volume of cartilage infiltration using 3D Slicer software. Results: The 1-, 3-, and 5-year disease free survivals (DFS) were 76%, 66%, and 64%, respectively. Using machine learning models, we found that the volume of thyroid cartilage infiltration had high correlation with DFS. Patients with a higher volume (> 670 mm3) of infiltration had a worse prognosis compared to those with a lower volume. Conclusions: Our study confirms the essential role of LVI as prognosticator in advanced LSCC and, more innovatively, highlights the volume of thyroid cartilage infiltration as another promising prognostic factor

    DECISION TREE CLASSIFIERS FOR CLASSIFICATION OF BREAST CANCER

    Get PDF
    Objective: Breast cancer is one of the dangerous cancers among world's women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. Survey: According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in united states. Also, 246,660 new cases of women with cancer are estimated for the year 2016.Methods: Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification is one of the vital techniques used by researchers to analyze and classify the medical data.Results: This paper analyzes the different decision tree classifier algorithms for seer breast cancer dataset using WEKA software. The performance of the classifiers are evaluated against the parameters like accuracy, Kappa statistic, Entropy, RMSE, TP Rate, FP Rate, Precision, Recall, F-Measure, ROC, Specificity, Sensitivity.Conclusion: The simulation results shows REPTree classifier classifies the data with 93.63% accuracy and minimum RMSE of 0.1628 REPTree algorithm consumes less time to build the model with 0.929 ROC and 0.959 PRC values. By comparing classification results, we confirm that a REPTree algorithm is better than other classification algorithms for SEER dataset
    • …
    corecore