33 research outputs found

    Conformal Prediction with Orange

    Get PDF
    Conformal predictors estimate the reliability of outcomes made by supervised machine learning models. Instead of a point value, conformal prediction defines an outcome region that meets a user-specified reliability threshold. Provided that the data are independently and identically distributed, the user can control the level of the prediction errors and adjust it following the requirements of a given application. The quality of conformal predictions often depends on the choice of nonconformity estimate for a given machine learning method. To promote the selection of a successful approach, we have developed Orange3-Conformal, a Python library that provides a range of conformal prediction methods for classification and regression. The library also implements several nonconformity scores. It has a modular design and can be extended to add new conformal prediction methods and nonconformities

    Conformal predictors in early diagnostics of ovarian and breast cancers

    Get PDF
    The paper describes an application of a recently developed machine learning technique called Mondrian predictors to risk assessment of ovarian and breast cancers. The analysis is based on mass spectrometry profiling of human serum samples that were collected in the United Kingdom Collaborative Trial of Ovarian Cancer Screening. The paper describes the technique and presents the results of classification (diagnosis) and the corresponding measures of confidence of the diagnostics. The main advantage of this approach is a proven validity of prediction. The paper also describes an approach to improve early diagnosis of ovarian and breast cancers since the data in the United Kingdom Collaborative Trial of Ovarian Cancer Screening were collected over a period of seven years and do allow to make observations of changes in human serum over that period of time. Significance of improvement is confirmed statistically (for up to 11 months for Ovarian Cancer and 9 months for Breast Cancer). In addition, the methodology allowed us to pinpoint the same mass spectrometry peaks as previously detected as carrying statistically significant information for discrimination between healthy and diseased patients. The results are discussed

    Using random forest for reliable classification and cost-sensitive learning for medical diagnosis

    Get PDF
    Background: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. Results: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. Conclusion: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class

    Personalized prostate cancer management : AI-assisted prostate pathology and improved active surveillance

    Get PDF
    Prostate cancer is a major global health concern and is the most common cancer-related cause of death in Sweden. Prostate cancer screening using PSA has been shown to reduce prostate cancer mortality but also leads to significant overdiagnosis and overtreatment of low-risk cancers. Improved risk stratification and effective active surveillance are crucial to balancing the benefits of screening with the risk of overdiagnosis and overtreatment. In Study I, we studied the uptake and the follow-up of active surveillance using a retrospective cohort of patients who were diagnosed with low-risk prostate cancer between 2008 and 2017 in Stockholm County. Our results showed that only 50% of eligible active surveillance patients received active surveillance as their primary treatment choice at diagnosis. Most men that enrolled in active surveillance remained on surveillance during the first years after diagnosis (82% during a median 3.5 years), but did not receive a follow up according to guidelines with regard to repeat biopsies and PSA tests. Current clinical practice has seen an increase in the use of magnetic resonance imaging (MRI) and the incorporation of risk prediction models to select men with the highest suspicion of clinically significant prostate cancer for prostate biopsy. However, the effectiveness and how MRI and risk prediction models should be incorporated into active surveillance follow-up have yet to be established. Study II evaluated the performance of MRI-targeted biopsies and a blood-based risk prediction model (the Stockholm3 test) for monitoring disease progression in patients on active surveillance and compared this to the conventional follow-up using PSA and systematic biopsies. When MRI-targeted and systematic biopsies were combined, the detection rate of clinically significant prostate cancer increased when compared to conventional systematic biopsies. Biopsies performed in MRI-positive men resulted in a 49% reduction in performed biopsies, at the expense of failing to diagnose 1.4% clinically significant prostate cancer in MRInegative men. The incorporation of the Stockholm3 test showed a 27% reduction in required MRI investigations and a 57% reduction in performed biopsies compared to performing only systematic biopsies. In Study III, we digitized biopsy cores from STHLM3 participants to develop an artificial intelligence (AI) for prostate cancer diagnostics. The AI system demonstrated clinically useful performance that was comparable to that of the study pathologist for cancer detection (AUC of 0.986) and for predictions of cancer length (correlation of 0.87) and grading performance that was on par with that of expert prostate pathologists. In Study IV, we developed a conformal predictor to estimate the uncertainty of the predictions for the model in Study III. The uncertainty estimates were used to control the error rate so that only predictions with high confidence are accepted and unreliable predictions can be detected. The conformal predictor was able to identify unreliable predictions as a result of variations in digital pathology scanners, preparation of tissue in different pathology laboratories, and the existence of unusual prostate tissue that the AI model was not exposed to during training. Little is known about the relationships between prostate cancer genetic risk factors and the morphology of prostate tissue. In Study V:, we investigated whether weakly supervised deep learning can learn to detect such possible associations. The findings in this paper imply relationships between prostatic tissue morphology and genetic risk factors for prostate cancer, particularly in young men. These results provide proof of principle for exploring the use of morphological information in multi-modal prostate cancer risk prediction algorithms. In conclusion, the purpose of this thesis was to describe possible extensions to improve prostate cancer active surveillance management, as well as to develop prediction models for improved prostate cancer diagnostics
    corecore