33 research outputs found
Conformal Prediction with Orange
Conformal predictors estimate the reliability of outcomes made by supervised machine learning models. Instead of a point value, conformal prediction defines an outcome region that meets a user-specified reliability threshold. Provided that the data are independently and identically distributed, the user can control the level of the prediction errors and adjust it following the requirements of a given application. The quality of conformal predictions often depends on the choice of nonconformity estimate for a given machine learning method. To promote the selection of a successful approach, we have developed Orange3-Conformal, a Python library that provides a range of conformal prediction methods for classification and regression. The library also implements several nonconformity scores. It has a modular design and can be extended to add new conformal prediction methods and nonconformities
Conformal predictors in early diagnostics of ovarian and breast cancers
The paper describes an application of a recently
developed machine learning technique called Mondrian
predictors to risk assessment of ovarian and breast
cancers. The analysis is based on mass spectrometry
profiling of human serum samples that were collected
in the United Kingdom Collaborative Trial of Ovarian
Cancer Screening. The paper describes the technique
and presents the results of classification (diagnosis)
and the corresponding measures of confidence of
the diagnostics. The main advantage of this approach
is a proven validity of prediction. The paper also describes
an approach to improve early diagnosis of ovarian
and breast cancers since the data in the United
Kingdom Collaborative Trial of Ovarian Cancer Screening
were collected over a period of seven years and do
allow to make observations of changes in human serum
over that period of time. Significance of improvement is
confirmed statistically (for up to 11 months for Ovarian
Cancer and 9 months for Breast Cancer). In addition,
the methodology allowed us to pinpoint the same mass
spectrometry peaks as previously detected as carrying
statistically significant information for discrimination
between healthy and diseased patients. The results are
discussed
Using random forest for reliable classification and cost-sensitive learning for medical diagnosis
Background: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. Results: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. Conclusion: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class
Personalized prostate cancer management : AI-assisted prostate pathology and improved active surveillance
Prostate cancer is a major global health concern and is the most common cancer-related cause
of death in Sweden. Prostate cancer screening using PSA has been shown to reduce prostate
cancer mortality but also leads to significant overdiagnosis and overtreatment of low-risk cancers.
Improved risk stratification and effective active surveillance are crucial to balancing the
benefits of screening with the risk of overdiagnosis and overtreatment.
In Study I, we studied the uptake and the follow-up of active surveillance using a retrospective
cohort of patients who were diagnosed with low-risk prostate cancer between 2008 and 2017
in Stockholm County. Our results showed that only 50% of eligible active surveillance patients
received active surveillance as their primary treatment choice at diagnosis. Most men that
enrolled in active surveillance remained on surveillance during the first years after diagnosis
(82% during a median 3.5 years), but did not receive a follow up according to guidelines with
regard to repeat biopsies and PSA tests.
Current clinical practice has seen an increase in the use of magnetic resonance imaging (MRI)
and the incorporation of risk prediction models to select men with the highest suspicion of clinically
significant prostate cancer for prostate biopsy. However, the effectiveness and how MRI
and risk prediction models should be incorporated into active surveillance follow-up have yet to
be established. Study II evaluated the performance of MRI-targeted biopsies and a blood-based
risk prediction model (the Stockholm3 test) for monitoring disease progression in patients on
active surveillance and compared this to the conventional follow-up using PSA and systematic
biopsies. When MRI-targeted and systematic biopsies were combined, the detection rate
of clinically significant prostate cancer increased when compared to conventional systematic
biopsies. Biopsies performed in MRI-positive men resulted in a 49% reduction in performed
biopsies, at the expense of failing to diagnose 1.4% clinically significant prostate cancer in MRInegative
men. The incorporation of the Stockholm3 test showed a 27% reduction in required
MRI investigations and a 57% reduction in performed biopsies compared to performing only
systematic biopsies.
In Study III, we digitized biopsy cores from STHLM3 participants to develop an artificial
intelligence (AI) for prostate cancer diagnostics. The AI system demonstrated clinically useful
performance that was comparable to that of the study pathologist for cancer detection (AUC
of 0.986) and for predictions of cancer length (correlation of 0.87) and grading performance
that was on par with that of expert prostate pathologists.
In Study IV, we developed a conformal predictor to estimate the uncertainty of the predictions
for the model in Study III. The uncertainty estimates were used to control the error rate so that
only predictions with high confidence are accepted and unreliable predictions can be detected.
The conformal predictor was able to identify unreliable predictions as a result of variations in
digital pathology scanners, preparation of tissue in different pathology laboratories, and the
existence of unusual prostate tissue that the AI model was not exposed to during training.
Little is known about the relationships between prostate cancer genetic risk factors and the
morphology of prostate tissue. In Study V:, we investigated whether weakly supervised deep
learning can learn to detect such possible associations. The findings in this paper imply relationships
between prostatic tissue morphology and genetic risk factors for prostate cancer,
particularly in young men. These results provide proof of principle for exploring the use of
morphological information in multi-modal prostate cancer risk prediction algorithms.
In conclusion, the purpose of this thesis was to describe possible extensions to improve prostate
cancer active surveillance management, as well as to develop prediction models for improved
prostate cancer diagnostics