451 research outputs found
Wisdom of the Contexts: Active Ensemble Learning for Contextual Anomaly Detection
In contextual anomaly detection (CAD), an object is only considered anomalous
within a specific context. Most existing methods for CAD use a single context
based on a set of user-specified contextual features. However, identifying the
right context can be very challenging in practice, especially in datasets, with
a large number of attributes. Furthermore, in real-world systems, there might
be multiple anomalies that occur in different contexts and, therefore, require
a combination of several "useful" contexts to unveil them. In this work, we
leverage active learning and ensembles to effectively detect complex contextual
anomalies in situations where the true contextual and behavioral attributes are
unknown. We propose a novel approach, called WisCon (Wisdom of the Contexts),
that automatically creates contexts from the feature set. Our method constructs
an ensemble of multiple contexts, with varying importance scores, based on the
assumption that not all useful contexts are equally so. Experiments show that
WisCon significantly outperforms existing baselines in different categories
(i.e., active classifiers, unsupervised contextual and non-contextual anomaly
detectors, and supervised classifiers) on seven datasets. Furthermore, the
results support our initial hypothesis that there is no single perfect context
that successfully uncovers all kinds of contextual anomalies, and leveraging
the "wisdom" of multiple contexts is necessary.Comment: Submitted to IEEE TKD
A Decision-Making Tool for Early Detection of Breast Cancer on Mammographic Images
Breast cancer is one of the most dangerous types of cancer in the world among females. In the medical industry, the early detection of a breast abnormality in a mammogram can significantly decrease the death rate caused by breast cancer. Therefore, researchers directed their focus and efforts to find better solutions. Whereas researchers earlier used semi-automatic algorithms of machine learning, recently the attention is redirected toward deep learning algorithms that automatically extract features. Therefore, in the research study, two pre-trained Convolutional Neural Network models, VGG16 and ResNet50, have been used and applied on mammogram images to classify their abnormalities in terms of (1) the Benign Calcification, (2) the Malignant Calcification, (3) the Benign Mass, and (4) the Malignant Mass. The mammographic images of the CBIS-DDSM dataset are used. In the training phase, various experiments are performed on ROI images to decide on the best model configuration and fine-tuning depth. The experimental results showed that the VGG16 model provided a remarkable advancement over the ResNet50 model; the accuracy obtained was 80.0% in the first model whereas the second model could classify with a 60.0% accuracy almost randomly. Apart from accuracy, the other performance metrics used in this study are precision, recall, F1-Score and AUC. Our evaluation, based on these performance metrics, shows that accurate detection effect is obtained from the two networks with VGG16 being the most accurate. Finally, a decision support tool is developed which classifies the full mammogram images based on the fine-tuned VGG16 architecture into Benign Calcification, Malignant Calcification, Benign Mass, and Malignant Mass
Weakly Supervised Learning for Breast Cancer Prediction on Mammograms in Realistic Settings
Automatic methods for early detection of breast cancer on mammography can
significantly decrease mortality. Broad uptake of those methods in hospitals is
currently hindered because the methods have too many constraints. They assume
annotations available for single images or even regions-of-interest (ROIs), and
a fixed number of images per patient. Both assumptions do not hold in a general
hospital setting. Relaxing those assumptions results in a weakly supervised
learning setting, where labels are available per case, but not for individual
images or ROIs. Not all images taken for a patient contain malignant regions
and the malignant ROIs cover only a tiny part of an image, whereas most image
regions represent benign tissue. In this work, we investigate a two-level
multi-instance learning (MIL) approach for case-level breast cancer prediction
on two public datasets (1.6k and 5k cases) and an in-house dataset of 21k
cases. Observing that breast cancer is usually only present in one side, while
images of both breasts are taken as a precaution, we propose a domain-specific
MIL pooling variant. We show that two-level MIL can be applied in realistic
clinical settings where only case labels, and a variable number of images per
patient are available. Data in realistic settings scales with continuous
patient intake, while manual annotation efforts do not. Hence, research should
focus in particular on unsupervised ROI extraction, in order to improve breast
cancer prediction for all patients.Comment: 10 pages, 5 figures, 5 table
Studies on deep learning approach in breast lesions detection and cancer diagnosis in mammograms
Breast cancer accounts for the largest proportion of newly diagnosed cancers in women recently. Early diagnosis of breast cancer can improve treatment outcomes and reduce mortality. Mammography is convenient and reliable, which is the most commonly used method for breast cancer screening. However, manual examinations are limited by the cost and experience of radiologists, which introduce a high false positive rate and false examination. Therefore, a high-performance computer-aided diagnosis (CAD) system is significant for lesions detection and cancer diagnosis. Traditional CADs for cancer diagnosis require a large number of features selected manually and remain a high false positive rate. The methods based on deep learning can automatically extract image features through the network, but their performance is limited by the problems of multicenter data biases, the complexity of lesion features, and the high cost of annotations. Therefore, it is necessary to propose a CAD system to improve the ability of lesion detection and cancer diagnosis, which is optimized for the above problems.
This thesis aims to utilize deep learning methods to improve the CADs' performance and effectiveness of lesion detection and cancer diagnosis. Starting from the detection of multi-type lesions using deep learning methods based on full consideration of characteristics of mammography, this thesis explores the detection method of microcalcification based on multiscale feature fusion and the detection method of mass based on multi-view enhancing. Then, a classification method based on multi-instance learning is developed, which integrates the detection results from the above methods, to realize the precise lesions detection and cancer diagnosis in mammography.
For the detection of microcalcification, a microcalcification detection network named MCDNet is proposed to overcome the problems of multicenter data biases, the low resolution of network inputs, and scale differences between microcalcifications. In MCDNet, Adaptive Image Adjustment mitigates the impact of multicenter biases and maximizes the input effective pixels. Then, the proposed pyramid network with shortcut connections ensures that the feature maps for detection contain more precise localization and classification information about multiscale objects. In the structure, trainable Weighted Feature Fusion is proposed to improve the detection performance of both scale objects by learning the contribution of feature maps in different stages. The experiments show that MCDNet outperforms other methods on robustness and precision. In case the average number of false positives per image is 1, the recall rates of benign and malignant microcalcification are 96.8% and 98.9%, respectively. MCDNet can effectively help radiologists detect microcalcifications in clinical applications.
For the detection of breast masses, a weakly supervised multi-view enhancing mass detection network named MVMDNet is proposed to solve the lack of lesion-level labels. MVMDNet can be trained on the image-level labeled dataset and extract the extra localization information by exploring the geometric relation between multi-view mammograms. In Multi-view Enhancing, Spatial Correlation Attention is proposed to extract correspondent location information between different views while Sigmoid Weighted Fusion module fuse diagnostic and auxiliary features to improve the precision of localization. CAM-based Detection module is proposed to provide detections for mass through the classification labels. The results of experiments on both in-house dataset and public dataset, [email protected] and [email protected] (recall rate@average number of false positive per image), demonstrate MVMDNet achieves state-of-art performances among weakly supervised methods and has robust generalization ability to alleviate the multicenter biases.
In the study of cancer diagnosis, a breast cancer classification network named CancerDNet based on Multi-instance Learning is proposed. CancerDNet successfully solves the problem that the features of lesions are complex in whole image classification utilizing the lesion detection results from the previous chapters. Whole Case Bag Learning is proposed to combined the features extracted from four-view, which works like a radiologist to realize the classification of each case. Low-capacity Instance Learning and High-capacity Instance Learning successfully integrate the detections of multi-type lesions into the CancerDNet, so that the model can fully consider lesions with complex features in the classification task. CancerDNet achieves the AUC of 0.907 and AUC of 0.925 on the in-house and the public datasets, respectively, which is better than current methods. The results show that CancerDNet achieves a high-performance cancer diagnosis.
In the works of the above three parts, this thesis fully considers the characteristics of mammograms and proposes methods based on deep learning for lesions detection and cancer diagnosis. The results of experiments on in-house and public datasets show that the methods proposed in this thesis achieve the state-of-the-art in the microcalcifications detection, masses detection, and the case-level classification of cancer and have a strong ability of multicenter generalization. The results also prove that the methods proposed in this thesis can effectively assist radiologists in making the diagnosis while saving labor costs
Exploring variability in medical imaging
Although recent successes of deep learning and novel machine learning techniques improved the perfor-
mance of classification and (anomaly) detection in computer vision problems, the application of these
methods in medical imaging pipeline remains a very challenging task. One of the main reasons for this
is the amount of variability that is encountered and encapsulated in human anatomy and subsequently
reflected in medical images. This fundamental factor impacts most stages in modern medical imaging
processing pipelines.
Variability of human anatomy makes it virtually impossible to build large datasets for each disease
with labels and annotation for fully supervised machine learning. An efficient way to cope with this is
to try and learn only from normal samples. Such data is much easier to collect. A case study of such
an automatic anomaly detection system based on normative learning is presented in this work. We
present a framework for detecting fetal cardiac anomalies during ultrasound screening using generative
models, which are trained only utilising normal/healthy subjects.
However, despite the significant improvement in automatic abnormality detection systems, clinical
routine continues to rely exclusively on the contribution of overburdened medical experts to diagnosis
and localise abnormalities. Integrating human expert knowledge into the medical imaging processing
pipeline entails uncertainty which is mainly correlated with inter-observer variability. From the per-
spective of building an automated medical imaging system, it is still an open issue, to what extent
this kind of variability and the resulting uncertainty are introduced during the training of a model
and how it affects the final performance of the task. Consequently, it is very important to explore the
effect of inter-observer variability both, on the reliable estimation of model’s uncertainty, as well as
on the model’s performance in a specific machine learning task. A thorough investigation of this issue
is presented in this work by leveraging automated estimates for machine learning model uncertainty,
inter-observer variability and segmentation task performance in lung CT scan images.
Finally, a presentation of an overview of the existing anomaly detection methods in medical imaging
was attempted. This state-of-the-art survey includes both conventional pattern recognition methods
and deep learning based methods. It is one of the first literature surveys attempted in the specific
research area.Open Acces
DEEP LEARNING BASED SEGMENTATION AND CLASSIFICATION FOR IMPROVED BREAST CANCER DETECTION
Breast Cancer is a leading killer of women globally. It is a serious health concern caused by calcifications or abnormal tissue growth in the breast. Doing a screening and identifying the nature of the tumor as benign or malignant is important to facilitate early intervention, which drastically decreases the mortality rate. Usually, it uses ultrasound images, since they are easily accessible to most people and have no drawbacks as such, unlike in the other most famous screening technique of mammograms where in some cases you may not get a clear scan. In this thesis, the approach to this problem is to build a stacked model which makes predictions on the basis of the shape, pattern, and spread of the tumor. To achieve this, typical steps are pre-processing of images followed by segmentation of the image and classification. For pre-processing, the proposed approach in this thesis uses histogram equalization that helps in improving the contrast of the image, making the tumor stand out from its surroundings, and making it easier for the segmentation step. Through segmentation, the approach uses UNet architecture with a ResNet backbone. The UNet architecture is made specifically for biomedical imaging. The aim of segmentation is to separate the tumor from the ultrasound image so that the classification model can make its predictions from this mask. The metric result of the F1-score for the segmentation model turned out to be 97.30%. For classification, the CNN base model is used for feature extraction from provided masks. These are then fed into a network and the predictions are done. The base CNN model used is ResNet50 and the neural network used for the output layer is a simple 8-layer network with ReLU activation in the hidden layers and softmax in the final decision-making layer. The ResNet weights are initialized from training on ImageNet. The ResNet50 returns 2048 features from each mask. These are then fed into the network for decision-making. The hidden layers of the neural network have 1024, 512, 256, 128, 64, 32, and 10 neurons respectively. The classification accuracy achieved for the proposed model was 98.61% with an F1 score of 98.41%. The detailed experimental results are presented along with comparative data
- …