451 research outputs found

    Wisdom of the Contexts: Active Ensemble Learning for Contextual Anomaly Detection

    Full text link
    In contextual anomaly detection (CAD), an object is only considered anomalous within a specific context. Most existing methods for CAD use a single context based on a set of user-specified contextual features. However, identifying the right context can be very challenging in practice, especially in datasets, with a large number of attributes. Furthermore, in real-world systems, there might be multiple anomalies that occur in different contexts and, therefore, require a combination of several "useful" contexts to unveil them. In this work, we leverage active learning and ensembles to effectively detect complex contextual anomalies in situations where the true contextual and behavioral attributes are unknown. We propose a novel approach, called WisCon (Wisdom of the Contexts), that automatically creates contexts from the feature set. Our method constructs an ensemble of multiple contexts, with varying importance scores, based on the assumption that not all useful contexts are equally so. Experiments show that WisCon significantly outperforms existing baselines in different categories (i.e., active classifiers, unsupervised contextual and non-contextual anomaly detectors, and supervised classifiers) on seven datasets. Furthermore, the results support our initial hypothesis that there is no single perfect context that successfully uncovers all kinds of contextual anomalies, and leveraging the "wisdom" of multiple contexts is necessary.Comment: Submitted to IEEE TKD

    A Decision-Making Tool for Early Detection of Breast Cancer on Mammographic Images

    Get PDF
    Breast cancer is one of the most dangerous types of cancer in the world among females. In the medical industry, the early detection of a breast abnormality in a mammogram can significantly decrease the death rate caused by breast cancer. Therefore, researchers directed their focus and efforts to find better solutions. Whereas researchers earlier used semi-automatic algorithms of machine learning, recently the attention is redirected toward deep learning algorithms that automatically extract features. Therefore, in the research study, two pre-trained Convolutional Neural Network models, VGG16 and ResNet50, have been used and applied on mammogram images to classify their abnormalities in terms of (1) the Benign Calcification, (2) the Malignant Calcification, (3) the Benign Mass, and (4) the Malignant Mass. The mammographic images of the CBIS-DDSM dataset are used. In the training phase, various experiments are performed on ROI images to decide on the best model configuration and fine-tuning depth. The experimental results showed that the VGG16 model provided a remarkable advancement over the ResNet50 model; the accuracy obtained was 80.0% in the first model whereas the second model could classify with a 60.0% accuracy almost randomly. Apart from accuracy, the other performance metrics used in this study are precision, recall, F1-Score and AUC. Our evaluation, based on these performance metrics, shows that accurate detection effect is obtained from the two networks with VGG16 being the most accurate. Finally, a decision support tool is developed which classifies the full mammogram images based on the fine-tuned VGG16 architecture into Benign Calcification, Malignant Calcification, Benign Mass, and Malignant Mass

    Weakly Supervised Learning for Breast Cancer Prediction on Mammograms in Realistic Settings

    Full text link
    Automatic methods for early detection of breast cancer on mammography can significantly decrease mortality. Broad uptake of those methods in hospitals is currently hindered because the methods have too many constraints. They assume annotations available for single images or even regions-of-interest (ROIs), and a fixed number of images per patient. Both assumptions do not hold in a general hospital setting. Relaxing those assumptions results in a weakly supervised learning setting, where labels are available per case, but not for individual images or ROIs. Not all images taken for a patient contain malignant regions and the malignant ROIs cover only a tiny part of an image, whereas most image regions represent benign tissue. In this work, we investigate a two-level multi-instance learning (MIL) approach for case-level breast cancer prediction on two public datasets (1.6k and 5k cases) and an in-house dataset of 21k cases. Observing that breast cancer is usually only present in one side, while images of both breasts are taken as a precaution, we propose a domain-specific MIL pooling variant. We show that two-level MIL can be applied in realistic clinical settings where only case labels, and a variable number of images per patient are available. Data in realistic settings scales with continuous patient intake, while manual annotation efforts do not. Hence, research should focus in particular on unsupervised ROI extraction, in order to improve breast cancer prediction for all patients.Comment: 10 pages, 5 figures, 5 table

    Studies on deep learning approach in breast lesions detection and cancer diagnosis in mammograms

    Get PDF
    Breast cancer accounts for the largest proportion of newly diagnosed cancers in women recently. Early diagnosis of breast cancer can improve treatment outcomes and reduce mortality. Mammography is convenient and reliable, which is the most commonly used method for breast cancer screening. However, manual examinations are limited by the cost and experience of radiologists, which introduce a high false positive rate and false examination. Therefore, a high-performance computer-aided diagnosis (CAD) system is significant for lesions detection and cancer diagnosis. Traditional CADs for cancer diagnosis require a large number of features selected manually and remain a high false positive rate. The methods based on deep learning can automatically extract image features through the network, but their performance is limited by the problems of multicenter data biases, the complexity of lesion features, and the high cost of annotations. Therefore, it is necessary to propose a CAD system to improve the ability of lesion detection and cancer diagnosis, which is optimized for the above problems. This thesis aims to utilize deep learning methods to improve the CADs' performance and effectiveness of lesion detection and cancer diagnosis. Starting from the detection of multi-type lesions using deep learning methods based on full consideration of characteristics of mammography, this thesis explores the detection method of microcalcification based on multiscale feature fusion and the detection method of mass based on multi-view enhancing. Then, a classification method based on multi-instance learning is developed, which integrates the detection results from the above methods, to realize the precise lesions detection and cancer diagnosis in mammography. For the detection of microcalcification, a microcalcification detection network named MCDNet is proposed to overcome the problems of multicenter data biases, the low resolution of network inputs, and scale differences between microcalcifications. In MCDNet, Adaptive Image Adjustment mitigates the impact of multicenter biases and maximizes the input effective pixels. Then, the proposed pyramid network with shortcut connections ensures that the feature maps for detection contain more precise localization and classification information about multiscale objects. In the structure, trainable Weighted Feature Fusion is proposed to improve the detection performance of both scale objects by learning the contribution of feature maps in different stages. The experiments show that MCDNet outperforms other methods on robustness and precision. In case the average number of false positives per image is 1, the recall rates of benign and malignant microcalcification are 96.8% and 98.9%, respectively. MCDNet can effectively help radiologists detect microcalcifications in clinical applications. For the detection of breast masses, a weakly supervised multi-view enhancing mass detection network named MVMDNet is proposed to solve the lack of lesion-level labels. MVMDNet can be trained on the image-level labeled dataset and extract the extra localization information by exploring the geometric relation between multi-view mammograms. In Multi-view Enhancing, Spatial Correlation Attention is proposed to extract correspondent location information between different views while Sigmoid Weighted Fusion module fuse diagnostic and auxiliary features to improve the precision of localization. CAM-based Detection module is proposed to provide detections for mass through the classification labels. The results of experiments on both in-house dataset and public dataset, [email protected] and [email protected] (recall rate@average number of false positive per image), demonstrate MVMDNet achieves state-of-art performances among weakly supervised methods and has robust generalization ability to alleviate the multicenter biases. In the study of cancer diagnosis, a breast cancer classification network named CancerDNet based on Multi-instance Learning is proposed. CancerDNet successfully solves the problem that the features of lesions are complex in whole image classification utilizing the lesion detection results from the previous chapters. Whole Case Bag Learning is proposed to combined the features extracted from four-view, which works like a radiologist to realize the classification of each case. Low-capacity Instance Learning and High-capacity Instance Learning successfully integrate the detections of multi-type lesions into the CancerDNet, so that the model can fully consider lesions with complex features in the classification task. CancerDNet achieves the AUC of 0.907 and AUC of 0.925 on the in-house and the public datasets, respectively, which is better than current methods. The results show that CancerDNet achieves a high-performance cancer diagnosis. In the works of the above three parts, this thesis fully considers the characteristics of mammograms and proposes methods based on deep learning for lesions detection and cancer diagnosis. The results of experiments on in-house and public datasets show that the methods proposed in this thesis achieve the state-of-the-art in the microcalcifications detection, masses detection, and the case-level classification of cancer and have a strong ability of multicenter generalization. The results also prove that the methods proposed in this thesis can effectively assist radiologists in making the diagnosis while saving labor costs

    Exploring variability in medical imaging

    Get PDF
    Although recent successes of deep learning and novel machine learning techniques improved the perfor- mance of classification and (anomaly) detection in computer vision problems, the application of these methods in medical imaging pipeline remains a very challenging task. One of the main reasons for this is the amount of variability that is encountered and encapsulated in human anatomy and subsequently reflected in medical images. This fundamental factor impacts most stages in modern medical imaging processing pipelines. Variability of human anatomy makes it virtually impossible to build large datasets for each disease with labels and annotation for fully supervised machine learning. An efficient way to cope with this is to try and learn only from normal samples. Such data is much easier to collect. A case study of such an automatic anomaly detection system based on normative learning is presented in this work. We present a framework for detecting fetal cardiac anomalies during ultrasound screening using generative models, which are trained only utilising normal/healthy subjects. However, despite the significant improvement in automatic abnormality detection systems, clinical routine continues to rely exclusively on the contribution of overburdened medical experts to diagnosis and localise abnormalities. Integrating human expert knowledge into the medical imaging processing pipeline entails uncertainty which is mainly correlated with inter-observer variability. From the per- spective of building an automated medical imaging system, it is still an open issue, to what extent this kind of variability and the resulting uncertainty are introduced during the training of a model and how it affects the final performance of the task. Consequently, it is very important to explore the effect of inter-observer variability both, on the reliable estimation of model’s uncertainty, as well as on the model’s performance in a specific machine learning task. A thorough investigation of this issue is presented in this work by leveraging automated estimates for machine learning model uncertainty, inter-observer variability and segmentation task performance in lung CT scan images. Finally, a presentation of an overview of the existing anomaly detection methods in medical imaging was attempted. This state-of-the-art survey includes both conventional pattern recognition methods and deep learning based methods. It is one of the first literature surveys attempted in the specific research area.Open Acces

    DEEP LEARNING BASED SEGMENTATION AND CLASSIFICATION FOR IMPROVED BREAST CANCER DETECTION

    Get PDF
    Breast Cancer is a leading killer of women globally. It is a serious health concern caused by calcifications or abnormal tissue growth in the breast. Doing a screening and identifying the nature of the tumor as benign or malignant is important to facilitate early intervention, which drastically decreases the mortality rate. Usually, it uses ultrasound images, since they are easily accessible to most people and have no drawbacks as such, unlike in the other most famous screening technique of mammograms where in some cases you may not get a clear scan. In this thesis, the approach to this problem is to build a stacked model which makes predictions on the basis of the shape, pattern, and spread of the tumor. To achieve this, typical steps are pre-processing of images followed by segmentation of the image and classification. For pre-processing, the proposed approach in this thesis uses histogram equalization that helps in improving the contrast of the image, making the tumor stand out from its surroundings, and making it easier for the segmentation step. Through segmentation, the approach uses UNet architecture with a ResNet backbone. The UNet architecture is made specifically for biomedical imaging. The aim of segmentation is to separate the tumor from the ultrasound image so that the classification model can make its predictions from this mask. The metric result of the F1-score for the segmentation model turned out to be 97.30%. For classification, the CNN base model is used for feature extraction from provided masks. These are then fed into a network and the predictions are done. The base CNN model used is ResNet50 and the neural network used for the output layer is a simple 8-layer network with ReLU activation in the hidden layers and softmax in the final decision-making layer. The ResNet weights are initialized from training on ImageNet. The ResNet50 returns 2048 features from each mask. These are then fed into the network for decision-making. The hidden layers of the neural network have 1024, 512, 256, 128, 64, 32, and 10 neurons respectively. The classification accuracy achieved for the proposed model was 98.61% with an F1 score of 98.41%. The detailed experimental results are presented along with comparative data
    corecore