344 research outputs found

    Abnormality Detection in Mammography using Deep Convolutional Neural Networks

    Full text link
    Breast cancer is the most common cancer in women worldwide. The most common screening technology is mammography. To reduce the cost and workload of radiologists, we propose a computer aided detection approach for classifying and localizing calcifications and masses in mammogram images. To improve on conventional approaches, we apply deep convolutional neural networks (CNN) for automatic feature learning and classifier building. In computer-aided mammography, deep CNN classifiers cannot be trained directly on full mammogram images because of the loss of image details from resizing at input layers. Instead, our classifiers are trained on labelled image patches and then adapted to work on full mammogram images for localizing the abnormalities. State-of-the-art deep convolutional neural networks are compared on their performance of classifying the abnormalities. Experimental results indicate that VGGNet receives the best overall accuracy at 92.53\% in classifications. For localizing abnormalities, ResNet is selected for computing class activation maps because it is ready to be deployed without structural change or further training. Our approach demonstrates that deep convolutional neural network classifiers have remarkable localization capabilities despite no supervision on the location of abnormalities is provided.Comment: 6 page

    Analysis of Mammographic Images for Early Detection of Breast Cancer Using Machine Learning Techniques

    Get PDF
    Breast cancer is the main reason for death among women. Radiographic images obtained from mammography equipment are one of the most frequently used techniques for helping in early detection of breast cancer. The motivation behind this study is to focus the tumour types of breast cancer images .It is methodology to anticipated a sickness in view of the visual conclusion of breast disease tumour types with precision, particularly when numerous feature are related. Breast Cancer (BC) is one such sample where the phenomenon is very complex furthermore numerous feature of tumour types are included. In the present investigation, various pattern recognition techniques were used for the classification of breast cancer using mammograms image processing techniques .The pattern recognition techniques for tumour image enhancements, segmentation, texture based image feature extraction and subsequent classification of breast cancer mammogram image was successfully performed. When two machine learning techniques such as Artificial Neural Network (ANN), Support Vector Machine (SVM) were used to classify 120 images, it was observed from the results that Artificial Neural Network classifiers demonstrated the h classification rate 91.31% and the SVM with both Radial Basis Function (RBF) and linear kernel classifiers demonstrated the highest classification rate of 92.11% and RBF classification rate is 92.85%

    COMPUTER AIDED SYSTEM FOR BREAST CANCER DIAGNOSIS USING CURVELET TRANSFORM

    Get PDF
    Breast cancer is a leading cause of death among women worldwide. Early detection is the key for improving breast cancer prognosis. Digital mammography remains one of the most suitable tools for early detection of breast cancer. Hence, there are strong needs for the development of computer aided diagnosis (CAD) systems which have the capability to help radiologists in decision making. The main goal is to increase the diagnostic accuracy rate. In this thesis we developed a computer aided system for the diagnosis and detection of breast cancer using curvelet transform. Curvelet is a multiscale transform which possess directionality and anisotropy, and it breaks some inherent limitations of wavelet in representing edges in images. We started this study by developing a diagnosis system. Five feature extraction methods were developed with curvelet and wavelet coefficients to differentiate between different breast cancer classes. The results with curvelet and wavelet were compared. The experimental results show a high performance of the proposed methods and classification accuracy rate achieved 97.30%. The thesis then provides an automatic system for breast cancer detection. An automatic thresholding algorithm was used to separate the area composed of the breast and the pectoral muscle from the background of the image. Subsequently, a region growing algorithm was used to locate the pectoral muscle and suppress it from the breast. Then, the work concentrates on the segmentation of region of interest (ROI). Two methods are suggested to accomplish the segmentation stage: an adaptive thresholding method and a pattern matching method. Once the ROI has been identified, an automatic cropping is performed to extract it from the original mammogram. Subsequently, the suggested feature extraction methods were applied to the segmented ROIs. Finally, the K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) classifiers were used to determine whether the region is abnormal or normal. At this level, the study focuses on two abnormality types (mammographic masses and architectural distortion). Experimental results show that the introduced methods have very high detection accuracies. The effectiveness of the proposed methods has been tested with Mammographic Image Analysis Society (MIAS) dataset. Throughout the thesis all proposed methods and algorithms have been applied with both curvelet and wavelet for comparison and statistical tests were also performed. The overall results show that curvelet transform performs better than wavelet and the difference is statistically significant

    Consistent performance measurement of a system to detect masses in mammograms based on blind feature extraction

    Get PDF
    BACKGROUND: Breast cancer continues to be a leading cause of cancer deaths among women, especially in Western countries. In the last two decades, many methods have been proposed to achieve a robust mammography‐based computer aided detection (CAD) system. A CAD system should provide high performance over time and in different clinical situations. I.e., the system should be adaptable to different clinical situations and should provide consistent performance. METHODS: We tested our system seeking a measure of the guarantee of its consistent performance. The method is based on blind feature extraction by independent component analysis (ICA) and classification by neural networks (NN) or SVM classifiers. The test mammograms were from the Digital Database for Screening Mammography (DDSM). This database was constructed collaboratively by four institutions over more than 10 years. We took advantage of this to train our system using the mammograms from each institution separately, and then testing it on the remaining mammograms. We performed another experiment to compare the results and thus obtain the measure sought. This experiment consists in to form the learning sets with all available prototypes regardless of the institution in which them were generated, obtaining in that way the overall results. RESULTS: The smallest variation from comparing the results of the testing set in each experiment (performed by training the system using the mammograms from one institution and testing with the remaining) with those of the overall result, considering the success rate for an intermediate decision maker threshold, was roughly 5%, and the largest variation was roughly 17%. But, if we considere the area under ROC curve, the smallest variation was close to 4%, and the largest variation was about a 6%. CONCLUSIONS: Considering the heterogeneity in the datasets used to train and test our system in each case, we think that the variation of performance obtained when the results are compared with the overall results is acceptable in both cases, for NN and SVM classifiers. The present method is therefore very general in that it is able to adapt to different clinical situations and provide consistent performance

    Breast Cancer : automatic detection and risk analysis through machine learning algorithms, using mammograms

    Get PDF
    Tese de Mestrado Integrado, Engenharia Biomédica e Biofísica (Engenharia Clínica e Instrumentação Médica), 2021, Universidade de Lisboa, Faculdade de CiênciasCom 2.3 milhões de casos diagnosticados em todo o Mundo, durante o ano de 2020, o cancro da mama tornou-se aquele com maior incidência, nesse mesmo ano, considerando ambos os sexos. Anualmente, em Portugal, são diagnosticados aproximadamente sete mil (7000) novos casos de cancro da mama, com mil oitocentas (1800) mulheres a morrerem, todos os anos, devido a esta doença - indicando uma taxa de mortalidade de aproximadamente 5 mulheres por dia. A maior parte dos diagnósticos de cancro da mama ocorrem ao nível de programas de rastreio, que utilizam mamografia. Esta técnica de imagem apresenta alguns problemas: o facto de ser uma imagem a duas dimensões leva a que haja sobreposição de tecidos, o que pode mascarar a presença de tumores; e a fraca sensibilidade a mamas mais densas, sendo estas caraterísticas de mulheres com risco de cancro da mama mais elevado. Como estes dois problemas dificultam a leitura das mamografias, grande parte deste trabalhou focou-se na verificação do desempenho de métodos computacionais na tarefa de classificar mamografias em duas classes: cancro e não-cancro. No que diz respeito à classe “não cancro” (N = 159), esta foi constituída por mamografias saudáveis (N=84), e por mamografias que continham lesões benignas (N=75). Já a classe “cancro” continha apenas mamografias com lesões malignas (N = 73). A discriminação entre estas duas classes foi feita com recurso a algoritmos de aprendizagem automática. Múltiplos classificadores foram otimizados e treinados (Ntreino=162, Nteste = 70), recorrendo a um conjunto de características previamente selecionado, que descreve a textura de toda a mamografia, em vez de apenas uma única Região de Interesse. Estas características de textura baseiam-se na procura de padrões: sequências de pixéis com a mesma intensidade, ou pares específicos de pixéis. O classificador que apresentou uma performance mais elevada foi um dos Support Vector Machine (SVM) treinados – AUC= 0.875, o que indica um desempenho entre o bom e o excelente. A Percent Mammographic Density (%PD) é um importante fator de risco no que diz respeito ao desenvolvimento da doença, pelo que foi estudado se a sua adição ao set de features selecionado resultaria numa melhor performance dos classificadores. O classificador, treinado e otimizado utilizando as features de textura e os cálculos de %PD, com maior capacidade discriminativa foi um Linear Discriminant Analysis (LDA) – AUC = 0.875. Uma vez que a performance é igual à obtida com o classificador que utiliza apenas features de textura, conclui-se que a %PD parece não contribuir com informação relevante. Tal pode ocorrer porque as próprias características de textura já têm informação sobre a densidade da mama. De forma a estudar-se de que modo o desempenho destes métodos computacionais pode ser afetado por piores condições de aquisição de imagem, foi simulado ruído gaussiano, e adicionado ao set de imagens utilizado para testagem. Este ruído, adicionado a cada imagem com quatro magnitudes diferentes, resultou numa AUC de 0.765 para o valor mais baixo de ruído, e numa AUC de 0.5 para o valor de ruído mais elevado. Tais resultados indicam que, para níveis de ruído mais baixo, o classificador consegue, ainda assim, manter uma performance satisfatória – o que deixa de se verificar para valores mais elevados de ruído. Estudou-se, também, se a aplicação de técnicas de filtragem – com um filtro mediana – poderia ajudar a recuperar informação perdida aquando da adição de ruído. A aplicação do filtro a todas as imagens ruidosas resultou numa AUC de 0.754 para o valor mais elevado de ruído, atingindo assim um desempenho similar ao set de imagens menos ruidosas, antes do processo de filtragem (AUC=0.765). Este resultados parecem indicar que, na presença de más condições de aquisição, a aplicação de um filtro mediana pode ajudar a recuperar informação, conduzindo assim a um melhor desempenho dos métodos computacionais. No entanto, esta mesma conclusão parece não se verificar para valores de ruído mais baixo onde a AUC após filtragem acaba por ser mais reduzida. Tal resultado poderá indicar que, em situações onde o nível de ruído é mais baixo, a técnica de filtragem não só remove o ruído, como acaba também por, ela própria, remover informação ao nível da textura da imagem. De modo a verificar se mamas com diferentes densidades afetavam a performance do classificador, foram criados três sets de teste diferentes, cada um deles contendo imagens de mamas com a mesma densidade (1, 2, e 3). Os resultados obtidos indicam-nos que um aumento na densidade das mamas analisadas não resulta, necessariamente, numa diminuição da capacidade em discriminar as classes definidas (AUC = 0.864, AUC = 0.927, AUC= 0.905; para as classes 1, 2, e 3 respetivamente). A utilização da imagem integral para analisar de textura, e a utilização de imagens de datasets diferentes (com dimensões de imagem diferentes), poderiam introduzir um viés na classificação, especialmente no que diz respeito às diferentes áreas da mama. Para verificar isso mesmo, utilizando o coeficiente de correlação de Pearson, ρ = 0.3, verificou-se que a área da mama (e a percentagem de ocupação) tem uma fraca correlação com a classificação dada a cada imagem. A construção do classificador, para além de servir de base a todos os testes apresentados, serviu também o propósito de criar uma interface interativa, passível de ser utilizada como ficheiro executável, sem necessidade de instalação de nenhum software. Esta aplicação permite que o utilizador carregue imagens de mamografia, exclua background desnecessário para a análise da imagem, extraia features, teste o classificador construído e dê como output, no ecrã, a classe correspondente à imagem carregada. A análise de risco de desenvolvimento da doença foi conseguida através da análise visual da variação dos valores das features de textura ao longo dos anos para um pequeno set (N=11) de mulheres. Esta mesma análise permitiu descortinar aquilo que parece ser uma tendência apresentada apenas por mulheres doentes, na mamografia imediatamente anterior ao diagnóstico da doença. Todos os resultados obtidos são descritos profundamente ao longo deste documento, onde se faz, também, uma referência pormenorizada a todos os métodos utilizados para os obter. O resultado da classificação feita apenas com as features de textura encontra-se dentro dos valores referenciados no estado-da-arte, indicando que o uso de features de textura, por si só, demonstrou ser profícuo. Para além disso, tal resultado serve também de indicação que o recurso a toda a imagem de mamografia, sem o trabalho árduo de definição de uma Região de Interesse, poderá ser utilizado com relativa segurança. Os resultados provenientes da análise do efeito da densidade e da área da mama, dão também confiança no uso do classificador. A interface interativa que resultou desta primeira fase de trabalho tem, potencialmente, um diferenciado conjunto de aplicações: no campo médico, poderá servir de auxiliar de diagnóstico ao médico; já no campo da análise computacional, poderá servir para a definição da ground truth de potenciais datasets que não tenham legendas definidas. No que diz respeito à análise de risco, a utilização de um dataset de dimensões reduzidas permitiu, ainda assim, compreender que existem tendências nas variações das features ao longo dos anos, que são especificas de mulheres que desenvolveram a doença. Os resultados obtidos servem, então, de indicação que a continuação desta linha de trabalho, procurando avaliar/predizer o risco, deverá ser seguida, com recurso não só a datasets mais completos, como também a métodos computacionais de aprendizagem automática.Two million and three hundred thousand Breast Cancer (BC) cases were diagnosed in 2020, making it the type of cancer with the highest incidence that year, considering both sexes. Breast Cancer diagnosis usually occurs during screening programs using mammography, which has some downsides: the masking effect due to its 2-D nature, and its poor sensitivity concerning dense breasts. Since these issues result in difficulties reading mammograms, the main part of this work aimed to verify how a computer vision method would perform in classifying mammograms into two classes: cancer and non-cancer. The ‘non-cancer group’ (N=159) was composed by images with healthy tissue (N=84) and images with benign lesions (N=75), while the cancer group (N=73) contained malignant lesions. To achieve this, multiple classifiers were optimized and trained (Ntrain = 162, Ntest = 70) with a previously selected ideal sub-set of features that describe the texture of the entire image, instead of just one small Region of Interest (ROI). The classifier with the best performance was Support Vector Machine (SVM), (AUC = 0.875), which indicates a good-to-excellent capability discriminating the two defined groups. To assess if Percent Mammographic Density (%PD), an important risk factor, added important information, a new classifier was optimized and trained using the selected sub-set of texture features plus the %PD calculation. The classifier with the best performance was a Linear Discriminant Analysis (LDA), (AUC=0.875), which seems to indicate, once it achieves the same performance as the classifier using only texture features, that there is no relevant information added from %PD calculations. This happens because texture already includes information on breast density. To understand how the classifier would perform in worst image acquisition conditions, gaussian noise was added to the test images (N=70), with four different magnitudes (AUC= 0.765 for the lowest noise value vs. AUC ≈ 0.5 for the highest). A median filter was applied to the noised images towards evaluating if information could be recovered. For the highest noise value, after filtering, the AUC was very close to the one obtained for the lowest noise value before filtering (0.754 vs 0.765), which indicates information recovery. The effect of density in classifier performance was evaluated by constructing three different test sets, each containing images from a density class (1,2,3). It was seen that an increase in density did not necessarily resulted in a decrease in performance, which indicates that the classifier is robust to density variation (AUC = 0.864, AUC= 0.927, AUC= 0.905 ; for class 1, 2, and 3 respectively). Since the entire image is being analyzed, and images come from different datasets, it was verified if breast area was adding bias to classification. Pearson correlation coefficient provided an output of ρ = 0.22, showing that there is a weak correlation between these two variables. Finally, breast cancer risk was assessed by visual texture feature analysis through the years, for a small set of women (N=11). This visual analysis allowed to unveil what seems to be a pattern amongst women who developed the disease, in the mammogram immediately before diagnosis. The details of each phase, as well as the associated final results are deeply described throughout this document. The work done in the first classification task resulted in a state-of-the-art performance, which may serve as foundation for new research in the area, without the laborious work of ROI definition. Besides that, the use of texture features alone proved to be fruitful. Results concerning risk may serve as basis for future work in the area, with larger datasets and the incorporation of Computer Vision methods

    Analysis of Mammographic Images for Early Detection of Breast Cancer Using Machine Learning Techniques

    Get PDF
    Breast cancer is the main reason for death among women. Radiographic images obtained from mammography equipment are one of the most frequently used techniques for helping in early detection of breast cancer. The motivation behind this study is to focus the tumour types of breast cancer images .It is methodology to anticipated a sickness in view of the visual conclusion of breast disease tumour types with precision, particularly when numerous feature are related. Breast Cancer (BC) is one such sample where the phenomenon is very complex furthermore numerous feature of tumour types are included. In the present investigation, various pattern recognition techniques were used for the classification of breast cancer using mammograms image processing techniques .The pattern recognition techniques for tumour image enhancements, segmentation, texture based image feature extraction and subsequent classification of breast cancer mammogram image was successfully performed. When two machine learning techniques such as Artificial Neural Network (ANN), Support Vector Machine (SVM) were used to classify 120 images, it was observed from the results that Artificial Neural Network classifiers demonstrated the h classification rate 91.31% and the SVM with both Radial Basis Function (RBF) and linear kernel classifiers demonstrated the highest classification rate of 92.11% and RBF classification rate is 92.85%

    Multi-Model Approach and Fuzzy Clustering for Mammogram Tumor to Improve Accuracy

    Get PDF
    Breast Cancer is one of the most common diseases among women which seriously affect health and threat to life. Presently, mammography is an uttermost important criterion for diagnosing breast cancer. In this work, image of breast cancer mass detection in mammograms with 1024×1024 pixels is used as dataset. This work investigates the performance of various approaches on classification techniques. Overall support vector machine (SVM) performs better in terms of log-loss and classification accuracy rate than other underlying models. Therefore, further extensions (i.e., multi-model ensembles method, Fuzzy c-means (FCM) clustering and SVM combination method, and FCM clustering based SVM model) and comparison with SVM have been performed in this work. The segmentation by FCM clustering technique allows one piece of data to belong in two or more clusters. The additional parts are due to the segmented image to enhance the tumor-shape. Simulation provides the accuracy and the area under the ROC curve for mini-MIAS are 91.39% and 0.964 respectively which give the confirmation of the effectiveness of the proposed algorithm (FCM-based SVM). This method increases the classification accuracy in the case of a malignant tumor. The simulation is based on R-software.This research was funded by the Spanish Government for its support through grant RTI2018-094336-B-100 (MCIU/AEI/FEDER, UE) and to the Basque Government for its support through grant IT1207-19

    ANN and Adaboost application for automatic detection of microcalcifications in breast cancer

    Get PDF
    AbstractObjectiveMicrocalcifications or MCs are considered to be the basic symptoms present in mammograms for breast cancer diagnosis. Therefore, the accurate detection of MCs is mandatory for the on-time diagnosis, effective treatment and reduction of mortality rates due to breast cancer. Mammogram analysis and interpretation is a challenging task, and there are many obstructions to the accurate detection of MCs such as small and non-uniform shape and size of the MCs clusters in addition to low contrast quality of MCs as compared to the rest of the tissue. These shortcomings of manual interpretation of MCs raise the need for an automatic detection system to assist radiologists in mammogram analysis. In this study, an automated system has been developed to minimize the manual inference and diagnose breast cancer with good precision. In this paper, we propose a two-fold detection algorithm. In the first stage, all suspicious regions from the mammogram are segmented out. In the next stage, these suspected regions are fed to a classifier which then detects whether the region was normal, benign or malignant. We compared the performance of a Neural Network classifier with Adaboost. ANN classifier shows more sensitivity and specificity but less accuracy as compared to Adaboost for tested images. Overall results show that the developed algorithm is able to achieve high accuracy and efficiency for the detection and diagnosis of breast cancer lesions for images from two different databases used, and also for mammograms obtained from a local hospital.ConclusionThe suggested algorithm was tested for DDSM, MIAS and local database and showed high level of overall accuracy (98.68%) and sensitivity (80.15%)
    corecore