57 research outputs found

    Classifying tree structures using elastic matching of sequence encodings

    Get PDF
    This document is the Accepted Manuscript version of the following article: Angeliki Skoura, Iosif Mporas, Vasileios Megalooikonomou, ‘Classifying tree structures using elastic matching of sequence encodings’, Neurocomputing, Vol. 163, pp. 151-159, February 2015. The Version of Record is available online at: DOI: https://doi.org/10.1016/j.neucom.2014.08.083. This Manuscript version is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/ ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.Structures of tree topology are frequently encountered in nature and in a range of scientific domains. In this paper, a multi-step framework is presented to classify tree topologies introducing the idea of elastic matching of their sequence encodings. Initially, representative sequences of the branching topologies are obtained using node labeling and tree traversal schemes. The similarity between tree topologies is then quantified by applying elastic matching techniques. The resulting sequence alignment reveals corresponding node groups providing a better understanding of matching tree topologies. The new similarity approach is explored using various classification algorithms and is applied to a medical dataset outperforming state-of-the-art techniques by at least 6.6% and 3.5% in terms of absolute specificity and accuracy correspondingly.Peer reviewe

    Feature selection using correlation analysis and principal component analysis for accurate breast cancer diagnosis

    Get PDF
    Breast cancer is one of the leading causes of death among women, more so than all other cancers. The accurate diagnosis of breast cancer is very difficult due to the complexity of the disease, changing treatment procedures and different patient population samples. Diagnostic techniques with better performance are very important for personalized care and treatment and to reduce and control the recurrence of cancer. The main objective of this research was to select feature selection techniques using correlation analysis and variance of input features before passing these significant features to a classification method. We used an ensemble method to improve the classification of breast cancer. The proposed approach was evaluated using the public WBCD dataset (Wisconsin Breast Cancer Dataset). Correlation analysis and principal component analysis were used for dimensionality reduction. Performance was evaluated for well-known machine learning classifiers, and the best seven classifiers were chosen for the next step. Hyper-parameter tuning was performed to improve the performances of the classifiers. The best performing classification algorithms were combined with two different voting techniques. Hard voting predicts the class that gets the majority vote, whereas soft voting predicts the class based on highest probability. The proposed approach performed better than state-of-the-art work, achieving an accuracy of 98.24%, high precision (99.29%) and a recall value of 95.89%

    AI-enhanced diagnosis of challenging lesions in breast MRI: a methodology and application primer

    Get PDF
    Computer-aided diagnosis (CAD) systems have become an important tool in the assessment of breast tumors with magnetic resonance imaging (MRI). CAD systems can be used for the detection and diagnosis of breast tumors as a “second opinion” review complementing the radiologist’s review. CAD systems have many common parts such as image pre-processing, tumor feature extraction and data classification that are mostly based on machine learning (ML) techniques. In this review paper, we describe the application of ML-based CAD systems in MRI of the breast covering the detection of diagnostically challenging lesions such as non-mass enhancing (NME) lesions, multiparametric MRI, neo-adjuvant chemotherapy (NAC) and radiomics all applied to NME. Since ML has been widely used in the medical imaging community, we provide an overview about the state-ofthe-art and novel techniques applied as classifiers to CAD systems. The differences in the CAD systems in MRI of the breast for several standard and novel applications for NME are explained in detail to provide important examples illustrating: (i) CAD for the detection and diagnosis, (ii) CAD in multi-parametric imaging (iii) CAD in NAC and (iv) breast cancer radiomics. We aim to provide a comparison between these CAD applications and to illustrate a global view on intelligent CAD systems based on ANN in MRI of the breast

    Color and morphological features extraction and nuclei classification in tissue samples of colorectal cancer

    Get PDF
    Cancer is an important public health problem and the third most leading cause of death in North America. Among the highest impact types of cancer are colorectal, breast, lung, and prostate. This thesis addresses the features extraction by using different artificial intelligence algorithms that provide distinct solutions for the purpose of Computer-AidedDiagnosis (CAD). For example, classification algorithms are employed in identifying histological structures, such as lymphocytes, cancer-cells nuclei and glands, from features like existence, extension or shape. The morphological aspect of these structures indicates the degree of severity of the related disease. In this paper, we use a large dataset of 5000 images to classify eight different tissue types in the case of colorectal cancer. We compare results with another dataset. We perform image segmentation and extract statistical information about the area, perimeter, circularity, eccentricity and solidity of the interest points in the image. Finally, we use and compare four popular machine learning techniques, i.e., Naive Bayes, Random Forest, Support Vector Machine and Multilayer Perceptron to classify and to improve the precision of category assignation. The performance of each algorithm was measured using 3 types of metrics: Precision, recall and F1-Score representing a huge contribution to the existing literature complementing it in a quantitative way. The large number of images has helped us to circumvent the overfitting and reproducibility problems. The main contribution is the use of new characteristics different from those already studied, this work researches about the color and morphological characteristics in the images that may be useful for performing tissue classification in colorectal cancer histology

    Automatic BIRAD scoring of breast cancer mammograms

    Get PDF
    A computer aided diagnosis system (CAD) is developed to fully characterize and classify mass to benign and malignancy and to predict BIRAD (Breast Imaging Reporting and Data system) scores using mammographic image data. The CAD includes a preprocessing step to de-noise mammograms. This is followed by an active counter segmentation to deforms an initial curve, annotated by a radiologist, to separate and define the boundary of a mass from background. A feature extraction scheme wasthen used to fully characterize a mass by extraction of the most relevant features that have a large impact on the outcome of a patient biopsy. For this thirty-five medical and mathematical features based on intensity, shape and texture associated to the mass were extracted. Several feature selection schemes were then applied to select the most dominant features for use in next step, classification. Finally, a hierarchical classification schemes were applied on those subset of features to firstly classify mass to benign (mass with BIRAD score 2) and malignant mass (mass with BIRAD score over 4), and secondly to sub classify mass with BIRAD score over 4 to three classes (BIRAD with score 4a,4b,4c). Accuracy of segmentation performance were evaluated by calculating the degree of overlapping between the active counter segmentation and the manual segmentation, and the result was 98.5%. Also reproducibility of active counter 3 using different manual initialization of algorithm by three radiologists were assessed and result was 99.5%. Classification performance was evaluated using one hundred sixty masses (80 masses with BRAD score 2 and 80 mass with BIRAD score over4). The best result for classification of data to benign and malignance was found using a combination of sequential forward floating feature (SFFS) selection and a boosted tree hybrid classifier with Ada boost ensemble method, decision tree learner type and 100 learners’ regression tree classifier, achieving 100% sensitivity and specificity in hold out method, 99.4% in cross validation method and 98.62 % average accuracy in cross validation method. For further sub classification of eighty malignance data with BIRAD score of over 4 (30 mass with BIRAD score 4a,30 masses with BIRAD score 4b and 20 masses with BIRAD score 4c), the best result achieved using the boosted tree with ensemble method bag, decision tree learner type with 200 learners Classification, achieving 100% sensitivity and specificity in hold out method, 98.8% accuracy and 98.41% average accuracy for ten times run in cross validation method. Beside those 160 masses (BIRAD score 2 and over 4) 13 masses with BIRAD score 3 were gathered. Which means patient is recommended to be tested in another medical imaging technique and also is recommended to do follow-up in six months. The CAD system was trained with mass with BIRAD score 2 and over 4 also 4 it was further tested using 13 masses with a BIRAD score of 3 and the CAD results are shown to agree with the radiologist’s classification after confirming in six months follow up. The present results demonstrate high sensitivity and specificity of the proposed CAD system compared to prior research. The present research is therefore intended to make contributions to the field by proposing a novel CAD system, consists of series of well-selected image processing algorithms, to firstly classify mass to benign or malignancy, secondly sub classify BIRAD 4 to three groups and finally to interpret BIRAD 3 to BIRAD 2 without a need of follow up study

    Archives of Data Science, Series A. Vol. 1,1: Special Issue: Selected Papers of the 3rd German-Polish Symposium on Data Analysis and Applications

    Get PDF
    The first volume of Archives of Data Science, Series A is a special issue of a selection of contributions which have been originally presented at the {\em 3rd Bilateral German-Polish Symposium on Data Analysis and Its Applications} (GPSDAA 2013). All selected papers fit into the emerging field of data science consisting of the mathematical sciences (computer science, mathematics, operations research, and statistics) and an application domain (e.g. marketing, biology, economics, engineering)

    Breast Cancer Analysis in DCE-MRI

    Get PDF
    Breast cancer is the most common women tumour worldwide, about 2 million new cases diagnosed each year (second most common cancer overall). This disease represents about 12% of all new cancer cases and 25% of all cancers in women. Early detection of breast cancer is one of the key factors in determining the prognosis for women with malignant tumours. The standard diagnostic tool for the detection of breast cancer is x-ray mammography. The disadvantage of this method is its low specificity, especially in the case of radiographically dense breast tissue (young or under-forty women), or in the presence of scars and implants within the breast. Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) has demonstrated a great potential in the screening of high-risk women for breast cancer, in staging newly diagnosed patients and in assessing therapy effects. However, due to the large amount of information, DCE-MRI manual examination is error prone and can hardly be inspected without the use of a Computer-Aided Detection and Diagnosis (CAD) system. Breast imaging analysis is made harder by the dynamical characteristics of soft tissues since any patient movements (such as involuntary due to breathing) may affect the voxel-by-voxel dynamical analysis. Breast DCE-MRI computer-aided analysis needs a pre-processing stage to identify breast parenchyma and reduce motion artefacts. Among the major issues in developing CAD for breast DCE-MRI, there is the detection and classification of lesions according to their aggressiveness. Moreover, it would be convenient to determine those subjects who are likely to not respond to the treatment so that a modification may be applied as soon as possible, relieving them from potentially unnecessary or toxic treatments. In this thesis, an automated CAD system is presented. The proposed CAD aims to support radiologist in lesion detection, diagnosis and therapy assessment after a suitable preprocessing stage. Segmentation of breast parenchyma has been addressed relying on fuzzy binary clustering, breast anatomical priors and morphological refinements. The breast mask extraction module combines three 2D Fuzzy C-Means clustering (executed from the three projection, axial, coronal and transversal) and geometrical breast anatomy characterization. In particular, seven well-defined key-points have been considered in order to accurately segment breast parenchyma from air and chest-wall. To diminish the effects of involuntary movement artefacts, it is usual to apply a motion correction of the DCE-MRI volumes before of any data analysis. However, there is no evidence that a single Motion Correction Technique (MCT) can handle different deformations - small or large, rigid or non-rigid - and different patients or tissues. Therefore, it would be useful to develop a quality index (QI) to evaluate the performance of different MCTs. The existent QI might not be adequate to deal with DCE-MRI data because of the intensity variation due to contrast media. Therefore, in developing a novel QI, the underlying idea is that once DCE-MRI data have been realigned using a specific MCT, the dynamic course of the signal intensity should be as close as possible to physiological models, such as the currently accepted ones (e.g. Tofts-Kermode, Extended Tofts-Kermode, Hayton-Brady, Gamma Capillary Transit Time, etc.). The motion correction module ranks all the MCTs, using the QI, selects the best MCT and applies a correction before of further data analysis. The proposed lesion detection module performs the segmentation of lesions in Regions of Interest (ROIs) by means of classification at a pixel level. It is based on a Support Vector Machine (SVM) trained with dynamic features, extracted from a suitably pre-selected area by using a pixel-based approach. The pre-selection mask strongly improves the final result. The lesion classification module evaluates the malignity of each ROI by means of 3D textural features. The Local Binary Patterns descriptor has been used in the Three Orthogonal Planes (LBP-TOP) configuration. A Random Forest has been used to achieve the final classification into a benignant or malignant lesion. The therapy assessment stage aims to predict the patient primary tumour recurrence to support the physician in the evaluation of the therapy effects and benefits. For each patient which has at least a malignant lesion, the recurrence of the disease has been evaluated by means of a multiple classifiers system. A set of dynamic, textural, clinicopathologic and pharmacokinetic features have been used to assess the probability of recurrence for the lesions. Finally, to improve the usability of the proposed work, we developed a framework for tele-medicine that allows advanced medical image remote analysis in a secure and versatile client-server environment, at a low cost. The benefits of using the proposed framework will be presented in a real-case scenario where OsiriX, a wide-spread medical image analysis software, is allowed to perform advanced remote image processing in a simple manner over a secure channel. The proposed CAD system have been tested on real breast DCE-MRI data for the available protocols. The breast mask extraction stage shows a median segmentation accuracy and Dice similarity index of 98% (+/-0,49) and 93% %(+/-1,48) respectively and 100% of neoplastic lesion coverage. The motion correction module is able to rank the MCTs with an accordance of 74% with a 'reference ranking'. Moreover, by only using 40% of the available volume, the computational load is reduced selecting always the best MCT. The automatic detection maximises the area of correctly detected lesions while minimising the number of false alarms with an accuracy of 99% and the lesions are, then, diagnosed according to their stage with an accuracy of 85%. The therapy assessment module provides a forecasting of the tumour recurrence with an accuracy of 78% and an AUC of 79%. Each module has been evaluated by a leave-one-patient-out approach, and results show a confidence level of 95% (p<0.05). Finally, the proposed remote architecture showed a very low transmission overhead which settles on about 2.5% for the widespread 10\100 Mbps. Security has been achieved using client-server certificates and up-to-date standards

    An automatic system for classification of breast cancer lesions in ultrasound images

    Get PDF
    Breast cancer is the most common of all cancers and second most deadly cancer in women in the developed countries. Mammography and ultrasound imaging are the standard techniques used in cancer screening. Mammography is widely used as the primary tool for cancer screening, however it is invasive technique due to radiation used. Ultrasound seems to be good at picking up many cancers missed by mammography. In addition, ultrasound is non-invasive as no radiation is used, portable and versatile. However, ultrasound images have usually poor quality because of multiplicative speckle noise that results in artifacts. Because of noise segmentation of suspected areas in ultrasound images is a challenging task that remains an open problem despite many years of research. In this research, a new method for automatic detection of suspected breast cancer lesions using ultrasound is proposed. In this fully automated method, new de-noising and segmentation techniques are introduced and high accuracy classifier using combination of morphological and textural features is used. We use a combination of fuzzy logic and compounding to denoise ultrasound images and reduce shadows. We introduced a new method to identify the seed points and then use region growing method to perform segmentation. For preliminary classification we use three classifiers (ANN, AdaBoost, FSVM) and then we use a majority voting to get the final result. We demonstrate that our automated system performs better than the other state-of-the-art systems. On our database containing ultrasound images for 80 patients we reached accuracy of 98.75% versus ABUS method with 88.75% accuracy and Hybrid Filtering method with 92.50% accuracy. Future work would involve a larger dataset of ultrasound images and we will extend our system to handle colour ultrasound images. We will also study the impact of larger number of texture and morphological features as well as weighting scheme on performance of our classifier. We will also develop an automated method to identify the "wall thickness" of a mass in breast ultrasound images. Presently the wall thickness is extracted manually with the help of a physician
    corecore