5 research outputs found

    Feature selection using correlation analysis and principal component analysis for accurate breast cancer diagnosis

    Get PDF
    Breast cancer is one of the leading causes of death among women, more so than all other cancers. The accurate diagnosis of breast cancer is very difficult due to the complexity of the disease, changing treatment procedures and different patient population samples. Diagnostic techniques with better performance are very important for personalized care and treatment and to reduce and control the recurrence of cancer. The main objective of this research was to select feature selection techniques using correlation analysis and variance of input features before passing these significant features to a classification method. We used an ensemble method to improve the classification of breast cancer. The proposed approach was evaluated using the public WBCD dataset (Wisconsin Breast Cancer Dataset). Correlation analysis and principal component analysis were used for dimensionality reduction. Performance was evaluated for well-known machine learning classifiers, and the best seven classifiers were chosen for the next step. Hyper-parameter tuning was performed to improve the performances of the classifiers. The best performing classification algorithms were combined with two different voting techniques. Hard voting predicts the class that gets the majority vote, whereas soft voting predicts the class based on highest probability. The proposed approach performed better than state-of-the-art work, achieving an accuracy of 98.24%, high precision (99.29%) and a recall value of 95.89%

    White Matter Hyperintensity and Multi-region Brain MRI Segmentation Using Convolutional Neural Network

    Get PDF
    Accurate segmentation of WMH (white matter hyperintensity) from the magnetic resonance image is a prerequisite for many precise medical procedures, especially for the diagnosis of vascular dementia. Brain segmentation has important research significance and clinical application prospects especially for early detection of Alzheimer’s disease. In order to effectively perform accurate segmentation according to the MRI characteristics of different regions of the brain, this thesis proposed an optimized 3D u-net and used WHM segmentation as a pre-experiment to select the good hyperparameters (i.e. network depth, image fusion method, and the implementation of loss function) to construct an image feature learning network with both long and short skip connections. Soft voting is used as the postprocessing procedure. Our model is evaluated by a 10-fold cross-validation and achieved a dice score of 0.78 for binary segmentation (WMH segmentation) and accuracy of 0.96 for multi-class segmentation (139 regions brain segmentation), outperforming other methods

    Consciousness level assessment in completely locked-in syndrome patients using soft-clustering

    Get PDF
    Brain-computer interfaces (BCIs) are very convenient tools to assess locked-in (LIS) and completely locked-in state (CLIS) patients' hidden states of consciousness. For the time being, there is no ground-truth data in respect to these states for above-mentioned patients. This lack of gold standard makes this problem particularly challenging. In addition to consciousness assessment, BCIs also provide them with a communication device that does not require the presence of motor responses, which they are lacking. Communication plays an important role in the patients' quality of life and prognosis. Significant progress have been made to provide them with EEG-based BCIs in particular. Nonetheless, the majority of existing studies directly dive into the communication part without assessing if the patient is even conscious. Additionally, the few studies that do essentially use evoked brain potentials, mostly the P300, that necessitates the patient's voluntary and active participation to be elicited. Patients are easily fatigued, and would consequently be less successful during the main communication task. Furthermore, when the consciousness states are determined using resting state data, only one or two features were used. In this thesis, different sets of EEG features are used to assess the consciousness level of CLIS patients using resting-state data. This is done as a preliminary step that needed to be succeeded in order to engage to the next step, communication with the patient. In other words, the 'conversation' is initiated only if the patient is sufficiently conscious. This variety of EEG features is utilised to increase the probability of correctly estimating the patients' consciousness states. Indeed, each of them captures a particular signal attribute, and combining them would allow the collection of different hidden characteristics that could have not been obtained from a single feature. Furthermore, the proposed method should allow to determine if communication shall be initiated at a specific time with the patient. The EEG features used are frequency-based, complexity related and connectivity metrics. Besides, instead of analysing results from individual channels or specific brain regions, the global activity of the brain is assessed. The estimated consciousness levels are then obtained by applying two different soft-clustering analysis methods, namely Fuzzy c-means (FCM) and Gaussian Mixture Models (GMM), to the individual features and ensembling their results using their average or their product. The proposed approach is first applied to EEG data recorded from patients with unresponsive wakefulness syndrome (UWS) and minimally conscious state (MCS) (patients with disorders of consciousness (DoC)) to evaluate its performance. It is subsequently applied to data from one CLIS patient that is unique in its kind because it contains a time frame during which the experimenters affirmed that he was conscious. Finally, it is used to estimate the levels of consciousness of nine other CLIS patients. The obtained results revealed that the presented approach was able to take into account the variations of the different features and deduce a unique output taking into consideration the individual features contributions. Some of them performed better than others, which is not surprising since each person is different. It was also able to draw very accurate estimations of the level of consciousness under specific conditions. The approach presented in this thesis provides an additional tool for diagnosis to the medical staff. Furthermore, when implemented online, it would enable to determine the optimal time to engage in communication with CLIS patients. Moreover, it could possibly be used to predict patients' cognitive decline and/or death

    Ensemble Machine Learning Model Generalizability and its Application to Indirect Tool Condition Monitoring

    Get PDF
    A practical, accurate, robust, and generalizable system for monitoring tool condition during a machining process would enable advancements in manufacturing process automation, cost reduction, and efficiency improvement. Previously proposed systems using various individual machine learning (ML) models and other analysis techniques have struggled with low generalizability to new machining and environmental conditions, as well as a common reliance on expensive or intrusive sensory equipment which hinders their industry adoption. While ensemble ML techniques offer significant advantages over individual models in terms of performance, overfitting reduction, and generalizability improvement, they have only begun to see limited applications within the field of tool condition monitoring (TCM). To address the research gaps which currently surround TCM system generalizability and optimal ensemble model configuration for this application, nine ML model types, including five heterogeneous and homogeneous ensemble models, are employed for tool wear classification. Sound, spindle power, and axial load signals are utilized through the sensor fusion of practical external and internal machine sensors. This original experimental process data is collected through tool wear experiments using a variety of machining conditions. Four feature selection methods and multiple tool wear classification resolution values are compared for this application, and the performance of the ML models is compared across metrics including k-fold cross validation and leave-one-group-out cross validation. The generalizability of the models to data from unseen experiments and machining conditions is evaluated, and a method of improving the generalizability levels using noisy training data is examined. T-tests are used to measure the significance of model performance differences. The extra-trees ensemble ML method, which had never before been applied to signal-based TCM, shows the best performance of the nine models.M.S

    Advanced Computational Methods for Oncological Image Analysis

    Get PDF
    [Cancer is the second most common cause of death worldwide and encompasses highly variable clinical and biological scenarios. Some of the current clinical challenges are (i) early diagnosis of the disease and (ii) precision medicine, which allows for treatments targeted to specific clinical cases. The ultimate goal is to optimize the clinical workflow by combining accurate diagnosis with the most suitable therapies. Toward this, large-scale machine learning research can define associations among clinical, imaging, and multi-omics studies, making it possible to provide reliable diagnostic and prognostic biomarkers for precision oncology. Such reliable computer-assisted methods (i.e., artificial intelligence) together with clinicians’ unique knowledge can be used to properly handle typical issues in evaluation/quantification procedures (i.e., operator dependence and time-consuming tasks). These technical advances can significantly improve result repeatability in disease diagnosis and guide toward appropriate cancer care. Indeed, the need to apply machine learning and computational intelligence techniques has steadily increased to effectively perform image processing operations—such as segmentation, co-registration, classification, and dimensionality reduction—and multi-omics data integration.
    corecore