11 research outputs found

    A Review on Classification of White Blood Cells Using Machine Learning Models

    Full text link
    The machine learning (ML) and deep learning (DL) models contribute to exceptional medical image analysis improvement. The models enhance the prediction and improve the accuracy by prediction and classification. It helps the hematologist to diagnose the blood cancer and brain tumor based on calculations and facts. This review focuses on an in-depth analysis of modern techniques applied in the domain of medical image analysis of white blood cell classification. For this review, the methodologies are discussed that have used blood smear images, magnetic resonance imaging (MRI), X-rays, and similar medical imaging domains. The main impact of this review is to present a detailed analysis of machine learning techniques applied for the classification of white blood cells (WBCs). This analysis provides valuable insight, such as the most widely used techniques and best-performing white blood cell classification methods. It was found that in recent decades researchers have been using ML and DL for white blood cell classification, but there are still some challenges. 1) Availability of the dataset is the main challenge, and it could be resolved using data augmentation techniques. 2) Medical training of researchers is recommended to help them understand the structure of white blood cells and select appropriate classification models. 3) Advanced DL networks such as Generative Adversarial Networks, R-CNN, Fast R-CNN, and faster R-CNN can also be used in future techniques.Comment: 23 page

    Imbalance Learning and Its Application on Medical Datasets

    Get PDF
    To gain more valuable information from the increasing large amount of data, data mining has been a hot topic that attracts growing attention in this two decades. One of the challenges in data mining is imbalance learning, which refers to leaning from imbalanced datasets. The imbalanced datasets is dominated by some classes (majority) and other under-represented classes (minority). The imbalanced datasets degrade the learning ability of traditional methods, which are designed on the assumption that all classes are balanced and have equal misclassification costs, leading to the poor performance on the minority classes. This phenomenon is usually called the class imbalance problem. However, it is usually the minority classes of more interest and importance, such as sick cases in the medical dataset. Additionally, traditional methods are optimized to achieve maximum accuracy, which is not suitable for evaluating the performance on imbalanced datasets. From the view of data space, class imbalance could be classified as extrinsic imbalance and intrinsic imbalance. Extrinsic imbalance is caused by external factors, such as data transmission or data storage, while intrinsic imbalance means the dataset is inherently imbalanced due to its nature.  As extrinsic imbalance could be fixed by collecting more samples, this thesis mainly focus on on two scenarios of the intrinsic imbalance,  machine learning for imbalanced structured datasets and deep learning for imbalanced image datasets.  Normally, the solutions for the class imbalance problem are named as imbalance learning methods, which could be grouped into data-level methods (re-sampling), algorithm-level (re-weighting) methods and hybrid methods. Data-level methods modify the class distribution of the training dataset to create balanced training sets, and typical examples are over-sampling and under-sampling. Instead of modifying the data distribution, algorithm-level methods adjust the misclassification cost to alleviate the class imbalance problem, and one typical example is cost sensitive methods. Hybrid methods usually combine data-level methods and algorithm-level methods. However, existing imbalance learning methods encounter different kinds of problems. Over-sampling methods increase the minority samples to create balanced training sets, which might lead the trained model overfit to the minority class. Under-sampling methods create balanced training sets by discarding majority samples, which lead to the information loss and poor performance of the trained model. Cost-sensitive methods usually need assistance from domain expert to define the misclassification costs which are task specified. Thus, the generalization ability of cost-sensitive methods is poor. Especially, when it comes to the deep learning methods under class imbalance, re-sampling methods may introduce large computation cost and existing re-weighting methods could lead to poor performance. The object of this dissertation is to understand features difference under class imbalance, to improve the classification performance on structured datasets or image datasets. This thesis proposes two machine learning methods for imbalanced structured datasets and one deep learning method for imbalance image datasets. The proposed methods are evaluated on several medical datasets, which are intrinsically imbalanced.  Firstly, we study the feature difference between the majority class and the minority class of an imbalanced medical dataset, which is collected from a Chinese hospital. After data cleaning and structuring, we get 3292 kidney stone cases treated by Percutaneous Nephrolithonomy from 2012 to 2019. There are 651 (19.78% ) cases who have postoperative complications, which makes the complication prediction an imbalanced classification task. We propose a sampling-based method SMOTE-XGBoost and implement it to build a postoperative complication prediction model. Experimental results show that the proposed method outperforms classic machine learning methods. Furthermore, traditional prediction models of Percutaneous Nephrolithonomy are designed to predict the kidney stone status and overlook complication related features, which could degrade their prediction performance on complication prediction tasks. To this end, we merge more features into the proposed sampling-based method and further improve the classification performance. Overall, SMOTE-XGBoost achieves an AUC of 0.7077 which is 41.54% higher than that of S.T.O.N.E. nephrolithometry, a traditional prediction model of Percutaneous Nephrolithonomy. After reviewing the existing machine learning methods under class imbalance, we propose a novel ensemble learning approach called Multiple bAlance Subset Stacking (MASS). MASS first cuts the majority class into multiple subsets by the size of the minority set, and combines each majority subset with the minority set as one balanced subsets. In this way, MASS could overcome the problem of information loss because it does not discard any majority sample. Each balanced subset is used to train one base classifier. Then, the original dataset is feed to all the trained base classifiers, whose output are used to generate the stacking dataset. One stack model is trained by the staking dataset to get the optimal weights for the base classifiers. As the stacking dataset keeps the same labels as the original dataset, which could avoid the overfitting problem. Finally, we can get an ensembled strong model based on the trained base classifiers and the staking model. Extensive experimental results on three medical datasets show that MASS outperforms baseline methods.  The robustness of MASS is proved over implementing different base classifiers. We design a parallel version MASS to reduce the training time cost. The speedup analysis proves that Parallel MASS could reduce training time cost greatly when applied on large datasets. Specially, Parallel MASS reduces 101.8% training time compared with MASS at most in our experiments.  When it comes to the class imbalance problem of image datasets, existing imbalance learning methods suffer from the problem of large training cost and poor performance.  After introducing the problem of implementing resampling methods on image classification tasks, we demonstrate issues of re-weighting strategy using class frequencies through the experimental result on one medical image dataset.  We propose a novel re-weighting method Hardness Aware Dynamic loss to solve the class imbalance problem of image datasets. After each training epoch of deep neural networks, we compute the classification hardness of each class. We will assign higher class weights to the classes have large classification hardness values and vice versa in the next epoch. In this way, HAD could tune the weight of each sample in the loss function dynamically during the training process. The experimental results prove that HAD significantly outperforms the state-of-the-art methods. Moreover, HAD greatly improves the classification accuracies of minority classes while only making a small compromise of majority class accuracies. Especially, HAD loss improves 10.04% average precision compared with the best baseline, Focal loss, on the HAM10000 dataset. At last, I conclude this dissertation with our contributions to the imbalance learning, and provide an overview of potential directions for future research, which include extensions of the three proposed methods, development of task-specified algorithms, and fixing the challenges of within-class imbalance.2021-06-0

    Intelligent Biosignal Processing in Wearable and Implantable Sensors

    Get PDF
    This reprint provides a collection of papers illustrating the state-of-the-art of smart processing of data coming from wearable, implantable or portable sensors. Each paper presents the design, databases used, methodological background, obtained results, and their interpretation for biomedical applications. Revealing examples are brain–machine interfaces for medical rehabilitation, the evaluation of sympathetic nerve activity, a novel automated diagnostic tool based on ECG data to diagnose COVID-19, machine learning-based hypertension risk assessment by means of photoplethysmography and electrocardiography signals, Parkinsonian gait assessment using machine learning tools, thorough analysis of compressive sensing of ECG signals, development of a nanotechnology application for decoding vagus-nerve activity, detection of liver dysfunction using a wearable electronic nose system, prosthetic hand control using surface electromyography, epileptic seizure detection using a CNN, and premature ventricular contraction detection using deep metric learning. Thus, this reprint presents significant clinical applications as well as valuable new research issues, providing current illustrations of this new field of research by addressing the promises, challenges, and hurdles associated with the synergy of biosignal processing and AI through 16 different pertinent studies. Covering a wide range of research and application areas, this book is an excellent resource for researchers, physicians, academics, and PhD or master students working on (bio)signal and image processing, AI, biomaterials, biomechanics, and biotechnology with applications in medicine

    Faster R-CNN model with momentum optimizer for RBC and WBC variants classification

    No full text
    Since many diseases and infections are dependent on the count and type of Red Blood Cells (RBCs) and White Blood Cells (WBCs) present in the blood stream, detection and classification pertaining to them is necessary and relevant. Based from existing related literature, ordinary Neural Networks are usually employed. Also, in existing researches, RBC types are the main focus. Hence, after observing research gaps, a Faster Region-based Convolutional Neural Network (Faster R-CNN) was utilized for this study, focusing not only on RBCs but also on the variants of WBCs. The aim is to have a fast and reliable system in order to achieve the goal of aiding the medical field in the classification of RBCs and WBCs. © 2020 IEEE

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered

    Enabling cardiovascular multimodal, high dimensional, integrative analytics

    Get PDF
    While traditionally the understanding of cardiovascular morbidity relied on the acquisition and interpretation of health data, the advances in health technologies has enabled us to collect far larger amount of health data. This thesis explores the application of advanced analytics that utilise powerful mechanisms for integrating health data across different modalities and dimensions into a single and holistic environment to better understand different diseases, with a focus on cardiovascular conditions. Different statistical methodologies are applied across a number of case studies supported by a novel methodology to integrate and simplify data collection. The work culminates in the different dataset modalities explaining different effects on morbidity: blood biomarkers, electrocardiogram recordings, RNA-Seq measurements, and different population effects piece together the understanding of a person morbidity. More specifically, explainable artificial intelligence methods were employed on structured datasets from patients with atrial fibrillation to improve the screening for the disease. Omics datasets, including RNA-sequencing and genotype datasets, were examined and new biomarkers were discovered allowing a better understanding of atrial fibrillation. Electrocardiogram signal data were used to assess the early risk prediction of heart failure, enabling clinicians to use this novel approach to estimate future incidences. Population-level data were applied to the identification of associations and temporal trajectory of diseases to better understand disease dependencies in different clinical cohorts

    Preface

    Get PDF

    Молодежь и современные информационные технологии : сборник трудов XIX Международной научно-практической конференции студентов, аспирантов и молодых учёных, 21-25 марта 2022 г., г. Томск

    Get PDF
    Сборник содержит доклады, представленные на XIX Международной научно-практической конференции студентов, аспирантов и молодых ученых «Молодежь и современные информационные технологии», прошедшей в Томском политехническом университете на базе Инженерной школы информационных технологий и робототехники. Материалы сборника отражают доклады студентов, аспирантов и молодых ученых, принятые к обсуждению на секциях: «Искусственный интеллект и машинное обучение», «Цифровизация, IT и цифровая экономика», «Дизайн и компьютерная графика», «Виртуальная и дополненная реальность», «Технология больших данных в индустрии», «Мехатроника и робототехника», «Автоматизация технологических процессов и производств». Сборник предназначен для специалистов в области информационных технологий, студентов и аспирантов соответствующих специальностей
    corecore