85 research outputs found

    Machine learning and feature selection methods for egfr mutation status prediction in lung cancer

    Get PDF
    The evolution of personalized medicine has changed the therapeutic strategy from classical chemotherapy and radiotherapy to a genetic modification targeted therapy, and although biopsy is the traditional method to genetically characterize lung cancer tumor, it is an invasive and painful procedure for the patient. Nodule image features extracted from computed tomography (CT) scans have been used to create machine learning models that predict gene mutation status in a noninvasive, fast, and easy-to-use manner. However, recent studies have shown that radiomic features extracted from an extended region of interest (ROI) beyond the tumor, might be more relevant to predict the mutation status in lung cancer, and consequently may be used to significantly decrease the mortality rate of patients battling this condition. In this work, we investigated the relation between image phenotypes and the mutation status of Epidermal Growth Factor Receptor (EGFR), the most frequently mutated gene in lung cancer with several approved targeted-therapies, using radiomic features extracted from the lung containing the nodule. A variety of linear, nonlinear, and ensemble predictive classification models, along with several feature selection methods, were used to classify the binary outcome of wild-type or mutant EGFR mutation status. The results show that a comprehensive approach using a ROI that included the lung with nodule can capture relevant information and successfully predict the EGFR mutation status with increased performance compared to local nodule analyses. Linear Support Vector Machine, Elastic Net, and Logistic Regression, combined with the Principal Component Analysis feature selection method implemented with 70% of variance in the feature set, were the best-performing classifiers, reaching Area Under the Curve (AUC) values ranging from 0.725 to 0.737. This approach that exploits a holistic analysis indicates that information from more extensive regions of the lung containing the nodule allows a more complete lung cancer characterization and should be considered in future radiogenomic studies.This work is financed by the ERDF—European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation—COMPETE 2020 Programme and by National Funds through the Portuguese funding agency, FCT—Fundação para a Ciência e a Tecnologia within project POCI-01-0145-FEDER-030263

    A novel two-stage heart arrhythmia ensemble classifier

    Get PDF
    Atrial fibrillation (AF) and ventricular arrhythmia (Arr) are among the most common and fatal cardiac arrhythmias in the world. Electrocardiogram (ECG) data, collected as part of the UK Biobank, represents an opportunity for analysis and classification of these two diseases in the UK. The main objective of our study is to investigate a two-stage model for the classification of individuals with AF and Arr in the UK Biobank dataset. The current literature addresses heart arrhythmia classification very extensively. However, the data used by most researchers lack enough instances of these common diseases. Moreover, by proposing the two-stage model and separation of normal and abnormal cases, we have improved the performance of the classifiers in detection of each specific disease. Our approach consists of two stages of classification. In the first stage, features of the ECG input are classified into two main classes: normal and abnormal. At the second stage, the features of the ECG are further categorised as abnormal and further classified into two diseases of AF and Arr. A diverse set of ECG features such as the QRS duration, PR interval and RR interval, as well as covariates such as sex, BMI, age and other factors, are used in the modelling process. For both stages, we use the XGBoost Classifier algorithm. The healthy population present in the data, has been undersampled to tackle the class imbalance present in the data. This technique has been applied and evaluated using an ECG dataset from the UKBioBank ECG taken at rest repository. The main results of our paper are as follows: The classification performance for the proposed approach has been measured using F1 score, Sensitivity (Recall) and Specificity (Precision). The results of the proposed system are 87.22%, 88.55% and 85.95%, for average F1 Score, average sensitivity and average specificity, respectively. Contribution and significance: The performance level indicates that automatic detection of AF and Arr in participants present in the UK Biobank is more precise and efficient if done in a two-stage manner. Automatic detection and classification of AF and Arr individuals this way would mean early diagnosis and prevention of more serious consequences later in their lives

    Classification techniques using gray level co-occurrence matrix features for the detection of lung cancer using computed tomography imaging

    Get PDF
    Lung cancer, which causes the majority of fatalities worldwide each year, is one of the deadliest diseases. The survival rate of cancer patients could be improved with better cancer detection methods. Image processing and machine learning have both been used to aid in lung cancer detection, but a method that both increase accuracy and increases a patient’s survival rate has yet to be identified. In an effort to find the most effective method for the accurate lung cancer recognition, this paper analyses and compares several classification algorithms. Lung computed tomography (CT) images are enhanced by removing noise using a median filter. For filtered image, threshold segmentation is used to segment it into distinct parts. From the segmented image different features are extracted using the grey level co-occurrence matrix (GLCM). several classification strategies, including support vector machine (SVM), random forest (RF), k-nearest neighbor (KNN), and decision tree (DT) methods, are used to classify lung images as malignant or normal based on the extracted features. Methods are evaluated based on a number of various performance measures, like accuracy, a precision, the recall, and the F1-Score. Based on the experimental outcomes, SVM outperforms other classification methods in accurately detecting lung cancer with an accuracy of 99.32%

    Women in Artificial intelligence (AI)

    Get PDF
    This Special Issue, entitled "Women in Artificial Intelligence" includes 17 papers from leading women scientists. The papers cover a broad scope of research areas within Artificial Intelligence, including machine learning, perception, reasoning or planning, among others. The papers have applications to relevant fields, such as human health, finance, or education. It is worth noting that the Issue includes three papers that deal with different aspects of gender bias in Artificial Intelligence. All the papers have a woman as the first author. We can proudly say that these women are from countries worldwide, such as France, Czech Republic, United Kingdom, Australia, Bangladesh, Yemen, Romania, India, Cuba, Bangladesh and Spain. In conclusion, apart from its intrinsic scientific value as a Special Issue, combining interesting research works, this Special Issue intends to increase the invisibility of women in AI, showing where they are, what they do, and how they contribute to developments in Artificial Intelligence from their different places, positions, research branches and application fields. We planned to issue this book on the on Ada Lovelace Day (11/10/2022), a date internationally dedicated to the first computer programmer, a woman who had to fight the gender difficulties of her times, in the XIX century. We also thank the publisher for making this possible, thus allowing for this book to become a part of the international activities dedicated to celebrating the value of women in ICT all over the world. With this book, we want to pay homage to all the women that contributed over the years to the field of AI
    • …
    corecore