Search CORE

3 research outputs found

Recommended from our members

Review of Immunotherapy Classification: Application Domains, Datasets, Algorithms and Software Tools from Machine Learning Perspective

Author: Abdullatif Amr R.A.
Mahmoud Ahsanullah Y.
Neagu Daniel
Scrimieri Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/12/2022
Field of study

YesImmunotherapy treatments can be essential sometimes and a waste of valuable resources in other cases, depending on the diagnosis results. Therefore, researchers in immunotherapy need to be updated on the current status of research by exploring: application domains e.g. warts, datasets e.g. immunotherapy, classifiers or algorithms e.g. kNN and software tools. The research objectives were: 1) to study the immunotherapy-related published literature from a supervised machine learning perspective. In addition, to reproduce immunotherapy classifiers reported in research papers. 2) To find gaps and challenges both in publications and practical work, which may be the basis for further research. Immunotherapy, diabetes, cryotherapy, exasens data and ”one unbalanced dataset” are explored. The results are compared with published literature. To address the found gaps in further research: novel experiments, unbalanced studies, focus on effectiveness and a new classifier algorithm are suggested

Bradford Scholars

Recommended from our members

Early diagnosis and personalised treatment focusing on synthetic data modelling: Novel visual learning approach in healthcare

Author: Abdullatif Amr R.A.
Mahmoud Ahsanullah Y.
Neagu Daniel
Scrimieri Daniele
Publication venue
Publication date: 09/08/2023
Field of study

YesThe early diagnosis and personalised treatment of diseases are facilitated by machine learning. The quality of data has an impact on diagnosis because medical data are usually sparse, imbalanced, and contain irrelevant attributes, resulting in suboptimal diagnosis. To address the impacts of data challenges, improve resource allocation, and achieve better health outcomes, a novel visual learning approach is proposed. This study contributes to the visual learning approach by determining whether less or more synthetic data are required to improve the quality of a dataset, such as the number of observations and features, according to the intended personalised treatment and early diagnosis. In addition, numerous visualisation experiments are conducted, including using statistical characteristics, cumulative sums, histograms, correlation matrix, root mean square error, and principal component analysis in order to visualise both original and synthetic data to address the data challenges. Real medical datasets for cancer, heart disease, diabetes, cryotherapy and immunotherapy are selected as case studies. As a benchmark and point of classification comparison in terms of such as accuracy, sensitivity, and specificity, several models are implemented such as k-Nearest Neighbours and Random Forest. To simulate algorithm implementation and data, Generative Adversarial Network is used to create and manipulate synthetic data, whilst, Random Forest is implemented to classify the data. An amendable and adaptable system is constructed by combining Generative Adversarial Network and Random Forest models. The system model presents working steps, overview and flowchart. Experiments reveal that the majority of data-enhancement scenarios allow for the application of visual learning in the first stage of data analysis as a novel approach. To achieve meaningful adaptable synergy between appropriate quality data and optimal classification performance while maintaining statistical characteristics, visual learning provides researchers and practitioners with practical human-in-the-loop machine learning visualisation tools. Prior to implementing algorithms, the visual learning approach can be used to actualise early, and personalised diagnosis. For the immunotherapy data, the Random Forest performed best with precision, recall, f-measure, accuracy, sensitivity, and specificity of 81%, 82%, 81%, 88%, 95%, and 60%, as opposed to 91%, 96%, 93%, 93%, 96%, and 73% for synthetic data, respectively. Future studies might examine the optimal strategies to balance the quantity and quality of medical data

Bradford Scholars

Recommended from our members

Novel machine learning experiments with artificially generated big data from small immunotherapy datasets

Author: Abdullatif Amr A.A.
Mahmoud Ahsanullah Y.
Neagu Daniel
Scrimieri Daniele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/12/2022
Field of study

YesBig data and machine learning result in agile and robust healthcare by expanding raw data into useful patterns for data-enhanced decision support. The available datasets are mostly small and unbalanced, resulting in non-optimal classification when the algorithms are implemented. In this study, five novel machine learning experiments are conducted to address the challenges of small datasets by expanding these into big data and then utilising Random Forests. The experiments are based on personalised adaptable strategies for both balanced and unbalanced datasets. Multiple datasets from cryotherapy and immunotherapy are considered, however, hereby only immunotherapy is used. In the first experiment, artificially generated data is presented by increasing the observations of the dataset, each new data is four-time larger than the previous one, resulting in better classification. In the second experiment, the effect of volume on classification is considered based on the number of attributes. The attributes of each new dataset are built based on conditional probabilities. It did not make any difference, in obtained classification, when the number of attributes is increased to more than 879. In the third simulation experiment, classes of data are classified manually by dividing the data into a two-dimensional plane. This experiment is first performed on small data and then on expanded big data: by increasing observations, an accuracy of 73.68% is attained. In the fourth experiment, the visualisation of the enlarged data did not provide better insights. In the fifth experiment, the impact of correlations among datasets’ attributes on classification is observed, however, no improvements in performance are achieved. The experiments generally improved performance by comparing the classification results using the original and artificial data.The full-text of this article will be released for public view at the end of the publisher embargo - 12 months after publication

Bradford Scholars