289 research outputs found

    Immersive analytics for oncology patient cohorts

    Get PDF
    This thesis proposes a novel interactive immersive analytics tool and methods to interrogate the cancer patient cohort in an immersive virtual environment, namely Virtual Reality to Observe Oncology data Models (VROOM). The overall objective is to develop an immersive analytics platform, which includes a data analytics pipeline from raw gene expression data to immersive visualisation on virtual and augmented reality platforms utilising a game engine. Unity3D has been used to implement the visualisation. Work in this thesis could provide oncologists and clinicians with an interactive visualisation and visual analytics platform that helps them to drive their analysis in treatment efficacy and achieve the goal of evidence-based personalised medicine. The thesis integrates the latest discovery and development in cancer patients’ prognoses, immersive technologies, machine learning, decision support system and interactive visualisation to form an immersive analytics platform of complex genomic data. For this thesis, the experimental paradigm that will be followed is in understanding transcriptomics in cancer samples. This thesis specifically investigates gene expression data to determine the biological similarity revealed by the patient's tumour samples' transcriptomic profiles revealing the active genes in different patients. In summary, the thesis contributes to i) a novel immersive analytics platform for patient cohort data interrogation in similarity space where the similarity space is based on the patient's biological and genomic similarity; ii) an effective immersive environment optimisation design based on the usability study of exocentric and egocentric visualisation, audio and sound design optimisation; iii) an integration of trusted and familiar 2D biomedical visual analytics methods into the immersive environment; iv) novel use of the game theory as the decision-making system engine to help the analytics process, and application of the optimal transport theory in missing data imputation to ensure the preservation of data distribution; and v) case studies to showcase the real-world application of the visualisation and its effectiveness

    Computational Sleep Behaviour Analysis and Application

    Get PDF
    Sleep affects a person’s health and is, therefore, assessed if health problems arise. Sleep behaviour is monitored for abnormalities in order to determine if any treatments, such as medication or behavioural changes (modifications to sleep habits), are necessary. Assessments are typically done using two methods: polysomnography over short periods and four-week retrospective questionnaires. These standard methods, however, cannot measure current sleep status continuously and unsupervised over long periods of time in the same way home-based sleep behaviour assessment can. In this work, we investigate the ability of sleep behaviour assessment using IoT devices in a natural home environment, which potential has not been investigated fully, to enable early abnormality detection and facilitate self-management. We developed a framework that incorporates different facets and perspectives to introduce focus and support in sleep behaviour assessment. The framework considers users’ needs, various available technologies, and factors that influence sleep behaviours. Sleep analysis approaches are incorporated to increase the reliability of the system. This assessment is strengthened by utilising sleep stage detection and sleep position recognition. This includes, first, the extraction and integration of influence factors of sleep stage recognition methods to create a fine-grained personalised approach and, second, the detection of common but more complex sleep positions, including leg positions. The relations between medical conditions and sleep are assessed through interviews with doctors and users on various topics, including treatment satisfaction and technology acceptance. The findings from these interviews led to the investigation of sleep behaviour as a diagnostic indicator. Changes in sleep behaviour are assessed alongside medical knowledge using data mining techniques to extract information about disease development; the following diseases were of interest: sleep apnoea, hypertension, diabetes, and chronic kidney disease. The proposed framework is designed in a way that allows it to be integrated into existing smart home environments. We believe that our framework provides promising building blocks for reliable sleep behaviour assessment by incorporating newly developed sleep analysis approaches. These approaches include a modular layered sleep behaviour assessment framework, a sleep regularity algorithm, a user-dependent visualisation concept, a higher-granularity sleep position analysis approach, a fine-grained sleep stage detection approach, a personalised sleep parameter extraction process, in-depth understanding on sleep and chronic disease relations, and a sleep-wake behaviour-based chronic disease detection method.This work has been supported by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 676157

    Document-level sentiment analysis of email data

    Get PDF
    Sisi Liu investigated machine learning methods for Email document sentiment analysis. She developed a systematic framework that has been qualitatively and quantitatively proved to be effective and efficient in identifying sentiment from massive amount of Email data. Analytical results obtained from the document-level Email sentiment analysis framework are beneficial for better decision making in various business settings

    A survey of the application of soft computing to investment and financial trading

    Get PDF

    Emotion and Stress Recognition Related Sensors and Machine Learning Technologies

    Get PDF
    This book includes impactful chapters which present scientific concepts, frameworks, architectures and ideas on sensing technologies and machine learning techniques. These are relevant in tackling the following challenges: (i) the field readiness and use of intrusive sensor systems and devices for capturing biosignals, including EEG sensor systems, ECG sensor systems and electrodermal activity sensor systems; (ii) the quality assessment and management of sensor data; (iii) data preprocessing, noise filtering and calibration concepts for biosignals; (iv) the field readiness and use of nonintrusive sensor technologies, including visual sensors, acoustic sensors, vibration sensors and piezoelectric sensors; (v) emotion recognition using mobile phones and smartwatches; (vi) body area sensor networks for emotion and stress studies; (vii) the use of experimental datasets in emotion recognition, including dataset generation principles and concepts, quality insurance and emotion elicitation material and concepts; (viii) machine learning techniques for robust emotion recognition, including graphical models, neural network methods, deep learning methods, statistical learning and multivariate empirical mode decomposition; (ix) subject-independent emotion and stress recognition concepts and systems, including facial expression-based systems, speech-based systems, EEG-based systems, ECG-based systems, electrodermal activity-based systems, multimodal recognition systems and sensor fusion concepts and (x) emotion and stress estimation and forecasting from a nonlinear dynamical system perspective

    Deep learning approaches to multimodal MRI brain age estimation

    Get PDF
    Brain ageing remains an intricate, multifaceted process, marked not just by chronological time but by a myriad of structural, functional, and microstructural changes that often lead to discrepancies between actual age and the age inferred from neuroimaging. Machine learning methods, and especially Convolutional Neural Networks (CNNs), have proven adept in capturing patterns relating to ageing induced changes in the brain. The differences between the predicted and chronological ages, referred to as brain age deltas, have emerged as useful biomarkers for exploring those factors which promote accelerated ageing or resilience, such as pathologies or lifestyle factors. However, previous studies relied overwhelmingly on structural neuroimaging for predictions, overlooking rich details inherent in other MRI modalities, such as potentially informative functional and microstructural changes. This research, utilising the extensive UK Biobank dataset, reveals that 57 different maps spanning structural, susceptibility-weighted, diffusion, and functional MRI modalities can not only predict an individual's chronological age, but also encode unique ageing-related details. Through the use of both 3D CNNs and the novel 3D Shifted Window (SWIN) Transformers, this work uncovered associations between brain age deltas and 191 different non-imaging derived phenotypes (nIDPs), offering a valuable insight into factors influencing brain ageing. Moreover, this work found that ensembling data from multiple maps results in higher prediction accuracies. After a thorough comparison of both linear and non-linear multi-modal ensembling methods, including deep fusion networks, it was found that linear methods, such as ElasticNet, generally outperform their more complex non-linear counterparts. In addition, while ensembling was found to strengthen age prediction accuracies, it was found to weaken nIDP associations in certain circumstances where ensembled maps might have opposing sensitivities to a particular nIDP, thus reinforcing the need for guided selections of the ensemble components. Finally, while both CNNs and SWINs show comparable brain age prediction precision, SWIN networks stand out for their robustness against data corruption, while also proving a degree of inherent explainability. Overall, the results presented herein demonstrate that other 3D maps and modalities, which have not been considered previously for the task of brain age prediction, encode different information about the ageing brain. This research lays the foundation for further explorations into how different factors, such as off-target drug effects, impact brain ageing. It also ushers in possibilities for enhanced clinical trial design, diagnostic approaches, and therapeutic monitoring grounded in refined brain age prediction models

    Deep learning for electronic health records: risk prediction, explainability, and uncertainty

    Get PDF
    Background: Risk models are essential for care planning and disease prevention. The unsatisfactory performance of the established clinical models has raised broad awareness and concerns. An accurate, explainable, and reliable risk model is highly beneficial but remains a challenge. Objective: This thesis aims to develop deep learning models that can make more accurate risk predictions with the provision of uncertainty estimation and the ability to provide medical explanations using a large and representative electronic health records (EHR) dataset. Methods: We investigated three directions in this thesis: risk prediction, explainability, and uncertainty estimation. For risk prediction, we investigated deep learning tools that can incorporate the minimal processed EHR for modelling and comprehensively compared them with the established machine learning and clinical models. Additionally, the post-hoc explanations were applied to deep learning models for medical information retrieval, and we specifically looked into explanations in risk association and counterfactual reasoning. Uncertainty estimation was qualitatively investigated using probabilistic modelling techniques. Our analyses relied on Clinical Practice Research Datalink, which contains anonymised EHR collected from primary care, secondary care, and death registration and is representative of the UK population. Results: We introduced a deep learning model, named BEHRT, that can incorporate minimal processed EHR for risk prediction. Without expert engagement, it learned meaningful representations that can automatically cluster highly correlated diseases. Compared to the established machine learning and clinical models that relied on expert- selected predictors, our proposed deep learning model showed superior performance on a wide range of risk prediction tasks and highlighted the necessity of recalibration when applying a risk model to a population with severe prior distribution shifts, and the importance of regular model updating to preserve the model’s discrimination performance under temporal data shifts. Additionally, we showed that the deep learning model explanation is an excellent tool for discovering risk factors. By explaining the deep learning model, we not only identified factors that were highly consistent with the established evidence but also those that have not been considered in expert-driven studies. Furthermore, the deep learning model also captured the interplay between risk and treated risk and the differential association of medications across different years, which would be difficult if the temporal context was not included in the modelling. Besides the explanations in terms of association, we introduced a framework that can achieve accurate risk prediction, while enabling counterfactual reasoning under hypothetical interventions. This offers counterfactual explanations that could inform clinicians for selection of those who will benefit the most. We demonstrated the benefit of the proposed framework using two exemplary case studies. Furthermore, transforming a deterministic deep learning model to probabilistic can make predictions with an uncertainty range. We showed that such information has many potential implications in practice, such as quantifying the confidence of a decision, indicating data insufficiency, distinguishing the correct and incorrect predictions, and indicating risk associations. Conclusions: Deep learning models led to substantially improved performance for risk prediction. The ability of uncertainty estimation can quantify the confidence of risk prediction to further inform clinical decision-making. Deep learning model explanation can generate hypotheses to guide medical research and provide counterfactual analysis to assist clinical decision-making. This encouraging evidence supports the great potential of incorporating deep learning methods into electronic health records to inform a wide range of health applications such as care planning, disease prevention, and medical study design

    Predictive Modelling Approach to Data-Driven Computational Preventive Medicine

    Get PDF
    This thesis contributes novel predictive modelling approaches to data-driven computational preventive medicine and offers an alternative framework to statistical analysis in preventive medicine research. In the early parts of this research, this thesis presents research by proposing a synergy of machine learning methods for detecting patterns and developing inexpensive predictive models from healthcare data to classify the potential occurrence of adverse health events. In particular, the data-driven methodology is founded upon a heuristic-systematic assessment of several machine-learning methods, data preprocessing techniques, models’ training estimation and optimisation, and performance evaluation, yielding a novel computational data-driven framework, Octopus. Midway through this research, this thesis advances research in preventive medicine and data mining by proposing several new extensions in data preparation and preprocessing. It offers new recommendations for data quality assessment checks, a novel multimethod imputation (MMI) process for missing data mitigation, a novel imbalanced resampling approach, and minority pattern reconstruction (MPR) led by information theory. This thesis also extends the area of model performance evaluation with a novel classification performance ranking metric called XDistance. In particular, the experimental results show that building predictive models with the methods guided by our new framework (Octopus) yields domain experts' approval of the new reliable models’ performance. Also, performing the data quality checks and applying the MMI process led healthcare practitioners to outweigh predictive reliability over interpretability. The application of MPR and its hybrid resampling strategies led to better performances in line with experts' success criteria than the traditional imbalanced data resampling techniques. Finally, the use of the XDistance performance ranking metric was found to be more effective in ranking several classifiers' performances while offering an indication of class bias, unlike existing performance metrics The overall contributions of this thesis can be summarised as follow. First, several data mining techniques were thoroughly assessed to formulate the new Octopus framework to produce new reliable classifiers. In addition, we offer a further understanding of the impact of newly engineered features, the physical activity index (PAI) and biological effective dose (BED). Second, the newly developed methods within the new framework. Finally, the newly accepted developed predictive models help detect adverse health events, namely, visceral fat-associated diseases and advanced breast cancer radiotherapy toxicity side effects. These contributions could be used to guide future theories, experiments and healthcare interventions in preventive medicine and data mining

    A machine learning approach to Structural Health Monitoring with a view towards wind turbines

    Get PDF
    The work of this thesis is centred around Structural Health Monitoring (SHM) and is divided into three main parts. The thesis starts by exploring di�erent architectures of auto-association. These are evaluated in order to demonstrate the ability of nonlinear auto-association of neural networks with one nonlinear hidden layer as it is of great interest in terms of reduced computational complexity. It is shown that linear PCA lacks performance for novelty detection. The novel key study which is revealed ampli�es that single hidden layer auto-associators are not performing in a similar fashion to PCA. The second part of this study concerns formulating pattern recognition algorithms for SHM purposes which could be used in the wind energy sector as SHM regarding this research �eld is still in an embryonic level compared to civil and aerospace engineering. The purpose of this part is to investigate the e�ectiveness and performance of such methods in structural damage detection. Experimental measurements such as high frequency responses functions (FRFs) were extracted from a 9m WT blade throughout a full-scale continuous fatigue test. A preliminary analysis of a model regression of virtual SCADA data from an o�shore wind farm is also proposed using Gaussian processes and neural network regression techniques. The third part of this work introduces robust multivariate statistical methods into SHM by inclusively revealing how the in uence of environmental and operational variation a�ects features that are sensitive to damage. The algorithms that are described are the Minimum Covariance Determinant Estimator (MCD) and the Minimum Volume Enclosing Ellipsoid (MVEE). These robust outlier methods are inclusive and in turn there is no need to pre-determine an undamaged condition data set, o�ering an important advantage over other multivariate methodologies. Two real life experimental applications to the Z24 bridge and to an aircraft wing are analysed. Furthermore, with the usage of the robust measures, the data variable correlation reveals linear or nonlinear connections

    Data-Driven Methods for Data Center Operations Support

    Get PDF
    During the last decade, cloud technologies have been evolving at an impressive pace, such that we are now living in a cloud-native era where developers can leverage on an unprecedented landscape of (possibly managed) services for orchestration, compute, storage, load-balancing, monitoring, etc. The possibility to have on-demand access to a diverse set of configurable virtualized resources allows for building more elastic, flexible and highly-resilient distributed applications. Behind the scenes, cloud providers sustain the heavy burden of maintaining the underlying infrastructures, consisting in large-scale distributed systems, partitioned and replicated among many geographically dislocated data centers to guarantee scalability, robustness to failures, high availability and low latency. The larger the scale, the more cloud providers have to deal with complex interactions among the various components, such that monitoring, diagnosing and troubleshooting issues become incredibly daunting tasks. To keep up with these challenges, development and operations practices have undergone significant transformations, especially in terms of improving the automations that make releasing new software, and responding to unforeseen issues, faster and sustainable at scale. The resulting paradigm is nowadays referred to as DevOps. However, while such automations can be very sophisticated, traditional DevOps practices fundamentally rely on reactive mechanisms, that typically require careful manual tuning and supervision from human experts. To minimize the risk of outages—and the related costs—it is crucial to provide DevOps teams with suitable tools that can enable a proactive approach to data center operations. This work presents a comprehensive data-driven framework to address the most relevant problems that can be experienced in large-scale distributed cloud infrastructures. These environments are indeed characterized by a very large availability of diverse data, collected at each level of the stack, such as: time-series (e.g., physical host measurements, virtual machine or container metrics, networking components logs, application KPIs); graphs (e.g., network topologies, fault graphs reporting dependencies among hardware and software components, performance issues propagation networks); and text (e.g., source code, system logs, version control system history, code review feedbacks). Such data are also typically updated with relatively high frequency, and subject to distribution drifts caused by continuous configuration changes to the underlying infrastructure. In such a highly dynamic scenario, traditional model-driven approaches alone may be inadequate at capturing the complexity of the interactions among system components. DevOps teams would certainly benefit from having robust data-driven methods to support their decisions based on historical information. For instance, effective anomaly detection capabilities may also help in conducting more precise and efficient root-cause analysis. Also, leveraging on accurate forecasting and intelligent control strategies would improve resource management. Given their ability to deal with high-dimensional, complex data, Deep Learning-based methods are the most straightforward option for the realization of the aforementioned support tools. On the other hand, because of their complexity, this kind of models often requires huge processing power, and suitable hardware, to be operated effectively at scale. These aspects must be carefully addressed when applying such methods in the context of data center operations. Automated operations approaches must be dependable and cost-efficient, not to degrade the services they are built to improve. i
    • …
    corecore