3,608 research outputs found

    A causal inference framework for comparative effectiveness and safety research using observational data, with application in type 2 diabetes

    Get PDF
    Randomized controlled trials are the gold standard to answer causal questions in health research as the process of randomization ensures balanced treatment groups and therefore makes it possible to compare average group outcomes directly. But they have many limitations with respect to costs, ethical considerations and practicability and therefore may not be suitable to answer all research questions. Evidence on cause and effect relationships from observational studies have the potential to overcome the limitations of trials and close important research gaps as they provide the possibility to study subpopulations of patients which are often excluded due to safety concerns, or can give insights into the risk profile of long-term endpoints. The quality of this real-world evidence depends on the quality of data, their suitability to answer a particular research question and the use of appropriate methods to estimate the treatment effect of interest. Of concern in observational research is bias in the treatment effect estimation due to confounding, as the treatment assignment is not controlled by the researcher and cannot be randomized. It is therefore possible that treatment groups are not balanced and confounding factors exist in the data which influence the treatment choice and the outcome of interest simultaneously. The benefits of observational studies make them attractive for studying the risk and benefit profiles of oral type 2 diabetes treatments, especially of newer agent classes such as Sodium-glucose Cotransporter-2 Inhibitors. Prescribing of this treatment class has increased in recent years and a large proportion of type 2 diabetes patients have become eligible to receive agents from this class after recent treatment guideline changes. More information about treatment effects of Sodium-glucose Cotransporter-2 Inhibitors are needed especially for the large patient population of older adults (e.g. 70 years or older), as possible adverse effects such as osmotic symptoms associated with this class could have severe consequences for these patients. The overall aim of this thesis is to develop a causal inference framework for the exploitation of observational data, needed to derive high quality evidence on the benefit and safety profile of oral type 2 diabetes treatments, with a focus on the widely prescribed treatment class of Sodium-glucose Cotransporter-2 Inhibitors and the patient population of older adults. Chapter 1 and 2 are introductions to causal inference theory including the description of all estimation methods employed in this thesis and an introduction to type 2 diabetes research encompassing important treatment decision considerations, and current research evidence on Sodium-glucose Cotransporter-2 Inhibitors. Chapter 3 presents a triangulation framework of assorted estimation methods to establish the consistency of estimation results from approaches utilizing different parts of the data and relying on different data structure assumptions. Furthermore, an Instrumental Variable approach is introduced which uses data from the period before treatment initiation to mitigate potential bias in case the exchangeability assumption is violated and a history of the outcome of interest previous to treatment initiation has an influence on the treatment decision. Chapter 4 describes a simulation study on the performance of established construction methods for a proxy Instrumental Variable of health care provider prescription preference. The methods are tested under different data conditions such as change in provider preference over time, missing data in baseline covariates and different sample sizes within each health care provider. Additionally, a construction method is introduced that aims to address changes in preference over time and non-ignorabile missingness in baseline characteristics. In Chapter 5 the developed conclusions about a robust Instrumental Variable estimation approach from previous chapters are applied for a causal analysis on the relative benefit and risk profile of Sodium-glucose Cotransporter-2 Inhibitors versus Dipeptidyl peptidase-4 Inhibitors in the patient population of older adults. Chapter 6 provides an overview of the main findings and implications of this thesis and discusses limitations and future research potential of each study.Operating Budget, Research Englan

    A Survey on Forensics and Compliance Auditing for Critical Infrastructure Protection

    Get PDF
    The broadening dependency and reliance that modern societies have on essential services provided by Critical Infrastructures is increasing the relevance of their trustworthiness. However, Critical Infrastructures are attractive targets for cyberattacks, due to the potential for considerable impact, not just at the economic level but also in terms of physical damage and even loss of human life. Complementing traditional security mechanisms, forensics and compliance audit processes play an important role in ensuring Critical Infrastructure trustworthiness. Compliance auditing contributes to checking if security measures are in place and compliant with standards and internal policies. Forensics assist the investigation of past security incidents. Since these two areas significantly overlap, in terms of data sources, tools and techniques, they can be merged into unified Forensics and Compliance Auditing (FCA) frameworks. In this paper, we survey the latest developments, methodologies, challenges, and solutions addressing forensics and compliance auditing in the scope of Critical Infrastructure Protection. This survey focuses on relevant contributions, capable of tackling the requirements imposed by massively distributed and complex Industrial Automation and Control Systems, in terms of handling large volumes of heterogeneous data (that can be noisy, ambiguous, and redundant) for analytic purposes, with adequate performance and reliability. The achieved results produced a taxonomy in the field of FCA whose key categories denote the relevant topics in the literature. Also, the collected knowledge resulted in the establishment of a reference FCA architecture, proposed as a generic template for a converged platform. These results are intended to guide future research on forensics and compliance auditing for Critical Infrastructure Protection.info:eu-repo/semantics/publishedVersio

    Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence

    Get PDF
    Recent years have seen a tremendous growth in Artificial Intelligence (AI)-based methodological development in a broad range of domains. In this rapidly evolving field, large number of methods are being reported using machine learning (ML) and Deep Learning (DL) models. Majority of these models are inherently complex and lacks explanations of the decision making process causing these models to be termed as 'Black-Box'. One of the major bottlenecks to adopt such models in mission-critical application domains, such as banking, e-commerce, healthcare, and public services and safety, is the difficulty in interpreting them. Due to the rapid proleferation of these AI models, explaining their learning and decision making process are getting harder which require transparency and easy predictability. Aiming to collate the current state-of-the-art in interpreting the black-box models, this study provides a comprehensive analysis of the explainable AI (XAI) models. To reduce false negative and false positive outcomes of these back-box models, finding flaws in them is still difficult and inefficient. In this paper, the development of XAI is reviewed meticulously through careful selection and analysis of the current state-of-the-art of XAI research. It also provides a comprehensive and in-depth evaluation of the XAI frameworks and their efficacy to serve as a starting point of XAI for applied and theoretical researchers. Towards the end, it highlights emerging and critical issues pertaining to XAI research to showcase major, model-specific trends for better explanation, enhanced transparency, and improved prediction accuracy

    Multi-epoch machine learning for galaxy formation

    Get PDF
    In this thesis I utilise a range of machine learning techniques in conjunction with hydrodynamical cosmological simulations. In Chapter 2 I present a novel machine learning method for predicting the baryonic properties of dark matter only subhalos taken from N-body simulations. The model is built using a tree-based algorithm and incorporates subhalo properties over a wide range of redshifts as its input features. I train the model using a hydrodynamical simulation which enables it to predict black hole mass, gas mass, magnitudes, star formation rate, stellar mass, and metallicity. This new model surpasses the performance of previous models. Furthermore, I explore the predictive power of each input property by looking at feature importance scores from the tree-based model. By applying the method to the LEGACY N-body simulation I generate a large volume mock catalog of the quasar population at z=3. By comparing this mock catalog with observations, I demonstrate that the IllustrisTNG subgrid model for black holes is not accurately capturing the growth of the most massive objects. In Chapter 3 I apply my method to investigate the evolution of galaxy properties in different simulations, and in various environments within a single simulation. By comparing the Illustris, EAGLE, and TNG simulations I show that subgrid model physics plays a more significant role than the choice of hydrodynamics method. Using the CAMELS simulation suite I consider the impact of cosmological and astrophysical parameters on the buildup of stellar mass within the TNG and SIMBA models. In the final chapter I apply a combination of neural networks and symbolic regression methods to construct a semi-analytic model which reproduces the galaxy population from a cosmological simulation. The neural network based approach is capable of producing a more accurate population than a previous method of binning based on halo mass. The equations resulting from symbolic regression are found to be a good approximation of the neural network

    Analysis and monitoring of single HaCaT cells using volumetric Raman mapping and machine learning

    Get PDF
    No explorer reached a pole without a map, no chef served a meal without tasting, and no surgeon implants untested devices. Higher accuracy maps, more sensitive taste buds, and more rigorous tests increase confidence in positive outcomes. Biomedical manufacturing necessitates rigour, whether developing drugs or creating bioengineered tissues [1]–[4]. By designing a dynamic environment that supports mammalian cells during experiments within a Raman spectroscope, this project provides a platform that more closely replicates in vivo conditions. The platform also adds the opportunity to automate the adaptation of the cell culture environment, alongside spectral monitoring of cells with machine learning and three-dimensional Raman mapping, called volumetric Raman mapping (VRM). Previous research highlighted key areas for refinement, like a structured approach for shading Raman maps [5], [6], and the collection of VRM [7]. Refining VRM shading and collection was the initial focus, k-means directed shading for vibrational spectroscopy map shading was developed in Chapter 3 and exploration of depth distortion and VRM calibration (Chapter 4). “Cage” scaffolds, designed using the findings from Chapter 4 were then utilised to influence cell behaviour by varying the number of cage beams to change the scaffold porosity. Altering the porosity facilitated spectroscopy investigation into previously observed changes in cell biology alteration in response to porous scaffolds [8]. VRM visualised changed single human keratinocyte (HaCaT) cell morphology, providing a complementary technique for machine learning classification. Increased technical rigour justified progression onto in-situ flow chamber for Raman spectroscopy development in Chapter 6, using a Psoriasis (dithranol-HaCaT) model on unfixed cells. K-means-directed shading and principal component analysis (PCA) revealed HaCaT cell adaptations aligning with previous publications [5] and earlier thesis sections. The k-means-directed Raman maps and PCA score plots verified the drug-supplying capacity of the flow chamber, justifying future investigation into VRM and machine learning for monitoring single cells within the flow chamber

    XgBoost Hyper-Parameter Tuning Using Particle Swarm Optimization for Stock Price Forecasting

    Get PDF
    Investment in the capital market has become a lifestyle for millennials in Indonesia as seen from the increasing number of SID (Single Investor Identification) from 2.4 million in 2019 to 10.3 million in December 2022. The increase is due to various reasons, starting from the Covid-19 pandemic, which limited the space for social interaction and the easy way to invest in the capital market through various e-commerce platforms. These investors generally use fundamental and technical analysis to maximize profits and minimize the risk of loss in stock investment. These methods may lead to problem where subjectivity and different interpretation may appear in the process. Additionally, these methods are time consuming due to the need in the deep research on the financial statements, economic conditions and company reports. Machine learning by utilizing historical stock price data which is time-series data is one of the methods that can be used for the stock price forecasting. This paper proposed XGBoost optimized by Particle Swarm Optimization (PSO) for stock price forecasting. XGBoost is known for its ability to make predictions accurately and efficiently. PSO is used to optimize the hyper-parameter values of XGBoost. The results of optimizing the hyper-parameter of the XGBoost algorithm using the Particle Swarm Optimization (PSO) method achieved the best performance when compared with standard XGBoost, Long Short-Term Memory (LSTM), Support Vector Regression (SVR) and Random Forest. The results in RSME, MAE and MAPE shows the lowest values in the proposed method, which are, 0.0011, 0.0008, and 0.0772%, respectively. Meanwhile, the  reaches the highest value. It is seen that the PSO-optimized XGBoost is able to predict the stock price with a low error rate, and can be a promising model to be implemented for the stock price forecasting. This result shows the contribution of the proposed method

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Protecting Privacy in Indian Schools: Regulating AI-based Technologies' Design, Development and Deployment

    Get PDF
    Education is one of the priority areas for the Indian government, where Artificial Intelligence (AI) technologies are touted to bring digital transformation. Several Indian states have also started deploying facial recognition-enabled CCTV cameras, emotion recognition technologies, fingerprint scanners, and Radio frequency identification tags in their schools to provide personalised recommendations, ensure student security, and predict the drop-out rate of students but also provide 360-degree information of a student. Further, Integrating Aadhaar (digital identity card that works on biometric data) across AI technologies and learning and management systems (LMS) renders schools a ‘panopticon’. Certain technologies or systems like Aadhaar, CCTV cameras, GPS Systems, RFID tags, and learning management systems are used primarily for continuous data collection, storage, and retention purposes. Though they cannot be termed AI technologies per se, they are fundamental for designing and developing AI systems like facial, fingerprint, and emotion recognition technologies. The large amount of student data collected speedily through the former technologies is used to create an algorithm for the latter-stated AI systems. Once algorithms are processed using machine learning (ML) techniques, they learn correlations between multiple datasets predicting each student’s identity, decisions, grades, learning growth, tendency to drop out, and other behavioural characteristics. Such autonomous and repetitive collection, processing, storage, and retention of student data without effective data protection legislation endangers student privacy. The algorithmic predictions by AI technologies are an avatar of the data fed into the system. An AI technology is as good as the person collecting the data, processing it for a relevant and valuable output, and regularly evaluating the inputs going inside an AI model. An AI model can produce inaccurate predictions if the person overlooks any relevant data. However, the state, school administrations and parents’ belief in AI technologies as a panacea to student security and educational development overlooks the context in which ‘data practices’ are conducted. A right to privacy in an AI age is inextricably connected to data practices where data gets ‘cooked’. Thus, data protection legislation operating without understanding and regulating such data practices will remain ineffective in safeguarding privacy. The thesis undergoes interdisciplinary research that enables a better understanding of the interplay of data practices of AI technologies with social practices of an Indian school, which the present Indian data protection legislation overlooks, endangering students’ privacy from designing and developing to deploying stages of an AI model. The thesis recommends the Indian legislature frame better legislation equipped for the AI/ML age and the Indian judiciary on evaluating the legality and reasonability of designing, developing, and deploying such technologies in schools

    Computational and experimental studies on the reaction mechanism of bio-oil components with additives for increased stability and fuel quality

    Get PDF
    As one of the world’s largest palm oil producers, Malaysia encountered a major disposal problem as vast amount of oil palm biomass wastes are produced. To overcome this problem, these biomass wastes can be liquefied into biofuel with fast pyrolysis technology. However, further upgradation of fast pyrolysis bio-oil via direct solvent addition was required to overcome it’s undesirable attributes. In addition, the high production cost of biofuels often hinders its commercialisation. Thus, the designed solvent-oil blend needs to achieve both fuel functionality and economic targets to be competitive with the conventional diesel fuel. In this thesis, a multi-stage computer-aided molecular design (CAMD) framework was employed for bio-oil solvent design. In the design problem, molecular signature descriptors were applied to accommodate different classes of property prediction models. However, the complexity of the CAMD problem increases as the height of signature increases due to the combinatorial nature of higher order signature. Thus, a consistency rule was developed reduce the size of the CAMD problem. The CAMD problem was then further extended to address the economic aspects via fuzzy multi-objective optimisation approach. Next, a rough-set based machine learning (RSML) model has been proposed to correlate the feedstock characterisation and pyrolysis condition with the pyrolysis bio-oil properties by generating decision rules. The generated decision rules were analysed from a scientific standpoint to identify the underlying patterns, while ensuring the rules were logical. The decision rules generated can be used to select optimal feedstock composition and pyrolysis condition to produce pyrolysis bio-oil of targeted fuel properties. Next, the results obtained from the computational approaches were verified through experimental study. The generated pyrolysis bio-oils were blended with the identified solvents at various mixing ratio. In addition, emulsification of the solvent-oil blend in diesel was also conducted with the help of surfactants. Lastly, potential extensions and prospective work for this study have been discuss in the later part of this thesis. To conclude, this thesis presented the combination of computational and experimental approaches in upgrading the fuel properties of pyrolysis bio-oil. As a result, high quality biofuel can be generated as a cleaner burning replacement for conventional diesel fuel
    • 

    corecore