8 research outputs found

    Inaccessible Entropy for Watermarking Generative Agents

    Get PDF
    In this work, we construct distortion-free and unforgeable watermarks for language models and generative agents. The watermarked output cannot be forged by a adversary nor removed by the adversary without significantly degrading model output quality. That is, the watermarked output is distortion-free: the watermarking algorithm does not noticeably change the quality of the model output and without the public detection key, no efficient adversary can distinguish output that is watermarked from outputs which are not. The core of the watermarking schemes involve embedding a message and publicly-verifiable digital signature in the generated model output. The message and signature can be extracted during the detection phase and verified by any authorized entity that has a public key. We show that, assuming the standard cryptographic assumption of one-way functions, we can construct distortion-free and unforgeable watermark schemes. Our framework relies on analyzing the inaccessible entropy of the watermarking schemes based on computational entropy notions derived from the existence of one-way functions

    Data structures and algorithms for analysis of alternative splicing with RNA-Seq data

    Get PDF

    Machine learning applications for personalised automated radiotherapy planning

    Get PDF
    Automated radiotherapy planning is characterised by reduction in manual planning due to an increase in computerised planning. Current methods can produce plans suitable for clinical use. However, every case is unique and manual intervention is often needed. The goal of this work was to determine whether it is feasible to develop a fully automated planning system producing clinically optimal plans, and if so, to begin developing it. This work explored relationships between automated planning parameters and anatomical features with respect to dosimetric outcomes. A rules-based automated planning technique was used, an algorithm requiring calibration of input parameters prior to use. This calibration determines the target objectives the algorithm will optimise to. Existing calibration methods use a single set of calibrated parameters per treatment site and are applied to all patients. This approach is considered sufficient to meet clinical goals but may not be sufficient for development of optimal personalised planning due to anatomical variance between patients. Using a validated rules-based planning methodology and obtaining patient bespoke expert-driven calibrated parameters as the optimal gold standard and validation benchmark, two machine learning techniques were explored for apriori configuration of parameters for the delivery of personalised treatment planning. The main objective was to train models to predict gold standard parameters hence generating expert planning automatically. A secondary objective was to determine dosimetric differences between plans generated via machine learned parameters and a traditional single set of parameters applied to all cases. Preliminary studies were carried out to define what will be considered gold standard and to identify anatomical features for inclusion in the main study as well as their relationships to calibrated parameters. The research presented here was applied to three sites: prostate, rectum and lung. Findings are also expected to provide heuristics for research to be carried out on other treatment sites

    The blessings of explainable AI in operations & maintenance of wind turbines

    Get PDF
    Wind turbines play an integral role in generating clean energy, but regularly suffer from operational inconsistencies and failures leading to unexpected downtimes and significant Operations & Maintenance (O&M) costs. Condition-Based Monitoring (CBM) has been utilised in the past to monitor operational inconsistencies in turbines by applying signal processing techniques to vibration data. The last decade has witnessed growing interest in leveraging Supervisory Control & Acquisition (SCADA) data from turbine sensors towards CBM. Machine Learning (ML) techniques have been utilised to predict incipient faults in turbines and forecast vital operational parameters with high accuracy by leveraging SCADA data and alarm logs. More recently, Deep Learning (DL) methods have outperformed conventional ML techniques, particularly for anomaly prediction. Despite demonstrating immense promise in transitioning to Artificial Intelligence (AI), such models are generally black-boxes that cannot provide rationales behind their predictions, hampering the ability of turbine operators to rely on automated decision making. We aim to help combat this challenge by providing a novel perspective on Explainable AI (XAI) for trustworthy decision support.This thesis revolves around three key strands of XAI – DL, Natural Language Generation (NLG) and Knowledge Graphs (KGs), which are investigated by utilising data from an operational turbine. We leverage DL and NLG to predict incipient faults and alarm events in the turbine in natural language as well as generate human-intelligible O&M strategies to assist engineers in fixing/averting the faults. We also propose specialised DL models which can predict causal relationships in SCADA features as well as quantify the importance of vital parameters leading to failures. The thesis finally culminates with an interactive Question- Answering (QA) system for automated reasoning that leverages multimodal domain-specific information from a KG, facilitating engineers to retrieve O&M strategies with natural language questions. By helping make turbines more reliable, we envisage wider adoption of wind energy sources towards tackling climate change

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining
    corecore