7 research outputs found

    DP-HyPO: An Adaptive Private Hyperparameter Optimization Framework

    Full text link
    Hyperparameter optimization, also known as hyperparameter tuning, is a widely recognized technique for improving model performance. Regrettably, when training private ML models, many practitioners often overlook the privacy risks associated with hyperparameter optimization, which could potentially expose sensitive information about the underlying dataset. Currently, the sole existing approach to allow privacy-preserving hyperparameter optimization is to uniformly and randomly select hyperparameters for a number of runs, subsequently reporting the best-performing hyperparameter. In contrast, in non-private settings, practitioners commonly utilize "adaptive" hyperparameter optimization methods such as Gaussian process-based optimization, which select the next candidate based on information gathered from previous outputs. This substantial contrast between private and non-private hyperparameter optimization underscores a critical concern. In our paper, we introduce DP-HyPO, a pioneering framework for "adaptive" private hyperparameter optimization, aiming to bridge the gap between private and non-private hyperparameter optimization. To accomplish this, we provide a comprehensive differential privacy analysis of our framework. Furthermore, we empirically demonstrate the effectiveness of DP-HyPO on a diverse set of real-world and synthetic datasets

    Enabling cardiovascular multimodal, high dimensional, integrative analytics

    Get PDF
    While traditionally the understanding of cardiovascular morbidity relied on the acquisition and interpretation of health data, the advances in health technologies has enabled us to collect far larger amount of health data. This thesis explores the application of advanced analytics that utilise powerful mechanisms for integrating health data across different modalities and dimensions into a single and holistic environment to better understand different diseases, with a focus on cardiovascular conditions. Different statistical methodologies are applied across a number of case studies supported by a novel methodology to integrate and simplify data collection. The work culminates in the different dataset modalities explaining different effects on morbidity: blood biomarkers, electrocardiogram recordings, RNA-Seq measurements, and different population effects piece together the understanding of a person morbidity. More specifically, explainable artificial intelligence methods were employed on structured datasets from patients with atrial fibrillation to improve the screening for the disease. Omics datasets, including RNA-sequencing and genotype datasets, were examined and new biomarkers were discovered allowing a better understanding of atrial fibrillation. Electrocardiogram signal data were used to assess the early risk prediction of heart failure, enabling clinicians to use this novel approach to estimate future incidences. Population-level data were applied to the identification of associations and temporal trajectory of diseases to better understand disease dependencies in different clinical cohorts
    corecore