7 research outputs found
DP-HyPO: An Adaptive Private Hyperparameter Optimization Framework
Hyperparameter optimization, also known as hyperparameter tuning, is a widely
recognized technique for improving model performance. Regrettably, when
training private ML models, many practitioners often overlook the privacy risks
associated with hyperparameter optimization, which could potentially expose
sensitive information about the underlying dataset. Currently, the sole
existing approach to allow privacy-preserving hyperparameter optimization is to
uniformly and randomly select hyperparameters for a number of runs,
subsequently reporting the best-performing hyperparameter. In contrast, in
non-private settings, practitioners commonly utilize "adaptive" hyperparameter
optimization methods such as Gaussian process-based optimization, which select
the next candidate based on information gathered from previous outputs. This
substantial contrast between private and non-private hyperparameter
optimization underscores a critical concern. In our paper, we introduce
DP-HyPO, a pioneering framework for "adaptive" private hyperparameter
optimization, aiming to bridge the gap between private and non-private
hyperparameter optimization. To accomplish this, we provide a comprehensive
differential privacy analysis of our framework. Furthermore, we empirically
demonstrate the effectiveness of DP-HyPO on a diverse set of real-world and
synthetic datasets
Enabling cardiovascular multimodal, high dimensional, integrative analytics
While traditionally the understanding of cardiovascular morbidity relied on the acquisition and interpretation of health data, the advances in health technologies has enabled us to collect far larger amount of health data. This thesis explores the application of advanced analytics that utilise powerful mechanisms for integrating health data across different modalities and dimensions into a single and holistic environment to better understand different diseases, with a focus on cardiovascular conditions. Different statistical methodologies are applied across a number of case studies supported by a novel methodology to integrate and simplify data collection. The work culminates in the different dataset modalities explaining different effects on morbidity: blood biomarkers, electrocardiogram recordings, RNA-Seq measurements, and different population effects piece together the understanding of a person morbidity. More specifically, explainable artificial intelligence methods were employed on structured datasets from patients with atrial fibrillation to improve the screening for the disease. Omics datasets, including RNA-sequencing and genotype datasets, were examined and new biomarkers were discovered allowing a better understanding of atrial fibrillation. Electrocardiogram signal data were used to assess the early risk prediction of heart failure, enabling clinicians to use this novel approach to estimate future incidences. Population-level data were applied to the identification of associations and temporal trajectory of diseases to better understand disease dependencies in different clinical cohorts