Search CORE

409 research outputs found

AGILE: ARBITRARY GRID LOGISTIC REGRESSION USING INTEL SOFTWARE GUARD EXTENSIONS

Author: Jiang Chao
Publication venue
Publication date: 10/08/2016
Field of study

Biomedical data are often collected and stored at different sites. How to take the most advantage of the data to provide better health care for patients and to contribute to academic research becomes more and more important and challenging considering the privacy regulations association with the data. There are several barriers to sharing and exchanging information, such as complex of data formats, information leakage during the data transmission, and big data issues. In this thesis, I focus on how to conduct integrated data analysis while ensuring data privacy and security during both data transmission and integration. Through a small experiment of GLORE[1] implemented on both garbled circuits[2] and Intel® Software Guard Extensions (Intel® SGX), I found that Intel® SGX performed better than garbled circuits in time consuming. So I believe that Intel® SGX has the potential to make great progress in security multiparty computation. By applying Intel® SGX, I not only built a framework but also devised a more flexible model that lets participants more freely cooperate with each other. My model AGILE leverages Intel® SGX to deliver trustworthy computations, a feature that is unlike the existing models like GLORE and VERTIGO[3] that address the integration problem when data is either horizontally or vertically partitioned. AGILE deals with data that is arbitrarily partitioned. Furthermore, to demonstrate AGILE’s performance, I evaluated the model using two real datasets. The experimental results show that AGILE provides secure and accurate computation much faster than GLORE and VERTIGO

SHAREOK repository

ModelChain: Decentralized Privacy-Preserving Healthcare Predictive Modeling Framework on Private Blockchain Networks

Author: Kuo Tsung-Ting
Ohno-Machado Lucila
Publication venue
Publication date: 05/02/2018
Field of study

Cross-institutional healthcare predictive modeling can accelerate research and facilitate quality improvement initiatives, and thus is important for national healthcare delivery priorities. For example, a model that predicts risk of re-admission for a particular set of patients will be more generalizable if developed with data from multiple institutions. While privacy-protecting methods to build predictive models exist, most are based on a centralized architecture, which presents security and robustness vulnerabilities such as single-point-of-failure (and single-point-of-breach) and accidental or malicious modification of records. In this article, we describe a new framework, ModelChain, to adapt Blockchain technology for privacy-preserving machine learning. Each participating site contributes to model parameter estimation without revealing any patient health information (i.e., only model data, no observation-level data, are exchanged across institutions). We integrate privacy-preserving online machine learning with a private Blockchain network, apply transaction metadata to disseminate partial models, and design a new proof-of-information algorithm to determine the order of the online learning process. We also discuss the benefits and potential issues of applying Blockchain technology to solve the privacy-preserving healthcare predictive modeling task and to increase interoperability between institutions, to support the Nationwide Interoperability Roadmap and national healthcare delivery priorities such as Patient-Centered Outcomes Research (PCOR)

arXiv.org e-Print Archive

eScholarship - University of California

Recommended from our members

Preservation of Patient Level Privacy: Federated Classification and Calibration Models

Author: Huang Yingxiang
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

With the launching of the Precision Medicine Initiative in the United States, by the National Institute of Health, and the emergence of a large volume of electronic health records, there are many opportunities to improve clinical decision support systems. A large number of samples are needed to build predictive models that have adequate discrimination and calibration. However, protecting patient privacy is also an important issue. Patient data are typically protected in localized silos, and consolidation of datasets from different healthcare systems is difficult. Federated learning allows the training of a global model by amassing intermediate calculations from localized medical systems. The knowledge learned from the data can be transferred and aggregated to achieve better performance than the one achieved by individual local models. Federated learning may help build better models, providing more accurate predictions. There are two types of measures to assess how well a model performs: discrimination and calibration. While most papers report discrimination measures, calibration has often been neglected but it is a critical metric for evaluation. In this dissertation, I show a novel way to build classifiers and calibration models in a federated manner. I also show how I can evaluate and improve model calibration in this manner. Federated modeling enables the accumulation of knowledge and information that are otherwise locked behind local medical systems

eScholarship - University of California

Psychometric evaluation of the Simulator Sickness Questionnaire as a measure of cybersickness

Author: Stone Iii William B.
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2017
Field of study

Some users of virtual reality (VR) technology experience negative symptoms, known as cybersickness, sometimes severe enough to cause discontinuation of VR use. Despite decades of research, there has been relatively little progress understanding the underlying causal mechanisms of cybersickness. Review of the measures used to assess cybersickness symptoms, particularly the subjective psychological components of cybersickness, indicated that extant questionnaires may exhibit psychometric problems that could affect interpretation of results. In the present study, new data were collected (N = 202) to evaluate the psychometric properties of the Simulator Sickness Questionnaire (SSQ), the most commonly reported measure of cybersickness symptoms, in the context of virtual reality. Findings suggest that the SSQ, as commonly used, is not applicable to VR. An alternative approach to measure cybersickness is suggested. Overall, incidence and severity of cybersickness was very low and participants rated the VR experience as highly entertaining

Digital Repository @ Iowa State University (ISU)

Implementing Vertical Federated Learning Using Autoencoders: Practical Application, Generalizability, and Utility Study

Author: 박유랑
성민동
차동철
Publication venue: 'JMIR Publications Inc.'
Publication date: 01/06/2021
Field of study

Background: Machine learning (ML) is now widely deployed in our everyday lives. Building robust ML models requires a massive amount of data for training. Traditional ML algorithms require training data centralization, which raises privacy and data governance issues. Federated learning (FL) is an approach to overcome this issue. We focused on applying FL on vertically partitioned data, in which an individual's record is scattered among different sites. Objective: The aim of this study was to perform FL on vertically partitioned data to achieve performance comparable to that of centralized models without exposing the raw data. Methods: We used three different datasets (Adult income, Schwannoma, and eICU datasets) and vertically divided each dataset into different pieces. Following the vertical division of data, overcomplete autoencoder-based model training was performed for each site. Following training, each site's data were transformed into latent data, which were aggregated for training. A tabular neural network model with categorical embedding was used for training. A centrally based model was used as a baseline model, which was compared to that of FL in terms of accuracy and area under the receiver operating characteristic curve (AUROC). Results: The autoencoder-based network successfully transformed the original data into latent representations with no domain knowledge applied. These altered data were different from the original data in terms of the feature space and data distributions, indicating appropriate data security. The loss of performance was minimal when using an overcomplete autoencoder; accuracy loss was 1.2%, 8.89%, and 1.23%, and AUROC loss was 1.1%, 0%, and 1.12% in the Adult income, Schwannoma, and eICU dataset, respectively. Conclusions: We proposed an autoencoder-based ML model for vertically incomplete data. Since our model is based on unsupervised learning, no domain-specific knowledge is required in individual sites. Under the circumstances where direct data sharing is not available, our approach may be a practical solution enabling both data protection and building a robust model.ope

Yonsei University Medical Library Open Access Repository