Search CORE

32,141 research outputs found

Privacy-preserving scoring of tree ensembles : a novel framework for AI in healthcare

Author: De Cock Martine
Dowsley Rafael
Fritchman Kyle
Hughes Tyler
Nascimento Anderson
Saminathan Keerthanaa
Teredesai Ankur
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Machine Learning (ML) techniques now impact a wide variety of domains. Highly regulated industries such as healthcare and finance have stringent compliance and data governance policies around data sharing. Advances in secure multiparty computation (SMC) for privacy-preserving machine learning (PPML) can help transform these regulated industries by allowing ML computations over encrypted data with personally identifiable information (PII). Yet very little of SMC-based PPML has been put into practice so far. In this paper we present the very first framework for privacy-preserving classification of tree ensembles with application in healthcare. We first describe the underlying cryptographic protocols that enable a healthcare organization to send encrypted data securely to a ML scoring service and obtain encrypted class labels without the scoring service actually seeing that input in the clear. We then describe the deployment challenges we solved to integrate these protocols in a cloud based scalable risk-prediction platform with multiple ML models for healthcare AI. Included are system internals, and evaluations of our deployment for supporting physicians to drive better clinical outcomes in an accurate, scalable, and provably secure manner. To the best of our knowledge, this is the first such applied framework with SMC-based privacy-preserving machine learning for healthcare

TLAD 2010 Proceedings:8th international workshop on teaching, learning and assesment of databases (TLAD)

Author
Publication venue
Publication date: 01/01/2010
Field of study

This is the eighth in the series of highly successful international workshops on the Teaching, Learning and Assessment of Databases (TLAD 2010), which once again is held as a workshop of BNCOD 2010 - the 27th International Information Systems Conference. TLAD 2010 is held on the 28th June at the beautiful Dudhope Castle at the Abertay University, just before BNCOD, and hopes to be just as successful as its predecessors.The teaching of databases is central to all Computing Science, Software Engineering, Information Systems and Information Technology courses, and this year, the workshop aims to continue the tradition of bringing together both database teachers and researchers, in order to share good learning, teaching and assessment practice and experience, and further the growing community amongst database academics. As well as attracting academics from the UK community, the workshop has also been successful in attracting academics from the wider international community, through serving on the programme committee, and attending and presenting papers.This year, the workshop includes an invited talk given by Richard Cooper (of the University of Glasgow) who will present a discussion and some results from the Database Disciplinary Commons which was held in the UK over the academic year. Due to the healthy number of high quality submissions this year, the workshop will also present seven peer reviewed papers, and six refereed poster papers. Of the seven presented papers, three will be presented as full papers and four as short papers. These papers and posters cover a number of themes, including: approaches to teaching databases, e.g. group centered and problem based learning; use of novel case studies, e.g. forensics and XML data; techniques and approaches for improving teaching and student learning processes; assessment techniques, e.g. peer review; methods for improving students abilities to develop database queries and develop E-R diagrams; and e-learning platforms for supporting teaching and learning

TLAD 2010 Proceedings:8th international workshop on teaching, learning and assesment of databases (TLAD)

Author
Publication venue
Publication date: 01/01/2010
Field of study

Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

Author: A Arcuri
AL Rector
AM Wood
AS Glas
B Kulis
C Cortes
C Sammut
CC Diamond
CD Kidd
CR MacIntyre
DP Lewis
E Koumoundouros
E Rahm
EM Knorr
ES Fisher
GE Box
GM Weber
H Carter
H He
H Meyer
H Quan
HH Hoos
I Yoo
J Andreu-Perez
J Fan
J Zhao
JD Lafferty
JM Bland
JW Graham
K Lange
KP Murphy
LA King
LM Collins
M Azarm-Daigle
M Kantardzic
M Sokolova
MA Stoto
N Oreskes
PB Jensen
PK Lindenauer
PM Visscher
RJ Little
V López
V Sessions
VN Vapnik
W Raghupathi
Y Luo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/01/2018
Field of study

From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

arXiv.org e-Print Archive