Knowledge Discovery From Sensitive Data: Differentially Private Bayesian Rule Lists

Belardo, Armando

Knowledge Discovery From Sensitive Data: Differentially Private Bayesian Rule Lists

Authors: Armando Belardo
Publication date: 1 January 2018
Publisher: Digital Commons @ Colgate

Abstract

The utility of machine learning is rising, coming from a growing wealth of data and problems that are becoming harder to solve analytically. With these changes there is also the need for interpretable machine learning in order for users to understand how a machine learning algorithm comes to a specific output. Bayesian Rule Lists, an interpretable machine learning algorithm, offers an advanced accuracy to interpretabilty trade off when compared to other interpretable machine learning algorithms. Additionally, with the amount of data collected today, there is a lot of potentially sensitive data that we can learn from such as medical and criminal records. However, to do so, we must guarantee a degree of privacy on the dataset; differential privacy has become the standard for this private data analysis. In this paper, we propose a differentially private algorithm for Bayesian Rule Lists. We first break down the original Bayesian Rule List algorithm into three main components: frequent itemset mining, rule list sampling, and point estimate computation. We then perform a literature review to understand these algorithms, and ways to privatize them. There after we computed the necessary sensitivities for all subroutines, and ran experiments on the resulting differentially private algorithm to gauge utility. Results show that the proposed algorithm is able to output rule lists with good accuracy and decent interpretability

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Colgate University: Digital Commons @ Colgate

oai:commons.colgate.edu:theses...

Last time updated on 17/10/2019