Optimization of transit smart card data publishing based on differential privacy

Abstract

“Privacy budget allocation is a key step of the differential privacy (DP)-based privacy-preserving data publishing (PPDP) algorithm development, as it directly impacts the data utility of the released dataset. This research describes the development of an optimal privacy budget allocation algorithm for transit smart card data publishing, with the goal of publishing non-interactive sanitized trajectory data under a differential privacy definition. To this end, after storing the smart card trajectory data with a prefix tree structure, a query probability model is built to quantitatively measure the probability of a trajectory location pair being queried. Next, privacy budget is calculated for each prefix tree node to minimize the query error, while satisfying the differential privacy definition. The optimal privacy budget values are derived with Lagrangian relaxation method, with several solution property proposed. Real-life metro smart card data from Shenzhen, China that includes a total of 2.8 million individual travelers and over 220 million records is used in the case study section. The developed algorithm is demonstrated to output sanitized dataset with higher utilities when compared with previous research”--Abstract, page iii

    Similar works