1 research outputs found

    Machine Learning Algorithms for Privacy-preserving Behavioral Data Analytics

    Get PDF
    PhD thesisBehavioral patterns observed in data generated by mobile and wearable devices are used by many applications, such as wellness monitoring or service personalization. However, sensitive information may be inferred from these data when they are shared with cloud-based services. In this thesis, we propose machine learning algorithms for data transformations to allow the inference of information required for specific tasks while preventing the inference of privacy-sensitive information. Specifically, we focus on protecting the user’s privacy when sharing motion-sensor data and web-browsing histories. Firstly, for human activity recognition using data of wearable sensors, we introduce two algorithms for training deep neural networks to transform motion-sensor data, focusing on two objectives: (i) to prevent the inference of privacy-sensitive activities (e.g. smoking or drinking), and (ii) to protect user’s sensitive attributes (e.g. gender) and prevent the re-identification of user. We show how to combine these two algorithms and propose a compound architecture that protects both sensitive activities and attributes. Alongside the algorithmic contributions, we published a motion-sensor dataset for human activity recognition. Secondly, to prevent the identification of users using their web-browsing behavior, we introduce an algorithm for privacy-preserving collaborative training of contextual bandit algorithms. The proposed method improves the accuracy of personalized recommendation agents that run locally on the user’s devices. We propose an encoding algorithm for the user’s web-browsing data that preserves the required information for the personalization of the future contents while ensuring differential privacy for the participants in collaborative training. In addition, for processing multivariate sensor data, we show how to make neural network architectures adaptive to dynamic sampling rate and sensor selection. This allows handling situations in human activity recognition where the dimensions of input data can be varied at inference time. Specifically, we introduce a customized pooling layer for neural networks and propose a customized training procedure to generalize over a large number of feasible data dimensions. Using the proposed architectural improvement, we show how to convert existing non-adaptive deep neural networks into an adaptive network while keeping the same classification accuracy. We conclude this thesis by discussing open questions and the potential future directions for continuing research in this area
    corecore