3 research outputs found

    Machine Learning Algorithms for Privacy-preserving Behavioral Data Analytics

    Get PDF
    PhD thesisBehavioral patterns observed in data generated by mobile and wearable devices are used by many applications, such as wellness monitoring or service personalization. However, sensitive information may be inferred from these data when they are shared with cloud-based services. In this thesis, we propose machine learning algorithms for data transformations to allow the inference of information required for specific tasks while preventing the inference of privacy-sensitive information. Specifically, we focus on protecting the user’s privacy when sharing motion-sensor data and web-browsing histories. Firstly, for human activity recognition using data of wearable sensors, we introduce two algorithms for training deep neural networks to transform motion-sensor data, focusing on two objectives: (i) to prevent the inference of privacy-sensitive activities (e.g. smoking or drinking), and (ii) to protect user’s sensitive attributes (e.g. gender) and prevent the re-identification of user. We show how to combine these two algorithms and propose a compound architecture that protects both sensitive activities and attributes. Alongside the algorithmic contributions, we published a motion-sensor dataset for human activity recognition. Secondly, to prevent the identification of users using their web-browsing behavior, we introduce an algorithm for privacy-preserving collaborative training of contextual bandit algorithms. The proposed method improves the accuracy of personalized recommendation agents that run locally on the user’s devices. We propose an encoding algorithm for the user’s web-browsing data that preserves the required information for the personalization of the future contents while ensuring differential privacy for the participants in collaborative training. In addition, for processing multivariate sensor data, we show how to make neural network architectures adaptive to dynamic sampling rate and sensor selection. This allows handling situations in human activity recognition where the dimensions of input data can be varied at inference time. Specifically, we introduce a customized pooling layer for neural networks and propose a customized training procedure to generalize over a large number of feasible data dimensions. Using the proposed architectural improvement, we show how to convert existing non-adaptive deep neural networks into an adaptive network while keeping the same classification accuracy. We conclude this thesis by discussing open questions and the potential future directions for continuing research in this area

    Online Privacy in Mobile and Web Platforms: Risk Quantification and Obfuscation Techniques

    Full text link
    The wide-spread use of the web and mobile platforms and their high engagement in human lives pose serious threats to the privacy and confidentiality of users. It has been demonstrated in a number of research works that devices, such as desktops, mobile, and web browsers contain subtle information and measurable variation, which allow them to be fingerprinted. Moreover, behavioural tracking is another form of privacy threat that is induced by the collection and monitoring of users gestures such as touch, motion, GPS, search queries, writing pattern, and more. The success of these methods is a clear indication that obfuscation techniques to protect the privacy of individuals, in reality, are not successful if the collected data contains potentially unique combinations of attributes relating to specific individuals. With this in view, this thesis focuses on understanding the privacy risks across the web and mobile platforms by identifying and quantifying the privacy leakages and then designing privacy preserving frameworks against identified threats. We first investigate the potential of using touch-based gestures to track mobile device users. For this purpose, we propose and develop an analytical framework that quantifies the amount of information carried by the user touch gestures. We then quantify users privacy risk in the web data using probabilistic method that incorporates all key privacy aspects, which are uniqueness, uniformity, and linkability of the web data. We also perform a large-scale study of dependency chains in the web and find that a large proportion of websites under-study load resources from suspicious third-parties that are known to mishandle user data and risk privacy leaks. The second half of the thesis addresses the abovementioned identified privacy risks by designing and developing privacy preserving frameworks for the web and mobile platforms. We propose an on-device privacy preserving framework that minimizes privacy leakages by bringing down the risk of trackability and distinguishability of mobile users while preserving the functionality of the existing apps/services. We finally propose a privacy-aware obfuscation framework for the web data having high predicted risk. Using differentially-private noise addition, our proposed framework is resilient against adversary who has knowledge about the obfuscation mechanism, HMM probabilities and the training dataset
    corecore