5,171 research outputs found

    Concealment Conserving the Data Mining of Groups & Individual

    Get PDF
    We present an overview of privacy preserving data mining, one of the most popular directions in the data mining research community. In the first part of the chapter, we presented approaches that have been proposed for the protection of either the sensitive data itself in the course of data mining or the sensitive data mining results, in the context of traditional (relational) datasets. Following that, in the second part of the chapter, we focused our attention on one of the most recent as well as prominent directions in privacy preserving data mining: the mining of user mobility data. Although still in its infancy, privacy preserving data mining of mobility data has attracted a lot of research attention and already counts a number of methodologies both with respect to sensitive data protection and to sensitive knowledge hiding. Finally, in the end of the chapter, we provided some roadmap along the field of privacy preserving mobility data mining as well as the area of privacy preserving data mining at large

    Intelligence artificielle à la périphérie du réseau mobile avec efficacité de communication

    Get PDF
    L'intelligence artificielle (AI) et l'informatique à la périphérie du réseau (EC) ont permis de mettre en place diverses applications intelligentes incluant les maisons intelligentes, la fabrication intelligente, et les villes intelligentes. Ces progrès ont été alimentés principalement par la disponibilité d'un plus grand nombre de données, l'abondance de la puissance de calcul et les progrès de plusieurs techniques de compression. Toutefois, les principales avancées concernent le déploiement de modèles dans les dispositifs connectés. Ces modèles sont préalablement entraînés de manière centralisée. Cette prémisse exige que toutes les données générées par les dispositifs soient envoyées à un serveur centralisé, ce qui pose plusieurs problèmes de confidentialité et crée une surcharge de communication importante. Par conséquent, pour les derniers pas vers l'AI dans EC, il faut également propulser l'apprentissage des modèles ML à la périphérie du réseau. L'apprentissage fédéré (FL) est apparu comme une technique prometteuse pour l'apprentissage collaboratif de modèles ML sur des dispositifs connectés. Les dispositifs entraînent un modèle partagé sur leurs données stockées localement et ne partagent que les paramètres résultants avec une entité centralisée. Cependant, pour permettre l' utilisation de FL dans les réseaux périphériques sans fil, plusieurs défis hérités de l'AI et de EC doivent être relevés. En particulier, les défis liés à l'hétérogénéité statistique des données à travers les dispositifs ainsi que la rareté et l'hétérogénéité des ressources nécessitent une attention particulière. L'objectif de cette thèse est de proposer des moyens de relever ces défis et d'évaluer le potentiel de la FL dans de futures applications de villes intelligentes. Dans la première partie de cette thèse, l'accent est mis sur l'incorporation des propriétés des données dans la gestion de la participation des dispositifs dans FL et de l'allocation des ressources. Nous commençons par identifier les mesures de diversité des données qui peuvent être utilisées dans différentes applications. Ensuite, nous concevons un indicateur de diversité permettant de donner plus de priorité aux clients ayant des données plus informatives. Un algorithme itératif est ensuite proposé pour sélectionner conjointement les clients et allouer les ressources de communication. Cet algorithme accélère l'apprentissage et réduit le temps et l'énergie nécessaires. De plus, l'indicateur de diversité proposé est renforcé par un système de réputation pour éviter les clients malveillants, ce qui améliore sa robustesse contre les attaques par empoisonnement des données. Dans une deuxième partie de cette thèse, nous explorons les moyens de relever d'autres défis liés à la mobilité des clients et au changement de concept dans les distributions de données. De tels défis nécessitent de nouvelles mesures pour être traités. En conséquence, nous concevons un processus basé sur les clusters pour le FL dans les réseaux véhiculaires. Le processus proposé est basé sur la formation minutieuse de clusters pour contourner la congestion de la communication et est capable de traiter différents modèles en parallèle. Dans la dernière partie de cette thèse, nous démontrons le potentiel de FL dans un cas d'utilisation réel impliquant la prévision à court terme de la puissance électrique dans un réseau intelligent. Nous proposons une architecture permettant l'utilisation de FL pour encourager la collaboration entre les membres de la communauté et nous montrons son importance pour l'entraînement des modèles et la réduction du coût de communication à travers des résultats numériques.Abstract : Artificial intelligence (AI) and Edge computing (EC) have enabled various applications ranging from smart home, to intelligent manufacturing, and smart cities. This progress was fueled mainly by the availability of more data, abundance of computing power, and the progress of several compression techniques. However, the main advances are in relation to deploying cloud-trained machine learning (ML) models on edge devices. This premise requires that all data generated by end devices be sent to a centralized server, thus raising several privacy concerns and creating significant communication overhead. Accordingly, paving the last mile of AI on EC requires pushing the training of ML models to the edge of the network. Federated learning (FL) has emerged as a promising technique for the collaborative training of ML models on edge devices. The devices train a globally shared model on their locally stored data and only share the resulting parameters with a centralized entity. However, to enable FL in wireless edge networks, several challenges inherited from both AI and EC need to be addressed. In particular, challenges related to the statistical heterogeneity of the data across the devices alongside the scarcity and the heterogeneity of the resources require particular attention. The goal of this thesis is to propose ways to address these challenges and to evaluate the potential of FL in future applications. In the first part of this thesis, the focus is on incorporating the data properties of FL in handling the participation and resource allocation of devices in FL. We start by identifying data diversity measures allowing us to evaluate the richness of local datasets in different applications. Then, we design a diversity indicator allowing us to give more priority to clients with more informative data. An iterative algorithm is then proposed to jointly select clients and allocate communication resources. This algorithm accelerates the training and reduces the overall needed time and energy. Furthermore, the proposed diversity indicator is reinforced with a reputation system to avoid malicious clients, thus enhancing its robustness against poisoning attacks. In the second part of this thesis, we explore ways to tackle other challenges related to the mobility of the clients and concept-shift in data distributions. Such challenges require new measures to be handled. Accordingly, we design a cluster-based process for FL for the particular case of vehicular networks. The proposed process is based on careful clusterformation to bypass the communication bottleneck and is able to handle different models in parallel. In the last part of this thesis, we demonstrate the potential of FL in a real use-case involving short-term forecasting of electrical power in smart grid. We propose an architecture empowered with FL to encourage the collaboration among community members and show its importance for both training and judicious use of communication resources through numerical results

    Quantifying Privacy Loss of Human Mobility Graph Topology

    Get PDF
    Abstract Human mobility is often represented as a mobility network, or graph, with nodes representing places of significance which an individual visits, such as their home, work, places of social amenity, etc., and edge weights corresponding to probability estimates of movements between these places. Previous research has shown that individuals can be identified by a small number of geolocated nodes in their mobility network, rendering mobility trace anonymization a hard task. In this paper we build on prior work and demonstrate that even when all location and timestamp information is removed from nodes, the graph topology of an individual mobility network itself is often uniquely identifying. Further, we observe that a mobility network is often unique, even when only a small number of the most popular nodes and edges are considered. We evaluate our approach using a large dataset of cell-tower location traces from 1 500 smartphone handsets with a mean duration of 430 days. We process the data to derive the top−N places visited by the device in the trace, and find that 93% of traces have a unique top−10 mobility network, and all traces are unique when considering top−15 mobility networks. Since mobility patterns, and therefore mobility networks for an individual, vary over time, we use graph kernel distance functions, to determine whether two mobility networks, taken at different points in time, represent the same individual. We then show that our distance metrics, while imperfect predictors, perform significantly better than a random strategy and therefore our approach represents a significant loss in privacy.</jats:p

    Split Federated Learning for 6G Enabled-Networks: Requirements, Challenges and Future Directions

    Full text link
    Sixth-generation (6G) networks anticipate intelligently supporting a wide range of smart services and innovative applications. Such a context urges a heavy usage of Machine Learning (ML) techniques, particularly Deep Learning (DL), to foster innovation and ease the deployment of intelligent network functions/operations, which are able to fulfill the various requirements of the envisioned 6G services. Specifically, collaborative ML/DL consists of deploying a set of distributed agents that collaboratively train learning models without sharing their data, thus improving data privacy and reducing the time/communication overhead. This work provides a comprehensive study on how collaborative learning can be effectively deployed over 6G wireless networks. In particular, our study focuses on Split Federated Learning (SFL), a technique recently emerged promising better performance compared with existing collaborative learning approaches. We first provide an overview of three emerging collaborative learning paradigms, including federated learning, split learning, and split federated learning, as well as of 6G networks along with their main vision and timeline of key developments. We then highlight the need for split federated learning towards the upcoming 6G networks in every aspect, including 6G technologies (e.g., intelligent physical layer, intelligent edge computing, zero-touch network management, intelligent resource management) and 6G use cases (e.g., smart grid 2.0, Industry 5.0, connected and autonomous systems). Furthermore, we review existing datasets along with frameworks that can help in implementing SFL for 6G networks. We finally identify key technical challenges, open issues, and future research directions related to SFL-enabled 6G networks

    ANGELAH: A Framework for Assisting Elders At Home

    Get PDF
    The ever growing percentage of elderly people within modern societies poses welfare systems under relevant stress. In fact, partial and progressive loss of motor, sensorial, and/or cognitive skills renders elders unable to live autonomously, eventually leading to their hospitalization. This results in both relevant emotional and economic costs. Ubiquitous computing technologies can offer interesting opportunities for in-house safety and autonomy. However, existing systems partially address in-house safety requirements and typically focus on only elder monitoring and emergency detection. The paper presents ANGELAH, a middleware-level solution integrating both ”elder monitoring and emergency detection” solutions and networking solutions. ANGELAH has two main features: i) it enables efficient integration between a variety of sensors and actuators deployed at home for emergency detection and ii) provides a solid framework for creating and managing rescue teams composed of individuals willing to promptly assist elders in case of emergency situations. A prototype of ANGELAH, designed for a case study for helping elders with vision impairments, is developed and interesting results are obtained from both computer simulations and a real-network testbed
    corecore