5,171 research outputs found
Concealment Conserving the Data Mining of Groups & Individual
We present an overview of privacy preserving data mining, one of the most popular directions in the data mining research community. In the first part of the chapter, we presented approaches that have been proposed for the protection of either the sensitive data itself in the course of data mining or the sensitive data mining results, in the context of traditional (relational) datasets. Following that, in the second part of the chapter, we focused our attention on one of the most recent as well as prominent directions in privacy preserving data mining: the mining of user mobility data. Although still in its infancy, privacy preserving data mining of mobility data has attracted a lot of research attention and already counts a number of methodologies both with respect to sensitive data protection and to sensitive knowledge hiding. Finally, in the end of the chapter, we provided some roadmap along the field of privacy preserving mobility data mining as well as the area of privacy preserving data mining at large
Recommended from our members
Context-awareness for mobile sensing: a survey and future directions
The evolution of smartphones together with increasing computational power have empowered developers to create innovative context-aware applications for recognizing user related social and cognitive activities in any situation and at any location. The existence and awareness of the context provides the capability of being conscious of physical environments or situations around mobile device users. This allows network services to respond proactively and intelligently based on such awareness. The key idea behind context-aware applications is to encourage users to collect, analyze and share local sensory knowledge in the purpose for a large scale community use by creating a smart network. The desired network is capable of making autonomous logical decisions to actuate environmental objects, and also assist individuals. However, many open challenges remain, which are mostly arisen due to the middleware services provided in mobile devices have limited resources in terms of power, memory and bandwidth. Thus, it becomes critically important to study how the drawbacks can be elaborated and resolved, and at the same time better understand the opportunities for the research community to contribute to the context-awareness. To this end, this paper surveys the literature over the period of 1991-2014 from the emerging concepts to applications of context-awareness in mobile platforms by providing up-to-date research and future research directions. Moreover, it points out the challenges faced in this regard and enlighten them by proposing possible solutions
Intelligence artificielle à la périphérie du réseau mobile avec efficacité de communication
L'intelligence artificielle (AI) et l'informatique à la périphérie du réseau (EC) ont permis de mettre en place diverses applications intelligentes incluant les maisons intelligentes, la fabrication intelligente, et les villes intelligentes. Ces progrès ont été alimentés principalement par la disponibilité d'un plus grand nombre de données, l'abondance de la puissance de calcul et les progrès de plusieurs techniques de compression. Toutefois, les principales avancées concernent le déploiement de modèles dans les dispositifs connectés. Ces modèles sont préalablement entraînés de manière centralisée. Cette prémisse exige que toutes les données générées par les dispositifs soient envoyées à un serveur centralisé, ce qui pose plusieurs problèmes de confidentialité et crée une surcharge de communication importante. Par conséquent, pour les derniers pas vers l'AI dans EC, il faut également propulser l'apprentissage des modèles ML à la périphérie du réseau.
L'apprentissage fédéré (FL) est apparu comme une technique prometteuse pour l'apprentissage collaboratif de modèles ML sur des dispositifs connectés. Les dispositifs entraînent un modèle partagé sur leurs données stockées localement et ne partagent que les paramètres résultants avec une entité centralisée. Cependant, pour permettre l' utilisation de FL dans les réseaux périphériques sans fil, plusieurs défis hérités de l'AI et de EC doivent être relevés. En particulier, les défis liés à l'hétérogénéité statistique des données à travers les dispositifs ainsi que la rareté et l'hétérogénéité des ressources nécessitent une attention particulière. L'objectif de cette thèse est de proposer des moyens de relever ces défis et d'évaluer le potentiel de la FL dans de futures applications de villes intelligentes.
Dans la première partie de cette thèse, l'accent est mis sur l'incorporation des propriétés des données dans la gestion de la participation des dispositifs dans FL et de l'allocation des ressources. Nous commençons par identifier les mesures de diversité des données qui peuvent être utilisées dans différentes applications. Ensuite, nous concevons un indicateur de diversité permettant de donner plus de priorité aux clients ayant des données plus informatives. Un algorithme itératif est ensuite proposé pour sélectionner conjointement les clients et allouer les ressources de communication. Cet algorithme accélère l'apprentissage et réduit le temps et l'énergie nécessaires. De plus, l'indicateur de diversité proposé est renforcé par un système de réputation pour éviter les clients malveillants, ce qui améliore sa robustesse contre les attaques par empoisonnement des données.
Dans une deuxième partie de cette thèse, nous explorons les moyens de relever d'autres défis liés à la mobilité des clients et au changement de concept dans les distributions de données. De tels défis nécessitent de nouvelles mesures pour être traités. En conséquence, nous concevons un processus basé sur les clusters pour le FL dans les réseaux véhiculaires. Le processus proposé est basé sur la formation minutieuse de clusters pour contourner la congestion de la communication et est capable de traiter différents modèles en parallèle.
Dans la dernière partie de cette thèse, nous démontrons le potentiel de FL dans un cas d'utilisation réel impliquant la prévision à court terme de la puissance électrique dans un réseau intelligent. Nous proposons une architecture permettant l'utilisation de FL pour encourager la collaboration entre les membres de la communauté et nous montrons son importance pour l'entraînement des modèles et la réduction du coût de communication à travers des résultats numériques.Abstract : Artificial intelligence (AI) and Edge computing (EC) have enabled various applications
ranging from smart home, to intelligent manufacturing, and smart cities. This progress
was fueled mainly by the availability of more data, abundance of computing power, and
the progress of several compression techniques. However, the main advances are in relation
to deploying cloud-trained machine learning (ML) models on edge devices. This premise
requires that all data generated by end devices be sent to a centralized server, thus raising
several privacy concerns and creating significant communication overhead. Accordingly,
paving the last mile of AI on EC requires pushing the training of ML models to the
edge of the network. Federated learning (FL) has emerged as a promising technique for
the collaborative training of ML models on edge devices. The devices train a globally
shared model on their locally stored data and only share the resulting parameters with
a centralized entity. However, to enable FL in wireless edge networks, several challenges
inherited from both AI and EC need to be addressed. In particular, challenges related
to the statistical heterogeneity of the data across the devices alongside the scarcity and
the heterogeneity of the resources require particular attention. The goal of this thesis is
to propose ways to address these challenges and to evaluate the potential of FL in future
applications. In the first part of this thesis, the focus is on incorporating the data properties of FL in
handling the participation and resource allocation of devices in FL. We start by identifying
data diversity measures allowing us to evaluate the richness of local datasets in different
applications. Then, we design a diversity indicator allowing us to give more priority to
clients with more informative data. An iterative algorithm is then proposed to jointly select
clients and allocate communication resources. This algorithm accelerates the training
and reduces the overall needed time and energy. Furthermore, the proposed diversity
indicator is reinforced with a reputation system to avoid malicious clients, thus enhancing
its robustness against poisoning attacks. In the second part of this thesis, we explore ways to tackle other challenges related to
the mobility of the clients and concept-shift in data distributions. Such challenges require
new measures to be handled. Accordingly, we design a cluster-based process for FL for the
particular case of vehicular networks. The proposed process is based on careful clusterformation
to bypass the communication bottleneck and is able to handle different models
in parallel. In the last part of this thesis, we demonstrate the potential of FL in a real use-case involving
short-term forecasting of electrical power in smart grid. We propose an architecture
empowered with FL to encourage the collaboration among community members and show
its importance for both training and judicious use of communication resources through
numerical results
Quantifying Privacy Loss of Human Mobility Graph Topology
Abstract
Human mobility is often represented as a mobility network, or graph, with nodes representing places of significance which an individual visits, such as their home, work, places of social amenity, etc., and edge weights corresponding to probability estimates of movements between these places. Previous research has shown that individuals can be identified by a small number of geolocated nodes in their mobility network, rendering mobility trace anonymization a hard task. In this paper we build on prior work and demonstrate that even when all location and timestamp information is removed from nodes, the graph topology of an individual mobility network itself is often uniquely identifying. Further, we observe that a mobility network is often unique, even when only a small number of the most popular nodes and edges are considered. We evaluate our approach using a large dataset of cell-tower location traces from 1 500 smartphone handsets with a mean duration of 430 days. We process the data to derive the top−N places visited by the device in the trace, and find that 93% of traces have a unique top−10 mobility network, and all traces are unique when considering top−15 mobility networks. Since mobility patterns, and therefore mobility networks for an individual, vary over time, we use graph kernel distance functions, to determine whether two mobility networks, taken at different points in time, represent the same individual. We then show that our distance metrics, while imperfect predictors, perform significantly better than a random strategy and therefore our approach represents a significant loss in privacy.</jats:p
Split Federated Learning for 6G Enabled-Networks: Requirements, Challenges and Future Directions
Sixth-generation (6G) networks anticipate intelligently supporting a wide
range of smart services and innovative applications. Such a context urges a
heavy usage of Machine Learning (ML) techniques, particularly Deep Learning
(DL), to foster innovation and ease the deployment of intelligent network
functions/operations, which are able to fulfill the various requirements of the
envisioned 6G services. Specifically, collaborative ML/DL consists of deploying
a set of distributed agents that collaboratively train learning models without
sharing their data, thus improving data privacy and reducing the
time/communication overhead. This work provides a comprehensive study on how
collaborative learning can be effectively deployed over 6G wireless networks.
In particular, our study focuses on Split Federated Learning (SFL), a technique
recently emerged promising better performance compared with existing
collaborative learning approaches. We first provide an overview of three
emerging collaborative learning paradigms, including federated learning, split
learning, and split federated learning, as well as of 6G networks along with
their main vision and timeline of key developments. We then highlight the need
for split federated learning towards the upcoming 6G networks in every aspect,
including 6G technologies (e.g., intelligent physical layer, intelligent edge
computing, zero-touch network management, intelligent resource management) and
6G use cases (e.g., smart grid 2.0, Industry 5.0, connected and autonomous
systems). Furthermore, we review existing datasets along with frameworks that
can help in implementing SFL for 6G networks. We finally identify key technical
challenges, open issues, and future research directions related to SFL-enabled
6G networks
ANGELAH: A Framework for Assisting Elders At Home
The ever growing percentage of elderly people within modern societies poses welfare systems under relevant stress. In fact, partial and progressive loss of motor, sensorial, and/or cognitive skills renders elders unable to live autonomously, eventually leading to their hospitalization. This results in both relevant emotional and economic costs. Ubiquitous computing technologies can offer interesting opportunities for in-house safety and autonomy. However, existing systems partially address in-house safety requirements and typically focus on only elder monitoring and emergency detection. The paper presents ANGELAH, a middleware-level solution integrating both ”elder monitoring and emergency detection” solutions and networking solutions. ANGELAH has two main features: i) it enables efficient integration between a variety of sensors and actuators deployed at home for emergency detection and ii) provides a solid framework for creating and managing rescue teams composed of individuals willing to promptly assist elders in case of emergency situations. A prototype of ANGELAH, designed for a case study for helping elders with vision impairments, is developed and interesting results are obtained from both computer simulations and a real-network testbed
- …