6,008 research outputs found
Pervasive Data Access in Wireless and Mobile Computing Environments
The rapid advance of wireless and portable computing technology has brought a lot of research interests and momentum to the area of mobile computing. One of the research focus is on pervasive data access. with wireless connections, users can access information at any place at any time. However, various constraints such as limited client capability, limited bandwidth, weak connectivity, and client mobility impose many challenging technical issues. In the past years, tremendous research efforts have been put forth to address the issues related to pervasive data access. A number of interesting research results were reported in the literature. This survey paper reviews important works in two important dimensions of pervasive data access: data broadcast and client caching. In addition, data access techniques aiming at various application requirements (such as time, location, semantics and reliability) are covered
Clustered FedStack: Intermediate Global Models with Bayesian Information Criterion
Federated Learning (FL) is currently one of the most popular technologies in
the field of Artificial Intelligence (AI) due to its collaborative learning and
ability to preserve client privacy. However, it faces challenges such as
non-identically and non-independently distributed (non-IID) and data with
imbalanced labels among local clients. To address these limitations, the
research community has explored various approaches such as using local model
parameters, federated generative adversarial learning, and federated
representation learning. In our study, we propose a novel Clustered FedStack
framework based on the previously published Stacked Federated Learning
(FedStack) framework. The local clients send their model predictions and output
layer weights to a server, which then builds a robust global model. This global
model clusters the local clients based on their output layer weights using a
clustering mechanism. We adopt three clustering mechanisms, namely K-Means,
Agglomerative, and Gaussian Mixture Models, into the framework and evaluate
their performance. We use Bayesian Information Criterion (BIC) with the maximum
likelihood function to determine the number of clusters. The Clustered FedStack
models outperform baseline models with clustering mechanisms. To estimate the
convergence of our proposed framework, we use Cyclical learning rates.Comment: This work has been submitted to the ELSEVIER for possible
publication. Copyright may be transferred without notice, after which this
version may no longer be accessibl
A generic persistence model for CLP systems (and two useful implementations)
This paper describes a model of persistence in (C)LP languages and two different and practically very useful ways to implement this model in current systems. The fundamental idea is that persistence is a characteristic of certain dynamic predicates (Le., those which encapsulate
state). The main effect of declaring a predicate persistent is that the dynamic changes made to such predicates persist from one execution to the next one. After proposing a syntax for declaring persistent predicates, a simple, file-based implementation of the concept is presented and
some examples shown. An additional implementation is presented which stores persistent predicates in an external datábase. The abstraction of the concept of persistence from its implementation allows developing applications
which can store their persistent predicates alternatively in files or databases with only a few simple changes to a declaration stating the location and modality used for persistent storage. The paper presents the model, the implementation approach in both the cases of using files
and relational databases, a number of optimizations of the process (using information obtained from static global analysis and goal clustering), and performance results from an implementation of these ideas
A Decision Technology System To Advance the Diagnosis and Treatment of Breast Cancer
Geographical variations in cancer rates have been observed for decades. Described spatial patterns and trends have provided clues for generating hypotheses about the etiology of cancer. For breast cancer, investigators have demonstrated that some variation can be explained by differences in the population distribution of known breast cancer risk factors such as menstrual and reproductive variables (Laden, Spiegelman, and Neas, 1997; Robbins, Bescianini, and Kelsey, 1997; Sturgeon, Schairer, and Gail, 1995). However, regional patterns also may reflect the effects of Workshop on Hormones, Hormone Metabolism, Environment, and Breast Cancer (1995): (a) environmental hazards (such as air and water pollution), (b) demographics and the lifestyle of a mobile population, (c) subgroup susceptibility, (d) changes and advances in medical practice and healthcare management, and (e) other factors. To accurately measure breast cancer risk in individuals and population groups, it is necessary to singly and jointly assess the association between such risk and the hypothesized factors. Various statistical models will be needed to determine the potential relationships between breast cancer development and estimated exposures to environmental contamination. To apply the models, data must be assembled from a variety of sources, converted into the statistical models’ parameters, and delivered effectively to researchers and policy makers. A Web-enabled decision technology system can be developed to provide the needed functionality. This chapter will present a conceptual architecture for such a decision technology system. First, there will be a brief overview of a typical geographical analysis. Next, the chapter will present the conceptual Web-based decision technology system and illustrate how the system can assist users in diagnosing and treating breast cancer. The chapter will conclude with an examination of the potential benefits from system use and the implications for breast cancer research and practice
Intelligence artificielle à la périphérie du réseau mobile avec efficacité de communication
L'intelligence artificielle (AI) et l'informatique à la périphérie du réseau (EC) ont permis de mettre en place diverses applications intelligentes incluant les maisons intelligentes, la fabrication intelligente, et les villes intelligentes. Ces progrès ont été alimentés principalement par la disponibilité d'un plus grand nombre de données, l'abondance de la puissance de calcul et les progrès de plusieurs techniques de compression. Toutefois, les principales avancées concernent le déploiement de modèles dans les dispositifs connectés. Ces modèles sont préalablement entraînés de manière centralisée. Cette prémisse exige que toutes les données générées par les dispositifs soient envoyées à un serveur centralisé, ce qui pose plusieurs problèmes de confidentialité et crée une surcharge de communication importante. Par conséquent, pour les derniers pas vers l'AI dans EC, il faut également propulser l'apprentissage des modèles ML à la périphérie du réseau.
L'apprentissage fédéré (FL) est apparu comme une technique prometteuse pour l'apprentissage collaboratif de modèles ML sur des dispositifs connectés. Les dispositifs entraînent un modèle partagé sur leurs données stockées localement et ne partagent que les paramètres résultants avec une entité centralisée. Cependant, pour permettre l' utilisation de FL dans les réseaux périphériques sans fil, plusieurs défis hérités de l'AI et de EC doivent être relevés. En particulier, les défis liés à l'hétérogénéité statistique des données à travers les dispositifs ainsi que la rareté et l'hétérogénéité des ressources nécessitent une attention particulière. L'objectif de cette thèse est de proposer des moyens de relever ces défis et d'évaluer le potentiel de la FL dans de futures applications de villes intelligentes.
Dans la première partie de cette thèse, l'accent est mis sur l'incorporation des propriétés des données dans la gestion de la participation des dispositifs dans FL et de l'allocation des ressources. Nous commençons par identifier les mesures de diversité des données qui peuvent être utilisées dans différentes applications. Ensuite, nous concevons un indicateur de diversité permettant de donner plus de priorité aux clients ayant des données plus informatives. Un algorithme itératif est ensuite proposé pour sélectionner conjointement les clients et allouer les ressources de communication. Cet algorithme accélère l'apprentissage et réduit le temps et l'énergie nécessaires. De plus, l'indicateur de diversité proposé est renforcé par un système de réputation pour éviter les clients malveillants, ce qui améliore sa robustesse contre les attaques par empoisonnement des données.
Dans une deuxième partie de cette thèse, nous explorons les moyens de relever d'autres défis liés à la mobilité des clients et au changement de concept dans les distributions de données. De tels défis nécessitent de nouvelles mesures pour être traités. En conséquence, nous concevons un processus basé sur les clusters pour le FL dans les réseaux véhiculaires. Le processus proposé est basé sur la formation minutieuse de clusters pour contourner la congestion de la communication et est capable de traiter différents modèles en parallèle.
Dans la dernière partie de cette thèse, nous démontrons le potentiel de FL dans un cas d'utilisation réel impliquant la prévision à court terme de la puissance électrique dans un réseau intelligent. Nous proposons une architecture permettant l'utilisation de FL pour encourager la collaboration entre les membres de la communauté et nous montrons son importance pour l'entraînement des modèles et la réduction du coût de communication à travers des résultats numériques.Abstract : Artificial intelligence (AI) and Edge computing (EC) have enabled various applications
ranging from smart home, to intelligent manufacturing, and smart cities. This progress
was fueled mainly by the availability of more data, abundance of computing power, and
the progress of several compression techniques. However, the main advances are in relation
to deploying cloud-trained machine learning (ML) models on edge devices. This premise
requires that all data generated by end devices be sent to a centralized server, thus raising
several privacy concerns and creating significant communication overhead. Accordingly,
paving the last mile of AI on EC requires pushing the training of ML models to the
edge of the network. Federated learning (FL) has emerged as a promising technique for
the collaborative training of ML models on edge devices. The devices train a globally
shared model on their locally stored data and only share the resulting parameters with
a centralized entity. However, to enable FL in wireless edge networks, several challenges
inherited from both AI and EC need to be addressed. In particular, challenges related
to the statistical heterogeneity of the data across the devices alongside the scarcity and
the heterogeneity of the resources require particular attention. The goal of this thesis is
to propose ways to address these challenges and to evaluate the potential of FL in future
applications. In the first part of this thesis, the focus is on incorporating the data properties of FL in
handling the participation and resource allocation of devices in FL. We start by identifying
data diversity measures allowing us to evaluate the richness of local datasets in different
applications. Then, we design a diversity indicator allowing us to give more priority to
clients with more informative data. An iterative algorithm is then proposed to jointly select
clients and allocate communication resources. This algorithm accelerates the training
and reduces the overall needed time and energy. Furthermore, the proposed diversity
indicator is reinforced with a reputation system to avoid malicious clients, thus enhancing
its robustness against poisoning attacks. In the second part of this thesis, we explore ways to tackle other challenges related to
the mobility of the clients and concept-shift in data distributions. Such challenges require
new measures to be handled. Accordingly, we design a cluster-based process for FL for the
particular case of vehicular networks. The proposed process is based on careful clusterformation
to bypass the communication bottleneck and is able to handle different models
in parallel. In the last part of this thesis, we demonstrate the potential of FL in a real use-case involving
short-term forecasting of electrical power in smart grid. We propose an architecture
empowered with FL to encourage the collaboration among community members and show
its importance for both training and judicious use of communication resources through
numerical results
Heterogeneous Federated Learning: State-of-the-art and Research Challenges
Federated learning (FL) has drawn increasing attention owing to its potential
use in large-scale industrial applications. Existing federated learning works
mainly focus on model homogeneous settings. However, practical federated
learning typically faces the heterogeneity of data distributions, model
architectures, network environments, and hardware devices among participant
clients. Heterogeneous Federated Learning (HFL) is much more challenging, and
corresponding solutions are diverse and complex. Therefore, a systematic survey
on this topic about the research challenges and state-of-the-art is essential.
In this survey, we firstly summarize the various research challenges in HFL
from five aspects: statistical heterogeneity, model heterogeneity,
communication heterogeneity, device heterogeneity, and additional challenges.
In addition, recent advances in HFL are reviewed and a new taxonomy of existing
HFL methods is proposed with an in-depth analysis of their pros and cons. We
classify existing methods from three different levels according to the HFL
procedure: data-level, model-level, and server-level. Finally, several critical
and promising future research directions in HFL are discussed, which may
facilitate further developments in this field. A periodically updated
collection on HFL is available at https://github.com/marswhu/HFL_Survey.Comment: 42 pages, 11 figures, and 4 table
- …