317 research outputs found
Empowering Patient Similarity Networks through Innovative Data-Quality-Aware Federated Profiling
Continuous monitoring of patients involves collecting and analyzing sensory data from a multitude of sources. To overcome communication overhead, ensure data privacy and security, reduce data loss, and maintain efficient resource usage, the processing and analytics are moved close to where the data are located (e.g., the edge). However, data quality (DQ) can be degraded because of imprecise or malfunctioning sensors, dynamic changes in the environment, transmission failures, or delays. Therefore, it is crucial to keep an eye on data quality and spot problems as quickly as possible, so that they do not mislead clinical judgments and lead to the wrong course of action. In this article, a novel approach called federated data quality profiling (FDQP) is proposed to assess the quality of the data at the edge. FDQP is inspired by federated learning (FL) and serves as a condensed document or a guide for node data quality assurance. The FDQP formal model is developed to capture the quality dimensions specified in the data quality profile (DQP). The proposed approach uses federated feature selection to improve classifier precision and rank features based on criteria such as feature value, outlier percentage, and missing data percentage. Extensive experimentation using a fetal dataset split into different edge nodes and a set of scenarios were carefully chosen to evaluate the proposed FDQP model. The results of the experiments demonstrated that the proposed FDQP approach positively improved the DQ, and thus, impacted the accuracy of the federated patient similarity network (FPSN)-based machine learning models. The proposed data-quality-aware federated PSN architecture leveraging FDQP model with data collected from edge nodes can effectively improve the data quality and accuracy of the federated patient similarity network (FPSN)-based machine learning models. Our profiling algorithm used lightweight profile exchange instead of full data processing at the edge, which resulted in optimal data quality achievement, thus improving efficiency. Overall, FDQP is an effective method for assessing data quality in the edge computing environment, and we believe that the proposed approach can be applied to other scenarios beyond patient monitoring
Symmetric separable convex resource allocation problems with structured disjoint interval bound constraints
Motivated by the problem of scheduling electric vehicle (EV) charging with a
minimum charging threshold in smart distribution grids, we introduce the
resource allocation problem (RAP) with a symmetric separable convex objective
function and disjoint interval bound constraints. In this RAP, the aim is to
allocate an amount of resource over a set of activities, where each
individual allocation is restricted to a disjoint collection of intervals.
This is a generalization of classical RAPs studied in the literature where in
contrast each allocation is only restricted by simple lower and upper bounds,
i.e., . We propose an exact algorithm that, for four special cases of the
problem, returns an optimal solution in time, where the term represents the number of flops required
for one evaluation of the separable objective function. In particular, the
algorithm runs in polynomial time when the number of intervals is fixed.
Moreover, we show how this algorithm can be adapted to also output an optimal
solution to the problem with integer variables without increasing its time
complexity. Computational experiments demonstrate the practical efficiency of
the algorithm for small values of and in particular for solving EV charging
problems.Comment: 20 pages, 4 figure
MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning
Training deep networks and tuning hyperparameters on large datasets is
computationally intensive. One of the primary research directions for efficient
training is to reduce training costs by selecting well-generalizable subsets of
training data. Compared to simple adaptive random subset selection baselines,
existing intelligent subset selection approaches are not competitive due to the
time-consuming subset selection step, which involves computing model-dependent
gradients and feature embeddings and applies greedy maximization of submodular
objectives. Our key insight is that removing the reliance on downstream model
parameters enables subset selection as a pre-processing step and enables one to
train multiple models at no additional cost. In this work, we propose MILO, a
model-agnostic subset selection framework that decouples the subset selection
from model training while enabling superior model convergence and performance
by using an easy-to-hard curriculum. Our empirical results indicate that MILO
can train models faster and tune hyperparameters
faster than full-dataset training or tuning without
compromising performance
Vehicle as a Service (VaaS): Leverage Vehicles to Build Service Networks and Capabilities for Smart Cities
Smart cities demand resources for rich immersive sensing, ubiquitous
communications, powerful computing, large storage, and high intelligence
(SCCSI) to support various kinds of applications, such as public safety,
connected and autonomous driving, smart and connected health, and smart living.
At the same time, it is widely recognized that vehicles such as autonomous
cars, equipped with significantly powerful SCCSI capabilities, will become
ubiquitous in future smart cities. By observing the convergence of these two
trends, this article advocates the use of vehicles to build a cost-effective
service network, called the Vehicle as a Service (VaaS) paradigm, where
vehicles empowered with SCCSI capability form a web of mobile servers and
communicators to provide SCCSI services in smart cities. Towards this
direction, we first examine the potential use cases in smart cities and
possible upgrades required for the transition from traditional vehicular ad hoc
networks (VANETs) to VaaS. Then, we will introduce the system architecture of
the VaaS paradigm and discuss how it can provide SCCSI services in future smart
cities, respectively. At last, we identify the open problems of this paradigm
and future research directions, including architectural design, service
provisioning, incentive design, and security & privacy. We expect that this
paper paves the way towards developing a cost-effective and sustainable
approach for building smart cities.Comment: 32 pages, 11 figure
A Survey of Dataset Refinement for Problems in Computer Vision Datasets
Large-scale datasets have played a crucial role in the advancement of
computer vision. However, they often suffer from problems such as class
imbalance, noisy labels, dataset bias, or high resource costs, which can
inhibit model performance and reduce trustworthiness. With the advocacy of
data-centric research, various data-centric solutions have been proposed to
solve the dataset problems mentioned above. They improve the quality of
datasets by re-organizing them, which we call dataset refinement. In this
survey, we provide a comprehensive and structured overview of recent advances
in dataset refinement for problematic computer vision datasets. Firstly, we
summarize and analyze the various problems encountered in large-scale computer
vision datasets. Then, we classify the dataset refinement algorithms into three
categories based on the refinement process: data sampling, data subset
selection, and active learning. In addition, we organize these dataset
refinement methods according to the addressed data problems and provide a
systematic comparative description. We point out that these three types of
dataset refinement have distinct advantages and disadvantages for dataset
problems, which informs the choice of the data-centric method appropriate to a
particular research objective. Finally, we summarize the current literature and
propose potential future research topics.Comment: 33 pages, 10 figures, to be published in ACM Computing Survey
To Compute or not to Compute? Adaptive Smart Sensing in Resource-Constrained Edge Computing
We consider a network of smart sensors for edge computing application that
sample a signal of interest and send updates to a base station for remote
global monitoring. Sensors are equipped with sensing and compute, and can
either send raw data or process them on-board before transmission. Limited
hardware resources at the edge generate a fundamental latency-accuracy
trade-off: raw measurements are inaccurate but timely, whereas accurate
processed updates are available after computational delay. Also, if sensor
on-board processing entails data compression, latency caused by wireless
communication might be higher for raw measurements. Hence, one needs to decide
when sensors should transmit raw measurements or rely on local processing to
maximize overall network performance. To tackle this sensing design problem, we
model an estimation-theoretic optimization framework that embeds computation
and communication delays, and propose a Reinforcement Learning-based approach
to dynamically allocate computational resources at each sensor. Effectiveness
of our proposed approach is validated through numerical simulations with case
studies motivated by the Internet of Drones and self-driving vehicles.Comment: 14 pages, 14 figures; submitted to IEEE TNSM; revised versio
Async-HFL: Efficient and Robust Asynchronous Federated Learning in Hierarchical IoT Networks
Federated Learning (FL) has gained increasing interest in recent years as a
distributed on-device learning paradigm. However, multiple challenges remain to
be addressed for deploying FL in real-world Internet-of-Things (IoT) networks
with hierarchies. Although existing works have proposed various approaches to
account data heterogeneity, system heterogeneity, unexpected stragglers and
scalibility, none of them provides a systematic solution to address all of the
challenges in a hierarchical and unreliable IoT network. In this paper, we
propose an asynchronous and hierarchical framework (Async-HFL) for performing
FL in a common three-tier IoT network architecture. In response to the largely
varied delays, Async-HFL employs asynchronous aggregations at both the gateway
and the cloud levels thus avoids long waiting time. To fully unleash the
potential of Async-HFL in converging speed under system heterogeneities and
stragglers, we design device selection at the gateway level and device-gateway
association at the cloud level. Device selection chooses edge devices to
trigger local training in real-time while device-gateway association determines
the network topology periodically after several cloud epochs, both satisfying
bandwidth limitation. We evaluate Async-HFL's convergence speedup using
large-scale simulations based on ns-3 and a network topology from NYCMesh. Our
results show that Async-HFL converges 1.08-1.31x faster in wall-clock time and
saves up to 21.6% total communication cost compared to state-of-the-art
asynchronous FL algorithms (with client selection). We further validate
Async-HFL on a physical deployment and observe robust convergence under
unexpected stragglers.Comment: Accepted by IoTDI'2
- …