317 research outputs found

    Empowering Patient Similarity Networks through Innovative Data-Quality-Aware Federated Profiling

    Get PDF
    Continuous monitoring of patients involves collecting and analyzing sensory data from a multitude of sources. To overcome communication overhead, ensure data privacy and security, reduce data loss, and maintain efficient resource usage, the processing and analytics are moved close to where the data are located (e.g., the edge). However, data quality (DQ) can be degraded because of imprecise or malfunctioning sensors, dynamic changes in the environment, transmission failures, or delays. Therefore, it is crucial to keep an eye on data quality and spot problems as quickly as possible, so that they do not mislead clinical judgments and lead to the wrong course of action. In this article, a novel approach called federated data quality profiling (FDQP) is proposed to assess the quality of the data at the edge. FDQP is inspired by federated learning (FL) and serves as a condensed document or a guide for node data quality assurance. The FDQP formal model is developed to capture the quality dimensions specified in the data quality profile (DQP). The proposed approach uses federated feature selection to improve classifier precision and rank features based on criteria such as feature value, outlier percentage, and missing data percentage. Extensive experimentation using a fetal dataset split into different edge nodes and a set of scenarios were carefully chosen to evaluate the proposed FDQP model. The results of the experiments demonstrated that the proposed FDQP approach positively improved the DQ, and thus, impacted the accuracy of the federated patient similarity network (FPSN)-based machine learning models. The proposed data-quality-aware federated PSN architecture leveraging FDQP model with data collected from edge nodes can effectively improve the data quality and accuracy of the federated patient similarity network (FPSN)-based machine learning models. Our profiling algorithm used lightweight profile exchange instead of full data processing at the edge, which resulted in optimal data quality achievement, thus improving efficiency. Overall, FDQP is an effective method for assessing data quality in the edge computing environment, and we believe that the proposed approach can be applied to other scenarios beyond patient monitoring

    Symmetric separable convex resource allocation problems with structured disjoint interval bound constraints

    Full text link
    Motivated by the problem of scheduling electric vehicle (EV) charging with a minimum charging threshold in smart distribution grids, we introduce the resource allocation problem (RAP) with a symmetric separable convex objective function and disjoint interval bound constraints. In this RAP, the aim is to allocate an amount of resource over a set of nn activities, where each individual allocation is restricted to a disjoint collection of mm intervals. This is a generalization of classical RAPs studied in the literature where in contrast each allocation is only restricted by simple lower and upper bounds, i.e., m=1m=1. We propose an exact algorithm that, for four special cases of the problem, returns an optimal solution in O((n+m2m2)(nlogn+nF))O \left(\binom{n+m-2}{m-2} (n \log n + nF) \right) time, where the term nFnF represents the number of flops required for one evaluation of the separable objective function. In particular, the algorithm runs in polynomial time when the number of intervals mm is fixed. Moreover, we show how this algorithm can be adapted to also output an optimal solution to the problem with integer variables without increasing its time complexity. Computational experiments demonstrate the practical efficiency of the algorithm for small values of mm and in particular for solving EV charging problems.Comment: 20 pages, 4 figure

    MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning

    Full text link
    Training deep networks and tuning hyperparameters on large datasets is computationally intensive. One of the primary research directions for efficient training is to reduce training costs by selecting well-generalizable subsets of training data. Compared to simple adaptive random subset selection baselines, existing intelligent subset selection approaches are not competitive due to the time-consuming subset selection step, which involves computing model-dependent gradients and feature embeddings and applies greedy maximization of submodular objectives. Our key insight is that removing the reliance on downstream model parameters enables subset selection as a pre-processing step and enables one to train multiple models at no additional cost. In this work, we propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training while enabling superior model convergence and performance by using an easy-to-hard curriculum. Our empirical results indicate that MILO can train models 3×10×3\times - 10 \times faster and tune hyperparameters 20×75×20\times - 75 \times faster than full-dataset training or tuning without compromising performance

    Vehicle as a Service (VaaS): Leverage Vehicles to Build Service Networks and Capabilities for Smart Cities

    Full text link
    Smart cities demand resources for rich immersive sensing, ubiquitous communications, powerful computing, large storage, and high intelligence (SCCSI) to support various kinds of applications, such as public safety, connected and autonomous driving, smart and connected health, and smart living. At the same time, it is widely recognized that vehicles such as autonomous cars, equipped with significantly powerful SCCSI capabilities, will become ubiquitous in future smart cities. By observing the convergence of these two trends, this article advocates the use of vehicles to build a cost-effective service network, called the Vehicle as a Service (VaaS) paradigm, where vehicles empowered with SCCSI capability form a web of mobile servers and communicators to provide SCCSI services in smart cities. Towards this direction, we first examine the potential use cases in smart cities and possible upgrades required for the transition from traditional vehicular ad hoc networks (VANETs) to VaaS. Then, we will introduce the system architecture of the VaaS paradigm and discuss how it can provide SCCSI services in future smart cities, respectively. At last, we identify the open problems of this paradigm and future research directions, including architectural design, service provisioning, incentive design, and security & privacy. We expect that this paper paves the way towards developing a cost-effective and sustainable approach for building smart cities.Comment: 32 pages, 11 figure

    A Survey of Dataset Refinement for Problems in Computer Vision Datasets

    Full text link
    Large-scale datasets have played a crucial role in the advancement of computer vision. However, they often suffer from problems such as class imbalance, noisy labels, dataset bias, or high resource costs, which can inhibit model performance and reduce trustworthiness. With the advocacy of data-centric research, various data-centric solutions have been proposed to solve the dataset problems mentioned above. They improve the quality of datasets by re-organizing them, which we call dataset refinement. In this survey, we provide a comprehensive and structured overview of recent advances in dataset refinement for problematic computer vision datasets. Firstly, we summarize and analyze the various problems encountered in large-scale computer vision datasets. Then, we classify the dataset refinement algorithms into three categories based on the refinement process: data sampling, data subset selection, and active learning. In addition, we organize these dataset refinement methods according to the addressed data problems and provide a systematic comparative description. We point out that these three types of dataset refinement have distinct advantages and disadvantages for dataset problems, which informs the choice of the data-centric method appropriate to a particular research objective. Finally, we summarize the current literature and propose potential future research topics.Comment: 33 pages, 10 figures, to be published in ACM Computing Survey

    To Compute or not to Compute? Adaptive Smart Sensing in Resource-Constrained Edge Computing

    Full text link
    We consider a network of smart sensors for edge computing application that sample a signal of interest and send updates to a base station for remote global monitoring. Sensors are equipped with sensing and compute, and can either send raw data or process them on-board before transmission. Limited hardware resources at the edge generate a fundamental latency-accuracy trade-off: raw measurements are inaccurate but timely, whereas accurate processed updates are available after computational delay. Also, if sensor on-board processing entails data compression, latency caused by wireless communication might be higher for raw measurements. Hence, one needs to decide when sensors should transmit raw measurements or rely on local processing to maximize overall network performance. To tackle this sensing design problem, we model an estimation-theoretic optimization framework that embeds computation and communication delays, and propose a Reinforcement Learning-based approach to dynamically allocate computational resources at each sensor. Effectiveness of our proposed approach is validated through numerical simulations with case studies motivated by the Internet of Drones and self-driving vehicles.Comment: 14 pages, 14 figures; submitted to IEEE TNSM; revised versio

    Async-HFL: Efficient and Robust Asynchronous Federated Learning in Hierarchical IoT Networks

    Full text link
    Federated Learning (FL) has gained increasing interest in recent years as a distributed on-device learning paradigm. However, multiple challenges remain to be addressed for deploying FL in real-world Internet-of-Things (IoT) networks with hierarchies. Although existing works have proposed various approaches to account data heterogeneity, system heterogeneity, unexpected stragglers and scalibility, none of them provides a systematic solution to address all of the challenges in a hierarchical and unreliable IoT network. In this paper, we propose an asynchronous and hierarchical framework (Async-HFL) for performing FL in a common three-tier IoT network architecture. In response to the largely varied delays, Async-HFL employs asynchronous aggregations at both the gateway and the cloud levels thus avoids long waiting time. To fully unleash the potential of Async-HFL in converging speed under system heterogeneities and stragglers, we design device selection at the gateway level and device-gateway association at the cloud level. Device selection chooses edge devices to trigger local training in real-time while device-gateway association determines the network topology periodically after several cloud epochs, both satisfying bandwidth limitation. We evaluate Async-HFL's convergence speedup using large-scale simulations based on ns-3 and a network topology from NYCMesh. Our results show that Async-HFL converges 1.08-1.31x faster in wall-clock time and saves up to 21.6% total communication cost compared to state-of-the-art asynchronous FL algorithms (with client selection). We further validate Async-HFL on a physical deployment and observe robust convergence under unexpected stragglers.Comment: Accepted by IoTDI'2
    corecore