1,592 research outputs found

    Server Assignment with Time-Varying Workloads in Mobile Edge Computing

    Get PDF
    Mobile Edge Computing (MEC) has emerged as a viable technology for mobile operators to push computing resources closer to the users so that requests can be served locally without long-haul crossing of the network core, thus improving network efficiency and user experience. In MEC, commodity servers are deployed in the edge to form a distributed network of mini datacenters. A consequential task is to partition the user cells into groups, each to be served by an edge server, to maximize the offloading to the edge. The conventional setting for this problem in the literature is: (1) assume that the interaction workload between two cells has a known interaction rate, (2) compute a partition optimized for these rates, for example, by solving a weighted-graph partitioning problem, and (3) for a time-varying workload, incrementally re-compute the partition when the interaction rates change. This setting is suitable only for infrequently changing workloads. The operational and computation costs of the partition update can be expensive and it is difficult to estimate interaction rates if they are not stable for a long period. Hence, this dissertation is motivated by the following questions: is there an efficient way to compute just one partition, no update needed, that is robust for a highly time-varying workload? Especially, what if we do not know the interaction rates at any time? By ``robust , we mean that the cost to process the workload at any given time remains small despite unpredictable workload increases. Another consideration is geographical awareness. The edge servers should be geographically close to their respective user cells for maximizing the benefits of MEC. This dissertation presents novel solutions to address these issues. The theoretical findings are substantiated by evaluation studies using real-world data

    Edge Assignment and Data Valuation in Federated Learning

    Get PDF
    Federated Learning (FL) is a recent Machine Learning method for training with private data separately stored in local machines without gathering them into one place for central learning. It was born to address the following challenges when applying Machine Learning in practice: (1) Communication cost: Most real-world data that can be useful for training are locally collected; to bring them all to one place for central learning can be expensive, especially in real-time learning applications when time is of the essence, for example, predicting the next word when texting on a smartphone; and (2) Privacy protection: Many applications must protect data privacy, such as those in the healthcare field; the private data can only be seen by its local owner and as such the learning may only use a content-hiding representation of this data, which is much less informative. To fulfill FL’s promise, this dissertation addresses three important problems regarding the need for good training data, system scalability, and uncertainty robustness: 1. The effectiveness of FL depends critically on the quality of the local training data. We should not only incentivize participants who have good training data but also minimize the effect of bad training data on the overall learning procedure. The first problem of my research is to determine a score to value a participant’s contribution. My approach is to compute such a score based on Shapley Value (SV), a concept of cooperative game theory for profit allocation in a coalition game. In this direction, the main challenge is due to the exponential time complexity of the SV computation, which is further complicated by the iterative manner of the FL learning algorithm. I propose a fast and effective valuation method that overcomes this challenge. 2. On scalability, FL depends on a central server for repeated aggregation of local training models, which is prone to become a performance bottleneck. A reasonable approach is to combine FL with Edge Computing: introduce a layer of edge servers to each serve as a regional aggregator to offload the main server. The scalability is thus improved, however at the cost of learning accuracy. The second problem of my research is to optimize this tradeoff. This dissertation shows that this cost can be alleviated with a proper choice of edge server assignment: which edge servers should aggregate the training models from which local machines. Specifically, I propose an assignment solution that is especially useful for the case of non-IID training data which is well-known to hinder today’s FL performance. 3. FL participants may decide on their own what devices they run on, their computing capabilities, and how often they communicate the training model with the aggregation server. The workloads incurred by them are therefore time-varying, and unpredictably. The server capacities are finite and can vary too. The third problem of my research is to compute an edge server assignment that is robust to such dynamics and uncertainties. I propose a stochastic approach to solving this problem

    Graph Pattern Matching on Symmetric Multiprocessor Systems

    Get PDF
    Graph-structured data can be found in nearly every aspect of today's world, be it road networks, social networks or the internet itself. From a processing perspective, finding comprehensive patterns in graph-structured data is a core processing primitive in a variety of applications, such as fraud detection, biological engineering or social graph analytics. On the hardware side, multiprocessor systems, that consist of multiple processors in a single scale-up server, are the next important wave on top of multi-core systems. In particular, symmetric multiprocessor systems (SMP) are characterized by the fact, that each processor has the same architecture, e.g. every processor is a multi-core and all multiprocessors share a common and huge main memory space. Moreover, large SMPs will feature a non-uniform memory access (NUMA), whose impact on the design of efficient data processing concepts should not be neglected. The efficient usage of SMP systems, that still increase in size, is an interesting and ongoing research topic. Current state-of-the-art architectural design principles provide different and in parts disjunct suggestions on which data should be partitioned and or how intra-process communication should be realized. In this thesis, we propose a new synthesis of four of the most well-known principles Shared Everything, Partition Serial Execution, Data Oriented Architecture and Delegation, to create the NORAD architecture, which stands for NUMA-aware DORA with Delegation. We built our research prototype called NeMeSys on top of the NORAD architecture to fully exploit the provided hardware capacities of SMPs for graph pattern matching. Being an in-memory engine, NeMeSys allows for online data ingestion as well as online query generation and processing through a terminal based user interface. Storing a graph on a NUMA system inherently requires data partitioning to cope with the mentioned NUMA effect. Hence, we need to dissect the graph into a disjunct set of partitions, which can then be stored on the individual memory domains. This thesis analyzes the capabilites of the NORAD architecture, to perform scalable graph pattern matching on SMP systems. To increase the systems performance, we further develop, integrate and evaluate suitable optimization techniques. That is, we investigate the influence of the inherent data partitioning, the interplay of messaging with and without sufficient locality information and the actual partition placement on any NUMA socket in the system. To underline the applicability of our approach, we evaluate NeMeSys against synthetic datasets and perform an end-to-end evaluation of the whole system stack on the real world knowledge graph of Wikidata

    RACER: Rapid Collaborative Exploration with a Decentralized Multi-UAV System

    Full text link
    Although the use of multiple Unmanned Aerial Vehicles (UAVs) has great potential for fast autonomous exploration, it has received far too little attention. In this paper, we present RACER, a RApid Collaborative ExploRation approach using a fleet of decentralized UAVs. To effectively dispatch the UAVs, a pairwise interaction based on an online hgrid space decomposition is used. It ensures that all UAVs simultaneously explore distinct regions, using only asynchronous and limited communication. Further, we optimize the coverage paths of unknown space and balance the workloads partitioned to each UAV with a Capacitated Vehicle Routing Problem(CVRP) formulation. Given the task allocation, each UAV constantly updates the coverage path and incrementally extracts crucial information to support the exploration planning. A hierarchical planner finds exploration paths, refines local viewpoints and generates minimum-time trajectories in sequence to explore the unknown space agilely and safely. The proposed approach is evaluated extensively, showing high exploration efficiency, scalability and robustness to limited communication. Furthermore, for the first time, we achieve fully decentralized collaborative exploration with multiple UAVs in real world. We will release our implementation as an open-source package.Comment: Conditionally accpeted by TR

    Pilot interaction with automated airborne decision making systems

    Get PDF
    An investigation was made of interaction between a human pilot and automated on-board decision making systems. Research was initiated on the topic of pilot problem solving in automated and semi-automated flight management systems and attempts were made to develop a model of human decision making in a multi-task situation. A study was made of allocation of responsibility between human and computer, and discussed were various pilot performance parameters with varying degrees of automation. Optimal allocation of responsibility between human and computer was considered and some theoretical results found in the literature were presented. The pilot as a problem solver was discussed. Finally the design of displays, controls, procedures, and computer aids for problem solving tasks in automated and semi-automated systems was considered

    Distributed graph processing and partitioning for spatiotemporal queries in the context of camera networks

    Get PDF
    This work presents a scalable, distributed architecture for processing spatiotemporal queries in the context of camera networks based on a graph structure. With the ever-increasing presence of cameras and the emergence of camera-networks, e.g., in the context of campus security, it becomes increasingly important to provide a robust and scalable architecture to store and retrieve detected events. In this work a distributed graph processing engine will be presented which is well suited for read and write tasks in the environment of spatiotemporal image-similarity based workloads. The key ideas presented in this work are the architecture of a scalable graph processing system well-suited for processing spatio-temporal queries and the design of a distributed and robust vertex-partitioning strategy for the graph which is being defined by the spatiotemporal attributes of the stored events. The work will show multiple lightweight heuristics for partitioning the graph among the nodes participating in the system, focusing on load-balancing between workers and high edge-locality for vertices. The system and the partitioning strategies will be evaluated, showing that the system scales with the number of workers and the problem size and is able to answer proportionally more queries per second. It will also be shown that the lightweight heuristics for partitioning the graph produce a relatively good balancing of the vertices on the worker-nodes and can be executed in an online-fashion, resulting in similar performance when compared to a traditional hash-partitioning while providing far superior edge-locality.Diese Arbeit stellt eine verteilte und skalierbare Architektur zur Verarbeitung von räumlich-zeitlichen Anfragen auf Kamera-Netzwerken vor und basiert dabei auf einer Graph-Struktur. Aufgrund der allgegenwärtigen Präsenz von Kameras und dem Aufkommen von Kamera-Netzwerken, zum Beispiel im Bereich der Überwachung eines Universitätsgeländes, wird es immer wichtiger, eine robuste und skalierbare Architektur zur Speicherung und Auffindung von detektierten Ereignissen anzubieten. In dieser Arbeit soll daher eine verteilte Graph-Verarbeitungs-Architektur vorgestellt werden, welche für die Verarbeitung von Einfüge- und Anfrage-Operationen im Umfeld von räumlich-zeitlichen Bild-Ähnlichkeits-basierten Aufgaben gut geeignet ist. Die Haupt-Ideen dieser Arbeit sind dabei die Architektur eines skalierbaren Graph-Verarbeitungs-Systems zur Ausführung der räumlich-zeitlichen Anfragen und der Entwurf verteilter und robuster Knoten-Partitionierungs-Schemas für den Graph der detektierten Ereignisse. Dafür werden mehrere, leichtgewichtigte Heuristiken zur Partitionierung des Graphen zwischen den am System teilnehmenden Rechner-Knoten vorgestellt, wobei der Fokus auf einer Lasten-Verteilung zwischen den Rechner-Knoten des Systems liegt, bei gleichzeitiger Optimierung der Kanten-Lokalität. Das dabei entstehende System und die Partitionierungsstrategien werden evaluiert, wobei die Skalierbarkeit des Systems sowohl in der Zahl der teilnehmenden Rechner-Knoten als auch der Größe des Graphen demonstriert wird, dies hat einen höheren Durchsatz an Anfragen pro Sekunde bei größerer Anzahl an verfügbaren Rechner-Knoten als Konsequenz. Zudem wird für die leichtgewichtigten Heuristiken zur Partitionierung des Graphen gezeigt, dass diese eine relativ gute Lastverteilung aufweisen und parallel zu anderen Systemfunktionen ausgeführt werden können, dies erlaubt ein ähnliches Performanz-Verhalten wie eine naive, traditionelle Hash-basierende Partitionierung, weist allerdings eine deutlich verbesserte Kanten-Lokalität auf

    Block-Diagonal and LT Codes for Distributed Computing With Straggling Servers

    Get PDF
    We propose two coded schemes for the distributed computing problem of multiplying a matrix by a set of vectors. The first scheme is based on partitioning the matrix into submatrices and applying maximum distance separable (MDS) codes to each submatrix. For this scheme, we prove that up to a given number of partitions the communication load and the computational delay (not including the encoding and decoding delay) are identical to those of the scheme recently proposed by Li et al., based on a single, long MDS code. However, due to the use of shorter MDS codes, our scheme yields a significantly lower overall computational delay when the delay incurred by encoding and decoding is also considered. We further propose a second coded scheme based on Luby Transform (LT) codes under inactivation decoding. Interestingly, LT codes may reduce the delay over the partitioned scheme at the expense of an increased communication load. We also consider distributed computing under a deadline and show numerically that the proposed schemes outperform other schemes in the literature, with the LT code-based scheme yielding the best performance for the scenarios considered.Comment: To appear in IEEE Transactions on Communication

    Real-Time Wireless Sensor-Actuator Networks for Cyber-Physical Systems

    Get PDF
    A cyber-physical system (CPS) employs tight integration of, and coordination between computational, networking, and physical elements. Wireless sensor-actuator networks provide a new communication technology for a broad range of CPS applications such as process control, smart manufacturing, and data center management. Sensing and control in these systems need to meet stringent real-time performance requirements on communication latency in challenging environments. There have been limited results on real-time scheduling theory for wireless sensor-actuator networks. Real-time transmission scheduling and analysis for wireless sensor-actuator networks requires new methodologies to deal with unique characteristics of wireless communication. Furthermore, the performance of a wireless control involves intricate interactions between real-time communication and control. This thesis research tackles these challenges and make a series of contributions to the theory and system for wireless CPS. (1) We establish a new real-time scheduling theory for wireless sensor-actuator networks. (2) We develop a scheduling-control co-design approach for holistic optimization of control performance in a wireless control system. (3) We design and implement a wireless sensor-actuator network for CPS in data center power management. (4) We expand our research to develop scheduling algorithms and analyses for real-time parallel computing to support computation-intensive CPS
    corecore