5 research outputs found

    Elastic Services for Edge Computing

    Get PDF

    Towards adaptive actors for scalable iot applications at the edge

    Get PDF
    Traditional device-cloud architectures are not scalable to the size of future IoT deployments. While edge and fog-computing principles seem like a tangible solution, they increase the programming effort of IoT systems, do not provide the same elasticity guarantees as the cloud and are of much greater hardware heterogeneity. Future IoT applications will be highly distributed and place their computational tasks on any combination of end-devices (sensor nodes, smartphones, drones), edge and cloud resources in order to achieve their application goals. These complex distributed systems require a programming model that allows developers to implement their applications in a simple way (i.e., focus on the application logic) and an execution framework that runs these applications resiliently with a high resource efficiency, while maximizing application utility. Towards such distributed execution runtime, we propose Nandu, an actor based system that adapts and migrates tasks dynamically using developer provided hints as seed information. Nandu allows developers to focus on sequential application logic and transforms their application into distributed, adaptive actors. The resulting actors support fine-grained entry points for the execution environment. These entry points allow local schedulers to adapt actors seamlessly to the current context, while optimizing the overall application utility according to developer provided requirements

    VersaMatch : ontology matching with weak supervision

    Get PDF
    Ontology matching is crucial to data integration for across-silo data sharing and has been mainly addressed with heuristic and machine learning (ML) methods. While heuristic methods are often inflexible and hard to extend to new domains, ML methods rely on substantial and hard to obtain amounts of labeled training data. To overcome these limitations, we propose VersaMatch, a flexible, weakly-supervised ontology matching system. VersaMatch employs various weak supervision sources, such as heuristic rules, pattern matching, and external knowledge bases, to produce labels from a large amount of unlabeled data for training a discriminative ML model. For prediction, VersaMatch develops a novel ensemble model combining the weak supervision sources with the discriminative model to support generalization while retaining a high precision. Our ensemble method boosts end model performance by 4 points compared to a traditional weak-supervision baseline. In addition, compared to state-of-the-art ontology matchers, VersaMatch achieves an overall 4-point performance improvement in F1 score across 26 ontology combinations from different domains. For recently released, in-the-wild datasets, VersaMatch beats the next best matchers by 9 points in F1. Furthermore, its core weak-supervision logic can easily be improved by adding more knowledge sources and collecting more unlabeled data for training

    Towards Knowledge Infusion for Robust and Transferable Machine Learning in IoT

    Get PDF
    Machine learning (ML) applications in Internet of Things (IoT) scenarios face the issue that supervision signals, such as labeled data, are scarce and expensive to obtain. For example, it often requires a human to manually label events in a data stream by observing the same events in the real world. In addition, the performance of trained models usually depends on a specific context: (1) location, (2) time and (3) data quality. This context is not static in reality, making it hard to achieve robust and transferable machine learning for IoT systems in practice. In this paper, we address these challenges with an envisioned method that we name Knowledge Infusion. First, we present two past case studies in which we combined external knowledge with traditional data-driven machine learning in IoT scenarios to ease the supervision effort: (1) a weak-supervision approach for the IoT domain to auto-generate labels based on external knowledge (e.g., domain knowledge) encoded in simple labeling functions. Our evaluation for transport mode classification achieves a micro-F1 score of 80.2%, with only seven labeling functions, on par with a fully supervised model that relies on hand-labeled data. (2) We introduce guiding functions to Reinforcement Learning (RL) to guide the agents' decisions and experience. In initial experiments, our guided reinforcement learning achieves more than three times higher reward in the beginning of its training than an agent with no external knowledge. We use the lessons learned from these experiences to develop our vision of knowledge infusion. In knowledge infusion, we aim to automate the inclusion of knowledge from existing knowledge bases and domain experts to combine it with traditional data-driven machine learning techniques during setup/training phase, but also during the execution phase

    Leveraging data-driven infrastructure management to facilitate AIOps for big data applications and operations

    Get PDF
    As institutions increasingly shift to distributed and containerized application deployments on remote heterogeneous cloud/cluster infrastructures, the cost and difficulty of efficiently managing and maintaining data-intensive applications have risen. A new emerging solution to this issue is Data-Driven Infrastructure Management (DDIM), where the decisions regarding the management of resources are taken based on data aspects and operations (both on the infrastructure and on the application levels). This chapter will introduce readers to the core concepts underpinning DDIM, based on experience gained from development of the Kubernetes-based BigDataStack DDIM platform (https://bigdatastack.eu/). This chapter involves multiple important BDV topics, including development, deployment, and operations for cluster/cloud-based big data applications, as well as data-driven analytics and artificial intelligence for smart automated infrastructure self-management. Readers will gain important insights into how next-generation DDIM platforms function, as well as how they can be used in practical deployments to improve quality of service for Big Data Applications. This chapter relates to the technical priority Data Processing Architectures of the European Big Data Value Strategic Research & Innovation Agenda [33], as well as the Data Processing Architectures horizontal and Engineering and DevOps for building Big Data Value vertical concerns. The chapter relates to the Reasoning and Decision Making cross-sectorial technology enablers of the AI, Data and Robotics Strategic Research, Innovation & Deployment Agenda [34]
    corecore