217 research outputs found

    Cloud Computing and Cloud Automata as A New Paradigm for Computation

    Get PDF
    Cloud computing addresses how to make right resources available to right computation to improve scaling, resiliency and efficiency of the computation. We argue that cloud computing indeed, is a new paradigm for computation with a higher order of artificial intelligence (AI), and put forward cloud automata as a new model for computation. A high-level AI requires infusing features that mimic human functioning into AI systems. One of the central features is that humans learn all the time and the learning is incremental. Consequently, for AI, we need to use computational models, which reflect incremental learning without stopping (sentience). These features are inherent in reflexive, inductive and limit Turing machines. To construct cloud automata, we use the mathematical theory of Oracles, which include Oracles of Turing machines as its special case. We develop a hierarchical approach based on Oracles with different ranks that includes Oracle AI as a special case. Discussing a named-set approach, we describe an implementation of a high-performance edge cloud using hierarchical name-oriented networking and Oracle AI-based orchestration. We demonstrate how cloud automata with a control overlay allows microservice network provisioning, monitoring and reconfiguration to address non-deterministic fluctuations affecting their behavior without interrupting the overall evolution of computation

    Orchestration of machine learning workflows on Internet of Things data

    Get PDF
    Applications empowered by machine learning (ML) and the Internet of Things (IoT) are changing the way people live and impacting a broad range of industries. However, creating and automating ML workflows at scale using real-world IoT data often leads to complex systems integration and production issues. Examples of challenges faced during the development of these ML applications include glue code, hidden dependencies, and data pipeline jungles. This research proposes the Machine Learning Framework for IoT data (ML4IoT), which is designed to orchestrate ML workflows to perform training and enable inference by ML models on IoT data. In the proposed framework, containerized microservices are used to automate the execution of tasks specified in ML workflows, which are defined through REST APIs. To address the problem of integrating big data tools and machine learning into a unified platform, the proposed framework enables the definition and execution of end-to-end ML workflows on large volumes of IoT data. In addition, to address the challenges of running multiple ML workflows in parallel, the ML4IoT has been designed to use container-based components that provide a convenient mechanism to enable the training and deployment of numerous ML models in parallel. Finally, to address the common production issues faced during the development of ML applications, the proposed framework used microservices architecture to bring flexibility, reusability, and extensibility to the framework. Through the experiments, we demonstrated the feasibility of the (ML4IoT), which managed to train and deploy predictive ML models in two types of IoT data. The obtained results suggested that the proposed framework can manage real-world IoT data, by providing elasticity to execute 32 ML workflows in parallel, which were used to train 128 ML models simultaneously. Also, results demonstrated that in the ML4IoT, the performance of rendering online predictions is not affected when 64 ML models are deployed concurrently to infer new information using online IoT data

    Microservice Transition and its Granularity Problem: A Systematic Mapping Study

    Get PDF
    Microservices have gained wide recognition and acceptance in software industries as an emerging architectural style for autonomic, scalable, and more reliable computing. The transition to microservices has been highly motivated by the need for better alignment of technical design decisions with improving value potentials of architectures. Despite microservices' popularity, research still lacks disciplined understanding of transition and consensus on the principles and activities underlying "micro-ing" architectures. In this paper, we report on a systematic mapping study that consolidates various views, approaches and activities that commonly assist in the transition to microservices. The study aims to provide a better understanding of the transition; it also contributes a working definition of the transition and technical activities underlying it. We term the transition and technical activities leading to microservice architectures as microservitization. We then shed light on a fundamental problem of microservitization: microservice granularity and reasoning about its adaptation as first-class entities. This study reviews state-of-the-art and -practice related to reasoning about microservice granularity; it reviews modelling approaches, aspects considered, guidelines and processes used to reason about microservice granularity. This study identifies opportunities for future research and development related to reasoning about microservice granularity.Comment: 36 pages including references, 6 figures, and 3 table

    Design and implementation of serverless architecture for i2b2 on AWS cloud and Snowflake data warehouse

    Get PDF
    Informatics for Integrating Biology and the Beside (i2b2) is an open-source medical tool for cohort discovery that allows researchers to explore and query clinical data. The i2b2 platform is designed to adopt any patient-centric data models and used at over 400 healthcare institutions worldwide for querying patient data. The platform consists of a webclient, core servers and database. Despite having installation guidelines, the complex architecture of the system with numerous dependencies and configuration parameters makes it difficult to install a functional i2b2 platform. On the other hand, maintaining the scalability, security, availability of the application is also challenging and requires lot of resources. Our aim was to deploy the i2b2 for University of Missouri (UM) System in the cloud as well as reduce the complexity and effort of the installation and maintenance process. Our solution encapsulated the complete installation process of each component using docker and deployed the container in the AWS Virtual Private Cloud (VPC) using several AWS PaaS (Platform as a Service), IaaS (Infrastructure as a Service) services. We deployed the application as a service in the AWS FARGATE, an on-demand, serverless, auto scalable compute engine. We also enhanced the functionality of i2b2 services and developed Snowflake JDBC driver support for i2b2 backend services. It enabled i2b2 services to query directly from Snowflake analytical database. In addition, we also created i2b2-data-installer package to load PCORnet CDM and ACT ontology data into i2b2 database. The i2b2 platform in University of Missouri holds 1.26B facts of 2.2M patients of UM Cerner Millennium data.Includes bibliographical references

    A Survey and Future Directions on Clustering: From WSNs to IoT and Modern Networking Paradigms

    Get PDF
    Many Internet of Things (IoT) networks are created as an overlay over traditional ad-hoc networks such as Zigbee. Moreover, IoT networks can resemble ad-hoc networks over networks that support device-to-device (D2D) communication, e.g., D2D-enabled cellular networks and WiFi-Direct. In these ad-hoc types of IoT networks, efficient topology management is a crucial requirement, and in particular in massive scale deployments. Traditionally, clustering has been recognized as a common approach for topology management in ad-hoc networks, e.g., in Wireless Sensor Networks (WSNs). Topology management in WSNs and ad-hoc IoT networks has many design commonalities as both need to transfer data to the destination hop by hop. Thus, WSN clustering techniques can presumably be applied for topology management in ad-hoc IoT networks. This requires a comprehensive study on WSN clustering techniques and investigating their applicability to ad-hoc IoT networks. In this article, we conduct a survey of this field based on the objectives for clustering, such as reducing energy consumption and load balancing, as well as the network properties relevant for efficient clustering in IoT, such as network heterogeneity and mobility. Beyond that, we investigate the advantages and challenges of clustering when IoT is integrated with modern computing and communication technologies such as Blockchain, Fog/Edge computing, and 5G. This survey provides useful insights into research on IoT clustering, allows broader understanding of its design challenges for IoT networks, and sheds light on its future applications in modern technologies integrated with IoT.acceptedVersio

    Data-Driven Intelligent Scheduling For Long Running Workloads In Large-Scale Datacenters

    Get PDF
    Cloud computing is becoming a fundamental facility of society today. Large-scale public or private cloud datacenters spreading millions of servers, as a warehouse-scale computer, are supporting most business of Fortune-500 companies and serving billions of users around the world. Unfortunately, modern industry-wide average datacenter utilization is as low as 6% to 12%. Low utilization not only negatively impacts operational and capital components of cost efficiency, but also becomes the scaling bottleneck due to the limits of electricity delivered by nearby utility. It is critical and challenge to improve multi-resource efficiency for global datacenters. Additionally, with the great commercial success of diverse big data analytics services, enterprise datacenters are evolving to host heterogeneous computation workloads including online web services, batch processing, machine learning, streaming computing, interactive query and graph computation on shared clusters. Most of them are long-running workloads that leverage long-lived containers to execute tasks. We concluded datacenter resource scheduling works over last 15 years. Most previous works are designed to maximize the cluster efficiency for short-lived tasks in batch processing system like Hadoop. They are not suitable for modern long-running workloads of Microservices, Spark, Flink, Pregel, Storm or Tensorflow like systems. It is urgent to develop new effective scheduling and resource allocation approaches to improve efficiency in large-scale enterprise datacenters. In the dissertation, we are the first of works to define and identify the problems, challenges and scenarios of scheduling and resource management for diverse long-running workloads in modern datacenter. They rely on predictive scheduling techniques to perform reservation, auto-scaling, migration or rescheduling. It forces us to pursue and explore more intelligent scheduling techniques by adequate predictive knowledges. We innovatively specify what is intelligent scheduling, what abilities are necessary towards intelligent scheduling, how to leverage intelligent scheduling to transfer NP-hard online scheduling problems to resolvable offline scheduling issues. We designed and implemented an intelligent cloud datacenter scheduler, which automatically performs resource-to-performance modeling, predictive optimal reservation estimation, QoS (interference)-aware predictive scheduling to maximize resource efficiency of multi-dimensions (CPU, Memory, Network, Disk I/O), and strictly guarantee service level agreements (SLA) for long-running workloads. Finally, we introduced a large-scale co-location techniques of executing long-running and other workloads on the shared global datacenter infrastructure of Alibaba Group. It effectively improves cluster utilization from 10% to averagely 50%. It is far more complicated beyond scheduling that involves technique evolutions of IDC, network, physical datacenter topology, storage, server hardwares, operating systems and containerization. We demonstrate its effectiveness by analysis of newest Alibaba public cluster trace in 2017. We are the first of works to reveal the global view of scenarios, challenges and status in Alibaba large-scale global datacenters by data demonstration, including big promotion events like Double 11 . Data-driven intelligent scheduling methodologies and effective infrastructure co-location techniques are critical and necessary to pursue maximized multi-resource efficiency in modern large-scale datacenter, especially for long-running workloads

    A Cognitive Routing framework for Self-Organised Knowledge Defined Networks

    Get PDF
    This study investigates the applicability of machine learning methods to the routing protocols for achieving rapid convergence in self-organized knowledge-defined networks. The research explores the constituents of the Self-Organized Networking (SON) paradigm for 5G and beyond, aiming to design a routing protocol that complies with the SON requirements. Further, it also exploits a contemporary discipline called Knowledge-Defined Networking (KDN) to extend the routing capability by calculating the “Most Reliable” path than the shortest one. The research identifies the potential key areas and possible techniques to meet the objectives by surveying the state-of-the-art of the relevant fields, such as QoS aware routing, Hybrid SDN architectures, intelligent routing models, and service migration techniques. The design phase focuses primarily on the mathematical modelling of the routing problem and approaches the solution by optimizing at the structural level. The work contributes Stochastic Temporal Edge Normalization (STEN) technique which fuses link and node utilization for cost calculation; MRoute, a hybrid routing algorithm for SDN that leverages STEN to provide constant-time convergence; Most Reliable Route First (MRRF) that uses a Recurrent Neural Network (RNN) to approximate route-reliability as the metric of MRRF. Additionally, the research outcomes include a cross-platform SDN Integration framework (SDN-SIM) and a secure migration technique for containerized services in a Multi-access Edge Computing environment using Distributed Ledger Technology. The research work now eyes the development of 6G standards and its compliance with Industry-5.0 for enhancing the abilities of the present outcomes in the light of Deep Reinforcement Learning and Quantum Computing

    Artificial intelligence driven anomaly detection for big data systems

    Get PDF
    The main goal of this thesis is to contribute to the research on automated performance anomaly detection and interference prediction by implementing Artificial Intelligence (AI) solutions for complex distributed systems, especially for Big Data platforms within cloud computing environments. The late detection and manual resolutions of performance anomalies and system interference in Big Data systems may lead to performance violations and financial penalties. Motivated by this issue, we propose AI-based methodologies for anomaly detection and interference prediction tailored to Big Data and containerized batch platforms to better analyze system performance and effectively utilize computing resources within cloud environments. Therefore, new precise and efficient performance management methods are the key to handling performance anomalies and interference impacts to improve the efficiency of data center resources. The first part of this thesis contributes to performance anomaly detection for in-memory Big Data platforms. We examine the performance of Big Data platforms and justify our choice of selecting the in-memory Apache Spark platform. An artificial neural network-driven methodology is proposed to detect and classify performance anomalies for batch workloads based on the RDD characteristics and operating system monitoring metrics. Our method is evaluated against other popular machine learning algorithms (ML), as well as against four different monitoring datasets. The results prove that our proposed method outperforms other ML methods, typically achieving 98–99% F-scores. Moreover, we prove that a random start instant, a random duration, and overlapped anomalies do not significantly impact the performance of our proposed methodology. The second contribution addresses the challenge of anomaly identification within an in-memory streaming Big Data platform by investigating agile hybrid learning techniques. We develop TRACK (neural neTwoRk Anomaly deteCtion in sparK) and TRACK-Plus, two methods to efficiently train a class of machine learning models for performance anomaly detection using a fixed number of experiments. Our model revolves around using artificial neural networks with Bayesian Optimization (BO) to find the optimal training dataset size and configuration parameters to efficiently train the anomaly detection model to achieve high accuracy. The objective is to accelerate the search process for finding the size of the training dataset, optimizing neural network configurations, and improving the performance of anomaly classification. A validation based on several datasets from a real Apache Spark Streaming system is performed, demonstrating that the proposed methodology can efficiently identify performance anomalies, near-optimal configuration parameters, and a near-optimal training dataset size while reducing the number of experiments up to 75% compared with naïve anomaly detection training. The last contribution overcomes the challenges of predicting completion time of containerized batch jobs and proactively avoiding performance interference by introducing an automated prediction solution to estimate interference among colocated batch jobs within the same computing environment. An AI-driven model is implemented to predict the interference among batch jobs before it occurs within system. Our interference detection model can alleviate and estimate the task slowdown affected by the interference. This model assists the system operators in making an accurate decision to optimize job placement. Our model is agnostic to the business logic internal to each job. Instead, it is learned from system performance data by applying artificial neural networks to establish the completion time prediction of batch jobs within the cloud environments. We compare our model with three other baseline models (queueing-theoretic model, operational analysis, and an empirical method) on historical measurements of job completion time and CPU run-queue size (i.e., the number of active threads in the system). The proposed model captures multithreading, operating system scheduling, sleeping time, and job priorities. A validation based on 4500 experiments based on the DaCapo benchmarking suite was carried out, confirming the predictive efficiency and capabilities of the proposed model by achieving up to 10% MAPE compared with the other models.Open Acces
    corecore