300 research outputs found

    Dependability where the mobile world meets the enterprise world

    Get PDF
    As we move toward increasingly larger scales of computing, complexity of systems and networks has increased manifold leading to massive failures of cloud providers (Amazon Cloudfront, November 2014) and geographically localized outages of cellular services (T-Mobile, June 2014). In this dissertation, we investigate the dependability aspects of two of the most prevalent computing platforms today, namely, smartphones and cloud computing. These two seemingly disparate platforms are part of a cohesive story—they interact to provide end-to-end services which are increasingly being delivered over mobile platforms, examples being iCloud, Google Drive and their smartphone counterparts iPhone and Android. ^ In one of the early work on characterizing failures in dominant mobile OSes, we analyzed bug repositories of Android and Symbian and found similarities in their failure modes [ISSRE2010]. We also presented a classification of root causes and quantified the impact of ease of customizing the smartphones on system reliability. Our evaluation of Inter-Component Communication in Android [DSN2012] show an alarming number of exception handling errors where a phone may be crashed by passing it malformed component invocation messages, even from unprivileged applications. In this work, we also suggest language extensions that can mitigate these problems. ^ Mobile applications today are increasingly being used to interact with enterprise-class web services commonly hosted in virtualized environments. Virutalization suffers from the problem of imperfect performance isolation where contention for low-level hardware resources can impact application performance. Through a set of rigorous experiments in a private cloud testbed and in EC2, we show that interference induced performance degradation is a reality. Our experiments have also shown that optimal configuration settings for web servers change during such phases of interference. Based on this observation, we design and implement the IC 2engine which can mitigate effects of interference by reconfiguring web server parameters [MW2014]. We further improve IC 2 by incorporating it into a two-level configuration engine, named ICE, for managing web server clusters [ICAC2015]. Our evaluations show that, compared to an interference agnostic configuration, IC 2 can improve response time of web servers by upto 40%, while ICE can improve response time by up to 94% during phases of interference

    Efficient data traverse paths for both I/O and computation intensive workloads

    Get PDF
    Virtualization has accomplished standard status in big business IT industry. Regardless of its across the board reception, it is realized that virtualization likewise presents non-minor overhead when executing errands on a virtual machine (VM). Specifically, a consolidated impact from gadget virtualization overhead and CPU planning idleness can cause execution debasement when computation concentrated undertakings and I/O escalated errands are co-situated on a VM. Such impedance causes additional vitality utilization, too. Right now, present Hylics, a novel network ment that empowers proficient data  navigate ways for both I/O and computation escalated remaining tasks at hand. This is accomplished with the network ment of in-memory document framework and system administration at the hypervisor level. A few significant structure issues are pinpointed and tended to during our model execution, including proficient transitional data sharing, network administration offloading, and QoS-mindful memory use the executives. In light of our genuine sending on KVM, Hylics can fundamentally improve computation and I/O execution for hybrid workloads

    On the role of performance interference in consolidated environments

    Get PDF
    Cotutela Universitat Politècnica de Catalunya i KTH Royal Institute of TechnologyWith the advent of resource shared environments such as the Cloud, virtualization has become the de facto standard for server consolidation. While consolidation improves utilization, it causes performance-interference between Virtual Machines (VMs) from contention in shared resources such as CPU, Last Level Cache (LLC) and memory bandwidth. Over-provisioning resources for performance sensitive applications can guarantee Quality of Service (QoS), however, it results in low machine utilization. Thus, assuring QoS for performance sensitive applications while allowing co-location has been a challenging problem. In this thesis, we identify ways to mitigate performance interference without undue over-provisioning and also point out the need to model and account for performance interference to improve the reliability and accuracy of elastic scaling. The end goal of this research is to leverage on the observations to provide efficient resource management that is both performance and cost aware. Our main contributions are threefold; first, we improve the overall machine utilization by executing best-e↵ort applications along side latency critical applications without violating its performance requirements. Our solution is able to dynamically adapt and leverage on the changing workload/phase behaviour to execute best-e↵ort applications without causing excessive interference on performance; second, we identify that certain performance metrics used for elastic scaling decisions may become unreliable if performance interference is unaccounted. By modelling performance interference, we show that these performance metrics become reliable in a multi-tenant environment; and third, we identify and demonstrate the impact of interference on the accuracy of elastic scaling and propose a solution to significantly minimise performance violations at a reduced cost.Con la aparición de entornos con recurso compartidos tales como la nube, la virtualización se ha convertido en el estándar de facto para la consolidación de servidores. Mientras que la consolidación mejora la utilización, también causa interferencia en el rendimiento de las máquinas virtuales (VM) debido a la contención en recursos compartidos, tales como CPU, el último nivel de cache (LLC) y el ancho de banda de memoria. El exceso de aprovisionamiento de recursos para aplicaciones sensibles al rendimiento puede garantizar la calidad de servicio (QoS), sin embargo, resulta en una baja utilización de la maquina. Por lo tanto, asegurar QoS en aplicaciones sensibles al rendimiento, al tiempo que permitir la co-localización ha sido un problema difícil. En esta tesis, se identifican las formas de mitigar la interferencia sin necesidad de sobre-aprovisionamiento y también se señala la necesidad de modelar y contabilizar la interferencia en el desempeño para mejorar la fiabilidad y la precisión del escalado elástico. El objetivo final de esta investigación consiste en aprovechar las observaciones para proporcionar una gestión eficiente de los recursos considerando tanto el rendimiento como el coste. Nuestras contribuciones principales son tres; primero, mejoramos la utilización total de la maquina mediante la ejecución de aplicaciones best-effort junto con aplicaciones críticas en latencia sin vulnerar sus requisitos de rendimiento. Nuestra solución es capaz de adaptarse de forma dinámica y sacar provecho del comportamiento cambiante de la carga de trabajo y sus cambios de fase para ejecutar aplicaciones best-effort, sin causar interferencia excesiva en el rendimiento; segundo, identificamos que ciertos parámetros de rendimiento utilizados para las decisiones de escalado elástico pueden no ser fiables si no se tiene en cuenta la interferencia en el rendimiento. Al modelar la interferencia en el rendimiento, se muestra que estas métricas de rendimiento resultan fiables en un entorno multi-proveedor; y tercero, se identifica y muestra el impacto de la interferencia en la precisión del escalado elástico y se propone una solución para minimizar significativamente vulneraciones de rendimiento con un coste reducido.Postprint (published version

    Guardauto: A Decentralized Runtime Protection System for Autonomous Driving

    Full text link
    Due to the broad attack surface and the lack of runtime protection, potential safety and security threats hinder the real-life adoption of autonomous vehicles. Although efforts have been made to mitigate some specific attacks, there are few works on the protection of the self-driving system. This paper presents a decentralized self-protection framework called Guardauto to protect the self-driving system against runtime threats. First, Guardauto proposes an isolation model to decouple the self-driving system and isolate its components with a set of partitions. Second, Guardauto provides self-protection mechanisms for each target component, which combines different methods to monitor the target execution and plan adaption actions accordingly. Third, Guardauto provides cooperation among local self-protection mechanisms to identify the root-cause component in the case of cascading failures affecting multiple components. A prototype has been implemented and evaluated on the open-source autonomous driving system Autoware. Results show that Guardauto could effectively mitigate runtime failures and attacks, and protect the control system with acceptable performance overhead

    A software approach to defeating side channels in last-level caches

    Full text link
    We present a software approach to mitigate access-driven side-channel attacks that leverage last-level caches (LLCs) shared across cores to leak information between security domains (e.g., tenants in a cloud). Our approach dynamically manages physical memory pages shared between security domains to disable sharing of LLC lines, thus preventing "Flush-Reload" side channels via LLCs. It also manages cacheability of memory pages to thwart cross-tenant "Prime-Probe" attacks in LLCs. We have implemented our approach as a memory management subsystem called CacheBar within the Linux kernel to intervene on such side channels across container boundaries, as containers are a common method for enforcing tenant isolation in Platform-as-a-Service (PaaS) clouds. Through formal verification, principled analysis, and empirical evaluation, we show that CacheBar achieves strong security with small performance overheads for PaaS workloads

    Classifying resilience approaches for protecting smart grids against cyber threats

    Get PDF
    Smart grids (SG) draw the attention of cyber attackers due to their vulnerabilities, which are caused by the usage of heterogeneous communication technologies and their distributed nature. While preventing or detecting cyber attacks is a well-studied field of research, making SG more resilient against such threats is a challenging task. This paper provides a classification of the proposed cyber resilience methods against cyber attacks for SG. This classification includes a set of studies that propose cyber-resilient approaches to protect SG and related cyber-physical systems against unforeseen anomalies or deliberate attacks. Each study is briefly analyzed and is associated with the proper cyber resilience technique which is given by the National Institute of Standards and Technology in the Special Publication 800-160. These techniques are also linked to the different states of the typical resilience curve. Consequently, this paper highlights the most critical challenges for achieving cyber resilience, reveals significant cyber resilience aspects that have not been sufficiently considered yet and, finally, proposes scientific areas that should be further researched in order to enhance the cyber resilience of SG.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Funding for open access charge: Universidad de Málaga / CBUA

    Autonomic Management And Performance Optimization For Cloud Computing Services

    Get PDF
    Cloud computing has become an increasingly important computing paradigm. It offers three levels of on-demand services to cloud users: software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS) . The success of cloud services heavily depends on the effectiveness of cloud management strategies. In this dissertation work, we aim to design and implement an automatic cloud management system to improve application performance, increase platform efficiency and optimize resource allocation. For large-scale multi-component applications, especially web-based cloud applica- tions, parameter setting is crucial to the service availability and quality. The increas- ing system complexity requires an automatic and efficient application configuration strategy. To improve the quality of application services, we propose a reinforcement learning(RL)-based autonomic configuration framework. It is able to adapt appli- cation parameter settings not only to the variations in workload, but also to the change of virtual resource allocation. The RL approach is enhanced with an efficient initialization policy to reduce the learning time for online decision. Experiments on Xen-based virtual cluster with TPC-W benchmarks show that the framework can drive applications into a optimal configuration in less than 25 iterations. For cloud platform service, one of the key challenges is to efficiently adapt the offered platforms to the virtualized environment, meanwhile maintaining their service features. MapReduce has become an important distributed parallel programming paradigm. Offering MapReduce cloud service presents an attractive usage model for enterprises. In a virtual MapReduce cluster, the interference between virtual machines (VMs) causes performance degradation of map and reduce tasks and renders existing data locality-aware task scheduling policy, like delay scheduling, no longer effective. On the other hand, virtualization offers an extra opportunity of data locality for co-hosted VMs. To address these issues, we present a task scheduling strategy to mitigate interference and meanwhile preserving task data locality for MapReduce applications. The strategy includes an interference-aware scheduling policy, based on a task performance prediction model, and an adaptive delay scheduling algorithm for data locality improvement. Experimental results on a 72-node Xen-based virtual cluster show that the scheduler is able to achieve a speedup of 1.5 to 6.5 times for individual jobs and yield an improvement of up to 1.9 times in system throughput in comparison with four other MapReduce schedulers. Cloud computing has a key requirement for resource configuration in a real-time manner. In such virtualized environments, both virtual machines (VMs) and hosted applications need to be configured on-the fly to adapt to system dynamics. The in- terplay between the layers of VMs and applications further complicates the problem of cloud configuration. Independent tuning of each aspect may not lead to optimal system wide performance. In this work, we propose a framework for coordinated configuration of VMs and resident applications. At the heart of the framework is a model-free hybrid reinforcement learning (RL) approach, which combines the advan- tages of Simplex method and RL method and is further enhanced by the use of system knowledge guided exploration policies. Experimental results on Xen based virtualized environments with TPC-W and TPC-C benchmarks demonstrate that the framework is able to drive a virtual server cluster into an optimal or near-optimal configuration state on the fly, in response to the change of workload. It improves the systems throughput by more than 30% over independent tuning strategies. In comparison with the coordinated tuning strategies based on basic RL or Simplex algorithm, the hybrid RL algorithm gains 25% to 40% throughput improvement

    Resource management of replicated service systems provisioned in the cloud

    Get PDF
    Service providers seek scalable and cost-effective cloud solutions for hosting their applications. Despite significant recent advances facilitating the deployment and management of services on cloud platforms, a number of challenges still remain. Service providers are confronted with time-varying requests for the provided applications, inter- dependencies between different components, performance variability of the procured virtual resources, and cost structures that differ from conventional data centers. Moreover, fulfilling service level agreements, such as the throughput and response time percentiles, becomes of paramount importance for ensuring business advantages.In this thesis, we explore service provisioning in clouds from multiple points of view. The aim is to best provide service replicas in the form of VMs to various service applications, such that their tail throughput and tail response times, as well as resource utilization, meet the service level agreements in the most cost effective manner. In particular, we develop models, algorithms and replication strategies that consider multi-tier composed services provisioned in clouds. We also investigate how a service provider can opportunistically take advantage of observed performance variability in the cloud. Finally, we provide means of guaranteeing tail throughput and response times in the face of performance variability of VMs, using Markov chain modeling and large deviation theory. We employ methods from analytical modeling, event-driven simulations and experiments. Overall, this thesis provides not only a multi-faceted approach to exploring several crucial aspects of hosting services in clouds, i.e., cost, tail throughput, and tail response times, but our proposed resource management strategies are also rigorously validated via trace-driven simulation and extensive experiment

    A smartwater metering deployment based on the fog computing paradigm

    Get PDF
    In this paper, we look into smart water metering infrastructures that enable continuous, on-demand and bidirectional data exchange between metering devices, water flow equipment, utilities and end-users. We focus on the design, development and deployment of such infrastructures as part of larger, smart city, infrastructures. Until now, such critical smart city infrastructures have been developed following a cloud-centric paradigm where all the data are collected and processed centrally using cloud services to create real business value. Cloud-centric approaches need to address several performance issues at all levels of the network, as massive metering datasets are transferred to distant machine clouds while respecting issues like security and data privacy. Our solution uses the fog computing paradigm to provide a system where the computational resources already available throughout the network infrastructure are utilized to facilitate greatly the analysis of fine-grained water consumption data collected by the smart meters, thus significantly reducing the overall load to network and cloud resources. Details of the system's design are presented along with a pilot deployment in a real-world environment. The performance of the system is evaluated in terms of network utilization and computational performance. Our findings indicate that the fog computing paradigm can be applied to a smart grid deployment to reduce effectively the data volume exchanged between the different layers of the architecture and provide better overall computational, security and privacy capabilities to the system

    RISKS IDENTIFICATION AND MITIGATION IN UAV APPLICATIONS DEVELOPMENT PROJECTS

    Get PDF
    With the recent advances in aircraft technologies, software, sensors, and communications, Unmanned Aerial Vehicles (UAVs) can offer a wide range of applications. UAVs can play important roles in applications, such as search and rescue, situation awareness in natural disasters, environmental monitoring, and perimeter surveillance. Developing UAV applications involves integrating hardware, software, sensors, and communication components with the UAV’s base system. UAV applications development projects are complex because of the various development stages and the integration complexity of high component. This research addresses the business and technical challenges encountered by UAV applications development and Project Management (PM). It identifies the risks associated with UAV applications development and compares various risk mitigation and management techniques that can be used. The study also investigates the role of Knowledge Management (KM) in reducing and managing risks. Furthermore, this study proposes a KM framework that reduces risks in UAV applications development projects. In addition, the proposed framework relies on KM and text mining techniques to enhance the efficiency of executing these projects
    corecore