68 research outputs found

    Cloud engineering is search based software engineering too

    Get PDF
    Many of the problems posed by the migration of computation to cloud platforms can be formulated and solved using techniques associated with Search Based Software Engineering (SBSE). Much of cloud software engineering involves problems of optimisation: performance, allocation, assignment and the dynamic balancing of resources to achieve pragmatic trade-offs between many competing technical and business objectives. SBSE is concerned with the application of computational search and optimisation to solve precisely these kinds of software engineering challenges. Interest in both cloud computing and SBSE has grown rapidly in the past five years, yet there has been little work on SBSE as a means of addressing cloud computing challenges. Like many computationally demanding activities, SBSE has the potential to benefit from the cloud; ‘SBSE in the cloud’. However, this paper focuses, instead, of the ways in which SBSE can benefit cloud computing. It thus develops the theme of ‘SBSE for the cloud’, formulating cloud computing challenges in ways that can be addressed using SBSE

    A Black-box Monitoring Approach to Measure Microservices Runtime Performance

    Get PDF
    Microservices changed cloud computing by moving the applications' complexity from one monolithic executable to thousands of network interactions between small components. Given the increasing deployment sizes, the architectural exploitation challenges, and the impact on data-centers' power consumption, we need to efficiently track this complexity. Within this article, we propose a black-box monitoring approach to track microservices at scale, focusing on architectural metrics, power consumption, application performance, and network performance. The proposed approach is transparent w.r.t. the monitored applications, generates less overhead w.r.t. black-box approaches available in the state-of-the-art, and provides fine-grain accurate metrics

    Introducing the new paradigm of Social Dispersed Computing: Applications, Technologies and Challenges

    Full text link
    [EN] If last decade viewed computational services as a utility then surely this decade has transformed computation into a commodity. Computation is now progressively integrated into the physical networks in a seamless way that enables cyber-physical systems (CPS) and the Internet of Things (IoT) meet their latency requirements. Similar to the concept of ¿platform as a service¿ or ¿software as a service¿, both cloudlets and fog computing have found their own use cases. Edge devices (that we call end or user devices for disambiguation) play the role of personal computers, dedicated to a user and to a set of correlated applications. In this new scenario, the boundaries between the network node, the sensor, and the actuator are blurring, driven primarily by the computation power of IoT nodes like single board computers and the smartphones. The bigger data generated in this type of networks needs clever, scalable, and possibly decentralized computing solutions that can scale independently as required. Any node can be seen as part of a graph, with the capacity to serve as a computing or network router node, or both. Complex applications can possibly be distributed over this graph or network of nodes to improve the overall performance like the amount of data processed over time. In this paper, we identify this new computing paradigm that we call Social Dispersed Computing, analyzing key themes in it that includes a new outlook on its relation to agent based applications. We architect this new paradigm by providing supportive application examples that include next generation electrical energy distribution networks, next generation mobility services for transportation, and applications for distributed analysis and identification of non-recurring traffic congestion in cities. The paper analyzes the existing computing paradigms (e.g., cloud, fog, edge, mobile edge, social, etc.), solving the ambiguity of their definitions; and analyzes and discusses the relevant foundational software technologies, the remaining challenges, and research opportunities.Garcia Valls, MS.; Dubey, A.; Botti, V. (2018). Introducing the new paradigm of Social Dispersed Computing: Applications, Technologies and Challenges. Journal of Systems Architecture. 91:83-102. https://doi.org/10.1016/j.sysarc.2018.05.007S831029

    lLTZVisor: a lightweight TrustZone-assisted hypervisor for low-end ARM devices

    Get PDF
    Dissertação de mestrado em Engenharia Eletrónica Industrial e ComputadoresVirtualization is a well-established technology in the server and desktop space and has recently been spreading across different embedded industries. Facing multiple challenges derived by the advent of the Internet of Things (IoT) era, these industries are driven by an upgrowing interest in consolidating and isolating multiple environments with mixed-criticality features, to address the complex IoT application landscape. Even though this is true for majority mid- to high-end embedded applications, low-end systems still present little to no solutions proposed so far. TrustZone technology, designed by ARM to improve security on its processors, was adopted really well in the embedded market. As such, the research community became active in exploring other TrustZone’s capacities for isolation, like an alternative form of system virtualization. The lightweight TrustZone-assisted hypervisor (LTZVisor), that mainly targets the consolidation of mixed-criticality systems on the same hardware platform, is one design example that takes advantage of TrustZone technology for ARM application processors. With the recent introduction of this technology to the new generation of ARM microcontrollers, an opportunity to expand this breakthrough form of virtualization to low-end devices arose. This work proposes the development of the lLTZVisor hypervisor, a refactored LTZVisor version that aims to provide strong isolation on resource-constrained devices, while achieving a low-memory footprint, determinism and high efficiency. The key for this is to implement a minimal, reliable, secure and predictable virtualization layer, supported by the TrustZone technology present on the newest generation of ARM microcontrollers (Cortex-M23/33).Virtualização é uma tecnologia já bem estabelecida no âmbito de servidores e computadores pessoais que recentemente tem vindo a espalhar-se através de várias indústrias de sistemas embebidos. Face aos desafios provenientes do surgimento da era Internet of Things (IoT), estas indústrias são guiadas pelo crescimento do interesse em consolidar e isolar múltiplos sistemas com diferentes níveis de criticidade, para atender ao atual e complexo cenário aplicativo IoT. Apesar de isto se aplicar à maioria de aplicações embebidas de média e alta gama, sistemas de baixa gama apresentam-se ainda com poucas soluções propostas. A tecnologia TrustZone, desenvolvida pela ARM de forma a melhorar a segurança nos seus processadores, foi adoptada muito bem pelo mercado dos sistemas embebidos. Como tal, a comunidade científica começou a explorar outras aplicações da tecnologia TrustZone para isolamento, como uma forma alternativa de virtualização de sistemas. O "lightweight TrustZone-assisted hypervisor (LTZVisor)", que tem sobretudo como fim a consolidação de sistemas de criticidade mista na mesma plataforma de hardware, é um exemplo que tira vantagem da tecnologia TrustZone para os processadores ARM de alta gama. Com a recente introdução desta tecnologia para a nova geração de microcontroladores ARM, surgiu uma oportunidade para expandir esta forma inovadora de virtualização para dispositivos de baixa gama. Este trabalho propõe o desenvolvimento do hipervisor lLTZVisor, uma versão reestruturada do LTZVisor que visa em proporcionar um forte isolamento em dispositivos com recursos restritos, simultâneamente atingindo um baixo footprint de memória, determinismo e alta eficiência. A chave para isto está na implementação de uma camada de virtualização mínima, fiável, segura e previsível, potencializada pela tecnologia TrustZone presente na mais recente geração de microcontroladores ARM (Cortex-M23/33)

    Infrastructural Security for Virtualized Grid Computing

    Get PDF
    The goal of the grid computing paradigm is to make computer power as easy to access as an electrical power grid. Unlike the power grid, the computer grid uses remote resources located at a service provider. Malicious users can abuse the provided resources, which not only affects their own systems but also those of the provider and others. Resources are utilized in an environment where sensitive programs and data from competitors are processed on shared resources, creating again the potential for misuse. This is one of the main security issues, since in a business environment competitors distrust each other, and the fear of industrial espionage is always present. Currently, human trust is the strategy used to deal with these threats. The relationship between grid users and resource providers ranges from highly trusted to highly untrusted. This wide trust relationship occurs because grid computing itself changed from a research topic with few users to a widely deployed product that included early commercial adoption. The traditional open research communities have very low security requirements, while in contrast, business customers often operate on sensitive data that represents intellectual property; thus, their security demands are very high. In traditional grid computing, most users share the same resources concurrently. Consequently, information regarding other users and their jobs can usually be acquired quite easily. This includes, for example, that a user can see which processes are running on another user´s system. For business users, this is unacceptable since even the meta-data of their jobs is classified. As a consequence, most commercial customers are not convinced that their intellectual property in the form of software and data is protected in the grid. This thesis proposes a novel infrastructural security solution that advances the concept of virtualized grid computing. The work started back in 2007 and led to the development of the XGE, a virtual grid management software. The XGE itself uses operating system virtualization to provide a virtualized landscape. Users’ jobs are no longer executed in a shared manner; they are executed within special sandboxed environments. To satisfy the requirements of a traditional grid setup, the solution can be coupled with an installed scheduler and grid middleware on the grid head node. To protect the prominent grid head node, a novel dual-laned demilitarized zone is introduced to make attacks more difficult. In a traditional grid setup, the head node and the computing nodes are installed in the same network, so a successful attack could also endanger the user´s software and data. While the zone complicates attacks, it is, as all security solutions, not a perfect solution. Therefore, a network intrusion detection system is enhanced with grid specific signatures. A novel software called Fence is introduced that supports end-to-end encryption, which means that all data remains encrypted until it reaches its final destination. It transfers data securely between the user´s computer, the head node and the nodes within the shielded, internal network. A lightweight kernel rootkit detection system assures that only trusted kernel modules can be loaded. It is no longer possible to load untrusted modules such as kernel rootkits. Furthermore, a malware scanner for virtualized grids scans for signs of malware in all running virtual machines. Using virtual machine introspection, that scanner remains invisible for most types of malware and has full access to all system calls on the monitored system. To speed up detection, the load is distributed to multiple detection engines simultaneously. To enable multi-site service-oriented grid applications, the novel concept of public virtual nodes is presented. This is a virtualized grid node with a public IP address shielded by a set of dynamic firewalls. It is possible to create a set of connected, public nodes, either present on one or more remote grid sites. A special web service allows users to modify their own rule set in both directions and in a controlled manner. The main contribution of this thesis is the presentation of solutions that convey the security of grid computing infrastructures. This includes the XGE, a software that transforms a traditional grid into a virtualized grid. Design and implementation details including experimental evaluations are given for all approaches. Nearly all parts of the software are available as open source software. A summary of the contributions and an outlook to future work conclude this thesis

    ‎An Artificial Intelligence Framework for Supporting Coarse-Grained Workload Classification in Complex Virtual Environments

    Get PDF
    Cloud-based machine learning tools for enhanced Big Data applications}‎, ‎where the main idea is that of predicting the ``\emph{next}'' \emph{workload} occurring against the target Cloud infrastructure via an innovative \emph{ensemble-based approach} that combines the effectiveness of different well-known \emph{classifiers} in order to enhance the whole accuracy of the final classification‎, ‎which is very relevant at now in the specific context of \emph{Big Data}‎. ‎The so-called \emph{workload categorization problem} plays a critical role in improving the efficiency and reliability of Cloud-based big data applications‎. ‎Implementation-wise‎, ‎our method proposes deploying Cloud entities that participate in the distributed classification approach on top of \emph{virtual machines}‎, ‎which represent classical ``commodity'' settings for Cloud-based big data applications‎. ‎Given a number of known reference workloads‎, ‎and an unknown workload‎, ‎in this paper we deal with the problem of finding the reference workload which is most similar to the unknown one‎. ‎The depicted scenario turns out to be useful in a plethora of modern information system applications‎. ‎We name this problem as \emph{coarse-grained workload classification}‎, ‎because‎, ‎instead of characterizing the unknown workload in terms of finer behaviors‎, ‎such as CPU‎, ‎memory‎, ‎disk‎, ‎or network intensive patterns‎, ‎we classify the whole unknown workload as one of the (possible) reference workloads‎. ‎Reference workloads represent a category of workloads that are relevant in a given applicative environment‎. ‎In particular‎, ‎we focus our attention on the classification problem described above in the special case represented by \emph{virtualized environments}‎. ‎Today‎, ‎\emph{Virtual Machines} (VMs) have become very popular because they offer important advantages to modern computing environments such as cloud computing or server farms‎. ‎In virtualization frameworks‎, ‎workload classification is very useful for accounting‎, ‎security reasons‎, ‎or user profiling‎. ‎Hence‎, ‎our research makes more sense in such environments‎, ‎and it turns out to be very useful in a special context like Cloud Computing‎, ‎which is emerging now‎. ‎In this respect‎, ‎our approach consists of running several machine learning-based classifiers of different workload models‎, ‎and then deriving the best classifier produced by the \emph{Dempster-Shafer Fusion}‎, ‎in order to magnify the accuracy of the final classification‎. ‎Experimental assessment and analysis clearly confirm the benefits derived from our classification framework‎. ‎The running programs which produce unknown workloads to be classified are treated in a similar way‎. ‎A fundamental aspect of this paper concerns the successful use of data fusion in workload classification‎. ‎Different types of metrics are in fact fused together using the Dempster-Shafer theory of evidence combination‎, ‎giving a classification accuracy of slightly less than 80%80\%‎. ‎The acquisition of data from the running process‎, ‎the pre-processing algorithms‎, ‎and the workload classification are described in detail‎. ‎Various classical algorithms have been used for classification to classify the workloads‎, ‎and the results are compared‎

    Mesure et analyse de latences dans les systèmes parallèles en temps réel

    Get PDF
    RÉSUMÉ Avec les infrastructures de type infonuagiques qui augmentent de plus en plus et les services qui exploitent les avantages de la parallélisation, on arrive à un point où les analyses de problèmes de performance sont de plus en plus complexes. En particulier, avec la parallélisation, un problème de latence sur un des composants peut ralentir une requête complète. Avec la multiplication du nombre de serveurs responsables d'une seule requête, la quantité de tests et de combinaisons à valider pour trouver une source de latence peut augmenter de manière exponentielle. Les problèmes à analyser ne sont pas nouveaux, ils sont similaires à ceux étudiés dans les systèmes temps-réel. La problématique cependant se situe au niveau de la détection automatisée en temps réel des problèmes dans des conditions réelles d'exploitation, et la mise à l'échelle de la collecte de données de contexte permettant la résolution. Dans cette thèse, nous proposons le \texttt{latency-tracker} comme solution efficace pour la mesure et l'analyse en temps réel de latences, et de le combiner avec le traceur \texttt{LTTng} pour la collecte et l'extraction de traces localement et sur le réseau. L'objectif principal est de rendre ces analyses complexes assez efficaces et non-intrusives pour fonctionner sur des machines de production, que ce soit sur des serveurs ou des appareils embarqués dédiés aux applications temps réel. Cette approche de la détection et de la compréhension des problèmes de latence dans l'ordre des dizaines de micro-secondes au niveau du noyau Linux est nouvelle et il n'existe pas d'équivalent à l'heure actuelle. En mesurant l'impact de tous les composants ajoutés dans le chemin critique des applications de manière individuelle, nous démontrons qu'il est possible d'utiliser cette approche dans des environnements très exigeants. Les mesures se concentrent au niveau de la consommation des ressources, jusqu'à l'effet sur les lignes de cache, mais également sur la mise à l'échelle sur des applications concurrentes et distribuées. La contribution principale de cette recherche se situe au niveau de l'ensemble des algorithmes développés permettant de mesurer précisément les latences avec un impact minimal, et de collecter assez d'informations de contexte pour en expliquer les causes. Ce faible impact permet l'application de ces méthodes dans des situations réelles où il était jusqu'à présent impossible de faire ce type de mesures sans modifier les conditions d'exécution. La spécialisation et l'optimisation des techniques actuelles d'agrégation, et la combinaison avec le domaine du traçage, donne ainsi naissance au domaine du traçage à état.----------ABSTRACT Today's server infrastructures are more and more organized around the cloud and virtualization technologies, and the services that run on these infrastructures tend to heavily use parallelisation to scale up to the demand. With this type of distributed systems, the performance analyses are becoming increasingly complex. Indeed, the work required to answer a single request can be divided among multiple servers, and a problem with any of the nodes can slow down the whole request. Finding the exact source of an abnormal latency in this kind of configuration can be really difficult and requires a lot of time. The problems we encounter are not new, they are similar to the ones faced by real-time systems. The biggest issue is to automatically detect in real-time these problems in production, and to have a scalable way to collect the context information required to understand and solve the problems. In this thesis, we propose the \texttt{latency-tracker} as a solution to efficiently measure and analyse latency problems in real-time, and to combine it with the \texttt{LTTng} tracer to gather and extract traces locally and on the network. The main objective is to make these complex analyses efficient enough to run on production machines, either servers in data-centers, or embedded platforms dedicated to real-time tasks. This approach to detect and explain latency issues in the order of tens of microseconds in the Linux kernel is new and there is no equivalent solution today. By individually measuring the impact of all the components added in the critical path of the applications, we demonstrate that it is possible to use this approach in very demanding environments. We measure the impact on the usage of resources, down to the impact on cache lines, but we also study the scalability of our approach on highly concurrent and distributed applications. The main contribution of this research is the set of algorithms developed to accurately measure latencies with a minimal impact, and to collect and extract enough context informations to understand the latency causes. This low impact enables the use of these methodologies in production, under real loads, which would be impossible with the existing tools today without risking to modify the execution conditions. We specialize and optimize the current techniques related to event agregation, and combine it with tracing to create the new domain of stateful tracing

    Observing the clouds : a survey and taxonomy of cloud monitoring

    Get PDF
    This research was supported by a Royal Society Industry Fellowship and an Amazon Web Services (AWS) grant. Date of Acceptance: 10/12/2014Monitoring is an important aspect of designing and maintaining large-scale systems. Cloud computing presents a unique set of challenges to monitoring including: on-demand infrastructure, unprecedented scalability, rapid elasticity and performance uncertainty. There are a wide range of monitoring tools originating from cluster and high-performance computing, grid computing and enterprise computing, as well as a series of newer bespoke tools, which have been designed exclusively for cloud monitoring. These tools express a number of common elements and designs, which address the demands of cloud monitoring to various degrees. This paper performs an exhaustive survey of contemporary monitoring tools from which we derive a taxonomy, which examines how effectively existing tools and designs meet the challenges of cloud monitoring. We conclude by examining the socio-technical aspects of monitoring, and investigate the engineering challenges and practices behind implementing monitoring strategies for cloud computing.Publisher PDFPeer reviewe