5,721 research outputs found

    High Energy Physics Forum for Computational Excellence: Working Group Reports (I. Applications Software II. Software Libraries and Tools III. Systems)

    Full text link
    Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence (HEP-FCE) initiated a roadmap planning activity with two key overlapping drivers -- 1) software effectiveness, and 2) infrastructure and expertise advancement. The HEP-FCE formed three working groups, 1) Applications Software, 2) Software Libraries and Tools, and 3) Systems (including systems software), to provide an overview of the current status of HEP computing and to present findings and opportunities for the desired HEP computational roadmap. The final versions of the reports are combined in this document, and are presented along with introductory material.Comment: 72 page

    An innovative approach to performance metrics calculus in cloud computing environments: a guest-to-host oriented perspective

    Get PDF
    In virtualized systems, the task of profiling and resource monitoring is not straight-forward. Many datacenters perform CPU overcommittment using hypervisors, running multiple virtual machines on a single computer where the total number of virtual CPUs exceeds the total number of physical CPUs available. From a customer point of view, it could be indeed interesting to know if the purchased service levels are effectively respected by the cloud provider. The innovative approach to performance profiling described in this work is based on the use of virtual performance counters, only recently made available by some hypervisors to their virtual machines, to implement guest-wide profiling. Although it isn't possible for the virtual machine to access Virtual Machine Monitor, with this method it is able to gather interesting informations to deduce the state of resource overcommittment of the virtualization host where it is executed. Tests have been carried out inside the compute nodes of FIWARE Genoa Node, an instance of a widely distributed federated community cloud, based on OpenStack and KVM. AgiLab-DITEN, the laboratory I belonged to and where I conducted my studies, together with TnT-Lab\u2013DITEN and CNIT-GE-Unit designed, installed and configured the whole Genoa Node, that was hosted on DITEN-UniGE equipment rooms. All the software measuring instruments, operating systems and programs used in this research are publicly available and free, and can be easily installed in a micro instance of virtual machine, rapidly deployable also in public clouds

    Software service adaptation based on interface localisation

    Get PDF
    The aim of Web services is the provision of software services to a range of different users in different locations. Service localisation in this context can facilitate the internationalisation and localisation of services by allowing their adaption to different locales. The authors investigate three dimensions: (i) lingual localisation by providing service-level language translation techniques to adopt services to different languages, (ii) regulatory localisation by providing standards-based mappings to achieve regulatory compliance with regionally varying laws, standards and regulations, and (iii) social localisation by taking into account preferences and customs for individuals and the groups or communities in which they participate. The objective is to support and implement an explicit modelling of aspects that are relevant to localisation and runtime support consisting of tools and middleware services to automating the deployment based on models of locales, driven by the two localisation dimensions. The authors focus here on an ontology-based conceptual information model that integrates locale specification into service architectures in a coherent way

    Self-management for large-scale distributed systems

    Get PDF
    Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control

    Advanced Elastic Platforms for High Throughput Computing on Container-based and Serverless Infrastructures

    Full text link
    [ES] El principal objetivo de esta tesis es ofrecer a los usuarios científicos un modo de crear y ejecutar aplicaciones sin servidor (i.e. serverless) altamente paralelas, dirigidas por eventos y orientadas al procesado de datos, tanto en proveedores en la nube públicos (e.g. AWS) como privados (e.g. OpenNebula, OpenStack). Para llevar a cabo dicho objetivo, se han desarrollado e integrado diferentes herramientas que ofrecen una vía para desplegar aplicaciones de computación de altas prestaciones basadas en contenedores, que además pueden beneficiarse de la alta escalabilidad presente en los entornos serverless. Primero se ha creado una herramienta que permite el despliegue de cargas de trabajo genéricas en el proveedor público AWS. Esta herramienta posibilita que se puedan aprovechar las funcionalidades de AWS Lambda (e.g. alta escalabilidad, computación basada en eventos) para el despliegue y la integración de aplicaciones computacionalmente intensivas que usan el modelo de funciones como servicio (FaaS). En segundo lugar se ha desarrollado un modelo de programación de alto rendimiento para el procesado de datos y orientado a eventos que permite a los usuarios desplegar flujos de trabajo como un conjunto de funciones serverless, a la vez que ofrece una gestión transparente de los datos. En tercer lugar, para poder superar los problemas presentes en los proveedores públicos (e.g. tiempo de ejecución limitado), se ha creado una plataforma que facilita el uso del modelo FaaS en infraestructuras privadas. Esta plataforma también puede ser desplegada automáticamente en distintos proveedores públicos de la nube. Finalmente, para comprobar y validar las diferentes herramientas y plataformas desarrolladas, se han probado diferentes casos de uso con interés tanto para investigación como para la empresa.[CA] El principal objectiu d'aquesta tesi és oferir als usuaris científics una manera de crear i executar aplicacions sense servidor (i.e. serverless) altament paral·leles, dirigides per esdeveniments i orientades al processament de dades, tant en proveïdors en núvol públics (e.g. AWS) com en privats (e.g. OpenNebula, OpenStack). Per a dur a terme aquest objectiu, s'ha desenvolupat e integrat diferents eines que ofereixen una via per desplegar aplicacions de computació d'altes prestacions basades en contenidors, alhora que es poden beneficiar de l'alta escalabilitat present en els entorns serverless. Primerament, s'ha creat una eina que possibilita el desplegament de càrregues de treball genèriques al proveïdor públic en núvol AWS. Aquesta eina permet aprofitar les funcionalitats de AWS Lambda (e.g. alta escalabilitat, computació basada en esdeveniments) per al desplegament i la integració d'aplicacions computacionalment intensives que fan ús del model de funcions com a servei (FaaS). En segon lloc, s'ha desenvolupat un model de programació d'alt rendiment per al processament de dades i orientat a esdeveniments, que permet als usuaris desplegar fluxos de treball com un conjunt de funcions serverless, alhora que ofereix una gestió transparent de les dades. En tercer lloc, per a superar els problemes presents als proveïdors públics (e.g. temps d'execució limitat) s'ha creat una plataforma que permet utilitzar el model FaaS en infraestructures privades. A més, aquesta plataforma pot ser desplegada automàticament en múltiples proveïdors públics en núvol. Finalment, per a comprobar i validar les diferents eines i plataformes dutes a terme, s'han provat diferents casos d'ús amb interès tant per a la recerca com per a l'empresa.[EN] The main objective of this thesis is to allow scientific users to deploy and execute highly-parallel event-driven file-processing serverless applications both in public (e.g. AWS), and in private (e.g. OpenNebula, OpenStack) cloud infrastructures. To achieve this objective, different tools and platforms are developed and integrated to provide scientific users with a way for deploying High Throughput Computing applications based on containers that can benefit from the high elasticity capabilities of the serverless environments. First, an open-source tool to deploy generic serverless workloads in the AWS public Cloud provider has been created. This tool allows the scientific users to benefit from the features of AWS Lambda (e.g. high scalability, event-driven computing) for the deployment and integration of compute-intensive applications that use the Functions as a Service (FaaS) model. Second, an event-driven file-processing high-throughput programming model has been developed to allow the users deploy generic applications as workflows of functions in serverless architectures, offering transparent data management. Third, in order to overcome the drawbacks of public serverless services such as limited execution time or computing capabilities, an open-source platform to support FaaS for compute-intensive applications in on-premises Clouds was created. The platform can be automatically deployed on multi-Clouds in order to create highly-parallel event-driven file-processing serverless applications. Finally, in order to assess and validate all the developed tools and platforms, several use cases with business and scientific backgrounds have been tested.Pérez González, AM. (2020). Advanced Elastic Platforms for High Throughput Computing on Container-based and Serverless Infrastructures [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/146365TESI

    Virtual Machine Lifecycle Management in Grid and Cloud Computing

    Get PDF
    Virtualisierungstechnologie ist die Grundlage für zwei wichtige Konzepte: Virtualized Grid Computing und Cloud Computing. Ersteres ist eine Erweiterung des klassischen Grid Computing. Es hat zum Ziel, die Anforderungen kommerzieller Nutzer des Grid hinsichtlich der Isolation von gleichzeitig ausgeführten Batch-Jobs und der Sicherheit der zugehörigen Daten zu erfüllen. Dabei werden Anwendungen in virtuellen Maschinen ausgeführt, um sie voneinander zu isolieren und die von ihnen verarbeiteten Daten vor anderen Nutzern zu schützen. Darüber hinaus löst Virtualized Grid Computing das Problem der Softwarebereitstellung, eines der bestehenden Probleme des klassischen Grid Computing. Cloud Computing ist ein weiteres Konzept zur Verwendung von entfernten Ressourcen. Der Fokus dieser Dissertation bezüglich Cloud Computing liegt auf dem “Infrastructure as a Service Modell”, das Ideen des (Virtualized) Grid Computing mit einem neuartigen Geschäftsmodell kombiniert. Dieses besteht aus der Bereitstellung von virtuellen Maschinen auf Abruf und aus einem Tarifmodell, bei dem lediglich die tatsächliche Nutzung berechnet wird. Der Einsatz von Virtualisierungstechnologie erhöht die Auslastung der verwendeten (physischen) Rechnersysteme und vereinfacht deren Administration. So ist es beispielsweise möglich, eine virtuelle Maschine zu klonen oder einen Snapshot einer virtuellen Maschine zu erstellen, um zu einem definierten Zustand zurückkehren zu können. Jedoch sind noch nicht alle Probleme im Zusammenhang mit der Virtualisierungstechnologie gelöst. Insbesondere entstehen durch den Einsatz in den sehr dynamischen Umgebungen des Virtualized Grid Computing und des Cloud Computing neue Herausforderungen für die Virtualisierungstechnologie. Diese Dissertation befasst sich mit verschiedenen Aspekten des Einsatzes von Virtualisierungstechnologie in Virtualized Grid und Cloud Computing Umgebungen. Zunächst wird der Lebenszyklus von virtuellen Maschinen in diesen Umgebungen untersucht, und es werden Modelle dieses Lebenszyklus entwickelt. Anhand der entwickelten Modelle werden Probleme identifiziert und Lösungen für diese Probleme entwickelt. Der Fokus liegt dabei auf den Bereichen Speicherung, Bereitstellung und Ausführung von virtuellen Maschinen. Virtuelle Maschinen werden üblicherweise in so genannten Disk Images, also Abbildern von virtuellen Festplatten, gespeichert. Dieses Format hat nicht nur Einfluss auf die Speicherung von größeren Mengen virtueller Maschinen, sondern auch auf deren Bereitstellung. In den untersuchten Umgebungen hat es zwei konkrete Nachteile: es verschwendet Speicherplatz und es verhindert eine effiziente Bereitstellung von virtuellen Maschinen. Maßnahmen zur Steigerung der Sicherheit von virtuellen Maschinen haben auf alle drei genannten Bereiche Einfluss. Beispielsweise sollte vor der Bereitstellung einer virtuellen Maschine geprüft werden, ob die darin installierte Software noch aktuell ist. Weiterhin sollte die Ausführungsumgebung Möglichkeiten bereitstellen, um die virtuelle Infrastruktur wirksam zu überwachen. Die erste in dieser Dissertation vorgestellte Lösung ist das Konzept der Image Composition. Es beschreibt die Komposition eines kombinierten Disk Images aus mehreren Schichten. Dadurch können Teile der einzelnen Schichten, die von mehreren virtuellen Maschinen verwendet werden, zwischen diesen geteilt und somit der Speicherbedarf für die Gesamtheit der virtuellen Maschinen reduziert werden. Der Marvin Image Compositor ist die Umsetzung dieses Konzepts. Die zweite Lösung ist der Marvin Image Store, ein Speichersystem für virtuelle Maschinen, das nicht auf den traditionell genutzten Disk Images basiert, sondern die darin enthaltenen Daten und Metadaten auf eine effiziente Weise getrennt voneinander speichert. Weiterhin werden vier Lösungen vorgestellt, die die Sicherheit von virtuellen Maschine verbessern können: Der Update Checker ist eine Lösung, die es ermöglicht, veraltete Software in virtuellen Maschinen zu identifizieren. Dabei spielt es keine Rolle, ob die jeweilige virtuelle Maschine gerade ausgeführt wird oder nicht. Die zweite Sicherheitslösung ermöglicht es, mehrere virtuelle Maschinen, die auf dem Konzept der Image Composition basieren, zentral zu aktualisieren. Das bedeutet, dass die einmalige Installation einer neuen Softwareversion ausreichend ist, um mehrere virtuelle Maschinen auf den neuesten Stand zu bringen. Die dritte Sicherheitslösung namens Online Penetration Suite ermöglicht es, virtuelle Maschinen automatisiert nach Schwachstellen zu durchsuchen. Die Überwachung der virtuellen Infrastruktur auf allen Ebenen ist der Zweck der vierten Sicherheitslösung. Zusätzlich zur Überwachung ermöglicht diese Lösung auch eine automatische Reaktion auf sicherheitsrelevante Ereignisse. Schließlich wird ein Verfahren zur Migration von virtuellen Maschinen vorgestellt, welches auch ohne ein zentrales Speichersystem eine effiziente Migration ermöglicht

    Monitoring tools for DevOps and microservices: A systematic grey literature review

    Get PDF
    Microservice-based systems are usually developed according to agile practices like DevOps, which enables rapid and frequent releases to promptly react and adapt to changes. Monitoring is a key enabler for these systems, as they allow to continuously get feedback from the field and support timely and tailored decisions for a quality-driven evolution. In the realm of monitoring tools available for microservices in the DevOps-driven development practice, each with different features, assumptions, and performance, selecting a suitable tool is an as much difficult as impactful task. This article presents the results of a systematic study of the grey literature we performed to identify, classify and analyze the available monitoring tools for DevOps and microservices. We selected and examined a list of 71 monitoring tools, drawing a map of their characteristics, limitations, assumptions, and open challenges, meant to be useful to both researchers and practitioners working in this area. Results are publicly available and replicable

    A new MDA-SOA based framework for intercloud interoperability

    Get PDF
    Cloud computing has been one of the most important topics in Information Technology which aims to assure scalable and reliable on-demand services over the Internet. The expansion of the application scope of cloud services would require cooperation between clouds from different providers that have heterogeneous functionalities. This collaboration between different cloud vendors can provide better Quality of Services (QoS) at the lower price. However, current cloud systems have been developed without concerns of seamless cloud interconnection, and actually they do not support intercloud interoperability to enable collaboration between cloud service providers. Hence, the PhD work is motivated to address interoperability issue between cloud providers as a challenging research objective. This thesis proposes a new framework which supports inter-cloud interoperability in a heterogeneous computing resource cloud environment with the goal of dispatching the workload to the most effective clouds available at runtime. Analysing different methodologies that have been applied to resolve various problem scenarios related to interoperability lead us to exploit Model Driven Architecture (MDA) and Service Oriented Architecture (SOA) methods as appropriate approaches for our inter-cloud framework. Moreover, since distributing the operations in a cloud-based environment is a nondeterministic polynomial time (NP-complete) problem, a Genetic Algorithm (GA) based job scheduler proposed as a part of interoperability framework, offering workload migration with the best performance at the least cost. A new Agent Based Simulation (ABS) approach is proposed to model the inter-cloud environment with three types of agents: Cloud Subscriber agent, Cloud Provider agent, and Job agent. The ABS model is proposed to evaluate the proposed framework.Fundação para a Ciência e a Tecnologia (FCT) - (Referencia da bolsa: SFRH SFRH / BD / 33965 / 2009) and EC 7th Framework Programme under grant agreement n° FITMAN 604674 (http://www.fitman-fi.eu
    corecore