    Virtual machine cluster mobility in inter-cloud platforms

    Modern cloud computing applications developed from different interoperable services that are interfacing with each other in a loose coupling approach. This work proposes the concept of the Virtual Machine (VM) cluster migration, meaning that services could be migrated to various clouds based on different constraints such as computational resources and better economical offerings. Since cloud services are instantiated as VMs, an application can be seen as a cluster of VMs that integrate its functionality. We focus on the VM cluster migration by exploring a more sophisticated method with regards to VM network configurations. In particular, networks are hard to managed because their internal setup is changed after a migration, and this is related with the configuration parameters during the re-instantiation to the new cloud platform. To address such issue, we introduce a Software Defined Networking (SDN) service that breaks the problem of network configuration into tractable pieces and involves virtual bridges instead of references to static endpoints. The architecture is modular, it is based on the SDN OpenFlow protocol and allows VMs to be paired in cluster groups that communicate with each other independently of the cloud platform that are deployed. The experimental analysis demonstrates migrations of VM clusters and provides a detailed discussion of service performance for different cases

    Prebaked µVMs: Scalable, Instant VM Startup for IaaS Clouds

    Abstract-IaaS clouds promise instantaneously available resources to elastic applications. In practice, however, virtual machine (VM) startup times are in the order of several minutes, or at best, several tens of seconds, negatively impacting the elasticity of applications like Web servers that need to scale out to handle dynamically increasing load. VM startup time is strongly influenced by booting the VM's operating system. In this work, we propose using so-called prebaked µVMs to speed up VM startup. µVMs are snapshots of minimal VMs that can be quickly resumed and then configured to application needs by hot-plugging resources. To serve µVMs, we extend our VM boot cache service, Squirrel, allowing to store µVMs for large numbers of VM images on the hosts of a data center. Our experiments show that µVMs can start up in less than one second on a standard file system. Using 1000+ VM images from a production cloud, we show that the respective µVMs can be stored in a compressed and deduplicated file system within 50 GB storage per host, while starting up within 2-3 seconds on average

    Availability and Preservation of Scholarly Digital Resources

    The dynamic, decentralized world-wide-web has become an essential part of scientific research and communication, representing a relatively new medium for the conveyance of scientific thought and discovery. Researchers create thousands of web sites every year to share software, data and services. Unlike books and journals, however, the preservation systems are not yet mature. This carries implications that go to the core of science: the ability to examine another\u27s sources to understand and reproduce their work. These valuable resources have been documented as disappearing over time in several subject areas. This dissertation examines the problem by performing a crossdisciplinary investigation, testing the effectiveness of existing remedies and introducing new ones. As part of the investigation, 14,489 unique web pages found in the abstracts within Thomson Reuters’ Web of Science citation index were accessed. The median lifespan of these web pages was found to be 9.3 years with 62% of them being archived. Survival analysis and logistic regression identified significant predictors of URL lifespan and included the year a URL was published, the number of times it was cited, its depth as well as its domain. Statistical analysis revealed biases in current static web-page solutions

    Virtual Machine Lifecycle Management in Grid and Cloud Computing

    Virtualisierungstechnologie ist die Grundlage für zwei wichtige Konzepte: Virtualized Grid Computing und Cloud Computing. Ersteres ist eine Erweiterung des klassischen Grid Computing. Es hat zum Ziel, die Anforderungen kommerzieller Nutzer des Grid hinsichtlich der Isolation von gleichzeitig ausgeführten Batch-Jobs und der Sicherheit der zugehörigen Daten zu erfüllen. Dabei werden Anwendungen in virtuellen Maschinen ausgeführt, um sie voneinander zu isolieren und die von ihnen verarbeiteten Daten vor anderen Nutzern zu schützen. Darüber hinaus löst Virtualized Grid Computing das Problem der Softwarebereitstellung, eines der bestehenden Probleme des klassischen Grid Computing. Cloud Computing ist ein weiteres Konzept zur Verwendung von entfernten Ressourcen. Der Fokus dieser Dissertation bezüglich Cloud Computing liegt auf dem “Infrastructure as a Service Modell”, das Ideen des (Virtualized) Grid Computing mit einem neuartigen Geschäftsmodell kombiniert. Dieses besteht aus der Bereitstellung von virtuellen Maschinen auf Abruf und aus einem Tarifmodell, bei dem lediglich die tatsächliche Nutzung berechnet wird. Der Einsatz von Virtualisierungstechnologie erhöht die Auslastung der verwendeten (physischen) Rechnersysteme und vereinfacht deren Administration. So ist es beispielsweise möglich, eine virtuelle Maschine zu klonen oder einen Snapshot einer virtuellen Maschine zu erstellen, um zu einem definierten Zustand zurückkehren zu können. Jedoch sind noch nicht alle Probleme im Zusammenhang mit der Virtualisierungstechnologie gelöst. Insbesondere entstehen durch den Einsatz in den sehr dynamischen Umgebungen des Virtualized Grid Computing und des Cloud Computing neue Herausforderungen für die Virtualisierungstechnologie. Diese Dissertation befasst sich mit verschiedenen Aspekten des Einsatzes von Virtualisierungstechnologie in Virtualized Grid und Cloud Computing Umgebungen. Zunächst wird der Lebenszyklus von virtuellen Maschinen in diesen Umgebungen untersucht, und es werden Modelle dieses Lebenszyklus entwickelt. Anhand der entwickelten Modelle werden Probleme identifiziert und Lösungen für diese Probleme entwickelt. Der Fokus liegt dabei auf den Bereichen Speicherung, Bereitstellung und Ausführung von virtuellen Maschinen. Virtuelle Maschinen werden üblicherweise in so genannten Disk Images, also Abbildern von virtuellen Festplatten, gespeichert. Dieses Format hat nicht nur Einfluss auf die Speicherung von größeren Mengen virtueller Maschinen, sondern auch auf deren Bereitstellung. In den untersuchten Umgebungen hat es zwei konkrete Nachteile: es verschwendet Speicherplatz und es verhindert eine effiziente Bereitstellung von virtuellen Maschinen. Maßnahmen zur Steigerung der Sicherheit von virtuellen Maschinen haben auf alle drei genannten Bereiche Einfluss. Beispielsweise sollte vor der Bereitstellung einer virtuellen Maschine geprüft werden, ob die darin installierte Software noch aktuell ist. Weiterhin sollte die Ausführungsumgebung Möglichkeiten bereitstellen, um die virtuelle Infrastruktur wirksam zu überwachen. Die erste in dieser Dissertation vorgestellte Lösung ist das Konzept der Image Composition. Es beschreibt die Komposition eines kombinierten Disk Images aus mehreren Schichten. Dadurch können Teile der einzelnen Schichten, die von mehreren virtuellen Maschinen verwendet werden, zwischen diesen geteilt und somit der Speicherbedarf für die Gesamtheit der virtuellen Maschinen reduziert werden. Der Marvin Image Compositor ist die Umsetzung dieses Konzepts. Die zweite Lösung ist der Marvin Image Store, ein Speichersystem für virtuelle Maschinen, das nicht auf den traditionell genutzten Disk Images basiert, sondern die darin enthaltenen Daten und Metadaten auf eine effiziente Weise getrennt voneinander speichert. Weiterhin werden vier Lösungen vorgestellt, die die Sicherheit von virtuellen Maschine verbessern können: Der Update Checker ist eine Lösung, die es ermöglicht, veraltete Software in virtuellen Maschinen zu identifizieren. Dabei spielt es keine Rolle, ob die jeweilige virtuelle Maschine gerade ausgeführt wird oder nicht. Die zweite Sicherheitslösung ermöglicht es, mehrere virtuelle Maschinen, die auf dem Konzept der Image Composition basieren, zentral zu aktualisieren. Das bedeutet, dass die einmalige Installation einer neuen Softwareversion ausreichend ist, um mehrere virtuelle Maschinen auf den neuesten Stand zu bringen. Die dritte Sicherheitslösung namens Online Penetration Suite ermöglicht es, virtuelle Maschinen automatisiert nach Schwachstellen zu durchsuchen. Die Überwachung der virtuellen Infrastruktur auf allen Ebenen ist der Zweck der vierten Sicherheitslösung. Zusätzlich zur Überwachung ermöglicht diese Lösung auch eine automatische Reaktion auf sicherheitsrelevante Ereignisse. Schließlich wird ein Verfahren zur Migration von virtuellen Maschinen vorgestellt, welches auch ohne ein zentrales Speichersystem eine effiziente Migration ermöglicht

    Estudo do IPFS como protocolo de distribuição de conteúdos em redes veiculares

    Over the last few years, vehicular ad-hoc networks (VANETs) have been the focus of great progress due to the interest in autonomous vehicles and in distributing content not only between vehicles, but also to the Cloud. Performing a download/upload to/from a vehicle typically requires the existence of a cellular connection, but the costs associated with mobile data transfers in hundreds or thousands of vehicles quickly become prohibitive. A VANET allows the costs to be several orders of magnitude lower - while keeping the same large volumes of data - because it is strongly based in the communication between vehicles (nodes of the network) and the infrastructure. The InterPlanetary File System (IPFS) is a protocol for storing and distributing content, where information is addressed by its content, instead of its location. It was created in 2014 and it seeks to connect all computing devices with the same system of files, comparable to a BitTorrent swarm exchanging Git objects. It has been tested and deployed in wired networks, but never in an environment where nodes have intermittent connectivity, such as a VANET. This work focuses on understanding IPFS, how/if it can be applied to the vehicular network context, and comparing it with other content distribution protocols. In this dissertation, IPFS has been tested in a small and controlled network to understand its working applicability to VANETs. Issues such as neighbor discoverability times and poor hashing performance have been addressed. To compare IPFS with other protocols (such as Veniam’s proprietary solution or BitTorrent) in a relevant way and in a large scale, an emulation platform was created. The tests in this emulator were performed in different times of the day, with a variable number of files and file sizes. Emulated results show that IPFS is on par with Veniam’s custom V2V protocol built specifically for V2V, and greatly outperforms BitTorrent regarding neighbor discoverability and data transfers. An analysis of IPFS’ performance in a real scenario was also conducted, using a subset of STCP’s vehicular network in Oporto, with the support of Veniam. Results from these tests show that IPFS can be used as a content dissemination protocol, showing it is up to the challenge provided by a constantly changing network topology, and achieving throughputs up to 2.8 MB/s, values similar or in some cases even better than Veniam’s proprietary solution.Nos últimos anos, as redes veiculares (VANETs) têm sido o foco de grandes avanços devido ao interesse em veículos autónomos e em distribuir conteúdos, não só entre veículos mas também para a "nuvem" (Cloud). Tipicamente, fazer um download/upload de/para um veículo exige a utilização de uma ligação celular (SIM), mas os custos associados a fazer transferências com dados móveis em centenas ou milhares de veículos rapidamente se tornam proibitivos. Uma VANET permite que estes custos sejam consideravelmente inferiores - mantendo o mesmo volume de dados - pois é fortemente baseada na comunicação entre veículos (nós da rede) e a infraestrutura. O InterPlanetary File System (IPFS - "sistema de ficheiros interplanetário") é um protocolo de armazenamento e distribuição de conteúdos, onde a informação é endereçada pelo conteúdo, em vez da sua localização. Foi criado em 2014 e tem como objetivo ligar todos os dispositivos de computação num só sistema de ficheiros, comparável a um swarm BitTorrent a trocar objetos Git. Já foi testado e usado em redes com fios, mas nunca num ambiente onde os nós têm conetividade intermitente, tal como numa VANET. Este trabalho tem como foco perceber o IPFS, como/se pode ser aplicado ao contexto de rede veicular e compará-lo a outros protocolos de distribuição de conteúdos. Numa primeira fase o IPFS foi testado numa pequena rede controlada, de forma a perceber a sua aplicabilidade às VANETs, e resolver os seus primeiros problemas como os tempos elevados de descoberta de vizinhos e o fraco desempenho de hashing. De modo a poder comparar o IPFS com outros protocolos (tais como a solução proprietária da Veniam ou o BitTorrent) de forma relevante e em grande escala, foi criada uma plataforma de emulação. Os testes neste emulador foram efetuados usando registos de mobilidade e conetividade veicular de alturas diferentes de um dia, com um número variável de ficheiros e tamanhos de ficheiros. Os resultados destes testes mostram que o IPFS está a par do protocolo V2V da Veniam (desenvolvido especificamente para V2V e VANETs), e que o IPFS é significativamente melhor que o BitTorrent no que toca ao tempo de descoberta de vizinhos e transferência de informação. Uma análise do desempenho do IPFS em cenário real também foi efetuada, usando um pequeno conjunto de nós da rede veicular da STCP no Porto, com o apoio da Veniam. Os resultados destes testes demonstram que o IPFS pode ser usado como protocolo de disseminação de conteúdos numa VANET, mostrando-se adequado a uma topologia constantemente sob alteração, e alcançando débitos até 2.8 MB/s, valores parecidos ou nalguns casos superiores aos do protocolo proprietário da Veniam.Mestrado em Engenharia de Computadores e Telemátic

    Multimedia Content Distribution Management Using a Distributed Topology

    Advertising plays an important role in order for many companies to promote their products and services. It can be expensive to place advertisements with no guarantees that the message will reach the intended persons. In this field, targeted advertising is the mainstream strategy to captivate the potential consumer. People are used to see advertisements everywhere they go in many different forms. One of those is the use of screen displays that are believed to make the ads more engaging. However, using digital screens to advertise may lead to some issues, like down times or unwanted error messages from the device that controls the screens. This can cause a bad experience for both the target audience and the advertiser. This thesis was developed within the scope of a project called Vixtape. It’s a platform with the goal of turning any public screen into an ads displaying device and in the process reward the screen owner by exposing ads to the target audience. It also has the mission of giving the end user a optimal technological experience, no flaws and highly efficient. All these characteristics are accomplished by the use of a new open source technology called Interplanetary File System (IPFS), that allow devices to share content between them in a Peer-to-Peer (P2P) topology. This content distribution method saves Internet bandwidth to the end user (i.e., the Vixtape service client) and also enables the devices to work offline in case their Internet connection drops. This will greatly reduce the common problems seen with ads screen, thus giving a better experience to both the audience and the end user. By the end of this document one can see that, adding a distributed topology to the Vixtape platform increased the Internet usage efficiency of the ads devices by always having up-to-date content available. This avoids that a device unnecessarily requests content from any of the other devices that had previously requested it. Additionally, a strategy to target a given audience was employed in order to choose the right ads to play. This further increases the maximum potential consumers the advertisements are shown to

    Ad hoc cloud computing

    Commercial and private cloud providers offer virtualized resources via a set of co-located and dedicated hosts that are exclusively reserved for the purpose of offering a cloud service. While both cloud models appeal to the mass market, there are many cases where outsourcing to a remote platform or procuring an in-house infrastructure may not be ideal or even possible. To offer an attractive alternative, we introduce and develop an ad hoc cloud computing platform to transform spare resource capacity from an infrastructure owner’s locally available, but non-exclusive and unreliable infrastructure, into an overlay cloud platform. The foundation of the ad hoc cloud relies on transferring and instantiating lightweight virtual machines on-demand upon near-optimal hosts while virtual machine checkpoints are distributed in a P2P fashion to other members of the ad hoc cloud. Virtual machines found to be non-operational are restored elsewhere ensuring the continuity of cloud jobs. In this thesis we investigate the feasibility, reliability and performance of ad hoc cloud computing infrastructures. We firstly show that the combination of both volunteer computing and virtualization is the backbone of the ad hoc cloud. We outline the process of virtualizing the volunteer system BOINC to create V-BOINC. V-BOINC distributes virtual machines to volunteer hosts allowing volunteer applications to be executed in the sandbox environment to solve many of the downfalls of BOINC; this however also provides the basis for an ad hoc cloud computing platform to be developed. We detail the challenges of transforming V-BOINC into an ad hoc cloud and outline the transformational process and integrated extensions. These include a BOINC job submission system, cloud job and virtual machine restoration schedulers and a periodic P2P checkpoint distribution component. Furthermore, as current monitoring tools are unable to cope with the dynamic nature of ad hoc clouds, a dynamic infrastructure monitoring and management tool called the Cloudlet Control Monitoring System is developed and presented. We evaluate each of our individual contributions as well as the reliability, performance and overheads associated with an ad hoc cloud deployed on a realistically simulated unreliable infrastructure. We conclude that the ad hoc cloud is not only a feasible concept but also a viable computational alternative that offers high levels of reliability and can at least offer reasonable performance, which at times may exceed the performance of a commercial cloud infrastructure