42 research outputs found

    Survey on Deduplication Techniques in Flash-Based Storage

    Get PDF
    Data deduplication importance is growing with the growth of data volumes. The domain of data deduplication is in active development. Recently it was influenced by appearance of Solid State Drive. This new type of disk has significant differences from random access memory and hard disk drives and is widely used now. In this paper we propose a novel taxonomy which reflects the main issues related to deduplication in Solid State Drive. We present a survey on deduplication techniques focusing on flash-based storage. We also describe several Open Source tools implementing data deduplication and briefly describe open research problems related to data deduplication in flash-based storage systems

    Resource-Efficient Replication and Migration of Virtual Machines.

    Full text link
    Continuous replication and live migration of Virtual Machines (VMs) are two vital tools in a virtualized environment, but they are resource-expensive. Continuously replicating a VM's checkpointed state to a backup host maintains high-availability (HA) of the VM despite host failures, but checkpoint replication can generate significant network traffic. Each replicated VM also incurs a 100% memory overhead, since the backup unproductively reserves the same amount of memory to hold the redundant VM state. Live migration, though being widely used for load-balancing, power-saving, etc., can also generate excessive network traffic, by transferring VM state iteratively. In addition, it can incur a long completion time and degrade application performance. This thesis explores ways to replicate VMs for HA using resources efficiently, and to migrate VMs fast, with minimal execution disruption and using resources efficiently. First, we investigate the tradeoffs in using different compression methods to reduce the network traffic of checkpoint replication in a HA system. We evaluate gzip, delta and similarity compressions based on metrics that are specifically important in a HA system, and then suggest guidelines for their selection. Next, we propose HydraVM, a storage-based HA approach that eliminates the unproductive memory reservation made in backup hosts. HydraVM maintains a recent image of a protected VM in a shared storage by taking and consolidating incremental VM checkpoints. When a failure occurs, HydraVM quickly resumes the execution of a failed VM by loading a small amount of essential VM state from the storage. As the VM executes, the VM state not yet loaded is supplied on-demand. Finally, we propose application-assisted live migration, which skips transfer of VM memory that need not be migrated to execute running applications at the destination. We develop a generic framework for the proposed approach, and then use the framework to build JAVMM, a system that migrates VMs running Java applications skipping transfer of garbage in Java memory. Our evaluation results show that compared to Xen live migration, which is agnostic of running applications, JAVMM can reduce the completion time, network traffic and application downtime caused by Java VM migration, all by up to over 90%.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111575/1/karenhou_1.pd

    Improving Application Performance in the Emerging Hyper-converged Infrastructure

    Get PDF
    University of Minnesota Ph.D. dissertation.April 2019. Major: Computer Science. Advisor: David Du. 1 computer file (PDF); viii, 118 pages.In today's world, the hyper-converged infrastructure is emerging as a new type of infrastructure. In the hyper-converged infrastructure, service providers deploy compute, network and storage services on inexpensive hardware rather than expensive proprietary hardware. It allows the service providers to customize the services they can provide by deploying applications in Virtual Machines (VMs) or containers. They can have controls on all resources including compute, network and storage. In this hyper-converged infrastructure, improving the application performance is an important issue. Throughout my Ph.D. research, I have been studying how to improve the performance of applications in the emerging hyper-converged infrastructure. I have been focusing on improving the performance of applications in VMs and in containers when accessing data, and how to improve the performance of applications in the networked storage environment. In the hyper-converged infrastructure, administrators can provide desktop services by deploying Virtual Desktop Infrastructure application (VDI) based on VMs. We first investigate how to identify storage requirements and determine how to meet such requirements with minimal storage resources for VDI application. We create a model to describe the behavior of VDI, and collect real VDI traces to populate this model. The model allows us to identify the storage requirements of VDI and determine the potential bottlenecks in storage. Based on this information, we can tell what capacity and minimum capability a storage system needs in order to support and satisfy a given VDI configuration. We show that our model can describe more fine-grained storage requirements of VDI compared with the rules of thumb which are currently used in industry. In the hyper-converged infrastructure, more and more applications are running in containers. We design and implement a system, called k8sES (k8s Enhanced Storage), that efficiently supports applications with various storage SLOs (Service Level Objectives) along with all other requirements deployed in the Kubernetes environment which is based on containers. Kubernetes (k8s) is a system for managing containerized applications across multiple hosts. The current storage support for containerized applications in k8s is limited. To satisfy users' SLOs, k8s administrators must manually configure storage in advance, and users must know the configurations and capabilities of different types of the provided storage. In k8sES, storage resources are dynamically allocated based on users' requirements. Given users' SLOs, k8sES will select the correct node and storage that can meet their requirements when scheduling applications. The storage allocation mechanism in k8sES also improves the storage utilization efficiency. In addition, we provide a tool to monitor the I/O activities of both applications and storage devices in Kubernetes. With the capabilities of controlling client, network and storage with hyper-convergence, we study how to coordinate different components along the I/O path to ensure latency SLOs for applications in the networked storage environment. We propose and implement JoiNS, a system trying to ensure latency SLOs for applications that access data on remote networked storage. JoiNS carefully considers all the components along the I/O path and controls them in a coordinated fashion. JoiNS has both global network and storage visibility with a logically centralized controller which keeps monitoring the status of each involved component. JoiNS coordinates these components and adjusts the priority of I/Os in each component based on the latency SLO, network and storage status, time estimation, and characteristics of each I/O request

    Virtual Machine Lifecycle Management in Grid and Cloud Computing

    Get PDF
    Virtualisierungstechnologie ist die Grundlage für zwei wichtige Konzepte: Virtualized Grid Computing und Cloud Computing. Ersteres ist eine Erweiterung des klassischen Grid Computing. Es hat zum Ziel, die Anforderungen kommerzieller Nutzer des Grid hinsichtlich der Isolation von gleichzeitig ausgeführten Batch-Jobs und der Sicherheit der zugehörigen Daten zu erfüllen. Dabei werden Anwendungen in virtuellen Maschinen ausgeführt, um sie voneinander zu isolieren und die von ihnen verarbeiteten Daten vor anderen Nutzern zu schützen. Darüber hinaus löst Virtualized Grid Computing das Problem der Softwarebereitstellung, eines der bestehenden Probleme des klassischen Grid Computing. Cloud Computing ist ein weiteres Konzept zur Verwendung von entfernten Ressourcen. Der Fokus dieser Dissertation bezüglich Cloud Computing liegt auf dem “Infrastructure as a Service Modell”, das Ideen des (Virtualized) Grid Computing mit einem neuartigen Geschäftsmodell kombiniert. Dieses besteht aus der Bereitstellung von virtuellen Maschinen auf Abruf und aus einem Tarifmodell, bei dem lediglich die tatsächliche Nutzung berechnet wird. Der Einsatz von Virtualisierungstechnologie erhöht die Auslastung der verwendeten (physischen) Rechnersysteme und vereinfacht deren Administration. So ist es beispielsweise möglich, eine virtuelle Maschine zu klonen oder einen Snapshot einer virtuellen Maschine zu erstellen, um zu einem definierten Zustand zurückkehren zu können. Jedoch sind noch nicht alle Probleme im Zusammenhang mit der Virtualisierungstechnologie gelöst. Insbesondere entstehen durch den Einsatz in den sehr dynamischen Umgebungen des Virtualized Grid Computing und des Cloud Computing neue Herausforderungen für die Virtualisierungstechnologie. Diese Dissertation befasst sich mit verschiedenen Aspekten des Einsatzes von Virtualisierungstechnologie in Virtualized Grid und Cloud Computing Umgebungen. Zunächst wird der Lebenszyklus von virtuellen Maschinen in diesen Umgebungen untersucht, und es werden Modelle dieses Lebenszyklus entwickelt. Anhand der entwickelten Modelle werden Probleme identifiziert und Lösungen für diese Probleme entwickelt. Der Fokus liegt dabei auf den Bereichen Speicherung, Bereitstellung und Ausführung von virtuellen Maschinen. Virtuelle Maschinen werden üblicherweise in so genannten Disk Images, also Abbildern von virtuellen Festplatten, gespeichert. Dieses Format hat nicht nur Einfluss auf die Speicherung von größeren Mengen virtueller Maschinen, sondern auch auf deren Bereitstellung. In den untersuchten Umgebungen hat es zwei konkrete Nachteile: es verschwendet Speicherplatz und es verhindert eine effiziente Bereitstellung von virtuellen Maschinen. Maßnahmen zur Steigerung der Sicherheit von virtuellen Maschinen haben auf alle drei genannten Bereiche Einfluss. Beispielsweise sollte vor der Bereitstellung einer virtuellen Maschine geprüft werden, ob die darin installierte Software noch aktuell ist. Weiterhin sollte die Ausführungsumgebung Möglichkeiten bereitstellen, um die virtuelle Infrastruktur wirksam zu überwachen. Die erste in dieser Dissertation vorgestellte Lösung ist das Konzept der Image Composition. Es beschreibt die Komposition eines kombinierten Disk Images aus mehreren Schichten. Dadurch können Teile der einzelnen Schichten, die von mehreren virtuellen Maschinen verwendet werden, zwischen diesen geteilt und somit der Speicherbedarf für die Gesamtheit der virtuellen Maschinen reduziert werden. Der Marvin Image Compositor ist die Umsetzung dieses Konzepts. Die zweite Lösung ist der Marvin Image Store, ein Speichersystem für virtuelle Maschinen, das nicht auf den traditionell genutzten Disk Images basiert, sondern die darin enthaltenen Daten und Metadaten auf eine effiziente Weise getrennt voneinander speichert. Weiterhin werden vier Lösungen vorgestellt, die die Sicherheit von virtuellen Maschine verbessern können: Der Update Checker ist eine Lösung, die es ermöglicht, veraltete Software in virtuellen Maschinen zu identifizieren. Dabei spielt es keine Rolle, ob die jeweilige virtuelle Maschine gerade ausgeführt wird oder nicht. Die zweite Sicherheitslösung ermöglicht es, mehrere virtuelle Maschinen, die auf dem Konzept der Image Composition basieren, zentral zu aktualisieren. Das bedeutet, dass die einmalige Installation einer neuen Softwareversion ausreichend ist, um mehrere virtuelle Maschinen auf den neuesten Stand zu bringen. Die dritte Sicherheitslösung namens Online Penetration Suite ermöglicht es, virtuelle Maschinen automatisiert nach Schwachstellen zu durchsuchen. Die Überwachung der virtuellen Infrastruktur auf allen Ebenen ist der Zweck der vierten Sicherheitslösung. Zusätzlich zur Überwachung ermöglicht diese Lösung auch eine automatische Reaktion auf sicherheitsrelevante Ereignisse. Schließlich wird ein Verfahren zur Migration von virtuellen Maschinen vorgestellt, welches auch ohne ein zentrales Speichersystem eine effiziente Migration ermöglicht

    How to Use Litigation Technology to Prepare & Present Your Case at Trial October 27, 2021

    Get PDF
    Meeting proceedings of a seminar by the same name, held October 27, 2021

    Introduction to Development Engineering

    Get PDF
    This open access textbook introduces the emerging field of Development Engineering and its constituent theories, methods, and applications. It is both a teaching text for students and a resource for researchers and practitioners engaged in the design and scaling of technologies for low-resource communities. The scope is broad, ranging from the development of mobile applications for low-literacy users to hardware and software solutions for providing electricity and water in remote settings. It is also highly interdisciplinary, drawing on methods and theory from the social sciences as well as engineering and the natural sciences. The opening section reviews the history of “technology-for-development” research, and presents a framework that formalizes this body of work and begins its transformation into an academic discipline. It identifies common challenges in development and explains the book’s iterative approach of “innovation, implementation, evaluation, adaptation.” Each of the next six thematic sections focuses on a different sector: energy and environment; market performance; education and labor; water, sanitation and health; digital governance; and connectivity. These thematic sections contain case studies from landmark research that directly integrates engineering innovation with technically rigorous methods from the social sciences. Each case study describes the design, evaluation, and/or scaling of a technology in the field and follows a single form, with common elements and discussion questions, to create continuity and pedagogical consistency. Together, they highlight successful solutions to development challenges, while also analyzing the rarely discussed failures. The book concludes by reiterating the core principles of development engineering illustrated in the case studies, highlighting common challenges that engineers and scientists will face in designing technology interventions that sustainably accelerate economic development. Development Engineering provides, for the first time, a coherent intellectual framework for attacking the challenges of poverty and global climate change through the design of better technologies. It offers the rigorous discipline needed to channel the energy of a new generation of scientists and engineers toward advancing social justice and improved living conditions in low-resource communities around the world

    Introduction to Development Engineering

    Get PDF
    This open access textbook introduces the emerging field of Development Engineering and its constituent theories, methods, and applications. It is both a teaching text for students and a resource for researchers and practitioners engaged in the design and scaling of technologies for low-resource communities. The scope is broad, ranging from the development of mobile applications for low-literacy users to hardware and software solutions for providing electricity and water in remote settings. It is also highly interdisciplinary, drawing on methods and theory from the social sciences as well as engineering and the natural sciences. The opening section reviews the history of “technology-for-development” research, and presents a framework that formalizes this body of work and begins its transformation into an academic discipline. It identifies common challenges in development and explains the book’s iterative approach of “innovation, implementation, evaluation, adaptation.” Each of the next six thematic sections focuses on a different sector: energy and environment; market performance; education and labor; water, sanitation and health; digital governance; and connectivity. These thematic sections contain case studies from landmark research that directly integrates engineering innovation with technically rigorous methods from the social sciences. Each case study describes the design, evaluation, and/or scaling of a technology in the field and follows a single form, with common elements and discussion questions, to create continuity and pedagogical consistency. Together, they highlight successful solutions to development challenges, while also analyzing the rarely discussed failures. The book concludes by reiterating the core principles of development engineering illustrated in the case studies, highlighting common challenges that engineers and scientists will face in designing technology interventions that sustainably accelerate economic development. Development Engineering provides, for the first time, a coherent intellectual framework for attacking the challenges of poverty and global climate change through the design of better technologies. It offers the rigorous discipline needed to channel the energy of a new generation of scientists and engineers toward advancing social justice and improved living conditions in low-resource communities around the world

    Enhancing security in public IaaS cloud systems through VM monitoring: a consumer’s perspective

    Get PDF
    Cloud computing is attractive for both consumers and providers to benefit from potential economies of scale in reducing cost of use (for consumers) and operation of infrastructure (for providers). In the IaaS service deployment model of the cloud, consumers can launch their own virtual machines (VMs) on an infrastructure made available by a cloud provider, enabling a number of different applications to be hosted within the VM. The cloud provider generally has full control and access to the VM, providing the potential for a provider to access both VM configuration parameters and the hosted data. Trust between the consumer and the provider is key in this context, and generally assumed to exist. However, relying on this assumption alone can be limiting. We argue that the VM owner must have greater access to operations that are being carried out on their VM by the provider and greater visibility on how this VM and its data are stored and processed in the cloud. In the case where VMs are migrated by the provider to another region, without notifying the owner, this can raise some privacy concerns. Therefore, mechanisms must be in place to ensure that violation of the confidentiality, integrity and SLA does not happen. In this thesis, we present a number of contributions in the field of cloud security which aim at supporting trustworthy cloud computing. We propose monitoring of security-related VM events as a solution to some of the cloud security challenges. Therefore, we present a system design and architecture to monitor security-related VM events in public IaaS cloud systems. To enable the system to achieve focused monitoring, we propose a taxonomy of security-related VM events. The architecture was supported by a prototype implementation of the monitoring tool called: VMInformant, which keeps the user informed and alerted about various events that have taken place on their VM. The tool was evaluated to learn about the performance and storage overheads associated with monitoring such events using CPU and I/O intensive benchmarks. Since events in multiple VMs, belonging to the same owner, may be related, we suggested an architecture of a system, called: Inspector Station, to aggregate and analyse events from multiple VMs. This system enables the consumer: (1) to learn about the overall security status of multiple VMs; (2) to find patterns in the events; and (3) to make informed decisions related to security. To ensure that VMs are not migrated to another region without notifying the owner, we proposed a hybrid approach, which combines multiple metrics to estimate the likelihood of a migration event. The technical aspects in this thesis are backed up by practical experiments to evaluate the approaches in real public IaaS cloud systems, e.g. Amazon AWS and Google Cloud Platform. We argue that having this level of transparency is essential to improve the trust between a cloud consumer and provider, especially in the context of a public cloud system
    corecore