Search CORE

25 research outputs found

Enabling Hyperscale Web Services

Author: Sriraman Akshitha
Publication venue
Publication date: 01/01/2021
Field of study

Modern web services such as social media, online messaging, web search, video streaming, and online banking often support billions of users, requiring data centers that scale to hundreds of thousands of servers, i.e., hyperscale. In fact, the world continues to expect hyperscale computing to drive more futuristic applications such as virtual reality, self-driving cars, conversational AI, and the Internet of Things. This dissertation presents technologies that will enable tomorrow’s web services to meet the world’s expectations. The key challenge in enabling hyperscale web services arises from two important trends. First, over the past few years, there has been a radical shift in hyperscale computing due to an unprecedented growth in data, users, and web service software functionality. Second, modern hardware can no longer support this growth in hyperscale trends due to a decline in hardware performance scaling. To enable this new hyperscale era, hardware architects must become more aware of hyperscale software needs and software researchers can no longer expect unlimited hardware performance scaling. In short, systems researchers can no longer follow the traditional approach of building each layer of the systems stack separately. Instead, they must rethink the synergy between the software and hardware worlds from the ground up. This dissertation establishes such a synergy to enable futuristic hyperscale web services. This dissertation bridges the software and hardware worlds, demonstrating the importance of that bridge in realizing efficient hyperscale web services via solutions that span the systems stack. The specific goal is to design software that is aware of new hardware constraints and architect hardware that efficiently supports new hyperscale software requirements. This dissertation spans two broad thrusts: (1) a software and (2) a hardware thrust to analyze the complex hyperscale design space and use insights from these analyses to design efficient cross-stack solutions for hyperscale computation. In the software thrust, this dissertation contributes uSuite, the first open-source benchmark suite of web services built with a new hyperscale software paradigm, that is used in academia and industry to study hyperscale behaviors. Next, this dissertation uses uSuite to study software threading implications in light of today’s hardware reality, identifying new insights in the age-old research area of software threading. Driven by these insights, this dissertation demonstrates how threading models must be redesigned at hyperscale by presenting an automated approach and tool, uTune, that makes intelligent run-time threading decisions. In the hardware thrust, this dissertation architects both commodity and custom hardware to efficiently support hyperscale software requirements. First, this dissertation characterizes commodity hardware’s shortcomings, revealing insights that influenced commercial CPU designs. Based on these insights, this dissertation presents an approach and tool, SoftSKU, that enables cheap commodity hardware to efficiently support new hyperscale software paradigms, improving the efficiency of real-world web services that serve billions of users, saving millions of dollars, and meaningfully reducing the global carbon footprint. This dissertation also presents a hardware-software co-design, uNotify, that redesigns commodity hardware with minimal modifications by using existing hardware mechanisms more intelligently to overcome new hyperscale overheads. Next, this dissertation characterizes how custom hardware must be designed at hyperscale, resulting in industry-academia benchmarking efforts, commercial hardware changes, and improved software development. Based on this characterization’s insights, this dissertation presents Accelerometer, an analytical model that estimates gains from hardware customization. Multiple hyperscale enterprises and hardware vendors use Accelerometer to make well-informed hardware decisions.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169802/1/akshitha_1.pd

Deep Blue Documents at the University of Michigan

Recommended from our members

New Container Architectures for Mobile, Drone, and Cloud Computing

Author: Van't Hof Alexander Edward
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2023
Field of study

Containers are increasingly used across many different types of computing to isolate and control apps while efficiently sharing computing resources. By using lightweight operating system virtualization, they can provide apps with a virtual computing abstraction while imposing minimal hardware requirements and a small footprint. My thesis is that new container architectures can provide additional functionality, better resource utilization, and stronger security for mobile, drone, and cloud computing. To demonstrate this, we introduce three new container architectures that enable new mobile app migration functionality, a new notion of virtual drones and efficient utilization of drone hardware, and stronger security for cloud computing by protecting containers against untrusted operating systems. First, we introduce Flux to support multi-surface apps, apps that seamlessly run across multiple user devices, through app migration. Flux introduces two key mechanisms to overcome device heterogeneity and residual dependencies associated with app migration to enable app migration. Selective Record/Adaptive Replay to record just those device-agnostic app calls that lead to the generation of app-specific device-dependent state in services and replay them on the target. Checkpoint/Restore in Android (CRIA) to transition an app into a state in which device-specific information the app contains can be safely discarded before checkpointing and restoring the app within a containerized environment on the new device. Second, we introduce AnDrone, a drone-as-a-service solution that makes drones accessible in the cloud. AnDrone provides a drone virtualization architecture to leverage the fact that computational costs are cheap compared to the operational and energy costs of putting a drone in the air. This enables multiple virtual drones to run simultaneously on the same physical drone at very little additional cost. To enable multiple virtual drones to run in an isolated and secure manner, each virtual drone runs its own containerized operating system instance. AnDrone introduces a new device container architecture, providing virtual drones with secure access to a full range of drone hardware devices, including sensors such as cameras and geofenced flight control. Finally, we introduce BlackBox, a new container architecture that provides fine-grain protection of application data confidentiality and integrity without the need to trust the operating system. BlackBox introduces a container security monitor, a small trusted computing base that creates separate and independent physical address spaces for each container, such that there is no direct information flow from container to operating system or other container physical address spaces. Containerized apps do not need to be modified, can still make full use of operating system services via system calls, yet their CPU and memory state are isolated and protected from other containers and the operating system

Columbia University Academic Commons

Execution Environments for Running Legacy Applications in Multi-Party Trust Settings

Author: Herwig Stephen Mark
Publication venue
Publication date: 01/01/2021
Field of study

Applications often assume that the same party owns all of the application’s resources, and that these resources require the same level of privacy. This assumption no longer holds when organizations outsource applications to a third-party cloud, or when the application requires access to not only public content, but private configuration, such as authentication and keying material. The result of this broken assumption is that applications either must be re-written to accommodate each new security posture, or used as-is, accepting that one party exposes private data to another. In this dissertation, I argue the following thesis: it is possible to run legacy application binaries with confidentiality and integrity guarantees that reflect a multi-party trust setting. I support this thesis through the design, implementation, and evaluation of two distinct application-level virtualization layers that handle trust concerns on behalf of the application: conclaves and SecureMigration. Conclaves assume the availability of Intel SGX secure hardware enclaves and extend prior work in developing runtimes that execute legacy applications within an enclave. In contrast, SecureMigration does not use secure hardware, but rather composes information flow control with process migration to execute a process across multiple physical machines owned and operated by distinct principals, while shielding each principal’s sensitive portion of the process from its peers

Digital Repository at the University of Maryland

High Performance Web Servers: A Study In Concurrent Programming Models

Author: Radhakrishnan Srihari
Publication venue: 'University of Waterloo'
Publication date: 14/05/2019
Field of study

With the advent of commodity large-scale multi-core computers, the performance of software running on these computers has become a challenge to researchers and enterprise developers. While academic research and industrial products have moved in the direction of writing scalable and highly available services using distributed computing, single machine performance remains an active domain, one which is far from saturated. This thesis selects an archetypal software example and workload in this domain, and describes software characteristics affecting performance. The example is highly-parallel web-servers processing a static workload. Particularly, this work examines concurrent programming models in the context of high-performance web-servers across different architectures — threaded (Apache, Go and μKnot), event-driven (Nginx, μServer) and staged (WatPipe) — compared with two static workloads in two different domains. The two workloads are a Zipf distribution of file sizes representing a user session pulling an assortment of many small and a few large files, and a 50KB file representing chunked streaming of a large audio or video file. Significant effort is made to fairly compare eight web-servers by carefully tuning each via their adjustment parameters. Tuning plays a significant role in workload-specific performance. The two domains are no disk I/O (in-memory file set) and medium disk I/O. The domains are created by lowering the amount of RAM available to the web-server from 4GB to 2GB, forcing files to be evicted from the file-system cache. Both domains are also restricted to 4 CPUs. The primary goal of this thesis is to examine fundamental performance differences between threaded and event-driven concurrency models, with particular emphasis on user-level threading models. Additionally, a secondary goal of the work is to examine high-performance software under restricted hardware environments. Over-provisioned hardware environments can mask architectural and implementation shortcomings in software – the hypothesis in this work is that restricting resources stresses the application, bringing out important performance characteristics and properties. Experimental results for the given workload show that memory pressure is one of the most significant factors for the degradation of web-server performance, because it forces both the onset and amount of disk I/O. With an ever increasing need to support more content at faster rates, a web-server relies heavily on in-memory caching of files and related content. In fact, personal and small business web-servers are even run on minimal hardware, like the Raspberry Pi, with only 1GB of RAM and a small SD card for the file system. Therefore, understanding behaviour and performance in restricted contexts should be a normal aspect of testing a web server (and other software systems)

University of Waterloo's Institutional Repository

Event-driven servers using asynchronous, non-blocking network I/O: Performance evaluation of kqueue and epoll

Author: Leonard Lorcan
Publication venue: Technological University Dublin
Publication date: 01/01/2021
Field of study

This research project evaluates the performance of kqueue and epoll in the context of event-driven servers. The evaluation is done through benchmarking and tracing which are used to measure throughput and execution time respectively. The experiment is repeated for both a virtualised and native server environment. The results from the experiment are statistically analysed and compared. These results show significant differences between kqueue and epoll, and a profound impact of virtualisation as a variable

Arrow@TUDublin

Direct Inter-Process Communication (dIPC): Repurposing the CODOMs architecture to accelerate IPC

Author: Etsion Yoav
Jordà Peroliu Marc
Navarro Nacho
Valero Cortés Mateo
Vilanova Lluis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

In current architectures, page tables are the fundamental mechanism that allows contemporary OSs to isolate user processes, binding each thread to a specific page table. A thread cannot therefore directly call another process's function or access its data; instead, the OS kernel provides data communication primitives and mediates process synchronization through inter-process communication (IPC) channels, which impede system performance. Alternatively, the recently proposed CODOMs architecture provides memory protection across software modules. Threads can cross module protection boundaries inside the same process using simple procedure calls, while preserving memory isolation. We present dIPC (for "direct IPC"), an OS extension that repurposes and extends the CODOMs architecture to allow threads to cross process boundaries. It maps processes into a shared address space, and eliminates the OS kernel from the critical path of inter-process communication. dIPC is 64.12× faster than local remote procedure calls (RPCs), and 8.87× faster than IPC in the L4 microkernel. We show that applying dIPC to a multi-tier OLTP web server improves performance by up to 5.12× (2.13× on average), and reaches over 94% of the ideal system efficiency.We thank Diego Marr´on for helping with MariaDB, the anonymous reviewers for their feedback and, especially, Andrew Baumann for helping us improve the paper. This research was partially funded by HiPEAC through a collaboration grant for Lluís Vilanova (agreement number 687698 for the EU’s Horizon2020 research and innovation programme), the Israel Science Fundation (ISF grant 769/12) and the Israeli Ministry of Science, Technology and Space.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Universal Skeptic Binder-Droid - Towards Arresting Malicious Communication of Colluding Apps in Android

Author: Nadimpalli Srinath
Publication venue
Publication date: 21/09/2015
Field of study

Since its first release, Android has been increasingly adopted by people and companies worldwide. It is currently estimated that around 1.1 billion Android devices are in use. Even though Android was built with Security principles and comes with a sound security model, it is a favorite target for malware authors. McAfee observed a 76% year on year growth in Android malware during the year 2014 alone. Thus malware is a predominant threat in Android ecosystem. A common attack vector for Android malware is the use of colluding apps. Colluding apps involve two or more applications and operate in two phases. In Phase 1, one application steals private sensitive data of the user In Phase 2, the same application sends the data to another application via covert communication channels. There are several covert channels in Android frameworks. Until now, several solutions in the literature have focused on preventing the extraction of sensitive data from the phone. We, to the best of our knowledge, are the first to stop the flow of sensitive info via the covert channels. We propose, Universal Skeptic Binder-Droid, an enhanced Binder module which enforces policies regarding the use of communication channels and prevents apps from colluding. With our proposed system, we have the added advantage of dynamically configuring policies at run time. Our initial implementation and results on our test bed reflect on the effectiveness and the ease of use of such a system

Texas A&M Repository

Automating Seccomp Filter Generation for Linux Applications

Author: Canella Claudio
Gruss Daniel
Schwarz Michael
Werner Mario
Publication venue
Publication date: 04/12/2020
Field of study

Software vulnerabilities in applications undermine the security of applications. By blocking unused functionality, the impact of potential exploits can be reduced. While seccomp provides a solution for filtering syscalls, it requires manual implementation of filter rules for each individual application. Recent work has investigated automated approaches for detecting and installing the necessary filter rules. However, as we show, these approaches make assumptions that are not necessary or require overly time-consuming analysis. In this paper, we propose Chestnut, an automated approach for generating strict syscall filters for Linux userspace applications with lower requirements and limitations. Chestnut comprises two phases, with the first phase consisting of two static components, i.e., a compiler and a binary analyzer, that extract the used syscalls during compilation or in an analysis of the binary. The compiler-based approach of Chestnut is up to factor 73 faster than previous approaches without affecting the accuracy adversely. On the binary analysis level, we demonstrate that the requirement of position-independent binaries of related work is not needed, enlarging the set of applications for which Chestnut is usable. In an optional second phase, Chestnut provides a dynamic refinement tool that allows restricting the set of allowed syscalls further. We demonstrate that Chestnut on average blocks 302 syscalls (86.5%) via the compiler and 288 (82.5%) using the binary-level analysis on a set of 18 widely used applications. We found that Chestnut blocks the dangerous exec syscall in 50% and 77.7% of the tested applications using the compiler- and binary-based approach, respectively. For the tested applications, Chestnut prevents exploitation of more than 62% of the 175 CVEs that target the kernel via syscalls. Finally, we perform a 6 month long-term study of a sandboxed Nginx server

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

Analyse de maliciels sur Android par l'analyse de la mémoire vive

Author: Lebel Bernard
Publication venue
Publication date: 24/05/2018
Field of study

Les plateformes mobiles font partie intégrante du quotidien. Leur flexibilité a permis aux développeurs d’applications d’y proposer des applications de toutes sortes : productivité, jeux, messageries, etc. Devenues des outils connectés d’agrégation d’informations personnelles et professionnelles, ces plateformes sont perçues comme un écosystème lucratif par les concepteurs de maliciels. Android est un système d’exploitation libre de Google visant le marché des appareils mobiles et est l’une des cibles de ces attaques, en partie grâce à la popularité de celuici. Dans la mesure où les maliciels Android constituent une menace pour les consommateurs, il est essentiel que la recherche visant l’analyse de maliciels s’intéresse spécifiquement à cette plateforme mobile. Le travail réalisé dans le cadre de cette maîtrise s’est intéressé à cette problématique, et plus spécifiquement par l’analyse de la mémoire vive. À cette fin, il a fallu s’intéresser aux tendances actuelles en matière de maliciels sur Android et les approches d’analyses statiques et dynamiques présentes dans la littérature. Il a été, par la suite, proposé d’explorer l’analyse de la mémoire vive appliquée à l’analyse de maliciels comme un complément aux approches actuelles. Afin de démontrer l’intérêt de l’approche pour la plateforme Android, une étude de cas a été réalisée où un maliciel expérimental a été conçu pour exprimer les comportements malicieux problématiques pour la plupart des approches relevées dans la littérature. Une approche appelée l’analyse différentielle de la mémoire vive a été présentée afin de faciliter l’analyse. Cette approche utilise le résultat de la différence entre les éléments présents après et avant le déploiement du maliciel pour réduire la quantité d’éléments à analyser. Les résultats de cette étude ont permis de démontrer que l’approche est prometteuse en tant que complément aux approches actuelles. Il est recommandé qu’elle soit le sujet d’études subséquentes afin de mieux détecter les maliciels sur Android et d’en automatiser son application.Mobile devices are at the core of modern society. Their versatility has allowed third-party developers to generate a rich experience for the user through mobile apps of all types (e.g. productivity, games, communications). As mobile platforms have become connected devices that gather nearly all of our personal and professional information, they are seen as a lucrative market by malware developers. Android is an open-sourced operating system from Google targeting specifically the mobile market and has been targeted by malicious activity due the widespread adoption of the latter by the consumers. As Android malwares threaten many consumers, it is essential that research in malware analysis address specifically this mobile platform. The work conducted during this Master’s focuses on the analysis of malwares on the Android platform. This was achieved through a literature review of the current malware trends and the approaches in static and dynamic analysis that exists to mitigate them. It was also proposed to explore live memory forensics applied to the analysis of malwares as a complement to existing methods. To demonstrate the applicability of the approach and its relevance to the Android malwares, a case study was proposed where an experimental malware has been designed to express malicious behaviours difficult to detect through current methods. The approach explored is called differential live memory analysis. It consists of analyzing the difference in the content of the live memory before and after the deployment of the malware. The results of the study have shown that this approach is promising and should be explored in future studies as a complement to current approaches

CorpusUL