    Security comparison of ownCloud, Nextcloud, and Seafile in open source cloud storage solutions

    Cloud storage has become one of the most efficient and economical ways to store data over the web. Although most organizations have adopted cloud storage, there are numerous privacy and security concerns about cloud storage and collaboration. Furthermore, adopting public cloud storage may be costly for many enterprises. An open-source cloud storage solution for cloud file sharing is a possible alternative in this instance. There is limited information on system architecture, security measures, and overall throughput consequences when selecting open-source cloud storage solutions despite widespread awareness. There are no comprehensive comparisons available to evaluate open-source cloud storage solutions (specifically owncloud, nextcloud, and seafile) and analyze the impact of platform selections. This thesis will present the concept of cloud storage, a comprehensive understanding of three popular open-source features, architecture, security features, vulnerabilities, and other angles in detail. The goal of the study is to conduct a comparison of these cloud solutions so that users may better understand the various open-source cloud storage solutions and make more knowledgeable selections. The author has focused on four attributes: features, architecture, security, and vulnerabilities of three cloud storage solutions ("ownCloud," "Nextcloud," and "Seafile") since most of the critical issues fall into one of these classifications. The findings show that, while the three services take slightly different approaches to confidentiality, integrity, and availability, they all achieve the same purpose. As a result of this research, the user will have a better understanding of the factors and will be able to make a more informed decision on cloud storage options

    Evaluation of Serverless Computing Frameworks Based on Kubernetes

    Recent advancements in virtualization and software architectures have led to the birth of the new paradigm of serverless computing. Serverless computing, also known as function-as-a-service, allows developers to deploy functions as computing units without worrying about the underlying infrastructure. Moreover, no resources are allocated or billed until a function is invoked. Thus, the major benefits of serverless computing are reduced developer concern about infrastructure, reduced time to market and lower cost. Currently, serverless computing is generally available through various public cloud service providers. However, there are certain bottlenecks on public cloud platforms, such as vendor lock-in, computation restrictions and regulatory restrictions. Thus, there is a growing interest to implement serverless computing on a private infrastructure. One of the preferred ways of implementing serverless computing is through the use of containers. A container-based solution allows to utilize features of existing orchestration frameworks, such as Kubernetes. This thesis discusses the implementation of serverless computing on Kubernetes. To this end, we carry out a feature evaluation of four open source serverless computing frameworks, namely Kubeless, OpenFaaS, Fission and OpenWhisk. Based on predefined criteria, we select Kubeless, Fission and OpenFaaS for further evaluation. First, we describe the developer experience on each framework. Next, we compare three different modes in which OpenFaaS functions are executed: HTTP, serializing and streaming. We evaluate the response time of function invocation and ease of monitoring and management of logs. We find that HTTP mode is the preferred mode for OpenFaaS. Finally, we evaluate the performance of the considered frameworks under different workloads. We find that Kubeless has the best performance among the three frameworks, both in terms of response time and the ratio of successful responses

    VISOR: virtual machine images management service for cloud infarestructures

    Cloud Computing is a relatively novel paradigm that aims to fulfill the computing as utility dream. It has appeared to bring the possibility of providing computing resources (such as servers, storage and networks) as a service and on demand, making them accessible through common Internet protocols. Through cloud offers, users only need to pay for the amount of resources they need and for the time they use them. Virtualization is the clouds key technology, acting upon virtual machine images to deliver fully functional virtual machine instances. Therefore, virtual machine images play an important role in Cloud Computing and their efficient management becomes a key concern that should be carefully addressed. To tackle this requirement, most cloud offers provide their own image repository, where images are stored and retrieved from, in order to instantiate new virtual machines. However, the rise of Cloud Computing has brought new problems in managing large collections of images. Existing image repositories are not able to efficiently manage, store and catalogue virtual machine images from other clouds through the same centralized service repository. This becomes especially important when considering the management of multiple heterogeneous cloud offers. In fact, despite the hype around Cloud Computing, there are still existing barriers to its widespread adoption. Among them, clouds interoperability is one of the most notable issues. Interoperability limitations arise from the fact that current cloud offers provide proprietary interfaces, and their services are tied to their own requirements. Therefore, when dealing with multiple heterogeneous clouds, users face hard to manage integration and compatibility issues. The management and delivery of virtual machine images across different clouds is an example of such interoperability constraints. This dissertation presents VISOR, a cloud agnostic virtual machine images management service and repository. Our work towards VISOR aims to provide a service not designed to fit in a specific cloud offer but rather to overreach sharing and interoperability limitations among different clouds. With VISOR, the management of clouds interoperability can be seamlessly abstracted from the underlying procedures details. In this way, it aims to provide users with the ability to manage and expose virtual machine images across heterogeneous clouds, throughout the same generic and centralized repository and management service. VISOR is an open source software with a community-driven development process, thus it can be freely customized and further improved by everyone. The conducted tests to evaluate its performance and resources usage rate have shown VISOR as a stable and high performance service, even when compared with other services already in production. Lastly, placing clouds as the main target audience is not a limitation for other use cases. In fact, virtualization and virtual machine images are not exclusively linked to cloud environments. Therefore and given the service agnostic design concerns, it is possible to adapt it to other usage scenarios as well.A Computação em Nuvem (”Cloud Computing”) é um paradigma relativamente novo que visa cumprir o sonho de fornecer a computação como um serviço. O mesmo surgiu para possibilitar o fornecimento de recursos de computação (servidores, armazenamento e redes) como um serviço de acordo com as necessidades dos utilizadores, tornando-os acessíveis através de protocolos de Internet comuns. Através das ofertas de ”cloud”, os utilizadores apenas pagam pela quantidade de recursos que precisam e pelo tempo que os usam. A virtualização é a tecnologia chave das ”clouds”, atuando sobre imagens de máquinas virtuais de forma a gerar máquinas virtuais totalmente funcionais. Sendo assim, as imagens de máquinas virtuais desempenham um papel fundamental no ”Cloud Computing” e a sua gestão eficiente torna-se um requisito que deve ser cuidadosamente analisado. Para fazer face a tal necessidade, a maioria das ofertas de ”cloud” fornece o seu próprio repositório de imagens, onde as mesmas são armazenadas e de onde são copiadas a fim de criar novas máquinas virtuais. Contudo, com o crescimento do ”Cloud Computing” surgiram novos problemas na gestão de grandes conjuntos de imagens. Os repositórios existentes não são capazes de gerir, armazenar e catalogar images de máquinas virtuais de forma eficiente a partir de outras ”clouds”, mantendo um único repositório e serviço centralizado. Esta necessidade torna-se especialmente importante quando se considera a gestão de múltiplas ”clouds” heterogéneas. Na verdade, apesar da promoção extrema do ”Cloud Computing”, ainda existem barreiras à sua adoção generalizada. Entre elas, a interoperabilidade entre ”clouds” é um dos constrangimentos mais notáveis. As limitações de interoperabilidade surgem do fato de as ofertas de ”cloud” atuais possuírem interfaces proprietárias, e de os seus serviços estarem vinculados às suas próprias necessidades. Os utilizadores enfrentam assim problemas de compatibilidade e integração difíceis de gerir, ao lidar com ”clouds” de diferentes fornecedores. A gestão e disponibilização de imagens de máquinas virtuais entre diferentes ”clouds” é um exemplo de tais restrições de interoperabilidade. Esta dissertação apresenta o VISOR, o qual é um repositório e serviço de gestão de imagens de máquinas virtuais genérico. O nosso trabalho em torno do VISOR visa proporcionar um serviço que não foi concebido para lidar com uma ”cloud” específica, mas sim para superar as limitações de interoperabilidade entre ”clouds”. Com o VISOR, a gestão da interoperabilidade entre ”clouds” é abstraída dos detalhes subjacentes. Desta forma pretende-se proporcionar aos utilizadores a capacidade de gerir e expor imagens entre ”clouds” heterogéneas, mantendo um repositório e serviço de gestão centralizados. O VISOR é um software de código livre com um processo de desenvolvimento aberto. O mesmo pode ser livremente personalizado e melhorado por qualquer pessoa. Os testes realizados para avaliar o seu desempenho e a taxa de utilização de recursos mostraram o VISOR como sendo um serviço estável e de alto desempenho, mesmo quando comparado com outros serviços já em utilização. Por fim, colocar as ”clouds” como principal público-alvo não representa uma limitação para outros tipos de utilização. Na verdade, as imagens de máquinas virtuais e a virtualização não estão exclusivamente ligadas a ambientes de ”cloud”. Assim sendo, e tendo em conta as preocupações tidas no desenho de um serviço genérico, também é possível adaptar o nosso serviço a outros cenários de utilização

    On the Efficient Design and Testing of Dependable Systems Software

    Modern computing systems that enable increasingly smart and complex applications permeate our daily lives. We strive for a fully connected and automated world to simplify our lives and increase comfort by offloading tasks to smart devices and systems. We have become dependent on the complex and ever growing ecosystem of software that drives the innovations of our smart technologies. With this dependence on complex software systems arises the question whether these systems are dependable, i.e., whether we can actually trust them to perform their intended functions. As software is developed by human beings, it must be expected to contain faults, and we need strategies and techniques to minimize both their number and the severity of their impact that scale with the increase in software complexity. Common approaches to achieve dependable operation include fault acceptance and fault avoidance strategies. The former gracefully handle faults when they occur during operation, e.g., by isolating and restarting faulty components, whereas the latter try to remove faults before system deployment, e.g., by applying correctness testing and software fault injection (SFI) techniques. On this background, this thesis aims at improving the efficiency of fault isolation for operating system kernel components, which are especially critical for dependable operation, as well as at improving the efficiency of dynamic testing activities to cope with the increasing complexity of software. Using the widely used Linux kernel, we demonstrate that partial fault isolation techniques for kernel software components can be enhanced with dynamic runtime profiles to strike a balance between the expected overheads imposed by the isolation mechanism and the achieved degree of isolation according to user requirements. With the increase in software complexity, comprehensive correctness and robustness assessments using testing and SFI require a substantially increasing number of individual tests whose execution requires a considerable amount of time. We study, considering different levels of the software stack, if modern parallel hardware can be employed to mitigate this increase. In particular, we demonstrate that SFI tests can benefit from parallel execution if such tests are carefully designed and conducted. We furthermore introduce a novel SFI framework to efficiently conduct such experiments. Moreover, we investigate if existing test suites for correctness testing can already benefit from parallel execution and provide an approach that offers a migration path for test suites that have not originally been designed for parallel execution

    Money & Trust in Digital Society, Bitcoin and Stablecoins in ML enabled Metaverse Telecollaboration

    We present a state of the art and positioning book, about Digital society tools, namely; Web3, Bitcoin, Metaverse, AI/ML, accessibility, safeguarding and telecollaboration. A high level overview of Web3 technologies leads to a description of blockchain, and the Bitcoin network is specifically selected for detailed examination. Suitable components of the extended Bitcoin ecosystem are described in more depth. Other mechanisms for native digital value transfer are described, with a focus on `money'. Metaverse technology is over-viewed, primarily from the perspective of Bitcoin and extended reality. Bitcoin is selected as the best contender for value transfer in metaverses because of it's free and open source nature, and network effect. Challenges and risks of this approach are identified. A cloud deployable virtual machine based technology stack deployment guide with a focus on cybersecurity best practice can be downloaded from GitHub to experiment with the technologies. This deployable lab is designed to inform development of secure value transaction, for small and medium sized companies

    A systematic review on cloud testing

    A systematic literature review is presented that surveyed the topic of cloud testing over the period (2012-2017). Cloud testing can refer either to testing cloud-based systems (testing of the cloud), or to leveraging the cloud for testing purposes (testing in the cloud): both approaches (and their combination into testing of the cloud in the cloud) have drawn research interest. An extensive paper search was conducted by both automated query of popular digital libraries and snowballing, which resulted into the final selection of 147 primary studies. Along the survey a framework has been incrementally derived that classifies cloud testing research along six main areas and their topics. The paper includes a detailed analysis of the selected primary studies to identify trends and gaps, as well as an extensive report of the state of art as it emerges by answering the identified Research Questions. We find that cloud testing is an active research field, although not all topics have received so far enough attention, and conclude by presenting the most relevant open research challenges for each area of the classification framework.This paper describes research work mostly undertaken in the context of the European Project H2020 731535: ElasTest. This work has also been partially supported by: the Italian MIUR PRIN 2015 Project: GAUSS; the Regional Government of Madrid (CM) under project Cloud4BigData (S2013/ICE-2894) cofunded by FSE & FEDER; and the Spanish Government under project LERNIM (RTC-2016-4674-7) cofunded by the Ministry of Economy and Competitiveness, FEDER & AEI

    Programming and parallelising applications for distributed infrastructures

    The last decade has witnessed unprecedented changes in parallel and distributed infrastructures. Due to the diminished gains in processor performance from increasing clock frequency, manufacturers have moved from uniprocessor architectures to multicores; as a result, clusters of computers have incorporated such new CPU designs. Furthermore, the ever-growing need of scienti c applications for computing and storage capabilities has motivated the appearance of grids: geographically-distributed, multi-domain infrastructures based on sharing of resources to accomplish large and complex tasks. More recently, clouds have emerged by combining virtualisation technologies, service-orientation and business models to deliver IT resources on demand over the Internet. The size and complexity of these new infrastructures poses a challenge for programmers to exploit them. On the one hand, some of the di culties are inherent to concurrent and distributed programming themselves, e.g. dealing with thread creation and synchronisation, messaging, data partitioning and transfer, etc. On the other hand, other issues are related to the singularities of each scenario, like the heterogeneity of Grid middleware and resources or the risk of vendor lock-in when writing an application for a particular Cloud provider. In the face of such a challenge, programming productivity - understood as a tradeo between programmability and performance - has become crucial for software developers. There is a strong need for high-productivity programming models and languages, which should provide simple means for writing parallel and distributed applications that can run on current infrastructures without sacri cing performance. In that sense, this thesis contributes with Java StarSs, a programming model and runtime system for developing and parallelising Java applications on distributed infrastructures. The model has two key features: first, the user programs in a fully-sequential standard-Java fashion - no parallel construct, API call or pragma must be included in the application code; second, it is completely infrastructure-unaware, i.e. programs do not contain any details about deployment or resource management, so that the same application can run in di erent infrastructures with no changes. The only requirement for the user is to select the application tasks, which are the model's unit of parallelism. Tasks can be either regular Java methods or web service operations, and they can handle any data type supported by the Java language, namely les, objects, arrays and primitives. For the sake of simplicity of the model, Java StarSs shifts the burden of parallelisation from the programmer to the runtime system. The runtime is responsible from modifying the original application to make it create asynchronous tasks and synchronise data accesses from the main program. Moreover, the implicit inter-task concurrency is automatically found as the application executes, thanks to a data dependency detection mechanism that integrates all the Java data types. This thesis provides a fairly comprehensive evaluation of Java StarSs on three di erent distributed scenarios: Grid, Cluster and Cloud. For each of them, a runtime system was designed and implemented to exploit their particular characteristics as well as to address their issues, while keeping the infrastructure unawareness of the programming model. The evaluation compares Java StarSs against state-of-the-art solutions, both in terms of programmability and performance, and demonstrates how the model can bring remarkable productivity to programmers of parallel distributed applications

    데이터 집약적 응용의 효율적인 시스템 자원 활용을 위한 메모리 서브시스템 최적화

    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2020. 8. 염헌영.With explosive data growth, data-intensive applications, such as relational database and key-value storage, have been increasingly popular in a variety of domains in recent years. To meet the growing performance demands of data-intensive applications, it is crucial to efficiently and fully utilize memory resources for the best possible performance. However, general-purpose operating systems (OSs) are designed to provide system resources to applications running on a system in a fair manner at system-level. A single application may find it difficult to fully exploit the systems best performance due to this system-level fairness. For performance reasons, many data-intensive applications implement their own mechanisms that OSs already provide, under the assumption that they know better about the data than OSs. They can be greedily optimized for performance but this may result in inefficient use of system resources. In this dissertation, we claim that simple OS support with minor application modifications can yield even higher application performance without sacrificing system-level resource utilization. We optimize and extend OS memory subsystem for better supporting applications while addressing three memory-related issues in data-intensive applications. First, we introduce a memory-efficient cooperative caching approach between application and kernel buffer to address double caching problem where the same data resides in multiple layers. Second, we present a memory-efficient, transparent zero-copy read I/O scheme to avoid the performance interference problem caused by memory copy behavior during I/O. Third, we propose a memory-efficient fork-based checkpointing mechanism for in-memory database systems to mitigate the memory footprint problem of the existing fork-based checkpointing scheme; memory usage increases incrementally (up to 2x) during checkpointing for update-intensive workloads. To show the effectiveness of our approach, we implement and evaluate our schemes on real multi-core systems. The experimental results demonstrate that our cooperative approach can more effectively address the above issues related to data-intensive applications than existing non-cooperative approaches while delivering better performance (in terms of transaction processing speed, I/O throughput, or memory footprint).최근 폭발적인 데이터 성장과 더불어 데이터베이스, 키-밸류 스토리지 등의 데이터 집약적인 응용들이 다양한 도메인에서 인기를 얻고 있다. 데이터 집약적인 응용의 높은 성능 요구를 충족하기 위해서는 주어진 메모리 자원을 효율적이고 완벽하게 활용하는 것이 중요하다. 그러나, 범용 운영체제(OS)는 시스템에서 수행 중인 모든 응용들에 대해 시스템 차원에서 공평하게 자원을 제공하는 것을 우선하도록 설계되어있다. 즉, 시스템 차원의 공평성 유지를 위한 운영체제 지원의 한계로 인해 단일 응용은 시스템의 최고 성능을 완전히 활용하기 어렵다. 이러한 이유로, 많은 데이터 집약적 응용은 운영체제에서 제공하는 기능에 의지하지 않고 비슷한 기능을 응용 레벨에 구현하곤 한다. 이러한 접근 방법은 탐욕적인 최적화가 가능하다는 점에서 성능 상 이득이 있을 수 있지만, 시스템 자원의 비효율적인 사용을 초래할 수 있다. 본 논문에서는 운영체제의 지원과 약간의 응용 수정만으로도 비효율적인 시스템 자원 사용 없이 보다 높은 응용 성능을 보일 수 있음을 증명하고자 한다. 그러기 위해 운영체제의 메모리 서브시스템을 최적화 및 확장하여 데이터 집약적인 응용에서 발생하는 세 가지 메모리 관련 문제를 해결하였다. 첫째, 동일한 데이터가 여러 계층에 존재하는 중복 캐싱 문제를 해결하기 위해 응용과 커널 버퍼 간에 메모리 효율적인 협력 캐싱 방식을 제시하였다. 둘째, 입출력시 발생하는 메모리 복사로 인한 성능 간섭 문제를 피하기 위해 메모리 효율적인 무복사 읽기 입출력 방식을 제시하였다. 셋째, 인-메모리 데이터베이스 시스템을 위한 메모리 효율적인 fork 기반 체크포인트 기법을 제안하여 기존 포크 기반 체크포인트 기법에서 발생하는 메모리 사용량 증가 문제를 완화하였다; 기존 방식은 업데이트 집약적 워크로드에 대해 체크포인팅을 수행하는 동안 메모리 사용량이 최대 2배까지 점진적으로 증가할 수 있었다. 본 논문에서는 제안한 방법들의 효과를 증명하기 위해 실제 멀티 코어 시스템에 구현하고 그 성능을 평가하였다. 실험결과를 통해 제안한 협력적 접근방식이 기존의 비협력적 접근방식보다 데이터 집약적 응용에게 효율적인 메모리 자원 활용을 가능하게 함으로써 더 높은 성능을 제공할 수 있음을 확인할 수 있었다.Chapter 1 Introduction 1 1.1 Motivation 1 1.1.1 Importance of Memory Resources 1 1.1.2 Problems 2 1.2 Contributions 5 1.3 Outline 6 Chapter 2 Background 7 2.1 Linux Kernel Memory Management 7 2.1.1 Page Cache 7 2.1.2 Page Reclamation 8 2.1.3 Page Table and TLB Shootdown 9 2.1.4 Copy-on-Write 10 2.2 Linux Support for Applications 11 2.2.1 fork 11 2.2.2 madvise 11 2.2.3 Direct I/O 12 2.2.4 mmap 13 Chapter 3 Memory Efficient Cooperative Caching 14 3.1 Motivation 14 3.1.1 Problems of Existing Datastore Architecture 14 3.1.2 Proposed Architecture 17 3.2 Related Work 17 3.3 Design and Implementation 19 3.3.1 Overview 19 3.3.2 Kernel Support 24 3.3.3 Migration to DBIO 25 3.4 Evaluation 27 3.4.1 System Configuration 27 3.4.2 Methodology 28 3.4.3 TPC-C Benchmarks 30 3.4.4 YCSB Benchmarks 32 3.5 Summary 37 Chapter 4 Memory Efficient Zero-copy I/O 38 4.1 Motivation 38 4.1.1 The Problems of Copy-Based I/O 38 4.2 Related Work 40 4.2.1 Zero Copy I/O 40 4.2.2 TLB Shootdown 42 4.2.3 Copy-on-Write 43 4.3 Design and Implementation 44 4.3.1 Prerequisites for z-READ 44 4.3.2 Overview of z-READ 45 4.3.3 TLB Shootdown Optimization 48 4.3.4 Copy-on-Write Optimization 52 4.3.5 Implementation 55 4.4 Evaluation 55 4.4.1 System Configurations 56 4.4.2 Effectiveness of the TLB Shootdown Optimization 57 4.4.3 Effectiveness of CoW Optimization 59 4.4.4 Analysis of the Performance Improvement 62 4.4.5 Performance Interference Intensity 63 4.4.6 Effectiveness of z-READ in Macrobenchmarks 65 4.5 Summary 67 Chapter 5 Memory Efficient Fork-based Checkpointing 69 5.1 Motivation 69 5.1.1 Fork-based Checkpointing 69 5.1.2 Approach 71 5.2 Related Work 73 5.3 Design and Implementation 74 5.3.1 Overview 74 5.3.2 OS Support 78 5.3.3 Implementation 79 5.4 Evaluation 80 5.4.1 Experimental Setup 80 5.4.2 Performance 81 5.5 Summary 86 Chapter 6 Conclusion 87 요약 100Docto

    Operating System Support for Redundant Multithreading

    Failing hardware is a fact and trends in microprocessor design indicate that the fraction of hardware suffering from permanent and transient faults will continue to increase in future chip generations. Researchers proposed various solutions to this issue with different downsides: Specialized hardware components make hardware more expensive in production and consume additional energy at runtime. Fault-tolerant algorithms and libraries enforce specific programming models on the developer. Compiler-based fault tolerance requires the source code for all applications to be available for recompilation. In this thesis I present ASTEROID, an operating system architecture that integrates applications with different reliability needs. ASTEROID is built on top of the L4/Fiasco.OC microkernel and extends the system with Romain, an operating system service that transparently replicates user applications. Romain supports single- and multi-threaded applications without requiring access to the application's source code. Romain replicates applications and their resources completely and thereby does not rely on hardware extensions, such as ECC-protected memory. In my thesis I describe how to efficiently implement replication as a form of redundant multithreading in software. I develop mechanisms to manage replica resources and to make multi-threaded programs behave deterministically for replication. I furthermore present an approach to handle applications that use shared-memory channels with other programs. My evaluation shows that Romain provides 100% error detection and more than 99.6% error correction for single-bit flips in memory and general-purpose registers. At the same time, Romain's execution time overhead is below 14% for single-threaded applications running in triple-modular redundant mode. The last part of my thesis acknowledges that software-implemented fault tolerance methods often rely on the correct functioning of a certain set of hardware and software components, the Reliable Computing Base (RCB). I introduce the concept of the RCB and discuss what constitutes the RCB of the ASTEROID system and other fault tolerance mechanisms. Thereafter I show three case studies that evaluate approaches to protecting RCB components and thereby aim to achieve a software stack that is fully protected against hardware errors