123 research outputs found

    Zackup, a scalable centralized backup service

    Get PDF
    The currently available open source, centralized to disk, backup services use a custom data storage structure to effectively store backup sets. While cus- tom data storage structures are necessary to store backup sets in differential or space efficient manner, they often lead to hard to navigate structures and increase the complexity of the backup service. To solve these problems a backup service was developed that relies on the Sun Microsystems, Inc. open sourced file system ZFS. With the variety of features ZFS includes, the major- ity of common tasks to create backup sets were delegated to the file system. Along with ZFS, a number of Ruby libraries were used to help further reduce the amount of author created code. Lastly to increase the scalability of the developed backup service a job queuing based communication architecture between service entities was employed allowing for a multiple backup node solution

    M2: Malleable Metal as a Service

    Full text link
    Existing bare-metal cloud services that provide users with physical nodes have a number of serious disadvantage over their virtual alternatives, including slow provisioning times, difficulty for users to release nodes and then reuse them to handle changes in demand, and poor tolerance to failures. We introduce M2, a bare-metal cloud service that uses network-mounted boot drives to overcome these disadvantages. We describe the architecture and implementation of M2 and compare its agility, scalability, and performance to existing systems. We show that M2 can reduce provisioning time by over 50% while offering richer functionality, and comparable run-time performance with respect to tools that provision images into local disks. M2 is open source and available at https://github.com/CCI-MOC/ims.Comment: IEEE International Conference on Cloud Engineering 201

    InSight2: An Interactive Web Based Platform for Modeling and Analysis of Large Scale Argus Network Flow Data

    Get PDF
    Monitoring systems are paramount to the proactive detection and mitigation of problems in computer networks related to performance and security. Degraded performance and compromised end-nodes can cost computer networks downtime, data loss and reputation. InSight2 is a platform that models, analyzes and visualizes large scale Argus network flow data using up-to-date geographical data, organizational information, and emerging threats. It is engineered to meet the needs of network administrators with flexibility and modularity in mind. Scalability is ensured by devising multi-core processing by implementing robust software architecture. Extendibility is achieved by enabling the end user to enrich flow records using additional user provided databases. Deployment is streamlined by providing an automated installation script. State-of-the-art visualizations are devised and presented in a secure, user friendly web interface giving greater insight about the network to the end user

    Suorituskyky ja skaalautuvuus sensoridatan tallennuksessa

    Get PDF
    Modern artificial intelligence and machine learning applications build on analysis and training using large datasets. New research and development does not always start with existing big datasets, but accumulate data over time. The same storage solution does not necessarily cover the scale during the lifetime of the research, especially if scaling up from using common workgroup storage technologies. The storage infrastructure at ZenRobotics has grown using standard workgroup technologies. The current approach is starting to show its limits, while the storage growth is predicted to continue and accelerate. Successful capacity planning and expansion requires a better understanding of the patterns of the use of storage and its growth. We have examined the current storage architecture and stored data from different perspectives in order to gain a better understanding of the situation. By performing a number of experiments we determine key properties of the employed technologies. The combination of these factors allows us to make informed decisions about future storage solutions. Current usage patterns are in many ways inefficient and changes are needed in order to be able to work with larger volumes of data. Some changes would allow to scale the current architecture a bit further, but in order to scale horizontally instead of just vertically, there is a need to start designing for scalability in the future system architecture.Modernit tekoälyn ja koneoppimisen sovellukset perustuvat suurten tietomää- rien analyysin ja käyttöön opetusdatana. Suuren aineiston olemassaolo ei aina ole itsestäänselvää tutkimuksen tai tuotekehityksen alkaessa. Samat tallennus- ratkaisut eivät välttämättä pysty kattamaan skaalautumistarpeita tutkimuksen koko keston ajalta, varsinkaan jos lähtökohtana ovat laajassa käytössä olevat työryhmätallennusratkaisut. ZenRoboticsilla käytössä oleva tallennusinfrastruktuuri on kasvanut yleisiä työ- ryhmätallennusteknologioita käyttäen. Nykyisen lähestymistavan rajat alkavat tulla vastaan, kun taas tallennuskapasiteetin tarve näyttäisi kasvavan ja kasvun tahti kiihtyvän. Tallennuskapasiteetin laajentamisen suunnittelu ja laajennuksen toteuttaminen edellyttävät parempaa käyttötapojen ja kasvun ymmärrystä. Tämä diplomityö tutkii nykyistä tallennusarkkitehtuuria ja tallennettua dataa eri näkökulmista nykytilanteen parempaan hahmottamiseen tähdäten. Suoritetuilla mittauksilla selvitimme käytössä olevien teknologioiden oleellisimmat ominaisuu- det. Yhdessä näiden perusteella pystymme tekemään tietoisempia valintoja tulevia tallennusratkaisuja koskien. Nykyiset käyttötavat ovat monin tavoin tehottomia. Suurempien tietomäärien käsittelemisen mahdollistamiseksi on tehtävä muutoksia. Työ esittelee muuto- sehdotuksia, joilla olisi mahdollista skaalata nykyistä tallennusarkkitehtuuria hieman suuremmalle kapasiteetille. Horisontaalisen skaalautumisen mahdollista- miseksi vertikaalisen sijaan on kuitenkin otettava skaalautuminen huomioon koko järjestelmän arkkitehtuurin suunnittelussa

    Kernel-space inline deduplication file systems for virtual machine image storage.

    Get PDF
    從文件系統設計的角度,我們探索了利用重復數據删除技術來消除硬盤陣列存儲設備當中的重復數據。我們提出了ScaleDFS,一個重復數據删除技術的文件系統, 旨在硬盤陣列存儲設備上實現可擴展的吞吐性能。ScaleDFS有三個主要的特點。第一,利用多核CPU並行計算出用作識別重復數據的加密指紋,以提高寫入速度。第二,緩存曾經讀取過的重復數據塊,以顯著提高讀取速度。第三,優化用作查找指紋的內存數據結構,以更加節省內存。ScaleDFS是一個以Linux系統內核模塊開發的,與POSIX兼容的,可以用在一般低成本硬件配置上的文件系統。我們進行了一系列的微觀性能測試,以及用42個不同版本的Linux虛擬鏡像文件進行了宏觀性能測試。我們證實,ScaleDFS在磁盤陣列上比目前已有的開源重復數據删除文件系統擁有更好的讀寫性能。We explore the use of deduplication for eliminating the storage of redundant data in RAID from a file-system design perspective. We propose ScaleDFS, a deduplication file system that seeks to achieve scalable read/write throughput in RAID. ScaleDFS is built on three novel design features. First, we improve the write throughput by exploiting multiple CPU cores to parallelize the processing of the cryptographic fingerprints that are used to identify redundant data. Second, we improve the read throughput by specifically caching in memory the recently read blocks that have been deduplicated. Third, we reduce the memory usage by enhancing the data structures that are used for fingerprint lookups. ScaleDFS is implemented as a POSIX-compliant, kernel-space driver module that can be deployed in commodity hardware configurations. We conduct microbenchmark experiments using synthetic workloads, and macrobenchmark experiments using a dataset of 42 VM images of different Linux distributions. We show that ScaleDFS achieves higher read/write throughput than existing open-source deduplication file systems in RAID.Detailed summary in vernacular field only.Ma, Mingcao."October 2012."Thesis (M.Phil.)--Chinese University of Hong Kong, 2013.Includes bibliographical references (leaves 39-42).Abstracts also in Chinese.Chapter 1 --- Introduction --- p.2Chapter 2 --- Literature Review --- p.5Chapter 2.1 --- Backup systems --- p.5Chapter 2.2 --- Use of special hardware --- p.6Chapter 2.3 --- Scalable storage --- p.6Chapter 2.4 --- Inline DFSs --- p.6Chapter 2.5 --- VM image storage with deduplication --- p.7Chapter 3 --- ScaleDFS Background --- p.8Chapter 3.1 --- Spatial Locality of Fingerprint Placement --- p.9Chapter 3.2 --- Prefetching of Fingerprint Stores --- p.12Chapter 3.3 --- Journaling --- p.13Chapter 4 --- ScaleDFS Design --- p.15Chapter 4.1 --- Parallelizing Deduplication --- p.15Chapter 4.2 --- Caching Read Blocks --- p.17Chapter 4.3 --- Reducing Memory Usage --- p.17Chapter 5 --- Implementation --- p.20Chapter 5.1 --- Choice of Hash Function --- p.20Chapter 5.2 --- OpenStack Deployment --- p.21Chapter 6 --- Experiments --- p.23Chapter 6.1 --- Microbenchmarks --- p.23Chapter 6.2 --- OpenStack Deployment --- p.28Chapter 6.3 --- VM Image Operations in a RAID Setup --- p.33Chapter 7 --- Conclusions and FutureWork --- p.38Bibliography --- p.3

    openstack

    Get PDF
    Σκοπός της Διπλωματικής εργασίας είναι η παρουσίαση του OpenStack. Ένα ανοιχτό λογισμικό διαχείρισης των τηλεπικοινωνιακών πόρων σε cloud περιβάλλον. Για την εκπόνηση της Διπλωματικής εργασίας γίνεται αρχικά μία περιγραφή της αρχιτεκτονικής του cloud περιβάλλοντος, και των μοντέλων εξυπηρέτησης που χρησιμοποιούνται. Εν συνεχεία παρουσιάζεται η αρχιτεκτονική Network Function Virtualization που εφαρμόζεται στις τηλεπικοινωνίες σύμφωνα με τα πρότυπα που έχει θέσει ο Ευρωπα’ι’κός Οργανισμός Τηλεπικοινωνιακών Προτύπων. Το κύριο θέμα της Διπλωματικής Εργασίας είναι η παρουσίαση του λογισμικού OpenStack που χρησιμοποιείται από την NFV αρχιτεκτονική. Στα κεφάλαια αυτά γίνεται μία προσπάθεια όσο το δυνατόν λεπτομερέστερης και πληρέστερης περιγραφής των λειτουργιών του OpenStack καθώς και τα μέρη από τα αποία αποτελείται. Τέλος γίνεται μία τεχνοοικονομική ανάλυση του κόστους εφαρμογής της NFV αρχιτεκτονικής με την τωρινή αρχιτεκτονική που εφαρμόζεται στα Τηλεπιοκοινωνιακά δίκτυα. Τα αποτελέσματα τα οποία προκύπτουν από την παρούσα εργασία είναι η πολλές δυνατότητες υλοπόιησης και εφαρμογής της νέας αρχιτεκτονικής καθώς και το πολύ χαμηλό κόστος λειτουργίας της σε σχέση με την υφιστάμενη εώς τώρα τεχνολογίαThe aim of this thesis is the presentation of OpenStack. An open software management of telecommunications resources in cloud environment. For the preparation of this thesis is first a description of the architecture of cloud environment, and service models used. Architecture Network Function Virtualization occurs then applied to telecommunications in accordance with standards set by the European Telecommunications Standards Institute. The main topic of the thesis is to present the OpenStack software used by the NFV architecture. In these chapters an attempt is as detailed and comprehensive description of the OpenStack functions and parts of the colony consists. Finally there is one techno-economic analysis of the cost of implementing NFV architecture with the current architecture applied to Telecommunication networks. The results derived from this work is a lot of potential implementation and application of the new architecture and the very low operating costs compared with existing technology up to no

    HetFS: A heterogeneous file system for everyone

    Get PDF
    Storage devices have been getting more and more diverse during the last decade. The advent of SSDs made it painfully clear that rotating devices, such as HDDs or magnetic tapes, were lacking in regards to response time. However, SSDs currently have a limited number of write cycles and a significantly larger price per capacity, which has prevented rotational technologies from begin abandoned. Additionally, Non-Volatile Memories (NVMs) have been lately gaining traction, offering devices that typically outperform NAND-based SSDs but exhibit a full new set of idiosyncrasies. Therefore, in order to appropriately support this diversity, intelligent mechanisms will be needed in the near-future to balance the benefits and drawbacks of each storage technology available to a system. In this paper, we present a first step towards such a mechanism called HetFS, an extension to the ZFS file system that is capable of choosing the storage device a file should be kept in according to preprogrammed filters. We introduce the prototype and show some preliminary results of the effects obtained when placing specific files into different devices.The research leading to these results has received funding from the European Community under the BIGStorage ETN (Project 642963 of the H2020-MSCA-ITN-2014), by the Spanish Ministry of Economy and Competitiveness under the TIN2015-65316 grant and by the Catalan Government under the 2014-SGR- 1051 grant. To learn more about the BigStorage project, please visit http: //bigstorage-project.eu/.Peer ReviewedPostprint (author's final draft

    Hera Object Storage : a seamless, automated multi-tiering solution on top of OpenStack Swift

    Get PDF
    Over the last couple of decades, the demand for storage in the Cloud has grown exponentially. Distributed Cloud storage and object storage for the increasing share of unstructured data, are in high focus in both academic and industrial research activities. At the same time, efficient storage and the corresponding costs are often contrasting parameters raising a trade-off problem for any proposed solution. To this aim, classifying the data in terms of access probability became a hot topic. This paper introduces Hera Object Storage, a storage system built on top of OpenStack Swift that aims at selecting the most appropriate storage tier for any object to be stored. The goal of the multi-tiering storage we propose is to be automated and seamless, guaranteeing the required storage performance at the lowest possible cost. The paper discusses the design challenges, the proposed algorithmic solutions to the scope and, based on a prototype implementation it presents a basic proof-of-concept validation
    corecore