320 research outputs found

    Securing Internet Protocol (IP) Storage: A Case Study

    Full text link
    Storage networking technology has enjoyed strong growth in recent years, but security concerns and threats facing networked data have grown equally fast. Today, there are many potential threats that are targeted at storage networks, including data modification, destruction and theft, DoS attacks, malware, hardware theft and unauthorized access, among others. In order for a Storage Area Network (SAN) to be secure, each of these threats must be individually addressed. In this paper, we present a comparative study by implementing different security methods in IP Storage network.Comment: 10 Pages, IJNGN Journa

    CloudJet4BigData: Streamlining Big Data via an Accelerated Socket Interface

    Get PDF
    Big data needs to feed users with fresh processing results and cloud platforms can be used to speed up big data applications. This paper describes a new data communication protocol (CloudJet) for long distance and large volume big data accessing operations to alleviate the large latencies encountered in sharing big data resources in the clouds. It encapsulates a dynamic multi-stream/multi-path engine at the socket level, which conforms to Portable Operating System Interface (POSIX) and thereby can accelerate any POSIX-compatible applications across IP based networks. It was demonstrated that CloudJet accelerates typical big data applications such as very large database (VLDB), data mining, media streaming and office applications by up to tenfold in real-world tests

    SDN Enabled Network Efficient Data Regeneration for Distributed Storage Systems

    Get PDF
    Distributed Storage Systems (DSSs) have seen increasing levels of deployment in data centers and in cloud storage networks. DSS provides efficient and cost-effective ways to store large amount of data. To ensure reliability and resilience to failures, DSS employ mirroring and coding schemes at the block and file level. While mirroring techniques provide an efficient way to recover lost data, they do not utilize disk space efficiently, resulting in large overheads in terms of data storage. Coding techniques on the other hand provide a better way to recover data as they reduce the amount of storage space required for data recovery purposes. However, the current recovery process for coded data is not efficient due to the need to transfer large amounts of data to regenerate the data lost as a result of a failure. This results in significant delays and excessive network traffic resulting in a major performance bottleneck. In this thesis, we propose a new architecture for efficient data regeneration in distribution storage systems. A key idea of our architecture is to enable network switches to perform network coding operations, i.e., combine packets they receive over incoming links and forward the resulting packet towards the destination and do this in a principled manner. Another key element of our framework is a transport-layer reverse multicast protocol that takes advantage of network coding to minimize the rebuild time required to transmit the data by allowing more efficient utilization of network bandwidth. The new architecture is supported using the principles of Software Defined Networking (SDN) and making extensions where required in a principled manner. To enable the switches to perform network coding operations, we propose an extension of packet processing pipeline in the dataplane of a software switch. Our testbed experiments show that the proposed architecture results in modest performance gains

    Network Planning for Disaster Recovery

    Get PDF

    Network Optimizations for Distributed Storage Networks

    Get PDF
    Distributed file systems enable the reliable storage of exabytes of information on thousands of servers distributed throughout a network. These systems achieve reliability and performance by storing three or more copies of data in different locations across the network. The management of these copies of data is commonly handled by intermediate servers that track and coordinate the placement of data in the network. This introduces potential network bottlenecks, as multiple transfers to fast storage nodes can saturate the network links connecting intermediate servers to the storage. The advent of open Network Operating Systems presents an opportunity to alleviate this bottleneck, as it is now possible to treat network elements as intermediate nodes in this distributed file system and have them perform the task of replicating data across storage nodes. In this thesis, we propose a new design paradigm for distributed file systems, driven by a new fundamental component of the system which runs on network elements such as switches or routers. We describe the component’s architecture and how it can be integrated into existing distributed file systems to increase their performance. To measure this performance increase over current approaches, we emulate a distributed file system by creating a block-level storage array distributed across multiple iSCSI targets presented in a network. Furthermore we emulate more complicated redundancy schemes likely to be used in distributed file systems in the future to determine what effect this approach may have on those systems and what benefits it offers. We find that this new component offers a decrease in request latency proportional to the number of storage nodes involved in the request. We also find that the benefits of this approach are limited by the ability of switch hardware to process incoming data from the request, but that these limitations can be surmounted through the proposed design paradigm

    HEC: Collaborative Research: SAM^2 Toolkit: Scalable and Adaptive Metadata Management for High-End Computing

    Get PDF
    The increasing demand for Exa-byte-scale storage capacity by high end computing applications requires a higher level of scalability and dependability than that provided by current file and storage systems. The proposal deals with file systems research for metadata management of scalable cluster-based parallel and distributed file storage systems in the HEC environment. It aims to develop a scalable and adaptive metadata management (SAM2) toolkit to extend features of and fully leverage the peak performance promised by state-of-the-art cluster-based parallel and distributed file storage systems used by the high performance computing community. There is a large body of research on data movement and management scaling, however, the need to scale up the attributes of cluster-based file systems and I/O, that is, metadata, has been underestimated. An understanding of the characteristics of metadata traffic, and an application of proper load-balancing, caching, prefetching and grouping mechanisms to perform metadata management correspondingly, will lead to a high scalability. It is anticipated that by appropriately plugging the scalable and adaptive metadata management components into the state-of-the-art cluster-based parallel and distributed file storage systems one could potentially increase the performance of applications and file systems, and help translate the promise and potential of high peak performance of such systems to real application performance improvements. The project involves the following components: 1. Develop multi-variable forecasting models to analyze and predict file metadata access patterns. 2. Develop scalable and adaptive file name mapping schemes using the duplicative Bloom filter array technique to enforce load balance and increase scalability 3. Develop decentralized, locality-aware metadata grouping schemes to facilitate the bulk metadata operations such as prefetching. 4. Develop an adaptive cache coherence protocol using a distributed shared object model for client-side and server-side metadata caching. 5. Prototype the SAM2 components into the state-of-the-art parallel virtual file system PVFS2 and a distributed storage data caching system, set up an experimental framework for a DOE CMS Tier 2 site at University of Nebraska-Lincoln and conduct benchmark, evaluation and validation studies

    Cheetah: An Economical Distributed RAM Drive

    Get PDF
    Current hard drive technology shows a widening gap between the ability to store vast amounts of data and the ability to process. To overcome the problems of this secular trend, we explore the use of available distributed RAM resources to effectively replace a mechanical hard drive. The essential approach is a distributed Linux block device that spreads its blocks throughout spare RAM on a cluster and transfers blocks using network capacity. The presented solution is LAN-scalable, easy to deploy, and faster than a commodity hard drive. The specific driving problem is I/O intensive applications, particularly digital forensics. The prototype implementation is a Linux 2.4 kernel module, and connects to Unix based clients. It features an adaptive prefetching scheme that seizes future data blocks for each read request. We present experimental results based on generic benchmarks as well as digital forensic applications that demonstrate significant performance gains over commodity hard drives

    Fairness in a data center

    Get PDF
    Existing data centers utilize several networking technologies in order to handle the performance requirements of different workloads. Maintaining diverse networking technologies increases complexity and is not cost effective. This results in the current trend to converge all traffic into a single networking fabric. Ethernet is both cost-effective and ubiquitous, and as such it has been chosen as the technology of choice for the converged fabric. However, traditional Ethernet does not satisfy the needs of all traffic workloads, for the most part, due to its lossy nature and, therefore, has to be enhanced to allow for full convergence. The resulting technology, Data Center Bridging (DCB), is a new set of standards defined by the IEEE to make Ethernet lossless even in the presence of congestion. As with any new networking technology, it is critical to analyze how the different protocols within DCB interact with each other as well as how each protocol interacts with existing technologies in other layers of the protocol stack. This dissertation presents two novel schemes that address critical issues in DCB networks: fairness with respect to packet lengths and fairness with respect to flow control and bandwidth utilization. The Deficit Round Robin with Adaptive Weight Control (DRR-AWC) algorithm actively monitors the incoming streams and adjusts the scheduling weights of the outbound port. The algorithm was implemented on a real DCB switch and shown to increase fairness for traffic consisting of mixed-length packets. Targeted Priority-based Flow Control (TPFC) provides a hop-by-hop flow control mechanism that restricts the flow of aggressor streams while allowing victim streams to continue unimpeded. Two variants of the targeting mechanism within TPFC are presented and their performance evaluated through simulation

    U-LITE: 6 years of scientific computing at LNGS

    Get PDF
    The computing infrastructure of Laboratori Nazionali del Gran Sasso (LNGS) is the primary platform for data storage, analysis, computing and simulation of the LNGS-based experiments, which are part of the research activities of the Istituto Nazionale di Fisica Nucleare (INFN). Groups running such experiments have diverse needs, and adopt different approaches in developing the computing frameworks that support their activities. Since the emergence of the Cloud paradigm, the Computing and Network Service has built on its experience in operating and managing the LNGS computing infrastructure to develop U-LITE, a versatile environment apt at hosting such varied ecosystem and providing LNGS scientific users a familiar computing interface which hides all the complexities of a modern data center management. Over the last 6 years U-LITE has proved as a valuable tool for the LNGS experiments, and provides an example of effective use of the Cloud computing approach in a real scientific context
    • …
    corecore