45,199 research outputs found

    Ceph as WAN Filesystem – Performance and Feasibility Study through Simulation

    Get PDF
    Recent development in object based distributed file systems (DFS) such as Ceph, GlusterFS as well as the more established ones like Lustre, GPFS, etc. have presented new opportunities to setup next generation of storage infrastructure for cloud computing, big data, and Internet of Things (IoT). However, existing DFSs are typically deployed to Local Area Network (LAN) and generally used for high-performance computing. Extending these DFSs into geographically distributed sites such as Campus Area Network (CAN) and Wide Area Network (WAN) for enterprise applications presents completely different set of challenges and issues. Unlike most implementations that choose a traditional multi sites deployment, i.e., each site implements a virtual storage (via LAN) and links through RESTful APIs (via WAN), we attempt to create a single virtual storage over WAN using Ceph. In this paper, we demonstrate that a properly designed and configured virtualized environment is a valuable tool for researchers to simulate a distributed files system over WAN without an actual physical environment.  By following a few guidelines, the read and write performance results in a simulated environment can indicate the trending of the read and write performance in the actual physical environment.  This implies that the storage design can be verified prior to actual deployment and establish a performance baseline. An obvious benefit is the initial investment of a storage solution is lower. Furthermore, this paper discuss about the challenges of setting up such environment, the feasibility of using Ceph as a single virtual store, and some possible future works

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    CERN openlab Whitepaper on Future IT Challenges in Scientific Research

    Get PDF
    This whitepaper describes the major IT challenges in scientific research at CERN and several other European and international research laboratories and projects. Each challenge is exemplified through a set of concrete use cases drawn from the requirements of large-scale scientific programs. The paper is based on contributions from many researchers and IT experts of the participating laboratories and also input from the existing CERN openlab industrial sponsors. The views expressed in this document are those of the individual contributors and do not necessarily reflect the view of their organisations and/or affiliates

    Investigating grid computing technologies for use with commercial simulation packages

    Get PDF
    As simulation experimentation in industry become more computationally demanding, grid computing can be seen as a promising technology that has the potential to bind together the computational resources needed to quickly execute such simulations. To investigate how this might be possible, this paper reviews the grid technologies that can be used together with commercial-off-the-shelf simulation packages (CSPs) used in industry. The paper identifies two specific forms of grid computing (Public Resource Computing and Enterprise-wide Desktop Grid Computing) and the middleware associated with them (BOINC and Condor) as being suitable for grid-enabling existing CSPs. It further proposes three different CSP-grid integration approaches and identifies one of them to be the most appropriate. It is hoped that this research will encourage simulation practitioners to consider grid computing as a technologically viable means of executing CSP-based experiments faster
    • …
    corecore