115,735 research outputs found

    StoRM: A Manager for Storage Resource in Grid

    Get PDF
    Nowadays, data intensive applications demand high-performance and large-storage systems capable of serving up to various Petabytes of storage space. Therefore, common solutions adopted in data centres include Storage Area Networks (SAN) and cluster parallel file systems, such as GPFS from IBM and Lustre from Sun Microsystems. In order to make these storage system solutions available in modern Data Grid architectures, standard interfaces are needed. The Grid Storage Resource Manager (SRM) interface is one of these standard interfaces. Grid storage services implementing the SRM standard provide common capabilities and advanced functionality such as dynamic space allocation and file management on shared storage systems. In this paper, we describe StoRM (STOrage Resource Manager). StoRM is a flexible and high-performing implementation of the standard SRM interface version 2.2. The software architecture of StoRM allows for an easy integration to different underlying storage systems via a plug-in mechanism. In particular, StoRM takes advantage from storage systems based on cluster file systems. Currently, StoRM is installed and used in production in various data centres, including the WLCG Italian Tier-1. In addition, Economics and Financial communities, as represented by the EGRID Project, adopt StoRM in production as well

    Scalable data abstractions for distributed parallel computations

    Get PDF
    The ability to express a program as a hierarchical composition of parts is an essential tool in managing the complexity of software and a key abstraction this provides is to separate the representation of data from the computation. Many current parallel programming models use a shared memory model to provide data abstraction but this doesn't scale well with large numbers of cores due to non-determinism and access latency. This paper proposes a simple programming model that allows scalable parallel programs to be expressed with distributed representations of data and it provides the programmer with the flexibility to employ shared or distributed styles of data-parallelism where applicable. It is capable of an efficient implementation, and with the provision of a small set of primitive capabilities in the hardware, it can be compiled to operate directly on the hardware, in the same way stack-based allocation operates for subroutines in sequential machines

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Cooperative Multi-Bitrate Video Caching and Transcoding in Multicarrier NOMA-Assisted Heterogeneous Virtualized MEC Networks

    Get PDF
    Cooperative video caching and transcoding in mobile edge computing (MEC) networks is a new paradigm for future wireless networks, e.g., 5G and 5G beyond, to reduce scarce and expensive backhaul resource usage by prefetching video files within radio access networks (RANs). Integration of this technique with other advent technologies, such as wireless network virtualization and multicarrier non-orthogonal multiple access (MC-NOMA), provides more flexible video delivery opportunities, which leads to enhancements both for the network's revenue and for the end-users' service experience. In this regard, we propose a two-phase RAF for a parallel cooperative joint multi-bitrate video caching and transcoding in heterogeneous virtualized MEC networks. In the cache placement phase, we propose novel proactive delivery-aware cache placement strategies (DACPSs) by jointly allocating physical and radio resources based on network stochastic information to exploit flexible delivery opportunities. Then, for the delivery phase, we propose a delivery policy based on the user requests and network channel conditions. The optimization problems corresponding to both phases aim to maximize the total revenue of network slices, i.e., virtual networks. Both problems are non-convex and suffer from high-computational complexities. For each phase, we show how the problem can be solved efficiently. We also propose a low-complexity RAF in which the complexity of the delivery algorithm is significantly reduced. A Delivery-aware cache refreshment strategy (DACRS) in the delivery phase is also proposed to tackle the dynamically changes of network stochastic information. Extensive numerical assessments demonstrate a performance improvement of up to 30% for our proposed DACPSs and DACRS over traditional approaches.Comment: 53 pages, 24 figure

    Many-Task Computing and Blue Waters

    Full text link
    This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware
    corecore