4,912 research outputs found

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Performance analysis of downlink shared channels in a UMTS network

    Get PDF
    In light of the expected growth in wireless data communications and the commonly anticipated up/downlink asymmetry, we present a performance analysis of downlink data transfer over \textsc{d}ownlink \textsc{s}hared \textsc{ch}annels (\textsc{dsch}s), arguably the most efficient \textsc{umts} transport channel for medium-to-large data transfers. It is our objective to provide qualitative insight in the different aspects that influence the data \textsc{q}uality \textsc{o}f \textsc{s}ervice (\textsc{qos}). As a most principal factor, the data traffic load affects the data \textsc{qos} in two distinct manners: {\em (i)} a heavier data traffic load implies a greater competition for \textsc{dsch} resources and thus longer transfer delays; and {\em (ii)} since each data call served on a \textsc{dsch} must maintain an \textsc{a}ssociated \textsc{d}edicated \textsc{ch}annel (\textsc{a}-\textsc{dch}) for signalling purposes, a heavier data traffic load implies a higher interference level, a higher frame error rate and thus a lower effective aggregate \textsc{dsch} throughput: {\em the greater the demand for service, the smaller the aggregate service capacity.} The latter effect is further amplified in a multicellular scenario, where a \textsc{dsch} experiences additional interference from the \textsc{dsch}s and \textsc{a}-\textsc{dch}s in surrounding cells, causing a further degradation of its effective throughput. Following an insightful two-stage performance evaluation approach, which segregates the interference aspects from the traffic dynamics, a set of numerical experiments is executed in order to demonstrate these effects and obtain qualitative insight in the impact of various system aspects on the data \textsc{qos}

    HARE: Final Report

    Get PDF
    This report documents the results of work done over a 6 year period under the FAST-OS programs. The first effort was called Right-Weight Kernels, (RWK) and was concerned with improving measurements of OS noise so it could be treated quantitatively; and evaluating the use of two operating systems, Linux and Plan 9, on HPC systems and determining how these operating systems needed to be extended or changed for HPC, while still retaining their general-purpose nature. The second program, HARE, explored the creation of alternative runtime models, building on RWK. All of the HARE work was done on Plan 9. The HARE researchers were mindful of the very good Linux and LWK work being done at other labs and saw no need to recreate it. Even given this limited funding, the two efforts had outsized impact: _ Helped Cray decide to use Linux, instead of a custom kernel, and provided the tools needed to make Linux perform well _ Created a successor operating system to Plan 9, NIX, which has been taken in by Bell Labs for further development _ Created a standard system measurement tool, Fixed Time Quantum or FTQ, which is widely used for measuring operating systems impact on applications _ Spurred the use of the 9p protocol in several organizations, including IBM _ Built software in use at many companies, including IBM, Cray, and Google _ Spurred the creation of alternative runtimes for use on HPC systems _ Demonstrated that, with proper modifications, a general purpose operating systems can provide communications up to 3 times as effective as user-level libraries Open source was a key part of this work. The code developed for this project is in wide use and available at many places. The core Blue Gene code is available at https://bitbucket.org/ericvh/hare. We describe details of these impacts in the following sections. The rest of this report is organized as follows: First, we describe commercial impact; next, we describe the FTQ benchmark and its impact in more detail; operating systems and runtime research follows; we discuss infrastructure software; and close with a description of the new NIX operating system, future work, and conclusions

    A stochastic control approach for scheduling multimedia transmissions over a polled multiaccess fading channel

    Get PDF
    We develop scheduling strategies for carrying multimedia traffic over a polled multiple access wireless network with fading. We consider a slotted system with three classes of traffic (voice, streaming media and file transfers). A Markov model is used for the fading and also for modeling voice packet arrivals and streaming arrivals. The performance objectives are a loss probability for voice, mean network delay for streaming media, and time average throughput for file transfers. A central scheduler (e.g., the access point in a single cell IEEE 802.11 wireless local area network (WLAN)) is assumed to be able to keep track of all the available state information and make the scheduling decision in each slot (e.g., as would be the case for PCF mode operation of the IEEE 802.11 WLAN). The problem is modeled as a constrained Markov decision problem. By using constraint relaxations (a linear relaxation and Whittle type relaxations) an index based policy is obtained. For the file transfers the decision problem turns out to be one with partial state information. Numerical comparisons are provided with the performance obtained from some simple policies

    CERN openlab Whitepaper on Future IT Challenges in Scientific Research

    Get PDF
    This whitepaper describes the major IT challenges in scientific research at CERN and several other European and international research laboratories and projects. Each challenge is exemplified through a set of concrete use cases drawn from the requirements of large-scale scientific programs. The paper is based on contributions from many researchers and IT experts of the participating laboratories and also input from the existing CERN openlab industrial sponsors. The views expressed in this document are those of the individual contributors and do not necessarily reflect the view of their organisations and/or affiliates

    Dynamic workflow management for large scale scientific applications

    Get PDF
    The increasing computational and data requirements of scientific applications have made the usage of large clustered systems as well as distributed resources inevitable. Although executing large applications in these environments brings increased performance, the automation of the process becomes more and more challenging. The use of complex workflow management systems has been a viable solution for this automation process. In this thesis, we study a broad range of workflow management tools and compare their capabilities especially in terms of dynamic and conditional structures they support, which are crucial for the automation of complex applications. We then apply some of these tools to two real-life scientific applications: i) simulation of DNA folding, and ii) reservoir uncertainty analysis. Our implementation is based on Pegasus workflow planning tool, DAGMan workflow execution system, Condor-G computational scheduler, and Stork data scheduler. The designed abstract workflows are converted to concrete workflows using Pegasus where jobs are matched to resources; DAGMan makes sure these jobs execute reliably and in the correct order on the remote resources; Condor-G performs the scheduling for the computational tasks and Stork optimizes the data movement between different components. Integrated solution with these tools allows automation of large scale applications, as well as providing complete reliability and efficiency in executing complex workflows. We have also developed a new site selection mechanism on top of these systems, which can choose the most available computing resources for the submission of the tasks. The details of our design and implementation, as well as experimental results are presented

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
    • 

    corecore