1,891 research outputs found

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Beltway Buffers: Avoiding the OS Traffic Jam

    Get PDF

    An Overview of Internet Measurements:Fundamentals, Techniques, and Trends

    Full text link
    The Internet presents great challenges to the characterization of its structure and behavior. Different reasons contribute to this situation, including a huge user community, a large range of applications, equipment heterogeneity, distributed administration, vast geographic coverage, and the dynamism that are typical of the current Internet. In order to deal with these challenges, several measurement-based approaches have been recently proposed to estimate and better understand the behavior, dynamics, and properties of the Internet. The set of these measurement-based techniques composes the Internet Measurements area of research. This overview paper covers the Internet Measurements area by presenting measurement-based tools and methods that directly influence other conventional areas, such as network design and planning, traffic engineering, quality of service, and network management

    Network stack specialization for performance

    Get PDF
    Contemporary network stacks are masterpieces of generality, supporting a range of edge-node and middle-node functions. This generality comes at significant performance cost: current APIs, memory models, and implementations drastically limit the effectiveness of increasingly powerful hardware. Generality has historically been required to allow individual systems to perform many functions. However, as providers have scaled up services to support hundreds of millions of users, they have transitioned toward many thousands (or even millions) of dedicated servers performing narrow ranges of functions. We argue that the overhead of generality is now a key obstacle to effective scaling, making specialization not only viable, but necessary. This paper presents Sandstorm, a clean-slate userspace network stack that exploits knowledge of web server semantics, improving throughput over current off-the-shelf designs while retaining use of conventional operating-system and programming frameworks. Based on Netmap, our novel approach merges application and network-stack memory models, aggressively amortizes stack-internal TCP costs based on application-layer knowledge, tightly couples with the NIC event model, and exploits low-latency hardware access. We compare our approach to the FreeBSD and Linux network stacks with nginx as the web server, demonstrating ∼3.5x throughput improvement, while experiencing low CPU utilization, linear scaling on multicore systems, and saturating current NIC hardware

    Mitigating Interference During Virtual Machine Live Migration through Storage Offloading

    Get PDF
    Today\u27s cloud landscape has evolved computing infrastructure into a dynamic, high utilization, service-oriented paradigm. This shift has enabled the commoditization of large-scale storage and distributed computation, allowing engineers to tackle previously untenable problems without large upfront investment. A key enabler of flexibility in the cloud is the ability to transfer running virtual machines across subnets or even datacenters using live migration. However, live migration can be a costly process, one that has the potential to interfere with other applications not involved with the migration. This work investigates storage interference through experimentation with real-world systems and well-established benchmarks. In order to address migration interference in general, a buffering technique is presented that offloads the migration\u27s read, eliminating interference in the majority of scenarios

    Network stack specialization for performance

    Full text link

    High speed network traffic analysis with commodity multi-core systems

    Full text link

    Dynamic Optical Networks for Data Centres and Media Production

    Get PDF
    This thesis explores all-optical networks for data centres, with a particular focus on network designs for live media production. A design for an all-optical data centre network is presented, with experimental verification of the feasibility of the network data plane. The design uses fast tunable (< 200 ns) lasers and coherent receivers across a passive optical star coupler core, forming a network capable of reaching over 1000 nodes. Experimental transmission of 25 Gb/s data across the network core, with combined wavelength switching and time division multiplexing (WS-TDM), is demonstrated. Enhancements to laser tuning time via current pre-emphasis are discussed, including experimental demonstration of fast wavelength switching (< 35 ns) of a single laser between all combinations of 96 wavelengths spaced at 50 GHz over a range wider than the optical C-band. Methods of increasing the overall network throughput by using a higher complexity modulation format are also described, along with designs for line codes to enable pulse amplitude modulation across the WS-TDM network core. The construction of an optical star coupler network core is investigated, by evaluating methods of constructing large star couplers from smaller optical coupler components. By using optical circuit switches to rearrange star coupler connectivity, the network can be partitioned, creating independent reserves of bandwidth and resulting in increased overall network throughput. Several topologies for constructing a star from optical couplers are compared, and algorithms for optimum construction methods are presented. All of the designs target strict criteria for the flexible and dynamic creation of multicast groups, which will enable future live media production workflows in data centres. The data throughput performance of the network designs is simulated under synthetic and practical media production traffic scenarios, showing improved throughput when reconfigurable star couplers are used compared to a single large star. An energy consumption evaluation shows reduced network power consumption compared to incumbent and other proposed data centre network technologies

    All-optical header processing in a 42.6Gb/s optoelectronic firewall

    Get PDF
    A novel architecture to enable future network security systems to provide effective protection in the context of continued traffic growth and the need to minimise energy consumption is proposed. It makes use of an all-optical pre-filtering stage operating at the line rate under software control to distribute incoming packets to specialised electronic processors. An experimental system that integrates software controls and electronic interfaces with an all-optical pattern recognition system has demonstrated the key functions required by the new architecture. As an example, the ability to sort packets arriving in a 42.6Gb/s data stream according to their service type was shown experimentally

    Hardware-Aware Algorithm Designs for Efficient Parallel and Distributed Processing

    Get PDF
    The introduction and widespread adoption of the Internet of Things, together with emerging new industrial applications, bring new requirements in data processing. Specifically, the need for timely processing of data that arrives at high rates creates a challenge for the traditional cloud computing paradigm, where data collected at various sources is sent to the cloud for processing. As an approach to this challenge, processing algorithms and infrastructure are distributed from the cloud to multiple tiers of computing, closer to the sources of data. This creates a wide range of devices for algorithms to be deployed on and software designs to adapt to.In this thesis, we investigate how hardware-aware algorithm designs on a variety of platforms lead to algorithm implementations that efficiently utilize the underlying resources. We design, implement and evaluate new techniques for representative applications that involve the whole spectrum of devices, from resource-constrained sensors in the field, to highly parallel servers. At each tier of processing capability, we identify key architectural features that are relevant for applications and propose designs that make use of these features to achieve high-rate, timely and energy-efficient processing.In the first part of the thesis, we focus on high-end servers and utilize two main approaches to achieve high throughput processing: vectorization and thread parallelism. We employ vectorization for the case of pattern matching algorithms used in security applications. We show that re-thinking the design of algorithms to better utilize the resources available in the platforms they are deployed on, such as vector processing units, can bring significant speedups in processing throughout. We then show how thread-aware data distribution and proper inter-thread synchronization allow scalability, especially for the problem of high-rate network traffic monitoring. We design a parallelization scheme for sketch-based algorithms that summarize traffic information, which allows them to handle incoming data at high rates and be able to answer queries on that data efficiently, without overheads.In the second part of the thesis, we target the intermediate tier of computing devices and focus on the typical examples of hardware that is found there. We show how single-board computers with embedded accelerators can be used to handle the computationally heavy part of applications and showcase it specifically for pattern matching for security-related processing. We further identify key hardware features that affect the performance of pattern matching algorithms on such devices, present a co-evaluation framework to compare algorithms, and design a new algorithm that efficiently utilizes the hardware features.In the last part of the thesis, we shift the focus to the low-power, resource-constrained tier of processing devices. We target wireless sensor networks and study distributed data processing algorithms where the processing happens on the same devices that generate the data. Specifically, we focus on a continuous monitoring algorithm (geometric monitoring) that aims to minimize communication between nodes. By deploying that algorithm in action, under realistic environments, we demonstrate that the interplay between the network protocol and the application plays an important role in this layer of devices. Based on that observation, we co-design a continuous monitoring application with a modern network stack and augment it further with an in-network aggregation technique. In this way, we show that awareness of the underlying network stack is important to realize the full potential of the continuous monitoring algorithm.The techniques and solutions presented in this thesis contribute to better utilization of hardware characteristics, across a wide spectrum of platforms. We employ these techniques on problems that are representative examples of current and upcoming applications and contribute with an outlook of emerging possibilities that can build on the results of the thesis
    • …
    corecore