454 research outputs found
Increasing Performances of TCP Data Transfers Through Multiple Parallel Connections
Although Transmission Control Protocol (TCP) is a widely deployed and successful protocol, it shows some limitations in present-day environments. In particular, it is unable to exploit multiple (physical or logical) paths between two hosts. This paper presents PATTHEL, a session-layer solution designed for parallelizing stream data transfers. Parallelization is achieved by striping the data flow among multiple TCP channels. This solution does not require invasive changes to the networking stack and can be implemented entirely in user space. Moreover, it is flexible enough to suit several scenarios - e.g. it can be used to split a data transfer among multiple relays within a peer-to-peer overlay networ
Predicting Intermediate Storage Performance for Workflow Applications
Configuring a storage system to better serve an application is a challenging
task complicated by a multidimensional, discrete configuration space and the
high cost of space exploration (e.g., by running the application with different
storage configurations). To enable selecting the best configuration in a
reasonable time, we design an end-to-end performance prediction mechanism that
estimates the turn-around time of an application using storage system under a
given configuration. This approach focuses on a generic object-based storage
system design, supports exploring the impact of optimizations targeting
workflow applications (e.g., various data placement schemes) in addition to
other, more traditional, configuration knobs (e.g., stripe size or replication
level), and models the system operation at data-chunk and control message
level.
This paper presents our experience to date with designing and using this
prediction mechanism. We evaluate this mechanism using micro- as well as
synthetic benchmarks mimicking real workflow applications, and a real
application.. A preliminary evaluation shows that we are on a good track to
meet our objectives: it can scale to model a workflow application run on an
entire cluster while offering an over 200x speedup factor (normalized by
resource) compared to running the actual application, and can achieve, in the
limited number of scenarios we study, a prediction accuracy that enables
identifying the best storage system configuration
Packet Striping for Multi-Interfaces
In future mobile systems, the end-terminals will be considerably more diverse than nowadays, and the users will have a greater choice of access technologies, offering different QoS, cost, security and so on. A mobile terminal equipped with multiple interfaces can achieve a much higher bandwidth by aggregating the bandwidth offered by the individual networks. In this paper, we present a system based on Mobile IPv6 that achieves the above objectives. We will discuss in detail the architectural requirements and algorithms that are needed to support the above system. We also extended our proposed scheme to support even if the correspondent node also has multiple interfaces. Simulation results show that the proposed algorithm can uniformly distribute data packets among multiple channels and deliver the packets perfectly in order at receiver to achieve bandwidth aggregation
Performance evaluation of multipath transport protocol in heterogeneous network environments
Performance of multipath transport protocols is known to be sensitive to path asymmetry. The difference between each path in terms of bandwidth, delay and packet loss has a potential to significantly decrease the overall performance of a data flow carried over multiple asymmetric paths. In this paper, we evaluate and analyse reliable data transfer in Concurrent Multipath Transfer extension of Stream Control Transport Protocol (CMT-SCTP) under various conditions of network asymmetry, with a focus on the use case where 3G and Wi-Fi networks are simultaneously available. We identify various causes of performance degradation, review the impact of CMT-SACK extension under path asymmetry and show that the total achievable goodput of a reliable in-order data flow over multiple heterogeneous paths is ruled by the characteristics of the worst path as perceived by the transport protocol. To support our study, we derive a simple analytical model of the receiver window blocking and validate it via simulation
Instruction-Level Execution Migration
We introduce the Execution Migration Machine (EM²), a novel data-centric multicore memory system architecture based on computation migration. Unlike traditional distributed memory multicores, which rely on complex cache coherence protocols to move the data to the core where the computation is taking place, our scheme always moves the computation to the core where the data resides. By doing away with the cache coherence protocol, we are able to boost the effectiveness of per-core caches while drastically reducing hardware complexity. To evaluate the potential of EM² architectures, we developed a series of PIN/Graphite-based models of an EM² multicore with 64 x86 cores and, under some simplifying assumptions (a timing model restricted to data memory performance, no instruction cache modeling, high-bandwidth fixed-latency interconnect allowing concurrent migrations), compared them against corresponding directory-based cache-coherent architecture models. We justify our assumptions and show that our conclusions are valid even if our assumptions are removed. Experimental results on a range of SPLASH-2 and PARSEC benchmarks indicate that EM2 can significantly improve per-core cache performance, decreasing overall miss rates by as much as 84% and reducing average memory latency by up to 58%
- âŚ