Search CORE

32 research outputs found

Manuscript submitted to Computer Communication Review Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product Networks

Author: Robert Grossman
Yunhong Gu
Publication venue
Publication date
Field of study

As the network bandwidth and delay increase, TCP becomes inefficient. Data intensive applications over high-speed networks such as the computational grids need new transport protocol to support them. This paper describes a general purpose high performance data transfer protocol as an application level solution. The protocol, named UDT, or UDP-based data transfer protocol, works above UDP with reliability and congestion control mechanisms. UDT uses both positive acknowledgement and negative acknowledgement to guarantee data reliability. It combines rate based and window based congestion control mechanisms to ensure efficiency and fairness, including two particular fairness objectives of TCP friendliness and delay independence. Both simulation and implementation results have shown that UDT meet these objectives very well. This paper will describe the details of UDT protocol with simulation and implementation results and analysis

CiteSeerX

Lessons learned from a year's worth of benchmarks of large data clouds

Author: Robert L Grossman
Yunhong Gu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

In this paper, we discuss some of the lessons that we have learned working with the Hadoop and Sector/Sphere systems. Both of these systems are cloud-based systems designed to support data intensive computing. Both include distributed file systems and closely coupled systems for processing data in parallel. Hadoop uses MapReduce, while Sphere supports the ability to execute an arbitrary user defined function over the data managed by Sector. We compare and contrast these systems and discuss some of the design trade-offs necessary in data intensive computing. In our experimental studies over the past year, Sector/Sphere has consistently performed about 2 – 4 times faster than Hadoop. We discuss some of the reasons that might be responsible for this difference in performance

CiteSeerX

Crossref