101 research outputs found
On the Energy Efficiency of MapReduce Shuffling Operations in Data Centers
This paper aims to quantitatively measure the impact of different data centers networking topologies on the performance and energy efficiency of shuffling operations in MapReduce. Mixed Integer Linear Programming (MILP) models are utilized to optimize the shuffling in several data center topologies with electronic, hybrid, and all-optical switching while maximizing the throughput and reducing the power consumption. The results indicate that the networking topology has a significant impact on the performance of MapReduce. They also indicate that with comparable performance, optical-based data centers can achieve an average of 54% reduction in the energy consumption when compared to electronic switching data centers
FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short
We introduce FatPaths: a simple, generic, and robust routing architecture
that enables state-of-the-art low-diameter topologies such as Slim Fly to
achieve unprecedented performance. FatPaths targets Ethernet stacks in both HPC
supercomputers as well as cloud data centers and clusters. FatPaths exposes and
exploits the rich ("fat") diversity of both minimal and non-minimal paths for
high-performance multi-pathing. Moreover, FatPaths uses a redesigned "purified"
transport layer that removes virtually all TCP performance issues (e.g., the
slow start), and incorporates flowlet switching, a technique used to prevent
packet reordering in TCP networks, to enable very simple and effective load
balancing. Our design enables recent low-diameter topologies to outperform
powerful Clos designs, achieving 15% higher net throughput at 2x lower latency
for comparable cost. FatPaths will significantly accelerate Ethernet clusters
that form more than 50% of the Top500 list and it may become a standard routing
scheme for modern topologies
MaxHadoop: An Efficient Scalable Emulation Tool to Test SDN Protocols in Emulated Hadoop Environments
AbstractThis paper presents MaxHadoop, a flexible and scalable emulation tool, which allows the efficient and accurate emulation of Hadoop environments over Software Defined Networks (SDNs). Hadoop has been designed to manage endless data-streams over networks, making it a tailored candidate to support the new class of network services belonging to Big Data. The development of Hadoop is contemporary with the evolution of networks towards the new architectures "Software Defined." To create our emulation environment, tailored to SDNs, we employ MaxiNet, given its capability of emulating large-scale SDNs. We make it possible to emulate realistic Hadoop scenarios on large-scale SDNs using low-cost commodity hardware, by resolving a few key limitations of MaxiNet through appropriate configuration settings. We validate the MaxHadoop emulator by executing two benchmarks, namely WordCount and TeraSort, to evaluate a set of Key Performance Indicators. The tests' outcomes evidence that MaxHadoop outperforms other existing emulation tools running over commodity hardware. Finally, we show the potentiality of MaxHadoop by utilizing it to perform a comparison of SDN-based network protocols
- …