84,589 research outputs found

    Energy Efficient Data-Intensive Computing With Mapreduce

    Get PDF
    Power and energy consumption are critical constraints in data center design and operation. In data centers, MapReduce data-intensive applications demand significant resources and energy. Recognizing the importance and urgency of optimizing energy usage of MapReduce applications, this work aims to provide instrumental tools to measure and evaluate MapReduce energy efficiency and techniques to conserve energy without impacting performance. Energy conservation for data-intensive computing requires enabling technology to provide detailed and systemic energy information and to identify in the underlying system hardware and software. To address this need, we present eTune, a fine-grained, scalable energy profiling framework for data-intensive computing on large-scale distributed systems. eTune leverages performance monitoring counters (PMCs) on modern computer components and statistically builds power-performance correlation models. Using learned models, eTune augments direct measurement with a software-based power estimator that runs on compute nodes and reports power at multiple levels including node, core, memory, and disks with high accuracy. Data-intensive computing differs from traditional high performance computing as most execution time is spent in moving data between storage devices, nodes, and components. Since data movements are potential performance and energy bottlenecks, we propose an analysis framework with methods and metrics for evaluating and characterizing costly built-in MapReduce data movements. The revealed data movement energy characteristics can be exploited in system design and resource allocation to improve data-intensive computing energy efficiency. Finally, we present an optimization technique that targets inefficient built-in MapReduce data movements to conserve energy without impacting performance. The optimization technique allocates the optimal number of compute nodes to applications and dynamically schedules processor frequency during its execution based on data movement characteristics. Experimental results show significant energy savings, though improvements depend on both workload characteristics and policies of resource and dynamic voltage and frequency scheduling. As data volume doubles every two years and more data centers are put into production, energy consumption is expected to grow further. We expect these studies provide direction and insight in building more energy efficient data-intensive systems and applications, and the tools and techniques are adopted by other researchers for their energy efficient studies

    Large-scale photonic natural language processing

    Get PDF
    Modern machine-learning applications require huge artificial networks demanding computational power and memory. Light-based platforms promise ultrafast and energy-efficient hardware, which may help realize next -generation data processing devices. However, current photonic networks are limited by the number of input-output nodes that can be processed in a single shot. This restricted network capacity prevents their application to relevant large-scale problems such as natural language processing. Here, we realize a photonic processor for supervised learning with a capacity exceeding 1.5 x 1010 optical nodes, more than one order of magnitude larger than any previous implementation, which enables photonic large-scale text encoding and classification. By exploiting the full three-dimensional structure of the optical field propagating in free space, we overcome the interpolation threshold and reach the over-parameterized region of machine learning, a condition that allows high-performance sentiment analysis with a minimal fraction of training points. Our results provide a novel sol-ution to scale up light-driven computing and open the route to photonic natural language processin
    • …
    corecore