14,135 research outputs found
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Because the choice and tuning of the optimizer affects the speed, and
ultimately the performance of deep learning, there is significant past and
recent research in this area. Yet, perhaps surprisingly, there is no generally
agreed-upon protocol for the quantitative and reproducible evaluation of
optimization strategies for deep learning. We suggest routines and benchmarks
for stochastic optimization, with special focus on the unique aspects of deep
learning, such as stochasticity, tunability and generalization. As the primary
contribution, we present DeepOBS, a Python package of deep learning
optimization benchmarks. The package addresses key challenges in the
quantitative assessment of stochastic optimizers, and automates most steps of
benchmarking. The library includes a wide and extensible set of ready-to-use
realistic optimization problems, such as training Residual Networks for image
classification on ImageNet or character-level language prediction models, as
well as popular classics like MNIST and CIFAR-10. The package also provides
realistic baseline results for the most popular optimizers on these test
problems, ensuring a fair comparison to the competition when benchmarking new
optimizers, and without having to run costly experiments. It comes with output
back-ends that directly produce LaTeX code for inclusion in academic
publications. It supports TensorFlow and is available open source.Comment: Accepted at ICLR 2019. 9 pages, 3 figures, 2 table
BEEBS: Open Benchmarks for Energy Measurements on Embedded Platforms
This paper presents and justifies an open benchmark suite named BEEBS,
targeted at evaluating the energy consumption of embedded processors.
We explore the possible sources of energy consumption, then select individual
benchmarks from contemporary suites to cover these areas. Version one of BEEBS
is presented here and contains 10 benchmarks that cover a wide range of typical
embedded applications. The benchmark suite is portable across diverse
architectures and is freely available.
The benchmark suite is extensively evaluated, and the properties of its
constituent programs are analysed. Using real hardware platforms we show case
examples which illustrate the difference in power dissipation between three
processor architectures and their related ISAs. We observe significant
differences in the average instruction dissipation between the architectures of
4.4x, specifically 170uW/MHz (ARM Cortex-M0), 65uW/MHz (Adapteva Epiphany) and
88uW/MHz (XMOS XS1-L1)
BigDataBench: a Big Data Benchmark Suite from Internet Services
As architecture, systems, and data management communities pay greater
attention to innovative big data systems and architectures, the pressure of
benchmarking and evaluating these systems rises. Considering the broad use of
big data systems, big data benchmarks must include diversity of data and
workloads. Most of the state-of-the-art big data benchmarking efforts target
evaluating specific types of applications or system software stacks, and hence
they are not qualified for serving the purposes mentioned above. This paper
presents our joint research efforts on this issue with several industrial
partners. Our big data benchmark suite BigDataBench not only covers broad
application scenarios, but also includes diverse and representative data sets.
BigDataBench is publicly available from http://prof.ict.ac.cn/BigDataBench .
Also, we comprehensively characterize 19 big data workloads included in
BigDataBench with varying data inputs. On a typical state-of-practice
processor, Intel Xeon E5645, we have the following observations: First, in
comparison with the traditional benchmarks: including PARSEC, HPCC, and
SPECCPU, big data applications have very low operation intensity; Second, the
volume of data input has non-negligible impact on micro-architecture
characteristics, which may impose challenges for simulation-based big data
architecture research; Last but not least, corroborating the observations in
CloudSuite and DCBench (which use smaller data inputs), we find that the
numbers of L1 instruction cache misses per 1000 instructions of the big data
applications are higher than in the traditional benchmarks; also, we find that
L3 caches are effective for the big data applications, corroborating the
observation in DCBench.Comment: 12 pages, 6 figures, The 20th IEEE International Symposium On High
Performance Computer Architecture (HPCA-2014), February 15-19, 2014, Orlando,
Florida, US
A Benchmark Suite for Template Detection and Content Extraction
Template detection and content extraction are two of the main areas of
information retrieval applied to the Web. They perform different analyses over
the structure and content of webpages to extract some part of the document.
However, their objective is different. While template detection identifies the
template of a webpage (usually comparing with other webpages of the same
website), content extraction identifies the main content of the webpage
discarding the other part. Therefore, they are somehow complementary, because
the main content is not part of the template. It has been measured that
templates represent between 40% and 50% of data on the Web. Therefore,
identifying templates is essential for indexing tasks because templates usually
contain irrelevant information such as advertisements, menus and banners.
Processing and storing this information is likely to lead to a waste of
resources (storage space, bandwidth, etc.). Similarly, identifying the main
content is essential for many information retrieval tasks. In this paper, we
present a benchmark suite to test different approaches for template detection
and content extraction. The suite is public, and it contains real heterogeneous
webpages that have been labelled so that different techniques can be suitable
(and automatically) compared.Comment: 13 pages, 3 table
ShenZhen transportation system (SZTS): a novel big data benchmark suite
Data analytics is at the core of the supply chain for both products and services in modern economies and societies. Big data workloads, however, are placing unprecedented demands on computing technologies, calling for a deep understanding and characterization of these emerging workloads. In this paper, we propose ShenZhen Transportation System (SZTS), a novel big data Hadoop benchmark suite comprised of real-life transportation analysis applications with real-life input data sets from Shenzhen in China. SZTS uniquely focuses on a specific and real-life application domain whereas other existing Hadoop benchmark suites, such as HiBench and CloudRank-D, consist of generic algorithms with synthetic inputs. We perform a cross-layer workload characterization at the microarchitecture level, the operating system (OS) level, and the job level, revealing unique characteristics of SZTS compared to existing Hadoop benchmarks as well as general-purpose multi-core PARSEC benchmarks. We also study the sensitivity of workload behavior with respect to input data size, and we propose a methodology for identifying representative input data sets
- …