7,031 research outputs found
The Implications of Diverse Applications and Scalable Data Sets in Benchmarking Big Data Systems
Now we live in an era of big data, and big data applications are becoming
more and more pervasive. How to benchmark data center computer systems running
big data applications (in short big data systems) is a hot topic. In this
paper, we focus on measuring the performance impacts of diverse applications
and scalable volumes of data sets on big data systems. For four typical data
analysis applications---an important class of big data applications, we find
two major results through experiments: first, the data scale has a significant
impact on the performance of big data systems, so we must provide scalable
volumes of data sets in big data benchmarks. Second, for the four applications,
even all of them use the simple algorithms, the performance trends are
different with increasing data scales, and hence we must consider not only
variety of data sets but also variety of applications in benchmarking big data
systems.Comment: 16 pages, 3 figure
ALOJA: A framework for benchmarking and predictive analytics in Hadoop deployments
This article presents the ALOJA project and its analytics tools, which leverages machine learning to interpret Big Data benchmark performance data and tuning. ALOJA is part of a long-term collaboration between BSC and Microsoft to automate the characterization of cost-effectiveness on Big Data deployments, currently focusing on Hadoop. Hadoop presents a complex run-time environment, where costs and performance depend on a large number of configuration choices. The ALOJA project has created an open, vendor-neutral repository, featuring over 40,000 Hadoop job executions and their performance details. The repository is accompanied by a test-bed and tools to deploy and evaluate the cost-effectiveness of different hardware configurations, parameters and Cloud services. Despite early success within ALOJA, a comprehensive study requires automation of modeling procedures to allow an analysis of large and resource-constrained search spaces. The predictive analytics extension, ALOJA-ML, provides an automated system allowing knowledge discovery by modeling environments from observed executions. The resulting models can forecast execution behaviors, predicting execution times for new configurations and hardware choices. That also enables model-based anomaly detection or efficient benchmark guidance by prioritizing executions. In addition, the community can benefit from ALOJA data-sets and framework to improve the design and deployment of Big Data applications.This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement
No 639595). This work is partially supported by the Ministry of Economy of Spain under contracts TIN2012-34557 and 2014SGR1051.Peer ReviewedPostprint (published version
Comparing Computing Platforms for Deep Learning on a Humanoid Robot
The goal of this study is to test two different computing platforms with
respect to their suitability for running deep networks as part of a humanoid
robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH
and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU
processing. The experiments addressed a number of benchmarking tasks including
pedestrian detection using deep neural networks. Some of the results were
unexpected but demonstrate that platforms exhibit both advantages and
disadvantages when taking computational performance and electrical power
requirements of such a system into account.Comment: 12 pages, 5 figure
Spark deployment and performance evaluation on the MareNostrum supercomputer
In this paper we present a framework to enable data-intensive Spark workloads on MareNostrum, a petascale supercomputer designed mainly for compute-intensive applications. As far as we know, this is the first attempt to investigate optimized deployment configurations of Spark on a petascale HPC setup. We detail the design of the framework and present some benchmark data to provide insights into the scalability of the system. We examine the impact of different configurations including parallelism, storage and networking alternatives, and we discuss several aspects in executing Big Data workloads on a computing system that is based on the compute-centric paradigm. Further, we derive conclusions aiming to pave the way towards systematic and optimized methodologies for fine-tuning data-intensive application on large clusters emphasizing on parallelism configurations.Peer ReviewedPostprint (author's final draft
Characterizing and Subsetting Big Data Workloads
Big data benchmark suites must include a diversity of data and workloads to
be useful in fairly evaluating big data systems and architectures. However,
using truly comprehensive benchmarks poses great challenges for the
architecture community. First, we need to thoroughly understand the behaviors
of a variety of workloads. Second, our usual simulation-based research methods
become prohibitively expensive for big data. As big data is an emerging field,
more and more software stacks are being proposed to facilitate the development
of big data applications, which aggravates hese challenges. In this paper, we
first use Principle Component Analysis (PCA) to identify the most important
characteristics from 45 metrics to characterize big data workloads from
BigDataBench, a comprehensive big data benchmark suite. Second, we apply a
clustering technique to the principle components obtained from the PCA to
investigate the similarity among big data workloads, and we verify the
importance of including different software stacks for big data benchmarking.
Third, we select seven representative big data workloads by removing redundant
ones and release the BigDataBench simulation version, which is publicly
available from http://prof.ict.ac.cn/BigDataBench/simulatorversion/.Comment: 11 pages, 6 figures, 2014 IEEE International Symposium on Workload
Characterizatio
Evolutionary conservation of influenza A PB2 sequences reveals potential target sites for small molecule inhibitors.
The influenza A basic polymerase protein 2 (PB2) functions as part of a heterotrimer to replicate the viral RNA genome. To investigate novel PB2 antiviral target sites, this work identified evolutionary conserved regions across the PB2 protein sequence amongst all sub-types and hosts, as well as ligand binding hot spots which overlap with highly conserved areas. Fifteen binding sites were predicted in different PB2 domains; some of which reside in areas of unknown function. Virtual screening of ~50,000 drug-like compounds showed binding affinities of up to 10.3 kcal/mol. The highest affinity molecules were found to interact with conserved residues including Gln138, Gly222, Ile529, Asn540 and Thr530. A library containing 1738 FDA approved drugs were screened additionally and revealed Paliperidone as a top hit with a binding affinity of -10 kcal/mol. Predicted ligands are ideal leads for new antivirals as they were targeted to evolutionary conserved binding sites
On the acceleration of wavefront applications using distributed many-core architectures
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectures to accelerate pipelined wavefront applications—a ubiquitous class of parallel algorithms used for the solution of a number of scientific and engineering applications. Specifically, we employ a recently developed port of the LU solver (from the NAS Parallel Benchmark suite) to investigate the performance of these algorithms on high-performance computing solutions from NVIDIA (Tesla C1060 and C2050) as well as on traditional clusters (AMD/InfiniBand and IBM BlueGene/P). Benchmark results are presented for problem classes A to C and a recently developed performance model is used to provide projections for problem classes D and E, the latter of which represents a billion-cell problem. Our results demonstrate that while the theoretical performance of GPU solutions will far exceed those of many traditional technologies, the sustained application performance is currently comparable for scientific wavefront applications. Finally, a breakdown of the GPU solution is conducted, exposing PCIe overheads and decomposition constraints. A new k-blocking strategy is proposed to improve the future performance of this class of algorithm on GPU-based architectures
- …