Search CORE

83,338 research outputs found

ALOJA: A benchmarking and predictive platform for big data performance analysis

Author: Berral García Josep Lluís
Carrera Pérez David
Poggi Nicolas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The main goals of the ALOJA research project from BSC-MSR, are to explore and automate the characterization of cost-effectivenessof Big Data deployments. The development of the project over its first year, has resulted in a open source benchmarking platform, an online public repository of results with over 42,000 Hadoop job runs, and web-based analytic tools to gather insights about system's cost-performance1. This article describes the evolution of the project's focus and research lines from over a year of continuously benchmarking Hadoop under dif- ferent configuration and deployments options, presents results, and dis cusses the motivation both technical and market-based of such changes. During this time, ALOJA's target has evolved from a previous low-level profiling of Hadoop runtime, passing through extensive benchmarking and evaluation of a large body of results via aggregation, to currently leveraging Predictive Analytics (PA) techniques. Modeling benchmark executions allow us to estimate the results of new or untested configu- rations or hardware set-ups automatically, by learning techniques from past observations saving in benchmarking time and costs.This work is partially supported the BSC-Microsoft Research Centre, the Span- ish Ministry of Education (TIN2012-34557), the MINECO Severo Ochoa Research program (SEV-2011-0067) and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

BigDataBench: a Big Data Benchmark Suite from Internet Services

Author: Gao Wanling
He Yongqiang
Jia Zhen
Li Xiaona
Lu Gang
Luo Chunjie
Qiu Bizhu
Shi Yingjie
Wang Lei
Yang Qiang
Zhan Jianfeng
Zhan Kent
Zhang Shujie
Zheng Chen
Zhu Yuqing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/02/2014
Field of study

As architecture, systems, and data management communities pay greater attention to innovative big data systems and architectures, the pressure of benchmarking and evaluating these systems rises. Considering the broad use of big data systems, big data benchmarks must include diversity of data and workloads. Most of the state-of-the-art big data benchmarking efforts target evaluating specific types of applications or system software stacks, and hence they are not qualified for serving the purposes mentioned above. This paper presents our joint research efforts on this issue with several industrial partners. Our big data benchmark suite BigDataBench not only covers broad application scenarios, but also includes diverse and representative data sets. BigDataBench is publicly available from http://prof.ict.ac.cn/BigDataBench . Also, we comprehensively characterize 19 big data workloads included in BigDataBench with varying data inputs. On a typical state-of-practice processor, Intel Xeon E5645, we have the following observations: First, in comparison with the traditional benchmarks: including PARSEC, HPCC, and SPECCPU, big data applications have very low operation intensity; Second, the volume of data input has non-negligible impact on micro-architecture characteristics, which may impose challenges for simulation-based big data architecture research; Last but not least, corroborating the observations in CloudSuite and DCBench (which use smaller data inputs), we find that the numbers of L1 instruction cache misses per 1000 instructions of the big data applications are higher than in the traditional benchmarks; also, we find that L3 caches are effective for the big data applications, corroborating the observation in DCBench.Comment: 12 pages, 6 figures, The 20th IEEE International Symposium On High Performance Computer Architecture (HPCA-2014), February 15-19, 2014, Orlando, Florida, US

arXiv.org e-Print Archive

Crossref

Evaluating the benefits of key-value databases for scientific applications

Author: Becerra Fontal Yolanda
Gil Eloy
Glock Philipp
Oden Lena
Santamaria Mateu Pol
Sirvent Raül
Torres Viñals Jordi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

The convergence of Big Data applications with High-Performance Computing requires new methodologies to store, manage and process large amounts of information. Traditional storage solutions are unable to scale and that results in complex coding strategies. For example, the brain atlas of the Human Brain Project has the challenge to process large amounts of high-resolution brain images. Given the computing needs, we study the effects of replacing a traditional storage system with a distributed Key-Value database on a cell segmentation application. The original code uses HDF5 files on GPFS through an intricate interface, imposing synchronizations. On the other hand, by using Apache Cassandra or ScyllaDB through Hecuba, the application code is greatly simplified. Thanks to the Key-Value data model, the number of synchronizations is reduced and the time dedicated to I/O scales when increasing the number of nodes.This project/research has received funding from the European Unions Horizon 2020 Framework Programme for Research and Innovation under the Speci c Grant Agreement No. 720270 (Human Brain Project SGA1) and the Speci c Grant Agreement No. 785907 (Human Brain Project SGA2). This work has also been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), and by Generalitat de Catalunya (contract 2017-SGR-1414).Postprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Experimental Performance Evaluation of Cloud-Based Analytics-as-a-Service

Author: Carra Damiano
Michiardi Pietro
Milanesio Marco
Pace Francesco
Venzano Daniele
Publication venue
Publication date: 01/01/2016
Field of study

An increasing number of Analytics-as-a-Service solutions has recently seen the light, in the landscape of cloud-based services. These services allow flexible composition of compute and storage components, that create powerful data ingestion and processing pipelines. This work is a first attempt at an experimental evaluation of analytic application performance executed using a wide range of storage service configurations. We present an intuitive notion of data locality, that we use as a proxy to rank different service compositions in terms of expected performance. Through an empirical analysis, we dissect the performance achieved by analytic workloads and unveil problems due to the impedance mismatch that arise in some configurations. Our work paves the way to a better understanding of modern cloud-based analytic services and their performance, both for its end-users and their providers.Comment: Longer version of the paper in Submission at IEEE CLOUD'1

arXiv.org e-Print Archive

Crossref

Catalogo dei prodotti della ricerca

Scipedia

Contours of Inclusion: Inclusive Arts Teaching and Learning

Author: Bill Henderson
Deborah Kronenberg
Don Glass
Kati Blair
Leah Barnum
Nicole Agois Hurel
Richard Jenkins
Publication venue: 'MRVSA Pubishing House'
Publication date: 06/06/2010
Field of study

The purpose of this publication is to share models and case examples of the process of inclusive arts curriculum design and evaluation. The first section explains the conceptual and curriculum frameworks that were used in the analysis and generation of the featured case studies (i.e. Understanding by Design, Differentiated Instruction, and Universal Design for Learning). Data for the cases studies was collected from three urban sites (i.e. Los Angeles, San Francisco, and Boston) and included participant observations, student and teacher interviews, curriculum documentation, digital documentation of student learning, and transcripts from discussion forum and teleconference discussions from a professional learning community.The initial case studies by Glass and Barnum use the curricular frameworks to analyze and understand what inclusive practices look like in two case studies of arts-in-education programs that included students with disabilities. The second set of precedent case studies by Kronenberg and Blair, and Jenkins and Agois Hurel uses the frameworks to explain their process of including students by providing flexible arts learning options to support student learning of content standards. Both sets of case studies illuminate curricular design decisions and instructional strategies that supported the active engagement and learning of students with disabilities in educational settings shared with their peers. The second set of cases also illustrate the reflective process of using frameworks like Universal Design for Learning (UDL) to guide curricular design, responsive instructional differentiation, and the use of the arts as a rich, meaningful, and engaging option to support learning. Appended are curriculum design and evaluation tools. (Individual chapters contain references.

IssueLab

Recommended from our members

Disrupting Illicit Supply Networks: New Applications of Operations Research and Data Analytics to End Modern Slavery

Author: Busch-Armendariz Noël
Kammer-Kerwick Matt
Talley McKenna
Publication venue: Bureau of Business Research
Publication date: 01/05/2018
Field of study

Report from a 2017 National Science Foundation workshop on promising research directions for applications of operations research and data analytics toward the disruption of illicit supply networks like human trafficking. The workshop was funded by the NSF’s Operations Engineering (ENG) and the Law & Social Sciences Program (SBE) under grant # CMMI-1726895. The report addresses the opportunity to apply advances from the fields of operations research, management science, analytics, machine learning, and data science toward the development of disruptive interventions against illicit networks. Such an extension of the current research agenda for trafficking would move understanding of such dynamic systems from descriptive characterization and predictive estimation toward improved dynamic operational control.Bureau of Business Researc

Texas ScholarWorks

Advanced Techniques for Assets Maintenance Management

Author: Crespo Márquez Adolfo
Fuente Antonio de la
González-Prida Vicente
Guillén López Antonio Jesús
Gómez Fernández Juan Francisco
Publication venue: IFAC (International Federation of Automatic Control) - Elsevier
Publication date: 01/01/2018
Field of study

16th IFAC Symposium on Information Control Problems in Manufacturing INCOM 2018 Bergamo, Italy, 11–13 June 2018. Edited by Marco Macchi, László Monostori, Roberto PintoThe aim of this paper is to remark the importance of new and advanced techniques supporting decision making in different business processes for maintenance and assets management, as well as the basic need of adopting a certain management framework with a clear processes map and the corresponding IT supporting systems. Framework processes and systems will be the key fundamental enablers for success and for continuous improvement. The suggested framework will help to define and improve business policies and work procedures for the assets operation and maintenance along their life cycle. The following sections present some achievements on this focus, proposing finally possible future lines for a research agenda within this field of assets management

idUS. Depósito de Investigación Universidad de Sevilla