Search CORE

54,362 research outputs found

Tracing Distributed Data Stream Processing Systems

Author: Benczúr András
Hermann Gábor
Szabo PGN
Zvara Zoltán
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Crossref

SZTAKI Publication Repository

Optimizing distributed data stream processing by tracing

Author: Balázs Barnabás Lóránt
Benczúr András, ifj
Szabó Péter
Zvara Zoltán
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Heterogeneous mobile, sensor, IoT, smart environment, and social networking applications have recently started to produce unbounded, fast, and massive-scale streams of data that have to be processed “on the fly”. Systems that process such data have to be enhanced with detection for operational exceptions and with triggers for both automated and manual operator actions. In this paper, we illustrate how tracing in distributed data processing systems can be applied to detecting changes in data and operational environment to maintain the efficiency of heterogeneous data stream processing systems under potentially changing data quality and distribution. By the tracing of individual input records, we can (1) identify outliers in a web crawling and document processing system and use the insights to define URL filtering rules; (2) identify heavy keys, such as NULL, that should be filtered before processing; (3) give hints to improve the key-based partitioning mechanisms; and (4) measure the limits of overpartitioning if heavy thread-unsafe libraries are imported. By using Apache Spark as illustration, we show how various data stream processing efficiency issues can be mitigated or optimized by our distributed tracing engine. We describe and qualitatively compare two different designs, one based on reporting to a distributed database and another based on trace piggybacking. Our prototype implementation consists of wrappers suitable for JVM environments in general, with minimal impact on the source code of the core system. Our tracing framework is the first to solve tracing in multiple systems across boundaries and to provide detailed performance measurements suitable for automated optimization, not just debugging

SZTAKI Publication Repository

Characterizing Deep-Learning I/O Workloads in TensorFlow

Author: Chien Steven W. D.
Herman Pawel
Laure Erwin
Markidis Stefano
Narasimhamurthy Sai
Santos Luis
Sishtla Chaitanya Prasad
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/10/2018
Field of study

The performance of Deep-Learning (DL) computing frameworks rely on the performance of data ingestion and checkpointing. In fact, during the training, a considerable high number of relatively small files are first loaded and pre-processed on CPUs and then moved to accelerator for computation. In addition, checkpointing and restart operations are carried out to allow DL computing frameworks to restart quickly from a checkpoint. Because of this, I/O affects the performance of DL applications. In this work, we characterize the I/O performance and scaling of TensorFlow, an open-source programming framework developed by Google and specifically designed for solving DL problems. To measure TensorFlow I/O performance, we first design a micro-benchmark to measure TensorFlow reads, and then use a TensorFlow mini-application based on AlexNet to measure the performance cost of I/O and checkpointing in TensorFlow. To improve the checkpointing performance, we design and implement a burst buffer. We find that increasing the number of threads increases TensorFlow bandwidth by a maximum of 2.3x and 7.8x on our benchmark environments. The use of the tensorFlow prefetcher results in a complete overlap of computation on accelerator and input pipeline on CPU eliminating the effective cost of I/O on the overall performance. The use of a burst buffer to checkpoint to a fast small capacity storage and copy asynchronously the checkpoints to a slower large capacity storage resulted in a performance improvement of 2.6x with respect to checkpointing directly to slower storage on our benchmark environment.Comment: Accepted for publication at pdsw-DISCS 201

arXiv.org e-Print Archive

Crossref

Scipedia

BSML: A Binding Schema Markup Language for Data Interchange in Problem Solving Environments (PSEs)

Author: Bae Kyung Kyoon
He Jian
Jiang Jing
Ramakrishnan Naren
Rappaport Theodore S.
Shaffer Clifford A.
Tranter William H.
Verstak Alex
Watson Layne T.
Publication venue
Publication date: 18/02/2002
Field of study

We describe a binding schema markup language (BSML) for describing data interchange between scientific codes. Such a facility is an important constituent of scientific problem solving environments (PSEs). BSML is designed to integrate with a PSE or application composition system that views model specification and execution as a problem of managing semistructured data. The data interchange problem is addressed by three techniques for processing semistructured data: validation, binding, and conversion. We present BSML and describe its application to a PSE for wireless communications system design

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals