7,575 research outputs found
Performance Evaluation of Distributed Computing Environments with Hadoop and Spark Frameworks
Recently, due to rapid development of information and communication
technologies, the data are created and consumed in the avalanche way.
Distributed computing create preconditions for analyzing and processing such
Big Data by distributing the computations among a number of compute nodes. In
this work, performance of distributed computing environments on the basis of
Hadoop and Spark frameworks is estimated for real and virtual versions of
clusters. As a test task, we chose the classic use case of word counting in
texts of various sizes. It was found that the running times grow very fast with
the dataset size and faster than a power function even. As to the real and
virtual versions of cluster implementations, this tendency is the similar for
both Hadoop and Spark frameworks. Moreover, speedup values decrease
significantly with the growth of dataset size, especially for virtual version
of cluster configuration. The problem of growing data generated by IoT and
multimodal (visual, sound, tactile, neuro and brain-computing, muscle and eye
tracking, etc.) interaction channels is presented. In the context of this
problem, the current observations as to the running times and speedup on Hadoop
and Spark frameworks in real and virtual cluster configurations can be very
useful for the proper scaling-up and efficient job management, especially for
machine learning and Deep Learning applications, where Big Data are widely
present.Comment: 5 pages, 1 table, 2017 IEEE International Young Scientists Forum on
Applied Physics and Engineering (YSF-2017) (Lviv, Ukraine
Sam2bam: High-Performance Framework for NGS Data Preprocessing Tools
This paper introduces a high-throughput software tool framework called {\it
sam2bam} that enables users to significantly speedup pre-processing for
next-generation sequencing data. The sam2bam is especially efficient on
single-node multi-core large-memory systems. It can reduce the runtime of data
pre-processing in marking duplicate reads on a single node system by 156-186x
compared with de facto standard tools. The sam2bam consists of parallel
software components that can fully utilize the multiple processors, available
memory, high-bandwidth of storage, and hardware compression accelerators if
available.
The sam2bam provides file format conversion between well-known genome file
formats, from SAM to BAM, as a basic feature. Additional features such as
analyzing, filtering, and converting the input data are provided by {\it
plug-in} tools, e.g., duplicate marking, which can be attached to sam2bam at
runtime.
We demonstrated that sam2bam could significantly reduce the runtime of NGS
data pre-processing from about two hours to about one minute for a whole-exome
data set on a 16-core single-node system using up to 130 GB of memory. The
sam2bam could reduce the runtime for whole-genome sequencing data from about 20
hours to about nine minutes on the same system using up to 711 GB of memory
Batch Size Influence on Performance of Graphic and Tensor Processing Units during Training and Inference Phases
The impact of the maximally possible batch size (for the better runtime) on
performance of graphic processing units (GPU) and tensor processing units (TPU)
during training and inference phases is investigated. The numerous runs of the
selected deep neural network (DNN) were performed on the standard MNIST and
Fashion-MNIST datasets. The significant speedup was obtained even for extremely
low-scale usage of Google TPUv2 units (8 cores only) in comparison to the quite
powerful GPU NVIDIA Tesla K80 card with the speedup up to 10x for training
stage (without taking into account the overheads) and speedup up to 2x for
prediction stage (with and without taking into account overheads). The precise
speedup values depend on the utilization level of TPUv2 units and increase with
the increase of the data volume under processing, but for the datasets used in
this work (MNIST and Fashion-MNIST with images of sizes 28x28) the speedup was
observed for batch sizes >512 images for training phase and >40 000 images for
prediction phase. It should be noted that these results were obtained without
detriment to the prediction accuracy and loss that were equal for both GPU and
TPU runs up to the 3rd significant digit for MNIST dataset, and up to the 2nd
significant digit for Fashion-MNIST dataset.Comment: 10 pages, 7 figures, 2 table
Teaching Concurrent Software Design: A Case Study Using Android
In this article, we explore various parallel and distributed computing topics
from a user-centric software engineering perspective. Specifically, in the
context of mobile application development, we study the basic building blocks
of interactive applications in the form of events, timers, and asynchronous
activities, along with related software modeling, architecture, and design
topics.Comment: Submitted to CDER NSF/IEEE-TCPP Curriculum Initiative on Parallel and
Distributed Computing - Core Topics for Undergraduate
Evoplex: A platform for agent-based modeling on networks
Agent-based modeling and network science have been used extensively to
advance our understanding of emergent collective behavior in systems that are
composed of a large number of simple interacting individuals or agents. With
the increasing availability of high computational power in affordable personal
computers, dedicated efforts to develop multi-threaded, scalable and
easy-to-use software for agent-based simulations are needed more than ever.
Evoplex meets this need by providing a fast, robust and extensible platform for
developing agent-based models and multi-agent systems on networks. Each agent
is represented as a node and interacts with its neighbors, as defined by the
network structure. Evoplex is ideal for modeling complex systems, for example
in evolutionary game theory and computational social science. In Evoplex, the
models are not coupled to the execution parameters or the visualization tools,
and there is a user-friendly graphical interface which makes it easy for all
users, ranging from newcomers to experienced, to create, analyze, replicate and
reproduce the experiments.Comment: 6 pages, 5 figures; accepted for publication in SoftwareX [software
available at https://evoplex.org
SInC: An accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data
We report SInC (SNV, Indel and CNV) simulator and read generator, an
open-source tool capable of simulating biological variants taking into account
a platform-specific error model. SInC is capable of simulating and generating
single- and paired-end reads with user-defined insert size with high efficiency
compared to the other existing tools. SInC, due to its multi-threaded
capability during read generation, has a low time footprint. SInC is currently
optimised to work in limited infrastructure setup and can efficiently exploit
the commonly used quad-core desktop architecture to simulate short sequence
reads with deep coverage for large genomes. Sinc can be downloaded from
https://sourceforge.net/projects/sincsimulator/
Software Model Checking with Explicit Scheduler and Symbolic Threads
In many practical application domains, the software is organized into a set
of threads, whose activation is exclusive and controlled by a cooperative
scheduling policy: threads execute, without any interruption, until they either
terminate or yield the control explicitly to the scheduler. The formal
verification of such software poses significant challenges. On the one side,
each thread may have infinite state space, and might call for abstraction. On
the other side, the scheduling policy is often important for correctness, and
an approach based on abstracting the scheduler may result in loss of precision
and false positives. Unfortunately, the translation of the problem into a
purely sequential software model checking problem turns out to be highly
inefficient for the available technologies. We propose a software model
checking technique that exploits the intrinsic structure of these programs.
Each thread is translated into a separate sequential program and explored
symbolically with lazy abstraction, while the overall verification is
orchestrated by the direct execution of the scheduler. The approach is
optimized by filtering the exploration of the scheduler with the integration of
partial-order reduction. The technique, called ESST (Explicit Scheduler,
Symbolic Threads) has been implemented and experimentally evaluated on a
significant set of benchmarks. The results demonstrate that ESST technique is
way more effective than software model checking applied to the sequentialized
programs, and that partial-order reduction can lead to further performance
improvements.Comment: 40 pages, 10 figures, accepted for publication in journal of logical
methods in computer scienc
Massively-Parallel Break Detection for Satellite Data
The field of remote sensing is nowadays faced with huge amounts of data.
While this offers a variety of exciting research opportunities, it also yields
significant challenges regarding both computation time and space requirements.
In practice, the sheer data volumes render existing approaches too slow for
processing and analyzing all the available data. This work aims at accelerating
BFAST, one of the state-of-the-art methods for break detection given satellite
image time series. In particular, we propose a massively-parallel
implementation for BFAST that can effectively make use of modern parallel
compute devices such as GPUs. Our experimental evaluation shows that the
proposed GPU implementation is up to four orders of magnitude faster than the
existing publicly available implementation and up to ten times faster than a
corresponding multi-threaded CPU execution. The dramatic decrease in running
time renders the analysis of significantly larger datasets possible in seconds
or minutes instead of hours or days. We demonstrate the practical benefits of
our implementations given both artificial and real datasets.Comment: 10 page
- …