1,531 research outputs found
A multi-stage recurrent neural network better describes decision-related activity in dorsal premotor cortex
We studied how a network of recurrently connected
artificial units solve a visual perceptual decision-making
task. The goal of this task is to discriminate the dominant
color of a central static checkerboard and report the
decision with an arm movement. This task has been used
to study neural activity in the dorsal premotor (PMd)
cortex. When a single recurrent neural network (RNN)
was trained to perform the task, the activity of artificial
units in the RNN differed from neural recordings in PMd,
suggesting that inputs to PMd differed from inputs to the
RNN. We expanded our architecture and examined how
a multi-stage RNN performed the task. In the multi-stage
RNN, the last stage exhibited similarities with PMd by
representing direction information but not color
information. We then investigated how the
representation of color and direction information evolve
across RNN stages. Together, our results are a
demonstration of the importance of incorporating
architectural constraints into RNN models. These
constraints can improve the ability of RNNs to model
neural activity in association areas.https://doi.org/10.32470/CCN.2019.1123-0Accepted manuscrip
Ruya: Memory-Aware Iterative Optimization of Cluster Configurations for Big Data Processing
Selecting appropriate computational resources for data processing jobs on
large clusters is difficult, even for expert users like data engineers.
Inadequate choices can result in vastly increased costs, without significantly
improving performance. One crucial aspect of selecting an efficient resource
configuration is avoiding memory bottlenecks. By knowing the required memory of
a job in advance, the search space for an optimal resource configuration can be
greatly reduced.
Therefore, we present Ruya, a method for memory-aware optimization of data
processing cluster configurations based on iteratively exploring a
narrowed-down search space. First, we perform job profiling runs with small
samples of the dataset on just a single machine to model the job's memory usage
patterns. Second, we prioritize cluster configurations with a suitable amount
of total memory and within this reduced search space, we iteratively search for
the best cluster configuration with Bayesian optimization. This search process
stops once it converges on a configuration that is believed to be optimal for
the given job. In our evaluation on a dataset with 1031 Spark and Hadoop jobs,
we see a reduction of search iterations to find an optimal configuration by
around half, compared to the baseline.Comment: 9 pages, 5 Figures, 3 Tables; IEEE BigData 2022. arXiv admin note:
substantial text overlap with arXiv:2206.1385
Selecting Efficient Cluster Resources for Data Analytics: When and How to Allocate for In-Memory Processing?
Distributed dataflow systems such as Apache Spark or Apache Flink enable
parallel, in-memory data processing on large clusters of commodity hardware.
Consequently, the appropriate amount of memory to allocate to the cluster is a
crucial consideration.
In this paper, we analyze the challenge of efficient resource allocation for
distributed data processing, focusing on memory. We emphasize that in-memory
processing with in-memory data processing frameworks can undermine resource
efficiency. Based on the findings of our trace data analysis, we compile
requirements towards an automated solution for efficient cluster resource
allocation.Comment: 4 pages, 3 Figures; ACM SSDBM 202
Leveraging Reinforcement Learning for Task Resource Allocation in Scientific Workflows
Scientific workflows are designed as directed acyclic graphs (DAGs) and
consist of multiple dependent task definitions. They are executed over a large
amount of data, often resulting in thousands of tasks with heterogeneous
compute requirements and long runtimes, even on cluster infrastructures. In
order to optimize the workflow performance, enough resources, e.g., CPU and
memory, need to be provisioned for the respective tasks. Typically, workflow
systems rely on user resource estimates which are known to be highly
error-prone and can result in over- or underprovisioning. While resource
overprovisioning leads to high resource wastage, underprovisioning can result
in long runtimes or even failed tasks.
In this paper, we propose two different reinforcement learning approaches
based on gradient bandits and Q-learning, respectively, in order to minimize
resource wastage by selecting suitable CPU and memory allocations. We provide a
prototypical implementation in the well-known scientific workflow management
system Nextflow, evaluate our approaches with five workflows, and compare them
against the default resource configurations and a state-of-the-art feedback
loop baseline. The evaluation yields that our reinforcement learning approaches
significantly reduce resource wastage compared to the default configuration.
Further, our approaches also reduce the allocated CPU hours compared to the
state-of-the-art feedback loop by 6.79% and 24.53%.Comment: Paper accepted in 2022 IEEE International Conference on Big Data
Workshop BPOD 202
Predicting Dynamic Memory Requirements for Scientific Workflow Tasks
With the increasing amount of data available to scientists in disciplines as
diverse as bioinformatics, physics, and remote sensing, scientific workflow
systems are becoming increasingly important for composing and executing
scalable data analysis pipelines. When writing such workflows, users need to
specify the resources to be reserved for tasks so that sufficient resources are
allocated on the target cluster infrastructure. Crucially, underestimating a
task's memory requirements can result in task failures. Therefore, users often
resort to overprovisioning, resulting in significant resource wastage and
decreased throughput.
In this paper, we propose a novel online method that uses monitoring time
series data to predict task memory usage in order to reduce the memory wastage
of scientific workflow tasks. Our method predicts a task's runtime, divides it
into k equally-sized segments, and learns the peak memory value for each
segment depending on the total file input size. We evaluate the prototype
implementation of our method using workflows from the publicly available
nf-core repository, showing an average memory wastage reduction of 29.48%
compared to the best state-of-the-art approac
The Nucleon Spin Polarizability at Order ) in Chiral Perturbation Theory
We calculate the forward spin-dependent photon-nucleon Compton amplitude as a
function of photon energy at the next-to-leading () order in
chiral perturbation theory, from which we extract the contribution to nucleon
spin polarizability. The result shows a large correction to the leading order
contribution.Comment: 7 pages, latex, 2 figures included as .eps file
Lotaru: Locally Predicting Workflow Task Runtimes for Resource Management on Heterogeneous Infrastructures
Many resource management techniques for task scheduling, energy and carbon
efficiency, and cost optimization in workflows rely on a-priori task runtime
knowledge. Building runtime prediction models on historical data is often not
feasible in practice as workflows, their input data, and the cluster
infrastructure change. Online methods, on the other hand, which estimate task
runtimes on specific machines while the workflow is running, have to cope with
a lack of measurements during start-up. Frequently, scientific workflows are
executed on heterogeneous infrastructures consisting of machines with different
CPU, I/O, and memory configurations, further complicating predicting runtimes
due to different task runtimes on different machine types.
This paper presents Lotaru, a method for locally predicting the runtimes of
scientific workflow tasks before they are executed on heterogeneous compute
clusters. Crucially, our approach does not rely on historical data and copes
with a lack of training data during the start-up. To this end, we use
microbenchmarks, reduce the input data to quickly profile the workflow locally,
and predict a task's runtime with a Bayesian linear regression based on the
gathered data points from the local workflow execution and the microbenchmarks.
Due to its Bayesian approach, Lotaru provides uncertainty estimates that can be
used for advanced scheduling methods on distributed cluster infrastructures.
In our evaluation with five real-world scientific workflows, our method
outperforms two state-of-the-art runtime prediction baselines and decreases the
absolute prediction error by more than 12.5%. In a second set of experiments,
the prediction performance of our method, using the predicted runtimes for
state-of-the-art scheduling, carbon reduction, and cost prediction, enables
results close to those achieved with perfect prior knowledge of runtimes
Recommended from our members
Volatile working memory representations crystallize with practice.
Working memory, the process through which information is transiently maintained and manipulated over a brief period, is essential for most cognitive functions1-4. However, the mechanisms underlying the generation and evolution of working-memory neuronal representations at the population level over long timescales remain unclear. Here, to identify these mechanisms, we trained head-fixed mice to perform an olfactory delayed-association task in which the mice made decisions depending on the sequential identity of two odours separated by a 5 s delay. Optogenetic inhibition of secondary motor neurons during the late-delay and choice epochs strongly impaired the task performance of the mice. Mesoscopic calcium imaging of large neuronal populations of the secondary motor cortex (M2), retrosplenial cortex (RSA) and primary motor cortex (M1) showed that many late-delay-epoch-selective neurons emerged in M2 as the mice learned the task. Working-memory late-delay decoding accuracy substantially improved in the M2, but not in the M1 or RSA, as the mice became experts. During the early expert phase, working-memory representations during the late-delay epoch drifted across days, while the stimulus and choice representations stabilized. In contrast to single-plane layer 2/3 (L2/3) imaging, simultaneous volumetric calcium imaging of up to 73,307 M2 neurons, which included superficial L5 neurons, also revealed stabilization of late-delay working-memory representations with continued practice. Thus, delay- and choice-related activities that are essential for working-memory performance drift during learning and stabilize only after several days of expert performance
Macaw: The Machine Learning Magnetometer Calibration Workflow
In Earth Systems Science, many complex data pipelines combine different data
sources and apply data filtering and analysis steps. Typically, such data
analysis processes are historically grown and implemented with many
sequentially executed scripts. Scientific workflow management systems (SWMS)
allow scientists to use their existing scripts and provide support for
parallelization, reusability, monitoring, or failure handling. However, many
scientists still rely on their sequentially called scripts and do not profit
from the out-of-the-box advantages a SWMS can provide. In this work, we
transform the data analysis processes of a Machine Learning-based approach to
calibrate the platform magnetometers of non-dedicated satellites utilizing
neural networks into a workflow called Macaw (MAgnetometer CAlibration
Workflow). We provide details on the workflow and the steps needed to port
these scripts to a scientific workflow. Our experimental evaluation compares
the original sequential script executions on the original HPC cluster with our
workflow implementation on a commodity cluster. Our results show that through
porting, our implementation decreased the allocated CPU hours by 50.2% and the
memory hours by 59.5%, leading to significantly less resource wastage. Further,
through parallelizing single tasks, we reduced the runtime by 17.5%.Comment: Paper accepted in 2022 IEEE International Conference on Data Mining
Workshops (ICDMW
- …