71,951 research outputs found
Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform
Advances in detectors and computational technologies provide new
opportunities for applied research and the fundamental sciences. Concurrently,
dramatic increases in the three Vs (Volume, Velocity, and Variety) of
experimental data and the scale of computational tasks produced the demand for
new real-time processing systems at experimental facilities. Recently, this
demand was addressed by the Spark-MPI approach connecting the Spark
data-intensive platform with the MPI high-performance framework. In contrast
with existing data management and analytics systems, Spark introduced a new
middleware based on resilient distributed datasets (RDDs), which decoupled
various data sources from high-level processing algorithms. The RDD middleware
significantly advanced the scope of data-intensive applications, spreading from
SQL queries to machine learning to graph processing. Spark-MPI further extended
the Spark ecosystem with the MPI applications using the Process Management
Interface. The paper explores this integrated platform within the context of
online ptychographic and tomographic reconstruction pipelines.Comment: New York Scientific Data Summit, August 6-9, 201
Asynchronous Execution of Python Code on Task Based Runtime Systems
Despite advancements in the areas of parallel and distributed computing, the
complexity of programming on High Performance Computing (HPC) resources has
deterred many domain experts, especially in the areas of machine learning and
artificial intelligence (AI), from utilizing performance benefits of such
systems. Researchers and scientists favor high-productivity languages to avoid
the inconvenience of programming in low-level languages and costs of acquiring
the necessary skills required for programming at this level. In recent years,
Python, with the support of linear algebra libraries like NumPy, has gained
popularity despite facing limitations which prevent this code from distributed
runs. Here we present a solution which maintains both high level programming
abstractions as well as parallel and distributed efficiency. Phylanx, is an
asynchronous array processing toolkit which transforms Python and NumPy
operations into code which can be executed in parallel on HPC resources by
mapping Python and NumPy functions and variables into a dependency tree
executed by HPX, a general purpose, parallel, task-based runtime system written
in C++. Phylanx additionally provides introspection and visualization
capabilities for debugging and performance analysis. We have tested the
foundations of our approach by comparing our implementation of widely used
machine learning algorithms to accepted NumPy standards
DPP-PMRF: Rethinking Optimization for a Probabilistic Graphical Model Using Data-Parallel Primitives
We present a new parallel algorithm for probabilistic graphical model
optimization. The algorithm relies on data-parallel primitives (DPPs), which
provide portable performance over hardware architecture. We evaluate results on
CPUs and GPUs for an image segmentation problem. Compared to a serial baseline,
we observe runtime speedups of up to 13X (CPU) and 44X (GPU). We also compare
our performance to a reference, OpenMP-based algorithm, and find speedups of up
to 7X (CPU).Comment: LDAV 2018, October 201
Grids and the Virtual Observatory
We consider several projects from astronomy that benefit from the Grid paradigm and
associated technology, many of which involve either massive datasets or the federation
of multiple datasets. We cover image computation (mosaicking, multi-wavelength
images, and synoptic surveys); database computation (representation through XML,
data mining, and visualization); and semantic interoperability (publishing, ontologies,
directories, and service descriptions)
Inviwo -- A Visualization System with Usage Abstraction Levels
The complexity of today's visualization applications demands specific
visualization systems tailored for the development of these applications.
Frequently, such systems utilize levels of abstraction to improve the
application development process, for instance by providing a data flow network
editor. Unfortunately, these abstractions result in several issues, which need
to be circumvented through an abstraction-centered system design. Often, a high
level of abstraction hides low level details, which makes it difficult to
directly access the underlying computing platform, which would be important to
achieve an optimal performance. Therefore, we propose a layer structure
developed for modern and sustainable visualization systems allowing developers
to interact with all contained abstraction levels. We refer to this interaction
capabilities as usage abstraction levels, since we target application
developers with various levels of experience. We formulate the requirements for
such a system, derive the desired architecture, and present how the concepts
have been exemplary realized within the Inviwo visualization system.
Furthermore, we address several specific challenges that arise during the
realization of such a layered architecture, such as communication between
different computing platforms, performance centered encapsulation, as well as
layer-independent development by supporting cross layer documentation and
debugging capabilities
From Social Simulation to Integrative System Design
As the recent financial crisis showed, today there is a strong need to gain
"ecological perspective" of all relevant interactions in
socio-economic-techno-environmental systems. For this, we suggested to set-up a
network of Centers for integrative systems design, which shall be able to run
all potentially relevant scenarios, identify causality chains, explore feedback
and cascading effects for a number of model variants, and determine the
reliability of their implications (given the validity of the underlying
models). They will be able to detect possible negative side effect of policy
decisions, before they occur. The Centers belonging to this network of
Integrative Systems Design Centers would be focused on a particular field, but
they would be part of an attempt to eventually cover all relevant areas of
society and economy and integrate them within a "Living Earth Simulator". The
results of all research activities of such Centers would be turned into
informative input for political Decision Arenas. For example, Crisis
Observatories (for financial instabilities, shortages of resources,
environmental change, conflict, spreading of diseases, etc.) would be connected
with such Decision Arenas for the purpose of visualization, in order to make
complex interdependencies understandable to scientists, decision-makers, and
the general public.Comment: 34 pages, Visioneer White Paper, see http://www.visioneer.ethz.c
Common Visual Representations as a Source for Misconceptions of Preservice Teachers in a Geometry Connection Course
In this paper, we demonstrate how atypical visual representations of a triangle, square or a parallelogram may hinder students’ understanding of a median and altitude. We analyze responses and reasoning given by 16 preservice middle school teachers in a Geometry Connection class. Particularly, the data were garnered from three specific questions posed on a cumulative final exam, which focused on computing and comparing areas of parallelograms, and triangles represented by atypical images. We use the notions of concept image and concept definition as our theoretical framework for an analysis of the students’ responses. Our findings have implication on how typical images can impact students’ cognitive process and their concept image. We provide a number of suggestions that can foster conceptualization of the notions of median and altitude in a triangle that can be realized in an enacted lesson
- …