943 research outputs found
TRACTABLE DATA-FLOW ANALYSIS FOR DISTRIBUTED SYSTEMS
Automated behavior analysis is a valuable technique in the development and maintainence of distributed systems. In this paper, we present a tractable dataflow analysis technique for the detection of unreachable states and actions in distributed systems. The technique follows an approximate approach described by Reif and Smolka, but delivers a more accurate result in assessing unreachable states and actions. The higher accuracy is achieved by the use of two concepts: action dependency and history sets. Although the technique, does not exhaustively detect all possible errors, it detects nontrivial errors with a worst-case complexity quadratic to the system size. It can be automated and applied to systems with arbitrary loops and nondeterministic structures. The technique thus provides practical and tractable behavior analysis for preliminary designs of distributed systems. This makes it an ideal candidate for an interactive checker in software development tools. The technique is illustrated with case studies of a pump control system and an erroneous distributed program. Results from a prototype implementation are presented
BrainFrame: A node-level heterogeneous accelerator platform for neuron simulations
Objective: The advent of High-Performance Computing (HPC) in recent years has
led to its increasing use in brain study through computational models. The
scale and complexity of such models are constantly increasing, leading to
challenging computational requirements. Even though modern HPC platforms can
often deal with such challenges, the vast diversity of the modeling field does
not permit for a single acceleration (or homogeneous) platform to effectively
address the complete array of modeling requirements. Approach: In this paper we
propose and build BrainFrame, a heterogeneous acceleration platform,
incorporating three distinct acceleration technologies, a Dataflow Engine, a
Xeon Phi and a GP-GPU. The PyNN framework is also integrated into the platform.
As a challenging proof of concept, we analyze the performance of BrainFrame on
different instances of a state-of-the-art neuron model, modeling the Inferior-
Olivary Nucleus using a biophysically-meaningful, extended Hodgkin-Huxley
representation. The model instances take into account not only the neuronal-
network dimensions but also different network-connectivity circumstances that
can drastically change application workload characteristics. Main results: The
synthetic approach of three HPC technologies demonstrated that BrainFrame is
better able to cope with the modeling diversity encountered. Our performance
analysis shows clearly that the model directly affect performance and all three
technologies are required to cope with all the model use cases.Comment: 16 pages, 18 figures, 5 table
Real-time support for high performance aircraft operation
The feasibility of real-time processing schemes using artificial neural networks (ANNs) is investigated. A rationale for digital neural nets is presented and a general processor architecture for control applications is illustrated. Research results on ANN structures for real-time applications are given. Research results on ANN algorithms for real-time control are also shown
Automatic translation of non-repetitive OpenMP to MPI
Cluster platforms with distributed-memory architectures are becoming increasingly available low-cost solutions for high performance computing. Delivering a productive programming environment that hides the complexity of clusters and allows writing efficient programs is urgently needed. Despite multiple efforts to provide shared memory abstraction, message-passing (MPI) is still the state-of-the-art programming model for distributed-memory architectures. ^ Writing efficient MPI programs is challenging. In contrast, OpenMP is a shared-memory programming model that is known for its programming productivity. Researchers introduced automatic source-to-source translation schemes from OpenMP to MPI so that programmers can use OpenMP while targeting clusters. Those schemes limited their focus on OpenMP programs with repetitive communication patterns (where the analysis of communication can be simplified). This dissertation reduces this limitation and presents a novel OpenMP-to-MPI translation scheme that covers OpenMP programs with both repetitive and non-repetitive communication patterns. We target laboratory-size clusters of ten to hundred nodes (commonly found in research laboratories and small enterprises). ^ With our translation scheme, six non-repetitive and four repetitive OpenMP benchmarks have been efficiently scaled to a cluster of 64 cores. By contrast, the state-of-the-art translator scaled only the four repetitive benchmarks. In addition, our translation scheme was shown to outperform or perform as well as the state-of-the-art translator. We also compare the translation scheme with available hand-coded MPI and Unified Parallel C (UPC) programs
Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)
Real-time analytics that requires integration and aggregation of
heterogeneous and distributed streaming and static data is a typical task in
many industrial scenarios such as diagnostics of turbines in Siemens. OBDA
approach has a great potential to facilitate such tasks; however, it has a
number of limitations in dealing with analytics that restrict its use in
important industrial applications. Based on our experience with Siemens, we
argue that in order to overcome those limitations OBDA should be extended and
become analytics, source, and cost aware. In this work we propose such an
extension. In particular, we propose an ontology, mapping, and query language
for OBDA, where aggregate and other analytical functions are first class
citizens. Moreover, we develop query optimisation techniques that allow to
efficiently process analytical tasks over static and streaming data. We
implement our approach in a system and evaluate our system with Siemens turbine
data
Partitioning SKA Dataflows for Optimal Graph Execution
Optimizing data-intensive workflow execution is essential to many modern
scientific projects such as the Square Kilometre Array (SKA), which will be the
largest radio telescope in the world, collecting terabytes of data per second
for the next few decades. At the core of the SKA Science Data Processor is the
graph execution engine, scheduling tens of thousands of algorithmic components
to ingest and transform millions of parallel data chunks in order to solve a
series of large-scale inverse problems within the power budget. To tackle this
challenge, we have developed the Data Activated Liu Graph Engine (DALiuGE) to
manage data processing pipelines for several SKA pathfinder projects. In this
paper, we discuss the DALiuGE graph scheduling sub-system. By extending
previous studies on graph scheduling and partitioning, we lay the foundation on
which we can develop polynomial time optimization methods that minimize both
workflow execution time and resource footprint while satisfying resource
constraints imposed by individual algorithms. We show preliminary results
obtained from three radio astronomy data pipelines.Comment: Accepted in HPDC ScienceCloud 2018 Worksho
VegaProf: Profiling Vega Visualizations
Vega is a popular domain-specific language (DSL) for visualization
specification. At runtime, Vega's DSL is first transformed into a dataflow
graph and then functions to render visualization primitives. While the Vega
abstraction of implementation details simplifies visualization creation, it
also makes Vega visualizations challenging to debug and profile without
adequate tools. Our formative interviews with three practitioners at Sigma
Computing showed that existing developer tools are not suited for visualization
profiling as they are disconnected from the semantics of the Vega DSL
specification and its resulting dataflow graph. We introduce VegaProf, the
first performance profiler for Vega visualizations. VegaProf effectively
instruments the Vega library by associating the declarative specification with
its compilation and execution. Using interactive visualizations, VegaProf
enables visualization engineers to interactively profile visualization
performance at three abstraction levels: function, dataflow graph, and
visualization specification. Our evaluation through two use cases and feedback
from five visualization engineers at Sigma Computing shows that VegaProf makes
visualization profiling tractable and actionable.Comment: Submitted to EuroVis'2
- …