Search CORE

390 research outputs found

Reo + mCRL2: A Framework for Model-checking Dataﬂow in Service Compositions

Author: Kokash N. (Natallia)
Krause , (Christian) C. (born Köhler, , C.)
Vink E.P. (Erik) de
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/08/2011
Field of study

The paradigm of service-oriented computing revolutionized the field of software engineering. According to this paradigm, new systems are composed of existing stand-alone services to support complex cross-organizational business processes. Correct communication of these services is not possible without a proper coordination mechanism. The Reo coordination language is a channel-based modeling language that introduces various types of channels and their composition rules. By composing Reo channels, one can specify Reo connectors that realize arbitrary complex behavioral protocols. Several formalisms have been introduced to give semantics to Reo. In their most basic form, they reflect service synchronization and dataflow constraints imposed by connectors. To ensure that the composed system behaves as intended, we need a wide range of automated verification tools to assist service composition designers. In this paper, we present our framework for the verification of Reo using the toolset. We unify our previous work on mapping various semantic models for Reo, namely, constraint automata, timed constraint automata, coloring semantics and the newly developed action constraint automata, to the process algebraic specification language of , address the correctness of this mapping, discuss tool support, and present a detailed example that illustrates the use of Reo empowered with for the analysis of dataflow in service-based process models

CWI's Institutional Repository

Reo + mCRL2: A Framework for Model-Checking Dataflow in Service Compositions

Author: Kokash N. (Natallia)
Krause , (Christian) C. (born Köhler, , C.)
Vink E.P. (Erik) de
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

The paradigm of service-oriented computing revolutionized the field of software engineering. According to this paradigm, new systems are composed of existing stand-alone services to support complex cross-organizational business processes. Correct communication of these services is not possible without a proper coordination mechanism. The Reo coordination language is a channel-based modeling language that introduces various types of channels and their composition rules. By composing Reo channels, one can specify Reo connectors that realize arbitrary complex behavioral protocols. Several formalisms have been introduced to give semantics to Reo. In their most basic form, they reflect service synchronization and dataflow constraints imposed by connectors. To ensure that the composed system behaves as intended, we need a wide range of automated verification tools to assist service composition designers. In this paper, we present our framework for the verification of Reo using the mCRL2 toolset. We unify our previous work on mapping various semantic models for Reo, namely, constraint automata, timed constraint automata, coloring semantics and the newly developed action constraint automata, to the process algebraic specification language of mCRL2, address the correctness of this mapping, discuss tool support, and present a detailed example that illustrates the use of Reo empowered with mCRL2 for the analysis of dataflow in service-based process models

CWI's Institutional Repository

Temporal Vectorization: A Compiler Approach to Automatic Multi-Pumping

Author: Ben-Nun Tal
De Matteis Tiziano
Hoefler Torsten
Johnsen Carl-Johannes
Licht Johannes de Fine
Publication venue
Publication date: 01/01/2022
Field of study

The multi-pumping resource sharing technique can overcome the limitations commonly found in single-clocked FPGA designs by allowing hardware components to operate at a higher clock frequency than the surrounding system. However, this optimization cannot be expressed in high levels of abstraction, such as HLS, requiring the use of hand-optimized RTL. In this paper we show how to leverage multiple clock domains for computational subdomains on reconfigurable devices through data movement analysis on high-level programs. We offer a novel view on multi-pumping as a compiler optimization - a superclass of traditional vectorization. As multiple data elements are fed and consumed, the computations are packed temporally rather than spatially. The optimization is applied automatically using an intermediate representation that maps high-level code to HLS. Internally, the optimization injects modules into the generated designs, incorporating RTL for fine-grained control over the clock domains. We obtain a reduction of resource consumption by up to 50% on critical components and 23% on average. For scalable designs, this can enable further parallelism, increasing overall performance

arXiv.org e-Print Archive

Repository for Publications and Research Data

Copenhagen University Research Information System

The force on the flex: Global parallelism and portability

Author: Jordan H. F.
Publication venue
Publication date
Field of study

A parallel programming methodology, called the force, supports the construction of programs to be executed in parallel by an unspecified, but potentially large, number of processes. The methodology was originally developed on a pipelined, shared memory multiprocessor, the Denelcor HEP, and embodies the primitive operations of the force in a set of macros which expand into multiprocessor Fortran code. A small set of primitives is sufficient to write large parallel programs, and the system has been used to produce 10,000 line programs in computational fluid dynamics. The level of complexity of the force primitives is intermediate. It is high enough to mask detailed architectural differences between multiprocessors but low enough to give the user control over performance. The system is being ported to a medium scale multiprocessor, the Flex/32, which is a 20 processor system with a mixture of shared and local memory. Memory organization and the type of processor synchronization supported by the hardware on the two machines lead to some differences in efficient implementations of the force primitives, but the user interface remains the same. An initial implementation was done by retargeting the macros to Flexible Computer Corporation's ConCurrent C language. Subsequently, the macros were caused to directly produce the system calls which form the basis for ConCurrent C. The implementation of the Fortran based system is in step with Flexible Computer Corporations's implementation of a Fortran system in the parallel environment

NASA Technical Reports Server

Automatic mapping of graphical programming applications to microelectronic technologies

Author: Ong Sze-Wei
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2001
Field of study

Adaptive computing systems (ACSs) and application-specific integrated circuits (ASICs) can serve as flexible hardware accelerators for applications in domains such as image processing and digital signal processing. However, the mapping of applications onto ACSs and ASICs using the traditional methods can take months for a hardware engineer to develop and debug. In this dissertation, a new approach for automatic mapping of software applications onto ACSs and ASICs has been developed, implemented and validated. This dissertation presents the design flow of the software environment called CHAMPION, which is being developed at the University of Tennessee. This environment permits high-level design entry using the Cantata graphical programming software fromKRI. Using Cantata as the design entry, CHAMPION hides from the user the low-level details of the hardware architecture and the finer issues of application mapping onto the hardware. Validation of the CHAMPION environment was performed using multiple applications of moderate complexity. In one case, theapplication mapping time which required six weeks to perform manually took only six minutes for CHAMPION, yet comparable results were produced. Furthermore, the CHAMPION environment was constructed such that retargeting to a new adaptive computing system could be accomplished in just a few hours as opposed to weeks using manual methods. Thus, CHAMPION permits both ACSs and ASICs to be utilized by a wider audience and application development accomplished in less time

University of Tennessee, Knoxville: Trace

The Synchronized Filtering Dataflow

Author: Li Peng
Publication venue: Washington University Open Scholarship
Publication date: 15/12/2014
Field of study

In the past decade, the world has seen the rise of big data, which calls for a paradigm shift in data processing. Streaming processing, where data are processed in their spatial or temporal order, is increasingly common. Meanwhile, parallel computing has become a household term in the computing world. The combination of streaming processing and parallel computing, streaming computing, has been playing an important role in data processing. A streaming computing system is a network of nodes connected by unidirectional first-in first-out (FIFO) data channels. When a node has multiple input channels, to ensure the deterministic behavior of the whole system, synchronization is required on those channels when the node consumes data. After a streaming computing node finishes a computation, it may choose not to produce output on some of its output channels. This behavior, known as filtering, is data-dependent and unpredictable. When filtered data streams are synchronized, applications can deadlock due to empty and full channel buffers. To avoid deadlocks and ensure bounded-memory execution, we turn to model-based approaches. In this dissertation, we propose the synchronized filtering dataflow (SFDF) to model synchronization and filtering behaviors. We avoid deadlocks in SFDF applications by augmenting data streams with dummy messages. We design decentralized algorithms that compute a dummy interval for each channel during compilation time and schedule dummy messages according to the dummy intervals during runtime. The runtime parts of our algorithms are very efficient, adding little overhead to computing nodes, but computing dummy intervals could be very time-consuming on general dataflow graphs. We design efficient algorithms to compute dummy intervals for streaming applications with special topologies. In particular, we focus on series-parallel directed acyclic graphs (SP-DAGs) and CS4 DAGs, where each undirected cycle is single-source and single-sink. We further extend our work to describe a set of polyhedral constraints that define all sets of safe dummy intervals for any dataflow graphs, which gives us more flexibility to choose dummy intervals. We also provide a polynomial-time algorithm to verify the safety of given dummy intervals for SP-DAGs. Dummy messages are only one type of control message used by streaming applications. We extend our SFDF model to support more types of control message, which are precisely synchronized with data streams. We use two types of control messages, dummy message and credit message, to guarantee bounded-memory execution. We demonstrate that the extended model can help improve performance of some applications by adding filtering behavior to non-filtering applications

Washington University St. Louis: Open Scholarship

Integrating compile-time and runtime parallelism management through revocable thread serialization

Author: Maa Gino K. (Gino Ko)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1995
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (leaves 125-128).by Gino K. Maa.Ph.D

DSpace@MIT