202 research outputs found
Recommended from our members
R-SHIM: Deterministic Concurrency with Recursion and Shared Variables
Concurrent programming languages are good for embedded systems because they match the parallelism of their environments, but most concurrent languages are nondeterministic, making coding in them unwieldy. We present R-SHIM, the core of a language with concurrent recursive procedure calls and disciplined shared variables that remains deterministic -- the behavior of a program is scheduling-independen
Instantaneous Transitions in Esterel
Esterel is an imperative synchronous programming language for the specification of deterministic concurrent reactive systems. While providing the usual control-flow constructs—sequences, loops, conditionals, and exceptions—its lack of a goto instruction makes the programming of arbitrary finite state machines awkward and hinders the design of source-to-source program transformations. We previously introduced to Esterel a non-instantaneous gotopause instruction, which prevents the synchronous execution of code before and code after the transition. Here, we tackle instantaneous transitions. Concurrency demands we assign scopes and priorities to gotos, so we extend Esterel's exception handling mechanism to allow exception handlers in arbitrary locations. We advocate for and formalize the resulting language. We observe that instantaneous gotos complement but do not replace non-instantaneous gotopauses
Recommended from our members
Deterministic Receptive Processes are Kahn Processes
Deterministic asynchronous concurrent formalisms are valuable because determinism greatly simplifies the design and validation of such systems and most concurrent formalisms are nondeterministic. This paper connects two of the more successful deterministic asynchronous formalisms: Kahn's dataflow networks and Josephs's deterministic receptive processes. The main result: a divergence-free deterministic receptive process is a Kahn process in that it can be modeled by a continuous function from input to output sequences, thus verifying it is compositionally deterministic. This result provides a bridge between two communities, enabling results from the asynchronous digital hardware community to be used in the context of dataflow computation and vice versa
Recommended from our members
Scheduling-Independent Threads and Exceptions in SHIM
Concurrent programming languages should be a good fit for embedded systems because they match the intrinsic parallelism of their architectures and environments. Unfortunately, typical concurrent programming formalisms are prone to races and nondeterminism, despite the presence of mechanisms such as monitors. In this paper, we propose SHIM, the core of a deterministic concurrent language, meaning the behavior of a program is independent of the scheduling of concurrent operations. SHIM does not sacrifice power or flexibility to achieve this determinism. It supports both synchronous and asynchronous paradigms-loosely and tightly synchronized threads-the dynamic creation of threads and shared variables, recursive procedures, and exceptions. We illustrate our programming model with examples including breadth-first-search algorithms and pipelines. By construction, they are race-free. We provide the formal semantics of SHIM and a preliminary implementation
Recommended from our members
Specifying Confluent Processes
We address the problem of specifying concurrent processes that can make local nondeterministic decisions without affecting global system behavior---the sequence of events communicated along each inter-process communication channel. Such nondeterminism can be used to cope with unpredictable execution rates and communication delays. Our model resembles Kahn's, but does not include unbounded buffered communication, so it is much simpler to reason about and implement. After formally characterizing these so-called confluent processes, we propose a collection of operators, including sequencing, parallel, and our own creation, confluent choice, that guarantee confluence by construction. The result is a set of primitive constructs that form the formal basis of a concurrent programming language for both hardware and software systems that gives deterministic behavior regardless of the relative execution rates of the processes. Such a language greatly simplifies the verification task because any correct implementation of such a system is guaranteed to have the same behavior, a property rarely found in concurrent programming environments
Recommended from our members
SHIM: A Deterministic Approach to Programming with Threads
Concurrent programming languages should be a good fit for embedded systems because they match the intrinsic parallelism of their architectures and environments. Unfortunately, most concurrent programming formalisms are prone to races and nondeterminism, despite the presence of mechanisms such as monitors. In this paper, we propose SHIM, the core of a concurrent language with disciplined shared variables that remains deterministic, meaning the behavior of a program is independent of the scheduling of concurrent operations. SHIM does not sacrifice power or flexibility to achieve this determinism. It supports both synchronous and asynchronous paradigms---loosely and tightly synchronized threads---the dynamic creation of threads and shared variables, recursive procedures, and exceptions. We illustrate our programming model with examples including breadth-first-search algorithms and pipelines. By construction, they are race-free. We provide the formal semantics of SHIM and a preliminary implementation
Recommended from our members
High-Level Optimization by Combining Retiming and Shannon Decomposition
Applying Shannon decomposition can reshape sequential circuits and improve opportunities for retiming. Both Shannon decomposition and retiming only rely on limited information about combinational blocks (timing estimates), so both techniques are suitable for high-level synthesis. We describe an efficient algorithm to preprocess a circuit using Shannon decomposition to increase retiming efficacy. It assembles complex chains of Shannon decompositions while carefully avoiding parallel ones in order to limit the area overhead due to logic duplication. We compare a traditional retiming flow with the same flow augmented with our algorithm. Although our algorithm provides no improvement on half of our benchmarks, for the other half we obtain a 25% speed-up on average (7% to 61%), while only increasing area by 5% (3% to 12%)
Optimizing Sequential Cycles Through Shannon Decomposition and Retiming
Optimizing sequential cycles is essential for many types of high-performance circuits, such as pipelines for packet processing. Retiming is a powerful technique for speeding pipelines, but it is stymied by tight sequential cycles. Designers usually attack such cycles by manually combining Shannon decomposition with retiming-effectively a form of speculation-but such manual decomposition is error prone. We propose an efficient algorithm that simultaneously applies Shannon decomposition and retiming to optimize circuits with tight sequential cycles. While the algorithm is only able to improve certain circuits (roughly half of the benchmarks we tried), the performance increase can be dramatic (7%-61%) with only a modest increase in area (1%-12%). The algorithm is also fast, making it a practical addition to a synthesis flow
GLB: Lifeline-based Global Load Balancing library in X10
We present GLB, a programming model and an associated implementation that can
handle a wide range of irregular paral- lel programming problems running over
large-scale distributed systems. GLB is applicable both to problems that are
easily load-balanced via static scheduling and to problems that are hard to
statically load balance. GLB hides the intricate syn- chronizations (e.g.,
inter-node communication, initialization and startup, load balancing,
termination and result collection) from the users. GLB internally uses a
version of the lifeline graph based work-stealing algorithm proposed by
Saraswat et al. Users of GLB are simply required to write several pieces of
sequential code that comply with the GLB interface. GLB then schedules and
orchestrates the parallel execution of the code correctly and efficiently at
scale. We have applied GLB to two representative benchmarks: Betweenness
Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be
statically load-balanced whereas UTS cannot. In either case, GLB scales well--
achieving nearly linear speedup on different computer architectures (Power,
Blue Gene/Q, and K) -- up to 16K cores
- …