3,237 research outputs found
Fault tolerant architectures for integrated aircraft electronics systems, task 2
The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported
The use of Ethernet in the DataFlow of the ATLAS Trigger & DAQ
The article analyzes a proposed network topology for the ATLAS DAQ DataFlow,
and identifies the Ethernet features required for a proper operation of the
network: MAC address table size, switch performance in terms of throughput and
latency, the use of Flow Control, Virtual LANs and Quality of Service. We
investigate these features on some Ethernet switches, and conclude on their
usefulness for the ATLAS DataFlow network.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, March 2003, 10 pages, LaTeX, 10 eps figures. PSN
MOGT01
DALiuGE: A Graph Execution Framework for Harnessing the Astronomical Data Deluge
The Data Activated Liu Graph Engine - DALiuGE - is an execution framework for
processing large astronomical datasets at a scale required by the Square
Kilometre Array Phase 1 (SKA1). It includes an interface for expressing complex
data reduction pipelines consisting of both data sets and algorithmic
components and an implementation run-time to execute such pipelines on
distributed resources. By mapping the logical view of a pipeline to its
physical realisation, DALiuGE separates the concerns of multiple stakeholders,
allowing them to collectively optimise large-scale data processing solutions in
a coherent manner. The execution in DALiuGE is data-activated, where each
individual data item autonomously triggers the processing on itself. Such
decentralisation also makes the execution framework very scalable and flexible,
supporting pipeline sizes ranging from less than ten tasks running on a laptop
to tens of millions of concurrent tasks on the second fastest supercomputer in
the world. DALiuGE has been used in production for reducing interferometry data
sets from the Karl E. Jansky Very Large Array and the Mingantu Ultrawide
Spectral Radioheliograph; and is being developed as the execution framework
prototype for the Science Data Processor (SDP) consortium of the Square
Kilometre Array (SKA) telescope. This paper presents a technical overview of
DALiuGE and discusses case studies from the CHILES and MUSER projects that use
DALiuGE to execute production pipelines. In a companion paper, we provide
in-depth analysis of DALiuGE's scalability to very large numbers of tasks on
two supercomputing facilities.Comment: 31 pages, 12 figures, currently under review by Astronomy and
Computin
Recommended from our members
Behavioral synthesis from VHDL using structured modeling
This dissertation describes work in behavioral synthesis involving the development of a VHDL Synthesis System VSS which accepts a VHDL behavioral input specification and performs technology independent synthesis to generate a circuit netlist of generic components. The VHDL language is used for input and output descriptions. An intermediate representation which incorporates signal typing and component attributes simplifies compilation and facilitates design optimization.A Structured Modeling methodology has been developed to suggest standard VHDL modeling practices for synthesis. Structured modeling provides recommendations for the use of available VHDL description styles so that optimal designs will be synthesized.A design composed of generic components is synthesized from the input description through a process of Graph Compilation, Graph Criticism, and Design Compilation. Experiments were performed to demonstrate the effects of different modeling styles on the quality of the design produced by VSS. Several alternative VHDL models were examined for each benchmark, illustrating the improvements in design quality achieved when Structured Modeling guidelines were followed
Compiler Discovered Dynamic Scheduling of Irregular Code in High-Level Synthesis
Dynamically scheduled high-level synthesis (HLS) achieves higher throughput
than static HLS for codes with unpredictable memory accesses and control flow.
However, excessive dataflow scheduling results in circuits that use more
resources and have a slower critical path, even when only a part of the circuit
exhibits dynamic behavior. Recent work has shown that marking parts of a
dataflow circuit for static scheduling can save resources and improve
performance (hybrid scheduling), but the dynamic part of the circuit still
bottlenecks the critical path. We propose instead to selectively introduce
dynamic scheduling into static HLS. This paper presents an algorithm for
identifying code regions amenable to dynamic scheduling and shows a methodology
for introducing dynamically scheduled basic blocks, loops, and memory
operations into static HLS. Our algorithm is informed by modulo-scheduling and
can be integrated into any modulo-scheduled HLS tool. On a set of ten
benchmarks, we show that our approach achieves on average an up to 3.7
and 3 speedup against dynamic and hybrid scheduling, respectively, with
an area overhead of 1.3 and frequency degradation of 0.74 when
compared to static HLS.Comment: To appear in the 33rd International Conference on Field-Programmable
Logic and Applications (2023
A High-Frequency Load-Store Queue with Speculative Allocations for High-Level Synthesis
Dynamically scheduled high-level synthesis (HLS) enables the use of
load-store queues (LSQs) which can disambiguate data hazards at circuit
runtime, increasing throughput in codes with unpredictable memory accesses.
However, the increased throughput comes at the price of lower clock frequency
and higher resource usage compared to statically scheduled circuits without
LSQs. The lower frequency often nullifies any throughput improvements over
static scheduling, while the resource usage becomes prohibitively expensive
with large queue sizes. This paper presents a method for achieving dynamically
scheduled memory operations in HLS without significant clock period and
resource usage increase. We present a novel LSQ based on shift-registers
enabled by the opportunity to specialize queue sizes to a target code in HLS.
We show a method to speculatively allocate addresses to our LSQ, significantly
increasing pipeline parallelism in codes that could not benefit from an LSQ
before. In stark contrast to traditional load value speculation, we do not
require pipeline replays and have no overhead on misspeculation. On a set of
benchmarks with data hazards, our approach achieves an average speedup of
11 against static HLS and 5 against dynamic HLS that uses a
state of the art LSQ from previous work. Our LSQ also uses several times fewer
resources, scaling to queues with hundreds of entries, and supports both
on-chip and off-chip memory.Comment: To appear in the International Conference on Field Programmable
Technology (FPT'23), Yokohama, Japan, 11-14 December 202
Efficient Software Implementation of Stream Programs
The way we use computers and mobile phones today requires large amounts of processing of data streams. Examples include digital signal processing for wireless transmission, audio and video coding for recording and watching videos, and noise reduction for the phone calls. These tasks can be performed by stream programs—computer programs that process streams of data. Stream programs can be composed of other stream programs. Components of a composition are connected in a network, i.e. the output streams of one component are sent as input streams to other components. The components, that perform the actual computation, are called kernels. They can be described in different styles and programming languages. There are also formal models for describing the kernels and the networks. One such model is the actor machine.This dissertation evaluates the actor machine, how it facilitates creating efficient software implementation of stream programs. The evaluation is divided into four aspects: (1) analyzability of its structure, (2) generality in what languages and styles it can express, (3) efficient implementation of kernels, and (4) efficient implementation of networks. This dissertation demonstrates all four aspects through implementation and evaluation of a stream program compiler based on actor machines
- …