Search CORE

662 research outputs found

DALiuGE: A Graph Execution Framework for Harnessing the Astronomical Data Deluge

Author: An Tao
Boulton Mark
Cooper Ian
Dodson Richard
Dolensky Markus
Lao Baoqiang
Mei Ying
Pallot Dave
Tobar Rodrigo
Vinsen Kevin
Wang Feng
Wang Ruonan
Wicenec Andreas
Wu Chen
Publication venue
Publication date: 01/01/2017
Field of study

The Data Activated Liu Graph Engine - DALiuGE - is an execution framework for processing large astronomical datasets at a scale required by the Square Kilometre Array Phase 1 (SKA1). It includes an interface for expressing complex data reduction pipelines consisting of both data sets and algorithmic components and an implementation run-time to execute such pipelines on distributed resources. By mapping the logical view of a pipeline to its physical realisation, DALiuGE separates the concerns of multiple stakeholders, allowing them to collectively optimise large-scale data processing solutions in a coherent manner. The execution in DALiuGE is data-activated, where each individual data item autonomously triggers the processing on itself. Such decentralisation also makes the execution framework very scalable and flexible, supporting pipeline sizes ranging from less than ten tasks running on a laptop to tens of millions of concurrent tasks on the second fastest supercomputer in the world. DALiuGE has been used in production for reducing interferometry data sets from the Karl E. Jansky Very Large Array and the Mingantu Ultrawide Spectral Radioheliograph; and is being developed as the execution framework prototype for the Science Data Processor (SDP) consortium of the Square Kilometre Array (SKA) telescope. This paper presents a technical overview of DALiuGE and discusses case studies from the CHILES and MUSER projects that use DALiuGE to execute production pipelines. In a companion paper, we provide in-depth analysis of DALiuGE's scalability to very large numbers of tasks on two supercomputing facilities.Comment: 31 pages, 12 figures, currently under review by Astronomy and Computin

arXiv.org e-Print Archive

Open predicate path expressions for distributed environments: notation, implementation, and extensions

Author: Headington Mark Roger
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1984
Field of study

This dissertation introduces open predicate path expressions --a non-procedural, very-high-level language notation for the synchronization of concurrent accesses to shared data in distributed computer systems. The target environment is one in which resource modules (totally encapsulated instances of abstract data types) are the basic building blocks in a network of conventional, von Neumann computers or of functional, highly parallel machines. Each resource module will contain two independent submodules: a synchronization submodule which coordinates requests for access to the resource\u27s data and an access-mechanism submodule which localizes the code for operations on that data;Open predicate path expressions are proposed as a specification language for the synchronization submodule and represent a blend of two existing path notations: open path expressions and predicate path expressions. Motivations for the adoption of this new notation are presented, and an implementation semantics for the notation is presented in the form of dataflow graphs;An algorithm is presented which will automatically synthesize an open predicate path expression into a dataflow graph, which is then implemented by a network of communicating submodules written in either a sequential or an applicative language. Finally, an extended notation for the synchronization submodule is proposed, the purpose of which is to provide greater expressive power for certain synchronization problems which are difficult to specify using path expressions alone

Web service composition: A survey of techniques and tools

Author: Benatallah Boualem
Daniel Florian
Lemos Angel Lagares
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Web services are a consolidated reality of the modern Web with tremendous, increasing impact on everyday computing tasks. They turned the Web into the largest, most accepted, and most vivid distributed computing platform ever. Yet, the use and integration of Web services into composite services or applications, which is a highly sensible and conceptually non-trivial task, is still not unleashing its full magnitude of power. A consolidated analysis framework that advances the fundamental understanding of Web service composition building blocks in terms of concepts, models, languages, productivity support techniques, and tools is required. This framework is necessary to enable effective exploration, understanding, assessing, comparing, and selecting service composition models, languages, techniques, platforms, and tools. This article establishes such a framework and reviews the state of the art in service composition from an unprecedented, holistic perspective

Archivio istituzionale della ricerca - Politecnico di Milano

Habitat Preferences of Adult Spotted Seatrout, Cynoscion nebulosus, in Lake Pontchartrain, Louisiana

Author: Bramer Noelle Marie
Publication venue: LSU Digital Commons
Publication date: 01/01/2015
Field of study

Spotted seatrout (Cynoscion nebulosus) are a highly sought after sportfish, making up over 90% of the recreational fishery in Louisiana. As a significant portion of every life history stage is spent within its natal estuary, it is an ideal bio-indicator of estuarine health. As one of the largest estuaries in Louisiana, Lake Pontchartrain represents one such supporting ecosystem. From November 2012 to April 2014 acoustic tagging of individual fish, a lake-wide receiver array, and ArcGIS mapping software were utilized to determine the spatial distribution of spotted seatrout within the lake. Receivers were placed in representative locations including man-made and natural structures. The prevalence of fish “hot spots,” between these locations were compared and thus bottom habitat preferences examined. Water quality parameters, including water temperature, dissolved oxygen, salinity, turbidity, colored dissolved organic matter, chlorophyll a, and hydrocarbons are biochemical factors with the potential to drive the species’ distribution. As such, a flow-through water sampling system was used to obtain monthly “snapshots” of conditions across the lake. Combined with presence/absence receiver data, any water quality preferences were examined. Overall, spotted seatrout showed a distinct preference for the central, central-north, and northeastern areas of the lake. It was also noted, however, that the species showed no distinct preference for a single bottom type, but utilized every habitat in the lake. With respect to water quality, salinity and temperature were determined to be the most important features for the species’ distribution. According to the generalized linear model produced, every unit increase in salinity (ppt) improved the odds of observing a spotted seatrout by almost five times while a unit increase in temperature improved the odds by approximately 11 percent. The above results are in agreement with the extensive literature on the species and its relationship to bottom types and water chemistry, but still leave questions of habitat use/preference in relation to the potential influences of life stage adaptations, availability of food resources, food web dynamics, or major environmental events. Future research in these areas will serve as important additions to the ecosystem-based strategy of management for this valuable Louisiana fishery

Louisiana State University

A Methodology to Design Pipelined Simulated Annealing Kernel Accelerators on Space-Borne Field-Programmable Gate Arrays

Author: Carver Jeffrey Michael
Publication venue: DigitalCommons@USU
Publication date: 01/05/2009
Field of study

Increased levels of science objectives expected from spacecraft systems necessitate the ability to carry out fast on-board autonomous mission planning and scheduling. Heterogeneous radiation-hardened Field Programmable Gate Arrays (FPGAs) with embedded multiplier and memory modules are well suited to support the acceleration of scheduling algorithms. A methodology to design circuits specifically to accelerate Simulated Annealing Kernels (SAKs) in event scheduling algorithms is shown. The main contribution of this thesis is the low complexity scoring calculation used for the heuristic mapping algorithm used to balance resource allocation across a coarse-grained pipelined data-path. The methodology was exercised over various kernels with different cost functions and problem sizes. These test cases were benchedmarked for execution time, resource usage, power, and energy on a Xilinx Virtex 4 LX QR 200 FPGA and a BAE RAD 750 microprocessor

HLS-ENABLEDDYNAMIC STREAM PROCESSING

Author: Kritikakis Charalampos
Publication venue
Publication date: 31/12/2021
Field of study

Compiling and optimizing spreadsheets for FPGA and multicore execution

Author: Hirsch Amir
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2007
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007."September 2007."Includes bibliographical references (p. 102-104).A major barrier to developing systems on multicore and FPGA chips is an easy-to-use development environment. This thesis presents the RhoZeta spreadsheet compiler and Catalyst optimization system for programming multiprocessors and FPGAs. Any spreadsheet frontend may be extended to work with RhoZeta's multiple interpreters and behavioral abstraction mechanisms. RhoZeta synchronizes a variety of cell interpreters acting on a global memory space. RhoZeta can also compile a group of cells to multithreaded C or Verilog. The result is an easy-to-use interface for programming multicore microprocessors and FPGAs. A spreadsheet environment presents parallelism and locality issues of modem hardware directly to the user and allows for a simple global memory synchronization model. Catalyst is a spreadsheet graph rewriting system based on performing behaviorally invariant guarded atomic actions while a system is being interpreted by RhoZeta. A number of optimization macros were developed to perform speculation, resource sharing and propagation of static assignments through a circuit. Parallelization of a 64-bit serial leading-zero-counter is demonstrated with Catalyst. Fault tolerance macros were also developed in Catalyst to protect against dynamic faults and to offset costs associated with testing semiconductors for static defects. A model for partitioning, placing and profiling spreadsheet execution in a heterogeneous hardware environment is also discussed. The RhoZeta system has been used to design several multithreaded and FPGA applications including a RISC emulator and a MIDI controlled modular synthesizer.by Amir Hirsch.M.Eng

Restructuring Software Code for High-Level Synthesis using a Graph-based Approach Targeting FPGAs

Author: Afonso Soares Canas Ferreira
Publication venue
Publication date: 20/07/2018
Field of study

An accurate analysis for guaranteed performance of multiprocessor streaming applications

Author: Poplavko P.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2008
Field of study

Already for more than a decade, consumer electronic devices have been available for entertainment, educational, or telecommunication tasks based on multimedia streaming applications, i.e., applications that process streams of audio and video samples in digital form. Multimedia capabilities are expected to become more and more commonplace in portable devices. This leads to challenges with respect to cost efficiency and quality. This thesis contributes models and analysis techniques for improving the cost efficiency, and therefore also the quality, of multimedia devices. Portable consumer electronic devices should feature flexible functionality on the one hand and low power consumption on the other hand. Those two requirements are conflicting. Therefore, we focus on a class of hardware that represents a good trade-off between those two requirements, namely on domain-specific multiprocessor systems-on-chip (MP-SoC). Our research work contributes to dynamic (i.e., run-time) optimization of MP-SoC system metrics. The central question in this area is how to ensure that real-time constraints are satisfied and the metric of interest such as perceived multimedia quality or power consumption is optimized. In these cases, we speak of quality-of-service (QoS) and power management, respectively. In this thesis, we pursue real-time constraint satisfaction that is guaranteed by the system by construction and proven mainly based on analytical reasoning. That approach is often taken in real-time systems to ensure reliable performance. Therefore the performance analysis has to be conservative, i.e. it has to use pessimistic assumptions on the unknown conditions that can negatively influence the system performance. We adopt this hypothesis as the foundation of this work. Therefore, the subject of this thesis is the analysis of guaranteed performance for multimedia applications running on multiprocessors. It is very important to note that our conservative approach is essentially different from considering only the worst-case state of the system. Unlike the worst-case approach, our approach is dynamic, i.e. it makes use of run-time characteristics of the input data and the environment of the application. The main purpose of our performance analysis method is to guide the run-time optimization. Typically, a resource or quality manager predicts the execution time, i.e., the time it takes the system to process a certain number of input data samples. When the execution times get smaller, due to dependency of the execution time on the input data, the manager can switch the control parameter for the metric of interest such that the metric improves but the system gets slower. For power optimization, that means switching to a low-power mode. If execution times grow, the manager can set parameters so that the system gets faster. For QoS management, for example, the application can be switched to a different quality mode with some degradation in perceived quality. The real-time constraints are then never violated and the metrics of interest are kept as good as possible. Unfortunately, maintaining system metrics such as power and quality at the optimal level contradicts with our main requirement, i.e., providing performance guarantees, because for this one has to give up some quality or power consumption. Therefore, the performance analysis approach developed in this thesis is not only conservative, but also accurate, so that the optimization of the metric of interest does not suffer too much from conservativity. This is not trivial to realize when two factors are combined: parallel execution on multiple processors and dynamic variation of the data-dependent execution delays. We achieve the goal of conservative and accurate performance estimation for an important class of multiprocessor platforms and multimedia applications. Our performance analysis technique is realizable in practice in QoS or power management setups. We consider a generic MP-SoC platform that runs a dynamic set of applications, each application possibly using multiple processors. We assume that the applications are independent, although it is possible to relax this requirement in the future. To support real-time constraints, we require that the platform can provide guaranteed computation, communication and memory budgets for applications. Following important trends in system-on-chip communication, we support both global buses and networks-on-chip. We represent every application as a homogeneous synchronous dataflow (HSDF) graph, where the application tasks are modeled as graph nodes, called actors. We allow dynamic datadependent actor execution delays, which makes HSDF graphs very useful to express modern streaming applications. Our reason to consider HSDF graphs is that they provide a good basic foundation for analytical performance estimation. In this setup, this thesis provides three major contributions: 1. Given an application mapped to an MP-SoC platform, given the performance guarantees for the individual computation units (the processors) and the communication unit (the network-on-chip), and given constant actor execution delays, we derive the throughput and the execution time of the system as a whole. 2. Given a mapped application and platform performance guarantees as in the previous item, we extend our approach for constant actor execution delays to dynamic datadependent actor delays. 3. We propose a global implementation trajectory that starts from the application specification and goes through design-time and run-time phases. It uses an extension of the HSDF model of computation to reflect the design decisions made along the trajectory. We present our model and trajectory not only to put the first two contributions into the right context, but also to present our vision on different parts of the trajectory, to make a complete and consistent story. Our first contribution uses the idea of so-called IPC (inter-processor communication) graphs known from the literature, whereby a single model of computation (i.e., HSDF graphs) are used to model not only the computation units, but also the communication unit (the global bus or the network-on-chip) and the FIFO (first-in-first-out) buffers that form a ‘glue’ between the computation and communication units. We were the first to propose HSDF graph structures for modeling bounded FIFO buffers and guaranteed throughput network connections for the network-on-chip communication in MP-SoCs. As a result, our HSDF models enable the formalization of the on-chip FIFO buffer capacity minimization problem under a throughput constraint as a graph-theoretic problem. Using HSDF graphs to formalize that problem helps to find the performance bottlenecks in a given solution to this problem and to improve this solution. To demonstrate this, we use the JPEG decoder application case study. Also, we show that, assuming constant – worst-case for the given JPEG image – actor delays, we can predict execution times of JPEG decoding on two processors with an accuracy of 21%. Our second contribution is based on an extension of the scenario approach. This approach is based on the observation that the dynamic behavior of an application is typically composed of a limited number of sub-behaviors, i.e., scenarios, that have similar resource requirements, i.e., similar actor execution delays in the context of this thesis. The previous work on scenarios treats only single-processor applications or multiprocessor applications that do not exploit all the flexibility of the HSDF model of computation. We develop new scenario-based techniques in the context of HSDF graphs, to derive the timing overlap between different scenarios, which is very important to achieve good accuracy for general HSDF graphs executing on multiprocessors. We exploit this idea in an application case study – the MPEG-4 arbitrarily-shaped video decoder, and demonstrate execution time prediction with an average accuracy of 11%. To the best of our knowledge, for the given setup, no other existing performance technique can provide a comparable accuracy and at the same time performance guarantees

Repository TU/e