5,730 research outputs found
Run Time Approximation of Non-blocking Service Rates for Streaming Systems
Stream processing is a compute paradigm that promises safe and efficient
parallelism. Modern big-data problems are often well suited for stream
processing's throughput-oriented nature. Realization of efficient stream
processing requires monitoring and optimization of multiple communications
links. Most techniques to optimize these links use queueing network models or
network flow models, which require some idea of the actual execution rate of
each independent compute kernel within the system. What we want to know is how
fast can each kernel process data independent of other communicating kernels.
This is known as the "service rate" of the kernel within the queueing
literature. Current approaches to divining service rates are static. Modern
workloads, however, are often dynamic. Shared cloud systems also present
applications with highly dynamic execution environments (multiple users,
hardware migration, etc.). It is therefore desirable to continuously re-tune an
application during run time (online) in response to changing conditions. Our
approach enables online service rate monitoring under most conditions,
obviating the need for reliance on steady state predictions for what are
probably non-steady state phenomena. First, some of the difficulties associated
with online service rate determination are examined. Second, the algorithm to
approximate the online non-blocking service rate is described. Lastly, the
algorithm is implemented within the open source RaftLib framework for
validation using a simple microbenchmark as well as two full streaming
applications.Comment: technical repor
Advanced subsonic transport propulsion
A brief review of the current NASA Energy Efficient Engine (E(3)) Project is presented. Included in this review are the factors that influenced the design of these turbofan engines and the advanced technology incorporated in them to reduce fuel consumption and improve environmental characteristics. In addition, factors such as the continuing spiral in fuel cost, that could influence future aircraft propulsion systems beyond those represented by the E(3) engines, are also discussed. Advanced technologies that will address these influencing factors and provide viable future propulsion systems are described. The potential importance of other propulsion system types, such as geared fans and turboshaft engines, is presented
Investigation of reactivity of launch vehicle materials with liquid oxygen Quarterly report, 23 Jul. - 22 Oct. 1968
Reactivity of launch vehicle organic materials with liquid oxyge
Throughput-optimal systolic arrays from recurrence equations
Many compute-bound software kernels have seen order-of-magnitude speedups on special-purpose accelerators built on specialized architectures such as field-programmable gate arrays (FPGAs). These architectures are particularly good at implementing dynamic programming algorithms that can be expressed as systems of recurrence equations, which in turn can be realized as systolic array designs. To efficiently find good realizations of an algorithm for a given hardware platform, we pursue software tools that can search the space of possible parallel array designs to optimize various design criteria. Most existing design tools in this area produce a design that is latency-space optimal. However, we instead wish to target applications that operate on a large collection of small inputs, e.g. a database of biological sequences. For such applications, overall throughput rather than latency per input is the most important measure of performance. In this work, we introduce a new procedure to optimize throughput of a systolic array subject to resource constraints, in this case the area and bandwidth constraints of an FPGA device. We show that the throughput of an array is dependent on the maximum number of lattice points executed by any processor in the array, which to a close approximation is determined solely by the array’s projection vector. We describe a bounded search process to find throughput-optimal projection vectors and a tool to perform automated design space exploration, discovering a range of array designs that are optimal for inputs of different sizes. We apply our techniques to the Nussinov RNA folding algorithm to generate multiple mappings of this algorithm into systolic arrays. By combining our library of designs with run-time reconfiguration of an FPGA device to dynamically switch among them, we predict significant speedup over a single, latency-space optimal array
Investigation of Reactivity of Launch Vehicle Materials with Liquid Oxygen
Impact sensitivity and ignition mechanism of organic compounds in liquid oxygen correlated with chemical and physical propertie
- …