15 research outputs found
Effect of activation energy on detonation re-initiation behaviors in hydrogen-air mixtures
Two-dimensional simulations of a detonation propagating over a semi-cylinder in a channel filled with a stoichiometric hydrogen-air mixture are presented. A full set of Navier-Stokes equations is solved using a third-order WENO algorithm with HLLC flux, coupled with a calibrated, single-step chemical diffusive model (CDM). Simulation results using five different effective activation energies 4, 6, 10, 12 and 14 are presented featuring four distinct detonation attenuation regimes, including unattenuated detonation transmission ( 4), critical detonation re-initiation ( 6, and 10), cycled detonation re-initiation ( 12), and complete quenching ( 14). The degree of cell irregularity and the intensity of triple points are found positively correlated with the effective activation energy. With a low effective activation energy ( 4), the CDM captures a regular cellular pattern, and the cellular structure remains intact as it propagates over the obstacle. With intermediate effective activation energies ( 6, and 10), the detonation cell size increases and the cell structures become less regular with emerging multi-level cell structures. Here, a critical detonation re-initiation event is captured, where a strong transverse detonation wave forms following the Mach shock reflection, and eventually leads to a steady detonation propagation. At high effective activation energy ( 12), the initial transverse detonations fail to produce a self-sustained detonation wave and multiple ignition and quenching events are found before the final establishment of the detonation wave
Computational diagnostics for flame acceleration and transition to detonation in a hydrogen/air mixture
A new computational diagnostic method for pressure-induced compressibility is proposed by projecting its local contribution to the chemical explosive mode (CEM) in the chemical explosive mode analysis (CEMA) framework. The new method is validated for the study of detonation development during the deflagration-to-detonation transition (DDT) process. The flame characteristics are identified through the quantification of individual CEM contributions of chemical reaction, diffusion, and pressure-induced compressibility. Numerical simulations are performed to investigate the DDT processes in a stoichiometric hydrogen-air mixture. A Godunov algorithm, fifth-order in space, and third-order in time are used to solve the fully compressible Navier-Stokes equations on a dynamically adapting mesh. A single-step, calibrated chemical diffusive model (CDM) described by Arrhenius kinetics is used for energy release and conservation between the fuel and the product. The new diagnostic method is first applied to one-dimensional (1D) canonical flame configurations followed by two-dimensional (2D) simulations of DDT in an obstructed channel where different detonation initiation scenarios are examined using the new CEMA projection formulation. Detailed examinations of the idealized configuration of detonation initiation through shock focusing mechanism at a flame front are also studied using the new formulation. A comparison of the currently proposed CEMA projection and the original formulation by the authors suggests that including the pressure-induced compressibility is essential for the use of CEMA in DDT process. The results also show that the new formulation of CEMA projection can successively capture the detonation initiation through either a gradient mechanism or a direct initiation mechanism, and therefore can be used as an effective local analytical tool for the computational diagnostics of detonation initiation in a DDT process. It was found that detonation development is characterized by a strong contribution of chemistry role to the CEM which is pivotal to the initiation of detonation. The role of compressibility is found enhanced at the edge of the detonation front where diffusion was found to have minimal effects on detonation development
OLLIE: Derivation-based Tensor Program Optimizer
Boosting the runtime performance of deep neural networks (DNNs) is critical
due to their wide adoption in real-world tasks. Existing approaches to
optimizing the tensor algebra expression of a DNN only consider expressions
representable by a fixed set of predefined operators, missing possible
optimization opportunities between general expressions. We propose OLLIE, the
first derivation-based tensor program optimizer. OLLIE optimizes tensor
programs by leveraging transformations between general tensor algebra
expressions, enabling a significantly larger expression search space that
includes those supported by prior work as special cases. OLLIE uses a hybrid
derivation-based optimizer that effectively combines explorative and guided
derivations to quickly discover highly optimized expressions. Evaluation on
seven DNNs shows that OLLIE can outperform existing optimizers by up to
2.73 (1.46 on average) on an A100 GPU and up to 2.68
(1.51) on a V100 GPU, respectively
PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR
Deep neural networks (DNNs) are of critical use in different domains. To
accelerate DNN computation, tensor compilers are proposed to generate efficient
code on different domain-specific accelerators. Existing tensor compilers
mainly focus on optimizing computation efficiency. However, memory access is
becoming a key performance bottleneck because the computational performance of
accelerators is increasing much faster than memory performance. The lack of
direct description of memory access and data dependence in current tensor
compilers' intermediate representation (IR) brings significant challenges to
generate memory-efficient code.
In this paper, we propose IntelliGen, a tensor compiler that can generate
high-performance code for memory-intensive operators by considering both
computation and data movement optimizations. IntelliGen represent a DNN program
using GIR, which includes primitives indicating its computation, data movement,
and parallel strategies. This information will be further composed as an
instruction-level dataflow graph to perform holistic optimizations by searching
different memory access patterns and computation operations, and generating
memory-efficient code on different hardware. We evaluate IntelliGen on NVIDIA
GPU, AMD GPU, and Cambricon MLU, showing speedup up to 1.97x, 2.93x, and
16.91x(1.28x, 1.23x, and 2.31x on average), respectively, compared to current
most performant frameworks.Comment: 12 pages, 14 figure
Robust estimation of bacterial cell count from optical density
Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data
Programming Matrices as Staged Sparse Rows to Generate Efficient Matrix-free Differential Equation Solver
Solving differential equations is a critical task in scientific computing.
Domain-specific languages (DSLs) have been a promising direction in achieving
performance and productivity, but the current state of the art only supports
stencil computation, leaving solvers requiring loop-carried dependencies aside.
Alternatively, sparse matrices can represent such equation solvers and are more
general than existing DSLs, but the performance is sacrificed.
This paper points out that sparse matrices can be represented as programs
instead of data, having both the generality from the matrix-based
representation and the performance from program optimizations. Based on the
idea, we propose the Staged Sparse Row (SSR) sparse matrix representation that
can efficiently cover applications on structured grids. With SSR
representation, users can intuitively define SSR matrices using generator
functions and use SSR matrices through a concise object-oriented interface. SSR
matrices can then be chained and applied to construct the algorithm, including
those with loop-carried dependences. We then apply a set of dedicated
optimizations, and ultimately simplify the SSR matrix-based codes into
straightforward matrix-free ones, which are efficient and friendly for further
analysis.
Implementing BT pseudo application in the NAS Parallel Benchmark, with less
than lines of code compared with the matrix-free reference FORTRAN
implementation, we achieved up to performance. Implementing a
matrix-free variant for the High-Performance Conjugate Gradient benchmark, we
achieve performance compared with the reference implementation,
while our implementation shares the same algorithm on the same programming
abstraction, which is sparse matrices
Elastic properties of (Ti,Al,Si)N nanocomposite films
(Ti,Al,Si)N films have been prepared by d.c. and rf reactive magnetron sputtering, with Si contents in the range 2-11 at.% and Al contents between 4 and 19 at.%. Samples prepared in rotation mode (three magnetrons) presented densities between 4.0 and 4.6 g/cm3, while samples prepared in static mode (magnetron with Ti target with small pieces of Si and Al) displayed densities mainly in the range 3.0-3.9 g/cm3. For comparison purposes, the evaluation of Young's modulus was performed by both depth-sensing indentation and surface acoustic wave (SAW) techniques. Indentation results revealed systematically higher values than those obtained by SAW. These discrepancies might be related with the relatively low density of the films. Hardness values of approximately 60 GPa were obtained with samples with a composition of approximately 28.5 at.% titanium, 12 at.% aluminium, 9.5 at.% silicon and 50 at.% nitrogen. XRD patterns showed the presence of two different crystalline phases, as in the case of (Ti,Si)N films. One is assigned to TiN phase (lattice parameter of approx. 0.429 nm) and the second, the so-called solid solution which is developed in situations of low surface mobility, revealed a lattice parameter (0.419 nm) slightly lower than that of bulk TiN.http://www.sciencedirect.com/science/article/B6TVV-43WTXCV-N/1/19b034221f9e7e8378dbcc26ff46ca1