From partial data to out-of-sample parameter and observation estimation with diffusion maps and geometric harmonics

Abstract

peer reviewedA data-driven framework is presented, that enables the prediction of quantities, either observations or parameters, given sufficient partial data. The framework is illustrated via a computational model of the deposition of Cu in a Chemical Vapor Deposition (CVD) reactor, where the reactor pressure, the deposition temperature and feed mass flow rate are important process parameters that determine the outcome of the process. The sampled observations are high-dimensional vectors containing the outputs of a detailed CFD steady-state model of the process, i.e. the values of velocity, pressure, temperature, and species mass fractions at each point in the discretization. A machine learning workflow is presented, able to predict out-of-sample (a) observations (e.g. mass fraction in the reactor), given process parameters (e.g. inlet temperature); (b) process parameters, given observation data; and (c) partial observations (e.g. temperature in the reactor), given other partial observations (e.g. mass fraction in the reactor). The proposed workflow relies on two manifold learning schemes: Diffusion Maps and the associated Geometric Harmonics. Diffusion Maps are used for discovering a reduced representation of the available data, and Geometric Harmonics for extending functions defined on the discovered manifold. In our work a special use case of Geometric Harmonics is formulated and implemented, which we call Double Diffusion Maps, to map from the reduced representation back to (partial) observations and process parameters. A comparison of our manifold learning scheme to the traditional Gappy-POD approach is provided: ours can be thought of as a โ€œGappy DMAPsโ€ approach. The presented methodology is easily transferable to application domains beyond reactor engineering

    Similar works