A machine-learned (ML) model is developed to enhance the accuracy of
turbulence transport equations of Reynolds Averaged Navier Stokes (RANS) solver
and applied for periodic hill test case, which involves complex flow regimes,
such as attached boundary layer, shear-layer, and separation and reattachment.
The accuracy of the model is investigated in extrapolation modes, i.e., the
test case has much larger separation bubble and higher turbulence than the
training cases. A parametric study is also performed to understand the effect
of network hyperparameters on training and model accuracy and to quantify the
uncertainty in model accuracy due to the non-deterministic nature of the neural
network training. The study revealed that, for any network, less than optimal
mini-batch size results in overfitting, and larger than optimal batch size
reduces accuracy. Data clustering is found to be an efficient approach to
prevent the machine-learned model from over-training on more prevalent flow
regimes, and results in a model with similar accuracy using almost one-third of
the training dataset. Feature importance analysis reveals that turbulence
production is correlated with shear strain in the free-shear region, with shear
strain and wall-distance and local velocity-based Reynolds number in the
boundary layer regime, and with streamwise velocity gradient in the
accelerating flow regime. The flow direction is found to be key in identifying
flow separation and reattachment regime. Machine-learned models perform poorly
in extrapolation mode, wherein the prediction shows less than 10% correlation
with Direct Numerical Simulation (DNS). A priori tests reveal that model
predictability improves significantly as the hill dataset is partially added
during training in a partial extrapolation model, e.g., with the addition of
only 5% of the hill data increases correlation with DNS to 80%.Comment: 50 pages, 18 figure