Wastewater treatment plants are designed to eliminate pollutants and
alleviate environmental pollution. However, the construction and operation of
WWTPs consume resources, emit greenhouse gases (GHGs) and produce residual
sludge, thus require further optimization. WWTPs are complex to control and
optimize because of high nonlinearity and variation. This study used a novel
technique, multi-agent deep reinforcement learning, to simultaneously optimize
dissolved oxygen and chemical dosage in a WWTP. The reward function was
specially designed from life cycle perspective to achieve sustainable
optimization. Five scenarios were considered: baseline, three different
effluent quality and cost-oriented scenarios. The result shows that
optimization based on LCA has lower environmental impacts compared to baseline
scenario, as cost, energy consumption and greenhouse gas emissions reduce to
0.890 CNY/m3-ww, 0.530 kWh/m3-ww, 2.491 kg CO2-eq/m3-ww respectively. The
cost-oriented control strategy exhibits comparable overall performance to the
LCA driven strategy since it sacrifices environmental bene ts but has lower
cost as 0.873 CNY/m3-ww. It is worth mentioning that the retrofitting of WWTPs
based on resources should be implemented with the consideration of impact
transfer. Specifically, LCA SW scenario decreases 10 kg PO4-eq in
eutrophication potential compared to the baseline within 10 days, while
significantly increases other indicators. The major contributors of each
indicator are identified for future study and improvement. Last, the author
discussed that novel dynamic control strategies required advanced sensors or a
large amount of data, so the selection of control strategies should also
consider economic and ecological conditions