In this paper, we formally introduce, with rigorous derivations, the use of
reinforcement learning to the field of inverse problems by designing an
iterative algorithm, called REINFORCE-IP, for solving a general type of
non-linear inverse problem. By choosing specific probability models for the
action-selection rule, we connect our approach to the conventional
regularization methods of Tikhonov regularization and iterative regularization.
For the numerical implementation of our approach, we parameterize the
solution-searching rule with the help of neural networks and iteratively
improve the parameter using a reinforcement-learning algorithm~-- REINFORCE.
Under standard assumptions we prove the almost sure convergence of the
parameter to a locally optimal value. Our work provides two typical examples
(non-linear integral equations and parameter-identification problems in partial
differential equations) of how reinforcement learning can be applied in solving
non-linear inverse problems. Our numerical experiments show that REINFORCE-IP
is an efficient algorithm that can escape from local minimums and identify
multi-solutions for inverse problems with non-uniqueness.Comment: 33 pages, 10 figure