We study the problem of learning good heuristic functions for classical
planning tasks with neural networks based on samples that are states with their
cost-to-goal estimates. It is well known that the learned model quality depends
on the training data quality. Our main goal is to understand better the
influence of sample generation strategies on the performance of a greedy
best-first heuristic search guided by a learned heuristic function. In a set of
controlled experiments, we find that two main factors determine the quality of
the learned heuristic: the regions of the state space included in the samples
and the quality of the cost-to-goal estimates. Also, these two factors are
interdependent: having perfect estimates of cost-to-goal is insufficient if an
unrepresentative part of the state space is included in the sample set.
Additionally, we study the effects of restricting samples to only include
states that could be evaluated when solving a given task and the effects of
adding samples with high-value estimates. Based on our findings, we propose
practical strategies to improve the quality of learned heuristics: three
strategies that aim to generate more representative states and two strategies
that improve the cost-to-goal estimates. Our resulting neural network heuristic
has higher coverage than a basic satisficing heuristic. Also, compared to a
baseline learned heuristic, our best neural network heuristic almost doubles
the mean coverage and can increase it for some domains by more than six times.Comment: 27 page