Development and evaluation of automated localization and reconstruction
of all fruits on tomato plants in a greenhouse based on multi-view perception
and 3D multi-object tracking
Accurate representation and localization of relevant objects is important for
robots to perform tasks. Building a generic representation that can be used
across different environments and tasks is not easy, as the relevant objects
vary depending on the environment and the task. Furthermore, another challenge
arises in agro-food environments due to their complexity, and high levels of
clutter and occlusions. In this paper, we present a method to build generic
representations in highly occluded agro-food environments using multi-view
perception and 3D multi-object tracking. Our representation is built upon a
detection algorithm that generates a partial point cloud for each detected
object. The detected objects are then passed to a 3D multi-object tracking
algorithm that creates and updates the representation over time. The whole
process is performed at a rate of 10 Hz. We evaluated the accuracy of the
representation on a real-world agro-food environment, where it was able to
successfully represent and locate tomatoes in tomato plants despite a high
level of occlusion. We were able to estimate the total count of tomatoes with a
maximum error of 5.08% and to track tomatoes with a tracking accuracy up to
71.47%. Additionally, we showed that an evaluation using tracking metrics gives
more insight in the errors in localizing and representing the fruits.Comment: Pre-print, article submitted and in review proces