In geospatial planning, it is often essential to represent objects in a
vectorized format, as this format easily translates to downstream tasks such as
web development, graphics, or design. While these problems are frequently
addressed using semantic segmentation, which requires additional
post-processing to vectorize objects in a non-trivial way, we present an
Image-to-Sequence model that allows for direct shape inference and is ready for
vector-based workflows out of the box. We demonstrate the model's performance
in various ways, including perturbations to the image input that correspond to
variations or artifacts commonly encountered in remote sensing applications.
Our model outperforms prior works when using ground truth bounding boxes (one
object per image), achieving the lowest maximum tangent angle error.Comment: ICLR 2023 Workshop on Machine Learning in Remote Sensin