Humans infer missing visual information by focusing on spatial relationships in the context of their surroundings. Machine learning aims to replicate this skill through image completion, a fundamental task in current computer vision research. While advances in self-attention layers have recently enhanced generative machine learning models for text, these mechanisms still currently lack the capability to handle sparse image completion efficiently. We introduce a distance-based attention mechanism that uses radial-based weights to efficiently reconstruct an image. We compare this attention mechanism with self-attention and a fully connected network on an image completion task using the MNIST dataset. Our results show that the weighted spatial-attention mechanism outperformed both models, producing more accurate reconstructions with greater efficiency
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.