57 research outputs found
Scale-Equivariant UNet for Histopathology Image Segmentation
Digital histopathology slides are scanned and viewed under different
magnifications and stored as images at different resolutions. Convolutional
Neural Networks (CNNs) trained on such images at a given scale fail to
generalise to those at different scales. This inability is often addressed by
augmenting training data with re-scaled images, allowing a model with
sufficient capacity to learn the requisite patterns. Alternatively, designing
CNN filters to be scale-equivariant frees up model capacity to learn
discriminative features. In this paper, we propose the Scale-Equivariant UNet
(SEUNet) for image segmentation by building on scale-space theory. The SEUNet
contains groups of filters that are linear combinations of Gaussian basis
filters, whose scale parameters are trainable but constrained to span disjoint
scales through the layers of the network. Extensive experiments on a nuclei
segmentation dataset and a tissue type segmentation dataset demonstrate that
our method outperforms other approaches, with much fewer trainable parameters.Comment: This paper was accepted by GeoMedIA 202
-Equivariant Vision Transformer
Vision Transformer (ViT) has achieved remarkable performance in computer
vision. However, positional encoding in ViT makes it substantially difficult to
learn the intrinsic equivariance in data. Initial attempts have been made on
designing equivariant ViT but are proved defective in some cases in this paper.
To address this issue, we design a Group Equivariant Vision Transformer
(GE-ViT) via a novel, effective positional encoding operator. We prove that
GE-ViT meets all the theoretical requirements of an equivariant neural network.
Comprehensive experiments are conducted on standard benchmark datasets,
demonstrating that GE-ViT significantly outperforms non-equivariant
self-attention networks. The code is available at
https://github.com/ZJUCDSYangKaifan/GEVit.Comment: Accept to UAI202
- …