132 research outputs found

    Solid Spherical Energy (SSE) CNNs for Efficient 3D Medical Image Analysis

    Get PDF
    Invariance to local rotation, to differentiate from the global rotation of images and objects, is required in various texture analysis problems. It has led to several breakthrough methods such as local binary patterns, maximum response and steerable filterbanks. In particular, textures in medical images often exhibit local structures at arbitrary orientations. Locally Rotation Invariant (LRI) Convolutional Neural Networks (CNN) were recently proposed using 3D steerable filters to combine LRI with Directional Sensitivity (DS). The steerability avoids the expensive cost of convolutions with rotated kernels and comes with a parametric representation that results in a drastic reduction of the number of trainable parameters. Yet, the potential bottleneck (memory and computation) of this approach lies in the necessity to recombine responses for a set of predefined discretized orientations. In this paper, we propose to calculate invariants from the responses to the set of spherical harmonics projected onto 3D kernels in the form of a lightweight Solid Spherical Energy (SSE) CNN. It offers a compromise between the high kernel specificity of the LRI-CNN and a low memory/operations requirement. The computational gain is evaluated on 3D synthetic and pulmonary nodule classification experiments. The performance of the proposed approach is compared with steerable LRI-CNNs and standard 3D CNNs, showing competitive results with the state of the art

    CubeNet: Equivariance to 3D Rotation and Translation

    Full text link
    3D Convolutional Neural Networks are sensitive to transformations applied to their input. This is a problem because a voxelized version of a 3D object, and its rotated clone, will look unrelated to each other after passing through to the last layer of a network. Instead, an idealized model would preserve a meaningful representation of the voxelized object, while explaining the pose-difference between the two inputs. An equivariant representation vector has two components: the invariant identity part, and a discernable encoding of the transformation. Models that can't explain pose-differences risk "diluting" the representation, in pursuit of optimizing a classification or regression loss function. We introduce a Group Convolutional Neural Network with linear equivariance to translations and right angle rotations in three dimensions. We call this network CubeNet, reflecting its cube-like symmetry. By construction, this network helps preserve a 3D shape's global and local signature, as it is transformed through successive layers. We apply this network to a variety of 3D inference problems, achieving state-of-the-art on the ModelNet10 classification challenge, and comparable performance on the ISBI 2012 Connectome Segmentation Benchmark. To the best of our knowledge, this is the first 3D rotation equivariant CNN for voxel representations.Comment: Preprin

    Regular SE(3) Group Convolutions for Volumetric Medical Image Analysis

    Full text link
    Regular group convolutional neural networks (G-CNNs) have been shown to increase model performance and improve equivariance to different geometrical symmetries. This work addresses the problem of SE(3), i.e., roto-translation equivariance, on volumetric data. Volumetric image data is prevalent in many medical settings. Motivated by the recent work on separable group convolutions, we devise a SE(3) group convolution kernel separated into a continuous SO(3) (rotation) kernel and a spatial kernel. We approximate equivariance to the continuous setting by sampling uniform SO(3) grids. Our continuous SO(3) kernel is parameterized via RBF interpolation on similarly uniform grids. We demonstrate the advantages of our approach in volumetric medical image analysis. Our SE(3) equivariant models consistently outperform CNNs and regular discrete G-CNNs on challenging medical classification tasks and show significantly improved generalization capabilities. Our approach achieves up to a 16.5% gain in accuracy over regular CNNs.Comment: 10 pages, 1 figure, 2 tables, accepted at MICCAI 2023. Updated version to camera ready version

    I2I: Image to Icosahedral Projection for SO(3)\mathrm{SO}(3) Object Reasoning from Single-View Images

    Full text link
    Reasoning about 3D objects based on 2D images is challenging due to large variations in appearance caused by viewing the object from different orientations. Ideally, our model would be invariant or equivariant to changes in object pose. Unfortunately, this is typically not possible with 2D image input because we do not have an a priori model of how the image would change under out-of-plane object rotations. The only SO(3)\mathrm{SO}(3)-equivariant models that currently exist require point cloud input rather than 2D images. In this paper, we propose a novel model architecture based on icosahedral group convolution that reasons in SO(3)\mathrm{SO(3)} by projecting the input image onto an icosahedron. As a result of this projection, the model is approximately equivariant to rotation in SO(3)\mathrm{SO}(3). We apply this model to an object pose estimation task and find that it outperforms reasonable baselines

    General E(2)-Equivariant Steerable CNNs

    Get PDF
    corecore