9 research outputs found
Leveraging Self-Supervised Vision Transformers for Neural Transfer Function Design
In volume rendering, transfer functions are used to classify structures of
interest, and to assign optical properties such as color and opacity. They are
commonly defined as 1D or 2D functions that map simple features to these
optical properties. As the process of designing a transfer function is
typically tedious and unintuitive, several approaches have been proposed for
their interactive specification. In this paper, we present a novel method to
define transfer functions for volume rendering by leveraging the feature
extraction capabilities of self-supervised pre-trained vision transformers. To
design a transfer function, users simply select the structures of interest in a
slice viewer, and our method automatically selects similar structures based on
the high-level features extracted by the neural network. Contrary to previous
learning-based transfer function approaches, our method does not require
training of models and allows for quick inference, enabling an interactive
exploration of the volume data. Our approach reduces the amount of necessary
annotations by interactively informing the user about the current
classification, so they can focus on annotating the structures of interest that
still require annotation. In practice, this allows users to design transfer
functions within seconds, instead of minutes. We compare our method to existing
learning-based approaches in terms of annotation and compute time, as well as
with respect to segmentation accuracy. Our accompanying video showcases the
interactivity and effectiveness of our method
Spatially Guiding Unsupervised Semantic Segmentation Through Depth-Informed Feature Distillation and Sampling
Traditionally, training neural networks to perform semantic segmentation
required expensive human-made annotations. But more recently, advances in the
field of unsupervised learning have made significant progress on this issue and
towards closing the gap to supervised algorithms. To achieve this, semantic
knowledge is distilled by learning to correlate randomly sampled features from
images across an entire dataset. In this work, we build upon these advances by
incorporating information about the structure of the scene into the training
process through the use of depth information. We achieve this by (1) learning
depth-feature correlation by spatially correlate the feature maps with the
depth maps to induce knowledge about the structure of the scene and (2)
implementing farthest-point sampling to more effectively select relevant
features by utilizing 3D sampling techniques on depth information of the scene.
Finally, we demonstrate the effectiveness of our technical contributions
through extensive experimentation and present significant improvements in
performance across multiple benchmark datasets