60 research outputs found
Interactive Segmentation for Diverse Gesture Types Without Context
Interactive segmentation entails a human marking an image to guide how a
model either creates or edits a segmentation. Our work addresses limitations of
existing methods: they either only support one gesture type for marking an
image (e.g., either clicks or scribbles) or require knowledge of the gesture
type being employed, and require specifying whether marked regions should be
included versus excluded in the final segmentation. We instead propose a
simplified interactive segmentation task where a user only must mark an image,
where the input can be of any gesture type without specifying the gesture type.
We support this new task by introducing the first interactive segmentation
dataset with multiple gesture types as well as a new evaluation metric capable
of holistically evaluating interactive segmentation algorithms. We then analyze
numerous interactive segmentation algorithms, including ones adapted for our
novel task. While we observe promising performance overall, we also highlight
areas for future improvement. To facilitate further extensions of this work, we
publicly share our new dataset at https://github.com/joshmyersdean/dig
Semantic Photo Manipulation with a Generative Image Prior
Despite the recent success of GANs in synthesizing images conditioned on
inputs such as a user sketch, text, or semantic labels, manipulating the
high-level attributes of an existing natural photograph with GANs is
challenging for two reasons. First, it is hard for GANs to precisely reproduce
an input image. Second, after manipulation, the newly synthesized pixels often
do not fit the original image. In this paper, we address these issues by
adapting the image prior learned by GANs to image statistics of an individual
image. Our method can accurately reconstruct the input image and synthesize new
content, consistent with the appearance of the input image. We demonstrate our
interactive system on several semantic image editing tasks, including
synthesizing new objects consistent with background, removing unwanted objects,
and changing the appearance of an object. Quantitative and qualitative
comparisons against several existing methods demonstrate the effectiveness of
our method.Comment: SIGGRAPH 201
Road Redesign Technique Achieving Enhanced Road Safety by Inpainting with a Diffusion Model
Road infrastructure can affect the occurrence of road accidents. Therefore,
identifying roadway features with high accident probability is crucial. Here,
we introduce image inpainting that can assist authorities in achieving safe
roadway design with minimal intervention in the current roadway structure.
Image inpainting is based on inpainting safe roadway elements in a roadway
image, replacing accident-prone (AP) features by using a diffusion model. After
object-level segmentation, the AP features identified by the properties of
accident hotspots are masked by a human operator and safe roadway elements are
inpainted. With only an average time of 2 min for image inpainting, the
likelihood of an image being classified as an accident hotspot drops by an
average of 11.85%. In addition, safe urban spaces can be designed considering
human factors of commuters such as gaze saliency. Considering this, we
introduce saliency enhancement that suggests chrominance alteration for a safe
road view.Comment: 9 Pages, 6 figures, 4 table
Self-Sampling Meta SAM: Enhancing Few-shot Medical Image Segmentation with Meta-Learning
While the Segment Anything Model (SAM) excels in semantic segmentation for
general-purpose images, its performance significantly deteriorates when applied
to medical images, primarily attributable to insufficient representation of
medical images in its training dataset. Nonetheless, gathering comprehensive
datasets and training models that are universally applicable is particularly
challenging due to the long-tail problem common in medical images. To address
this gap, here we present a Self-Sampling Meta SAM (SSM-SAM) framework for
few-shot medical image segmentation. Our innovation lies in the design of three
key modules: 1) An online fast gradient descent optimizer, further optimized by
a meta-learner, which ensures swift and robust adaptation to new tasks. 2) A
Self-Sampling module designed to provide well-aligned visual prompts for
improved attention allocation; and 3) A robust attention-based decoder
specifically designed for medical few-shot learning to capture relationship
between different slices. Extensive experiments on a popular abdominal CT
dataset and an MRI dataset demonstrate that the proposed method achieves
significant improvements over state-of-the-art methods in few-shot
segmentation, with an average improvements of 10.21% and 1.80% in terms of DSC,
respectively. In conclusion, we present a novel approach for rapid online
adaptation in interactive image segmentation, adapting to a new organ in just
0.83 minutes. Code is publicly available on GitHub upon acceptance
- …