503 research outputs found
Proposal Flow: Semantic Correspondences from Object Proposals
Finding image correspondences remains a challenging problem in the presence
of intra-class variations and large changes in scene layout. Semantic flow
methods are designed to handle images depicting different instances of the same
object or scene category. We introduce a novel approach to semantic flow,
dubbed proposal flow, that establishes reliable correspondences using object
proposals. Unlike prevailing semantic flow approaches that operate on pixels or
regularly sampled local regions, proposal flow benefits from the
characteristics of modern object proposals, that exhibit high repeatability at
multiple scales, and can take advantage of both local and geometric consistency
constraints among proposals. We also show that the corresponding sparse
proposal flow can effectively be transformed into a conventional dense flow
field. We introduce two new challenging datasets that can be used to evaluate
both general semantic flow techniques and region-based approaches such as
proposal flow. We use these benchmarks to compare different matching
algorithms, object proposals, and region features within proposal flow, to the
state of the art in semantic flow. This comparison, along with experiments on
standard datasets, demonstrates that proposal flow significantly outperforms
existing semantic flow methods in various settings.Comment: arXiv admin note: text overlap with arXiv:1511.0506
Convolutional neural network architecture for geometric matching
We address the problem of determining correspondences between two images in
agreement with a geometric model such as an affine or thin-plate spline
transformation, and estimating its parameters. The contributions of this work
are three-fold. First, we propose a convolutional neural network architecture
for geometric matching. The architecture is based on three main components that
mimic the standard steps of feature extraction, matching and simultaneous
inlier detection and model parameter estimation, while being trainable
end-to-end. Second, we demonstrate that the network parameters can be trained
from synthetically generated imagery without the need for manual annotation and
that our matching layer significantly increases generalization capabilities to
never seen before images. Finally, we show that the same model can perform both
instance-level and category-level matching giving state-of-the-art results on
the challenging Proposal Flow dataset.Comment: In 2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 2017
Attention Mechanisms in Medical Image Segmentation: A Survey
Medical image segmentation plays an important role in computer-aided
diagnosis. Attention mechanisms that distinguish important parts from
irrelevant parts have been widely used in medical image segmentation tasks.
This paper systematically reviews the basic principles of attention mechanisms
and their applications in medical image segmentation. First, we review the
basic concepts of attention mechanism and formulation. Second, we surveyed over
300 articles related to medical image segmentation, and divided them into two
groups based on their attention mechanisms, non-Transformer attention and
Transformer attention. In each group, we deeply analyze the attention
mechanisms from three aspects based on the current literature work, i.e., the
principle of the mechanism (what to use), implementation methods (how to use),
and application tasks (where to use). We also thoroughly analyzed the
advantages and limitations of their applications to different tasks. Finally,
we summarize the current state of research and shortcomings in the field, and
discuss the potential challenges in the future, including task specificity,
robustness, standard evaluation, etc. We hope that this review can showcase the
overall research context of traditional and Transformer attention methods,
provide a clear reference for subsequent research, and inspire more advanced
attention research, not only in medical image segmentation, but also in other
image analysis scenarios.Comment: Submitted to Medical Image Analysis, survey paper, 34 pages, over 300
reference
STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos
Existing methods for instance segmentation in videos typi-cally involve
multi-stage pipelines that follow the tracking-by-detectionparadigm and model a
video clip as a sequence of images. Multiple net-works are used to detect
objects in individual frames, and then associatethese detections over time.
Hence, these methods are often non-end-to-end trainable and highly tailored to
specific tasks. In this paper, we pro-pose a different approach that is
well-suited to a variety of tasks involvinginstance segmentation in videos. In
particular, we model a video clip asa single 3D spatio-temporal volume, and
propose a novel approach thatsegments and tracks instances across space and
time in a single stage. Ourproblem formulation is centered around the idea of
spatio-temporal em-beddings which are trained to cluster pixels belonging to a
specific objectinstance over an entire video clip. To this end, we introduce
(i) novel mix-ing functions that enhance the feature representation of
spatio-temporalembeddings, and (ii) a single-stage, proposal-free network that
can rea-son about temporal context. Our network is trained end-to-end to
learnspatio-temporal embeddings as well as parameters required to clusterthese
embeddings, thus simplifying inference. Our method achieves state-of-the-art
results across multiple datasets and tasks. Code and modelsare available at
https://github.com/sabarim/STEm-Seg.Comment: 28 pages, 6 figure
- …