1,311 research outputs found
Coarse-to-Fine Annotation Enrichment for Semantic Segmentation Learning
Rich high-quality annotated data is critical for semantic segmentation
learning, yet acquiring dense and pixel-wise ground-truth is both labor- and
time-consuming. Coarse annotations (e.g., scribbles, coarse polygons) offer an
economical alternative, with which training phase could hardly generate
satisfactory performance unfortunately. In order to generate high-quality
annotated data with a low time cost for accurate segmentation, in this paper,
we propose a novel annotation enrichment strategy, which expands existing
coarse annotations of training data to a finer scale. Extensive experiments on
the Cityscapes and PASCAL VOC 2012 benchmarks have shown that the neural
networks trained with the enriched annotations from our framework yield a
significant improvement over that trained with the original coarse labels. It
is highly competitive to the performance obtained by using human annotated
dense annotations. The proposed method also outperforms among other
state-of-the-art weakly-supervised segmentation methods.Comment: CIKM 2018 International Conference on Information and Knowledge
Managemen
Gabor Barcodes for Medical Image Retrieval
In recent years, advances in medical imaging have led to the emergence of
massive databases, containing images from a diverse range of modalities. This
has significantly heightened the need for automated annotation of the images on
one side, and fast and memory-efficient content-based image retrieval systems
on the other side. Binary descriptors have recently gained more attention as a
potential vehicle to achieve these goals. One of the recently introduced binary
descriptors for tagging of medical images are Radon barcodes (RBCs) that are
driven from Radon transform via local thresholding. Gabor transform is also a
powerful transform to extract texture-based information. Gabor features have
exhibited robustness against rotation, scale, and also photometric
disturbances, such as illumination changes and image noise in many
applications. This paper introduces Gabor Barcodes (GBCs), as a novel framework
for the image annotation. To find the most discriminative GBC for a given query
image, the effects of employing Gabor filters with different parameters, i.e.,
different sets of scales and orientations, are investigated, resulting in
different barcode lengths and retrieval performances. The proposed method has
been evaluated on the IRMA dataset with 193 classes comprising of 12,677 x-ray
images for indexing, and 1,733 x-rays images for testing. A total error score
as low as ( accuracy for the first hit) was achieved.Comment: To appear in proceedings of The 2016 IEEE International Conference on
Image Processing (ICIP 2016), Sep 25-28, 2016, Phoenix, Arizona, US
Binary Representation Learning for Large Scale Visual Data
The exponentially growing modern media created large amount of multimodal or multidomain visual data, which usually reside in high dimensional space. And it is crucial to provide not only effective but also efficient understanding of the data.In this dissertation, we focus on learning binary representation of visual dataset, whose primary use has been hash code for retrieval purpose. Simultaneously it serves as multifunctional feature that can also be used for various computer vision tasks. Essentially, this is achieved by discriminative learning that preserves the supervision information in the binary representation.By using deep networks such as convolutional neural networks (CNNs) as backbones, and effective binary embedding algorithm that is seamlessly integrated into the learning process, we achieve state-of-the art performance on several settings. First, we study the supervised binary representation learning problem by using label information directly instead of pairwise similarity or triplet loss. By considering images and associated textual information, we study the cross-modal representation learning. CNNs are used in both image and text embedding, and we are able to perform retrieval and prediction across these modalities. Furthermore, by utilizing unlabeled images from a different domain, we propose to use adversarial learning to connect these domains. Finally, we also consider progressive learning for more efficient learning and instance-level representation learning to provide finer granularity understanding. This dissertation demonstrates that binary representation is versatile and powerful under various circumstances with different tasks
- …