1,464 research outputs found
Clothing Co-Parsing by Joint Image Segmentation and Labeling
This paper aims at developing an integrated system of clothing co-parsing, in
order to jointly parse a set of clothing images (unsegmented but annotated with
tags) into semantic configurations. We propose a data-driven framework
consisting of two phases of inference. The first phase, referred as "image
co-segmentation", iterates to extract consistent regions on images and jointly
refines the regions over all images by employing the exemplar-SVM (E-SVM)
technique [23]. In the second phase (i.e. "region co-labeling"), we construct a
multi-image graphical model by taking the segmented regions as vertices, and
incorporate several contexts of clothing configuration (e.g., item location and
mutual interactions). The joint label assignment can be solved using the
efficient Graph Cuts algorithm. In addition to evaluate our framework on the
Fashionista dataset [30], we construct a dataset called CCP consisting of 2098
high-resolution street fashion photos to demonstrate the performance of our
system. We achieve 90.29% / 88.23% segmentation accuracy and 65.52% / 63.89%
recognition rate on the Fashionista and the CCP datasets, respectively, which
are superior compared with state-of-the-art methods.Comment: 8 pages, 5 figures, CVPR 201
Scene Graph Generation by Iterative Message Passing
Understanding a visual scene goes beyond recognizing individual objects in
isolation. Relationships between objects also constitute rich semantic
information about the scene. In this work, we explicitly model the objects and
their relationships using scene graphs, a visually-grounded graphical structure
of an image. We propose a novel end-to-end model that generates such structured
scene representation from an input image. The model solves the scene graph
inference problem using standard RNNs and learns to iteratively improves its
predictions via message passing. Our joint inference model can take advantage
of contextual cues to make better predictions on objects and their
relationships. The experiments show that our model significantly outperforms
previous methods for generating scene graphs using Visual Genome dataset and
inferring support relations with NYU Depth v2 dataset.Comment: CVPR 201
Convolutional Sparse Kernel Network for Unsupervised Medical Image Analysis
The availability of large-scale annotated image datasets and recent advances
in supervised deep learning methods enable the end-to-end derivation of
representative image features that can impact a variety of image analysis
problems. Such supervised approaches, however, are difficult to implement in
the medical domain where large volumes of labelled data are difficult to obtain
due to the complexity of manual annotation and inter- and intra-observer
variability in label assignment. We propose a new convolutional sparse kernel
network (CSKN), which is a hierarchical unsupervised feature learning framework
that addresses the challenge of learning representative visual features in
medical image analysis domains where there is a lack of annotated training
data. Our framework has three contributions: (i) We extend kernel learning to
identify and represent invariant features across image sub-patches in an
unsupervised manner. (ii) We initialise our kernel learning with a layer-wise
pre-training scheme that leverages the sparsity inherent in medical images to
extract initial discriminative features. (iii) We adapt a multi-scale spatial
pyramid pooling (SPP) framework to capture subtle geometric differences between
learned visual features. We evaluated our framework in medical image retrieval
and classification on three public datasets. Our results show that our CSKN had
better accuracy when compared to other conventional unsupervised methods and
comparable accuracy to methods that used state-of-the-art supervised
convolutional neural networks (CNNs). Our findings indicate that our
unsupervised CSKN provides an opportunity to leverage unannotated big data in
medical imaging repositories.Comment: Accepted by Medical Image Analysis (with a new title 'Convolutional
Sparse Kernel Network for Unsupervised Medical Image Analysis'). The
manuscript is available from following link
(https://doi.org/10.1016/j.media.2019.06.005
Recent Progress in Image Deblurring
This paper comprehensively reviews the recent development of image
deblurring, including non-blind/blind, spatially invariant/variant deblurring
techniques. Indeed, these techniques share the same objective of inferring a
latent sharp image from one or several corresponding blurry images, while the
blind deblurring techniques are also required to derive an accurate blur
kernel. Considering the critical role of image restoration in modern imaging
systems to provide high-quality images under complex environments such as
motion, undesirable lighting conditions, and imperfect system components, image
deblurring has attracted growing attention in recent years. From the viewpoint
of how to handle the ill-posedness which is a crucial issue in deblurring
tasks, existing methods can be grouped into five categories: Bayesian inference
framework, variational methods, sparse representation-based methods,
homography-based modeling, and region-based methods. In spite of achieving a
certain level of development, image deblurring, especially the blind case, is
limited in its success by complex application conditions which make the blur
kernel hard to obtain and be spatially variant. We provide a holistic
understanding and deep insight into image deblurring in this review. An
analysis of the empirical evidence for representative methods, practical
issues, as well as a discussion of promising future directions are also
presented.Comment: 53 pages, 17 figure
- …