Search CORE

14 research outputs found

ImageSpirit: Verbal Guided Image Parsing

Author: Cheng Ming-Ming
Crook Nigel
Lin Wen-Yan
Mitra Niloy
Sturgess Paul
Torr Philip
Vineet Vibhav
Warrell Jonathan
Zheng Shuai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Humans describe images in terms of nouns and adjectives while algorithms operate on images represented as sets of pixels. Bridging this gap between how humans would like to access images versus their typical representation is the goal of image parsing, which involves assigning object and attribute labels to pixel. In this paper we propose treating nouns as object labels and adjectives as visual attribute labels. This allows us to formulate the image parsing problem as one of jointly estimating per-pixel object and attribute labels from a set of training images. We propose an efficient (interactive time) solution. Using the extracted labels as handles, our system empowers a user to verbally refine the results. This enables hands-free parsing of an image into pixel-wise object/attribute labels that correspond to human semantics. Verbally selecting objects of interests enables a novel and natural interaction modality that can possibly be used to interact with new generation devices (e.g. smart phones, Google Glass, living room devices). We demonstrate our system on a large number of real-world images with varying complexity. To help understand the tradeoffs compared to traditional mouse based interactions, results are reported for both a large scale quantitative evaluation and a user study.Comment: http://mmcheng.net/imagespirit

arXiv.org e-Print Archive

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

UCL Discovery

Oxford University Research Archive

Oxford Brookes University: RADAR

Calipso: Physics-based Image and Video Editing through CAD Model Proxies

Author: Cotin Stephane
Courtecuisse Hadrien
Haouchine Nazim
Nießner Matthias
Roy Frederick
Publication venue
Publication date: 12/08/2017
Field of study

We present Calipso, an interactive method for editing images and videos in a physically-coherent manner. Our main idea is to realize physics-based manipulations by running a full physics simulation on proxy geometries given by non-rigidly aligned CAD models. Running these simulations allows us to apply new, unseen forces to move or deform selected objects, change physical parameters such as mass or elasticity, or even add entire new objects that interact with the rest of the underlying scene. In Calipso, the user makes edits directly in 3D; these edits are processed by the simulation and then transfered to the target 2D content using shape-to-image correspondences in a photo-realistic rendering process. To align the CAD models, we introduce an efficient CAD-to-image alignment procedure that jointly minimizes for rigid and non-rigid alignment while preserving the high-level structure of the input shape. Moreover, the user can choose to exploit image flow to estimate scene motion, producing coherent physical behavior with ambient dynamics. We demonstrate Calipso's physics-based editing on a wide range of examples producing myriad physical behavior while preserving geometric and visual consistency.Comment: 11 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

Intelligent visual media processing: when graphics meets vision

Author: Cheng Ming-Ming
Hou Qi-Bin
Rosin Paul L
Zhang Song-Hai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The computer graphics and computer vision communities have been working closely together in recent years, and a variety of algorithms and applications have been developed to analyze and manipulate the visual media around us. There are three major driving forces behind this phenomenon: i) the availability of big data from the Internet has created a demand for dealing with the ever increasing, vast amount of resources; ii) powerful processing tools, such as deep neural networks, provide e�ective ways for learning how to deal with heterogeneous visual data; iii) new data capture devices, such as the Kinect, bridge between algorithms for 2D image understanding and 3D model analysis. These driving forces have emerged only recently, and we believe that the computer graphics and computer vision communities are still in the beginning of their honeymoon phase. In this work we survey recent research on how computer vision techniques bene�t computer graphics techniques and vice versa, and cover research on analysis, manipulation, synthesis, and interaction. We also discuss existing problems and suggest possible further research directions

Crossref

Online Research @ Cardiff

Global contrast based salient region detection

Author: Cheng Ming-Ming
Hu Shi-Min
Huang Xiaolei
Mitra Niloy J
Torr Philip HS
Publication venue: IEEE
Publication date: 04/08/2014
Field of study

Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information

Oxford University Research Archive

A brief survey of visual saliency detection

Author: Guo Jie
Hussain Sumaira
Jian Muwei
Ullah Inam
Wang Xing
Yin Yilong
Yu Hui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2020
Field of study

Portsmouth University Research Portal (Pure)

Interactive sketch-driven image synthesis

Author: Campbell Neill D. F.
Goldman Dan B.
Kautz Jan
Turmukhambetov Daniyar
Publication venue: 'Wiley'
Publication date: 13/08/2015
Field of study

OPUS

Crossref

Interactive Sketch-Driven Image Synthesis

Author: Campbell ND
Goldman DB
Kautz J
Turmukhambetov D
Publication venue: Blackwell Publishing Ltd
Publication date: 01/01/2015
Field of study

We present an interactive system for composing realistic images of an object under arbitrary pose and appearance specified by sketching. Our system draws inspiration from a traditional illustration workflow: The user first sketches rough 'masses' of the object, as ellipses, to define an initial abstract pose that can then be refined with more detailed contours as desired. The system is made robust to partial or inaccurate sketches using a reduced-dimensionality model of pose space learnt from a labelled collection of photos. Throughout the composition process, interactive visual feedback is provided to guide the user. Finally, the user's partial or complete sketch, complemented with appearance requirements, is used to constrain the automatic synthesis of a novel, high-quality, realistic image

UCL Discovery

Guiding Image Manipulations using Shape-appearance Subspaces from Co-alignment of Image Collections

Author: Allen
AN
Barnes
Belongie
Bookstein
Caetano
Cashman
Cheng
Cootes
Feng
Gastal
Goldberg
HACohen
He
Horn
Huang
Igarashi
Lang
Lee
Levin
Li
Liao
Lischinski
Liu
Müller
Ovsjanikov
Pellacini
Schaefer
Seol
Serre
Seung
Shapira
Shih
Sumner
Tao
Turk
Tversky
Wang
XU
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Advanced Visual Computing for Image Saliency Detection

Author: Yuan Yuchen
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 07/04/2017
Field of study

Saliency detection is a category of computer vision algorithms that aims to filter out the most salient object in a given image. Existing saliency detection methods can generally be categorized as bottom-up methods and top-down methods, and the prevalent deep neural network (DNN) has begun to show its applications in saliency detection in recent years. However, the challenges in existing methods, such as problematic pre-assumption, inefficient feature integration and absence of high-level feature learning, prevent them from superior performances. In this thesis, to address the limitations above, we have proposed multiple novel models with favorable performances. Specifically, we first systematically reviewed the developments of saliency detection and its related works, and then proposed four new methods, with two based on low-level image features, and two based on DNNs. The regularized random walks ranking method (RR) and its reversion-correction-improved version (RCRR) are based on conventional low-level image features, which exhibit higher accuracy and robustness in extracting the image boundary based foreground / background queries; while the background search and foreground estimation (BSFE) and dense and sparse labeling (DSL) methods are based on DNNs, which have shown their dominant advantages in high-level image feature extraction, as well as the combined strength of multi-dimensional features. Each of the proposed methods is evaluated by extensive experiments, and all of them behave favorably against the state-of-the-art, especially the DSL method, which achieves remarkably higher performance against sixteen state-of-the-art methods (including ten conventional methods and six learning based methods) on six well-recognized public datasets. The successes of our proposed methods reveal more potential and meaningful applications of saliency detection in real-life computer vision tasks

Sydney eScholarship