Search CORE

13,464 research outputs found

Hybrid image representation methods for automatic image annotation: a survey

Author: Bechkoum Kamal
Benblidia Nadjia
Bouyerbou Hafidha
Oukid Saliha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2012
Field of study

In most automatic image annotation systems, images are represented with low level features using either global methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is beneficial in annotating images. In this paper, we provide a survey on automatic image annotation techniques according to one aspect: feature extraction, and, in order to complement existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation

Crossref

NECTAR

ImageSpirit: Verbal Guided Image Parsing

Author: Cheng Ming-Ming
Crook Nigel
Lin Wen-Yan
Mitra Niloy
Sturgess Paul
Torr Philip
Vineet Vibhav
Warrell Jonathan
Zheng Shuai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Humans describe images in terms of nouns and adjectives while algorithms operate on images represented as sets of pixels. Bridging this gap between how humans would like to access images versus their typical representation is the goal of image parsing, which involves assigning object and attribute labels to pixel. In this paper we propose treating nouns as object labels and adjectives as visual attribute labels. This allows us to formulate the image parsing problem as one of jointly estimating per-pixel object and attribute labels from a set of training images. We propose an efficient (interactive time) solution. Using the extracted labels as handles, our system empowers a user to verbally refine the results. This enables hands-free parsing of an image into pixel-wise object/attribute labels that correspond to human semantics. Verbally selecting objects of interests enables a novel and natural interaction modality that can possibly be used to interact with new generation devices (e.g. smart phones, Google Glass, living room devices). We demonstrate our system on a large number of real-world images with varying complexity. To help understand the tradeoffs compared to traditional mouse based interactions, results are reported for both a large scale quantitative evaluation and a user study.Comment: http://mmcheng.net/imagespirit

arXiv.org e-Print Archive

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

UCL Discovery

Oxford University Research Archive

Oxford Brookes University: RADAR

Discriminative Region Proposal Adversarial Networks for High-Quality Image-to-Image Translation

Author: C Dong
C Li
HR Sheikh
J Johnson
J-Y Zhu
O Ronneberger
R Tyleček
R Zhang
X Wang
Z Wang
Publication venue
Publication date: 06/08/2018
Field of study

Image-to-image translation has been made much progress with embracing Generative Adversarial Networks (GANs). However, it's still very challenging for translation tasks that require high quality, especially at high-resolution and photorealism. In this paper, we present Discriminative Region Proposal Adversarial Networks (DRPAN) for high-quality image-to-image translation. We decompose the procedure of image-to-image translation task into three iterated steps, first is to generate an image with global structure but some local artifacts (via GAN), second is using our DRPnet to propose the most fake region from the generated image, and third is to implement "image inpainting" on the most fake region for more realistic result through a reviser, so that the system (DRPAN) can be gradually optimized to synthesize images with more attention on the most artifact local part. Experiments on a variety of image-to-image translation tasks and datasets validate that our method outperforms state-of-the-arts for producing high-quality translation results in terms of both human perceptual studies and automatic quantitative measures.Comment: ECCV 201

arXiv.org e-Print Archive

Crossref

Automatic available seat counting in public transport using wavelets

Author: De Potter Pieterjan
Kypraios Ioannis
Poppe Chris
Van de Walle Rik
Verstockt Steven
Publication venue: Croatian Society Electronics in Marine (ELMAR)
Publication date: 01/01/2011
Field of study

Ghent University Academic Bibliography