Search CORE

28,486 research outputs found

Spatio-temporal wardrobe generation of actor's clothing in video content

Author: E Simo-Serra
F Wang
H Wang
J Liaukonyte
K Nogueira
K Taşdemir
L Baraldi
L dos Santos Belo
M Ajmal
P Šaloun
R Achanta
SA Chatzichristofis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography

Webly Supervised Learning of Convolutional Networks

Author: Chen Xinlei
Gupta Abhinav
Publication venue
Publication date: 07/10/2015
Field of study

We present an approach to utilize large amounts of web data for learning CNNs. Specifically inspired by curriculum learning, we present a two-step approach for CNN training. First, we use easy images to train an initial visual representation. We then use this initial CNN and adapt it to harder, more realistic images by leveraging the structure of data and categories. We demonstrate that our two-stage CNN outperforms a fine-tuned CNN trained on ImageNet on Pascal VOC 2012. We also demonstrate the strength of webly supervised learning by localizing objects in web images and training a R-CNN style detector. It achieves the best performance on VOC 2007 where no VOC training data is used. Finally, we show our approach is quite robust to noise and performs comparably even when we use image search results from March 2013 (pre-CNN image search era)

arXiv.org e-Print Archive

Crossref

Recommended from our members

Zapping index: Using smile to measure advertisement zapping likelihood

Author: An L
Bhanu B
Kafai M
Yang S
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

In marketing and advertising research, 'zapping' is defined as the action when a viewer stops watching a commercial. Researchers analyze users' behavior in order to prevent zapping which helps advertisers to design effective commercials. Since emotions can be used to engage consumers, in this paper, we leverage automated facial expression analysis to understand consumers' zapping behavior. Firstly, we provide an accurate moment-to-moment smile detection algorithm. Secondly, we formulate a binary classification problem (zapping/non-zapping) based on real-world scenarios, and adopt smile response as the feature to predict zapping. Thirdly, to cope with the lack of a metric in advertising evaluation, we propose a new metric called Zapping Index (ZI). ZI is a moment-to-moment measurement of a user's zapping probability. It gauges not only the reaction of a user, but also the preference of a user to commercials. Finally, extensive experiments are performed to provide insights and we make recommendations that will be useful to both advertisers and advertisement publishers

eScholarship - University of California

A generic news story segmentation system and its evaluation

Author: Czirjék Csaba
Murphy Noel
O'Connor Noel E.
O'Hare Neil
Smeaton Alan F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

The paper presents an approach to segmenting broadcast TV news programmes automatically into individual news stories. We first segment the programme into individual shots, and then a number of analysis tools are run on the programme to extract features to represent each shot. The results of these feature extraction tools are then combined using a support vector machine trained to detect anchorperson shots. A news broadcast can then be segmented into individual stories based on the location of the anchorperson shots within the programme. We use one generic system to segment programmes from two different broadcasters, illustrating the robustness of our feature extraction process to the production styles of different broadcasters

Crossref

Irish Universities

DCU Online Research Access Service

Spectrum Sensing of DVB-T2 Signals using a Low Computational Noise Power Estimation

Author: Abdulkadir Yusuf
El Cheikh Mohamad
Simpson Oluyomi
Sun Yichuang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/08/2018
Field of study

© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted ncomponent of this work in other works.Cognitive radio is a promising technology that answers the spectrum scarcity problem arising from the proliferation of wireless networks and mobile services. In this paper, spectrum sensing of digital video broadcasting-second generation terrestrial (DVB-T2) signals in AWGN, WRAN and COST207 multipath fading environment are considered. ED is known to achieve an increased performance among low computational complexity detectors, but it is susceptible to noise uncertainty. Taking into consideration the edge pilot and scattered pilot periodicity in DVB-T2 signals, a low computational noise power estimator is proposed. Analytical forms for the detector are derived. Simulation results show that with the noise power estimator, ED significantly outperforms the pilot correlation-based detectors. Simulation also show that the proposed scheme enables ED to obtain increased detection performance in multi-path fading environments. Moreover, based on this algorithm a practical sensing scheme for cognitive radio networks is proposed.Peer reviewedFinal Accepted Versio

Crossref

University of Hertfordshire Research Archive

Speaker-following Video Subtitles

Author: Hu Yongtao
Kautz Jan
Wang Wenping
Yu Yizhou
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

We propose a new method for improving the presentation of subtitles in video (e.g. TV and movies). With conventional subtitles, the viewer has to constantly look away from the main viewing area to read the subtitles at the bottom of the screen, which disrupts the viewing experience and causes unnecessary eyestrain. Our method places on-screen subtitles next to the respective speakers to allow the viewer to follow the visual content while simultaneously reading the subtitles. We use novel identification algorithms to detect the speakers based on audio and visual information. Then the placement of the subtitles is determined using global optimization. A comprehensive usability study indicated that our subtitle placement method outperformed both conventional fixed-position subtitling and another previous dynamic subtitling method in terms of enhancing the overall viewing experience and reducing eyestrain

arXiv.org e-Print Archive

HKU Scholars Hub