Search CORE

256 research outputs found

Adaptation of Images and Videos for Different Screen Sizes

Author: Kiess Johannes
Publication venue
Publication date: 01/01/2014
Field of study

With the increasing popularity of smartphones and similar mobile devices, the demand for media to consume on the go rises. As most images and videos today are captured with HD or even higher resolutions, there is a need to adapt them in a content-aware fashion before they can be watched comfortably on screens with small sizes and varying aspect ratios. This process is called retargeting. Most distortions during this process are caused by a change of the aspect ratio. Thus, retargeting mainly focuses on adapting the aspect ratio of a video while the rest can be scaled uniformly. The main objective of this dissertation is to contribute to the modern image and video retargeting, especially regarding the potential of the seam carving operator. There are still unsolved problems in this research field that should be addressed in order to improve the quality of the results or speed up the performance of the retargeting process. This dissertation presents novel algorithms that are able to retarget images, videos and stereoscopic videos while dealing with problems like the preservation of straight lines or the reduction of the required memory space and computation time. Additionally, a GPU implementation is used to achieve the retargeting of videos in real-time. Furthermore, an enhancement of face detection is presented which is able to distinguish between faces that are important for the retargeting and faces that are not. Results show that the developed techniques are suitable for the desired scenarios

MAnnheim DOCument Server

Objective quality prediction of image retargeting algorithms

Author: Gutierrez Diego
Liang Yun
Liu Yong Lin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Quality assessment of image retargeting results is useful when comparing different methods. However, performing the necessary user studies is a long, cumbersome process. In this paper, we propose a simple yet efficient objective quality assessment method based on five key factors: i) preservation of salient regions; ii) analysis of the influence of artifacts; iii) preservation of the global structure of the image; iv) compliance with well-established aesthetics rules; and v) preservation of symmetry. Experiments on the RetargetMe benchmark, as well as a comprehensive additional user study, demonstrate that our proposed objective quality assessment method outperforms other existing metrics, while correlating better with human judgements. This makes our metric a good predictor of subjective preference

Repositorio Universidad de Zaragoza

State of the Art Report on Video-based Graphics and Video Visualizations

Author: Agarwal
Agarwal
Agarwala
Aggarwal
Ahonen
Andriluka
Arulampalam
Assa
Assa
Avidan
Bai
Ballan
Barnes
Barron
Bartoli
Bay
Bennett
Bhat
Bishop
Botchen
Bousseau
Boykov
Brandel
Bruhn
Brutzer
Buehler
Caspi
Chen
Cheng
Collomosse
Cornelis
Correa
Coughlan
Cremers
Dalal
Daniel
Davison
Dellaert
Deutscher
Divvala
Dollar
Durou
Faugeras
Felzenszwalb
Felzenszwalb
Felzenszwalb
Fleet
Furukawa
Gall
Galvin
Gibson
Goldman
Hannuna
Harris
Hartley
Hoiem
Horn
Hu
Huang
Höferlin
Kakumanu
Kang
Kang
Ke
Kimber
Klein
Koutsourakis
Kumar
Kutulakos
Kwatra
Laptev
Laptev
Laurentini
Le
Lee
Li
Lindeberg
Liu
Lobay
Lowe
Lucas
Matas
McIvor
Mei
Mikolajczyk
Mikolajczyk
Moons
Moreels
Nienhaus
Patel
Peker
Pellegrini
Petrovic
Piccardi
Pritch
Radke
Ramanan
Rav-Acha
Rav-Acha
Rav-Acha
Reisfeld
Romdhani
Rother
Rubinstein
Rubinstein
Rubinstein
Russell
Schoeffmann
Seitz
Setlur
Setlur
Sezgin
Shesh
Shi
Sion
Starck
Stein
Stoykova
Sull
Sun
Szeliski
Szeliski
Teodosio
Torresani
Torresani
Truong
Urtasun
Van
Viola
Vlasic
Vogiatzis
Wang
Wang
Wang
Wang
Wang
Wang
Weickert
Welch
Wilson
Winnemöller
Wolf
Xu
Yeung
Zhao
Zhu
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

Crossref

Cronfa at Swansea University

Backtracking Spatial Pyramid Pooling (SPP)-based Image Classifier for Weakly Supervised Top-down Salient Object Detection

Author: Cholakkal Hisham
Johnson Jubin
Rajan Deepu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Top-down saliency models produce a probability map that peaks at target locations specified by a task/goal such as object detection. They are usually trained in a fully supervised setting involving pixel-level annotations of objects. We propose a weakly supervised top-down saliency framework using only binary labels that indicate the presence/absence of an object in an image. First, the probabilistic contribution of each image region to the confidence of a CNN-based image classifier is computed through a backtracking strategy to produce top-down saliency. From a set of saliency maps of an image produced by fast bottom-up saliency approaches, we select the best saliency map suitable for the top-down task. The selected bottom-up saliency map is combined with the top-down saliency map. Features having high combined saliency are used to train a linear SVM classifier to estimate feature saliency. This is integrated with combined saliency and further refined through a multi-scale superpixel-averaging of saliency map. We evaluate the performance of the proposed weakly supervised topdown saliency and achieve comparable performance with fully supervised approaches. Experiments are carried out on seven challenging datasets and quantitative results are compared with 40 closely related approaches across 4 different applications.Comment: 14 pages, 7 figure

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

3D Shape Modeling Using High Level Descriptors

Author: Andersen Vedrana
Publication venue: Technical University of Denmark, DTU Informatics, Building 321
Publication date: 01/01/2011
Field of study

Online Research Database In Technology

Characterness: An indicator of text in the wild

Author: Jia W
Li Y
Shen C
Van Den Hengel A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/09/2013
Field of study

Text in an image provides vital information for interpreting its contents, and text in a scene can aid a variety of tasks from navigation to obstacle avoidance and odometry. Despite its value, however, detecting general text in images remains a challenging research problem. Motivated by the need to consider the widely varying forms of natural text, we propose a bottom-up approach to the problem, which reflects the characterness of an image region. In this sense, our approach mirrors the move from saliency detection methods to measures of objectness. In order to measure the characterness, we develop three novel cues that are tailored for character detection and a Bayesian method for their integration. Because text is made up of sets of characters, we then design a Markov random field model so as to exploit the inherent dependencies between characters. We experimentally demonstrate the effectiveness of our characterness cues as well as the advantage of Bayesian multicue integration. The proposed text detector outperforms state-of-the-art methods on a few benchmark scene text detection data sets. We also show that our measurement of characterness is superior than state-of-the-art saliency detection models when applied to the same task. © 2013 IEEE

arXiv.org e-Print Archive

Adelaide Research & Scholarship

OPUS - University of Technology Sydney

NARRATE: A Normal Assisted Free-View Portrait Stylizer

Author: Chen Wenzheng
Li Minzhang
Wang Youjia
Wu Yiwen
Xu Lan
Xu Teng
Yu Jingyi
Publication venue
Publication date: 20/07/2022
Field of study

In this work, we propose NARRATE, a novel pipeline that enables simultaneously editing portrait lighting and perspective in a photorealistic manner. As a hybrid neural-physical face model, NARRATE leverages complementary benefits of geometry-aware generative approaches and normal-assisted physical face models. In a nutshell, NARRATE first inverts the input portrait to a coarse geometry and employs neural rendering to generate images resembling the input, as well as producing convincing pose changes. However, inversion step introduces mismatch, bringing low-quality images with less facial details. As such, we further estimate portrait normal to enhance the coarse geometry, creating a high-fidelity physical face model. In particular, we fuse the neural and physical renderings to compensate for the imperfect inversion, resulting in both realistic and view-consistent novel perspective images. In relighting stage, previous works focus on single view portrait relighting but ignoring consistency between different perspectives as well, leading unstable and inconsistent lighting effects for view changes. We extend Total Relighting to fix this problem by unifying its multi-view input normal maps with the physical face model. NARRATE conducts relighting with consistent normal maps, imposing cross-view constraints and exhibiting stable and coherent illumination effects. We experimentally demonstrate that NARRATE achieves more photorealistic, reliable results over prior works. We further bridge NARRATE with animation and style transfer tools, supporting pose change, light change, facial animation, and style transfer, either separately or in combination, all at a photographic quality. We showcase vivid free-view facial animations as well as 3D-aware relightable stylization, which help facilitate various AR/VR applications like virtual cinematography, 3D video conferencing, and post-production.Comment: 14 pages,13 figures https://youtu.be/mP4FV3evmy

arXiv.org e-Print Archive

A Visual Computing Unified Application Using Deep Learning and Computer Vision Techniques

Author: G. Shruthi
J. Sowmya B.
Meeradevi
P Dayananda
Rohith S.
S. Supreeth
Seema S.
Publication venue: International Federation of Engineering Education Societies (IFEES)
Publication date: 12/01/2024
Field of study

Vision Studio aims to utilize a diverse range of modern deep learning and computer vision principles and techniques to provide a broad array of functionalities in image and video processing. Deep learning is a distinct class of machine learning algorithms that utilize multiple layers to gradually extract more advanced features from raw input. This is beneficial when using a matrix as input for pixels in a photo or frames in a video. Computer vision is a field of artificial intelligence that teaches computers to interpret and comprehend the visual domain. The main functions implemented include deepfake creation, digital ageing (de-ageing), image animation, and deepfake detection. Deepfake creation allows users to utilize deep learning methods, particularly autoencoders, to overlay source images onto a target video. This creates a video of the source person imitating or saying things that the target person does. Digital aging utilizes generative adversarial networks (GANs) to digitally simulate the aging process of an individual. Image animation utilizes first-order motion models to create highly realistic animations from a source image and driving video. Deepfake detection is achieved by using advanced and highly efficient convolutional neural networks (CNNs), primarily employing the EfficientNet family of models

Online-Journals.org (International Association of Online Engineering)