Search CORE

43,232 research outputs found

Perceptually-Driven Video Coding with the Daala Video Codec

Author: Bankoski
Daede
Daede
Dai
de Oliveira
Duda
Egge
Egge
Fukuma
Fuldseth
Grange
Han
Ponomarenko
Reader
Sezer
Stuiver
Terriberry
Terriberry
Tran
Valin
Valin
Valin
Wang
Watanabe
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 08/10/2016
Field of study

The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201

arXiv.org e-Print Archive

Crossref

An efficient rate control algorithm for a wavelet video codec

Author: He Z.
J. COSMAS
K. K. LOO
M. TUN
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 02/10/2009
Field of study

Rate control plays an essential role in video coding and transmission to provide the best video quality at the receiver's end given the constraint of certain network conditions. In this paper, a rate control algorithm using the Quality Factor (QF) optimization method is proposed for the wavelet-based video codec and implemented on an open source Dirac video encoder. A mathematical model which we call Rate-QF (R - QF) model is derived to generate the optimum QF for the current coding frame according to the target bitrate. The proposed algorithm is a complete one pass process and does not require complex mathematical calculation. The process of calculating the QF is quite simple and further calculation is not required for each coded frame. The experimental results show that the proposed algorithm can control the bitrate precisely (within 1% of target bitrate in average). Moreover, the variation of bitrate over each Group of Pictures (GOPs) is lower than that of H.264. This is an advantage in preventing the buffer overflow and underflow for real-time multimedia data streaming

Crossref

Middlesex University Research Repository

A New Fast Motion Estimation and Mode Decision algorithm for H.264 Depth Maps encoding in Free Viewpoint TV

Author: Cabrera Quesada Julian
Cernigliaro Gianluca
García Santos Narciso
Jaureguizar Núñez Fernando
Naccari M.
Pereira F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

In this paper, we consider a scenario where 3D scenes are modeled through a View+Depth representation. This representation is to be used at the rendering side to generate synthetic views for free viewpoint video. The encoding of both type of data (view and depth) is carried out using two H.264/AVC encoders. In this scenario we address the reduction of the encoding complexity of depth data. Firstly, an analysis of the Mode Decision and Motion Estimation processes has been conducted for both view and depth sequences, in order to capture the correlation between them. Taking advantage of this correlation, we propose a fast mode decision and motion estimation algorithm for the depth encoding. Results show that the proposed algorithm reduces the computational burden with a negligible loss in terms of quality of the rendered synthetic views. Quality measurements have been conducted using the Video Quality Metric

Crossref

Archivo Digital UPM

Perception of categories: from coding efficiency to reaction times

Author: Abbott
Abramson
Ashby
Ashby
Ashby
Ashby
Beale
Beck
Bialek
Bogacz
Bonnasse-Gahot
Bornstein
Britten
Brunel
Clarke
Cover
De Baene
Ecker
Freedman
Freedman
Freedman
Gold
Goldstone
Hallé
Haussler
Heekeren
Heeren
Herschkowitz
Huk
Jean-Pierre Nadal
Kim
Kriegeskorte
Kruschke
Kuhl
Kuhl
Kuhl
Laurent Bonnasse-Gahot
Liberman
Link
Link
McMurray
Meyers
Nosofsky
Ohl
Op de Beeck
Pisoni
Polka
Prather
Ratcliff
Ratcliff
Renart
Repp
Rissanen
Salinas
Seung
Shadlen
Shadlen
Sigala
Smith
Studdert-Kennedy
Usher
Vickers
Wald
Werker
Xu
Ylinen
Yoon
Zohary
Publication venue: 'Elsevier BV'
Publication date: 23/02/2011
Field of study

Reaction-times in perceptual tasks are the subject of many experimental and theoretical studies. With the neural decision making process as main focus, most of these works concern discrete (typically binary) choice tasks, implying the identification of the stimulus as an exemplar of a category. Here we address issues specific to the perception of categories (e.g. vowels, familiar faces, ...), making a clear distinction between identifying a category (an element of a discrete set) and estimating a continuous parameter (such as a direction). We exhibit a link between optimal Bayesian decoding and coding efficiency, the latter being measured by the mutual information between the discrete category set and the neural activity. We characterize the properties of the best estimator of the likelihood of the category, when this estimator takes its inputs from a large population of stimulus-specific coding cells. Adopting the diffusion-to-bound approach to model the decisional process, this allows to relate analytically the bias and variance of the diffusion process underlying decision making to macroscopic quantities that are behaviorally measurable. A major consequence is the existence of a quantitative link between reaction times and discrimination accuracy. The resulting analytical expression of mean reaction times during an identification task accounts for empirical facts, both qualitatively (e.g. more time is needed to identify a category from a stimulus at the boundary compared to a stimulus lying within a category), and quantitatively (working on published experimental data on phoneme identification tasks)

arXiv.org e-Print Archive

Crossref

Temporal Dynamics of Decision-Making during Motion Perception in the Visual Cortex

Author: Grossberg Stephen
Pilly Praveen K.
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/02/2008
Field of study

How does the brain make decisions? Speed and accuracy of perceptual decisions covary with certainty in the input, and correlate with the rate of evidence accumulation in parietal and frontal cortical "decision neurons." A biophysically realistic model of interactions within and between Retina/LGN and cortical areas V1, MT, MST, and LIP, gated by basal ganglia, simulates dynamic properties of decision-making in response to ambiguous visual motion stimuli used by Newsome, Shadlen, and colleagues in their neurophysiological experiments. The model clarifies how brain circuits that solve the aperture problem interact with a recurrent competitive network with self-normalizing choice properties to carry out probablistic decisions in real time. Some scientists claim that perception and decision-making can be described using Bayesian inference or related general statistical ideas, that estimate the optimal interpretation of the stimulus given priors and likelihoods. However, such concepts do not propose the neocortical mechanisms that enable perception, and make decisions. The present model explains behavioral and neurophysiological decision-making data without an appeal to Bayesian concepts and, unlike other existing models of these data, generates perceptual representations and choice dynamics in response to the experimental visual stimuli. Quantitative model simulations include the time course of LIP neuronal dynamics, as well as behavioral accuracy and reaction time properties, during both correct and error trials at different levels of input ambiguity in both fixed duration and reaction time tasks. Model MT/MST interactions compute the global direction of random dot motion stimuli, while model LIP computes the stochastic perceptual decision that leads to a saccadic eye movement.National Science Foundation (SBE-0354378, IIS-02-05271); Office of Naval Research (N00014-01-1-0624); National Institutes of Health (R01-DC-02852

Elsevier - Publisher Connector

Boston University Institutional Repository (OpenBU)