Search CORE

15 research outputs found

Gradient Ascent Subjective Multimedia Quality Testing

Author: Andrew Catellier
Stephen Voran
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2011
Field of study

Polar coordinate quantizers that minimize mean-squared error

Author: Scharf Louis L.
Voran Stephen D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1994
Field of study

Includes bibliographical references.A quantizer for complex data is defined by a partition of the complex plane and a representation point associated with each cell of the partition. A polar coordinate quantizer independently quantizes the magnitude and phase angle of complex data. We derive design equations for minimum mean-squared error polar coordinate quantizers and report some interesting theoretical results on their performance, including performance limits for "phase-only" representations. The results provide a concrete example of a biased estimator whose mean-squared error is smaller than that of any unbiased estimator. Quantizer design examples show the relative importance of magnitude and phase encoding.This work was supported by the Office of Naval Research under contract No. N00014-89-J-1070 and by the NSF Center for Optoelectronic Computing Systems at the University of Colorado, under contract No. 8622236

Mountain Scholar (Digital Collections of Colorado and Wyoming)

This Page Intentionally Left Blank This Page Intentionally Left Blank CONTENTS

Author: Stephen D. Voran
William M. Daley
Publication venue
Publication date
Field of study

2. DELAY ESTIMATION …………………………………………………………………..

CiteSeerX

Wideband Audio Waveform Evaluation Networks: Efficient, Accurate Estimation of Speech Qualities

Author: Andrew A. Catellier
Stephen D. Voran
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

Speech quality and speech intelligibility can vary dramatically across the wide range of currently available telecommunications systems, devices, and operating environments. This creates a strong demand for efficient real-time measurements of quality and intelligibility. Wideband Audio Waveform Evaluation Networks (WAWEnets) are convolutional neural networks (CNNs) that operate directly on wideband audio waveforms in order to produce evaluations of those waveforms. In the present work these evaluations give qualities of telecommunications speech (e.g., noisiness, intelligibility, overall speech quality). WAWEnets are no-reference networks because they do not require “reference” (original or undistorted) versions of the waveforms they evaluate. Our initial 2020 WAWEnet publication introduces four WAWEnets and each emulates the output of an established full-reference speech quality or intelligibility estimation algorithm. We have updated the WAWEnet architecture to be more efficient and effective. Here we present a single WAWEnet that closely tracks seven different quality and intelligibility values with per-segment correlations in the range of 0.92 to 0.96. We create a second network that additionally tracks four subjective speech quality dimensions. We offer a third network that focuses on just subjective quality scores and achieves a per-segment correlation of 0.97. The performance of our WAWEnet architecture compares favorably to models with orders-of-magnitude more parameters and computational complexity. This work has leveraged 334 hours of speech in 13 languages, more than two million full-reference target values, and more than 93,000 subjective mean opinion scores. We also interpret the operation of WAWEnets and identify the key to their operation using the language of signal processing: ReLUs strategically move spectral information from non-DC components into the DC component. The DC values of 96 output signals define a vector in a 96-D latent space, and this vector is then mapped to a quality or intelligibility value for the input waveform

arXiv.org e-Print Archive

Directory of Open Access Journals