Search CORE

77 research outputs found

An Open source Implementation of ITU-T Recommendation P.808 with Validation

Author: Cutler Ross
Naderi Babak
Publication venue: 'International Speech Communication Association'
Publication date: 16/05/2020
Field of study

The ITU-T Recommendation P.808 provides a crowdsourcing approach for conducting a subjective assessment of speech quality using the Absolute Category Rating (ACR) method. We provide an open-source implementation of the ITU-T Rec. P.808 that runs on the Amazon Mechanical Turk platform. We extended our implementation to include Degradation Category Ratings (DCR) and Comparison Category Ratings (CCR) test methods. We also significantly speed up the test process by integrating the participant qualification step into the main rating task compared to a two-stage qualification and rating solution. We provide program scripts for creating and executing the subjective test, and data cleansing and analyzing the answers to avoid operational errors. To validate the implementation, we compare the Mean Opinion Scores (MOS) collected through our implementation with MOS values from a standard laboratory experiment conducted based on the ITU-T Rec. P.800. We also evaluate the reproducibility of the result of the subjective speech quality assessment through crowdsourcing using our implementation. Finally, we quantify the impact of parts of the system designed to improve the reliability: environmental tests, gold and trapping questions, rating patterns, and a headset usage test

arXiv.org e-Print Archive

Crossref

Transformation of Mean Opinion Scores to Avoid Misleading of Ranked based Statistical Techniques

Author: Möller Sebastian
Naderi Babak
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2020
Field of study

The rank correlation coefficients and the ranked-based statistical tests (as a subset of non-parametric techniques) might be misleading when they are applied to subjectively collected opinion scores. Those techniques assume that the data is measured at least at an ordinal level and define a sequence of scores to represent a tied rank when they have precisely an equal numeric value. In this paper, we show that the definition of tied rank, as mentioned above, is not suitable for Mean Opinion Scores (MOS) and might be misleading conclusions of rank-based statistical techniques. Furthermore, we introduce a method to overcome this issue by transforming the MOS values considering their

95\%

Confidence Intervals. The rank correlation coefficients and ranked-based statistical tests can then be safely applied to the transformed values. We also provide open-source software packages in different programming languages to utilize the application of our transformation method in the quality of experience domain.Comment: his paper has been accepted for publication in the 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX

arXiv.org e-Print Archive

Crossref

Application of Just-Noticeable Difference in Quality as Environment Suitability Test for Crowdsourcing Speech Quality Assessment Task

Author: Möller Sebastian
Naderi Babak
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/04/2020
Field of study

Crowdsourcing micro-task platforms facilitate subjective media quality assessment by providing access to a highly scale-able, geographically distributed and demographically diverse pool of crowd workers. Those workers participate in the experiment remotely from their own working environment, using their own hardware. In the case of speech quality assessment, preliminary work showed that environmental noise at the listener's side and the listening device (loudspeaker or headphone) significantly affect perceived quality, and consequently the reliability and validity of subjective ratings. As a consequence, ITU-T Rec. P.808 specifies requirements for the listening environment of crowd workers when assessing speech quality. In this paper, we propose a new Just Noticeable Difference of Quality (JNDQ) test as a remote screening method for assessing the suitability of the work environment for participating in speech quality assessment tasks. In a laboratory experiment, participants performed this JNDQ test with different listening devices in different listening environments, including a silent room according to ITU-T Rec. P.800 and a simulated background noise scenario. Results show a significant impact of the environment and the listening device on the JNDQ threshold. Thus, the combination of listening device and background noise needs to be screened in a crowdsourcing speech quality test. We propose a minimum threshold of our JNDQ test as an easily applicable screening method for this purpose.Comment: This paper has been accepted for publication in the 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX

arXiv.org e-Print Archive

Crossref

Multi-dimensional Speech Quality Assessment in Crowdsourcing

Author: Cutler Ross
Naderi Babak
Ristea Nicolae-Catalin
Publication venue
Publication date: 13/09/2023
Field of study

Subjective speech quality assessment is the gold standard for evaluating speech enhancement processing and telecommunication systems. The commonly used standard ITU-T Rec. P.800 defines how to measure speech quality in lab environments, and ITU-T Rec.~P.808 extended it for crowdsourcing. ITU-T Rec. P.835 extends P.800 to measure the quality of speech in the presence of noise. ITU-T Rec. P.804 targets the conversation test and introduces perceptual speech quality dimensions which are measured during the listening phase of the conversation. The perceptual dimensions are noisiness, coloration, discontinuity, and loudness. We create a crowdsourcing implementation of a multi-dimensional subjective test following the scales from P.804 and extend it to include reverberation, the speech signal, and overall quality. We show the tool is both accurate and reproducible. The tool has been used in the ICASSP 2023 Speech Signal Improvement challenge and we show the utility of these speech quality dimensions in this challenge. The tool will be publicly available as open-source at https://github.com/microsoft/P.808

arXiv.org e-Print Archive

VCD: A Video Conferencing Dataset for Video Compression

Author: Cutler Ross
Hosseinkashi Yasaman
Khongbantabam Nabakumar Singh
Naderi Babak
Publication venue
Publication date: 13/09/2023
Field of study

Commonly used datasets for evaluating video codecs are all very high quality and not representative of video typically used in video conferencing scenarios. We present the Video Conferencing Dataset (VCD) for evaluating video codecs for real-time communication, the first such dataset focused on video conferencing. VCD includes a wide variety of camera qualities and spatial and temporal information. It includes both desktop and mobile scenarios and two types of video background processing. We report the compression efficiency of H.264, H.265, H.266, and AV1 in low-delay settings on VCD and compare it with the non-video conferencing datasets UVC, MLC-JVC, and HEVC. The results show the source quality and the scenarios have a significant effect on the compression efficiency of all the codecs. VCD enables the evaluation and tuning of codecs for this important scenario. The VCD is publicly available as an open-source dataset at https://github.com/microsoft/VCD

arXiv.org e-Print Archive

Design optimization of switched reluctance motor for noise reduction

Author: Babak Ganji
Omid Naderi Samani
Publication venue: Faculty of Engineering/Faculty of Civil Engineering, University of Rijeka
Publication date: 01/01/2016
Field of study

With finite element method (FEM) using ANSYS finite element (FE) package, an electromagnetic-structural simulation model is introduced for the switched reluctance motor (SRM). Since the main reason of noise and vibration in the SRM is a radial force applied to stator poles, the 2D FE transient analysis is carried out in electromagnetic modeling to predict the instantaneous radial force. Based on 3D FEM, the modal analysis is done in the developed structural model to determine mode shapes and natural frequencies. Using the developed simulation model and an evolutionary algorithm, a method is proposed for design optimization of the SRM to decrease noise. To evaluate the proposed method, the simulation results are presented for an 8/6 switched reluctance motor.

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Cryptanalysis of CRUSH hash structure

Author: Babak Sadeghiyan
Majid Naderi
Nasour Bagheri
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 29/01/2008
Field of study

In this paper, we will present a cryptanalysis of CRUSH hash structure. Surprisingly, our attack could find pre-image for any desired length of internal message. Time complexity of this attack is completely negligible. We will show that the time complexity of finding a pre-image of any length is O(1). In this attack, an adversary could freely find a pre-image with the length of his own choice for any given message digits. We can also find second pre-image, collision, multi-collision in the same complexity with our attack. In this paper, we also introduce a stronger variant of the algorithm, and show that an adversary could still be able to produce collisions for this stronger variant of CRUSH hash structure with a time complexity less than a Birthday attack

Cryptology ePrint Archive

Full Reference Video Quality Assessment for Machine Learning-Based Video Codecs

Author: Cho Juhee
Cutler Ross
Hosseinkashi Yasaman
Majeedi Abrar
Martinez Ruben Alvarez
Naderi Babak
Publication venue
Publication date: 01/09/2023
Field of study

Machine learning-based video codecs have made significant progress in the past few years. A critical area in the development of ML-based video codecs is an accurate evaluation metric that does not require an expensive and slow subjective test. We show that existing evaluation metrics that were designed and trained on DSP-based video codecs are not highly correlated to subjective opinion when used with ML video codecs due to the video artifacts being quite different between ML and video codecs. We provide a new dataset of ML video codec videos that have been accurately labeled for quality. We also propose a new full reference video quality assessment (FRVQA) model that achieves a Pearson Correlation Coefficient (PCC) of 0.99 and a Spearman's Rank Correlation Coefficient (SRCC) of 0.99 at the model level. We make the dataset and FRVQA model open source to help accelerate research in ML video codecs, and so that others can further improve the FRVQA model

arXiv.org e-Print Archive