Search CORE

11 research outputs found

Low Delay Sparse and Mixed Excitation CELP Coders for Wideband Speech Coding

Author: Dymarski Przemyslaw Grzegorz
Publication venue: Electronics and Telecommunications Committee
Publication date: 01/01/2020
Field of study

Code Excited Linear Prediction (CELP) algorithmsare proposed for compression of speech in 8 kHz band atswitched or variable bit rate and algorithmic delay not exceeding2 msec. Two structures of Low-Delay CELP coders are analyzed:Low-delay sparse excitation and mixed excitation CELP. Sparseexcitation is based on MP-MLQ and multilayer models. Mixedexcitation CELP algorithm stems from the narrowband G.728standard. As opposed to G.728 LD-CELP coder, mixed excitationcodebook consists of pseudorandom vectors and sequencesobtained with Long-Term Prediction (LTP). Variable rate codingconsists in maximizing vector dimension while keeping therequired speech quality. Good speech quality (MOS=3.9according to PESQ algorithm) is obtained at average bit rate 33.5kbit/sec

Biblioteka Nauki - repozytorium artykuÅÃ³w

International Journal of Electronics and Telecommunications (Warsaw University of Technology)

Масштабируемые аудиоречевые кодеры на основе адаптивного частотно-временного анализа звуковых сигналов

Author: Петровский Александр Александрович
Петровский Алексей Александрович
Publication venue: СПб ФИЦ РАН
Publication date: 02/02/2017
Field of study

In the paper is discussed the methods of perceptual sub-band audio signal processing with the dynamic time-frequency map transformation based on the discrete wavelet packet (WP) transform. The advantages of it is that the growing process of WP tree is going from the top to down without returning to smaller scale levels of decomposition and needing to build a complete WP tree, that corresponds to the concept of scalable audio/speech coders implementation in real time. The objective quality assessment of proposed coders based techniques PEMO-Q and comparing with the widespread encoders Opus and Vorbis are given. It shows that the reconstructed signal complies with ITU-R PEAQ at a high compression ratio up to 18 times or more, does not contain artifacts and noise to mask ration less -9 dB.В статье рассматриваются методы перцептуальной субполосной обработки звуковых сигналов с динамической трансформацией частотно-временного плана на основе пакетного дискретного вейвлет-преобразования (ПДВП), достоинством которых является то, что рост дерева осуществляется сверху вниз, без возвратов на меньшие масштабные уровни преобразования и необходимости построения полного дерева ПДВП, что соответствует концепции реализации масштабируемых аудиоречевых кодеров в реальном масштабе времени. Приводятся объективные оценки качества предлагаемых кодеров на основе методики PEMO-Q и сравнения с широко распространенными кодерами Opus и Vorbis, которые показывают, что реконструированный сигнал соответствует требованиям стандарта ITU-R PEAQ при высокой степени компрессии в 18 и более раз, не содержит артефактов: отношение мощности шума к порогу маскирования 〖NMR〗_total меньше –9 дБ

Информатика и автоматизация

Reviews on Technology and Standard of Spatial Audio Coding

Author: Elfitri Ikhwana
Luthfi Amirul
Publication venue: 'Perpustakaan Universitas Andalas'
Publication date: 16/03/2017
Field of study

Market demands on a more impressive entertainment media have motivated for delivery of three dimensional (3D) audio content to home consumers through Ultra High Definition TV (UHDTV), the next generation of TV broadcasting, where spatial audio coding plays fundamental role. This paper reviews fundamental concept on spatial audio coding which includes technology, standard, and application. Basic principle of object-based audio reproduction system will also be elaborated, compared to the traditional channel-based system, to provide good understanding on this popular interactive audio reproduction system which gives end users flexibility to render their own preferred audio composition.Keywords : spatial audio, audio coding, multi-channel audio signals, MPEG standard, object-based audi

Jurnal Nasional Teknik Elektro

Desarrollo e implementación de un dispositivo de adquisición y almacenamiento de sonidos para ganadería de precisión

Author: Arrasin Cesar H
Chelotti José O.
Giovanini Leonardo L.
Rufiner Hugo Leonardo
Vanrell Sebastián R.
Publication venue
Publication date: 01/11/2014
Field of study

El monitoreo preciso de las actividades alimentarias de los rumiantes (rumia y pastoreo) es un importante indicador de su salud y bienestar. Un buen seguimiento en la dieta repercute de manera directa en la calidad y cantidad de la leche y carne producidas por el animal. En este trabajo se describe el desarrollo e implementación de un dispositivo de adquisición y almacenamiento de señales para el monitoreo de actividades alimentarias en ganado bovino. Dicho dispositivo tiene por objetivo captar los sonidos producidos por los animales durante su alimentación, sin interferir en su comportamiento normal y sin intervención del operador. El sistema propuesto consta de tres módulos: i) un módulo de adquisición y limpieza del sonido producido por el animal, ii) un módulo de compresión de la señal resultante, organización y almacenamiento de los datos y iii) un módulo de administración de la energía. El sistema fue diseñado para tener una autonomía de una semana y soportar las condiciones operacionales presentes en el campo de aplicación, como son la presencia de ruidos y condiciones climáticas adversas.Sociedad Argentina de Informática e Investigación Operativa (SADIO

Impact perceptuel d'une mise à zéro des segments plosifs de parole

Author: Santini Vincent
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2016
Field of study

En traitement du signal audio, les plosives sont des sons de parole très importants au regard de l’intelligibilité et de la qualité. Les plosives sont cependant difficiles à modéliser à l’aide des techniques usuelles (prédiction linéaire et codage par transformée), à cause de leur dynamique propre importante et à cause de leur nature non prédictible. Cette étude présente un exemple de système complet capable de détecter, segmenter, et altérer les plosives dans un flux de parole. Ce système est utilisé afin de vérifier la validité de l’hypothèse suivante : La phase d’éclatement (de burst) des plosives peut être mise à zéro, de façon perceptuellement équivalente. L’impact sur la qualité subjective de cette transformation est évalué sur une banque de phrases enregistrées. Les résultats de cette altération hautement destructive des signaux tendent à montrer que l’impact perceptuel est mineur. Les implications de ces résultats pour le codage de la parole sont abordées

Savoirs UdeS