Search CORE

92,047 research outputs found

End to End Deep Neural Network Frequency Demodulation of Speech Signals

Author: DE Rumelhart
Indranil Hatai
M Amini
M Schuster
M Önder
N Srivastava
RE Turner
S Hochreiter
T Goehring
Y Xu
Publication venue
Publication date: 07/10/2017
Field of study

Frequency modulation (FM) is a form of radio broadcasting which is widely used nowadays and has been for almost a century. We suggest a software-defined-radio (SDR) receiver for FM demodulation that adopts an end-to-end learning based approach and utilizes the prior information of transmitted speech message in the demodulation process. The receiver detects and enhances speech from the in-phase and quadrature components of its base band version. The new system yields high performance detection for both acoustical disturbances, and communication channel noise and is foreseen to out-perform the established methods for low signal to noise ratio (SNR) conditions in both mean square error and in perceptual evaluation of speech quality score

arXiv.org e-Print Archive

Crossref

Criteria for evaluating internet tutorials in speech communication sciences

Author: Bowerman C
Eriksson A
Huckvale M
Rosner M
Tatham M
Wolters M
Publication venue
Publication date: 01/01/1999
Field of study

UCL Discovery

IMAGINE Final Report

Author: Arana C
Dattani I
Pick R
Recio I
Schmidt P
Publication venue: s.n.
Publication date: 01/09/2003
Field of study

Southampton (e-Prints Soton)

Recommended from our members

A Prototype Toolkit For Evaluating Indoor Environmental Quality In Commercial Buildings

Author: Anwar George
Dickerhoff Darryl
Heinzerling David
Hoyt Tyler
Webster Tom
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

Measurement of building environmental parameters is often complex, expensive, and not easily proceduralized in a manner that covers all commercial buildings. Evaluating building indoor environmental quality performance is therefore not standard practice. This project developed a prototype toolkit that addressed existing barriers to widespread indoor environmental quality performance evaluation. A toolkit with both hardware and software elements was designed for practitioners around the indoor environmental quality requirements of the American Society of Heating, Refrigeration and Air Conditioning Engineers / Chartered Institution of Building Services / United States Green Building Council Performance Measurement Protocols. This unique toolkit was built on a wireless mesh network with a web-based data collection, analysis, and reporting application. The toolkit provided a fast, robust deployment of sensors, real-time data analysis, Performance Measurement Protocol-based analysis methods and a scorecard and report generation tools. A web-enabled Geographic Information System-based metadata collection system also reduced field-study deployment time. The toolkit was evaluated through three case studies, which were discussed in this report

eScholarship - University of California

Voice input/output capabilities at Perception Technology Corporation

Author: Ferber Leon A.
Publication venue
Publication date
Field of study

Condensed resumes of key company personnel at the Perception Technology Corporation are presented. The staff possesses recognition, speech synthesis, speaker authentication, and language identification. Hardware and software engineers' capabilities are included

NASA Technical Reports Server

Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures

Author: Declan A. E. Costello
Max A. Little
Meredydd L. Harries
Publication venue
Publication date: 20/04/2009
Field of study

Clinical acoustic voice recording analysis is usually performed using classical perturbation measures including jitter, shimmer and noise-to-harmonic ratios. However, restrictive mathematical limitations of these measures prevent analysis for severely dysphonic voices. Previous studies of alternative nonlinear random measures addressed wide varieties of vocal pathologies. Here, we analyze a single vocal pathology cohort, testing the performance of these alternative measures alongside classical measures.

We present voice analysis pre- and post-operatively in unilateral vocal fold paralysis (UVFP) patients and healthy controls, patients undergoing standard medialisation thyroplasty surgery, using jitter, shimmer and noise-to-harmonic ratio (NHR), and nonlinear recurrence period density entropy (RPDE), detrended fluctuation analysis (DFA) and correlation dimension. Systematizing the preparative editing of the recordings, we found that the novel measures were more stable and hence reliable, than the classical measures, on healthy controls.

RPDE and jitter are sensitive to improvements pre- to post-operation. Shimmer, NHR and DFA showed no significant change (p > 0.05). All measures detect statistically significant and clinically important differences between controls and patients, both treated and untreated (p < 0.001, AUC > 0.7). Pre- to post-operation, GRBAS ratings show statistically significant and clinically important improvement in overall dysphonia grade (G) (AUC = 0.946, p < 0.001).

Re-calculating AUCs from other study data, we compare these results in terms of clinical importance. We conclude that, when preparative editing is systematized, nonlinear random measures may be useful UVFP treatment effectiveness monitoring tools, and there may be applications for other forms of dysphonia.&#xa

Nature Precedings

Using Transcoding for Hidden Communication in IP Telephony

Author: Mazurczyk Wojciech
Szaga Pawel
Szczypiorski Krzysztof
Publication venue
Publication date: 04/11/2011
Field of study

The paper presents a new steganographic method for IP telephony called TranSteg (Transcoding Steganography). Typically, in steganographic communication it is advised for covert data to be compressed in order to limit its size. In TranSteg it is the overt data that is compressed to make space for the steganogram. The main innovation of TranSteg is to, for a chosen voice stream, find a codec that will result in a similar voice quality but smaller voice payload size than the originally selected. Then, the voice stream is transcoded. At this step the original voice payload size is intentionally unaltered and the change of the codec is not indicated. Instead, after placing the transcoded voice payload, the remaining free space is filled with hidden data. TranSteg proof of concept implementation was designed and developed. The obtained experimental results are enclosed in this paper. They prove that the proposed method is feasible and offers a high steganographic bandwidth. TranSteg detection is difficult to perform when performing inspection in a single network localisation.Comment: 17 pages, 16 figures, 4 table

arXiv.org e-Print Archive

Springer - Publisher Connector