616 research outputs found

    Noise-robust detection of peak-clipping in decoded speech

    Get PDF

    A modulation property of time-frequency derivatives of filtered phase and its application to aperiodicity and fo estimation

    Full text link
    We introduce a simple and linear SNR (strictly speaking, periodic to random power ratio) estimator (0dB to 80dB without additional calibration/linearization) for providing reliable descriptions of aperiodicity in speech corpus. The main idea of this method is to estimate the background random noise level without directly extracting the background noise. The proposed method is applicable to a wide variety of time windowing functions with very low sidelobe levels. The estimate combines the frequency derivative and the time-frequency derivative of the mapping from filter center frequency to the output instantaneous frequency. This procedure can replace the periodicity detection and aperiodicity estimation subsystems of recently introduced open source vocoder, YANG vocoder. Source code of MATLAB implementation of this method will also be open sourced.Comment: 8 pages 9 figures, Submitted and accepted in Interspeech201

    An evaluation of intrusive instrumental intelligibility metrics

    Full text link
    Instrumental intelligibility metrics are commonly used as an alternative to listening tests. This paper evaluates 12 monaural intrusive intelligibility metrics: SII, HEGP, CSII, HASPI, NCM, QSTI, STOI, ESTOI, MIKNN, SIMI, SIIB, and sEPSMcorr\text{sEPSM}^\text{corr}. In addition, this paper investigates the ability of intelligibility metrics to generalize to new types of distortions and analyzes why the top performing metrics have high performance. The intelligibility data were obtained from 11 listening tests described in the literature. The stimuli included Dutch, Danish, and English speech that was distorted by additive noise, reverberation, competing talkers, pre-processing enhancement, and post-processing enhancement. SIIB and HASPI had the highest performance achieving a correlation with listening test scores on average of ρ=0.92\rho=0.92 and ρ=0.89\rho=0.89, respectively. The high performance of SIIB may, in part, be the result of SIIBs developers having access to all the intelligibility data considered in the evaluation. The results show that intelligibility metrics tend to perform poorly on data sets that were not used during their development. By modifying the original implementations of SIIB and STOI, the advantage of reducing statistical dependencies between input features is demonstrated. Additionally, the paper presents a new version of SIIB called SIIBGauss\text{SIIB}^\text{Gauss}, which has similar performance to SIIB and HASPI, but takes less time to compute by two orders of magnitude.Comment: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 201

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1

    The Bit Error Rate (BER) Performance in Multi-Carrier (OFDM) and Single-Carrier

    Get PDF
    The spectacular growth of wireless communication tools has escalated the number of mobile subscribers from almost 700 million in 2000 to more than 4 billion in 2009. The huge number of subscribers has led to several issues with how service is provided. The high user demand has forced developers to overcome the problems of the old analog systems and to introduce OFDM as a promising technique that can fulfill users\u27 high demands. This technique matches well with high data rate connection and provides a higher capacity for the subscribers\u27 usage. The OFDM, as a multi-carrier, is more complex than the single-carrier transmission scheme. However, the OFDM technique maintains better performance for high data rate in terms of bit error rate (BER). In this thesis a comparison has been presented between the multi-carrier OFDM and the single-carrier to prove, in a simulation form, the theoretical point of view. Despite the advantages of using the OFDM scheme, there are several drawbacks. One of these negatives is the high peak to average power ratio (PAPR). To overcome this problem, there are power reduction techniques that can be applied to the signal to reduce the high power. One of these techniques is the clipping and filtering technique. A maximum level is sited for the transmitted signal to reduce the power and afterward, the signal goes through a filter to remove the influence of the in-band distortion and out-of-band radiation

    On the efficiency of PAPR reduction schemes deployed for DRM systems

    Get PDF
    Digital Radio Mondiale (DRM) is the universally, openly standardized digital broadcasting system for all frequencies including LW, MW, and SW as well as VHF bands. Alongside providing high audio quality to listeners, DRM satisfies technological requirements posed by broadcasters, manufacturers and regulatory authorities and thus bears a great potential for the future of global radio. One of the key issues here concerns green broadcasting. Facing the need for high-power transmitters to cover wide areas, there is room for improvement concerning the power efficiency of DRM-transmitters. A major drawback of DRM is its high peak-to-average power ratio (PAPR) due to the applied transmission technology based on Orthogonal Frequency Division Multiplexing (OFDM), which results in non-linearities in the emitted signal, low power efficiency, and high costs of transmitters. To overcome this, numerous schemes have been investigated for reducing PAPR in OFDM systems. In this paper, we review and analyze various technologies to reduce PAPR providing that the technical feasibility and DRM-specific system architecture and edge conditions regarding the system performance in terms of modulation error rate, compliance with frequency mask, and synchronization efficiency are ensured. All evaluations are carried out with I/Q signals which are monitored in real operation to present the actual performance of proposed PAPR techniques. Subsequently, the capability of the best approach is evaluated via measurements on a DRM test platform, where achieved transmit power gain of 10 dB is shown. According to our evaluation results, PAPR reduction schemes based on active constellation extension followed by a filter prove to be promising towards practical realization of power-efficient transmitters. © 2016, The Author(s)
    corecore