2,391 research outputs found

    An evaluation of intrusive instrumental intelligibility metrics

    Full text link
    Instrumental intelligibility metrics are commonly used as an alternative to listening tests. This paper evaluates 12 monaural intrusive intelligibility metrics: SII, HEGP, CSII, HASPI, NCM, QSTI, STOI, ESTOI, MIKNN, SIMI, SIIB, and sEPSMcorr\text{sEPSM}^\text{corr}. In addition, this paper investigates the ability of intelligibility metrics to generalize to new types of distortions and analyzes why the top performing metrics have high performance. The intelligibility data were obtained from 11 listening tests described in the literature. The stimuli included Dutch, Danish, and English speech that was distorted by additive noise, reverberation, competing talkers, pre-processing enhancement, and post-processing enhancement. SIIB and HASPI had the highest performance achieving a correlation with listening test scores on average of ρ=0.92\rho=0.92 and ρ=0.89\rho=0.89, respectively. The high performance of SIIB may, in part, be the result of SIIBs developers having access to all the intelligibility data considered in the evaluation. The results show that intelligibility metrics tend to perform poorly on data sets that were not used during their development. By modifying the original implementations of SIIB and STOI, the advantage of reducing statistical dependencies between input features is demonstrated. Additionally, the paper presents a new version of SIIB called SIIBGauss\text{SIIB}^\text{Gauss}, which has similar performance to SIIB and HASPI, but takes less time to compute by two orders of magnitude.Comment: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 201

    Data compression techniques applied to high resolution high frame rate video technology

    Get PDF
    An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended

    Picture coding in viewdata systems

    Get PDF
    Viewdata systems in commercial use at present offer the facility for transmitting alphanumeric text and graphic displays via the public switched telephone network. An enhancement to the system would be to transmit true video images instead of graphics. Such a system, under development in Britain at present uses Differential Pulse Code Modulation (DPCM) and a transmission rate of 1200 bits/sec. Error protection is achieved by the use of error protection codes, which increases the channel requirement. In this thesis, error detection and correction of DPCM coded video signals without the use of channel error protection is studied. The scheme operates entirely at the receiver by examining the local statistics of the received data to determine the presence of errors. Error correction is then undertaken by interpolation from adjacent correct or previousiy corrected data. DPCM coding of pictures has the inherent disadvantage of a slow build-up of the displayed picture at the receiver and difficulties with image size manipulation. In order to fit the pictorial information into a viewdata page, its size has to be reduced. Unitary transforms, typically the discrete Fourier transform (DFT), the discrete cosine transform (DCT) and the Hadamard transform (HT) enable lowpass filtering and decimation to be carried out in a single operation in the transform domain. Size reductions of different orders are considered and the merits of the DFT, DCT and HT are investigated. With limited channel capacity, it is desirable to remove the redundancy present in the source picture in order to reduce the bit rate. Orthogonal transformation decorrelates the spatial sample distribution and packs most of the image energy in the low order coefficients. This property is exploited in bit-reduction schemes which are adaptive to the local statistics of the different source pictures used. In some cases, bit rates of less than 1.0 bit/pel are achieved with satisfactory received picture quality. Unlike DPCM systems, transform coding has the advantage of being able to display rapidly a picture of low resolution by initial inverse transformation of the low order coefficients only. Picture resolution is then progressively built up as more coefficients are received and decoded. Different sequences of picture update are investigated to find that which achieves the best subjective quality with the fewest possible coefficients transmitted

    Orthogonal transmultiplexers : extensions to digital subscriber line (DSL) communications

    Get PDF
    An orthogonal transmultiplexer which unifies multirate filter bank theory and communications theory is investigated in this dissertation. Various extensions of the orthogonal transmultiplexer techniques have been made for digital subscriber line communication applications. It is shown that the theoretical performance bounds of single carrier modulation based transceivers and multicarrier modulation based transceivers are the same under the same operational conditions. Single carrier based transceiver systems such as Quadrature Amplitude Modulation (QAM) and Carrierless Amplitude and Phase (CAP) modulation scheme, multicarrier based transceiver systems such as Orthogonal Frequency Division Multiplexing (OFDM) or Discrete Multi Tone (DMT) and Discrete Subband (Wavelet) Multicarrier based transceiver (DSBMT) techniques are considered in this investigation. The performance of DMT and DSBMT based transceiver systems for a narrow band interference and their robustness are also investigated. It is shown that the performance of a DMT based transceiver system is quite sensitive to the location and strength of a single tone (narrow band) interference. The performance sensitivity is highlighted in this work. It is shown that an adaptive interference exciser can alleviate the sensitivity problem of a DMT based system. The improved spectral properties of DSBMT technique reduces the performance sensitivity for variations of a narrow band interference. It is shown that DSBMT technique outperforms DMT and has a more robust performance than the latter. The superior performance robustness is shown in this work. Optimal orthogonal basis design using cosine modulated multirate filter bank is discussed. An adaptive linear combiner at the output of analysis filter bank is implemented to eliminate the intersymbol and interchannel interferences. It is shown that DSBMT is the most suitable technique for a narrow band interference environment. A blind channel identification and optimal MMSE based equalizer employing a nonmaximally decimated filter bank precoder / postequalizer structure is proposed. The performance of blind channel identification scheme is shown not to be sensitive to the characteristics of unknown channel. The performance of the proposed optimal MMSE based equalizer is shown to be superior to the zero-forcing equalizer
    corecore