8,843 research outputs found

    Proceedings of the Second International Mobile Satellite Conference (IMSC 1990)

    Get PDF
    Presented here are the proceedings of the Second International Mobile Satellite Conference (IMSC), held June 17-20, 1990 in Ottawa, Canada. Topics covered include future mobile satellite communications concepts, aeronautical applications, modulation and coding, propagation and experimental systems, mobile terminal equipment, network architecture and control, regulatory and policy considerations, vehicle antennas, and speech compression

    Near-Instantaneously Adaptive HSDPA-Style OFDM Versus MC-CDMA Transceivers for WIFI, WIMAX, and Next-Generation Cellular Systems

    No full text
    Burts-by-burst (BbB) adaptive high-speed downlink packet access (HSDPA) style multicarrier systems are reviewed, identifying their most critical design aspects. These systems exhibit numerous attractive features, rendering them eminently eligible for employment in next-generation wireless systems. It is argued that BbB-adaptive or symbol-by-symbol adaptive orthogonal frequency division multiplex (OFDM) modems counteract the near instantaneous channel quality variations and hence attain an increased throughput or robustness in comparison to their fixed-mode counterparts. Although they act quite differently, various diversity techniques, such as Rake receivers and space-time block coding (STBC) are also capable of mitigating the channel quality variations in their effort to reduce the bit error ratio (BER), provided that the individual antenna elements experience independent fading. By contrast, in the presence of correlated fading imposed by shadowing or time-variant multiuser interference, the benefits of space-time coding erode and it is unrealistic to expect that a fixed-mode space-time coded system remains capable of maintaining a near-constant BER

    New Directions in Subband Coding

    Get PDF
    Two very different subband coders are described. The first is a modified dynamic bit-allocation-subband coder (D-SBC) designed for variable rate coding situations and easily adaptable to noisy channel environments. It can operate at rates as low as 12 kb/s and still give good quality speech. The second coder is a 16-kb/s waveform coder, based on a combination of subband coding and vector quantization (VQ-SBC). The key feature of this coder is its short coding delay, which makes it suitable for real-time communication networks. The speech quality of both coders has been enhanced by adaptive postfiltering. The coders have been implemented on a single AT&T DSP32 signal processo

    Improved compactly computable objective measures for predicting the acceptiability of speech communications systems

    Get PDF
    Issued as Monthly status reports [1-7], and Final report, Project no. E-21-61

    Adaptive Feedback Cancellation With Band-Limited LPC Vocoder in Digital Hearing Aids

    Get PDF

    BigEAR: Inferring the Ambient and Emotional Correlates from Smartphone-based Acoustic Big Data

    Get PDF
    This paper presents a novel BigEAR big data framework that employs psychological audio processing chain (PAPC) to process smartphone-based acoustic big data collected when the user performs social conversations in naturalistic scenarios. The overarching goal of BigEAR is to identify moods of the wearer from various activities such as laughing, singing, crying, arguing, and sighing. These annotations are based on ground truth relevant for psychologists who intend to monitor/infer the social context of individuals coping with breast cancer. We pursued a case study on couples coping with breast cancer to know how the conversations affect emotional and social well being. In the state-of-the-art methods, psychologists and their team have to hear the audio recordings for making these inferences by subjective evaluations that not only are time-consuming and costly, but also demand manual data coding for thousands of audio files. The BigEAR framework automates the audio analysis. We computed the accuracy of BigEAR with respect to the ground truth obtained from a human rater. Our approach yielded overall average accuracy of 88.76% on real-world data from couples coping with breast cancer.Comment: 6 pages, 10 equations, 1 Table, 5 Figures, IEEE International Workshop on Big Data Analytics for Smart and Connected Health 2016, June 27, 2016, Washington DC, US

    The intensity JND comes from Poisson neural noise: Implications for image coding

    Get PDF
    While the problems of image coding and audio coding have frequently been assumed to have similarities, specific sets of relationships have remained vague. One area where there should be a meaningful comparison is with central masking noise estimates, which define the codec's quantizer step size. In the past few years, progress has been made on this problem in the auditory domain (Allen and Neely, J. Acoust. Soc. Am., {\bf 102}, 1997, 3628-46; Allen, 1999, Wiley Encyclopedia of Electrical and Electronics Engineering, Vol. 17, p. 422-437, Ed. Webster, J.G., John Wiley \& Sons, Inc, NY). It is possible that some useful insights might now be obtained by comparing the auditory and visual cases. In the auditory case it has been shown, directly from psychophysical data, that below about 5 sones (a measure of loudness, a unit of psychological intensity), the loudness JND is proportional to the square root of the loudness \DL(\L) \propto \sqrt{\L(I)}. This is true for both wideband noise and tones, having a frequency of 250 Hz or greater. Allen and Neely interpret this to mean that the internal noise is Poisson, as would be expected from neural point process noise. It follows directly that the Ekman fraction (the relative loudness JND), decreases as one over the square root of the loudness, namely \DL/\L \propto 1/\sqrt{\L}. Above {\L} = 5 sones, the relative loudness JND \DL/\L \approx 0.03 (i.e., Ekman law). It would be very interesting to know if this same relationship holds for the visual case between brightness \B(I) and the brightness JND \DB(I). This might be tested by measuring both the brightness JND and the brightness as a function of intensity, and transforming the intensity JND into a brightness JND, namely \DB(I) = \B(I+ \DI) - \B(I) \approx \DI \frac{d\B}{dI}. If the Poisson nature of the loudness relation (below 5 sones) is a general result of central neural noise, as is anticipated, then one would expect that it would also hold in vision, namely that \DB(\B) \propto \sqrt{\B(I)}. %The history of this problem is fascinating, starting with Weber and Fechner. It is well documented that the exponent in the S.S. Stevens' power law is the same for loudness and brightness (Stevens, 1961) \nocite{Stevens61a} (i.e., both brightness \B(I) and loudness \L(I) are proportional to I0.3I^{0.3}). Furthermore, the brightness JND data are more like Riesz's near miss data than recent 2AFC studies of JND measures \cite{Hecht34,Gescheider97}

    A Subband-Based SVM Front-End for Robust ASR

    Full text link
    This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) front-end that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. The proposed front-end is compared with state-of-the-art ASR front-ends in terms of robustness to additive noise and linear filtering. Experiments performed on the TIMIT phoneme classification task demonstrate the benefits of the proposed subband based SVM front-end: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed front-end with a conventional front-end such as MFCC yields further improvements over the individual front ends across the full range of noise levels

    Perceptual models in speech quality assessment and coding

    Get PDF
    The ever-increasing demand for good communications/toll quality speech has created a renewed interest into the perceptual impact of rate compression. Two general areas are investigated in this work, namely speech quality assessment and speech coding. In the field of speech quality assessment, a model is developed which simulates the processing stages of the peripheral auditory system. At the output of the model a "running" auditory spectrum is obtained. This represents the auditory (spectral) equivalent of any acoustic sound such as speech. Auditory spectra from coded speech segments serve as inputs to a second model. This model simulates the information centre in the brain which performs the speech quality assessment. [Continues.
    • …
    corecore