961 research outputs found

    A single channel speech enhancement technique exploiting human auditory masking properties

    Get PDF
    To enhance extreme corrupted speech signals, an Improved Psychoacoustically Motivated Spectral Weighting Rule (IPMSWR) is proposed, that controls the predefined residual noise level by a time-frequency dependent parameter. Unlike conventional Psychoacoustically Motivated Spectral Weighting Rules (PMSWR), the level of the residual noise is here varied throughout the enhanced speech based on the discrimination between the regions with speech presence and speech absence by means of segmental SNR within critical bands. Controlling in such a way the level of the residual noise in the noise only region avoids the unpleasant residual noise perceived at very low SNRs. To derive the gain coefficients, the computation of the masking curve and the estimation of the corrupting noise power are required. Since the clean speech is generally not available for a single channel speech enhancement technique, the rough clean speech components needed to compute the masking curve are here obtained using advanced spectral subtraction techniques. To estimate the corrupting noise, a new technique is employed, that relies on the noise power estimation using rapid adaptation and recursive smoothing principles. The performances of the proposed approach are objectively and subjectively compared to the conventional approaches to highlight the aforementioned improvement

    Aerospace medicine and biology: A continuing bibliography with indexes

    Get PDF
    This bibliography lists 138 reports, articles, and other documents introduced into the NASA scientific and technical information system in Jun. 1980

    Perceptually motivated blind source separation of convolutive audio mixtures

    Get PDF

    Reviews on Technology and Standard of Spatial Audio Coding

    Get PDF
    Market  demands  on a more impressive entertainment media have motivated for delivery of three dimensional  (3D) audio content to  home consumers  through Ultra  High  Definition  TV  (UHDTV), the next generation of TV broadcasting, where spatial  audio  coding plays  fundamental role. This paper reviews fundamental concept on spatial audio coding which includes technology, standard, and application. Basic principle of object-based audio reproduction system  will also be elaborated, compared  to  the  traditional channel-based system, to provide good understanding on this popular interactive audio reproduction system which gives end users flexibility to render  their  own preferred  audio composition.Keywords : spatial audio, audio coding, multi-channel audio signals, MPEG standard, object-based audi

    Efficient audio signal processing for embedded systems

    Get PDF
    We investigated two design strategies that would allow us to efficiently process audio signals on embedded systems such as mobile phones and portable electronics. In the first strategy, we exploit properties of the human auditory system to process audio signals. We designed a sound enhancement algorithm to make piezoelectric loudspeakers sound "richer" and "fuller," using a combination of bass extension and dynamic range compression. We also developed an audio energy reduction algorithm for loudspeaker power management by suppressing signal energy below the masking threshold. In the second strategy, we use low-power analog circuits to process the signal before digitizing it. We designed an analog front-end for sound detection and implemented it on a field programmable analog array (FPAA). The sound classifier front-end can be used in a wide range of applications because programmable floating-gate transistors are employed to store classifier weights. Moreover, we incorporated a feature selection algorithm to simplify the analog front-end. A machine learning algorithm AdaBoost is used to select the most relevant features for a particular sound detection application. We also designed the circuits to implement the AdaBoost-based analog classifier.PhDCommittee Chair: Anderson, David; Committee Member: Hasler, Jennifer; Committee Member: Hunt, William; Committee Member: Lanterman, Aaron; Committee Member: Minch, Bradle

    Acoustic source separation based on target equalization-cancellation

    Full text link
    Normal-hearing listeners are good at focusing on the target talker while ignoring the interferers in a multi-talker environment. Therefore, efforts have been devoted to build psychoacoustic models to understand binaural processing in multi-talker environments and to develop bio-inspired source separation algorithms for hearing-assistive devices. This thesis presents a target-Equalization-Cancellation (target-EC) approach to the source separation problem. The idea of the target-EC approach is to use the energy change before and after cancelling the target to estimate a time-frequency (T-F) mask in which each entry estimates the strength of target signal in the original mixture. Once the mask is calculated, it is applied to the original mixture to preserve the target-dominant T-F units and to suppress the interferer-dominant T-F units. On the psychoacoustic modeling side, when the output of the target-EC approach is evaluated with the Coherence-based Speech Intelligibility Index (CSII), the predicted binaural advantage closely matches the pattern of the measured data. On the application side, the performance of the target-EC source separation algorithm was evaluated by psychoacoustic measurements using both a closed-set speech corpus and an open-set speech corpus, and it was shown that the target-EC cue is a better cue for source separation than the interaural difference cues

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner
    corecore