961 research outputs found
A single channel speech enhancement technique exploiting human auditory masking properties
To enhance extreme corrupted speech signals, an Improved Psychoacoustically
Motivated Spectral Weighting Rule (IPMSWR) is proposed, that controls the
predefined residual noise level by a time-frequency dependent parameter.
Unlike conventional Psychoacoustically Motivated Spectral Weighting Rules
(PMSWR), the level of the residual noise is here varied throughout the
enhanced speech based on the discrimination between the regions with speech
presence and speech absence by means of segmental SNR within critical bands.
Controlling in such a way the level of the residual noise in the noise only
region avoids the unpleasant residual noise perceived at very low SNRs. To
derive the gain coefficients, the computation of the masking curve and the
estimation of the corrupting noise power are required. Since the clean speech
is generally not available for a single channel speech enhancement technique,
the rough clean speech components needed to compute the masking curve are
here obtained using advanced spectral subtraction techniques. To estimate the
corrupting noise, a new technique is employed, that relies on the noise power
estimation using rapid adaptation and recursive smoothing principles. The
performances of the proposed approach are objectively and subjectively
compared to the conventional approaches to highlight the aforementioned
improvement
Aerospace medicine and biology: A continuing bibliography with indexes
This bibliography lists 138 reports, articles, and other documents introduced into the NASA scientific and technical information system in Jun. 1980
Reviews on Technology and Standard of Spatial Audio Coding
Market demands on a more impressive entertainment media have motivated for delivery of three dimensional (3D) audio content to home consumers through Ultra High Definition TV (UHDTV), the next generation of TV broadcasting, where spatial audio coding plays fundamental role. This paper reviews fundamental concept on spatial audio coding which includes technology, standard, and application. Basic principle of object-based audio reproduction system will also be elaborated, compared to the traditional channel-based system, to provide good understanding on this popular interactive audio reproduction system which gives end users flexibility to render their own preferred audio composition.Keywords : spatial audio, audio coding, multi-channel audio signals, MPEG standard, object-based audi
Efficient audio signal processing for embedded systems
We investigated two design strategies that would allow us to efficiently process audio signals on embedded systems such as mobile phones and portable electronics. In the first strategy, we exploit properties of the human auditory system to process audio signals. We designed a sound enhancement algorithm to make piezoelectric loudspeakers sound "richer" and "fuller," using a combination of bass extension and dynamic range compression. We also developed an audio energy reduction algorithm for loudspeaker power management by suppressing signal energy below the masking threshold. In the second strategy, we use low-power analog circuits to process the signal before digitizing it. We designed an analog front-end for sound detection and implemented it on a field programmable analog array (FPAA). The sound classifier front-end can be used in a wide range of applications because programmable floating-gate transistors are employed to store classifier weights. Moreover, we incorporated a feature selection algorithm to simplify the analog front-end. A machine learning algorithm AdaBoost is used to select the most relevant features for a particular sound detection application. We also designed the circuits to implement the AdaBoost-based analog classifier.PhDCommittee Chair: Anderson, David; Committee Member: Hasler, Jennifer; Committee Member: Hunt, William; Committee Member: Lanterman, Aaron; Committee Member: Minch, Bradle
Analysis of nonlinear behavior of loudspeakers using the instantaneous frequency:Abstracts of papers
Acoustic source separation based on target equalization-cancellation
Normal-hearing listeners are good at focusing on the target talker while ignoring the interferers in a multi-talker environment. Therefore, efforts have been devoted to build psychoacoustic models to understand binaural processing in multi-talker environments and to develop bio-inspired source separation algorithms for hearing-assistive devices. This thesis presents a target-Equalization-Cancellation (target-EC) approach to the source separation problem. The idea of the target-EC approach is to use the energy change before and after cancelling the target to estimate a time-frequency (T-F) mask in which each entry estimates the strength of target signal in the original mixture. Once the mask is calculated, it is applied to the original mixture to preserve the target-dominant T-F units and to suppress the interferer-dominant T-F units. On the psychoacoustic modeling side, when the output of the target-EC approach is evaluated with the Coherence-based Speech Intelligibility Index (CSII), the predicted binaural advantage closely matches the pattern of the measured data. On the application side, the performance of the target-EC source separation algorithm was evaluated by psychoacoustic measurements using both a closed-set speech corpus and an open-set speech corpus, and it was shown that the target-EC cue is a better cue for source separation than the interaural difference cues
Scalable and perceptual audio compression
This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner
- …