571 research outputs found

    Perception-aware low-power audio processing techniques for portable devices

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Non-linear echo cancellation - a Bayesian approach

    Get PDF
    Echo cancellation literature is reviewed, then a Bayesian model is introduced and it is shown how how it can be used to model and fit nonlinear channels. An algorithm for cancellation of echo over a nonlinear channel is developed and tested. It is shown that this nonlinear algorithm converges for both linear and nonlinear channels and is superior to linear echo cancellation for canceling an echo through a nonlinear echo-path channel

    Three-dimensional media for mobile devices

    Get PDF
    Cataloged from PDF version of article.This paper aims at providing an overview of the core technologies enabling the delivery of 3-D Media to next-generation mobile devices. To succeed in the design of the corresponding system, a profound knowledge about the human visual system and the visual cues that form the perception of depth, combined with understanding of the user requirements for designing user experience for mobile 3-D media, are required. These aspects are addressed first and related with the critical parts of the generic system within a novel user-centered research framework. Next-generation mobile devices are characterized through their portable 3-D displays, as those are considered critical for enabling a genuine 3-D experience on mobiles. Quality of 3-D content is emphasized as the most important factor for the adoption of the new technology. Quality is characterized through the most typical, 3-D-specific visual artifacts on portable 3-D displays and through subjective tests addressing the acceptance and satisfaction of different 3-D video representation, coding, and transmission methods. An emphasis is put on 3-D video broadcast over digital video broadcasting-handheld (DVB-H) in order to illustrate the importance of the joint source-channel optimization of 3-D video for its efficient compression and robust transmission over error-prone channels. The comparative results obtained identify the best coding and transmission approaches and enlighten the interaction between video quality and depth perception along with the influence of the context of media use. Finally, the paper speculates on the role and place of 3-D multimedia mobile devices in the future internet continuum involving the users in cocreation and refining of rich 3-D media content

    On-the-fly Auditory Masking for Scalable VOIP Bridges

    Get PDF
    International audienceEndpoints or conference servers of current audio-conferencing solutions use all the audio frames they receive in order to mix them into one final aggregate stream. However, at each time-instant, some of this content may not be audible due to auditory masking. Hence, sending corresponding frames through the network leads to a loss of bandwidth, while decoding them for mixing or spatial audio processing leads to increased processor load. In this paper, we propose a solution based on an efficient on-the-fly auditory masking evaluation. Our technique allows prioritizing audio frames in order to select only those audible for each connected client. We present results of quality tests showing the transparency of the algorithm. We describe its integration in a France Telecom audio conference server. Tests in a 3D game environment with spatialized chat capabilities show a 70% average reduction in required bandwidth, demonstrating the efficiency of our method

    Audio/Video Transmission over IEEE 802.11e Networks: Retry Limit Adaptation and Distortion Estimation

    Get PDF
    The objective of this thesis focuses on the audio and video transmission over wireless networks adopting the family of the IEEE 802.11x standards. In particular, this thesis discusses about the resolution of four issues: the adaptive retransmission, the comparison of video quality indexes for retry limit adaptation purposes, the estimation of the distortion and the joint adaptation of the maximum number of retransmissions of voice and video flows
    • 

    corecore