705 research outputs found

    Unbalanced Quantized Multi-State Video Coding

    Get PDF
    Multi-State Video Coding (MSVC) is a multiple description scheme based on frame-wise splitting of the video sequence into two or more subsequences. Each subsequence is encoded separately to generate descriptions which can be decoded independently. Due to subsequence splitting the prediction gain decreases. but since reconstruction capabilities improves, error resilience of the system increases. Our focus is on Multi-State Video Coding with unbalanced quantized descriptions, which is particularly interesting for video streaming applications over heterogeneous networks where path diversity is used and transmission channels have varying transmission characteristics. The total bitrate is kept constant while the subsequences are quantized with different step sizes depending on the sequence as well as on the transmission conditions. Our goal is to figure out under which transmission conditions unbalanced bitstreams lead to good system performance in terms of the average reconstructed PSNR. Besides, we investigate the effects of intra-coding on the error resilience of the system and show that the sequence characteristics, and in particular the degree of motion in the sequence, have an important impact on the decoding performance. Finally, we propose a distortion model that is the core of an optimized rate allocation strategy, which is dependent on the network characteristics and status as well as on the video sequence characteristics

    Graded quantization for multiple description coding of compressive measurements

    Get PDF
    Compressed sensing (CS) is an emerging paradigm for acquisition of compressed representations of a sparse signal. Its low complexity is appealing for resource-constrained scenarios like sensor networks. However, such scenarios are often coupled with unreliable communication channels and providing robust transmission of the acquired data to a receiver is an issue. Multiple description coding (MDC) effectively combats channel losses for systems without feedback, thus raising the interest in developing MDC methods explicitly designed for the CS framework, and exploiting its properties. We propose a method called Graded Quantization (CS-GQ) that leverages the democratic property of compressive measurements to effectively implement MDC, and we provide methods to optimize its performance. A novel decoding algorithm based on the alternating directions method of multipliers is derived to reconstruct signals from a limited number of received descriptions. Simulations are performed to assess the performance of CS-GQ against other methods in presence of packet losses. The proposed method is successful at providing robust coding of CS measurements and outperforms other schemes for the considered test metrics

    Source-Channel Diversity for Parallel Channels

    Full text link
    We consider transmitting a source across a pair of independent, non-ergodic channels with random states (e.g., slow fading channels) so as to minimize the average distortion. The general problem is unsolved. Hence, we focus on comparing two commonly used source and channel encoding systems which correspond to exploiting diversity either at the physical layer through parallel channel coding or at the application layer through multiple description source coding. For on-off channel models, source coding diversity offers better performance. For channels with a continuous range of reception quality, we show the reverse is true. Specifically, we introduce a new figure of merit called the distortion exponent which measures how fast the average distortion decays with SNR. For continuous-state models such as additive white Gaussian noise channels with multiplicative Rayleigh fading, optimal channel coding diversity at the physical layer is more efficient than source coding diversity at the application layer in that the former achieves a better distortion exponent. Finally, we consider a third decoding architecture: multiple description encoding with a joint source-channel decoding. We show that this architecture achieves the same distortion exponent as systems with optimal channel coding diversity for continuous-state channels, and maintains the the advantages of multiple description systems for on-off channels. Thus, the multiple description system with joint decoding achieves the best performance, from among the three architectures considered, on both continuous-state and on-off channels.Comment: 48 pages, 14 figure

    AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes

    Full text link
    We propose a method named AudioFormer,which learns audio feature representations through the acquisition of discrete acoustic codes and subsequently fine-tunes them for audio classification tasks. Initially,we introduce a novel perspective by considering the audio classification task as a form of natural language understanding (NLU). Leveraging an existing neural audio codec model,we generate discrete acoustic codes and utilize them to train a masked language model (MLM),thereby obtaining audio feature representations. Furthermore,we pioneer the integration of a Multi-Positive sample Contrastive (MPC) learning approach. This method enables the learning of joint representations among multiple discrete acoustic codes within the same audio input. In our experiments,we treat discrete acoustic codes as textual data and train a masked language model using a cloze-like methodology,ultimately deriving high-quality audio representations. Notably,the MPC learning technique effectively captures collaborative representations among distinct positive samples. Our research outcomes demonstrate that AudioFormer attains significantly improved performance compared to prevailing monomodal audio classification models across multiple datasets,and even outperforms audio-visual multimodal classification models on select datasets. Specifically,our approach achieves remarkable results on datasets including AudioSet (2M,20K),and FSD50K,with performance scores of 53.9,45.1,and 65.6,respectively. We have openly shared both the code and models: https://github.com/LZH-0225/AudioFormer.git.Comment: 9 pages, 4 figure

    Graded quantization for multiple description coding of compressive measurements

    Get PDF
    Compressed sensing (CS) is an emerging paradigm for acquisition of compressed representations of a sparse signal. Its low complexity is appealing for resource-constrained scenarios like sensor networks. However, such scenarios are often coupled with unreliable communication channels and providing robust transmission of the acquired data to a receiver is an issue. Multiple description coding (MDC) effectively combats channel losses for systems without feedback, thus raising the interest in developing MDC methods explicitly designed for the CS framework, and exploiting its properties. We propose a method called Graded Quantization (CS-GQ) that leverages the democratic property of compressive measurements to effectively implement MDC, and we provide methods to optimize its performance. A novel decoding algorithm based on the alternating directions method of multipliers is derived to reconstruct signals from a limited number of received descriptions. Simulations are performed to assess the performance of CS-GQ against other methods in presence of packet losses. The proposed method is successful at providing robust coding of CS measurements and outperforms other schemes for the considered test metrics
    • …
    corecore