677 research outputs found
A new model-based algorithm for optimizing the MPEG-AAC in MS-stereo
International audienceIn this paper, a new model-based algorithm for optimizing the MPEG-Advanced Audio Coder (AAC) in MS-stereo mode is presented. This algorithm is an extension to stereo signals of prior work on a statistical model of quantization noise. Traditionally, MS-stereo coding approaches replace the Left (L) and Right (R) channels by the Middle (M) and Sides (S) channels, each channel being independently processed, almost like a monophonic signal. In contrast, our method proposes a global approach for coding both channels in the same process. A model for the quantization error allows us to tune the quantizers on channels M and S with respect to a distortion constraint on the reconstructed channels L and R as they will appear in the decoder. This approach leads to a more efficient perceptual noise-shaping and avoids using complex psychoacoustic models built on the M and S channels. Furthermore, it provides a straightforward scheme to choose between LR and MS modes in each subband for each frame. Subjective listening tests prove that the coding efficiency at a medium bitrate (96 kbits/s for both channels) is significantly better with our algorithm than with the standard algorithm, without increase of complexity
How are Oil Revenues redistributed in an Oil Economy? The case of Kazakhstan
Kazakhstan’s economy has been driven by an oilboom since the discovery of large new oilfields coincided with the upturn of world oil prices after 1998. This paper uses national household expenditure survey data to examine whether Kazakhstan’s experience supports a curse or a blessing outcome. We assess the extent to which the benefits from the oilboom are retained in the oil-producing regions, or spread evenly across the national economy, or are concentrated in the cities where the country’s elite lives. We then analyze the data to determine the transmission mechanisms (higher wages, social transfers or informal income) from the oilboom to household expenditure.resource boom; redistribution
Calculation of an entropy-constrained quantizer for exponentially damped sinudoids parameters
Technical report, 5 pagesThe Exponentially Damped Sinusoids (EDS) model can efficiently represent real-world audio signals. In the context of low bit rate parametric audio coding, the EDS model could bring a significant improvement over classical sinusoidal models. The inclusion of an additional damping parameter calls for a specific quantization scheme. In this report, we describe a new joint-scalar quantization scheme for EDS parameters in high resolution hypothesis, which is much easier to implement than a vector quantization scheme. A performance evaluation of this quantizer in comparison with a 3-dimensional vector quantizer is proposed in a paper submitted to IEEE Signal Processing Letters named "Entropy-Constrained Quantization of Exponentially Damped Sinusoids Parameters"
Entropy-constrained quantization of exponentially damped sinusoids parameters
International audienceSinusoidal modeling is traditionally one of the most popular techniques for low bitrate audio coding. Usually, the sinusoidal parameters are kept constant within a time segment but the exponentially damped sinusoidal (EDS) model is also an efficient alternative. However, the inclusion of an additional damping parameter calls for a specific quantization scheme. In this paper, we propose an asymptotically optimal entropy-constrained quantization method for amplitude, phase and damping parameters. We show that this scheme is nearly optimal in terms of rate-distortion trade-off. We also show that damping consumes the smallest part of the total entropy of quantization indexes, which suggests that the EDS model is truly efficient for audio coding
Audio Signal Representations for Factorization in the sparse domain
International audienceIn this paper, a new class of audio representations is introduced, together with a corresponding fast decomposition algorithm. The main feature of these representations is that they are both sparse and approximately shift-invariant, which allows similarity search in a sparse domain. The common sparse support of detected similar patterns is then used to factorize their representations. The potential of this method for simultaneous structural analysis and compressing tasks is illustrated by preliminary experiments on simple musical data
Exploring new features for music classification
International audienceAutomatic music classification aims at grouping unknown songs in predefined categories such as music genre or induced emotion. To obtain perceptually relevant results, it is needed to design appropriate features that carry important information for semantic inference. In this paper, we explore novel features and evaluate them in a task of music automatic tagging. The proposed features span various aspects of the music: timbre, textual metadata, visual descriptors of cover art, and features characterizing the lyrics of sung music. The merit of these novel features is then evaluated using a classification system based on a boosting algorithm on binary decision trees. Their effectiveness for the task at hand is discussed with reference to the very common Mel Frequency Cepstral Coefficients features. We show that some of these features alone bring useful information, and that the classification system takes great advantage of a description covering such diverse aspects of songs
An overview of informed audio source separation
International audienceAudio source separation consists in recovering different unknown signals called sources by filtering their observed mixtures. In music processing, most mixtures are stereophonic songs and the sources are the individual signals played by the instruments, e.g. bass, vocals, guitar, etc. Source separation is often achieved through a classical generalized Wiener filtering, which is controlled by parameters such as the power spectrograms and the spatial locations of the sources. For an efficient filtering, those parameters need to be available and their estimation is the main challenge faced by separation algorithms. In the blind scenario, only the mixtures are available and performance strongly depends on the mixtures considered. In recent years, much research has focused on informed separation, which consists in using additional available information about the sources to improve the separation quality. In this paper, we review some recent trends in this direction
Efficient Bayesian Model Selection in PARAFAC via Stochastic Thermodynamic Integration
International audienceParallel factor analysis (PARAFAC) is one of the most popular tensor factorization models. Even though it has proven successful in diverse application fields, the performance of PARAFAC usually hinges up on the rank of the factorization, which is typically specified manually by the practitioner. In this study, we develop a novel parallel and distributed Bayesian model selection technique for rank estimation in large-scale PARAFAC models. The proposed approach integrates ideas from the emerging field of stochastic gradient Markov Chain Monte Carlo, statistical physics, and distributed stochastic optimization. As opposed to the existing methods, which are based on some heuristics, our method has a clear mathematical interpretation, and has significantly lower computational requirements, thanks to data subsampling and parallelization. We provide formal theoretical analysis on the bias induced by the proposed approach. Our experiments on synthetic and large-scale real datasets show that our method is able to find the optimal model order while being significantly faster than the state-of-the-art
- …