510 research outputs found

    Advanced automatic mixing tools for music

    Get PDF
    PhDThis thesis presents research on several independent systems that when combined together can generate an automatic sound mix out of an unknown set of multi‐channel inputs. The research explores the possibility of reproducing the mixing decisions of a skilled audio engineer with minimal or no human interaction. The research is restricted to non‐time varying mixes for large room acoustics. This research has applications in dynamic sound music concerts, remote mixing, recording and postproduction as well as live mixing for interactive scenes. Currently, automated mixers are capable of saving a set of static mix scenes that can be loaded for later use, but they lack the ability to adapt to a different room or to a different set of inputs. In other words, they lack the ability to automatically make mixing decisions. The automatic mixer research depicted here distinguishes between the engineering mixing and the subjective mixing contributions. This research aims to automate the technical tasks related to audio mixing while freeing the audio engineer to perform the fine‐tuning involved in generating an aesthetically‐pleasing sound mix. Although the system mainly deals with the technical constraints involved in generating an audio mix, the developed system takes advantage of common practices performed by sound engineers whenever possible. The system also makes use of inter‐dependent channel information for controlling signal processing tasks while aiming to maintain system stability at all times. A working implementation of the system is described and subjective evaluation between a human mix and the automatic mix is used to measure the success of the automatic mixing tools

    Orthogonal transmultiplexers : extensions to digital subscriber line (DSL) communications

    Get PDF
    An orthogonal transmultiplexer which unifies multirate filter bank theory and communications theory is investigated in this dissertation. Various extensions of the orthogonal transmultiplexer techniques have been made for digital subscriber line communication applications. It is shown that the theoretical performance bounds of single carrier modulation based transceivers and multicarrier modulation based transceivers are the same under the same operational conditions. Single carrier based transceiver systems such as Quadrature Amplitude Modulation (QAM) and Carrierless Amplitude and Phase (CAP) modulation scheme, multicarrier based transceiver systems such as Orthogonal Frequency Division Multiplexing (OFDM) or Discrete Multi Tone (DMT) and Discrete Subband (Wavelet) Multicarrier based transceiver (DSBMT) techniques are considered in this investigation. The performance of DMT and DSBMT based transceiver systems for a narrow band interference and their robustness are also investigated. It is shown that the performance of a DMT based transceiver system is quite sensitive to the location and strength of a single tone (narrow band) interference. The performance sensitivity is highlighted in this work. It is shown that an adaptive interference exciser can alleviate the sensitivity problem of a DMT based system. The improved spectral properties of DSBMT technique reduces the performance sensitivity for variations of a narrow band interference. It is shown that DSBMT technique outperforms DMT and has a more robust performance than the latter. The superior performance robustness is shown in this work. Optimal orthogonal basis design using cosine modulated multirate filter bank is discussed. An adaptive linear combiner at the output of analysis filter bank is implemented to eliminate the intersymbol and interchannel interferences. It is shown that DSBMT is the most suitable technique for a narrow band interference environment. A blind channel identification and optimal MMSE based equalizer employing a nonmaximally decimated filter bank precoder / postequalizer structure is proposed. The performance of blind channel identification scheme is shown not to be sensitive to the characteristics of unknown channel. The performance of the proposed optimal MMSE based equalizer is shown to be superior to the zero-forcing equalizer

    Efficient Algorithms for Immersive Audio Rendering Enhancement

    Get PDF
    Il rendering audio immersivo è il processo di creazione di un’esperienza sonora coinvolgente e realistica nello spazio 3D. Nei sistemi audio immersivi, le funzioni di trasferimento relative alla testa (head-related transfer functions, HRTFs) vengono utilizzate per la sintesi binaurale in cuffia poiché esprimono il modo in cui gli esseri umani localizzano una sorgente sonora. Possono essere introdotti algoritmi di interpolazione delle HRTF per ridurre il numero di punti di misura e per creare un movimento del suono affidabile. La riproduzione binaurale può essere eseguita anche dagli altoparlanti. Tuttavia, il coinvolgimento di due o più gli altoparlanti causa il problema del crosstalk. In questo caso, algoritmi di cancellazione del crosstalk (CTC) sono necessari per eliminare i segnali di interferenza indesiderati. In questa tesi, partendo da un'analisi comparativa di metodi di misura delle HRTF, viene proposto un sistema di rendering binaurale basato sull'interpolazione delle HRTF per applicazioni in tempo reale. Il metodo proposto mostra buone prestazioni rispetto a una tecnica di riferimento. L'algoritmo di interpolazione è anche applicato al rendering audio immersivo tramite altoparlanti, aggiungendo un algoritmo di cancellazione del crosstalk fisso, che considera l'ascoltatore in una posizione fissa. Inoltre, un sistema di cancellazione crosstalk adattivo, che include il tracciamento della testa dell'ascoltatore, è analizzato e implementato in tempo reale. Il CTC adattivo implementa una struttura in sottobande e risultati sperimentali dimostrano che un maggiore numero di bande migliora le prestazioni in termini di errore totale e tasso di convergenza. Il sistema di riproduzione e le caratteristiche dell'ambiente di ascolto possono influenzare le prestazioni a causa della loro risposta in frequenza non ideale. L'equalizzazione viene utilizzata per livellare le varie parti dello spettro di frequenze che compongono un segnale audio al fine di ottenere le caratteristiche sonore desiderate. L'equalizzazione può essere manuale, come nel caso dell'equalizzazione grafica, dove il guadagno di ogni banda di frequenza può essere modificato dall'utente, o automatica, la curva di equalizzazione è calcolata automaticamente dopo la misurazione della risposta impulsiva della stanza. L'equalizzazione della risposta ambientale può essere applicata anche ai sistemi multicanale, che utilizzano due o più altoparlanti e la zona di equalizzazione può essere ampliata misurando le risposte impulsive in diversi punti della zona di ascolto. In questa tesi, GEQ efficienti e un sistema adattativo di equalizzazione d'ambiente. In particolare, sono proposti e approfonditi tre equalizzatori grafici a basso costo computazionale e a fase lineare e quasi lineare. Gli esperimenti confermano l'efficacia degli equalizzatori proposti in termini di accuratezza, complessità computazionale e latenza. Successivamente, una struttura adattativa in sottobande è introdotta per lo sviluppo di un sistema di equalizzazione d'ambiente multicanale. I risultati sperimentali verificano l'efficienza dell'approccio in sottobande rispetto al caso a banda singola. Infine, viene presentata una rete crossover a fase lineare per sistemi multicanale, mostrando ottimi risultati in termini di risposta in ampiezza, bande di transizione, risposta polare e risposta in fase. I sistemi di controllo attivo del rumore (ANC) possono essere progettati per ridurre gli effetti dell'inquinamento acustico e possono essere utilizzati contemporaneamente a un sistema audio immersivo. L'ANC funziona creando un'onda sonora in opposizione di fase rispetto all'onda sonora in arrivo. Il livello sonoro complessivo viene così ridotto grazie all'interferenza distruttiva. Infine, questa tesi presenta un sistema ANC utilizzato per la riduzione del rumore. L’approccio proposto implementa una stima online del percorso secondario e si basa su filtri adattativi in sottobande applicati alla stima del percorso primario che mirano a migliorare le prestazioni dell’intero sistema. La struttura proposta garantisce un tasso di convergenza migliore rispetto all'algoritmo di riferimento.Immersive audio rendering is the process of creating an engaging and realistic sound experience in 3D space. In immersive audio systems, the head-related transfer functions (HRTFs) are used for binaural synthesis over headphones since they express how humans localize a sound source. HRTF interpolation algorithms can be introduced for reducing the number of measurement points and creating a reliable sound movement. Binaural reproduction can be also performed by loudspeakers. However, the involvement of two or more loudspeakers causes the problem of crosstalk. In this case, crosstalk cancellation (CTC) algorithms are needed to delete unwanted interference signals. In this thesis, starting from a comparative analysis of HRTF measurement techniques, a binaural rendering system based on HRTF interpolation is proposed and evaluated for real-time applications. The proposed method shows good performance in comparison with a reference technique. The interpolation algorithm is also applied for immersive audio rendering over loudspeakers, by adding a fixed crosstalk cancellation algorithm, which assumes that the listener is in a fixed position. In addition, an adaptive crosstalk cancellation system, which includes the tracking of the listener's head, is analyzed and a real-time implementation is presented. The adaptive CTC implements a subband structure and experimental results prove that a higher number of bands improves the performance in terms of total error and convergence rate. The reproduction system and the characteristics of the listening room may affect the performance due to their non-ideal frequency response. Audio equalization is used to adjust the balance of different audio frequencies in order to achieve desired sound characteristics. The equalization can be manual, such as in the case of graphic equalization, where the gain of each frequency band can be modified by the user, or automatic, where the equalization curve is automatically calculated after the room impulse response measurement. The room response equalization can be also applied to multichannel systems, which employ two or more loudspeakers, and the equalization zone can be enlarged by measuring the impulse responses in different points of the listening zone. In this thesis, efficient graphic equalizers (GEQs), and an adaptive room response equalization system are presented. In particular, three low-complexity linear- and quasi-linear-phase graphic equalizers are proposed and deeply examined. Experiments confirm the effectiveness of the proposed GEQs in terms of accuracy, computational complexity, and latency. Successively, a subband adaptive structure is introduced for the development of a multichannel and multiple positions room response equalizer. Experimental results verify the effectiveness of the subband approach in comparison with the single-band case. Finally, a linear-phase crossover network is presented for multichannel systems, showing great results in terms of magnitude flatness, cutoff rates, polar diagram, and phase response. Active noise control (ANC) systems can be designed to reduce the effects of noise pollution and can be used simultaneously with an immersive audio system. The ANC works by creating a sound wave that has an opposite phase with respect to the sound wave of the unwanted noise. The additional sound wave creates destructive interference, which reduces the overall sound level. Finally, this thesis presents an ANC system used for noise reduction. The proposed approach implements an online secondary path estimation and is based on cross-update adaptive filters applied to the primary path estimation that aim at improving the performance of the whole system. The proposed structure allows for a better convergence rate in comparison with a reference algorithm

    Non-linear echo cancellation - a Bayesian approach

    Get PDF
    Echo cancellation literature is reviewed, then a Bayesian model is introduced and it is shown how how it can be used to model and fit nonlinear channels. An algorithm for cancellation of echo over a nonlinear channel is developed and tested. It is shown that this nonlinear algorithm converges for both linear and nonlinear channels and is superior to linear echo cancellation for canceling an echo through a nonlinear echo-path channel

    Frequency Spreading Equalization in Multicarrier Massive MIMO

    Full text link
    Application of filter bank multicarrier (FBMC) as an effective method for signaling over massive MIMO channels has been recently proposed. This paper further expands the application of FBMC to massive MIMO by applying frequency spreading equalization (FSE) to these channels. FSE allows us to achieve a more accurate equalization. Hence, higher number of bits per symbol can be transmitted and the bandwidth of each subcarrier can be widened. Widening the bandwidth of each subcarrier leads to (i) higher bandwidth efficiency; (ii) lower complexity; (iii) lower sensitivity to carrier frequency offset (CFO); (iv) reduced peak-to-average power ratio (PAPR); and (iv) reduced latency. All these appealing advantages have a direct impact on the digital as well as analog circuitry that is needed for the system implementation. In this paper, we develop the mathematical formulation of the minimum mean square error (MMSE) FSE for massive MIMO systems. This analysis guides us to decide on the number of subcarriers that will be sufficient for practical channel models.Comment: Accepted in IEEE ICC 2015 - Workshop on 5G & Beyond - Enabling Technologies and Application

    Performance Assessment of Dual-Polarized 5G Waveforms and Beyond in Directly Modulated DFB-Laser using Volterra Equalizer

    Get PDF
    International audienceWe investigate the performance of 25-Gbps dual-polarized orthogonal frequency division multiplexing (OFDM)-based modulation in a directly modulated distributed feedback (DFB)-laser over 25 km of single-mode fiber. A Volterra equalizer is used to compensate for the nonlinear effects of the optical fiber. The results show that FBMC-OQAM modulation outperforms OFDM, universal filtered multicarrier (UFMC), and generalized frequency division multiplexing (GFDM) waveforms. Indeed, a target bit error rate of similar to 3.8 x 10(-3) [forward error correction (FEC) limit] for FBMC, UFMC, OFDM, and GFDM can be achieved at -30.5, -26, -16, and -14.9 dBm, respectively. The effect of the DFB laser is also investigated for UFMC, OFDM, and GFDM, and they undergo a Q penalty of 2.44, 2.77, and 4.14 dB, respectively, at their FEC limit points. For FBMC-OQAM, the signal is perfectly recovered when excluding the DFB laser at -30.5 dBm. (C) 2020 Society of Photo-Optical Instrumentation Engineers (SPIE
    corecore