156 research outputs found

    The use of spectral information in the development of novel techniques for speech-based cognitive load classification

    Full text link
    The cognitive load of a user refers to the amount of mental demand imposed on the user when performing a particular task. Estimating the cognitive load (CL) level of the users is necessary to adjust the workload imposed on them accordingly in order to improve task performance. The current speech based CL classification systems are not adequate for commercial use due to their low performance particularly in noisy environments. This thesis proposes many techniques to improve the performance of the speech based cognitive load classification system in both clean and noisy conditions. This thesis analyses and presents the effectiveness of speech features such as spectral centroid frequency (SCF) and spectral centroid amplitude (SCA) for CL classification. Sub-systems based on SCF and SCA features were developed and fused with the traditional Mel frequency cepstral coefficients (MFCC) based system, producing an 8.9% and 31.5% relative error rate reduction respectively when compared to the MFCC-based system alone. The Stroop test corpus was used in these experiments. The investigation into cognitive load information in the form of spectral distribution in different subbands shows that the information distributed in the low frequency subband is significantly higher than the high frequency subband. Two different methods are proposed to utilize this finding. The first method, called the multi-band approach, uses a weighting scheme to emphasize the speech features in low frequency subbands. The cognitive load classification accuracy of this approach is shown to be higher than a system based on a non-weighting scheme. The second method is to design an effective filterbank based on the spectral distribution of cognitive load information using the Kullback-Leibler distance measure. It is shown that the designed filterbank consistently provides higher classification accuracies than other existing filterbanks such as mel, Bark, and equivalent rectangular bandwidth. A discrete cosine transform based speech enhancement technique is proposed in order to increase the robustness of the CL classification system and found to be more suitable than other methods investigated. This proposed method provides a 3.0% average relative error rate reduction for the seven types of noise and five levels of SNR used. In particular, it provides a maximum of 7.5% relative error rate reduction for the F16 noise (in NOISEX-92 database) at 20 dB SNR

    Computer Models for Musical Instrument Identification

    Get PDF
    PhDA particular aspect in the perception of sound is concerned with what is commonly termed as texture or timbre. From a perceptual perspective, timbre is what allows us to distinguish sounds that have similar pitch and loudness. Indeed most people are able to discern a piano tone from a violin tone or able to distinguish different voices or singers. This thesis deals with timbre modelling. Specifically, the formant theory of timbre is the main theme throughout. This theory states that acoustic musical instrument sounds can be characterised by their formant structures. Following this principle, the central point of our approach is to propose a computer implementation for building musical instrument identification and classification systems. Although the main thrust of this thesis is to propose a coherent and unified approach to the musical instrument identification problem, it is oriented towards the development of algorithms that can be used in Music Information Retrieval (MIR) frameworks. Drawing on research in speech processing, a complete supervised system taking into account both physical and perceptual aspects of timbre is described. The approach is composed of three distinct processing layers. Parametric models that allow us to represent signals through mid-level physical and perceptual representations are considered. Next, the use of the Line Spectrum Frequencies as spectral envelope and formant descriptors is emphasised. Finally, the use of generative and discriminative techniques for building instrument and database models is investigated. Our system is evaluated under realistic recording conditions using databases of isolated notes and melodic phrases

    Efficient Acquisition and Denoising of Full-Range Event-Related Potentials Following Transient Stimulation of the Auditory Pathway

    Get PDF
    This body of work relates to recent advances in the field of human auditory event-related potentials (ERP), specifically the fast, deconvolution-based ERP acquisition as well as single-response based preprocessing, denoising and subsequent analysis methods. Its goal is the contribution of a cohesive set of methods facilitating the fast, reliable acquisition of the whole electrophysiological response generated by the auditory pathway from the brainstem to the cortex following transient acoustical stimulation. The present manuscript is divided into three sequential areas of investigation : First, the general feasibility of simultaneously acquiring auditory brainstem, middle-latency and late ERP single responses is demonstrated using recordings from 15 normal hearing subjects. Favourable acquisition parameters (i.e., sampling rate, bandpass filter settings and interstimulus intervals) are established, followed by signal analysis of the resulting ERP in terms of their dominant intrinsic scales to determine the properties of an optimal signal representation with maximally reduced sample count by means of nonlinear resampling on a logarithmic timebase. This way, a compression ratio of 16.59 is achieved. Time-scale analysis of the linear-time and logarithmic-time ERP single responses is employed to demonstrate that no important information is lost during compressive resampling, which is additionally supported by a comparative evaluation of the resulting average waveforms - here, all prominent waves remain visible, with their characteristic latencies and amplitudes remaining essentially unaffected by the resampling process. The linear-time and resampled logarithmic-time signal representations are comparatively investigated regarding their susceptibility to the types of physiological and technical noise frequently contaminating ERP recordings. While in principle there already exists a plethora of well-investigated approaches towards the denoising of ERP single-response representations to improve signal quality and/or reduce necessary aquisition times, the substantially altered noise characteristics of the obtained, resampled logarithmic-time single response representations as opposed to their linear-time equivalent necessitates a reevaluation of the available methods on this type of data. Additionally, two novel, efficient denoising algorithms based on transform coefficient manipulation in the sinogram domain and on an analytic, discrete wavelet filterbank are proposed and subjected to a comparative performance evaluation together with two established denoising methods. To facilitate a thorough comparison, the real-world ERP dataset obtained in the first part of this work is employed alongside synthetic data generated using a phenomenological ERP model evaluated at different signal-to-noise ratios (SNR), with individual gains in multiple outcome metrics being used to objectively assess algorithm performances. Results suggest the proposed denoising algorithms to substantially outperform the state-of-the-art methods in terms of the employed outcome metrics as well as their respective processing times. Furthermore, an efficient stimulus sequence optimization method for use with deconvolution-based ERP acquisition methods is introduced, which achieves consistent noise attenuation within a broad designated frequency range. A novel stimulus presentation paradigm for the fast, interleaved acquisition of auditory brainstem, middle-latency and late responses featuring alternating periods of optimized, high-rate deconvolution sequences and subsequent low-rate stimulation is proposed and investigated in 20 normal hearing subjects. Deconvolved sequence responses containing early and middle-latency ERP components are fused with subsequent late responses using a time-frequency resolved weighted averaging method based on cross-trial regularity, yielding a uniform SNR of the full-range auditory ERP across investigated timescales. Obtained average ERP waveforms exhibit morphologies consistent with both literature values and the reference recordings obtained in the first part of this manuscript, with all prominent waves being visible in the grand average waveforms. The novel stimulation approach cuts acquisition time by a factor of 3.4 while at the same time yielding a substantial gain in the SNR of obtained ERP data. Results suggest the proposed interleaved stimulus presentation and associated postprocessing methodology to be suitable for the fast, reliable extraction of full-range neural correlates of auditory processing in future studies.Diese Arbeit steht im Zusammenhang mit aktuellen Entwicklungen auf dem Gebiet der ereigniskorrelierten Potentiale (EKP) des humanen auditorischen Systems, insbesondere der schnellen, entfaltungsbasierten EKP-Aufzeichnung sowie einzelantwortbasierten Vorverarbeitungs-, Entrauschungs- und nachgelagerten Analysemethoden. Ziel ist die Bereitstellung eines vollstĂ€ndigen Methodensatzes, der eine schnelle, zuverlĂ€ssige Erfassung der gesamten elektrophysiologischen AktivitĂ€t entlang der Hörbahn vom Hirnstamm bis zum Cortex ermöglicht, die als Folge transienter akustischer Stimulation auftritt. Das vorliegende Manuskript gliedert sich in drei aufeinander aufbauende Untersuchungsbereiche : ZunĂ€chst wird die generelle Machbarkeit der gleichzeitigen Aufzeichnung von Einzelantworten der auditorischen Hirnstammpotentiale zusammen mit mittelspĂ€ten und spĂ€ten EKP anhand von Referenzmessungen an 15 normalhörenden Probanden demonstriert. Es werden hierzu geeignete Erfassungsparameter (Abtastrate, Bandpassfiltereinstellungen und Interstimulusintervalle) ermittelt, gefolgt von einer Signalanalyse der resultierenden EKP im Hinblick auf deren dominante intrinsische Skalen, um auf dieser Grundlage die Eigenschaften einer optimalen Signaldarstellung mit maximal reduzierter Anzahl an Abtastpunkten zu bestimmen, die durch nichtlineare Neuabtastung auf eine logarithmische Zeitbasis realisiert wird. Hierbei wird ein KompressionsverhĂ€ltnis von 16.59 erzielt. Zeit-Skalen-Analysen der uniform und logarithmisch abgetasteten EKP-Einzelantworten zeigen, dass bei der kompressiven Neuabtastung keine relevante Information verloren geht, was durch eine vergleichende Auswertung der resultierenden, gemittelten Wellenformen zusĂ€tzlich gestĂŒtzt wird - alle prominenten Wellen bleiben sichtbar und sind hinsichtlich ihrer charakteristischen Latenzen und Amplituden von der Neuabtastung weitgehend unbeeinflusst. Die uniforme und logarithmische SignalreprĂ€sentation werden hinsichtlich ihrer AnfĂ€lligkeit fĂŒr die ĂŒblicherweise bei der EKP-Aufzeichnung auftretenden physiologischen und technischen Störquellen vergleichend untersucht. Obwohl bereits eine FĂŒlle von gut etablierten AnsĂ€tzen fĂŒr die Entrauschung von EKP-Einzelantwortdarstellungen zur Verbesserung der SignalqualitĂ€t und/oder zur Reduktion der benötigten Erfassungszeiten existiert, erfordern die wesentlich verĂ€nderten Störeigenschaften der vorliegenden, logarithmisch abgetasteten Einzelantwortdarstellungen im Gegensatz zu ihrem uniformen Äquivalent eine Neubewertung der verfĂŒgbaren Methoden fĂŒr diese Art von Daten. DarĂŒber hinaus werden zwei neuartige, effiziente Entrauschungsalgorithmen geboten, die auf der Koeffizientenmanipulation einer Sinogramm-ReprĂ€sentation bzw. einer analytischen, diskreten Wavelet-Zerlegung der Einzelantworten basieren und gemeinsam mit zwei etablierten Entrauschungsmethoden einer vergleichenden Leistungsbewertung unterzogen werden. Um einen umfassenden Vergleich zu ermöglichen, werden der im ersten Teil dieser Arbeit erhaltene EKP-Messdatensatz sowie synthetischen Daten eingesetzt, die mithilfe eines phĂ€nomenologischen EKP-Modells bei verschiedenen Signal-Rausch-AbstĂ€nden (SRA) erzeugt wurden, wobei die individuellen Anstiege in mehreren Zielmetriken zur objektiven Bewertung der Performanz herangezogen werden. Die erhaltenen Ergebnisse deuten darauf hin, dass die vorgeschlagenen Entrauschungsalgorithmen die etablierten Methoden sowohl in den eingesetzten Zielmetriken als auch mit Blick auf die Laufzeiten deutlich ĂŒbertreffen. Weiterhin wird ein effizientes Reizsequenzoptimierungsverfahren fĂŒr den Einsatz mit entfaltungsbasierten EKP-Aufzeichnungsmethoden vorgestellt, das eine konsistente RauschunterdrĂŒckung innerhalb eines breiten Frequenzbands erreicht. Ein neuartiges Stimulus-PrĂ€sentationsparadigma fĂŒr die schnelle, verschachtelte Erfassung auditorischer Hirnstammpotentiale, mittlelspĂ€ter und spĂ€ter Antworten durch alternierende Darbietung von optimierten, dichter Stimulussequenzen und nachgelagerter, langsamer Einzelstimulation wird eingefĂŒhrt und in 20 normalhörenden Probanden evaluiert. Entfaltete Sequenzantworten, die frĂŒhe und mittlere EKP enthalten, werden mit den nachfolgenden spĂ€ten Antworten fusioniert, wobei eine Zeit-Frequenz-aufgelöste, gewichtete Mittelung unter BerĂŒcksichtigung von RegularitĂ€t ĂŒber Einzelantworten hinweg zum Einsatz kommt. Diese erreicht einheitliche SRA der resultierenden EKP-Signale ĂŒber alle untersuchten Zeitskalen hinweg. Die erhaltenen, gemittelten EKP-Wellenformen weisen Morphologien auf, die sowohl mit einschlĂ€gigen Literaturwerten als auch mit den im ersten Teil dieses Manuskripts erhaltenen Referenzaufnahmen konsistent sind, wobei alle markanten Wellen deutlich in den Gesamtmittelwerten sichtbar sind. Das neuartige Stimulationsparadigma verkĂŒrzt die Erfassungszeit um den Faktor 3.4 und vergrĂ¶ĂŸert gleichzeitig den erreichten SRA erheblich. Die Ergebnisse deuten darauf hin, dass die vorgeschlagene verschachtelte StimulusprĂ€sentation und die nachgelagerte EKP-Verarbeitungsmethodik zur schnellen, zuverlĂ€ssigen Extraktion neuronaler Korrelate der gesamten auditorischen Verarbeitung im Rahmen zukĂŒnftiger Studien geeignet sind.Bundesministerium fĂŒr Bildung und Forschung | Bimodal Fusion - Eine neurotechnologische Optimierungsarchitektur fĂŒr integrierte bimodale Hörsysteme | 2016-201

    Deep Neural Networks for Sound Event Detection

    Get PDF
    The objective of this thesis is to develop novel classiïŹcation and feature learning techniques for the task of sound event detection (SED) in real-world environments. Throughout their lives, humans experience a consistent learning process on how to assign meanings to sounds. Thanks to this, most of the humans can easily recognize the sound of a thunder, dog bark, door bell, bird singing etc. In this work, we aim to develop systems that can automatically detect the sound events commonly present in our daily lives. Such systems can be utilized in e.g. contextaware devices, acoustic surveillance, bio-acoustical and healthcare monitoring, and smart-home cities.In this thesis, we propose to apply the modern machine learning methods called deep learning for SED. The relationship between the commonly used timefrequency representations for SED (such as mel spectrogram and magnitude spectrogram) and the target sound event labels are highly complex. Deep learning methods such as deep neural networks (DNN) utilize a layered structure of units to extract features from the given sound representation input with increased abstraction at each layer. This increases the network’s capacity to eïŹƒciently learn the highly complex relationship between the sound representation and the target sound event labels. We found that the proposed DNN approach performs signiïŹcantly better than the established classiïŹer techniques for SED such as Gaussian mixture models.In a time-frequency representation of an audio recording, a sound event can often be recognized as a distinct pattern that may exhibit shifts in both dimensions. The intra-class variability of the sound events may cause to small shifts in the frequency domain content, and the time domain shift results from the fact that a sound event can occur at any time for a given audio recording. We found that convolutional neural networks (CNN) are useful to learn shift-invariant ïŹlters that are essential for robust modeling of sound events. In addition, we show that recurrent neural networks (RNN) are eïŹ€ective in modeling the long-term temporal characteristics of the sound events. Finally, we combine the convolutional and recurrent layers in a single classiïŹer called convolutional recurrent neural networks (CRNN), which emphasizes the beneïŹts of both and provides state-of-the-art results in multiple SED benchmark datasets.Aside from learning the mappings between the time-frequency representations and the sound event labels, we show that deep learning methods can also be utilized to learn a direct mapping between the the target labels and a lower level representation such as the magnitude spectrogram or even the raw audio signals. In this thesis, the feature learning capabilities of the deep learning methods and the empirical knowledge on the human auditory perception are proposed to be integrated through the means of layer weight initialization with ïŹlterbank coeïŹƒcients. This results with an optimal, ad-hoc ïŹlterbank that is obtained through gradient based optimization of the original coeïŹƒcients to improve the SED performance

    Machine Learning for Human Activity Detection in Smart Homes

    Get PDF
    Recognizing human activities in domestic environments from audio and active power consumption sensors is a challenging task since on the one hand, environmental sound signals are multi-source, heterogeneous, and varying in time and on the other hand, the active power consumption varies significantly for similar type electrical appliances. Many systems have been proposed to process environmental sound signals for event detection in ambient assisted living applications. Typically, these systems use feature extraction, selection, and classification. However, despite major advances, several important questions remain unanswered, especially in real-world settings. A part of this thesis contributes to the body of knowledge in the field by addressing the following problems for ambient sounds recorded in various real-world kitchen environments: 1) which features, and which classifiers are most suitable in the presence of background noise? 2) what is the effect of signal duration on recognition accuracy? 3) how do the SNR and the distance between the microphone and the audio source affect the recognition accuracy in an environment in which the system was not trained? We show that for systems that use traditional classifiers, it is beneficial to combine gammatone frequency cepstral coefficients and discrete wavelet transform coefficients and to use a gradient boosting classifier. For systems based on deep learning, we consider 1D and 2D CNN using mel-spectrogram energies and mel-spectrograms images, as inputs, respectively and show that the 2D CNN outperforms the 1D CNN. We obtained competitive classification results for two such systems and validated the performance of our algorithms on public datasets (Google Brain/TensorFlow Speech Recognition Challenge and the 2017 Detection and Classification of Acoustic Scenes and Events Challenge). Regarding the problem of the energy-based human activity recognition in a household environment, machine learning techniques to infer the state of household appliances from their energy consumption data are applied and rule-based scenarios that exploit these states to detect human activity are used. Since most activities within a house are related with the operation of an electrical appliance, this unimodal approach has a significant advantage using inexpensive smart plugs and smart meters for each appliance. This part of the thesis proposes the use of unobtrusive and easy-install tools (smart plugs) for data collection and a decision engine that combines energy signal classification using dominant classifiers (compared in advanced with grid search) and a probabilistic measure for appliance usage. It helps preserving the privacy of the resident, since all the activities are stored in a local database. DNNs received great research interest in the field of computer vision. In this thesis we adapted different architectures for the problem of human activity recognition. We analyze the quality of the extracted features, and more specifically how model architectures and parameters affect the ability of the automatically extracted features from DNNs to separate activity classes in the final feature space. Additionally, the architectures that we applied for our main problem were also applied to text classification in which we consider the input text as an image and apply 2D CNNs to learn the local and global semantics of the sentences from the variations of the visual patterns of words. This work helps as a first step of creating a dialogue agent that would not require any natural language preprocessing. Finally, since in many domestic environments human speech is present with other environmental sounds, we developed a Convolutional Recurrent Neural Network, to separate the sound sources and applied novel post-processing filters, in order to have an end-to-end noise robust system. Our algorithm ranked first in the Apollo-11 Fearless Steps Challenge.Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No. 676157, project ACROSSIN

    Probabilistic models of contextual effects in Auditory Pitch Perception

    Get PDF
    Perception was recognised by Helmholtz as an inferential process whereby learned expectations about the environment combine with sensory experience to give rise to percepts. Expectations are flexible, built from past experiences over multiple time-scales. What is the nature of perceptual expectations? How are they learned? How do they affect perception? These are the questions I propose to address in this thesis. I focus on two important yet simple perceptual attributes of sounds whose perception is widely regarded as effortless and automatic : pitch and frequency. In a first study, I aim to propose a definition of pitch as the solution of a computational goal. Pitch is a fundamental and salient perceptual attribute of many behaviourally important sounds including speech and music. The effortless nature of its perception has led to the search for a direct physical correlate of pitch and for mechanisms to extract pitch from peripheral neural responses. I propose instead that pitch is the outcome of a probabilistic inference of an underlying periodicity in sounds given a learned statistical prior over naturally pitch-evoking sounds, explaining in a single model a wide range of psychophysical results. In two other psychophysical studies I study how and at what time-scales recent sensory history affects the perception of frequency shifts and pitch shifts. (1) When subjects are presented with ambiguous pitch shifts (using octave ambiguous Shepard tone pairs), I show that sensory history is used to leverage the ambiguity in a way that reflects expectations of spectro-temporal continuity of auditory scenes. (2) In delayed 2 tone frequency discrimination tasks, I explore the contraction bias : when asked to report which of two tones separated by brief silence is higher, subjects behave as though they hear the earlier tone ’contracted’ in frequency towards a combination of recently presented stimulus frequencies, and the mean of the overall distribution of tones used in the experiment. I propose that expectations - the statistical learning of the sampled stimulus distribution - are built online and combined with sensory evidence in a statistically optimal fashion. Models derived in the thesis embody the concept of perception as unconscious inference. The results support the view that even apparently primitive acoustic percepts may derive from subtle statistical inference, suggesting that such inferential processes operate at all levels across our sensory systems

    D13.2 Techniques and performance analysis on energy- and bandwidth-efficient communications and networking

    Get PDF
    Deliverable D13.2 del projecte europeu NEWCOM#The report presents the status of the research work of the various Joint Research Activities (JRA) in WP1.3 and the results that were developed up to the second year of the project. For each activity there is a description, an illustration of the adherence to and relevance with the identified fundamental open issues, a short presentation of the main results, and a roadmap for the future joint research. In the Annex, for each JRA, the main technical details on specific scientific activities are described in detail.Peer ReviewedPostprint (published version

    Low-Complexity Algorithms for Channel Estimation in Optimised Pilot-Assisted Wireless OFDM Systems

    Get PDF
    Orthogonal frequency division multiplexing (OFDM) has recently become a dominant transmission technology considered for the next generation fixed and mobile broadband wireless communication systems. OFDM has an advantage of lessening the severe effects of the frequency-selective (multipath) fading due to the band splitting into relatively flat fading subchannels, and allows for low-complexity transceiver implementation based on the fast Fourier transform algorithms. Combining OFDM modulation with multilevel frequency-domain symbol mapping (e.g., QAM) and spatial multiplexing (SM) over the multiple-input multiple-output (MIMO) channels, can theoretically achieve near Shannon capacity of the communication link. However, the high-rate and spectrumefficient system implementation requires coherent detection at the receiving end that is possible only when accurate channel state information (CSI) is available. Since in practice, the response of the wireless channel is unknown and is subject to random variation with time, the receiver typically employs a channel estimator for CSI acquisition. The channel response information retrieved by the estimator is then used by the data detector and can also be fed back to the transmitter by means of in-band or out-of-band signalling, so the latter could adapt power loading, modulation and coding parameters according to the channel conditions. Thus, design of an accurate and robust channel estimator is a crucial requirement for reliable communication through the channel, which is selective in time and frequency. In a MIMO configuration, a separate channel estimator has to be associated with each transmit/receive antenna pair, making the estimation algorithm complexity a primary concern. Pilot-assisted methods, relying on the insertion of reference symbols in certain frequencies and time slots, have been found attractive for identification of the doubly-selective radio channels from both the complexity and performance standpoint. In this dissertation, a family of the reduced-complexity estimators for the single and multiple-antenna OFDM systems is developed. The estimators are based on the transform-domain processing and have the same order of computational complexity, irrespective of the number of pilot subcarriers and their positioning. The common estimator structure represents a cascade of successive small-dimension filtering modules. The number of modules, as well as their order inside the cascade, is determined by the class of the estimator (one or two-dimensional) and availability of the channel statistics (correlation and signal-to-noise power ratio). For fine precision estimation in the multipath channels with statistics not known a priori, we propose recursive design of the filtering modules. Simulation results show that in the steady state, performance of the recursive estimators approaches that of their theoretical counterparts, which are optimal in the minimum mean square error (MMSE) sense. In contrast to the majority of the channel estimators developed so far, our modular-type architectures are suitable for the reconfigurable OFDM transceivers where the actual channel conditions influence the decision of what class of filtering algorithm to use, and how to allot pilot subcarrier positions in the band. In the pilot-assisted transmissions, channel estimation and detection are performed separately from each other over the distinct subcarrier sets. The estimator output is used only to construct the detector transform, but not as the detector input. Since performance of both channel estimation and detection depends on the signal-to-noise power vi ratio (SNR) at the corresponding subcarriers, there is a dilemma of the optimal power allocation between the data and the pilot symbols as these are conflicting requirements under the total transmit power constraint. The problem is exacerbated by the variety of channel estimators. Each kind of estimation algorithm is characterised by its own SNR gain, which in general can vary depending on the channel correlation. In this dissertation, we optimise pilot-data power allocation for the case of developed low-complexity one and two-dimensional MMSE channel estimators. The resultant contribution is manifested by the closed-form analytical expressions of the upper bound (suboptimal approximate value) on the optimal pilot-to-data power ratio (PDR) as a function of a number of design parameters (number of subcarriers, number of pilots, number of transmit antennas, effective order of the channel model, maximum Doppler shift, SNR, etc.). The resultant PDR equations can be applied to the MIMO-OFDM systems with arbitrary arrangement of the pilot subcarriers, operating in an arbitrary multipath fading channel. These properties and relatively simple functional representation of the derived analytical PDR expressions are designated to alleviate the challenging task of on-the-fly optimisation of the adaptive SM-MIMO-OFDM system, which is capable of adjusting transmit signal configuration (e.g., block length, number of pilot subcarriers or antennas) according to the established channel conditions

    Performance analysis of FBMC over OFDM in Cognitive Radio Network

    Get PDF
    Cognitive Radio (CR) system is an adaptive, reconfigurable communication system that can intuitively adjust its parameters to meet users or network demands. The major objective of CR is to provide a platform for the Secondary User (SU) to fully utilize the available spectrum resource by sensing the existence of spectrum holes without causing interference to the Primary User (PU). However, PU detection has been one of the main challenges in CR technology. In comparison to traditional wireless communication systems, due to the Cross-Channel Interference (CCI) from the adjacent channels used by SU to PU, CR system now poses new challenges to Resource Allocation (RA) problems. Past efforts have been focussed on Orthogonal Frequency Division Multiplexing (OFDM) based CR systems. However, OFDM technique show various limitations in CR application due to its enormous spectrum leakage. Filter Bank based Multicarrier (FBMC) has been proposed as a promising Multicarrier Modulation (MCM) candidate that has numerous advantages over OFDM. In this dissertation, a critical analysis of the performance of FBMC over OFDM was studied, and CR system was used as the testing platform. Firstly, the problem of spectrum sensing of OFDM based CR systems in contrast to FBMC based were surveyed from literature point of view, then the performance of the two schemes was analysed and compared from the spectral efficiency point of view. A resource allocation algorithm was proposed where much attention was focused on interference and power constraint. The proposed algorithms have been verified using MATLAB simulations, however, numerical results show that FBMC can attain higher spectrum efficiency and attractive benefit in terms of spectrum sensing as opposed to OFDM. The contributions of this dissertation have heightened the interest in more research and findings on how FBMC can be improved for future application CR systems
    • 

    corecore