854 research outputs found

    Application of sound source separation methods to advanced spatial audio systems

    Full text link
    This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are preserved in twochannel stereo format, special up-converters are required to use advanced spatial audio reproduction formats, such as WFS. This is due to the fact that WFS needs the original source signals to be available, in order to accurately synthesize the acoustic field inside an extended listening area. Thus, an object-based mixing is required. Source separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. Therefore, SSS algorithms can be applied to existing two-channel mixtures to extract the different objects that compose the stereo scene. Unfortunately, most stereo mixtures are underdetermined, i.e., there are more sound sources than audio channels. This condition makes the SSS problem especially difficult and stronger assumptions have to be taken, often related to the sparsity of the sources under some signal transformation. This thesis is focused on the application of SSS techniques to the spatial sound reproduction field. As a result, its contributions can be categorized within these two areas. First, two underdetermined SSS methods are proposed to deal efficiently with the separation of stereo sound mixtures. These techniques are based on a multi-level thresholding segmentation approach, which enables to perform a fast and unsupervised separation of sound sources in the time-frequency domain. Although both techniques rely on the same clustering type, the features considered by each of them are related to different localization cues that enable to perform separation of either instantaneous or real mixtures.Additionally, two post-processing techniques aimed at improving the isolation of the separated sources are proposed. The performance achieved by several SSS methods in the resynthesis of WFS sound scenes is afterwards evaluated by means of listening tests, paying special attention to the change observed in the perceived spatial attributes. Although the estimated sources are distorted versions of the original ones, the masking effects involved in their spatial remixing make artifacts less perceptible, which improves the overall assessed quality. Finally, some novel developments related to the application of time-frequency processing to source localization and enhanced sound reproduction are presented.Cobos Serrano, M. (2009). Application of sound source separation methods to advanced spatial audio systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8969Palanci

    Design of a Simulator for Neonatal Multichannel EEG: Application to Time-Frequency Approaches for Automatic Artifact Removal and Seizure Detection

    Get PDF
    The electroencephalogram (EEG) is used to noninvasively monitor brain activities; it is the most utilized tool to detect abnormalities such as seizures. In recent studies, detection of neonatal EEG seizures has been automated to assist neurophysiologists in diagnosing EEG as manual detection is time consuming and subjective; however it still lacks the necessary robustness that is required for clinical implementation. Moreover, as EEG is intended to record the cerebral activities, extra-cerebral activities external to the brain are also recorded; these are called “artifacts” and can seriously degrade the accuracy of seizure detection. Seizures are one of the most common neurologic problems managed by hospitals occurring in 0.1%-0.5% livebirths. Neonates with seizures are at higher risk for mortality and are reported to be 55-70 times more likely to have severe cerebral-palsy. Therefore, early and accurate detection of neonatal seizures is important to prevent long-term neurological damage. Several attempts in modelling the neonatal EEG and artifacts have been done, but most did not consider the multichannel case. Furthermore, these models were used to test artifact or seizure detection separately, but not together. This study aims to design synthetic models that generate clean or corrupted multichannel EEG to test the accuracy of available artifact and seizure detection algorithms in a controlled environment. In this thesis, synthetic neonatal EEG model is constructed by using; single-channel EEG simulators, head model, 21-electrodes, and propagation equations, to produce clean multichannel EEG. Furthermore, neonatal EEG artifact model is designed using synthetic signals to corrupt EEG waveforms. After that, an automated EEG artifact detection and removal system is designed in both time and time-frequency domains. Artifact detection is optimised and removal performance is evaluated. Finally, an automated seizure detection technique is developed, utilising fused and extended multichannel features along a cross-validated SVM classifier. Results show that the synthetic EEG model mimics real neonatal EEG with 0.62 average correlation, and corrupted-EEG can degrade seizure detection average accuracy from 100% to 70.9%. They also show that using artifact detection and removal enhances the average accuracy to 89.6%, and utilising the extended features enhances it to 97.4% and strengthened its robustness.لمراقبة ورصد أنشطة واشارات المخ، دون الحاجة لأي عملیات (EEG) یستخدم الرسم أو التخطیط الكھربائي للدماغ للدماغجراحیة، وھي تعد الأداة الأكثر استخداما في الكشف عن أي شذوذأو نوبات غیر طبیعیة مثل نوبات الصرع. وقد أظھرت دراسات حدیثة، أن الكشف الآلي لنوبات حدیثي الولادة، ساعد علماء الفسیولوجیا العصبیة في تشخیص الاشارات الدماغیة بشكل أكبر من الكشف الیدوي، حیث أن الكشف الیدوي یحتاج إلى وقت وجھد أكبر وھوذو فعالیة أقل بكثیر، إلا أنھ لا یزال یفتقر إلى المتانة الضروریة والمطلوبة للتطبیق السریري.علاوة على ذلك؛ فكما یقوم الرسم الكھربائي بتسجیل الأنشطة والإشارات الدماغیة الداخلیة، فھو یسجل أیضا أي نشاط أو اشارات خارجیة، مما یؤدي إلى -(artifacts) :حدوث خلل في مدى دقة وفعالیة الكشف عن النوبات الدماغیة الداخلیة، ویطلق على تلك الاشارات مسمى (نتاج صنعي) . 0.5٪ولادة حدیثة في -٪تعد نوبات الصرع من أكثر المشكلات العصبیة انتشارا،ً وھي تصیب ما یقارب 0.1المستشفیات. حیث أن حدیثي الولادة المصابین بنوبات الصرع ھم أكثر عرضة للوفاة، وكما تشیر التقاریر الى أنھم 70مرة أكثر. لذا یعد الكشف المبكر والدقیق للنوبات الدماغیة -معرضین للإصابة بالشلل الدماغي الشدید بما یقارب 55لحدیثي الولادة مھم جدا لمنع الضرر العصبي على المدى الطویل. لقد تم القیام بالعدید من المحاولات التي كانتتھدف الى تصمیم نموذج التخطیط الكھربائي والنتاج الصنعي لدماغ حدیثي الولادة, إلا أن معظمھا لم یعر أي اھتمام الى قضیة تعدد القنوات. إضافة الى ذلك, استخدمت ھذه النماذج , كل على حدة, أو نوبات الصرع. تھدف ھذه الدراسة الى تصمیم نماذج مصطنعة من شأنھا (artifact) لإختبار كاشفات النتاج الصنعيأن تولد اشارات دماغیة متعددة القنوات سلیمة أو معطلة وذلك لفحص مدى دقة فعالیة خوارزمیات الكشف عن نوبات ضمن بیئة یمكن السیطرة علیھا. (artifact) الصرع و النتاج الصنعي في ھذه الأطروحة, یتكون نموذج الرسم الكھربائي المصطنع لحدیثي الولادة من : قناة محاكاة واحده للرسم الكھربائي, نموذج رأس, 21قطب كھربائي و معادلات إنتشار. حیث تھدف جمیعھا لإنتاج إشاراة سلیمة متعدده القنوات للتخطیط عن طریق استخدام اشارات مصطنعة (artifact) الكھربائي للدماغ.علاوة على ذلك, لقد تم تصمیم نموذجالنتاج الصنعيفي نطاقالوقت و (artifact) لإتلاف الرسم الكھربائي للدماغ. بعد ذلك تم انشاء برنامج لكشف و إزالةالنتاج الصناعينطاقالوقت و التردد المشترك. تم تحسین برنامج الكشف النتاج الصناعيالى ابعد ما یمكن بینما تمت عملیة تقییم أداء الإزالة. وفي الختام تم التمكن من تطویر تقنیة الكشف الآلي عن نوبات الصرع, وذلك بتوظیف صفات مدمجة و صفات الذي تم التأكد من صحتھ. (SVM) جدیدة للقنوات المتعددة لإستخدامھا للمصنفلقد أظھرت النتائج أن نموذج الرسم الكھربائي المصطنع لحدیثي الولادة یحاكي الرسمالكھربائي الحقیقي لحدیثي الولادة بمتوسط ترابط 0.62, و أنالرسم الكھربائي المتضرر للدماغ قد یؤدي الى حدوث ھبوطفي مدى دقة متوسط الكشف عن نوبات الصرع من 100%الى 70.9%. وقد أشارت أیضا الى أن استخدام الكشف والإزالة عن النتاج الصنعي (artifact) یؤدي الى تحسن مستوى الدقة الى نسبة 89.6 %, وأن توظیف الصفات الجدیدة للقنوات المتعددة یزید من تحسنھا لتصل الى نسبة 94.4 % مما یعمل على دعم متانتھا

    Binaural localization and separation techniques

    Get PDF
    Abstract Based on binaural signals, i.e. the signals observed at the two ears, a listener can localize and recognize different sound sources and then focus on one of these. For decades, researchers have tried to invent a machine that can do the same under similar conditions. Despite all the efforts, the human auditory system is, by far, superior to any machine that has been devised. The topic of this thesis is computational techniques for the localization and separation of sources in binaural signals. In order to give an overview of different areas of research that have considered the problems of source localization and separation, we start with a review of existing techniques. This provides the background for the techniques that we propose subsequently. Binaural Localization The most important cues for localization of sound sources in binaural signals are the level and time differences between the ears. We propose a technique for the joint evaluation of these cues where noisy level difference estimates are combined with less noisy but ambiguous time difference estimates in order to provide accurate azimuth estimates. The proposed technique enables the localization of sources and the tracking of these in dynamic scenes. Head model Based on a study of the level and time differences as function of azimuth angle for different heads, we propose a generic model that is parametrized by the distance between the ears only. This enables the use of the binaural localization technique mentioned above for a listener whose head related transfer functions have not been measured. Binaural separation For the separation of sources we propose a method based on spatial windowing in the azimuth parameter space. Separation of overlapping partials Finally, we propose a technique for the separation of overlapping partials in mixtures of harmonic instruments. The technique is based on the similarity of temporal envelopes between the different partials of a harmonic note

    An Online Solution for Localisation, Tracking and Separation of Moving Speech Sources

    Get PDF
    The problem of separating a time varying number of speech sources in a room is difficult to solve. The challenge lies in estimating the number and the location of these speech sources. Furthermore, the tracked speech sources need to be separated. This thesis proposes a solution which utilises the Random Finite Set approach to estimate the number and location of these speech sources and subsequently separate the speech source mixture via time frequency masking

    A unified approach to sparse signal processing

    Get PDF
    A unified view of the area of sparse signal processing is presented in tutorial form by bringing together various fields in which the property of sparsity has been successfully exploited. For each of these fields, various algorithms and techniques, which have been developed to leverage sparsity, are described succinctly. The common potential benefits of significant reduction in sampling rate and processing manipulations through sparse signal processing are revealed. The key application domains of sparse signal processing are sampling, coding, spectral estimation, array processing, compo-nent analysis, and multipath channel estimation. In terms of the sampling process and reconstruction algorithms, linkages are made with random sampling, compressed sensing and rate of innovation. The redundancy introduced by channel coding i
    corecore