235 research outputs found

    Audioā€Visual Speaker Tracking

    Get PDF
    Target motion tracking found its application in interdisciplinary fields, including but not limited to surveillance and security, forensic science, intelligent transportation system, driving assistance, monitoring prohibited area, medical science, robotics, action and expression recognition, individual speaker discrimination in multiā€speaker environments and video conferencing in the fields of computer vision and signal processing. Among these applications, speaker tracking in enclosed spaces has been gaining relevance due to the widespread advances of devices and technologies and the necessity for seamless solutions in realā€time tracking and localization of speakers. However, speaker tracking is a challenging task in realā€life scenarios as several distinctive issues influence the tracking process, such as occlusions and an unknown number of speakers. One approach to overcome these issues is to use multiā€modal information, as it conveys complementary information about the state of the speakers compared to singleā€modal tracking. To use multiā€modal information, several approaches have been proposed which can be classified into two categories, namely deterministic and stochastic. This chapter aims at providing multimedia researchers with a stateā€ofā€theā€art overview of tracking methods, which are used for combining multiple modalities to accomplish various multimedia analysis tasks, classifying them into different categories and listing new and future trends in this field

    ā€˜Did the speaker change?ā€™: Temporal tracking for overlapping speaker segmentation in multi-speaker scenarios

    Get PDF
    Diarization systems are an essential part of many speech processing applications, such as speaker indexing, improving automatic speech recognition (ASR) performance and making single speaker-based algorithms available for use in multi-speaker domains. This thesis will focus on the first task of the diarization process, that being the task of speaker segmentation which can be thought of as trying to answer the question ā€˜Did the speaker change?ā€™ in an audio recording. This thesis starts by showing that time-varying pitch properties can be used advantageously within the segmentation step of a multi-talker diarization system. It is then highlighted that an individualā€™s pitch is smoothly varying and, therefore, can be predicted by means of a Kalman filter. Subsequently, it is shown that if the pitch is not predictable, then this is most likely due to a change in the speaker. Finally, a novel system is proposed that uses this approach of pitch prediction for speaker change detection. This thesis then goes on to demonstrate how voiced harmonics can be useful in detecting when more than one speaker is talking, such as during overlapping speaker activity. A novel system is proposed to track multiple harmonics simultaneously, allowing for the determination of onsets and end-points of a speakerā€™s utterance in the presence of an additional active speaker. This thesis then extends this work to explore the use of a new multimodal approach for overlapping speaker segmentation that tracks both the fundamental frequency (F0) and direction of arrival (DoA) of each speaker simultaneously. The proposed multiple hypothesis tracking system, which simultaneously tracks both features, shows an improvement in segmentation performance when compared to tracking these features separately. Lastly, this thesis focuses on the DoA estimation part of the newly proposed multimodal approach. It does this by exploring a polynomial extension to the multiple signal classification (MUSIC) algorithm, spatio-spectral polynomial (SSP)-MUSIC, and evaluating its performance when using speech sound sources.Open Acces

    Sparse Bases and Bayesian Inference of Electromagnetic Scattering

    Get PDF
    Many approaches in CEM rely on the decomposition of complex radiation and scattering behavior with a set of basis vectors. Accurate estimation of the quantities of interest can be synthesized through a weighted sum of these vectors. In addition to basis decompositions, sparse signal processing techniques developed in the CS community can be leveraged when only a small subset of the basis vectors are required to sufficiently represent the quantity of interest. We investigate several concepts in which novel bases are applied to common electromagnetic problems and leverage the sparsity property to improve performance and/or reduce computational burden. The first concept explores the use of multiple types of scattering primitives to reconstruct scattering patterns of electrically large targets. Using a combination of isotropic point scatterers and wedge diffraction primitives as our bases, a 40% reduction in reconstruction error can be achieved. Next, a sparse basis is used to improve DOA estimation. We implement the BSBL technique to determine the angle of arrival of multiple incident signals with only a single snapshot of data from an arbitrary arrangement of non-isotropic antennas. This is an improvement over the current state-of-the-art, where restrictions on the antenna type, configuration, and a priori knowledge of the number of signals are often assumed. Lastly, we investigate the feasibility of a basis set to reconstruct the scattering patterns of electrically small targets. The basis is derived from the TCM and can capture non-localized scattering behavior. Preliminary results indicate that this basis may be used in an interpolation and extrapolation scheme to generate scattering patterns over multiple frequencies

    Modelling, Simulation and Data Analysis in Acoustical Problems

    Get PDF
    Modelling and simulation in acoustics is currently gaining importance. In fact, with the development and improvement of innovative computational techniques and with the growing need for predictive models, an impressive boost has been observed in several research and application areas, such as noise control, indoor acoustics, and industrial applications. This led us to the proposal of a special issue about ā€œModelling, Simulation and Data Analysis in Acoustical Problemsā€, as we believe in the importance of these topics in modern acousticsā€™ studies. In total, 81 papers were submitted and 33 of them were published, with an acceptance rate of 37.5%. According to the number of papers submitted, it can be affirmed that this is a trending topic in the scientific and academic community and this special issue will try to provide a future reference for the research that will be developed in coming years

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Applications of Antenna Technology in Sensors

    Get PDF
    During the past few decades, information technologies have been evolving at a tremendous rate, causing profound changes to our world and to our ways of living. Emerging applications have opened u[ new routes and set new trends for antenna sensors. With the advent of the Internet of Things (IoT), the adaptation of antenna technologies for sensor and sensing applications has become more important. Now, the antennas must be reconfigurable, flexible, low profile, and low-cost, for applications from airborne and vehicles, to machine-to-machine, IoT, 5G, etc. This reprint aims to introduce and treat a series of advanced and emerging topics in the field of antenna sensors

    User-Symbiotic Speech Enhancement for Hearing Aids

    Get PDF
    • ā€¦
    corecore