235 research outputs found
AudioāVisual Speaker Tracking
Target motion tracking found its application in interdisciplinary fields, including but not limited to surveillance and security, forensic science, intelligent transportation system, driving assistance, monitoring prohibited area, medical science, robotics, action and expression recognition, individual speaker discrimination in multiāspeaker environments and video conferencing in the fields of computer vision and signal processing. Among these applications, speaker tracking in enclosed spaces has been gaining relevance due to the widespread advances of devices and technologies and the necessity for seamless solutions in realātime tracking and localization of speakers. However, speaker tracking is a challenging task in realālife scenarios as several distinctive issues influence the tracking process, such as occlusions and an unknown number of speakers. One approach to overcome these issues is to use multiāmodal information, as it conveys complementary information about the state of the speakers compared to singleāmodal tracking. To use multiāmodal information, several approaches have been proposed which can be classified into two categories, namely deterministic and stochastic. This chapter aims at providing multimedia researchers with a stateāofātheāart overview of tracking methods, which are used for combining multiple modalities to accomplish various multimedia analysis tasks, classifying them into different categories and listing new and future trends in this field
āDid the speaker change?ā: Temporal tracking for overlapping speaker segmentation in multi-speaker scenarios
Diarization systems are an essential part of many speech processing applications, such as speaker indexing, improving automatic speech recognition (ASR) performance and making single speaker-based algorithms available for use in multi-speaker domains. This thesis will focus on the first task of the diarization process, that being the task of speaker segmentation which can be thought of as trying to answer the question āDid the speaker change?ā in an audio recording.
This thesis starts by showing that time-varying pitch properties can be used advantageously within the segmentation step of a multi-talker diarization system. It is then highlighted that an individualās pitch is smoothly varying and, therefore, can be predicted by means of a Kalman filter. Subsequently, it is shown that if the pitch is not predictable, then this is most likely due to a change in the speaker. Finally, a novel system is proposed that uses this approach of pitch prediction for speaker change detection.
This thesis then goes on to demonstrate how voiced harmonics can be useful in detecting when more than one speaker is talking, such as during overlapping speaker activity. A novel system is proposed to track multiple harmonics simultaneously, allowing for the determination of onsets and end-points of a speakerās utterance in the presence of an additional active speaker.
This thesis then extends this work to explore the use of a new multimodal approach for overlapping speaker segmentation that tracks both the fundamental frequency (F0) and direction of arrival (DoA) of each speaker simultaneously. The proposed multiple hypothesis tracking system, which simultaneously tracks both features, shows an improvement in segmentation performance when compared to tracking these features separately.
Lastly, this thesis focuses on the DoA estimation part of the newly proposed multimodal approach. It does this by exploring a polynomial extension to the multiple signal classification (MUSIC) algorithm, spatio-spectral polynomial (SSP)-MUSIC, and evaluating its performance when using speech sound sources.Open Acces
Sparse Bases and Bayesian Inference of Electromagnetic Scattering
Many approaches in CEM rely on the decomposition of complex radiation and scattering behavior with a set of basis vectors. Accurate estimation of the quantities of interest can be synthesized through a weighted sum of these vectors. In addition to basis decompositions, sparse signal processing techniques developed in the CS community can be leveraged when only a small subset of the basis vectors are required to sufficiently represent the quantity of interest. We investigate several concepts in which novel bases are applied to common electromagnetic problems and leverage the sparsity property to improve performance and/or reduce computational burden. The first concept explores the use of multiple types of scattering primitives to reconstruct scattering patterns of electrically large targets. Using a combination of isotropic point scatterers and wedge diffraction primitives as our bases, a 40% reduction in reconstruction error can be achieved. Next, a sparse basis is used to improve DOA estimation. We implement the BSBL technique to determine the angle of arrival of multiple incident signals with only a single snapshot of data from an arbitrary arrangement of non-isotropic antennas. This is an improvement over the current state-of-the-art, where restrictions on the antenna type, configuration, and a priori knowledge of the number of signals are often assumed. Lastly, we investigate the feasibility of a basis set to reconstruct the scattering patterns of electrically small targets. The basis is derived from the TCM and can capture non-localized scattering behavior. Preliminary results indicate that this basis may be used in an interpolation and extrapolation scheme to generate scattering patterns over multiple frequencies
Modelling, Simulation and Data Analysis in Acoustical Problems
Modelling and simulation in acoustics is currently gaining importance. In fact, with the development and improvement of innovative computational techniques and with the growing need for predictive models, an impressive boost has been observed in several research and application areas, such as noise control, indoor acoustics, and industrial applications. This led us to the proposal of a special issue about āModelling, Simulation and Data Analysis in Acoustical Problemsā, as we believe in the importance of these topics in modern acousticsā studies. In total, 81 papers were submitted and 33 of them were published, with an acceptance rate of 37.5%. According to the number of papers submitted, it can be affirmed that this is a trending topic in the scientific and academic community and this special issue will try to provide a future reference for the research that will be developed in coming years
Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019
International audienc
Applications of Antenna Technology in Sensors
During the past few decades, information technologies have been evolving at a tremendous rate, causing profound changes to our world and to our ways of living. Emerging applications have opened u[ new routes and set new trends for antenna sensors. With the advent of the Internet of Things (IoT), the adaptation of antenna technologies for sensor and sensing applications has become more important. Now, the antennas must be reconfigurable, flexible, low profile, and low-cost, for applications from airborne and vehicles, to machine-to-machine, IoT, 5G, etc. This reprint aims to introduce and treat a series of advanced and emerging topics in the field of antenna sensors
- ā¦