20 research outputs found

    Web Technologies for Scientific Hearing Experiments and Teaching - An Overview

    Get PDF
    Scientists of many audio-related fields need to verify their theories by conducting controlled experiments with human test subjects. The process of developing and conducting such experiments often poses non-trivial challenges to scientists and test subjects. Web technologies promise simple delivery of experiments as interactive websites, possibly even on subjects' own computers. Similar benefits are possible for teaching science. While many tasks in hearing experiments and teaching are well-supported with current web-based tools, support for scientific data structures, signal processing operations and statistical data analysis methods is still incomplete in comparison with entrenched non-web tools. These shortcomings could easily be overcome with a few libraries, and would provide a great boon to scientists and educators

    Geometry-aware DoA Estimation using a Deep Neural Network with mixed-data input features

    Full text link
    Unlike model-based direction of arrival (DoA) estimation algorithms, supervised learning-based DoA estimation algorithms based on deep neural networks (DNNs) are usually trained for one specific microphone array geometry, resulting in poor performance when applied to a different array geometry. In this paper we illustrate the fundamental difference between supervised learning-based and model-based algorithms leading to this sensitivity. Aiming at designing a supervised learning-based DoA estimation algorithm that generalizes well to different array geometries, in this paper we propose a geometry-aware DoA estimation algorithm. The algorithm uses a fully connected DNN and takes mixed data as input features, namely the time lags maximizing the generalized cross-correlation with phase transform and the microphone coordinates, which are assumed to be known. Experimental results for a reverberant scenario demonstrate the flexibility of the proposed algorithm towards different array geometries and show that the proposed algorithm outperforms model-based algorithms such as steered response power with phase transform.Comment: Submitted to ICASSP 202

    Long-term Conversation Analysis: Exploring Utility and Privacy

    Full text link
    The analysis of conversations recorded in everyday life requires privacy protection. In this contribution, we explore a privacy-preserving feature extraction method based on input feature dimension reduction, spectral smoothing and the low-cost speaker anonymization technique based on McAdams coefficient. We assess the utility of the feature extraction methods with a voice activity detection and a speaker diarization system, while privacy protection is determined with a speech recognition and a speaker verification model. We show that the combination of McAdams coefficient and spectral smoothing maintains the utility while improving privacy.Comment: Submitted to ITG Conference on Speech Communication, 202

    HTML Web Audio Elements: Easy Interaction with Web Audio API Through HTML

    Get PDF
    The JavaScript Web Audio API has a powerful but low-level and complicated structure. Therefore, many Javascript-based wrapper libraries exist, which are intended to simplify its usage. This paper presents a completely new approach, which translates the API into HTML Custom Elements and allows definition, usage and control of complex audio scenarios using only normal HTML elements

    An Alternative Implementation of the Superdirective Beamformer, in

    No full text
    In this contribution we introduce a new implementation of superdirective beamformers. The new structure has the advantage of reduced computational complexity. This advantage is due to a GSClike (Generalized Sidelobe Canceller) scheme. Unlike the conventional GSC, the filters in the sidelobe cancelling path are fixed and can be computed in advance by using the Wiener solution. The new structure yields exactly the same noise reduction performance as the superdirective beamformer does. 1

    Near-ear sound pressure level distribution in everyday life considering the user’s own voice and privacy

    No full text
    Recently, exploring acoustic conditions of people in their everyday environments has drawn a lot of attention. One of the most important and disturbing sound sources is the test participant’s own voice. This contribution proposes an algorithm to determine the own-voice audio segments (OVS) for blocks of 125 ms and a method for measuring sound pressure levels (SPL) without violating privacy laws. The own voice detection (OVD) algorithm here developed is based on a machine learning algorithm and a set of acoustic features that do not allow for speech reconstruction. A manually labeled real-world recording of one full day showed reliable and robust detection results. Moreover, the OVD algorithm was applied to 13 near-ear recordings of hearing-impaired participants in an ecological momentary assessment (EMA) study. The analysis shows that the grand mean percentage of predicted OVS during one day was approx. 10% which corresponds well to other published data. These OVS had a small impact on the median SPL over all data. However, for short analysis intervals, significant differences up to 30 dB occurred in the measured SPL, depending on the proportion of OVS and the SPL of the background noise
    corecore