    Impact Thresholds of Parameters of Binaural Room Impulse Responses (BRIRs) on Perceptual Reverberation

    This paper presents a study on the perceived importance of different acoustic parameters of Binaural Room Impulse Response (BRIR) rendering. A headphone-based listening test was conducted with twenty expert participants. Three BRIRs generated from simulations of three different rooms were convolved with a dry speech signal and used as reference audio samples. Four BRIR parameters, Initial Time Delay Gap (ITDG), Forward Early Reflections (FER), Reverse Early Reflections (RER) and Late Reverberation (LR) were systematically altered and convolved with a speech signal to generate the test conditions. A staircase method was used to obtain the threshold at which each BRIR parameter was perceived as different from the reference audio sample. The average perceived impact threshold of each parameter was then calculated across the twenty participants. Results show that RER removal and ITDG extension have a clear impact on the perceptual reverberation of speech audio. Subjects were less sensitive to FER removal. The effect of LR removal on perceptual reverberation is hard to distinguish. Therefore, RER and ITDG are of particular importance when designing artificial reverberation algorithms, whilst more research is needed to understand the perceptual contribution of LR. Minor changes in FER and LR are less significant

    The Perception of Formant Tuning in Soprano Voices

    Abstract 0.1. Introduction At the upper end of the soprano singing range, it is known that singers alter the shape of their vocal tracts in order to bring one or more of the vocal tract resonances nearer to a harmonic of the voice source; a process known as resonance tuning, which increases the amplitude of the sound produced with little eort from the singer. This study investigated the perception of first and second resonance tuning; key strategies observed in classically trained soprano voices. It was expected that the most commonlyused strategies observed in singers would be preferred by the listeners, and since previous investigations have often focussed only on a single vowel sound (usually /A/), this test will also allow for comparison of dierent tuning strategies between vowels. 0.2. Method Synthetic vowel sounds were generated using an L-F glottal flow model, passed through a series of filters to represent the vocal tract resonances. Listeners then compared the sounds, which included 3 vowels, at 4 fundamental frequencies ( f0), to which 4 dierent tuning strategies were applied; (A) the expected formant values in speech, (B) the first formant tuned to the fundamental, (C) the second formant tuned to the second harmonic, and (D) both first and second formants tuned to the first and second harmonics respectively. Participants were asked three sets of questions: comparing how much they preferred dierent tuning strategies, how natural they found dierent tuning strategies, and identifying the vowel of each sound. 0.3. Results The results obtained varied greatly between vowels; the results for the /A/ vowel were similar for preference and naturalness, but no clear pattern was seen for vowel identification. The results for the /u/ vowel did not appear to show a clear dierence between the dierent tuning strategies for preference, and only a little separation for naturalness. The vowel identification was generally very poor for this vowel. The results for the /i/ vowel were striking, with strategies including R2 tuning both preferred and perceived as more natural than those without for both preference and naturalness, however for the vowel identification, strategies without R2 tuning were most often correctly identified. 0.4. Conclusion The results indicate that the perception of dierent tuning strategies alters depending on the vowel and the perceptual quality investigated (preference, naturalness, or vowel identification), and whether the first and second harmonic fall above or below the first or second formants. For some vowels and perceptual qualities, formant tuning was found to be beneficial at lower f0 values than expected based on current expectations of formant tuning in practice

    Determining The Relevant Criteria For 3D Vocal Tract Characterisation

    0.1. Introduction Soprano singers face a number of specific challenges when singing vowels at high frequencies, due to the wide spacing of harmonics in the voice source. The varied and complex techniques used to overcome these are still not fully understood. Magnetic resonance imaging (MRI) has become increasingly popular in recent years for singing voice analysis. This study proposes a new protocol using 3D MRI to investigate the articulatory parameters relevant to resonance tuning, a technique whereby the singer alters their vocal tract to shift its resonances nearer to a voice source harmonic, increasing the amplitude of the sound produced. 0.2. Method The protocol was tested with a single soprano opera singer. Drawing on previous MRI studies, articulatory measurements from 3D MRI images were compared to vocal tract resonances measured directly using broad-band noise excitation. The suitability of the protocol was assessed using statistical analysis. 0.3. Results No clear linear relationships were apparent between articulatory characteristics and vocal tract resonances. The results were highly vowel-dependent, showing dierent patterns of resonance tuning and interactions between variables. This potentially indicates a complex interaction between the vocal tract and sung vowels in soprano voices, meriting further investigation. 0.4. Conclusion The eective interpretation of MRI data is essential for a deeper understanding of soprano voice production, and in particular the phenomenon of resonance tuning. This paper presents a new protocol that contributes towards this aim, and the results suggest that a more vowel-specific approach is necessary in the wider investigation of resonance tuning in female voices

    The Impact of Gender on Conference Authorship in Audio Engineering : Analysis Using a New Data Collection Method

    Contribution: This paper provides evidence of the lack of gender diversity at audio engineering conferences, using a novel and inclusive gender determination method to produce a new dataset of author gender. Background: Audio engineering has historically been male-dominated; whilst the number of non-male audio engineers has increased recently, the industry mindset has changed very little. Studies into the gender diversity of this field are required to force a shift in mindset and create a more inclusive environment. Research Questions: To what extent is there an imbalance in the representation of different genders at audio engineering conferences? Do conference topic, presentation type, or author position have an impact on the gender balance? Methodology: A novel method was designed to obtain pronouns of authors where possible, avoiding removal of data or potential false positives. The main limitation of this methodology is the time required for gender determination. Gender composition was analyzed across 20 conferences, with gender balance further analyzed within four key categories: conference topic, presentation type, position in the author byline, and the number of authors listed. Findings: This data-driven study demonstrates a clear lack of gender diversity in conference authorship in audio engineering. The results show low overall representation of non-male authors at audio engineering conferences, with significant differences across conference topics, and a notable lack of gender diversity within invited presentations. Index Terms— Audio Engineering, Conferences, Gender, Underrepresentation, Bias, Discrimination, STEM, Engineering Pipelin

    A New Method of Onset and Offset Detection in Ensemble Singing

    This paper presents a novel method combining electrolaryngography and acoustic analysis to detect the onset and offset of phonation as well as the beginning and ending of notes within a sung legato phrase, through the application of a peak-picking algorithm, TIMEX. The evaluation of the method applied to a set of singing duo recordings shows an overall performance of 78% within a tolerance window of 50 ms compared with manual annotations performed by three experts. Results seem very promising in light of the state-of-the-art techniques presented at MIREX in 2016 yielding an overall performance of around 60%. The new method was applied to a pilot study with two duets to analyse synchronization between singers during ensemble performances. Results from this investigation demonstrate bidirectional temporal adaptations between performers, and suggest that the precision and consistency of synchronization, and the tendency to precede or lag a co-performer might be affected by visual contact between singers and leader–follower relationships. The outcomes of this paper promise to be beneficial for future investigations of synchronization in singing ensembles

    Protocol for a cohort study of adolescent mental health service users with a nested cluster randomised controlled trial to assess the clinical and cost-effectiveness of managed transition in improving transitions from child to adult mental health services (the MILESTONE study)

    Introduction Disruption of care during transition from child and adolescent mental health services (CAMHS) to adult mental health services may adversely affect the health and well-being of service users. The MILESTONE (Managing the Link and Strengthening Transition from Child to Adult Mental Healthcare) study evaluates the longitudinal course and outcomes of adolescents approaching the transition boundary (TB) of their CAMHS and determines the effectiveness of the model of managed transition in improving outcomes, compared with usual care. Methods and analysis This is a cohort study with a nested cluster randomised controlled trial. Recruited CAMHS have been randomised to provide either (1) managed transition using the Transition Readiness and Appropriateness Measure score summary as a decision aid, or (2) usual care for young people reaching the TB. Participants are young people within 1 year of reaching the TB of their CAMHS in eight European countries; one parent/carer and a CAMHS clinician for each recruited young person; and adult mental health clinician or other community-based care provider, if young person transitions. The primary outcome is Health of the Nation Outcome Scale for Children and Adolescents (HoNOSCA) measuring health and social functioning at 15 months postintervention. The secondary outcomes include mental health, quality of life, transition experience and healthcare usage assessed at 9, 15 and 24 months postintervention. With a mean cluster size of 21, a total of 840 participants randomised in a 1:2 intervention to control are required, providing 89% power to detect a difference in HoNOSCA score of 0.30 SD. The addition of 210 recruits for the cohort study ensures sufficient power for studying predictors, resulting in 1050 participants and an approximate 1:3 randomisation. Ethics and dissemination The study protocol was approved by the UK National Research Ethics Service (15/WM/0052) and equivalent ethics boards in participating countries. Results will be reported at conferences, in peer-reviewed publications and to all relevant stakeholder groups