416 research outputs found

    Exploiting correlogram structure for robust speech recognition with multiple speech sources

    Get PDF
    This paper addresses the problem of separating and recognising speech in a monaural acoustic mixture with the presence of competing speech sources. The proposed system treats sound source separation and speech recognition as tightly coupled processes. In the first stage sound source separation is performed in the correlogram domain. For periodic sounds, the correlogram exhibits symmetric tree-like structures whose stems are located on the delay that corresponds to multiple pitch periods. These pitch-related structures are exploited in the study to group spectral components at each time frame. Local pitch estimates are then computed for each spectral group and are used to form simultaneous pitch tracks for temporal integration. These processes segregate a spectral representation of the acoustic mixture into several time-frequency regions such that the energy in each region is likely to have originated from a single periodic sound source. The identified time-frequency regions, together with the spectral representation, are employed by a `speech fragment decoder' which employs `missing data' techniques with clean speech models to simultaneously search for the acoustic evidence that best matches model sequences. The paper presents evaluations based on artificially mixed simultaneous speech utterances. A coherence-measuring experiment is first reported which quantifies the consistency of the identified fragments with a single source. The system is then evaluated in a speech recognition task and compared to a conventional fragment generation approach. Results show that the proposed system produces more coherent fragments over different conditions, which results in significantly better recognition accuracy

    Human-centred design of clinical auditory alarms

    Get PDF
    Auditory alarms are commonly badly designed, providing little to no information or guidance. In the healthcare context, the poor acoustics of alarms is one contributor for the noise problem. The goal of this thesis is to propose a human-centred methodology for the design of clinical auditory alarms, by making them less disruptive and more informative, thus improving the healthcare soundscape. It implements this methodology from concept to evaluation and validation, combining psychoacoustics with usability and user experience methods. Another aim of this research consisted in understanding the limitations and possibilities offered by online tools for scientific studies. Thus, different processes and methodologies were implemented, and corresponding results were discussed. To understand the acoustic healthcare environment, field visits, interviews, and surveys were performed with healthcare professionals. Additionally, sound pressure levels and frequency analysis of several surgeries in different hospitals provided specific sound design requirements, which were added to an existent body of knowledge on clinical alarm design. A second stage consisted in prototyping very simple sounds to comprehend which temporal and spectral parameters of sound could be manipulated to communicate clinical information. Parameters such as frequency, speed, onset, and rhythm were studied, and relations between subjective perception and physical parameters were established. In parallel, and heavily influenced by the new IEC 60601-1-8 - General requirements, tests and guidance for alarm systems in medical electrical equipment and medical electrical systems, a design strategy with auditory icons was created. This strategy intended to provide as much information as possible in an auditory alarm. To do so, it involved two main components: a priority pointer indicating the priority of the alarm; an auditory icon indicating the cause of the alarm. A third component indicating increasing or decreasing tendency of the vital sign was designed, but not validated with users. After online validation of the priority pointer and auditory icon for eight categories (cardiac, drug administration, ventilation, blood pressure, perfusion, oxygen, temperature, and power down), a new library of clinical auditory alarms is proposed.Os alarmes auditivos sĂŁo habitualmente mal concebidos, dando poucas informaçÔes ou orientaçÔes perante a situação que despoletou o aviso. No contexto da saĂșde, a mĂĄ acĂșstica dos alarmes Ă© um dos contribuidores para o problema do ruĂ­do. O objetivo desta tese Ă© o de melhorar a paisagem sonora em ambientes clĂ­nicos, propondo uma metodologia centrada no Humano para o design de alarmes auditivos clĂ­nicos, tornando-os menos disruptivos e mais informativos. Essa metodologia Ă© implementada desde o conceito atĂ© a avaliação e validação, combinando mĂ©todos da psicoacĂșstica com mĂ©todos de usabilidade e experiĂȘncia do utilizador. Outro objetivo desta investigação Ă© o de compreender as limitaçÔes e possibilidades oferecidas pelas ferramentas online para estudos cientĂ­ficos. Assim, diversos processos e metodologias foram implementados, e os respetivos resultados sĂŁo discutidos. Para compreender o ambiente acĂșstico clĂ­nico, foram realizadas visitas de campo, entrevistas e inquĂ©ritos com profissionais de saĂșde. AlĂ©m disso, avaliou-se o nĂ­vel de pressĂŁo sonora e frequĂȘncias de vĂĄrias cirurgias em diferentes hospitais. Esta atividade forneceu requisitos especĂ­ficos de design de som que foram adicionados a um corpo existente de conhecimento sobre design de alarmes clĂ­nicos. Uma segunda etapa consistiu na prototipagem de sons simples para compreender que parĂąmetros temporais e espectrais do som poderiam ser manipulados para comunicar informaçÔes clĂ­nicas. ParĂąmetros como frequĂȘncia, velocidade, envelope e ritmo foram estudados, e as relaçÔes entre a perceção subjetiva e os parĂąmetros fĂ­sicos foram estabelecidas. Paralelamente, e fortemente influenciado pela nova norma IEC 60601-1-8 - Requisitos gerais, testes e orientaçÔes para sistemas de alarme em equipamentos mĂ©dicos elĂ©tricos e sistemas mĂ©dicos elĂ©tricos, foi criada uma estratĂ©gia de design com Ă­cones auditivos. Essa estratĂ©gia pretendia incorporar o mĂĄximo de informaçÔes num alarme auditivo. Para isso, envolveu dois componentes principais: um ponteiro de prioridade que indica a prioridade do alarme; e um Ă­cone auditivo que indica a causa do alarme. Um terceiro componente de tendĂȘncia (aumento ou diminuição do valor do sinal vital) foi criado, mas nĂŁo validado com utilizadores. ApĂłs a validação do ponteiro de prioridade e Ă­cone auditivo para oito categorias (cardĂ­aco, administração de medicamentos, ventilação, pressĂŁo arterial, perfusĂŁo, oxigĂ©nio, temperatura e falha de equipamento), propĂ”e-se uma nova biblioteca de alarmes auditivos clĂ­nicos

    A perceptually evaluated signal model:Collisions between a vibrating object and an obstacle

    Get PDF
    The collision interaction mechanism between a vibrating string and a non-resonant obstacle is at the heart of many musical instruments. This paper focuses on the identification of perceptually salient auditory features related to this phenomenon. The objective is to design a signal-based synthesis process, with an eye towards developing intuitive control strategies. To this end, a database of synthesized sounds is assembled through physics-based emulation of a string/obstacle collision, in order to characterize the effect of collisions on time-frequency content. The investigation of this database reveals characteristic time-frequency patterns related to the position of the obstacle during the interaction. In particular, a frequency shift of certain modes is apparent for strong interactions, which, alongside the generation of new frequency components, leads to increased perceived roughness and inharmonicity. These observations enable the design of a real-time compatible signal-based sound synthesis process, with a mapping of synthesis parameters linked to the perceived location of the obstacle. The accuracy of the signal model with respect to the physical model sound output and recorded sounds was evaluated through listening tests: time-frequency patterns reproduced by the signal model enabled listeners to precisely recognize the transverse location of the obstacle

    A Thousand Bells: Acoustical implementation of bell spectra using the finite element method and its compositional realization

    Get PDF
    This dissertation focuses primarily on the analysis of acoustical models of bell sounds and the modelling of virtual bell shapes and their spectra using the Finite Element Method (FEM) technique. The first chapter provides a brief introduction of pre and post-spectral music that is inspired by or employs bell sounds from which it derives its central materials. The second chapter introduces bell acoustics and the creation of new spectral profiles of optimal bell tone colors based upon just tuning ratios. In this chapter, I discuss how the concepts of consonance and Just Noticeable Difference in psychoacoustics are applied to use the 96 tone equal temperament tuning system for bell harmonic profiles. The third chapter includes the theoretical basis of the FEM and its application to the isoparametric 2-D quadrilateral elements, which are the fundamental theories of how bell harmonies are mathematically calculated. This includes the central concepts of the FEM, such as the Principle of Virtual Work/Displacement, master to global coordinate transformation, FE shape functions, usages of Jacobian matrices, numerical integration of the stiffness matrix and the equivalent nodal force vector for the element by using the Gauss-Lagrange quadrature. In the fourth chapter, I create bell model geometry by using 2D bell nominal curve and adjustable design variables. Physical parameters, such as the Poisson ratio, Young’s modulus, and material properties are also adopted from previous bell design research. Based upon the aforementioned prototypes, I create 24 different 3-D bell geometries, and analyze the spectra of these virtual bells. These bell models are analyzed, optimized and tuned to create tone colors that are defined in Chapter 2. After a validating process of the bell model, the general backgrounds of optimization theory are also introduced and analyzed for the purpose of creating 3-D virtual bells. For a general background of campanology, I use AndrĂ© Lehr’s Campanology textbook to provide the brief history, types, mechanism, casting, forms and parts, and tones of different bells. For acoustical and computational realizations of virtual bells, the analysis focuses on the research of Albertus Johannes Gerardus Schoofs and his follower PJM Roozen-Kroon on the FEM bell optimization, upon which the first prototype of the major-third bell was designed and cast

    Intergrating the exposition in music-composition research

    Get PDF
    • 

    corecore