416 research outputs found
Exploiting correlogram structure for robust speech recognition with multiple speech sources
This paper addresses the problem of separating and recognising speech in a monaural acoustic mixture with the presence of competing speech sources. The proposed system treats sound source separation and speech recognition as
tightly coupled processes. In the first stage sound source separation is performed in the correlogram domain. For periodic sounds, the correlogram exhibits symmetric tree-like structures whose stems are located on the delay
that corresponds to multiple pitch periods. These pitch-related structures are exploited in the study to group spectral components at each time frame. Local
pitch estimates are then computed for each spectral group and are used to form simultaneous pitch tracks for temporal integration. These processes segregate a spectral representation of the acoustic mixture into several time-frequency regions such that the energy in each region is likely to have originated from a single periodic sound source. The identified time-frequency regions, together
with the spectral representation, are employed by a `speech fragment decoder' which employs `missing data' techniques with clean speech models to simultaneously search for the acoustic evidence that best matches model sequences. The paper presents evaluations based on artificially mixed simultaneous speech utterances. A coherence-measuring experiment is first reported which quantifies the consistency of the identified fragments with a single source. The system is then evaluated in a speech recognition task and compared to a conventional fragment generation approach. Results show that the proposed system produces more coherent fragments over different conditions,
which results in significantly better recognition accuracy
Recommended from our members
Topology of spatial texture in the acousmatic medium
This research explores the dynamic fabric of experienced space in acousmatic music. The topology of spatial texture is a network of concepts treating music as a flexible, textural space, which deforms, shapes, and transforms in time. A comprehensive terminology is introduced, along with five fixed-media electroacoustic compositions, which exemplify a manifestation of spatial texture in composition and musical thinking.
The theory draws from research on the cross-modality of texture perception, philosophical discourse on embodied meaning, physics, psychology of visual art, and discourse on space in acousmatic music. Several different structural perspectives are discussed, which reveal how spatial texture incorporates lower sound-structural levels, materiality, states and processes, motion, global networks and terrains, and relationships between space and time. Emphasis is put on visual and physical connections with spatiality in the acousmatic experience: cogency in spatial structure and dynamics reinforces links among modalities.
The concepts and terminology are intended as a contribution to theory in the acousmatic medium, relevant to composition, analysis, and listening. The music represents an aesthetic orientation which emphasises materiality and morphology in texture, transformative processes, spatial design, and spatiotemporal polyvalence
Human-centred design of clinical auditory alarms
Auditory alarms are commonly badly designed, providing little to no information or
guidance. In the healthcare context, the poor acoustics of alarms is one contributor for the
noise problem. The goal of this thesis is to propose a human-centred methodology for the
design of clinical auditory alarms, by making them less disruptive and more informative,
thus improving the healthcare soundscape. It implements this methodology from concept
to evaluation and validation, combining psychoacoustics with usability and user
experience methods. Another aim of this research consisted in understanding the
limitations and possibilities offered by online tools for scientific studies. Thus, different
processes and methodologies were implemented, and corresponding results were
discussed.
To understand the acoustic healthcare environment, field visits, interviews, and surveys
were performed with healthcare professionals. Additionally, sound pressure levels and
frequency analysis of several surgeries in different hospitals provided specific sound design
requirements, which were added to an existent body of knowledge on clinical alarm
design. A second stage consisted in prototyping very simple sounds to comprehend which
temporal and spectral parameters of sound could be manipulated to communicate clinical
information. Parameters such as frequency, speed, onset, and rhythm were studied, and
relations between subjective perception and physical parameters were established. In
parallel, and heavily influenced by the new IEC 60601-1-8 - General requirements, tests and
guidance for alarm systems in medical electrical equipment and medical electrical systems,
a design strategy with auditory icons was created. This strategy intended to provide as
much information as possible in an auditory alarm. To do so, it involved two main
components: a priority pointer indicating the priority of the alarm; an auditory icon
indicating the cause of the alarm. A third component indicating increasing or decreasing
tendency of the vital sign was designed, but not validated with users. After online
validation of the priority pointer and auditory icon for eight categories (cardiac, drug
administration, ventilation, blood pressure, perfusion, oxygen, temperature, and power
down), a new library of clinical auditory alarms is proposed.Os alarmes auditivos são habitualmente mal concebidos, dando poucas informaçÔes ou
orientaçÔes perante a situação que despoletou o aviso. No contexto da saĂșde, a mĂĄ acĂșstica
dos alarmes Ă© um dos contribuidores para o problema do ruĂdo. O objetivo desta tese Ă© o
de melhorar a paisagem sonora em ambientes clĂnicos, propondo uma metodologia
centrada no Humano para o design de alarmes auditivos clĂnicos, tornando-os menos
disruptivos e mais informativos. Essa metodologia é implementada desde o conceito até a
avaliação e validação, combinando mĂ©todos da psicoacĂșstica com mĂ©todos de usabilidade
e experiĂȘncia do utilizador. Outro objetivo desta investigação Ă© o de compreender as
limitaçÔes e possibilidades oferecidas pelas ferramentas online para estudos cientĂficos.
Assim, diversos processos e metodologias foram implementados, e os respetivos resultados
sĂŁo discutidos.
Para compreender o ambiente acĂșstico clĂnico, foram realizadas visitas de campo,
entrevistas e inquĂ©ritos com profissionais de saĂșde. AlĂ©m disso, avaliou-se o nĂvel de
pressĂŁo sonora e frequĂȘncias de vĂĄrias cirurgias em diferentes hospitais. Esta atividade
forneceu requisitos especĂficos de design de som que foram adicionados a um corpo
existente de conhecimento sobre design de alarmes clĂnicos. Uma segunda etapa consistiu
na prototipagem de sons simples para compreender que parĂąmetros temporais e espectrais
do som poderiam ser manipulados para comunicar informaçÔes clĂnicas. ParĂąmetros como
frequĂȘncia, velocidade, envelope e ritmo foram estudados, e as relaçÔes entre a perceção
subjetiva e os parĂąmetros fĂsicos foram estabelecidas. Paralelamente, e fortemente
influenciado pela nova norma IEC 60601-1-8 - Requisitos gerais, testes e orientaçÔes para
sistemas de alarme em equipamentos médicos elétricos e sistemas médicos elétricos, foi
criada uma estratĂ©gia de design com Ăcones auditivos. Essa estratĂ©gia pretendia incorporar
o måximo de informaçÔes num alarme auditivo. Para isso, envolveu dois componentes
principais: um ponteiro de prioridade que indica a prioridade do alarme; e um Ăcone
auditivo que indica a causa do alarme. Um terceiro componente de tendĂȘncia (aumento
ou diminuição do valor do sinal vital) foi criado, mas não validado com utilizadores. Após
a validação do ponteiro de prioridade e Ăcone auditivo para oito categorias (cardĂaco,
administração de medicamentos, ventilação, pressão arterial, perfusão, oxigénio,
temperatura e falha de equipamento), propÔe-se uma nova biblioteca de alarmes auditivos
clĂnicos
A perceptually evaluated signal model:Collisions between a vibrating object and an obstacle
The collision interaction mechanism between a vibrating string and a non-resonant obstacle is at the heart of many musical instruments. This paper focuses on the identification of perceptually salient auditory features related to this phenomenon. The objective is to design a signal-based synthesis process, with an eye towards developing intuitive control strategies. To this end, a database of synthesized sounds is assembled through physics-based emulation of a string/obstacle collision, in order to characterize the effect of collisions on time-frequency content. The investigation of this database reveals characteristic time-frequency patterns related to the position of the obstacle during the interaction. In particular, a frequency shift of certain modes is apparent for strong interactions, which, alongside the generation of new frequency components, leads to increased perceived roughness and inharmonicity. These observations enable the design of a real-time compatible signal-based sound synthesis process, with a mapping of synthesis parameters linked to the perceived location of the obstacle. The accuracy of the signal model with respect to the physical model sound output and recorded sounds was evaluated through listening tests: time-frequency patterns reproduced by the signal model enabled listeners to precisely recognize the transverse location of the obstacle
Recommended from our members
Composing with Sound-Objects: A Methodology
Technology presents us with the ability to record and manipulate the entire universe of sound in a musical composition. As a result, composers are faced with an overwhelmingâoften paralyzingâamount of available musical options. My methodology focuses on how a sound-object informs the organization, collection, manipulation, and culmination of a work of electronic music. I believe that breaking down the number of choices to manageable bite-sized portions helps minimize ambiguity, and imposing limits on musical parameters helps the composer focus on productive musical options. This is a methodology where the sound-object holds primacy over the work and serves as the motivic touchstone from which to make all compositional decisions.Part one of the dissertation provides a definition of a sound-object and an historical overview. Part two is my methodology, which is divided into three working stages: onset, continuant, and termination. The onset stage discusses a compositional approach to organizing a piece of music based on the sound-object as motivic touchstone; it introduces the organizational process according to functional considerations as well as conceptual approaches. The continuant stage is the composerâs playground where sound is transformed. It includes the technical and practical approaches used to assess the many parameters of a sound-object, as well as how the object itself informs the transformations. Additionally, the continuant stage represents an approach to composition and improvisationâinformed by the sound-objectâthat uses acoustic instruments. Finally, the termination stage brings all these elements together in order to finish the piece. This stage explores how the sound-object can inform the structure of the piece at the subsequent levels of event, phrase, section, and overall form. I will demonstrate this methodology by explaining how I composed five original pieces with sound-objectsâAcoustic Memories (2019), Modular Voices (2019), Interconnected (2019), Synthetic Objects (2019), and Gucci ConcrĂšte (2019). Through the framework presented in this methodology, composers of electronic music will better understand this flexible medium of composition, by moving beyond the traditional grid of discrete pitches and rhythms, in order to control the entire universe of sound for their palette of inspiration
A Thousand Bells: Acoustical implementation of bell spectra using the finite element method and its compositional realization
This dissertation focuses primarily on the analysis of acoustical models of bell sounds and the modelling of virtual bell shapes and their spectra using the Finite Element Method (FEM) technique.
The first chapter provides a brief introduction of pre and post-spectral music that is inspired by or employs bell sounds from which it derives its central materials.
The second chapter introduces bell acoustics and the creation of new spectral profiles of optimal bell tone colors based upon just tuning ratios. In this chapter, I discuss how the concepts of consonance and Just Noticeable Difference in psychoacoustics are applied to use the 96 tone equal temperament tuning system for bell harmonic profiles.
The third chapter includes the theoretical basis of the FEM and its application to the isoparametric 2-D quadrilateral elements, which are the fundamental theories of how bell harmonies are mathematically calculated. This includes the central concepts of the FEM, such as the Principle of Virtual Work/Displacement, master to global coordinate transformation, FE shape functions, usages of Jacobian matrices, numerical integration of the stiffness matrix and the equivalent nodal force vector for the element by using the Gauss-Lagrange quadrature.
In the fourth chapter, I create bell model geometry by using 2D bell nominal curve and adjustable design variables. Physical parameters, such as the Poisson ratio, Youngâs modulus, and material properties are also adopted from previous bell design research. Based upon the aforementioned prototypes, I create 24 different 3-D bell geometries, and analyze the spectra of these virtual bells. These bell models are analyzed, optimized and tuned to create tone colors that are defined in Chapter 2. After a validating process of the bell model, the general backgrounds of optimization theory are also introduced and analyzed for the purpose of creating 3-D virtual bells.
For a general background of campanology, I use AndrĂ© Lehrâs Campanology textbook to provide the brief history, types, mechanism, casting, forms and parts, and tones of different bells. For acoustical and computational realizations of virtual bells, the analysis focuses on the research of Albertus Johannes Gerardus Schoofs and his follower PJM Roozen-Kroon on the FEM bell optimization, upon which the first prototype of the major-third bell was designed and cast
Recommended from our members
Materials, meaning and metaphor: Unveiling spatiotemporal pertinences in acousmatic music
This dissertation addresses two topics. The first is a preliminary investigation into the listening strategies for electroacoustic music by François Delalande. A listening experiment was undertaken to test Delalandeâs strategies and to learn from listenersâ responses in order to apply them to compositional practice. This process prompted the conception of a new, integrated reception behaviour framework for electroacoustic music that comprises four listening strategies: sonic properties, structural attributes, self-orientation, and imaginary realms. The second topic is the poietico-esthesic analysis of the folio of acousmatic compositions from the perspective of the reception behaviours framework. The intention of the reception behaviours framework is to illuminate those sounds and structures in electroacoustic music that could be perceived as carriers of meaning. The analysis of the acousmatic compositions in the portfolio, from the perspective of the reception behaviours framework, aims to illustrate how the acousmatic composer can attempt to create meaning in an acousmatic work. While space is observed as the common denominator in the reception behaviours framework from an esthesic perspective, space and time are proposed as common denominators that carry all poietic intention. Hence, space and time can be seen as universal carriers through which meaning can subsequently be conveyed and perceived
- âŠ