Search CORE

416 research outputs found

Exploiting correlogram structure for robust speech recognition with multiple speech sources

Author: Barker J.
Coy A.
Green P.
Ma N.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

This paper addresses the problem of separating and recognising speech in a monaural acoustic mixture with the presence of competing speech sources. The proposed system treats sound source separation and speech recognition as tightly coupled processes. In the first stage sound source separation is performed in the correlogram domain. For periodic sounds, the correlogram exhibits symmetric tree-like structures whose stems are located on the delay that corresponds to multiple pitch periods. These pitch-related structures are exploited in the study to group spectral components at each time frame. Local pitch estimates are then computed for each spectral group and are used to form simultaneous pitch tracks for temporal integration. These processes segregate a spectral representation of the acoustic mixture into several time-frequency regions such that the energy in each region is likely to have originated from a single periodic sound source. The identified time-frequency regions, together with the spectral representation, are employed by a `speech fragment decoder' which employs `missing data' techniques with clean speech models to simultaneously search for the acoustic evidence that best matches model sequences. The paper presents evaluations based on artificially mixed simultaneous speech utterances. A coherence-measuring experiment is first reported which quantifies the consistency of the identified fragments with a single source. The system is then evaluated in a speech recognition task and compared to a conventional fragment generation approach. Results show that the proposed system produces more coherent fragments over different conditions, which results in significantly better recognition accuracy

CiteSeerX

White Rose Research Online

Recommended from our members

Topology of spatial texture in the acousmatic medium

Author: Nyström Erik
Publication venue
Publication date: 01/01/2013
Field of study

This research explores the dynamic fabric of experienced space in acousmatic music. The topology of spatial texture is a network of concepts treating music as a flexible, textural space, which deforms, shapes, and transforms in time. A comprehensive terminology is introduced, along with five fixed-media electroacoustic compositions, which exemplify a manifestation of spatial texture in composition and musical thinking. The theory draws from research on the cross-modality of texture perception, philosophical discourse on embodied meaning, physics, psychology of visual art, and discourse on space in acousmatic music. Several different structural perspectives are discussed, which reveal how spatial texture incorporates lower sound-structural levels, materiality, states and processes, motion, global networks and terrains, and relationships between space and time. Emphasis is put on visual and physical connections with spatiality in the acousmatic experience: cogency in spatial structure and dynamics reinforces links among modalities. The concepts and terminology are intended as a contribution to theory in the acousmatic medium, relevant to composition, analysis, and listening. The music represents an aesthetic orientation which emphasises materiality and morphology in texture, transformative processes, spatial design, and spatiotemporal polyvalence

City Research Online

OpenGrey Repository

Human-centred design of clinical auditory alarms

Author: Vieira Joana Catarina Fernandes
Publication venue
Publication date: 05/05/2022
Field of study

Auditory alarms are commonly badly designed, providing little to no information or guidance. In the healthcare context, the poor acoustics of alarms is one contributor for the noise problem. The goal of this thesis is to propose a human-centred methodology for the design of clinical auditory alarms, by making them less disruptive and more informative, thus improving the healthcare soundscape. It implements this methodology from concept to evaluation and validation, combining psychoacoustics with usability and user experience methods. Another aim of this research consisted in understanding the limitations and possibilities offered by online tools for scientific studies. Thus, different processes and methodologies were implemented, and corresponding results were discussed. To understand the acoustic healthcare environment, field visits, interviews, and surveys were performed with healthcare professionals. Additionally, sound pressure levels and frequency analysis of several surgeries in different hospitals provided specific sound design requirements, which were added to an existent body of knowledge on clinical alarm design. A second stage consisted in prototyping very simple sounds to comprehend which temporal and spectral parameters of sound could be manipulated to communicate clinical information. Parameters such as frequency, speed, onset, and rhythm were studied, and relations between subjective perception and physical parameters were established. In parallel, and heavily influenced by the new IEC 60601-1-8 - General requirements, tests and guidance for alarm systems in medical electrical equipment and medical electrical systems, a design strategy with auditory icons was created. This strategy intended to provide as much information as possible in an auditory alarm. To do so, it involved two main components: a priority pointer indicating the priority of the alarm; an auditory icon indicating the cause of the alarm. A third component indicating increasing or decreasing tendency of the vital sign was designed, but not validated with users. After online validation of the priority pointer and auditory icon for eight categories (cardiac, drug administration, ventilation, blood pressure, perfusion, oxygen, temperature, and power down), a new library of clinical auditory alarms is proposed.Os alarmes auditivos são habitualmente mal concebidos, dando poucas informações ou orientações perante a situação que despoletou o aviso. No contexto da saúde, a má acústica dos alarmes é um dos contribuidores para o problema do ruído. O objetivo desta tese é o de melhorar a paisagem sonora em ambientes clínicos, propondo uma metodologia centrada no Humano para o design de alarmes auditivos clínicos, tornando-os menos disruptivos e mais informativos. Essa metodologia é implementada desde o conceito até a avaliação e validação, combinando métodos da psicoacústica com métodos de usabilidade e experiência do utilizador. Outro objetivo desta investigação é o de compreender as limitações e possibilidades oferecidas pelas ferramentas online para estudos científicos. Assim, diversos processos e metodologias foram implementados, e os respetivos resultados são discutidos. Para compreender o ambiente acústico clínico, foram realizadas visitas de campo, entrevistas e inquéritos com profissionais de saúde. Além disso, avaliou-se o nível de pressão sonora e frequências de várias cirurgias em diferentes hospitais. Esta atividade forneceu requisitos específicos de design de som que foram adicionados a um corpo existente de conhecimento sobre design de alarmes clínicos. Uma segunda etapa consistiu na prototipagem de sons simples para compreender que parâmetros temporais e espectrais do som poderiam ser manipulados para comunicar informações clínicas. Parâmetros como frequência, velocidade, envelope e ritmo foram estudados, e as relações entre a perceção subjetiva e os parâmetros físicos foram estabelecidas. Paralelamente, e fortemente influenciado pela nova norma IEC 60601-1-8 - Requisitos gerais, testes e orientações para sistemas de alarme em equipamentos médicos elétricos e sistemas médicos elétricos, foi criada uma estratégia de design com ícones auditivos. Essa estratégia pretendia incorporar o máximo de informações num alarme auditivo. Para isso, envolveu dois componentes principais: um ponteiro de prioridade que indica a prioridade do alarme; e um ícone auditivo que indica a causa do alarme. Um terceiro componente de tendência (aumento ou diminuição do valor do sinal vital) foi criado, mas não validado com utilizadores. Após a validação do ponteiro de prioridade e ícone auditivo para oito categorias (cardíaco, administração de medicamentos, ventilação, pressão arterial, perfusão, oxigénio, temperatura e falha de equipamento), propõe-se uma nova biblioteca de alarmes auditivos clínicos

UTL Repository

Spectral noise levels and roughness severity ratings for normal and simulated rough vowels produced by adult males /

Author: Sansone Frank Edward,
Publication venue
Publication date: 01/01/1969
Field of study

SHAREOK repository

A perceptually evaluated signal model:Collisions between a vibrating object and an obstacle

Author: Aramaki Mitsuko
Bilbao Stefan
Kronland Martinet Richard
Poirot Samuel
Ystad Solvi
Publication venue
Publication date: 01/01/2023
Field of study

The collision interaction mechanism between a vibrating string and a non-resonant obstacle is at the heart of many musical instruments. This paper focuses on the identification of perceptually salient auditory features related to this phenomenon. The objective is to design a signal-based synthesis process, with an eye towards developing intuitive control strategies. To this end, a database of synthesized sounds is assembled through physics-based emulation of a string/obstacle collision, in order to characterize the effect of collisions on time-frequency content. The investigation of this database reveals characteristic time-frequency patterns related to the position of the obstacle during the interaction. In particular, a frequency shift of certain modes is apparent for strong interactions, which, alongside the generation of new frequency components, leads to increased perceived roughness and inharmonicity. These observations enable the design of a real-time compatible signal-based sound synthesis process, with a mapping of synthesis parameters linked to the perceived location of the obstacle. The accuracy of the signal model with respect to the physical model sound output and recorded sounds was evaluated through listening tests: time-frequency patterns reproduced by the signal model enabled listeners to precisely recognize the transverse location of the obstacle

HAL AMU

Edinburgh Research Explorer

Recommended from our members

Composing with Sound-Objects: A Methodology

Author: Wheeler George Stockton
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Technology presents us with the ability to record and manipulate the entire universe of sound in a musical composition. As a result, composers are faced with an overwhelming—often paralyzing—amount of available musical options. My methodology focuses on how a sound-object informs the organization, collection, manipulation, and culmination of a work of electronic music. I believe that breaking down the number of choices to manageable bite-sized portions helps minimize ambiguity, and imposing limits on musical parameters helps the composer focus on productive musical options. This is a methodology where the sound-object holds primacy over the work and serves as the motivic touchstone from which to make all compositional decisions.Part one of the dissertation provides a definition of a sound-object and an historical overview. Part two is my methodology, which is divided into three working stages: onset, continuant, and termination. The onset stage discusses a compositional approach to organizing a piece of music based on the sound-object as motivic touchstone; it introduces the organizational process according to functional considerations as well as conceptual approaches. The continuant stage is the composer’s playground where sound is transformed. It includes the technical and practical approaches used to assess the many parameters of a sound-object, as well as how the object itself informs the transformations. Additionally, the continuant stage represents an approach to composition and improvisation—informed by the sound-object—that uses acoustic instruments. Finally, the termination stage brings all these elements together in order to finish the piece. This stage explores how the sound-object can inform the structure of the piece at the subsequent levels of event, phrase, section, and overall form. I will demonstrate this methodology by explaining how I composed five original pieces with sound-objects—Acoustic Memories (2019), Modular Voices (2019), Interconnected (2019), Synthetic Objects (2019), and Gucci Concrète (2019). Through the framework presented in this methodology, composers of electronic music will better understand this flexible medium of composition, by moving beyond the traditional grid of discrete pitches and rhythms, in order to control the entire universe of sound for their palette of inspiration

eScholarship - University of California

A Thousand Bells: Acoustical implementation of bell spectra using the finite element method and its compositional realization

Author: Lee Dongryul
Publication venue
Publication date: 01/08/2020
Field of study

This dissertation focuses primarily on the analysis of acoustical models of bell sounds and the modelling of virtual bell shapes and their spectra using the Finite Element Method (FEM) technique. The first chapter provides a brief introduction of pre and post-spectral music that is inspired by or employs bell sounds from which it derives its central materials. The second chapter introduces bell acoustics and the creation of new spectral profiles of optimal bell tone colors based upon just tuning ratios. In this chapter, I discuss how the concepts of consonance and Just Noticeable Difference in psychoacoustics are applied to use the 96 tone equal temperament tuning system for bell harmonic profiles. The third chapter includes the theoretical basis of the FEM and its application to the isoparametric 2-D quadrilateral elements, which are the fundamental theories of how bell harmonies are mathematically calculated. This includes the central concepts of the FEM, such as the Principle of Virtual Work/Displacement, master to global coordinate transformation, FE shape functions, usages of Jacobian matrices, numerical integration of the stiffness matrix and the equivalent nodal force vector for the element by using the Gauss-Lagrange quadrature. In the fourth chapter, I create bell model geometry by using 2D bell nominal curve and adjustable design variables. Physical parameters, such as the Poisson ratio, Young’s modulus, and material properties are also adopted from previous bell design research. Based upon the aforementioned prototypes, I create 24 different 3-D bell geometries, and analyze the spectra of these virtual bells. These bell models are analyzed, optimized and tuned to create tone colors that are defined in Chapter 2. After a validating process of the bell model, the general backgrounds of optimization theory are also introduced and analyzed for the purpose of creating 3-D virtual bells. For a general background of campanology, I use André Lehr’s Campanology textbook to provide the brief history, types, mechanism, casting, forms and parts, and tones of different bells. For acoustical and computational realizations of virtual bells, the analysis focuses on the research of Albertus Johannes Gerardus Schoofs and his follower PJM Roozen-Kroon on the FEM bell optimization, upon which the first prototype of the major-third bell was designed and cast

Illinois Digital Environment for Access to Learning and Scholarship Repository

Intergrating the exposition in music-composition research

Author: Roels Hans
Publication venue: 'Leiden University Press'
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography

Recommended from our members

Materials, meaning and metaphor: Unveiling spatiotemporal pertinences in acousmatic music

Author: Anderson E.L.
Publication venue
Publication date
Field of study

This dissertation addresses two topics. The first is a preliminary investigation into the listening strategies for electroacoustic music by François Delalande. A listening experiment was undertaken to test Delalande’s strategies and to learn from listeners’ responses in order to apply them to compositional practice. This process prompted the conception of a new, integrated reception behaviour framework for electroacoustic music that comprises four listening strategies: sonic properties, structural attributes, self-orientation, and imaginary realms. The second topic is the poietico-esthesic analysis of the folio of acousmatic compositions from the perspective of the reception behaviours framework. The intention of the reception behaviours framework is to illuminate those sounds and structures in electroacoustic music that could be perceived as carriers of meaning. The analysis of the acousmatic compositions in the portfolio, from the perspective of the reception behaviours framework, aims to illustrate how the acousmatic composer can attempt to create meaning in an acousmatic work. While space is observed as the common denominator in the reception behaviours framework from an esthesic perspective, space and time are proposed as common denominators that carry all poietic intention. Hence, space and time can be seen as universal carriers through which meaning can subsequently be conveyed and perceived

City Research Online