Search CORE

30 research outputs found

Model-based speech enhancement for hearing aids

Author: Kavalekalam Mathew Shaji
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2018
Field of study

VBN

Recommended from our members

Support Vector Machine Active Learning for Music Retrieval

Author: Ellis Daniel P. W.
Mandel Michael I.
Poliner Graham E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2006
Field of study

Searching and organizing growing digital music collections requires a computational model of music similarity. This paper describes a system for performing flexible music similarity queries using SVM active learning. We evaluated the success of our system by classifying 1210 pop songs according to mood and style (from an online music guide) and by the performing artist. In comparing a number of representations for songs, we found the statistics of mel-frequency cepstral coefficients to perform best in precision-at-20 comparisons. We also show that by choosing training examples intelligently, active learning requires half as many labeled examples to achieve the same accuracy as a standard scheme

Columbia University Academic Commons

Hear what you feel, feel what you hear : the effect of musical sequences on emotional processing

Author: Esteves Ana Marta Tavares
Publication venue
Publication date: 01/07/2021
Field of study

Dissertação de mestrado, Psicologia (Área de Especialização em Psicologia Clínica e da Saúde - Psicoterapia Cognitivo-Comportamental e Integrativa), Universidade de Lisboa, Faculdade de Psicologia, 2021Music has a unique ability to access affective and motivational systems of the brain (Thaut & Wheeler, 2010). However, there is a large gap in research on the association between musical stimuli and their impact on emotional processing, a crucial component for the success of the therapeutic process (Greenberg & Paivio, 1997). The present study thus sought to explore both the capacity of music to access the affective system, to induce emotions, and to change emotional states, as well as to facilitate emotional processing leading to the resolution of emotional distress. An empirically validated sequential model of emotional processing from Pascual-Leone and Greenberg (2007) was used to test this dual capacity. Three musical sequences with distinct components were developed and presented in an online platform. One musical sequence followed the order of the sequential model (EED>AMM), another musical sequence inverted that same order (AMM>EED Sequence), and the last musical sequence was intended to serve as a baseline (Control Sequence). All musical sequences, not only led to shifts in participant’s emotional states, but also led to an increase of participant’s emotional resolution. Some of the results were surprising, since the Control Sequence also led the increase of emotional resolution and the EED>AMM Sequence didn’t present itself as the winning candidate of leading to a higher emotional resolution. Nevertheless, these surprising results still demonstrated the power of music to impact emotional processing and allow future studies to keep exploring this relationship.A música pode ser descrita como a criação de emoções. Juslin e Västjfäll (2008, p. 572) focam esta ideia ao referirem que “aquilo que torna as emoções musicais únicas, não são os seus mecanismos subjacentes ou as emoções que evocam, mas sim o facto de que a música é, muitas vezes, intencionalmente desenhada para induzir emoções”. Pelo menos, é difícil, se não impossível, imaginar a ausência de uma relação entre música e emoção. Tendo uma forte presença na cultura humana (Blood & Zatorre, 2001; Marin & Bhattacharya, 2011; Sacks, 2007; Zentner et al., 2008), diversas formas de utilização da música no nosso dia-a-dia são apontadas na literatura, estando entre elas a libertação e regulação das emoções (DeNora, 2000; Juslin et al., 2008; Juslin & Västjfäll, 2008; Knobloch & Zillmann, 2002; Marin & Bhattacharya, 2011), conforto e alívio do stress (DeNora, 2000, 2016), revivência de experiências passadas valorizadas (Hays & Minichiello, 2005) ou acompanhamento na realização de tarefas do quotidiano (Sloboda et al. , 2009). Apesar desta forte presença na cultura, o estudo da relação entre a música e emoções vê-se fortemente dividido entre duas posições: uma posição cognitivista e uma posição emotivista. Por um lado, a posição cognitivista defende que o estímulo musical é incapaz de induzir, verdadeiramente, emoções (Kivy, 1990; Meyer, 1956; Scherer, 2004; Zentner et al. 2008), pelo que estas são apenas percecionadas perante o estímulo musical. Por outro lado, na posição emotivista, são propostas diversas teorias explicativas para o como e porquê de o estímulo musical induzir, verdadeiramente, emoções. Blood e Zatorre (2001), Juslin e Västjfäll (2008), Koelsh (2012), Krumhansl (1997) e outros revistos por Juslin e Sloboda (2010) demonstraram como estímulos musicais são capazes de induzir emoções básicas, entre elas, tristeza, medo, nojo, raiva e felicidade. Dentro da posição emotivista, foi, então, formulada a perspetiva de que a música tem uma capacidade única de aceder aos sistemas afetivos e motivacionais do cérebro (Thaut, 2005). Especificamente, Thaut e Wheeler (2010) afirmam que a música foi considerada como um dos maiores mecanismos para a eficácia terapêutica ao: 1) assumir um papel eficaz na influência e modificação em estados afetivos e 2) assumir um papel central através da modificação afetiva ao aceder à totalidade das cognições, perceções, estados e organização comportamental do paciente. No entanto, Thaut e Wheeler (2010) afirmam que ainda não existem teorias unificadoras que expliquem os mecanismos neuropsicológicos e psicológicos subjacentes às respostas afetivas na audição da música, nem modelos científicos sobre o papel de emoções evocadas pela música em contexto terapêutico. Adicionalmente, embora não faltem artigos e estudos que comprovem como estímulos musicais podem, efetivamente, induzir emoções, pelo contrário, há uma grande falta de estudos que analisem a influência dos estímulos musicais no processamento emocional. No campo da terapia musical, o Bonny Method of Guided Imagery and Music (Bonny, 1994) destaca-se como o único método comumente conhecido e utilizado, onde a música interage com o cérebro para evocar imagens que induzem emoções e memórias, permitindo a transformação de emoções dolorosas para emoções positivas (Lee et al., 2016). Este método não é suportado e não tem ligação com nenhum modelo de processamento emocional teórico, sendo largamente baseado na exploração e interpretação livre. Assim, o papel do cliente é partilhar abertamente as suas perceções e experiências dentro da música, e o papel do terapeuta é facilitar uma reflexão e uma integração dos sentimentos compartilhados do cliente. Esta é uma importante lacuna sobre a qual refletir, uma vez que o processamento emocional é considerado como um dos principais elementos do processo terapêutico. Greenberg e Paivio (1997) descrevem-no através de três passos: (1) evocação de estados emocionais, (2) exploração das sequências cognitivo-afetivas associadas e (3) reestruturação dos estados afetivos através da introdução de algo novo. Estes passos estavam subjacentes a tarefas terapêuticas específicas, mas, Pascual-Leone e Greenberg (2007) apresentaram um modelo sequencial de processamento emocional, a um nível de abstração mais elevado, que explica a resolução do distress emocional consoante a evolução terapêutica. Neste modelo sequencial, parte-se de emoções indiferenciadas e não integradas (estados representativos de Early Expressions of Distress, EED), para experiências emocionais de aceitação (estados representativos de Advanced Meaning Making, AMM), independentemente da especificidade das tarefas terapêuticas percorridas. Esta independência permite a exploração de diferentes métodos com potencial terapêutico, mesmo fora do âmbito da psicoterapia. No presente estudo, como principal objetivo pretendeu-se explorar se a experiência de estar exposto a sequências musicais com determinadas características, permite alcançar uma menor ou maior resolução emocional face a alguma angústia emocional. Assim, hipotetiza-se que: • A audição de ambos os tipos de sequências musicais (experimental e de controlo) conduzirá a uma mudança no estado emocional dos participantes, que se refletirá em mudanças nas dimensões de valência, ativação e controlo • Ouvir as sequências musicais experimentais, independentemente da ordem de progressão, e em comparação com a sequência musical de controlo, levará a um aumento da resolução emocional, levando os participantes a relatar uma menor angústia intra- ou interpessoal • Ouvir a sequência musical com a progressão EED-AMM conduzirá a um maior nível de resolução emocional, em comparação com a sequência musical de controlo (sem progressão especificada) e com a progressão AMM-EED De forma a responder às referidas hipóteses, o presente estudo recorreu a métodos quantitativos e a um breve elemento qualitativo, caracterizando-se por um estudo de abordagem de método misto. Quanto aos métodos quantitativos empregues, estes caracterizam-se por métodos experimentais, uma vez que pretende-se analisar e explorar relações causais entre diferentes sequências musicais, o estado emocional e a resolução emocional. São utilizados três grupos distintos, duas condições experimentais e uma condição de controlo. Foi aplicado um desenho pré-pós-teste, uma vez que este se apresenta como um design robusto com várias vantagens associadas que permitem isolar melhor o efeito nas análises (Christense et al., 2015). Os estímulos musicais utilizados resumem-se a três sequências musicais: duas sequências experimentais que pretendiam simular os estados afetivos descritos no modelo sequencial (Sequência EED-AMM que seguia a ordem do referido modelo e Sequência AMM-EED que invertia a ordem do referido modelo), e uma sequência controlo. A seleção dos excertos musicais que integraram as duas sequências musicais experimentais apresentadas aos participantes passaram, assim, por duas fases de seleção: 1) opções com base na revisão de literatura, 2) melhor candidata com base num pré-teste aplicado à população geral. Quanto à sequência controlo, foram selecionados os primeiros 6 minutos e 6 segundos da peça musical Les Sylphides, de Chopin, uma vez que, num estudo de Zimny e Weidenfeller (2015), os dados revelaram uma associação desta peça musical a um estado de neutralidade com base em medidas GSR (resposta galvânica da pele). As medidas e escalas utilizadas foram: • Self-Assessment Manikin (SAM): a utilização desta escala foi pertinente para o presente estudo pois permitiu averiguar se as sequências musicais impactaram o estado emocional dos participantes, nas dimensões valência, ativação e controlo • Resolution of Long-Standing Interpersonal Grievances (UFB-RS): a utilização desta escala foi pertinente para o presente estudo pois permitiu determinar se as sequências musicais tiveram impacto no nível de resolução emocional sentido pelos participantes que selecionaram o marcador emocional de ressentimento e mágoa numa relação importante • Resolution of Long-Standing Emotional Self-Neglect (ESN-RS): a utilização desta escala foi pertinente para o presente estudo pois permitiu determinar se as sequências musicais tiveram impacto no nível de resolução emocional sentido pelos participantes que selecionarem o marcador emocional de autonegligência • Bern Post Session Report (BPSR-P): a utilização desta escala foi pertinente para o presente estudo pois permitiu aprofundar o possível impacto terapêutico que as sequências musicais tiveram nos participantes • Tarefa de Escrita Expressiva: imediatamente após a indução experimental, foi dada a escolha aos participantes de realizarem uma tarefa de escrita expressiva, pelo que a realização desta tarefa por parte dos participantes pretendeu melhor averiguar o nível de processamento emocional induzido O presente estudo foi desenvolvido sob o formato de um questionário online, através da plataforma Qualtrics (www.qualtrics.com). Em primeiro lugar, os participantes tiveram a oportunidade de escolher trabalhar: ou um marcador emocional de Autonegligência ou um marcador emocional de Ressentimento e Mágoa numa Relação Importante. Seguidamente, foi pedido que os participantes preenchessem as medidas SAM, UFB-RS ou ESN-RS. Posteriormente, foi iniciada a indução experimental através da audição de uma das três sequências musicais. Imediatamente após a audição da sequência musical, foi perguntado aos participantes se notaram alguma mudança ou transformação interna relativamente ao tema que escolheram trabalhar, e, caso respondessem sim, era pedido que descrevessem em alguns detalhes a transformação sentida. Por último, foi pedido que preenchessem as medidas SAM, UFB-RS ou ESN-RS e BPSR-P. Os dados quantitativos obtidos no presente estudo foram analisados estatisticamente com recurso ao software IBM SPSS Statistics (versão 26.0) e os dados qualitativos foram analisados com recurso o software Nvivo12. Para todas as análises estatísticas realizadas, os dados das amostras de participantes atribuídos às escalas UFBRS ou ESN-RS foram agregados, uma vez que ambas permitem a medição do nível de resolução emocional. Não era pretendido diferenciar entre o nível de resolução emocional alcançado em cada marcador, mas sim o nível global de resolução emocional alcançado. As variáveis foram analisadas separadamente consoante cada condição (Sequência EEDAMM, n = 30; Sequência AMM-EED, n = 30; Sequência Control, n = 30). Quanto à primeira hipótese, os resultados demonstraram que, dentro de cada condição, ocorreram mudanças nas dimensões valência, ativação e controlo do estado emocional dos participantes entre o pré- e o pós-indução experimental. Adicionalmente, entre condições, foi demonstrado como os participantes estavam igualmente emocionalmente ativos, tanto no pré- como no pós-indução experimental. Quanto à segunda hipótese, inversamente ao esperado, os resultados demonstraram como todas as sequências musicais levaram ao alcance de uma maior resolução emocional. Quanto à terceira e última hipótese, os resultados quantitativos demonstraram como a sequência musical com a progressão EED-AMM não conduziu a um maior nível de resolução emocional, em comparação com as restantes sequências. Os resultados aqui encontrados permitiram explorar uma relação entre a audição de sequências musicais que simulam o modelo sequencial de processamento emocional de Pascual-Leone e Greenberg (2007) e o alcance de uma tentativa de maior resolução emocional. Este estudo permite ponderar sobre o potencial de intervenções mais curtas, simples e económicas, mesmo fora do âmbito de psicoterapia. Igualmente, os dados aqui registados permitem começar a preencher uma lacuna, ao associarem estímulos musicais a um modelo de processamento emocional empírico. Conclusivamente, o presente estudo demonstra como, não só a música pode ser descrita como a criação de emoções, como também pode ser descrita como o a transformação de emoções

Universidade de Lisboa: Repositório.UL

A Content-Aware Interactive Explorer of Digital Music Collections: The Phonos Music Explorer

Author
Publication venue
Publication date
Field of study

La tesi si propone di utilizzare le più recenti tecnologie del Music Information Retrieval (MIR) al fine di creare un esploratore interattivo di cataloghi musicali. Il software utilizza tecniche avanzate quali riduzione di dimensionalità mediante FastMap, generazione e streaming over-the-network di contenuto audio, segmentazione e estrazione di descrittori da segnali audio. Inoltre, il software è in grado di adattare in real-time il proprio output sulla base di interazioni dell'utent

Padua Thesis and Dissertation Archive

A computational framework for sound segregation in music signals

Author: Martins Luís Gustavo Pereira Marques
Publication venue
Publication date: 01/01/2008
Field of study

Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200

Repositório Aberto da Universidade do Porto

Towards a better understanding of mix engineering

Author: De Man Brecht
Publication venue: 'Queen Mary University of London'
Publication date: 21/09/2017
Field of study

PhDThis thesis explores how the study of realistic mixes can expand current knowledge about multitrack music mixing. An essential component of music production, mixing remains an esoteric matter with few established best practices. Research on the topic is challenged by a lack of suitable datasets, and consists primarily of controlled studies focusing on a single type of signal processing. However, considering one of these processes in isolation neglects the multidimensional nature of mixing. For this reason, this work presents an analysis and evaluation of real-life mixes, demonstrating that it is a viable and even necessary approach to learn more about how mixes are created and perceived. Addressing the need for appropriate data, a database of 600 multitrack audio recordings is introduced, and mixes are produced by skilled engineers for a selection of songs. This corpus is subjectively evaluated by 33 expert listeners, using a new framework tailored to the requirements of comparison of musical signal processing. By studying the relationship between these assessments and objective audio features, previous results are confirmed or revised, new rules are unearthed, and descriptive terms can be defined. In particular, it is shown that examples of inadequate processing, combined with subjective evaluation, are essential in revealing the impact of mix processes on perception. As a case study, the percept `reverberation amount' is ex-pressed as a function of two objective measures, and a range of acceptable values can be delineated. To establish the generality of these findings, the experiments are repeated with an expanded set of 180 mixes, assessed by 150 subjects with varying levels of experience from seven different locations in five countries. This largely confirms initial findings, showing few distinguishable trends between groups. Increasing experience of the listener results in a larger proportion of critical and specific statements, and agreement with other experts.Yamaha Corporation, the Audio Engineering Society, Harman International Industries, the Engineering and Physical Sciences Research Council, the Association of British Turkish Academics, and Queen Mary University of London's School of Electronic Engineering and Computer Scienc

Queen Mary Research Online

An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

Author: Cahill Niall M.
Publication venue
Publication date: 01/02/2012
Field of study

In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

MURAL - Maynooth University Research Archive Library

Irish Universities

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

Author: Cahill Niall M.
Publication venue
Publication date: 01/02/2012
Field of study

MURAL - Maynooth University Research Archive Library

Automating the Production of the Balance Mix in Music Production

Author: Jillings Nicholas
Publication venue
Publication date: 16/10/2023
Field of study

Historically, the junior engineer is an individual who would assist the sound engineer to produce a mix by performing a number of mixing and pre-processing tasks ahead of the main session. With improvements in technology, these tasks can be done more efficiently, so many aspects of this role are now assigned to the lead engineer. Similarly, these technological advances mean amateur producers now have access to similar mixing tools at home, without the need for any studio time or record label investments. As the junior engineer’s role is now embedded into the process it creates a steeper learning curve for these amateur engineers, and adding time onto the mixing process. In order to build tools to help users overcome the hurdles associated with this increased workload, we first aim to quantify the role of a modern studio engineer. To do this, a production environment was built to collect session data, allowing subjects to construct a balance mix, which is the starting point of the mixing life-cycle. This balance-mix is generally designed to ensure that all the recordings in a mix are audible, as well as to build routing structures and apply pre-processing. Improvements in web technologies allow for this data-collection system to run in a browser, making remote data acquisition feasible in a short space of time. The data collected in this study was then used to develop a set of assistive tools, designed to be non-intrusive and to provide guidance, allowing the engineer to understand the process. From the data, grouping of the audio tracks proved to be one of the most important, yet overlooked tasks in the production life-cycle. This step is often misunderstood by novice engineers, and can enhance the quality of the final product. The first assistive tool we present in this thesis takes multi-track audio sessions and uses semantic information to group and label them. The system can work with any collection of audio tracks, and can be embedded into a poroduction environment. It was also apparent from the data that the minimisation of masking is a primary task of the mixing stage. We therefore present a tool which can automatically balance a mix by minimising the masking between separate audio tracks. Using evolutionary computing as a solver, the mix space can be searched effectively without the requirement for complex models to be trained on production data. The evaluation of these systems show they are capable of producing a session structure similar to that of a real engineer. This provides a balance mix which is routed and pre-processed, before creative mixing can take place. This provides an engineer with several steps completed for them, similar to the work of a junior engineer

BCU Open Access

Advances in Architectural Acoustics

Author
Publication venue: 'MDPI AG'
Publication date: 06/07/2022
Field of study

Satisfactory acoustics is crucial for the ability of spaces such as auditoriums and lecture rooms to perform their primary function. The acoustics of dwellings and offices greatly affects the quality of our life, since we are all consciously or subconsciously aware of the sounds to which we are daily subjected. Architectural acoustics, which encompasses room and building acoustics, is the scientific field that deals with these topics and can be defined as the study of generation, propagation, and effects of sound in enclosures. Modeling techniques, as well as related acoustic theories for accurately calculating the sound field, have been the center of many major new developments. In addition, the image conveyed by a purely physical description of sound would be incomplete without regarding human perception; hence, the interrelation between objective stimuli and subjective sensations is a field of important investigations. A holistic approach in terms of research and practice is the optimum way for solving the perplexing problems which arise in the design or refurbishment of spaces, since current trends in contemporary architecture, such as transparency, openness, and preference for bare sound-reflecting surfaces are continuing pushing the very limits of functional acoustics. All the advances in architectural acoustics gathered in this Special Issue, we hope that inspire researchers and acousticians to explore new directions in this age of scientific convergence

Directory of Open Access Books (DOAB)