26,146 research outputs found
Modulation-frequency acts as a primary cue for auditory stream segregation
In our surrounding acoustic world sounds are produced by different sources and interfere with each other before arriving to the ears. A key function of the auditory system is to provide consistent and robust descriptions of the coherent sound groupings and sequences (auditory objects), which likely correspond to the various sound sources in the environment. This function has been termed auditory stream segregation. In the current study we tested the effects of separation in the frequency of amplitude modulation on the segregation of concurrent sound sequences in the auditory stream-segregation paradigm (van Noorden 1975). The aim of the study was to assess 1) whether differential amplitude modulation would help in separating concurrent sound sequences and 2) whether this cue would interact with previously studied static cues (carrier frequency and location difference) in segregating concurrent streams of sound. We found that amplitude modulation difference is utilized as a primary cue for the stream segregation and it interacts with other primary cues such as frequency and location difference
Different roles of similarity and predictability in auditory stream segregation
Sound sources often emit trains of discrete sounds, such as a series of footsteps. Previously, two difÂŹferent principles have been suggested for how the human auditory system binds discrete sounds toÂŹgether into perceptual units. The feature similarity principle is based on linking sounds with similar characteristics over time. The predictability principle is based on linking sounds that follow each other in a predictable manner. The present study compared the effects of these two principles. Participants were presented with tone sequences and instructed to continuously indicate whether they perceived a single coherent sequence or two concurrent streams of sound. We investigated the inïŹuence of separate manipulations of similarity and predictability on these perceptual reports. Both grouping principles affected perception of the tone sequences, albeit with different characteristics. In particular, results suggest that whereas predictability is only analyzed for the currently perceived sound organization, feature similarity is also analyzed for alternative groupings of sound. Moreover, changing similarity or predictability within an ongoing sound sequence led to markedly different dynamic effects. Taken together, these results provide evidence for different roles of similarity and predictability in auditory scene analysis, suggesting that forming auditory stream representations and competition between alterÂŹnatives rely on partly different processes
On the Informed Source Separation Approach for Interactive Remixing in Stereo
International audienceInformed source separation (ISS) has become a popular trend in the audio signal processing community over the past few years. Its purpose is to decompose a mixture signal into its constituent parts at the desired or the best possible quality level given some metadata. In this paper we present a comparison between two ISS systems and relate the ISS approach in various configurations with conventional coding of separate tracks for interactive remixing in stereo. The compared systems are Underdetermined Source Signal Recovery (USSR) and Enhanced Audio Object Separation (EAOS). The latter forms a part of MPEG's Spatial Audio Object Coding technology. The performance is evaluated using objective difference grades computed with PEMO-Q. The results suggest that USSR performs perceptually better than EOAS and has a lower computational complexity
Pitch-Informed Solo and Accompaniment Separation
ï»żDas Thema dieser Dissertation ist die Entwicklung eines Systems zur
Tonhöhen-informierten Quellentrennung von Musiksignalen in Soloinstrument
und Begleitung. Dieses ist geeignet, die dominanten Instrumente aus einem
MusikstĂŒck zu isolieren, unabhĂ€ngig von der Art des Instruments, der
Begleitung und Stilrichtung. Dabei werden nur einstimmige
Melodieinstrumente in Betracht gezogen. Die Musikaufnahmen liegen monaural
vor, es kann also keine zusÀtzliche Information aus der Verteilung der
Instrumente im Stereo-Panorama gewonnen werden.
Die entwickelte Methode nutzt Tonhöhen-Information als Basis fĂŒr eine
sinusoidale Modellierung der spektralen Eigenschaften des Soloinstruments
aus dem Musikmischsignal. Anstatt die spektralen Informationen pro Frame zu
bestimmen, werden in der vorgeschlagenen Methode Tonobjekte fĂŒr die
Separation genutzt. Tonobjekt-basierte Verarbeitung ermöglicht es,
zusÀtzlich die NotenanfÀnge zu verfeinern, transiente Artefakte zu
reduzieren, gemeinsame Amplitudenmodulation (Common Amplitude Modulation
CAM) einzubeziehen und besser nichtharmonische Elemente der Töne
abzuschÀtzen. Der vorgestellte Algorithmus zur Quellentrennung von
Soloinstrument und Begleitung ermöglicht eine Echtzeitverarbeitung und ist
somit relevant fĂŒr den praktischen Einsatz.
Ein Experiment zur besseren Modellierung der ZusammenhÀnge zwischen
Magnitude, Phase und Feinfrequenz von isolierten Instrumententönen wurde
durchgefĂŒhrt. Als Ergebnis konnte die KontinuitĂ€t der zeitlichen
EinhĂŒllenden, die InharmonizitĂ€t bestimmter Musikinstrumente und die
Auswertung des Phasenfortschritts fĂŒr die vorgestellte Methode ausgenutzt
werden. ZusĂ€tzlich wurde ein Algorithmus fĂŒr die Quellentrennung in
perkussive und harmonische Signalanteile auf Basis des Phasenfortschritts
entwickelt. Dieser erreicht ein verbesserte perzeptuelle QualitÀt der
harmonischen und perkussiven Signale gegenĂŒber vergleichbaren Methoden nach
dem Stand der Technik.
Die vorgestellte Methode zur Klangquellentrennung in Soloinstrument und
Begleitung wurde zu den Evaluationskampagnen SiSEC 2011 und SiSEC 2013
eingereicht. Dort konnten vergleichbare Ergebnisse im Hinblick auf
perzeptuelle BewertungsmaĂe erzielt werden. Die QualitĂ€t eines
Referenzalgorithmus im Hinblick auf den in dieser Dissertation
beschriebenen Instrumentaldatensatz ĂŒbertroffen werden.
Als ein Anwendungsszenario fĂŒr die Klangquellentrennung in Solo und
Begleitung wurde ein Hörtest durchgefĂŒhrt, der die QualitĂ€tsanforderungen
an Quellentrennung im Kontext von Musiklernsoftware bewerten sollte. Die
Ergebnisse dieses Hörtests zeigen, dass die Solo- und Begleitspur gemĂ€Ă
unterschiedlicher QualitÀtskriterien getrennt werden sollten. Die
Musiklernsoftware Songs2See integriert die vorgestellte
Klangquellentrennung bereits in einer kommerziell erhÀltlichen Anwendung.This thesis addresses the development of a system for pitch-informed solo
and accompaniment separation capable of separating main instruments from
music accompaniment regardless of the musical genre of the track, or type
of music accompaniment. For the solo instrument, only pitched monophonic
instruments were considered in a single-channel scenario where no panning
or spatial location information is available.
In the proposed method, pitch information is used as an initial stage of a
sinusoidal modeling approach that attempts to estimate the spectral
information of the solo instrument from a given audio mixture. Instead of
estimating the solo instrument on a frame by frame basis, the proposed
method gathers information of tone objects to perform separation.
Tone-based processing allowed the inclusion of novel processing stages for
attack refinement, transient interference reduction, common amplitude
modulation (CAM) of tone objects, and for better estimation of non-harmonic
elements that can occur in musical instrument tones. The proposed solo and
accompaniment algorithm is an efficient method suitable for real-world
applications.
A study was conducted to better model magnitude, frequency, and phase of
isolated musical instrument tones. As a result of this study, temporal
envelope smoothness, inharmonicty of musical instruments, and phase
expectation were exploited in the proposed separation method. Additionally,
an algorithm for harmonic/percussive separation based on phase expectation
was proposed. The algorithm shows improved perceptual quality with respect
to state-of-the-art methods for harmonic/percussive separation.
The proposed solo and accompaniment method obtained perceptual quality
scores comparable to other state-of-the-art algorithms under the SiSEC 2011
and SiSEC 2013 campaigns, and outperformed the comparison algorithm on the
instrumental dataset described in this thesis.As a use-case of solo and
accompaniment separation, a listening test procedure was conducted to
assess separation quality requirements in the context of music education.
Results from the listening test showed that solo and accompaniment tracks
should be optimized differently to suit quality requirements of music
education. The Songs2See application was presented as commercial music
learning software which includes the proposed solo and accompaniment
separation method
Informed Separation of Spatial Images of Stereo Music Recordings Using Second-Order Statistics
International audienceIn this work we address a reverse audio engineering problem, i.e. the separation of stereo tracks of professionally produced music recordings. More precisely, we apply a spatial filtering approach with a quadratic constraint using an explicit source-image-mixture model. The model parameters are "learned" from a given set of original stereo tracks, reduced in size and used afterwards to demix the desired tracks in best possible quality from a preexisting mixture. Our approach implicates a side-information rate of 10 kbps per source or channel and has a low computational complexity. The results obtained for the SiSEC 2013 dataset are intended to be used as reference for comparison with unpublished approaches
Informed Source Separation from compressed mixtures using spatial wiener filter and quantization noise estimation
International audienceIn a previous work, we proposed an Informed Source Separation sys- tem based on Wiener filtering for active listening of music from un- compressed (16-bit PCM) multichannel mix signals. In the present work, the system is improved to work with (MPEG-2 AAC) com- pressed mix signals: quantization noise is estimated from the AAC bitstream at the decoder and explicitly taken into account in the source separation process. Also a direct MDCT-to-STFT transform is used to optimize the computational efficiency of the process in the STFT domain from AAC-decoded MDCT coefficients
Recommended from our members
Serial dependence in a simulated clinical visual search task.
In everyday life, we continuously search for and classify objects in the environment around us. This kind of visual search is extremely important when performed by radiologists in cancer image interpretation and officers in airport security screening. During these tasks, observers often examine large numbers of uncorrelated images (tumor x-rays, checkpoint x-rays, etc.) one after another. An underlying assumption of such tasks is that search and recognition are independent of our past experience. Here, we simulated a visual search task reminiscent of medical image search and found that shape classification performance was strongly impaired by recent visual experience, biasing classification errors 7% more towards the previous image content. This perceptual attraction exhibited the three main tuning characteristics of Continuity Fields: serial dependence extended over 12âseconds back in time (temporal tuning), it occurred only between similar tumor-like shapes (feature tuning), and only within a limited spatial region (spatial tuning). Taken together, these results demonstrate that serial dependence influences shape perception and occurs in visual search tasks. They also raise the possibility of a detrimental impact of serial dependence in clinical and practically relevant settings, such as medical image perception
- âŠ