3 research outputs found
Real-time Sound Source Separation For Music Applications
Sound source separation refers to the task of extracting individual sound sources from some number of mixtures of those sound sources. In this thesis, a novel sound source separation algorithm for musical applications is presented. It leverages the fact that the vast majority of commercially recorded music since the 1950s has been mixed down for two channel reproduction, more commonly known as stereo. The algorithm presented in Chapter 3 in this thesis requires no prior knowledge or learning and performs the task of separation based purely on azimuth discrimination within the stereo field. The algorithm exploits the use of the pan pot as a means to achieve image localisation within stereophonic recordings. As such, only an interaural intensity difference exists between left and right channels for a single source. We use gain scaling and phase cancellation techniques to expose frequency dependent nulls across the azimuth domain, from which source separation and resynthesis is carried out. The algorithm is demonstrated to be state of the art in the field of sound source separation but also to be a useful pre-process to other tasks such as music segmentation and surround sound upmixing
Recommended from our members
An Adaptive Strategy for Sensory Processing
Recognizing objects and detecting associations among them is essential for the survival of organisms. The ability to perform these tasks is derived from the representations of objects obtained through processing information along sensory pathways. Our current understanding of sensory processing is based on two sets of foundational theories – The Efficient Coding Hypothesis and hierarchical assembly of object representations. These theories suggest that sensory processing aims to identify independent features of the environment and progressively represent objects in terms of comprehensive combinations of these features. Separately, the two sets of theories have successfully explained the detection of associations and perceptual invariance, respectively; however, reconciling them together in one unified theory has remained challenging. Independent features are deemed essential for detecting association by the Efficient coding hypothesis, but to achieve consistency in representations, multiple comprehensive structures corresponding to the same object must be hierarchically assembled, ignoring independence among such structures.
Here we propose an alternative framework for sensory processing in which the system, instead of finding the truly independent components of the environment, aims to represent objects based on their most informative structures. Using theoretical arguments, we show that following such a strategy allows the system to efficiently represent sensory cues without necessarily acquiring knowledge about statistical properties of all possible inputs. Through mathematical simulations, we find that the framework can describe the known characteristics of early sensory processing stages and permits consistent input representations observed at later stages of processing. We also demonstrate that the framework can be implemented in a biologically plausible neuronal circuit and explain aspects of experience and learning from corrupted inputs. Thus, this framework provides a novel perspective and a unified description of sensory processing in its entirety