Search CORE

138 research outputs found

Gaussian Framework for Interference Reduction in Live Recordings

Author: Di Carlo Diego
Publication venue
Publication date: 08/04/2022
Field of study

Here typical live full-length music recordings are considered. In this scenarios, some instrumental voices are captured by microphones intended to other voices, leading to so-called “interferences”. Reducing this phenomenon is desirable because it opens new possibilities for sound engineers and also it has been proven that it increase performances of music analysis and processing tools (e.g. pitch tracking). In this work we propose an fast NMF-based algorithm to solve this problem.ope

Padua Thesis and Dissertation Archive

Iterative decoding and equalization for 2-D recording channels

Author: J.A. O'Sullivan
N. Singla
R.S. Indeck
Yunxiang Wu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

A networking approach to sharing music studio resources

Author: Foss Richard John
Publication venue: Faculty of Science, Computer Science
Publication date: 01/01/1996
Field of study

This thesis investigates the extent to which networking technology can be used to provide remote workstation access to a pool of shared music studio resources. A pilot system is described in which MIDI messages, studio control data, and audio signals flow between the workstations and a studio server. A booking and timing facility avoids contention and allows for accurate reports of studio usage. The operation of the system has been evaluated in terms of its ability to satislY three fundamental goals, namely the remote, shared and centralized access to studio resources. Three essential network configurations have been identified, incorporating a mix of star and bus topologies, and their relative potential for satisfYing the fundamental goals has been highlighted

Rhodes Repository (SEALS)

hpDJ: An automated DJ with floorshow feedback

Author: Cliff Dave
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/01/2006
Field of study

Many radio stations and nightclubs employ Disk-Jockeys (DJs) to provide a continuous uninterrupted stream or “mix” of dance music, built from a sequence of individual song-tracks. In the last decade, commercial pre-recorded compilation CDs of DJ mixes have become a growth market. DJs exercise skill in deciding an appropriate sequence of tracks and in mixing 'seamlessly' from one track to the next. Online access to large-scale archives of digitized music via automated music information retrieval systems offers users the possibility of discovering many songs they like, but the majority of consumers are unlikely to want to learn the DJ skills of sequencing and mixing. This paper describes hpDJ, an automatic method by which compilations of dance-music can be sequenced and seamlessly mixed by computer, with minimal user involvement. The user may specify a selection of tracks, and may give a qualitative indication of the type of mix required. The resultant mix can be presented as a continuous single digital audio file, whether for burning to CD, or for play-out from a personal playback device such as an iPod, or for play-out to rooms full of dancers in a nightclub. Results from an early version of this system have been tested on an audience of patrons in a London nightclub, with very favourable results. Subsequent to that experiment, we designed technologies which allow the hpDJ system to monitor the responses of crowds of dancers/listeners, so that hpDJ can dynamically react to those responses from the crowd. The initial intention was that hpDJ would monitor the crowd’s reaction to the song-track currently being played, and use that response to guide its selection of subsequent song-tracks tracks in the mix. In that version, it’s assumed that all the song-tracks existed in some archive or library of pre-recorded files. However, once reliable crowd-monitoring technology is available, it becomes possible to use the crowd-response data to dynamically “remix” existing song-tracks (i.e, alter the track in some way, tailoring it to the response of the crowd) and even to dynamically “compose” new song-tracks suited to that crowd. Thus, the music played by hpDJ to any particular crowd of listeners on any particular night becomes a direct function of that particular crowd’s particular responses on that particular night. On a different night, the same crowd of people might react in a different way, leading hpDJ to create different music. Thus, the music composed and played by hpDJ could be viewed as an “emergent” property of the dynamic interaction between the computer system and the crowd, and the crowd could then be viewed as having collectively collaborated on composing the music that was played on that night. This en masse collective composition raises some interesting legal issues regarding the ownership of the composition (i.e.: who, exactly, is the author of the work?), but revenue-generating businesses can nevertheless plausibly be built from such technologies

Southampton (e-Prints Soton)

Systematic evaluation of perceived spatial quality

Author: Berg Jan
Rumsey Francis
Publication venue
Publication date: 01/01/2003
Field of study

The evaluation of perceived spatial quality calls for a method that is sensitive to changes in the constituent dimensions of that quality. In order to devise a method accounting for these changes, several processes have to be performed. This paper shows the development of scales by elicitation and structuring of verbal data, followed by validation of the resulting attribute scales

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Luleå University of Technology Publications

Surrey Research Insight

Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds

Author: A Geiger
A Owens
A Owens
BC Russell
C Rascon
Computational auditory scene analysis
D Li
F Antonacci
H Wallach
H Zhao
I Dokmanic
J Delmerico
J Tiete
KI McAnally
L-C Chen
LC Chen
LD Rosenblum
R Fendrich
R Gao
S Argentieri
S Hecker
U Klee
W Huang
WR Thurlow
WW Gaver
Y Tian
Publication venue
Publication date: 09/03/2020
Field of study

Humans can robustly recognize and localize objects by integrating visual and auditory cues. While machines are able to do the same now with images, less work has been done with sounds. This work develops an approach for dense semantic labelling of sound-making objects, purely based on binaural sounds. We propose a novel sensor setup and record a new audio-visual dataset of street scenes with eight professional binaural microphones and a 360 degree camera. The co-existence of visual and audio cues is leveraged for supervision transfer. In particular, we employ a cross-modal distillation framework that consists of a vision `teacher' method and a sound `student' method -- the student method is trained to generate the same results as the teacher method. This way, the auditory system can be trained without using human annotations. We also propose two auxiliary tasks namely, a) a novel task on Spatial Sound Super-resolution to increase the spatial resolution of sounds, and b) dense depth prediction of the scene. We then formulate the three tasks into one end-to-end trainable multi-tasking network aiming to boost the overall performance. Experimental results on the dataset show that 1) our method achieves promising results for semantic prediction and the two auxiliary tasks; and 2) the three tasks are mutually beneficial -- training them together achieves the best performance and 3) the number and orientations of microphones are both important. The data and code will be released to facilitate the research in this new direction.Comment: Project page: https://www.trace.ethz.ch/publications/2020/sound_perception/index.htm

arXiv.org e-Print Archive

Repository for Publications and Research Data

Iterative detection and decoding for separable two-dimensional intersymbol interference

Author: J.A. O'Sullivan
N. Singla
R.S. Indeck
Yunxiang Wu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Music Visualization Using Source Separated Stereophonic Music

Author: Chookaszian Hannah Eileen
Publication venue: DigitalCommons@CalPoly
Publication date: 01/06/2022
Field of study

This thesis introduces a music visualization system for stereophonic source separated music. Music visualization systems are a popular way to represent information from audio signals through computer graphics. Visualization can help people better understand music and its complex and interacting elements. This music visualization system extracts pitch, panning, and loudness features from source separated audio files to create the visual. Most state-of-the art visualization systems develop their visual representation of the music from either the fully mixed final song recording, where all of the instruments and vocals are combined into one file, or from the digital audio workstation (DAW) data containing multiple independent recordings of individual audio sources. Original source recordings are not always readily available to the public so music source separation (MSS) can be used to obtain estimated versions of the audio source files. This thesis surveys different approaches to MSS and music visualization as well as introduces a new music visualization system specifically for source separated music