Search CORE

2,685 research outputs found

Estimating Single-Channel Source Separation Masks: Relevance Vector Machine Classifiers vs. Pitch-Based Masking

Author: Ellis Daniel P. W.
Weiss Ron J.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2006
Field of study

Audio sources frequently concentrate much of their energy into a relatively small proportion of the available time-frequency cells in a short-time Fourier transform (STFT). This sparsity makes it possible to separate sources, to some degree, simply by selecting STFT cells dominated by the desired source, setting all others to zero (or to an estimate of the obscured target value), and inverting the STFT to a waveform. The problem of source separation then becomes identifying the cells containing good target information. We treat this as a classification problem, and train a Relevance Vector Machine (a probabilistic relative of the Support Vector Machine) to perform this task. We compare the performance of this classifier both against SVMs (it has similar accuracy but is not as efficient as RVMs), and against a traditional Computational Auditory Scene Analysis (CASA) technique based on a noise-robust pitch tracker, which the RVM outperforms significantly. Differences between the RVM- and pitch-tracker-based mask estimation suggest benefits to be obtained by combining both

CiteSeerX

Columbia University Academic Commons

A variational EM algorithm for learning eigenvoice parameters in mixed signals

Author: Ellis Daniel P. W.
Weiss Ron J.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

We derive an efficient learning algorithm for model-based source separation for use on single channel speech mixtures where the precise source characteristics are not known a priori. The sources are modeled using factor-analyzed hidden Markov models (HMM) where source specific characteristics are captured by an "eigenvoice" speaker subspace model. The proposed algorithm is able to learn adaptation parameters for two speech sources when only a mixture of signals is observed. We evaluate the algorithm on the 2006 speech separation challenge data set and show that it is significantly faster than our earlier system at a small cost in terms of performance

CiteSeerX

Crossref

Columbia University Academic Commons

Recommended from our members

Learning, Using, and Adapting Models in Scene Analysis

Author: Ellis Daniel P. W.
Weiss Ron J.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Discusses models of source behavior as the way to conquer uncertainty in mixtures

Columbia University Academic Commons

Monaural speech separation using source-adapted models

Author: Ellis Daniel P. W.
Weiss Ron J.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

We propose a model-based source separation system for use on single channel speech mixtures where the precise source characteristics are not known a priori. We do this by representing the space of source variation with a parametric signal model based on the eigenvoice technique for rapid speaker adaptation. We present an algorithm to infer the characteristics of the sources present in a mixture, allowing for significantly improved separation performance over that obtained using unadapted source models. The algorithm is evaluated on the task defined in the 2006 Speech Separation Challenge [1] and compared with separation using source-dependent models

Crossref

Columbia University Academic Commons

Recommended from our members

Combining Localization Cues and Source Model Constraints for Binaural Source Separation

Author: Ellis Daniel P. W.
Mandel Michael I.
Weiss Ron J.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

We describe a system for separating multiple sources from a two-channel recording based on interaural cues and prior knowledge of the statistics of the underlying source signals. The proposed algorithm effectively combines information derived from low level perceptual cues, similar to those used by the human auditory system, with higher level information related to speaker identity. We combine a probabilistic model of the observed interaural level and phase differences with a prior model of the source statistics and derive an EM algorithm for finding the maximum likelihood parameters of the joint model. The system is able to separate more sound sources than there are observed channels in the presence of reverberation. In simulated mixtures of speech from two and three speakers the proposed algorithm gives a signal-to-noise ratio improvement of 1.7 dB over a baseline algorithm which uses only interaural cues. Further improvement is obtained by incorporating eigenvoice speaker adaptation to enable the source model to better match the sources present in the signal. This improves performance over the baseline by 2.7 dB when the speakers used for training and testing are matched. However, the improvement is minimal when the test data is very different from that used in training

Columbia University Academic Commons

Source Separation Based on Binaural Cues and Source Model Constraints

Author: Ellis Daniel P. W.
Mandel Michael I.
Weiss Ron J.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2008
Field of study

We describe a system for separating multiple sources from a two-channel recording based on interaural cues and known characteristics of the source signals. We combine a probabilistic model of the observed interaural level and phase differences with a prior model of the source statistics and derive an EM algorithm for finding the maximum likelihood parameters of the joint model. The system is able to separate more sound sources than there are observed channels. In simulated reverberant mixtures of three speakers the proposed algorithm gives a signal-to-noise ratio improvement of 2.1 dB over a baseline algorithm using only interaural cues

CiteSeerX

Columbia University Academic Commons

CNN Architectures for Large-Scale Audio Classification

Author: Chaudhuri Sourish
Ellis Daniel P. W.
Gemmeke Jort F.
Hershey Shawn
Jansen Aren
Moore R. Channing
Plakal Manoj
Platt Devin
Saurous Rif A.
Seybold Bryan
Slaney Malcolm
Weiss Ron J.
Wilson Kevin
Publication venue
Publication date: 10/01/2017
Field of study

Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio. We use various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels. We examine fully connected Deep Neural Networks (DNNs), AlexNet [1], VGG [2], Inception [3], and ResNet [4]. We investigate varying the size of both training set and label vocabulary, finding that analogs of the CNNs used in image classification do well on our audio classification task, and larger training and label sets help up to a point. A model using embeddings from these classifiers does much better than raw features on the Audio Set [5] Acoustic Event Detection (AED) classification task.Comment: Accepted for publication at ICASSP 2017 Changes: Added definitions of mAP, AUC, and d-prime. Updated mAP/AUC/d-prime numbers for Audio Set based on changes of latest Audio Set revision. Changed wording to fit 4 page limit with new addition

arXiv.org e-Print Archive

Crossref

Clustering beat-chroma patterns in a large music database

Author: Bertin-Mahieux Thierry
Ellis Daniel P. W.
Weiss Ron J.
Publication venue
Publication date: 01/01/2010
Field of study

A musical style or genre implies a set of common conventions and patterns combined and deployed in different ways to make individual musical pieces; for instance, most would agree that contemporary pop music is assembled from a relatively small palette of harmonic and melodic patterns. The purpose of this paper is to use a database of tens of thousands of songs in combination with a compact representation of melodic-harmonic content (the beat-synchronous chromagram) and data-mining tools (clustering) to attempt to explicitly catalog this palette — at least within the limitations of the beat-chroma representation. We use online k-means clustering to summarize 3.7 million 4-beat bars in a codebook of a few hundred prototypes. By measuring how accurately such a quantized codebook can reconstruct the original data, we can quantify the degree of diversity (distortion as a function of codebook size) and temporal structure (i.e. the advantage gained by joint quantizing multiple frames) in this music. The most popular codewords themselves reveal the common chords used in the music. Finally, the quantized representation of music can be used for music retrieval tasks such as artist and genre classification, and identifying songs that are similar in terms of their melodic-harmonic content

CiteSeerX

Columbia University Academic Commons

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Impact of EMA regulatory label changes on systemic diclofenac initiation, discontinuation, and switching to other pain medicines in Scotland, England, Denmark, and The Netherlands

Author: Bennie Marion
Doney Alexander S. F.
Ernst Martin Thomsen
Flynn Robert W. V.
Hallas Jesper
Herings Ron M. C.
MacDonald Thomas M.
Mackenzie Isla S.
Mitchell Lyn
Morales Daniel R.
Morant Steve V.
Morris Carole
Nicholson Lizzie
Overbeek Jetty A.
Pottegard Anton
Robertson Chris
Smits Elisabeth
Wei Li
Publication venue: 'Wiley'
Publication date: 01/03/2020
Field of study

Purpose: In June 2013 a European Medicines Agency referral procedure concluded that diclofenac was associated with an elevated risk of acute cardiovascular events and contraindications, warnings, and changes to the product information were implemented across the European Union. This study measured the impact of the regulatory action on the prescribing of systemic diclofenac in Denmark, The Netherlands, England, and Scotland. Methods: Quarterly time series analyses measuring diclofenac prescription initiation, discontinuation and switching to other systemic nonsteroidal anti-inflammatory (NSAIDs), topical NSAIDs, paracetamol, opioids, and other chronic pain medication in those who discontinued diclofenac. Absolute effects were estimated using interrupted time series regression. Results: Overall, diclofenac prescription initiations fell during the observation periods of all countries. Compared with Denmark where there appeared to be amore limited effect, the regulatory action was associated with significant immediate reductions in diclofenac initiation in The Netherlands (−0.42%, 95% CI, −0.66% to −0.18%), England (−0.09%, 95% CI, −0.11% to −0.08%), and Scotland (−0.67%, 95% CI, −0.79% to −0.55%); and falling trends in diclofenac initiation in the Netherlands (−0.03%, 95% CI, −0.06% to −0.01% per quarter) and Scotland (−0.04%, 95% CI, −0.05% to −0.02% per quarter). There was no significant impact on diclofenac discontinuation in any country. The regulatory action was associated with modest differences in switching to other pain medicines following diclofenac discontinuation. Conclusions: The regulatory action was associated with significant reductions in overall diclofenac initiation which varied by country and type of exposure. There was no impact on discontinuation and variable impact on switching

University of Strathclyde Institutional Repository

University of Dundee Online Publications

Why have asset price properties changed so little in 200 years

We first review empirical evidence that asset prices have had episodes of large fluctuations and been inefficient for at least 200 years. We briefly review recent theoretical results as well as the neurological basis of trend following and finally argue that these asset price properties can be attributed to two fundamental mechanisms that have not changed for many centuries: an innate preference for trend following and the collective tendency to exploit as much as possible detectable price arbitrage, which leads to destabilizing feedback loops.Comment: 16 pages, 4 figure

arXiv.org e-Print Archive