467 research outputs found
Deep clustering: Discriminative embeddings for segmentation and separation
We address the problem of acoustic source separation in a deep learning
framework we call "deep clustering." Rather than directly estimating signals or
masking functions, we train a deep network to produce spectrogram embeddings
that are discriminative for partition labels given in training data. Previous
deep network approaches provide great advantages in terms of learning power and
speed, but previously it has been unclear how to use them to separate signals
in a class-independent way. In contrast, spectral clustering approaches are
flexible with respect to the classes and number of items to be segmented, but
it has been unclear how to leverage the learning power and speed of deep
networks. To obtain the best of both worlds, we use an objective function that
to train embeddings that yield a low-rank approximation to an ideal pairwise
affinity matrix, in a class-independent way. This avoids the high cost of
spectral factorization and instead produces compact clusters that are amenable
to simple clustering methods. The segmentations are therefore implicitly
encoded in the embeddings, and can be "decoded" by clustering. Preliminary
experiments show that the proposed method can separate speech: when trained on
spectrogram features containing mixtures of two speakers, and tested on
mixtures of a held-out set of speakers, it can infer masking functions that
improve signal quality by around 6dB. We show that the model can generalize to
three-speaker mixtures despite training only on two-speaker mixtures. The
framework can be used without class labels, and therefore has the potential to
be trained on a diverse set of sound types, and to generalize to novel sources.
We hope that future work will lead to segmentation of arbitrary sounds, with
extensions to microphone array methods as well as image segmentation and other
domains.Comment: Originally submitted on June 5, 201
Deep Clustering and Conventional Networks for Music Separation: Stronger Together
Deep clustering is the first method to handle general audio separation
scenarios with multiple sources of the same type and an arbitrary number of
sources, performing impressively in speaker-independent speech separation
tasks. However, little is known about its effectiveness in other challenging
situations such as music source separation. Contrary to conventional networks
that directly estimate the source signals, deep clustering generates an
embedding for each time-frequency bin, and separates sources by clustering the
bins in the embedding space. We show that deep clustering outperforms
conventional networks on a singing voice separation task, in both matched and
mismatched conditions, even though conventional networks have the advantage of
end-to-end training for best signal approximation, presumably because its more
flexible objective engenders better regularization. Since the strengths of deep
clustering and conventional network architectures appear complementary, we
explore combining them in a single hybrid network trained via an approach akin
to multi-task learning. Remarkably, the combination significantly outperforms
either of its components.Comment: Published in ICASSP 201
Can Weight Loss Improve the Cardiovascular Outcomes of Patients with Obesity and Obstructive Sleep Apnea?
Cardiovascular events are the primary cause of mortality in patients with obstructive sleep apnea and obesity. The rising prevalence of obstructive sleep apnea in recent decades has been linked to increasing rates of obesity. Obstructive sleep apnea has also been linked with many different cardiovascular diseases including coronary artery disease, stroke, heart failure, hypertension, and atrial fibrillation. Obesity is an increasing health concern globally, in part because obesity complications such as hypertension, diabetes, and obstructive sleep apnea increase the risk of cardiovascular diseases. More than 10% weight loss may be required to prevent or reverse obesity complications. Treatment approaches to obesity include nutritional therapy, exercise therapy, pharmacotherapy, and surgical therapies. This review intends to identify the effects of weight loss on cardiovascular outcomes in patients with obesity and obstructive sleep apnea. Despite the strong association between cardiovascular diseases and obstructive sleep apnea, randomized trials have failed to demonstrate that treatment of obstructive sleep apnea reduces cardiovascular events, even in patients with established cardiovascular diseases. Weight loss in patients with obstructive sleep apnea improves HbA1c, systolic blood pressure, HDL cholesterol, and triglycerides, but thus far no changes in cardiovascular events have been shown. The combination of weight loss with continuous positive airway pressure (CPAP) appears more beneficial than either treatment in isolation. Large well-controlled trials in patients with obstructive sleep apnea to assess the effects of different weight reduction programs on cardiovascular disease are still needed
Coordinate Descent for Mixed-norm NMF
Nonnegative matrix factorization (NMF) is widely used in a variety of machine learning tasks
involving speech, documents and images. Being able to specify the structure of the matrix factors
is crucial in incorporating prior information. The factors correspond to the feature matrix and
the learnt representation. In particular, we allow an user-friendly specification of sparsity on the
groups of features using the L1/L2 measure. Also, we propose a pairwise coordinate descent
algorithm to minimize the objective. Experimental evidence of the efficacy of this approach is
provided on the ORL faces dataset
- …