173 research outputs found
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect
We study the cocktail party problem and propose a novel attention network
called Tune-In, abbreviated for training under negative environments with
interference. It firstly learns two separate spaces of speaker-knowledge and
speech-stimuli based on a shared feature space, where a new block structure is
designed as the building block for all spaces, and then cooperatively solves
different tasks. Between the two spaces, information is cast towards each other
via a novel cross- and dual-attention mechanism, mimicking the bottom-up and
top-down processes of a human's cocktail party effect. It turns out that
substantially discriminative and generalizable speaker representations can be
learnt in severely interfered conditions via our self-supervised training. The
experimental results verify this seeming paradox. The learnt speaker embedding
has superior discriminative power than a standard speaker verification method;
meanwhile, Tune-In achieves remarkably better speech separation performances in
terms of SI-SNRi and SDRi consistently in all test modes, and especially at
lower memory and computational consumption, than state-of-the-art benchmark
systems.Comment: Accepted in AAAI 202
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
One of the leading single-channel speech separation (SS) models is based on a
TasNet with a dual-path segmentation technique, where the size of each segment
remains unchanged throughout all layers. In contrast, our key finding is that
multi-granularity features are essential for enhancing contextual modeling and
computational efficiency. We introduce a self-attentive network with a novel
sandglass-shape, namely Sandglasset, which advances the state-of-the-art (SOTA)
SS performance at significantly smaller model size and computational cost.
Forward along each block inside Sandglasset, the temporal granularity of the
features gradually becomes coarser until reaching half of the network blocks,
and then successively turns finer towards the raw signal level. We also unfold
that residual connections between features with the same granularity are
critical for preserving information after passing through the bottleneck layer.
Experiments show our Sandglasset with only 2.3M parameters has achieved the
best results on two benchmark SS datasets -- WSJ0-2mix and WSJ0-3mix, where the
SI-SNRi scores have been improved by absolute 0.8 dB and 2.4 dB, respectively,
comparing to the prior SOTA results.Comment: Accepted in ICASSP 202
Contrastive Separative Coding for Self-supervised Representation Learning
To extract robust deep representations from long sequential modeling of
speech data, we propose a self-supervised learning approach, namely Contrastive
Separative Coding (CSC). Our key finding is to learn such representations by
separating the target signal from contrastive interfering signals. First, a
multi-task separative encoder is built to extract shared separable and
discriminative embedding; secondly, we propose a powerful cross-attention
mechanism performed over speaker representations across various interfering
conditions, allowing the model to focus on and globally aggregate the most
critical information to answer the "query" (current bottom-up embedding) while
paying less attention to interfering, noisy, or irrelevant parts; lastly, we
form a new probabilistic contrastive loss which estimates and maximizes the
mutual information between the representations and the global speaker vector.
While most prior unsupervised methods have focused on predicting the future,
neighboring, or missing samples, we take a different perspective of predicting
the interfered samples. Moreover, our contrastive separative loss is free from
negative sampling. The experiment demonstrates that our approach can learn
useful representations achieving a strong speaker verification performance in
adverse conditions.Comment: Accepted in ICASSP 202
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model
Expressive human speech generally abounds with rich and flexible speech
prosody variations. The speech prosody predictors in existing expressive speech
synthesis methods mostly produce deterministic predictions, which are learned
by directly minimizing the norm of prosody prediction error. Its unimodal
nature leads to a mismatch with ground truth distribution and harms the model's
ability in making diverse predictions. Thus, we propose a novel prosody
predictor based on the denoising diffusion probabilistic model to take
advantage of its high-quality generative modeling and training stability.
Experiment results confirm that the proposed prosody predictor outperforms the
deterministic baseline on both the expressiveness and diversity of prediction
results with even fewer network parameters.Comment: Proceedings of Interspeech 2023 (doi: 10.21437/Interspeech.2023-715),
demo site at https://thuhcsi.github.io/interspeech2023-DiffVar
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Denoising diffusion probabilistic models (DDPMs) have recently achieved
leading performances in many generative tasks. However, the inherited iterative
sampling process costs hindered their applications to speech synthesis. This
paper proposes FastDiff, a fast conditional diffusion model for high-quality
speech synthesis. FastDiff employs a stack of time-aware location-variable
convolutions of diverse receptive field patterns to efficiently model long-term
time dependencies with adaptive conditions. A noise schedule predictor is also
adopted to reduce the sampling steps without sacrificing the generation
quality. Based on FastDiff, we design an end-to-end text-to-speech synthesizer,
FastDiff-TTS, which generates high-fidelity speech waveforms without any
intermediate feature (e.g., Mel-spectrogram). Our evaluation of FastDiff
demonstrates the state-of-the-art results with higher-quality (MOS 4.28) speech
samples. Also, FastDiff enables a sampling speed of 58x faster than real-time
on a V100 GPU, making diffusion models practically applicable to speech
synthesis deployment for the first time. We further show that FastDiff
generalized well to the mel-spectrogram inversion of unseen speakers, and
FastDiff-TTS outperformed other competing methods in end-to-end text-to-speech
synthesis. Audio samples are available at \url{https://FastDiff.github.io/}.Comment: Accepted by IJCAI 202
Free energy barrier for melittin reorientation from a membrane-bound state to a transmembrane state
An important step in a phospholipid membrane pore formation by melittin
antimicrobial peptide is a reorientation of the peptide from a surface into a
transmembrane conformation. In this work we perform umbrella sampling
simulations to calculate the potential of mean force (PMF) for the
reorientation of melittin from a surface-bound state to a transmembrane state
and provide a molecular level insight into understanding peptide and lipid
properties that influence the existence of the free energy barrier. The PMFs
were calculated for a peptide to lipid (P/L) ratio of 1/128 and 4/128. We
observe that the free energy barrier is reduced when the P/L ratio increased.
In addition, we study the cooperative effect; specifically we investigate if
the barrier is smaller for a second melittin reorientation, given that another
neighboring melittin was already in the transmembrane state. We observe that
indeed the barrier of the PMF curve is reduced in this case, thus confirming
the presence of a cooperative effect
Cosmological parameters from SDSS and WMAP
We measure cosmological parameters using the three-dimensional power spectrum
P(k) from over 200,000 galaxies in the Sloan Digital Sky Survey (SDSS) in
combination with WMAP and other data. Our results are consistent with a
``vanilla'' flat adiabatic Lambda-CDM model without tilt (n=1), running tilt,
tensor modes or massive neutrinos. Adding SDSS information more than halves the
WMAP-only error bars on some parameters, tightening 1 sigma constraints on the
Hubble parameter from h~0.74+0.18-0.07 to h~0.70+0.04-0.03, on the matter
density from Omega_m~0.25+/-0.10 to Omega_m~0.30+/-0.04 (1 sigma) and on
neutrino masses from <11 eV to <0.6 eV (95%). SDSS helps even more when
dropping prior assumptions about curvature, neutrinos, tensor modes and the
equation of state. Our results are in substantial agreement with the joint
analysis of WMAP and the 2dF Galaxy Redshift Survey, which is an impressive
consistency check with independent redshift survey data and analysis
techniques. In this paper, we place particular emphasis on clarifying the
physical origin of the constraints, i.e., what we do and do not know when using
different data sets and prior assumptions. For instance, dropping the
assumption that space is perfectly flat, the WMAP-only constraint on the
measured age of the Universe tightens from t0~16.3+2.3-1.8 Gyr to
t0~14.1+1.0-0.9 Gyr by adding SDSS and SN Ia data. Including tensors, running
tilt, neutrino mass and equation of state in the list of free parameters, many
constraints are still quite weak, but future cosmological measurements from
SDSS and other sources should allow these to be substantially tightened.Comment: Minor revisions to match accepted PRD version. SDSS data and ppt
figures available at http://www.hep.upenn.edu/~max/sdsspars.htm
Antimicrobial resistance among migrants in Europe: a systematic review and meta-analysis
BACKGROUND: Rates of antimicrobial resistance (AMR) are rising globally and there is concern that increased migration is contributing to the burden of antibiotic resistance in Europe. However, the effect of migration on the burden of AMR in Europe has not yet been comprehensively examined. Therefore, we did a systematic review and meta-analysis to identify and synthesise data for AMR carriage or infection in migrants to Europe to examine differences in patterns of AMR across migrant groups and in different settings. METHODS: For this systematic review and meta-analysis, we searched MEDLINE, Embase, PubMed, and Scopus with no language restrictions from Jan 1, 2000, to Jan 18, 2017, for primary data from observational studies reporting antibacterial resistance in common bacterial pathogens among migrants to 21 European Union-15 and European Economic Area countries. To be eligible for inclusion, studies had to report data on carriage or infection with laboratory-confirmed antibiotic-resistant organisms in migrant populations. We extracted data from eligible studies and assessed quality using piloted, standardised forms. We did not examine drug resistance in tuberculosis and excluded articles solely reporting on this parameter. We also excluded articles in which migrant status was determined by ethnicity, country of birth of participants' parents, or was not defined, and articles in which data were not disaggregated by migrant status. Outcomes were carriage of or infection with antibiotic-resistant organisms. We used random-effects models to calculate the pooled prevalence of each outcome. The study protocol is registered with PROSPERO, number CRD42016043681. FINDINGS: We identified 2274 articles, of which 23 observational studies reporting on antibiotic resistance in 2319 migrants were included. The pooled prevalence of any AMR carriage or AMR infection in migrants was 25·4% (95% CI 19·1-31·8; I2 =98%), including meticillin-resistant Staphylococcus aureus (7·8%, 4·8-10·7; I2 =92%) and antibiotic-resistant Gram-negative bacteria (27·2%, 17·6-36·8; I2 =94%). The pooled prevalence of any AMR carriage or infection was higher in refugees and asylum seekers (33·0%, 18·3-47·6; I2 =98%) than in other migrant groups (6·6%, 1·8-11·3; I2 =92%). The pooled prevalence of antibiotic-resistant organisms was slightly higher in high-migrant community settings (33·1%, 11·1-55·1; I2 =96%) than in migrants in hospitals (24·3%, 16·1-32·6; I2 =98%). We did not find evidence of high rates of transmission of AMR from migrant to host populations. INTERPRETATION: Migrants are exposed to conditions favouring the emergence of drug resistance during transit and in host countries in Europe. Increased antibiotic resistance among refugees and asylum seekers and in high-migrant community settings (such as refugee camps and detention facilities) highlights the need for improved living conditions, access to health care, and initiatives to facilitate detection of and appropriate high-quality treatment for antibiotic-resistant infections during transit and in host countries. Protocols for the prevention and control of infection and for antibiotic surveillance need to be integrated in all aspects of health care, which should be accessible for all migrant groups, and should target determinants of AMR before, during, and after migration. FUNDING: UK National Institute for Health Research Imperial Biomedical Research Centre, Imperial College Healthcare Charity, the Wellcome Trust, and UK National Institute for Health Research Health Protection Research Unit in Healthcare-associated Infections and Antimictobial Resistance at Imperial College London
Mechanistic Studies of Ethylene Hydrophenylation Catalyzed by Bipyridyl Pt(II) Complexes
This article discusses mechanistic studies of ethylene hydrophenylation catalyzed by bipyridyl Pt(II) complexes
- …