99 research outputs found
Spatial-temporal Graph Based Multi-channel Speaker Verification With Ad-hoc Microphone Arrays
The performance of speaker verification degrades significantly in adverse
acoustic environments with strong reverberation and noise. To address this
issue, this paper proposes a spatial-temporal graph convolutional network (GCN)
method for the multi-channel speaker verification with ad-hoc microphone
arrays. It includes a feature aggregation block and a channel selection block,
both of which are built on graphs. The feature aggregation block fuses speaker
features among different time and channels by a spatial-temporal GCN. The
graph-based channel selection block discards the noisy channels that may
contribute negatively to the system. The proposed method is flexible in
incorporating various kinds of graphs and prior knowledge. We compared the
proposed method with six representative methods in both real-world and
simulated environments.
Experimental results show that the proposed method achieves a relative equal
error rate (EER) reduction of lower than the strongest
referenced method in the simulated datasets, and lower than
the latter in the real datasets. Moreover, its performance is robust across
different signal-to-noise ratios and reverberation time
Wespeaker baselines for VoxSRC2023
This report showcases the results achieved using the wespeaker toolkit for
the VoxSRC2023 Challenge. Our aim is to provide participants, especially those
with limited experience, with clear and straightforward guidelines to develop
their initial systems. Via well-structured recipes and strong results, we hope
to offer an accessible and good enough start point for all interested
individuals. In this report, we describe the results achieved on the VoxSRC2023
dev set using the pretrained models, you can check the CodaLab evaluation
server for the results on the evaluation set
Deep Learning Based Stage-wise Two-dimensional Speaker Localization with Large Ad-hoc Microphone Arrays
While deep-learning-based speaker localization has shown advantages in
challenging acoustic environments, it often yields only direction-of-arrival
(DOA) cues rather than precise two-dimensional (2D) coordinates. To address
this, we propose a novel deep-learning-based 2D speaker localization method
leveraging ad-hoc microphone arrays, where an ad-hoc microphone array is
composed of randomly distributed microphone nodes, each of which is equipped
with a traditional array. Specifically, we first employ convolutional neural
networks at each node to estimate speaker directions. Then, we integrate these
DOA estimates using triangulation and clustering techniques to get 2D speaker
locations. To further boost the estimation accuracy, we introduce a node
selection algorithm that strategically filters the most reliable nodes.
Extensive experiments on both simulated and real-world data demonstrate that
our approach significantly outperforms conventional methods. The proposed node
selection further refines performance. The real-world dataset in the
experiment, named Libri-adhoc-node10 which is a newly recorded data described
for the first time in this paper, is online available at
https://github.com/Liu-sp/Libri-adhoc-nodes10
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames
Recently, the unified streaming and non-streaming two-pass (U2/U2++)
end-to-end model for speech recognition has shown great performance in terms of
streaming capability, accuracy and latency. In this paper, we present
fast-U2++, an enhanced version of U2++ to further reduce partial latency. The
core idea of fast-U2++ is to output partial results of the bottom layers in its
encoder with a small chunk, while using a large chunk in the top layers of its
encoder to compensate the performance degradation caused by the small chunk.
Moreover, we use knowledge distillation method to reduce the token emission
latency. We present extensive experiments on Aishell-1 dataset. Experiments and
ablation studies show that compared to U2++, fast-U2++ reduces model latency
from 320ms to 80ms, and achieves a character error rate (CER) of 5.06% with a
streaming setup.Comment: 5 pages, 3 figure
Dopamine D2-receptor neurons in nucleus accumbens regulate sevoflurane anesthesia in mice
IntroductionThe mechanism of general anesthesia remains elusive. In recent years, numerous investigations have indicated that its mode of action is closely associated with the sleep-wake pathway. As a result, this study aimed to explore the involvement of dopamine D2 receptor (D2R) expressing neurons located in the nucleus accumbens (NAc), a critical nucleus governing sleep-wake regulation, in sevoflurane anesthesia.MethodsThis exploration was carried out using calcium fiber photometry and optogenetics technology, while utilizing cortical electroencephalogram (EEG), loss of righting reflex (LORR), and recovery of righting reflex (RORR) as experimental indicators.ResultsThe findings from calcium fiber photometry revealed a decrease in the activity of NAcD2R neurons during the induction phase of sevoflurane anesthesia, with subsequent recovery observed during the anesthesia’s emergence phase. Moreover, the activation of NAcD2R neurons through optogenetics technology led to a reduction in the anesthesia induction process and an extension of the arousal process in mice. Conversely, the inhibition of these neurons resulted in the opposite effect. Furthermore, the activation of NAcD2R neurons projecting into the ventral pallidum (VP) via optogenetics demonstrated a shortened induction time for mice under sevoflurane anesthesia.DiscussionIn conclusion, our research outcomes suggest that NAcD2R neurons play a promotive role in the sevoflurane general anesthesia process in mice, and their activation can reduce the induction time of anesthesia via the ventral pallidum (VP)
Age-associated microbiome shows the giant panda lives on hemicelluloses, not on cellulose
The giant panda feeds almost exclusively on bamboo, a diet highly enriched in lignin and cellulose, but is characterized by a digestive tract similar to carnivores. It is still large unknown if and how the giant panda gut microbiota contributes to lignin and cellulose degradation. Here we show the giant pandas’ gut microbiota does not significantly contribute to cellulose and lignin degradation. We found that no operational taxonomic unit had a nearest neighbor identified as a cellulolytic species or strain with a significant higher abundance in juvenile than cubs, a very low abundance of putative lignin and cellulose genes existed in part of analyzing samples but a significant higher abundance of genes involved in starch and hemicellulose degradation in juveniles than cubs. Moreover, a significant lower abundance of putative cellulolytic genes and a significant higher abundance of putative α-amylase and hemicellulase gene families were present in giant pandas than in omnivores or herbivores
Precision Higgs physics at the CEPC
The discovery of the Higgs boson with its mass around 125 GeV by the ATLAS
and CMS Collaborations marked the beginning of a new era in high energy
physics. The Higgs boson will be the subject of extensive studies of the
ongoing LHC program. At the same time, lepton collider based Higgs factories
have been proposed as a possible next step beyond the LHC, with its main goal
to precisely measure the properties of the Higgs boson and probe potential new
physics associated with the Higgs boson. The Circular Electron Positron
Collider~(CEPC) is one of such proposed Higgs factories. The CEPC is an
circular collider proposed by and to be hosted in China. Located in a
tunnel of approximately 100~km in circumference, it will operate at a
center-of-mass energy of 240~GeV as the Higgs factory. In this paper, we
present the first estimates on the precision of the Higgs boson property
measurements achievable at the CEPC and discuss implications of these
measurements.Comment: 46 pages, 37 figure
Fault Detection of Stator Inter-Turn Short-Circuit in PMSM on Stator Current and Vibration Signal
The stator inter-turn short circuit fault is one of the most common and key faults in permanent magnet synchronous motor (PMSM). This paper introduces a time–frequency method for inter-turn fault detection in stator winding of PMSM using improved wavelet packet transform. Both stator current signal and vibration signal are used for the detection of short circuit faults. Two different experimental data from a three-phase PMSM were processed and analyzed by this time–frequency method in LabVIEW. The feasibility of this approach is shown by the experimental test
- …