99 research outputs found

    Spatial-temporal Graph Based Multi-channel Speaker Verification With Ad-hoc Microphone Arrays

    Full text link
    The performance of speaker verification degrades significantly in adverse acoustic environments with strong reverberation and noise. To address this issue, this paper proposes a spatial-temporal graph convolutional network (GCN) method for the multi-channel speaker verification with ad-hoc microphone arrays. It includes a feature aggregation block and a channel selection block, both of which are built on graphs. The feature aggregation block fuses speaker features among different time and channels by a spatial-temporal GCN. The graph-based channel selection block discards the noisy channels that may contribute negatively to the system. The proposed method is flexible in incorporating various kinds of graphs and prior knowledge. We compared the proposed method with six representative methods in both real-world and simulated environments. Experimental results show that the proposed method achieves a relative equal error rate (EER) reduction of 15.39%\mathbf{15.39\%} lower than the strongest referenced method in the simulated datasets, and 17.70%\mathbf{17.70\%} lower than the latter in the real datasets. Moreover, its performance is robust across different signal-to-noise ratios and reverberation time

    Wespeaker baselines for VoxSRC2023

    Full text link
    This report showcases the results achieved using the wespeaker toolkit for the VoxSRC2023 Challenge. Our aim is to provide participants, especially those with limited experience, with clear and straightforward guidelines to develop their initial systems. Via well-structured recipes and strong results, we hope to offer an accessible and good enough start point for all interested individuals. In this report, we describe the results achieved on the VoxSRC2023 dev set using the pretrained models, you can check the CodaLab evaluation server for the results on the evaluation set

    Deep Learning Based Stage-wise Two-dimensional Speaker Localization with Large Ad-hoc Microphone Arrays

    Full text link
    While deep-learning-based speaker localization has shown advantages in challenging acoustic environments, it often yields only direction-of-arrival (DOA) cues rather than precise two-dimensional (2D) coordinates. To address this, we propose a novel deep-learning-based 2D speaker localization method leveraging ad-hoc microphone arrays, where an ad-hoc microphone array is composed of randomly distributed microphone nodes, each of which is equipped with a traditional array. Specifically, we first employ convolutional neural networks at each node to estimate speaker directions. Then, we integrate these DOA estimates using triangulation and clustering techniques to get 2D speaker locations. To further boost the estimation accuracy, we introduce a node selection algorithm that strategically filters the most reliable nodes. Extensive experiments on both simulated and real-world data demonstrate that our approach significantly outperforms conventional methods. The proposed node selection further refines performance. The real-world dataset in the experiment, named Libri-adhoc-node10 which is a newly recorded data described for the first time in this paper, is online available at https://github.com/Liu-sp/Libri-adhoc-nodes10

    Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames

    Full text link
    Recently, the unified streaming and non-streaming two-pass (U2/U2++) end-to-end model for speech recognition has shown great performance in terms of streaming capability, accuracy and latency. In this paper, we present fast-U2++, an enhanced version of U2++ to further reduce partial latency. The core idea of fast-U2++ is to output partial results of the bottom layers in its encoder with a small chunk, while using a large chunk in the top layers of its encoder to compensate the performance degradation caused by the small chunk. Moreover, we use knowledge distillation method to reduce the token emission latency. We present extensive experiments on Aishell-1 dataset. Experiments and ablation studies show that compared to U2++, fast-U2++ reduces model latency from 320ms to 80ms, and achieves a character error rate (CER) of 5.06% with a streaming setup.Comment: 5 pages, 3 figure

    Dopamine D2-receptor neurons in nucleus accumbens regulate sevoflurane anesthesia in mice

    Get PDF
    IntroductionThe mechanism of general anesthesia remains elusive. In recent years, numerous investigations have indicated that its mode of action is closely associated with the sleep-wake pathway. As a result, this study aimed to explore the involvement of dopamine D2 receptor (D2R) expressing neurons located in the nucleus accumbens (NAc), a critical nucleus governing sleep-wake regulation, in sevoflurane anesthesia.MethodsThis exploration was carried out using calcium fiber photometry and optogenetics technology, while utilizing cortical electroencephalogram (EEG), loss of righting reflex (LORR), and recovery of righting reflex (RORR) as experimental indicators.ResultsThe findings from calcium fiber photometry revealed a decrease in the activity of NAcD2R neurons during the induction phase of sevoflurane anesthesia, with subsequent recovery observed during the anesthesia’s emergence phase. Moreover, the activation of NAcD2R neurons through optogenetics technology led to a reduction in the anesthesia induction process and an extension of the arousal process in mice. Conversely, the inhibition of these neurons resulted in the opposite effect. Furthermore, the activation of NAcD2R neurons projecting into the ventral pallidum (VP) via optogenetics demonstrated a shortened induction time for mice under sevoflurane anesthesia.DiscussionIn conclusion, our research outcomes suggest that NAcD2R neurons play a promotive role in the sevoflurane general anesthesia process in mice, and their activation can reduce the induction time of anesthesia via the ventral pallidum (VP)

    Age-associated microbiome shows the giant panda lives on hemicelluloses, not on cellulose

    Get PDF
    The giant panda feeds almost exclusively on bamboo, a diet highly enriched in lignin and cellulose, but is characterized by a digestive tract similar to carnivores. It is still large unknown if and how the giant panda gut microbiota contributes to lignin and cellulose degradation. Here we show the giant pandas’ gut microbiota does not significantly contribute to cellulose and lignin degradation. We found that no operational taxonomic unit had a nearest neighbor identified as a cellulolytic species or strain with a significant higher abundance in juvenile than cubs, a very low abundance of putative lignin and cellulose genes existed in part of analyzing samples but a significant higher abundance of genes involved in starch and hemicellulose degradation in juveniles than cubs. Moreover, a significant lower abundance of putative cellulolytic genes and a significant higher abundance of putative α-amylase and hemicellulase gene families were present in giant pandas than in omnivores or herbivores

    Precision Higgs physics at the CEPC

    Get PDF
    The discovery of the Higgs boson with its mass around 125 GeV by the ATLAS and CMS Collaborations marked the beginning of a new era in high energy physics. The Higgs boson will be the subject of extensive studies of the ongoing LHC program. At the same time, lepton collider based Higgs factories have been proposed as a possible next step beyond the LHC, with its main goal to precisely measure the properties of the Higgs boson and probe potential new physics associated with the Higgs boson. The Circular Electron Positron Collider~(CEPC) is one of such proposed Higgs factories. The CEPC is an e+ee^+e^- circular collider proposed by and to be hosted in China. Located in a tunnel of approximately 100~km in circumference, it will operate at a center-of-mass energy of 240~GeV as the Higgs factory. In this paper, we present the first estimates on the precision of the Higgs boson property measurements achievable at the CEPC and discuss implications of these measurements.Comment: 46 pages, 37 figure

    Fault Detection of Stator Inter-Turn Short-Circuit in PMSM on Stator Current and Vibration Signal

    No full text
    The stator inter-turn short circuit fault is one of the most common and key faults in permanent magnet synchronous motor (PMSM). This paper introduces a time–frequency method for inter-turn fault detection in stator winding of PMSM using improved wavelet packet transform. Both stator current signal and vibration signal are used for the detection of short circuit faults. Two different experimental data from a three-phase PMSM were processed and analyzed by this time–frequency method in LabVIEW. The feasibility of this approach is shown by the experimental test
    corecore