Search CORE

70 research outputs found

Polyphonic audio tagging with sequentially labelled data using CRNN with learnable gated linear units

Author: Hou Yuanbo
Kong Qiuqiang
Li Shengchen
Wang Jun
Publication venue
Publication date: 01/01/2018
Field of study

Audio tagging aims to detect the types of sound events occurring in an audio recording. To tag the polyphonic audio recordings, we propose to use Connectionist Temporal Classification (CTC) loss function on the top of Convolutional Recurrent Neural Network (CRNN) with learnable Gated Linear Units (GLU-CTC), based on a new type of audio label data: Sequentially Labelled Data (SLD). In GLU-CTC, CTC objective function maps the frame-level probability of labels to clip-level probability of labels. To compare the mapping ability of GLU-CTC for sound events, we train a CRNN with GLU based on Global Max Pooling (GLU-GMP) and a CRNN with GLU based on Global Average Pooling (GLU-GAP). And we also compare the proposed GLU-CTC system with the baseline system, which is a CRNN trained using CTC loss function without GLU. The experiments show that the GLU-CTC achieves an Area Under Curve (AUC) score of 0.882 in audio tagging, outperforming the GLU-GMP of 0.803, GLU-GAP of 0.766 and baseline system of 0.837. That means based on the same CRNN model with GLU, the performance of CTC mapping is better than the GMP and GAP mapping. Given both based on the CTC mapping, the CRNN with GLU outperforms the CRNN without GLU.Comment: DCASE2018 Workshop. arXiv admin note: text overlap with arXiv:1808.0193

arXiv.org e-Print Archive

University of Surrey

Surrey Research Insight

Keyword Spotting System and Evaluation of Pruning and Quantization Methods on Low-power Edge Microcontrollers

Author: Li Shengchen
Wang Jingyi
Publication venue
Publication date: 04/08/2022
Field of study

Keyword spotting (KWS) is beneficial for voice-based user interactions with low-power devices at the edge. The edge devices are usually always-on, so edge computing brings bandwidth savings and privacy protection. The devices typically have limited memory spaces, computational performances, power and costs, for example, Cortex-M based microcontrollers. The challenge is to meet the high computation and low-latency requirements of deep learning on these devices. This paper firstly shows our small-footprint KWS system running on STM32F7 microcontroller with Cortex-M7 core @216MHz and 512KB static RAM. Our selected convolutional neural network (CNN) architecture has simplified number of operations for KWS to meet the constraint of edge devices. Our baseline system generates classification results for each 37ms including real-time audio feature extraction part. This paper further evaluates the actual performance for different pruning and quantization methods on microcontroller, including different granularity of sparsity, skipping zero weights, weight-prioritized loop order, and SIMD instruction. The result shows that for microcontrollers, there are considerable challenges for accelerate unstructured pruned models, and the structured pruning is more friendly than unstructured pruning. The result also verified that the performance improvement for quantization and SIMD instruction.Comment: Submitted to DCASE2022 Workshop. Code available at: https://github.com/RoboBachelor/Keyword-Spotting-STM3

arXiv.org e-Print Archive

Efficient coding schemes for low‐rate wireless personal area networks

Author: Dai Shengchen
Kang Kai
Qian Hua
Wang Xudong
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/05/2016
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/166246/1/cmu2bf01608.pd

Deep Blue Documents

Recommended from our members

Prompting Fab Yeast Surface Display Efficiency by ER Retention and Molecular Chaperon Co-expression.

Author: Ge Xin
Iverson Brent L
Lee Ki Baek
Li Junhong
Mei Meng
Wang Shengchen
Yi Li
Zhang Guimin
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

For antibody discovery and engineering, yeast surface display (YSD) of antigen-binding fragments (Fabs) and coupled fluorescence activated cell sorting (FACS) provide intact paratopic conformations and quantitative analysis at the monoclonal level, and thus holding great promises for numerous applications. Using anti-TNFα mAbs Infliximab, Adalimumab, and its variants as model Fabs, this study systematically characterized complementary approaches for the optimization of Fab YSD. Results suggested that by using divergent promoter GAL1-GAL10 and endoplasmic reticulum (ER) signal peptides for co-expression of light chain and heavy chain-Aga2 fusion, assembled Fabs were functionally displayed on yeast cell surface with sigmoidal binding responses toward TNFα. Co-expression of a Hsp70 family molecular chaperone Kar2p and/or protein-disulfide isomerase (Pdi1p) significantly improved efficiency of functional display (defined as the ratio of cells displaying functional Fab over cells displaying assembled Fab). Moreover, fusing ER retention sequences (ERSs) with light chain also enhanced Fab display quality at the expense of display quantity, and the degree of improvements was correlated with the strength of ERSs and was more significant for Infliximab than Adalimumab. The feasibility of affinity maturation was further demonstrated by isolating a high affinity Fab clone from 1:103 or 1:105 spiked libraries

eScholarship - University of California