70 research outputs found
Polyphonic audio tagging with sequentially labelled data using CRNN with learnable gated linear units
Audio tagging aims to detect the types of sound events occurring in an audio
recording. To tag the polyphonic audio recordings, we propose to use
Connectionist Temporal Classification (CTC) loss function on the top of
Convolutional Recurrent Neural Network (CRNN) with learnable Gated Linear Units
(GLU-CTC), based on a new type of audio label data: Sequentially Labelled Data
(SLD). In GLU-CTC, CTC objective function maps the frame-level probability of
labels to clip-level probability of labels. To compare the mapping ability of
GLU-CTC for sound events, we train a CRNN with GLU based on Global Max Pooling
(GLU-GMP) and a CRNN with GLU based on Global Average Pooling (GLU-GAP). And we
also compare the proposed GLU-CTC system with the baseline system, which is a
CRNN trained using CTC loss function without GLU. The experiments show that the
GLU-CTC achieves an Area Under Curve (AUC) score of 0.882 in audio tagging,
outperforming the GLU-GMP of 0.803, GLU-GAP of 0.766 and baseline system of
0.837. That means based on the same CRNN model with GLU, the performance of CTC
mapping is better than the GMP and GAP mapping. Given both based on the CTC
mapping, the CRNN with GLU outperforms the CRNN without GLU.Comment: DCASE2018 Workshop. arXiv admin note: text overlap with
arXiv:1808.0193
Keyword Spotting System and Evaluation of Pruning and Quantization Methods on Low-power Edge Microcontrollers
Keyword spotting (KWS) is beneficial for voice-based user interactions with
low-power devices at the edge. The edge devices are usually always-on, so edge
computing brings bandwidth savings and privacy protection. The devices
typically have limited memory spaces, computational performances, power and
costs, for example, Cortex-M based microcontrollers. The challenge is to meet
the high computation and low-latency requirements of deep learning on these
devices. This paper firstly shows our small-footprint KWS system running on
STM32F7 microcontroller with Cortex-M7 core @216MHz and 512KB static RAM. Our
selected convolutional neural network (CNN) architecture has simplified number
of operations for KWS to meet the constraint of edge devices. Our baseline
system generates classification results for each 37ms including real-time audio
feature extraction part. This paper further evaluates the actual performance
for different pruning and quantization methods on microcontroller, including
different granularity of sparsity, skipping zero weights, weight-prioritized
loop order, and SIMD instruction. The result shows that for microcontrollers,
there are considerable challenges for accelerate unstructured pruned models,
and the structured pruning is more friendly than unstructured pruning. The
result also verified that the performance improvement for quantization and SIMD
instruction.Comment: Submitted to DCASE2022 Workshop. Code available at:
https://github.com/RoboBachelor/Keyword-Spotting-STM3
Efficient coding schemes for low‐rate wireless personal area networks
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/166246/1/cmu2bf01608.pd
Recommended from our members
Prompting Fab Yeast Surface Display Efficiency by ER Retention and Molecular Chaperon Co-expression.
For antibody discovery and engineering, yeast surface display (YSD) of antigen-binding fragments (Fabs) and coupled fluorescence activated cell sorting (FACS) provide intact paratopic conformations and quantitative analysis at the monoclonal level, and thus holding great promises for numerous applications. Using anti-TNFα mAbs Infliximab, Adalimumab, and its variants as model Fabs, this study systematically characterized complementary approaches for the optimization of Fab YSD. Results suggested that by using divergent promoter GAL1-GAL10 and endoplasmic reticulum (ER) signal peptides for co-expression of light chain and heavy chain-Aga2 fusion, assembled Fabs were functionally displayed on yeast cell surface with sigmoidal binding responses toward TNFα. Co-expression of a Hsp70 family molecular chaperone Kar2p and/or protein-disulfide isomerase (Pdi1p) significantly improved efficiency of functional display (defined as the ratio of cells displaying functional Fab over cells displaying assembled Fab). Moreover, fusing ER retention sequences (ERSs) with light chain also enhanced Fab display quality at the expense of display quantity, and the degree of improvements was correlated with the strength of ERSs and was more significant for Infliximab than Adalimumab. The feasibility of affinity maturation was further demonstrated by isolating a high affinity Fab clone from 1:103 or 1:105 spiked libraries
- …