313 research outputs found

    Exploring RWKV for Memory Efficient and Low Latency Streaming ASR

    Full text link
    Recently, self-attention-based transformers and conformers have been introduced as alternatives to RNNs for ASR acoustic modeling. Nevertheless, the full-sequence attention mechanism is non-streamable and computationally expensive, thus requiring modifications, such as chunking and caching, for efficient streaming ASR. In this paper, we propose to apply RWKV, a variant of linear attention transformer, to streaming ASR. RWKV combines the superior performance of transformers and the inference efficiency of RNNs, which is well-suited for streaming ASR scenarios where the budget for latency and memory is restricted. Experiments on varying scales (100h - 10000h) demonstrate that RWKV-Transducer and RWKV-Boundary-Aware-Transducer achieve comparable to or even better accuracy compared with chunk conformer transducer, with minimal latency and inference memory cost.Comment: submitted to ICASSP 202

    An Approximation of Label Distribution-Based Ensemble Learning Method for Online Educational Prediction

    Get PDF
    Online education becomes increasingly important since traditional learning is shocked heavily by COVID-19. To better develop personalized learning plans for students, it is necessary to build a model that can automatically evaluate students’ performance in online education. For this purpose, in this study we propose an ensemble learning method named light gradient boosting channel attention network (LGBCAN), which is based on label distribution estimation. First, the light gradient boosting machine (LightGBM) is used to predict the performance in online learning tasks. Then The Channel Attention Network (CAN) model further improves the function of LightGBM by focusing on better results in the K-fold CrossEntropy of LightGBM. The results are converted into predicted classes through post-processing methods named approximation of label distribution to complete the classification task. The experiments are employed on two datasets, data science bowl (DSB) and answer correctness prediction (ACP). The experimental results in both datasets suggest that our model has better robustness and generalization ability

    Facial Expression Retargeting from Human to Avatar Made Easy

    Full text link
    Facial expression retargeting from humans to virtual characters is a useful technique in computer graphics and animation. Traditional methods use markers or blendshapes to construct a mapping between the human and avatar faces. However, these approaches require a tedious 3D modeling process, and the performance relies on the modelers' experience. In this paper, we propose a brand-new solution to this cross-domain expression transfer problem via nonlinear expression embedding and expression domain translation. We first build low-dimensional latent spaces for the human and avatar facial expressions with variational autoencoder. Then we construct correspondences between the two latent spaces guided by geometric and perceptual constraints. Specifically, we design geometric correspondences to reflect geometric matching and utilize a triplet data structure to express users' perceptual preference of avatar expressions. A user-friendly method is proposed to automatically generate triplets for a system allowing users to easily and efficiently annotate the correspondences. Using both geometric and perceptual correspondences, we trained a network for expression domain translation from human to avatar. Extensive experimental results and user studies demonstrate that even nonprofessional users can apply our method to generate high-quality facial expression retargeting results with less time and effort.Comment: IEEE Transactions on Visualization and Computer Graphics (TVCG), to appea

    Analysis of Continuous Development with Nuclear Energy

    Get PDF
    As the energy crisis gradually approaching, the development of new energy sources has become an inevitable need. The use of solar energy, nuclear energy, wind energy and other new energy continues to grow. In order to study the impact of new energy on the economic side, this paper reviews correlative papers that talk about the advantages and disadvantages of new energy by comparing it with traditional energy. The result of this paper has shown the impact of new energy on the economy. By analyzing the internal and external strength determine a forecast of future trends it could have. And through the analyzing of their shortages the research has shown that its weakness; threat and the difficulties human will face during using new energy, include its highly cost and problems human had on disposal its waste. The advantages and drawbacks of the development of new energy in the future are examined in this article, in addition to some potential solutions that may assist to increase the advantages and reduce the problems

    Moving Metric Detection and Alerting System at eBay

    Full text link
    At eBay, there are thousands of product health metrics for different domain teams to monitor. We built a two-phase alerting system to notify users with actionable alerts based on anomaly detection and alert retrieval. In the first phase, we developed an efficient anomaly detection algorithm, called Moving Metric Detector (MMD), to identify potential alerts among metrics with distribution agnostic criteria. In the second alert retrieval phase, we built additional logic with feedbacks to select valid actionable alerts with point-wise ranking model and business rules. Compared with other trend and seasonality decomposition methods, our decomposer is faster and better to detect anomalies in unsupervised cases. Our two-phase approach dramatically improves alert precision and avoids alert spamming in eBay production.Comment: The work is oral presented on the AAAI-20 Workshop on Cloud Intelligence, 202
    • …
    corecore