Search CORE

113 research outputs found

Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference

Author: Eng Nicholas
Higuchi Yosuke
Someki Masao
Watanabe Shinji
Publication venue
Publication date: 30/09/2023
Field of study

Attention-based encoder-decoder models with autoregressive (AR) decoding have proven to be the dominant approach for automatic speech recognition (ASR) due to their superior accuracy. However, they often suffer from slow inference. This is primarily attributed to the incremental calculation of the decoder. This work proposes a partially AR framework, which employs segment-level vectorized beam search for improving the inference speed of an ASR model based on the hybrid connectionist temporal classification (CTC) attention-based architecture. It first generates an initial hypothesis using greedy CTC decoding, identifying low-confidence tokens based on their output probabilities. We then utilize the decoder to perform segment-level vectorized beam search on these tokens, re-predicting in parallel with minimal decoder calculations. Experimental results show that our method is 12 to 13 times faster in inference on the LibriSpeech corpus over AR decoding whilst preserving high accuracy.Comment: Accepted at ASRU 202

arXiv.org e-Print Archive

ESPnet-ONNX: Bridging a Gap Between Research and Production

Author: Hayashi Tomoki
Higuchi Yosuke
Someki Masao
Watanabe Shinji
Publication venue
Publication date: 20/09/2022
Field of study

In the field of deep learning, researchers often focus on inventing novel neural network models and improving benchmarks. In contrast, application developers are interested in making models suitable for actual products, which involves optimizing a model for faster inference and adapting a model to various platforms (e.g., C++ and Python). In this work, to fill the gap between the two, we establish an effective procedure for optimizing a PyTorch-based research-oriented model for deployment, taking ESPnet, a widely used toolkit for speech processing, as an instance. We introduce different techniques to ESPnet, including converting a model into an ONNX format, fusing nodes in a graph, and quantizing parameters, which lead to approximately 1.3-2

\times

speedup in various tasks (i.e., ASR, TTS, speech translation, and spoken language understanding) while keeping its performance without any additional training. Our ESPnet-ONNX will be publicly available at https://github.com/espnet/espnet_onnxComment: Accepted to APSIPA ASC 202

arXiv.org e-Print Archive

Worker Displacement in Japan and Canada

Author: Abe Masahiro
Higuchi Yoshio
Kuhn Peter Joseph
Nakamura Masao
Sweetman Arthur
Publication venue: Upjohn Research
Publication date: 01/01/2002
Field of study

Statistics Canada for generously providing customized counts of separation and displacement rate

CiteSeerX

Upjohn Research

Social Consciousness and Local Politics under the Post-1955 Regime : Analyzing Voting Behaviors in Tokushima, Japan

Author: Higuchi Naoto
Kubota Shigeru
Maruyama Masao
Matsutani Mitsuru
Murase Hiroshi
Takaki Ryosuke
Yabe Takuya
Publication venue
Publication date: 20/04/2021
Field of study

Tokushima University Institutional Repository

Stereotactic body radiotherapy for lung tumors : Preliminary results from a single institution

Author: HIGUCHI Makiko
HIRATSUKA Junichi
KAKUBA Koki
KAMITANI Nobuhiko
KONISHI Kei
NAGASE Naomi
NAKATA Masao
OKA Mikio
TANI Tadash
TOKIYA Ryoji
YODEN Eisaku
Publication venue: 'Radiological Society of North America (RSNA)'
Publication date: 01/01/2012
Field of study

Kawasaki Medical School Institutional Repository / 川崎医科大学学術機関リポジトリ