70 research outputs found
A* shortest string decoding for non-idempotent semirings
The single shortest path algorithm is undefined for weighted finite-state
automata over non-idempotent semirings because such semirings do not guarantee
the existence of a shortest path. However, in non-idempotent semirings
admitting an order satisfying a monotonicity condition (such as the plus-times
or log semirings), the notion of shortest string is well-defined. We describe
an algorithm which finds the shortest string for a weighted non-deterministic
automaton over such semirings using the backwards shortest distance of an
equivalent deterministic automaton (DFA) as a heuristic for A* search performed
over a companion idempotent semiring, which is proven to return the shortest
string. While there may be exponentially more states in the DFA, this algorithm
needs to visit only a small fraction of them if determinization is performed
"on the fly".Comment: Ten pages, two figures. To appear in the proceedings of the 18th
Conference of the European Chapter of the Association for Computational
Linguistic
Factor oracle : a new structure for pattern matching
International audienceWe introduce a new automaton on a word p, sequence of letters taken in an alphabet Σ, that we call factor oracle. This automaton is acyclic, recognizes at least the factors of p, has m+1 states and a linear number of transitions. We give an on-line construction to build it. We use this new structure in string matching algorithms that we conjecture optimal according to the experimental results. These algorithms are as effecient as the ones that already exist using less memory and being more easy to implement
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study
In the era of large models, the autoregressive nature of decoding often
results in latency serving as a significant bottleneck. We propose a
non-autoregressive LM-fused ASR system that effectively leverages the
parallelization capabilities of accelerator hardware. Our approach combines the
Universal Speech Model (USM) and the PaLM 2 language model in per-segment
scoring mode, achieving an average relative WER improvement across all
languages of 10.8% on FLEURS and 3.6% on YouTube captioning. Furthermore, our
comprehensive ablation study analyzes key parameters such as LLM size, context
length, vocabulary size, fusion methodology. For instance, we explore the
impact of LLM size ranging from 128M to 340B parameters on ASR performance.
This study provides valuable insights into the factors influencing the
effectiveness of practical large-scale LM-fused speech recognition systems.Comment: ICASSP 202
- …