794 research outputs found
Fast computation of non-Gaussian covariance of redshift-space galaxy power spectrum multipoles
The non-Gaussian part of the covariance matrix of the galaxy power spectrum
involves the connected four-point correlation in Fourier space, i.e.
trispectrum. This paper introduces a fast method to compute the non-Gaussian
part of the covariance matrix of the galaxy power spectrum multipoles in
redshift space at tree-level standard perturbation theory. For the tree-level
galaxy trispectrum, the angular integral between two wavevectors can be
evaluated analytically by employing an FFTLog. The new implementation computes
the non-Gaussian covariance of the power spectrum monopole, quadrupole,
hexadecapole and their cross-covariance in O(10) seconds, for an effectively
arbitrary number of instances of cosmological and galaxy bias parameters and
redshift, without any parallelization or acceleration. It is a large advantage
over conventional numerical integration. We demonstrate that the computation of
the covariance at k = 0.005 - 0.4 h/Mpc gives results with 0.1 - 1% accuracy.
The efficient computation of the analytic covariance can be useful for future
galaxy surveys, especially utilizing multi-tracer analysis.Comment: 13 pages, 4 figures, to be submitted to Phys. Rev.
A Case Study of Bootstrap Masker Quality Assessment for Speech-Privacy Protection
In this paper, we discuss the quality assessment of a new method for thegeneration of a masker for speech privacy protection. This masker includes speech characteristicsthat prevent eavesdroppers from overhearing conversations in public spaces.Previous research shows that maskers generated from the target speech perform betterin interfering with the listening, than the other maskers. Therefore, we propose a bootstrap(BS) masker method that efficiently generates a masker from a small sample of therecorded speech. We evaluate the subjective speech intelligibility and establish that the BSmasker can achieve the same level of intelligibility as that of the conventional additionalmasker at an approximately 4 dB lower target-to-masker ratio
Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
We present a novel integration of an instruction-tuned large language model
(LLM) and end-to-end automatic speech recognition (ASR). Modern LLMs can
perform a wide range of linguistic tasks within zero-shot learning when
provided with a precise instruction or a prompt to guide the text generation
process towards the desired task. We explore using this zero-shot capability of
LLMs to extract linguistic information that can contribute to improving ASR
performance. Specifically, we direct an LLM to correct grammatical errors in an
ASR hypothesis and harness the embedded linguistic knowledge to conduct
end-to-end ASR. The proposed model is built on the hybrid connectionist
temporal classification (CTC) and attention architecture, where an
instruction-tuned LLM (i.e., Llama2) is employed as a front-end of the decoder.
An ASR hypothesis, subject to correction, is obtained from the encoder via CTC
decoding, which is then fed into the LLM along with an instruction. The decoder
subsequently takes as input the LLM embeddings to perform sequence generation,
incorporating acoustic information from the encoder output. Experimental
results and analyses demonstrate that the proposed integration yields promising
performance improvements, and our approach largely benefits from LLM-based
rescoring.Comment: Submitted to ICASSP202
Neural magnetic field dependent fMRI toward direct functional connectivity measurements: A phantom study
Recently, the main issue in neuroscience has been the imaging of the functional connectivity in the brain. No modality that can measure functional connectivity directly, however, has been developed yet. Here, we show the novel MRI sequence, called the partial spinlock sequence toward direct measurements of functional connectivity. This study investigates a probable measurement of phase differences directly associated with functional connectivity. By employing partial spinlock imaging, the neural magnetic field might influence the magnetic resonance signals. Using simulation and phantom studies to model the neural magnetic fields, we showed that magnetic resonance signals vary depending on the phase of an externally applied oscillating magnetic field with non-right flip angles. These results suggest that the partial spinlock sequence is a promising modality for functional connectivity measurements
InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
This paper presents InterMPL, a semi-supervised learning method of end-to-end
automatic speech recognition (ASR) that performs pseudo-labeling (PL) with
intermediate supervision. Momentum PL (MPL) trains a connectionist temporal
classification (CTC)-based model on unlabeled data by continuously generating
pseudo-labels on the fly and improving their quality. In contrast to
autoregressive formulations, such as the attention-based encoder-decoder and
transducer, CTC is well suited for MPL, or PL-based semi-supervised ASR in
general, owing to its simple/fast inference algorithm and robustness against
generating collapsed labels. However, CTC generally yields inferior performance
than the autoregressive models due to the conditional independence assumption,
thereby limiting the performance of MPL. We propose to enhance MPL by
introducing intermediate loss, inspired by the recent advances in CTC-based
modeling. Specifically, we focus on self-conditional and hierarchical
conditional CTC, that apply auxiliary CTC losses to intermediate layers such
that the conditional independence assumption is explicitly relaxed. We also
explore how pseudo-labels should be generated and used as supervision for
intermediate losses. Experimental results in different semi-supervised settings
demonstrate that the proposed approach outperforms MPL and improves an ASR
model by up to a 12.1% absolute performance gain. In addition, our detailed
analysis validates the importance of the intermediate loss.Comment: Submitted to ICASSP202
- …